Tag Archives: Data archiving
With that said, the basic approaches to consider are from the top-down, or the bottom-up. You can be successful with either approach. However, there are certain efficiencies you’ll gain with a specific choice, and it could significantly reduce the risk and cost. Let’s explore the pros and cons of each approach.
Approaching data integration from the top-down means moving from the high level integration flows, down to the data semantics. Thus, you an approach, perhaps even a tool-set (using requirements), and then define the flows that are decomposed down to the raw data.
The advantages of this approach include:
The ability to spend time defining the higher levels of abstraction without being limited by the underlying integration details. This typically means that those charged with designing the integration flows are more concerned with how they have to deal with the underlying source and target, and this approach means that they don’t have to deal with that issue until later, as they break down the flows.
The disadvantages of this approach include:
The data integration architect does not consider the specific needs of the source or target systems, in many instances, and thus some rework around the higher level flows may have to occur later. That causes inefficiencies, and could add risk and cost to the final design and implementation.
For the most part, this is the approach that most choose for data integration. Indeed, I use this approach about 75 percent of the time. The process is to start from the native data in the sources and targets, and work your way up to the integration flows. This typically means that those charged with designing the integration flows are more concerned with the underlying data semantic mediation than the flows.
The advantages of this approach include:
It’s typically a more natural and traditional way of approaching data integration. Called “data-driven” integration design in many circles, this initially deals with the details, so by the time you get up to the integration flows there are few surprises, and there’s not much rework to be done. It’s a bit less risky and less expensive, in most cases.
The disadvantages of this approach include:
Starting with the details means that you could get so involved in the details that you miss the larger picture, and the end state of your architecture appears to be poorly planned, when all is said and done. Of course, that depends on the types of data integration problems you’re looking to solve.
No matter which approach you leverage, with some planning and some strategic thinking, you’ll be fine. However, there are different paths to the same destination, and some paths are longer and less efficient than others. As you pick an approach, learn as you go, and adjust as needed.
This magic quadrant focuses on what Gartner calls Structured Data Archiving. Data Archiving is used to index, migrate, preserve and protect application data in secondary databases or flat files. These are typically located on lower-cost storage, for policy-based retention. Data Archiving makes data available in context of the originating business process or application. This is especially useful in the event of litigation or of an audit.
The Magic Quadrant calls out two use cases. These use cases are “live archiving of production applications” and “application retirement of legacy systems.” Informatica refers to both use cases, together, as “Enterprise Data Archiving.” We consider this to be a foundational component of a comprehensive Information Lifecycle Management strategy.
The application landscape is constantly evolving. For this reason, data archiving is a strategic component of a data growth management strategy. Application owners need a plan to manage data as applications are upgraded, replaced, consolidated, moved to the cloud and/or retired.
When you don’t have a plan in production, data accumulates in the business application. When this happens, performance bothers the business. In addition, data bloat bothers IT operations. When you don’t have a plan for legacy systems, applications accumulate in the data center. As a result, increasing budgets bother the CFO.
A data growth management plan must include the following:
- How to cycle through applications and retire them
- How to smartly store the application data
- How to ultimately dispose data while staying compliant
Structured data archiving and application retirement technologies help automate and streamline these tasks.
Informatica Data Archive delivers unparalleled connectivity, scalability and a broad range of innovative options (i.e. Smart Partitioning, Live Archiving, and retiring aging and legacy data to the Informatica Data Vault), and comprehensive retention management and data reporting and visualization. We believe our strengths in this space are the key ingredients for deploying a successful enterprise data archive.
For more information, read the Gartner Magic Quadrant for Structured Data Archiving and Application Retirement.
What springs to mind when you think about old applications? What happens to them when they outlived their usefulness? Do they finally get to retire and have their day in the sun, or do they tenaciously hang on to life?
Think for a moment about your situation and of those around you. From the time work started you have been encouraged and sometimes forced to think about, plan for and fund your own retirement. Now consider the portfolio your organization has built up over the years; hundreds or maybe thousands of apps, spread across numerous platforms and locations – A mix of home-grown with the best-in-breed tools or acquired from the leading application vendors.
Evaluating Your Current Situation
- Do you know how many of those “legacy” systems are still running?
- Do you know how much these apps are costing?
- Is there a plan to retire them?
- How is the execution tracking to plan?
Truth is, even if you have a plan, it probably isn’t going well.
Providing better citizen service at a lower cost
This is something every state and local organization aspires to do by reducing costs. Many organizations are spending 75% or more of their budgets on just keeping the lights on – maintaining existing applications and infrastructure. Being able to fully retire some, or many of these applications saves significant money. Do you know how much these applications are costing your organization? Don’t forget to include the whole range of costs that applications incur – including the physical infrastructure costs such as mainframes, networks and storage, as well as the required software licenses and of course the time of the people that actually keep them running. What happens when those with with Cobol and CICS experience retire? Usually the answer is not good news. There is a lot to consider and many benefits to be gained through an effective application retirement strategy.
August 2011 report by ESG Global shows that some 68% of organizations had over six or more legacy applications running and that 50% planned to retire at least one of those over the following 12-18 months. It would be interesting to see today’s situation and be able evaluate how successful these application retirement plans have been.
A common problem is knowing where to start. You know there are applications that you should be able to retire, but planning, building and executing an effective and success plan can be tough. To help this process we have developed a strategy, framework and solution for effective and efficient application retirement. This is a good starting point on your application retirement journey.
To get a speedy overview, take six minutes to watch this video on application retirement.
We have created a community specifically for application managers in our ‘Potential At Work’ site. If you haven’t already signed up, take a moment and join this group of like-minded individuals from across the globe.
I recently met with a longtime colleague from the Oracle E-Business Suite implementation eco-system, now VP of IT for a global technology provider. This individual has successfully implemented data archiving and data masking technologies to eliminate duplicate applications and control the costs of data growth – saving tens of millions of dollars. He has freed up resources that were re-deployed within new innovative projects such as Big Data – giving him the reputation as a thought leader. In addition, he has avoided exposing sensitive data in application development activities by securing it with data masking technology – thus securing his reputation.
When I asked him about those projects and the impact on his career, he responded, ‘Data archiving and data security are table stakes in the Oracle Applications IT game. However, if I want to be a part of anything important, it has to involve Cloud and Big Data.’ He further explained how the savings achieved from Informatica Data Archive enabled him to increase employee retention rates because he was able to fund an exciting Hadoop project that key resources wanted to work on. Not to mention, as he transitioned from physical infrastructure to a virtual server by retiring legacy applications – he had accomplished his first step on his ‘journey to the cloud’. This would not have been possible if his data required technology that was not supported in the cloud. If he hadn’t secured sensitive data and had experienced a breach, he would be looking for a new job in a new industry.
Not long after, I attended a CIO summit where the theme of the conference was ‘Breakthrough Innovation’. Of course, Cloud and Big Data were main stage topics – not just about the technology, but about how it was used to solve business challenges and provide services to the new generation of ‘entitled’ consumers. This is the description of those who expect to have everything at their fingertips. They want to be empowered to share or not share their information. They expect that if you are going to save their personal information, it will not be abused. Lastly, they may even expect to try a product or service for free before committing to buy.
In order to size up to these expectations, Application Owners, like my long-time colleague, need to incorporate Data Archive and Data Masking in their standard SDLC processes. Without Data Archive, IT budgets may be consumed by supporting old applications and mountains of data, thereby becoming inaccessible for new innovative projects. Without Data Masking, a public breach will drive many consumers elsewhere.
Healthcare organizations are currently engaged in major transformative initiatives. The American Recovery and Reinvestment Act of 2009 (ARRA) provided the healthcare industry incentives for the adoption and modernization of point-of-care computing solutions including electronic medical and health records (EMRs/EHRs). Funds have been allocated, and these projects are well on their way. In fact, the majority of hospitals in the US are engaged in implementing EPIC, a software platform that is essentially the ERP for healthcare.
These Cadillac systems are being deployed from scratch with very little data being ported from the old systems into the new. The result is a dearth of legacy applications running in aging hospital data centers, consuming every last penny of HIS budgets. Because the data still resides on those systems, hospital staff continues to use them making it difficult to shut down or retire.
Most of these legacy systems are not running on modern technology platforms – they run on systems such as HP Turbo Image, Intercache Mumps, and embedded proprietary databases. Finding people who know how to manage and maintain these systems is costly and risky – risky in that if data residing in those applications is subject to data retention requirements (patient records, etc.) and the data becomes inaccessible.
A different challenge for CFOs of these hospitals is the ROI on these EPIC implementations. Because these projects are multi-phased, multi-year, boards of directors are asking about the value realized from these investments. Many are coming up short because they are maintaining both applications in parallel. Relief will come when systems can be retired – but getting hospital staff and regulators to approve a retirement project requires evidence that they can still access data while adhering to compliance needs.
Many providers have overcome these hurdles by successfully implementing an application retirement strategy based on the Informatica Data Archive platform. Several of the largest pediatrics’ children’s hospitals in the US are either already saving or expecting to save $2 Million or more annually from retiring legacy applications. The savings come from:
- Eliminating software maintenance and license costs
- Eliminate hardware dependencies and costs
- Reduced storage requirements by 95% (data archived is stored in a highly compressed, accessible format)
- Improved efficiencies in IT by eliminating specialized processes or skills associated with legacy systems
- Freed IT resources – teams can spend more of their time working on innovations and new projects
Informatica Application Retirement Solutions for Healthcare provide hospitals with the ability to completely retire legacy applications, retire and maintain access to archive data for hospital staff. And with built in security and retention management, records managers and legal teams are satisfying compliance requirements. Contact your Informatica Healthcare team for more information on how you can get that EPIC ROI the board of directors is asking for.
The term “big data” has been bandied around so much in recent months that arguably, it’s lost a lot of meaning in the IT industry. Typically, IT teams have heard the phrase, and know they need to be doing something, but that something isn’t being done. As IDC pointed out last year, there is a concerning shortage of trained big data technology experts, and failure to recognise the implications that not managing big data can have on the business is dangerous. In today’s information economy, as increasingly digital consumers, customers, employees and social networkers we’re handing over more and more personal information for businesses and third parties to collate, manage and analyse. On top of the growth in digital data, emerging trends such as cloud computing are having a huge impact on the amount of information businesses are required to handle and store on behalf of their customers. Furthermore, it’s not just the amount of information that’s spiralling out of control: it’s also the way in which it is structured and used. There has been a dramatic rise in the amount of unstructured data, such as photos, videos and social media, which presents businesses with new challenges as to how to collate, handle and analyse it. As a result, information is growing exponentially. Experts now predict a staggering 4300% increase in annual data generation by 2020. Unless businesses put policies in place to manage this wealth of information, it will become worthless, and due to the often extortionate costs to store the data, it will instead end up having a huge impact on the business’ bottom line. Maxed out data centres Many businesses have limited resource to invest in physical servers and storage and so are increasingly looking to data centres to store their information in. As a result, data centres across Europe are quickly filling up. Due to European data retention regulations, which dictate that information is generally stored for longer periods than in other regions such as the US, businesses across Europe have to wait a very long time to archive their data. For instance, under EU law, telecommunications service and network providers are obliged to retain certain categories of data for a specific period of time (typically between six months and two years) and to make that information available to law enforcement where needed. With this in mind, it’s no surprise that investment in high performance storage capacity has become a key priority for many. Time for a clear out So how can organisations deal with these storage issues? They can upgrade or replace their servers, parting with lots of capital expenditure to bring in more power or more memory for Central Processing Units (CPUs). An alternative solution would be to “spring clean” their information. Smart partitioning allows businesses to spend just one tenth of the amount required to purchase new servers and storage capacity, and actually refocus how they’re organising their information. With smart partitioning capabilities, businesses can get all the benefits of archiving the information that’s not necessarily eligible for archiving (due to EU retention regulations). Furthermore, application retirement frees up floor space, drives the modernisation initiative, allows mainframe systems and older platforms to be replaced and legacy data to be migrated to virtual archives. Before IT professionals go out and buy big data systems, they need to spring clean their information and make room for big data. Poor economic conditions across Europe have stifled innovation for a lot of organisations, as they have been forced to focus on staying alive rather than putting investment into R&D to help improve operational efficiencies. They are, therefore, looking for ways to squeeze more out of their already shrinking budgets. The likes of smart partitioning and application retirement offer businesses a real solution to the growing big data conundrum. So maybe it’s time you got your feather duster out, and gave your information a good clean out this spring?
The Oracle Application User Group Archive and Purge Special Interest Group held its semi-annual meeting on Sunday September 30th at Oracle OpenWorld. Once again, this session was very well attended – but more so this year because of the expert panel which included: Admed Alomari – Founder of Cybermoor, Isam Alyousfi – Seniorr Director, Lead Oracle Applications Tuning Group, Sameer Barakat, Oracle Applications Tuning Group, and Ziyad Dahbour – now at Informatica (Founder of TierData, Founder of Outerbay). (more…)
The need for more robust data retention management and enforcement is more than just good data management practice. It is a legal requirement for financial services organizations across the globe to comply with the myriad of local, federal, and international laws that mandate the retention of certain types of data for example:
- Dodd-Frank Act: Under Dodd-Frank, firms are required to maintain records for no less than five years.
- Basel Accord: The Basel guidelines call for the retention of risk and transaction data over a period of three to seven years. Noncompliance can result in significant fines and penalties.
- MiFiD II: Transactional data must also be stored in such a way that it meets new records retention requirements for such data (which must now be retained for up to five years) and easily retrieved, in context, to prove best execution.
- Bank Secrecy Act: All BSA records must be retained for a period of five years and must be filed or stored in such a way as to be accessible within a reasonable period of time.
- Payment Card Industry Data Security Standard (PCI): PCI requires card issuers and acquirers to retain an audit trail history for a period that is consistent with its effective use, as well as legal regulations. An audit history usually covers a period of at least one year, with a minimum of three months available on-line.
- Sarbanes-Oxley:Section 103 requires firms to prepare and maintain, for a period of not less than seven years, audit work papers and other information related to any audit report, in sufficient detail to support the conclusions reached and reported to external regulators.
Each of these laws have distinct data collection, analysis, and retention requirements that must be factored into existing information management practices. Unfortunately, existing data archiving methods including traditional database and tape backup methods lack the required capabilities to effectively enforce and automate data retention policies to comply with industry regulations. In addition, a number of internal and external challenges make it even more difficult for financial institutions to archive and retain required data due to the following trends: (more…)
Data warehouses are applications– so why not manage them like one? In fact, data grows at a much faster rate in data warehouses, since they integrate date from multiple applications and cater to many different groups of users who need different types of analysis. Data warehouses also keep historical data for a long time, so data grows exponentially in these systems. The infrastructure costs in data warehouses also escalate quickly since analytical processing on large amounts of data requires big beefy boxes. Not to mention the software license and maintenance costs of such a large amount of data. Imagine how many backup media is required to backup tens to hundreds of terabytes of data warehouses on a regular basis. But do you really need to keep all that historical data in production?
One of the challenges of managing data growth in data warehouses is that it’s hard to determine which data is actually used, which data is no longer being used, or even if the data was ever used at all. Unlike transactional systems where the application logic determines when records are no longer being transacted upon, the usage of analytical data in data warehouses has no definite business rules. Age or seasonality may determine data usage in data warehouses, but business users are usually loath to let go of the availability of all that data at their fingertips. The only clear cut way to prove that some data is no longer being used in data warehouses is to monitor its usage.