Category Archives: Database Archiving
This magic quadrant focuses on what Gartner calls Structured Data Archiving. Data Archiving is used to index, migrate, preserve and protect application data in secondary databases or flat files. These are typically located on lower-cost storage, for policy-based retention. Data Archiving makes data available in context of the originating business process or application. This is especially useful in the event of litigation or of an audit.
The Magic Quadrant calls out two use cases. These use cases are “live archiving of production applications” and “application retirement of legacy systems.” Informatica refers to both use cases, together, as “Enterprise Data Archiving.” We consider this to be a foundational component of a comprehensive Information Lifecycle Management strategy.
The application landscape is constantly evolving. For this reason, data archiving is a strategic component of a data growth management strategy. Application owners need a plan to manage data as applications are upgraded, replaced, consolidated, moved to the cloud and/or retired.
When you don’t have a plan in production, data accumulates in the business application. When this happens, performance bothers the business. In addition, data bloat bothers IT operations. When you don’t have a plan for legacy systems, applications accumulate in the data center. As a result, increasing budgets bother the CFO.
A data growth management plan must include the following:
- How to cycle through applications and retire them
- How to smartly store the application data
- How to ultimately dispose data while staying compliant
Structured data archiving and application retirement technologies help automate and streamline these tasks.
Informatica Data Archive delivers unparalleled connectivity, scalability and a broad range of innovative options (i.e. Smart Partitioning, Live Archiving, and retiring aging and legacy data to the Informatica Data Vault), and comprehensive retention management and data reporting and visualization. We believe our strengths in this space are the key ingredients for deploying a successful enterprise data archive.
For more information, read the Gartner Magic Quadrant for Structured Data Archiving and Application Retirement.
Oracle DBAs are challenged with keeping mission critical databases up and running with predictable performance as data volumes grow. Our customers are changing their approach to proactively managing Oracle performance while simplifying IT by leveraging our innovative Data Archive Smart Partitioning features. Smart Partitioning leverages Oracle Database Partitioning, simplifying deploying and managing partitioning strategies. DBAs have been able to respond to requests to improve business process performance without having to write any custom code or SQL scripts.
With Smart Partitioning, DBA’s have a new dialogue with business analysts – rather than wading in the technology weeds, they ask how many months, quarters or years of data are required to get the job done? And show – within a few clicks – how users can self-select how much gets processed when they run queries, reports or programs – basically showing them how they can control their own performance by controlling the volume of data they pull from the database.
Smart Partitioning is configured using easily understood business dimensions such as time, company, business unit etc. These dimensions make it easy to ‘slice’ data to meet the job at hand. Performance becomes manageable and under business control. Another benefit is in your non-production environments. Creating smaller sized, subset databases that are fully functional now fits easily into your cloning operations.
Finally, Informatica has been working closely with the Oracle Enterprise Solutions Group to align Informatica Data Archive Smart Partitioning with the Oracle ZS3 Appliance to maximize performance and savings while minimizing the complexity of implementing an Information Lifecycle Management strategy.
When the average person hears of cloning, my bet is that they think of the controversy and ethical issues surrounding cloning, such as the cloning of Dolly the sheep, or the possible cloning of humans by a mad geneticist in a rogue nation state. I would also put money down that when an Informatica blog reader thinks of cloning they think of “The Matrix” or “Star Wars” (that dreadful episode II Attack of the Clones). I did. Unfortunately.
But my pragmatic expectation is that when Informatica customers think of cloning, they also think of Data Cloning software. Data Cloning software clones terabytes of database data into a host of other databases, data warehouses, analytical appliances, and Big Data stores such as Hadoop. And just for hoots and hollers, you should know that almost half of all Data Integration efforts involve replication, be it snapshot or real-time, according to TDWI survey data. Survey also says… replication is the second most popular — or second most used — data integration tool, behind ETL.
Do your company’s cloning tools work with non-standard types? Know that Informatica cloning tools can reproduce Oracle data to just about anything on 2 tuples (or more). We do non-discriminatory duplication, so it’s no wonder we especially fancy cloning the Oracle! (a thousand apologies for the bad “Matrix” pun)
Just remember that data clones are an important and natural component of business continuity, and the use cases span both operational and analytic applications. So if you’re not cloning your Oracle data safely and securely with the quality results that you need and deserve, it’s high time that you get some better tools.
Send in the Clones
With that in mind, if you haven’t tried to clone before, for a limited time, Informatica is making Fast Clone database cloning trial software product available for a free download. Click here to get it now.
This is the first in a series of articles where I will take an in-depth look at how state and local governments are affected by data breaches and what they should be considering as part of their compliance, risk-avoidance and remediation plans.
Each state has one or more agencies that are focused on the lives, physical and mental health and overall welfare of their citizens. The mission statement of the Department of Public Welfare of Pennsylvania, my home state is typical, it reads “Our vision is to see Pennsylvanians living safe, healthy and independent lives. Our mission is to improve the quality of life for Pennsylvania’s individuals and families. We promote opportunities for independence through services and supports while demonstrating accountability for taxpayer resources.”
Just as in the enterprise, over the last couple of decades the way an agency deals with citizens has changed dramatically. No longer is everything paper-based and manually intensive – each state has made enormous efforts not just to automate more and more of their processes but more lately to put everything online. The combination of these two factors has led to the situation where just about everything a state knows about each citizen is stored in numerous databases, data warehouses and of course accessed through the Web.
It’s interesting that in the PA mission statement two of the three focus areas are safety and health– I am sure when written these were meant in the physical sense. We now have to consider what each state is doing to safeguard and promote the digital safety and health of its citizens. You might ask what digital safety and health means – at the highest level this is quite straightforward – it means that each state must ensure the data it holds about its’ citizens is safe from inadvertent or deliberate exposure or disclosure. It seems that each week we read about another data breach – high profile data breach infographic - either accidental (a stolen laptop for instance) or deliberate (hacking as an example) losses of data about people – the citizens. Often that includes data contents that can be used to identify the individuals, and once an individual citizen is identified they are at risk of identity theft, credit card fraud or worse.
Of the 50 states, 46 now have a series of laws and regulations in place about when and how they need to report on data breaches or losses – this is all well and good, but is a bit like shutting the stable door after the horse has bolted – but with higher stakes as there are potentially dire consequences to the digital safety and health of their citizens.
In the next article I will look at the numerous areas that are often overlooked when states establish and execute their data protection and data privacy plans.
Informatica announced yesterday the Informatica ILM Nearline product is SAP-certified. ILM Nearline helps IT organizations reduce costs of managing data growth in existing implementations of the SAP NetWeaver Business Warehouse (SAP NetWeaver BW) and SAP HANA. By doing so, customers can leverage freed budgets and resources to invest in its application landscape and data center modernization initiatives. Informatica ILM Nearline v6.1A for use with SAP NetWeaver BW and SAP HANA, available today, is purpose-built for SAP environments leveraging native SAP interfaces.
Data volumes are growing the fastest in data warehouse and reporting applications, yet a significant amount of it is rarely used or infrequently accessed. In deployments of SAP NetWeaver BW, standard SAP archiving can reduce the size of a production data warehouse database to help preserve its performance, but if users ever want to query or manipulate the archived data, the data needs to be loaded back into the production system disrupting data analytics processes and extending time to insight. The same holds true for SAP HANA.
To address this, ILM Nearline enables IT to migrate large volumes of largely inactive SAP NetWeaver BW or SAP HANA data from the production database or in memory store to online, secure, highly compressed, immutable files in a near-line system while maintaining end-user access. The result is a controlled environment running SAP NetWeaver BW or SAP HANA with predictable, ongoing hardware, software and maintenance costs. This helps ensure service-level agreements (SLAs) can be met while freeing up ongoing budget and resources so IT can focus on innovation.
Informatic ILM Nearline for use with SAP NetWeaver BW and SAP HANA has been certified with the following interfaces:
- NW-BW-NLS Nearline Storage SAP NetWeaver BW 7.30 on SAP HANA for Informatica Data Archive 6.1A
- NW-BW-NLS 7.30 – Nearline Storage – SAP NetWeaver BW 7.30 for Informatica Data Archive 6.1A
- BC-HCS 6.20 – HTTP Content Server 6.20 for Interface for Informatica Data Archive 6.1
“Informatica ILM Nearline for use with SAP NetWeaver BW and SAP HANA is all about reducing the costs of data while keeping the data easily accessible and thus valuable,” said Adam Wilson, general manager, ILM, Informatica. “As data volumes continue to soar, the solution is especially game-changing for organizations implementing SAP HANA as they can use the Informatica-enabled savings to help offset and control the costs of their SAP HANA licenses without disrupting the current SAP NetWeaver BW users’ access to the data.”
Specific advantages of Informatica ILM Nearline include:
- Industry-leading compression rates – Informatica ILM Nearline’s compression rates exceed standard database compression rates by a sizable margin. Customers typically achieve rates in excess of 90 percent, and some have reported rates as high as 98 percent.
- Easy administration and data access – No database administration is required for data archived by Informatica ILM Nearline. Data is accessible from the user’s standard SAP application screen without any IT interventions and is efficiently stored to simplify backup, restore and data replication processes.
- Limitless capacity – Highly scalable, the solution is designed to store limitless amounts of data without affecting data access performance.
- Easy storage tiering – As data is stored in a highly compressed format, the nearline archive can be easily migrated from one storage location to another in support of a tiered storage strategy.
Available now, Informatica ILM Nearline for use with SAP NetWeaver BW and SAP HANA is based on intellectual property acquired from Sand Technology in Q4 2011 and enhanced by Informatica.
 Informatica Survey Results, January 23, 2013 (citation from Enterprise Data Archive for Hybrid IT Webinar)
The Oracle Application User Group (OAUG) Archive and Purge Special Interest Group (SIG) held its semi-annual session first thing in the morning, Sunday September 22, 2013 – 8:00am. The chairman of the SIG, Brian Bent, must have lost in the drawing straws contest for session times. Regardless, attendance was incredibly strong and the topic, ‘Cleaning up your Oracle E-Business Suite Mess’, was well received.
From the initial audience survey, most attendees have made the jump to OEBS R12 and very few have implemented an Information Lifecycle Management (ILM) strategy. As organizations migrate to the latest version, the rate of data growth increases significantly such that performance takes a plunge, costs for infrastructure and storage spike, and DBAs are squeezed with trying to make due.
The bulk of the discussion was on what Oracle offers for purging Concurrent Programs. The focus was on system tables – not functional archive and purge routines, like General Ledger or Accounts Receivable. That will be a topic of another SIG day.
For starters, Oracle provides Concurrent Programs to purge administrative data. Look for ‘Big Tables’ owned by APPLSYS for more candidates and search for the biggest tables / indexes. Search for ‘PURGE’ on MyOracleSupport (MOS) – do your homework to decide if the Purge programs apply to you. If you are concerned about deleting data, you can create an archive table, add an ‘on delete’ trigger to the original table, run the purge and automatically save the data in the archive table (Guess what? This is a CUSTOMIZATION).
Some areas to look at include FND_Concurrent Requests and FND_LOBS.
- Most customers purge data older than 7-30 days
- Oracle recommends keeping this table under 25,000 rows
- Consider additional Purges that delete data about concurrent requests that run frequently
- DBAs do not delete from FND_LOBS; the only way to get rid of them is for Oracle to provide a concurrent Program for the module that users used to load them up
- Can take an enormous amount of space and make exporting and importing your database take a long time
- You can also look to store FND_LOBS as secure files, but requires advanced compression licenses
- Log enhancement requests for more concurrent programs to clean up FND_LOBS
- Look to third party solutions, such as Informatica
Other suggestions include WORKFLOW, but this requires more research.
For more information, join the Oracle Application User Group and sign up for the Archive and Purge Special Interest Group.
In my last post, I discussed how our Informatica ILM Nearline allows vast amounts of detail data to be accessed at speeds that rival the performance of online systems, which in turn gives business analysts and application managers the power to assess and fine-tune important business initiatives on the basis of actual historical facts. We saw that the promise of Informatica ILM Nearline is basically to give you all the data you want, when and how you want it — without compromising the performance of existing data warehouse and business reporting systems.
Today, I want to consider what this capability means specifically for a business. What are the concrete benefits of implementing Informatica ILM Nearline? Here are a few of the most important ones.
Informatica ILM Nearline enables you to keep all your valuable data available for analysis.
Having more data accessible – more details, covering longer periods – enables a number of improvements in Business Intelligence processes:
- A clearer understanding of emerging trends in the business – what will go well in the future as well as what is now “going south”
- Better support for iterative analyses, enabling more intensive Business Performance Management (BPM)
- Better insight into customer behavior over the long term
- More precise target marketing, bringing a three- to five-fold improvement in campaign yield
Informatica ILM Nearline enables you to dramatically increase information storage and maintain service levels without increasing costs or administration requirements.
- Extremely high compression rates give the ability to store considerably more information in a given hardware configuration
- A substantially reduced data footprint means much faster data processing, enabling effective satisfaction of Service Level Agreements without extensive investments in processing power
- Minimal administration requirements bring reductions in resource costs, and ensure that valuable IT and business resources will not be diverted from important tasks just to manage and maintain the Informatica ILM Nearline implementation
- High data compression also substantially reduces the cost of maintaining a data center by reducing requirements for floor space, air conditioning and so on.
Informatica ILM Nearline simplifies and accelerates Disaster Recovery scenarios.
A reduced data footprint means more data can be moved across existing networks, making Informatica ILM Nearline an ideal infrastructure for implementing and securing an offsite backup process for massive amounts of data,
Informatica ILM Nearline keeps all detail data in an immutable form, available for delivery on request.
Having read-only detail data available on-demand enables quick response to audit requests, avoiding the possibility of costly penalties for non-compliance. Optional security packages can be used to control user access and data privacy.
Informatica ILM Nearline makes it easy to offload data from the online database before making final decisions about what is to be moved to an archiving solution.
The traditional archiving process typically involves extensive analysis of data usage patterns in order to determine what should be moved to relatively inaccessible archival storage. With an Informatica ILM Nearline solution, it’s a simple matter to move large amounts of data out of the online database — thereby improving performance and guaranteeing satisfaction of SLA’s, — while still keeping the data available for access when required. Data that is determined to be no longer used, but which still needs to be kept around to comply with data retention policies or regulations, can then be easily moved into an archiving solution.
Taken together, these benefits make a strong case for implementing an Informatica ILM Nearline solution when the data tsunami threatens to overwhelm the enterprise data warehouse. In future posts, I will be investigating each of these in more detail.
Data warehouses tend to grow very quickly because they integrate data from multiple sources and maintain years of historical data for analytics. A number of our customers have data warehouses in the hundreds of terabytes to petabytes range. Managing such a large amount of data becomes a challenge. How do you curb runaway costs in such an environment? Completing maintenance tasks within the prescribed window and ensuring acceptable performance are also big challenges.
We have provided best practices to archive aged data from data warehouses. Archiving data will keep the production data size at almost a constant level, reducing infrastructure and maintenance costs, while keeping performance up. At the same time, you can still access the archived data directly if you really need to from any reporting tool. Yet many are loath to move data out of their production system. This year, at Informatica World, we’re going to discuss another method of managing data growth without moving data out of the production data warehouse. I’m not going to tell you what this new method is, yet. You’ll have to come and learn more about it at my breakout session at Informatica World: What’s New from Informatica to Improve Data Warehouse Performance and Lower Costs.
I look forward to seeing all of you at Aria, Las Vegas next month. Also, I am especially excited to see our ILM customers at our second Product Advisory Council again this year.
Alternative Methods of Managing Data Growth and Best Practices for Using Them as Part of an Enterprise Information Lifecycle Management Strategy
Data, either manually created, or machine generated, tend to live on forever, because people hold on to it for fear that they might lose information by destroying data.
There is a saying in Bhagavad Gita:
jaathasya hi dhruvo mr.thyur dhr.uvam janma mr.thasya cha |
thasmaad aparihaarye’rthe’ na thvam sochithum-arhasi ||
“For death is certain to one who is born; to one who is dead, birth is certain; therefore, thou shalt not grieve for what is unavoidable.” (more…)
Both partitioning and archiving are alternative methods of improving database and application performance. Depending on a database administrator’s comfort level for one technology or method over another, either partitioning or archiving could be implemented to address performance issues due to data growth in production applications. But what are the best practices for utilizing one or the other method and how can they be used better together?