Category Archives: Database Archiving
Data warehouses tend to grow very quickly because they integrate data from multiple sources and maintain years of historical data for analytics. A number of our customers have data warehouses in the hundreds of terabytes to petabytes range. Managing such a large amount of data becomes a challenge. How do you curb runaway costs in such an environment? Completing maintenance tasks within the prescribed window and ensuring acceptable performance are also big challenges.
We have provided best practices to archive aged data from data warehouses. Archiving data will keep the production data size at almost a constant level, reducing infrastructure and maintenance costs, while keeping performance up. At the same time, you can still access the archived data directly if you really need to from any reporting tool. Yet many are loath to move data out of their production system. This year, at Informatica World, we’re going to discuss another method of managing data growth without moving data out of the production data warehouse. I’m not going to tell you what this new method is, yet. You’ll have to come and learn more about it at my breakout session at Informatica World: What’s New from Informatica to Improve Data Warehouse Performance and Lower Costs.
I look forward to seeing all of you at Aria, Las Vegas next month. Also, I am especially excited to see our ILM customers at our second Product Advisory Council again this year.
Alternative Methods of Managing Data Growth and Best Practices for Using Them as Part of an Enterprise Information Lifecycle Management Strategy
Data, either manually created, or machine generated, tend to live on forever, because people hold on to it for fear that they might lose information by destroying data.
There is a saying in Bhagavad Gita:
jaathasya hi dhruvo mr.thyur dhr.uvam janma mr.thasya cha |
thasmaad aparihaarye’rthe’ na thvam sochithum-arhasi ||
“For death is certain to one who is born; to one who is dead, birth is certain; therefore, thou shalt not grieve for what is unavoidable.” (more…)
Both partitioning and archiving are alternative methods of improving database and application performance. Depending on a database administrator’s comfort level for one technology or method over another, either partitioning or archiving could be implemented to address performance issues due to data growth in production applications. But what are the best practices for utilizing one or the other method and how can they be used better together?
The need for more robust data retention management and enforcement is more than just good data management practice. It is a legal requirement for financial services organizations across the globe to comply with the myriad of local, federal, and international laws that mandate the retention of certain types of data for example:
- Dodd-Frank Act: Under Dodd-Frank, firms are required to maintain records for no less than five years.
- Basel Accord: The Basel guidelines call for the retention of risk and transaction data over a period of three to seven years. Noncompliance can result in significant fines and penalties.
- MiFiD II: Transactional data must also be stored in such a way that it meets new records retention requirements for such data (which must now be retained for up to five years) and easily retrieved, in context, to prove best execution.
- Bank Secrecy Act: All BSA records must be retained for a period of five years and must be filed or stored in such a way as to be accessible within a reasonable period of time.
- Payment Card Industry Data Security Standard (PCI): PCI requires card issuers and acquirers to retain an audit trail history for a period that is consistent with its effective use, as well as legal regulations. An audit history usually covers a period of at least one year, with a minimum of three months available on-line.
- Sarbanes-Oxley:Section 103 requires firms to prepare and maintain, for a period of not less than seven years, audit work papers and other information related to any audit report, in sufficient detail to support the conclusions reached and reported to external regulators.
Each of these laws have distinct data collection, analysis, and retention requirements that must be factored into existing information management practices. Unfortunately, existing data archiving methods including traditional database and tape backup methods lack the required capabilities to effectively enforce and automate data retention policies to comply with industry regulations. In addition, a number of internal and external challenges make it even more difficult for financial institutions to archive and retain required data due to the following trends: (more…)
In this video, Richard Cramer, chief healthcare strategist, and Claudia Chandra, senior director, product management, ILM, Informatica, discuss healthcare and application retirement.
During this discussion (the first of two videos), Richard and Claudia cover the following topics as they relate to healthcare:
- The business case for application retirement
- Additional drivers for application retirement
The second video, discusses application retirement project scope, enterprise IT initiatives and how application retirement is the fastest way for IT to save money now.
Just like your house needs yearly spring cleaning and you need to regularly throw out old junk, your application portfolio needs periodic review and rationalization to identify legacy, redundant applications that can be decommissioned to reduce bloat and save costs. If you have a hard time letting go of old stuff, it’s probably even harder for your application users to let go of access to their data. However, retiring applications doesn’t have to mean that you also lose the data within them. If the data within those applications are still needed for periodic reporting or for regulatory compliance, then there are still ways to retain the data without maintaining the application. (more…)
Gartner hosted a webinar on January 10, 2012: Gartner Worldwide IT Spending Forecast. One of the topics covered was industry IT spend for 2012.
In covering that topic they made a point of saying that due to severe flooding in Thailand, they expect storage to become in short supply (as much as a 29% global shortfall) through the end of 2012. It is expected that the price of storage/GB will increase as a result and supplies will fall short of demand. They recommended finding alternatives to purchasing storage to keep costs down. (more…)
Data warehouses are applications– so why not manage them like one? In fact, data grows at a much faster rate in data warehouses, since they integrate date from multiple applications and cater to many different groups of users who need different types of analysis. Data warehouses also keep historical data for a long time, so data grows exponentially in these systems. The infrastructure costs in data warehouses also escalate quickly since analytical processing on large amounts of data requires big beefy boxes. Not to mention the software license and maintenance costs of such a large amount of data. Imagine how many backup media is required to backup tens to hundreds of terabytes of data warehouses on a regular basis. But do you really need to keep all that historical data in production?
One of the challenges of managing data growth in data warehouses is that it’s hard to determine which data is actually used, which data is no longer being used, or even if the data was ever used at all. Unlike transactional systems where the application logic determines when records are no longer being transacted upon, the usage of analytical data in data warehouses has no definite business rules. Age or seasonality may determine data usage in data warehouses, but business users are usually loath to let go of the availability of all that data at their fingertips. The only clear cut way to prove that some data is no longer being used in data warehouses is to monitor its usage.
Database partitioning and database archiving are both methods for improving application performance. Many IT organizations use one or the other, but using them together can provide additional incremental value to an organization.
Database partitioning is a well-known method to DBAs and is supported by most of the commercially available databases. The benefits of partitioning include: (more…)
Enterprise Applications Myth #2: You Only Need to Focus on the Application in Any Modernization Initiative
This is the second in a series of myth-busting posts. Myth #1 was “Apps and Data Live and Die Together“.
Myth #2: When embarking on an application modernization initiative, either doing a significant application upgrade or entirely replacing legacy applications, you really only need to focus on the application and making sure that it aligns with your business processes.