Tag Archives: Database Performance
Data warehouses tend to grow very quickly because they integrate data from multiple sources and maintain years of historical data for analytics. A number of our customers have data warehouses in the hundreds of terabytes to petabytes range. Managing such a large amount of data becomes a challenge. How do you curb runaway costs in such an environment? Completing maintenance tasks within the prescribed window and ensuring acceptable performance are also big challenges.
We have provided best practices to archive aged data from data warehouses. Archiving data will keep the production data size at almost a constant level, reducing infrastructure and maintenance costs, while keeping performance up. At the same time, you can still access the archived data directly if you really need to from any reporting tool. Yet many are loath to move data out of their production system. This year, at Informatica World, we’re going to discuss another method of managing data growth without moving data out of the production data warehouse. I’m not going to tell you what this new method is, yet. You’ll have to come and learn more about it at my breakout session at Informatica World: What’s New from Informatica to Improve Data Warehouse Performance and Lower Costs.
I look forward to seeing all of you at Aria, Las Vegas next month. Also, I am especially excited to see our ILM customers at our second Product Advisory Council again this year.
It is that time of year for some to reflect on the past or ponder the future. If part of your end of year ritual includes cleaning out a cluttered closet or room in the house, consider the same ritual for the data in your databases.
In December, 2005 Sun Microsystems conducted an interview with Bill Inmon, the father of the data warehouse concept. He said, “ILM keeps a data warehouse from costing huge amounts of money and maintains good performance consistently throughout the data warehouse environment.” Four years later, the average size of a data warehouse has increased by 200%, surpassing the multi-terabyte size benchmark.
With these mammoth databases comes an increase in cost to manage them and a potential deterioration in performance. It is common practice to leverage techniques like indexing and database partitioning to address query performance issues with very large databases but those techniques do not address challenges associated with the raw volumes of data.