Tag Archives: application retirement
As a final part of our series, Architecting A Database Archiving Solution, we will review a process I use to assess a client’s existing Total Cost of Ownership of their database application and how to justify a database archiving solution. The key metrics I begin with are listed below and explained:
During this series of “Architecting a Database Archiving Solution”, we discussed the Anatomy of A Database Archiving Solution and End User Access Requirements. In this post we will review the archive repository options at a very high level. Each option has its pros and cons and needs to be evaluated in more detail to determine which will be the best fit for your situation.
Series: Architecting A Database Archiving Solution Part 3: End User Access & Performance Expectations
In my previous blog as part of the series, architecting a database archiving solution, we discussed the major architecture components. In this session, we will focus on how end user access requirements and expected performance service levels drive the core of an architecture discussion.
End user access requirements can be determined by answering the following questions. When data is archived from a source database:
- How long does the archived data need to be retained? The longer the retention period, the more the solution architecture needs to account for potentially significant data volumes and technology upgrades or obsolescence. This will determine cost factors of keeping data online in a database or an archive file, versus nearline or offline on other media such as tape. (more…)
Before we can go into more details on how to architect a database archiving solution, let’s review at a high level the major components of a database archiving solution. In general, a database archiving solution is comprised of four key pieces – application metadata, a policy engine, an archive repository and an archive access layer.
Application Metadata – This component contains information that is used to define what tables will participate in a database archiving activity. It stores the relationships between those tables, including database or application level constraints and any criteria that needs to be considered when selecting data that will be archived. The metadata for packaged applications, such as Oracle E-Business Suite, PeopleSoft, or SAP can usually be purchased in pre-populated repositories, such as Informatica’s Application Accelerators for Data Archive to speed implementation times.
Policy Engine – This component is where business users define their retention policies in terms of time durations and possibly other related rules (i.e. keep all financial data for current quarter plus seven years and the general and sub ledgers must have a status of “Closed”). The policy engine is also responsible for executing the policy within the database, and moving data to a configured archive repository. This involves translating the policy and metadata into structured query language that the database understands (SELECT * from TABLE A where COLUMN 1 > 2 years and COLUMN 2 = “Closed”). Depending on the policy, users may want to move the data to an archive (meaning it is removed from the source application) or just create a copy in the archive. The policy engine takes care of all those steps.
Archive Repository – This stores the database archive records. The choices for the repository vary and will be determined based on a number of factors typically driven from end user archive access requirements (we will discuss this in the next blog). Some of these choices include another archive database, highly compressed query-able archive files, XML files to name a few.
Archive Access Layer – This is the mechanism that makes the database archive accessible either to a native application, a standard business reporting tool, or a data discovery portal. Again, these options vary and will be determined based on the end user access requirements and the technology standards in the organizations data center.
In the next series, we will discuss how End User Access and Performance Requirements impact the selection of these components in further detail.
Julie Lockner, Founder, www.CentricInfo.com
Classifying databases data for an ILM project requires a process for categorizing and classifying that involves the business owners, Records Management, Security, IT, DBA’s and developers. In an ideal scenario, a company has documented every single business process down to data flows and database tables. IT can map database tables to the underlying infrastructure. Since most of us work in realistic scenarios, here is one approach you can take to classify information without knowing all the interrelations.�
In one of my earlier blogs, I wrote about why you still need database archiving, when you already partition your database. On a similar vein, many people also ask me why you still need to archive when you already have database compression to reduce your storage capacity and cost. The benefits of archiving, which you can’t achieve with just compression and/or partitioning are still the same:
- Archiving allows you to completely move data volumes out of the production system to improve response time and reduce infrastructure costs. Why keep unused data, even if compressed, on high cost server infrastructure when you don’t need to? Why add overhead to query processing when you can remove the data from being processed at all?
- Avoid server and software license upgrades. By removing inactive data from the database, you no longer require as much processing power and you can keep your existing server without having to add CPU cores and additional licenses for your database and application. This further eliminates costs.
- Reduce overall administration and maintenance costs. If you still keep unused data around in your production system, you still need to back it up, replicate it for high availability, clone it for non-production copies, recover it in the event of a disaster, upgrade it, organize and partition it, and consider it as part of your performance tuning strategy. Yes, it will take less time to backup, copy, restore, etc., since the data is compressed and is smaller, but why even include that data as part of production maintenance activities at all, if it’s infrequently used?
- Remove the multiplier effect. The cost of additional data volume in production systems is multiplied when you consider how many copies you have of that production data in mirrors, backups, clones, non-production systems, and reporting warehouses. The size multiplier is less since the data is compressed, but it’s still more wasted capacity in multiple locations. Not to mention the additional server, software license, and maintenance costs associated with the additional volumes in those multiple copies. So it’s best to just remove that data size at the source.
- Ensure compliance by enforcing retention and disposition policies. As I discussed in my previous blog on the difference between archiving and backup, archiving is the solution for long term data retention. Archiving solutions, such as Informatica Data Archive, have integration points with records management software or provide built-in retention management to enforce the retention of data for a specified period based on policies. During that period, the immutability and authenticity of the archived data is ensured, and when the retention period expires, records are automatically purged after the appropriate review and approval process. Regulated data needs to be retained long enough to comply with regulations, but keeping data for too long can also become a legal liability. So it’s important that expired records are purged in a timely manner. Just keeping data in production databases indefinitely doesn’t help you to reduce your compliance and legal risks.
Implementing enterprise application and database archiving is just plain best practices. The best way to improve performance and reduce infrastructure and maintenance costs is to reduce the data volume in your production systems. Why increase overhead when you don’t have to? Today’s archiving solutions allow you to maintain easy access to the data after archival, so there is no reason to keep data around just for the sake of accessibility. By moving inactive but regulated data to a central archival store, you can uniformly enforce retention policies. At the same time, you can reduce the time and cost of eDiscovery by making all types of data centrally and easily searchable.
Many of my clients struggle with how to design a database archiving solution. Database archiving is not as clean as email or file archiving. Project owners who have done their research understand why they need an archiving solution: either to address performance degradation or increased costs (or both) due to uncontrolled data volume growth in their production databases. Where help is appreciated is during the planning phase of a project and defining what requirements are critical and how those requirements translate into an archive architecture.
Oracle has been relatively quiet of late about Oracle Fusion Applications availability. But as the first applications are scheduled to rollout this year (2010), it might be a good time to revisit your upgrade to Fusion Applications plans. Is data archiving part a of it? If it isn’t, it should be.
Last month I outlined the reasons why IT organizations should consider eliminating certain applications. The savings in terms of redirecting hardware, software maintenance licenses and full time equivalents (FTE) with specialized skill sets to other more critical projects can be significant. I also noted that all these benefits can be achieved so long as you can continue to retain and access the data easily and cost-effectively for compliance and reporting purposes.
A question that might linger in your mind is how much cost savings can you really achieve if you retire the data but still retain it in a database, with its associated maintenance cost? But what if you had the option to store your data at a fraction of its original size on common file systems, with full audit-ability, built-in retention management, while still maintaining on demand query access for your business users? The resulting savings could be so significant that application retirement might not only be an option, but an overwhelmingly compelling initiative that must be implemented.
One of the key deliverables for an ILM project that my team is wrapping up is a metadata repository of all the database tables we have applied an ILM solution to. In this repository, we list the database, schema and table name; what Record Series the data belong to; the corresponding retention period and criteria; business owner; and source information. Not only will this repository be used to archive and purge data on a regular, operational basis, but it will also be used by Records Management to track Records Retention compliance.