Tag Archives: data growth
Why Backups Are Terrible Archives
Businesses retain information in an Enterprise data archiving either for compliance – adhere to data retention regulations – or because business users are afraid to let go of data they are used to having access to. Many IT have told us they retain data in archives because they are looking to cut infrastructure costs and do not have retention requirements clearly articulated from the business. As a result, enterprise data archiving has morphed into serving multiple purposes for IT –they can eliminate costs associated with maintaining aging data in production applications, allow business users to access the information on demand, all while adhering to some – if any known or defined – retention policies. (more…)
Lean Data Warehouse – Clean Up The Waste
Many years ago (over 30 to be precise) I can recall walking the halls of more than one fortune 500 company and seeing four-foot high stacks of boxes with computer printouts in the hallway outside of managers’ offices. In fact it was not uncommon to see pallet-loads of computer printouts in some companies. When I asked one manager what the reports were and why they had so many, he said “we don’t look at the reports any more but we don’t know how to get the data center to stop sending them.” (more…)
Dodd-Frank Legislation and Structured Data Retention
The “Dodd-Frank Wall Street Reform and Consumer Protection Act” has recently been passed by the US federal government to regulate financial institutions. Per this legislation, there will be more “watchdog” agencies that will be auditing banks, lending and investment institutions to ensure compliance. As an example, there will be an Office of Financial Research within the Federal Treasury responsible for collecting and analyzing data. This legislation brings with it a higher risk of fines for non-compliance. (more…)
Start Running Because The Data Tsunami Is Approaching
The phrase ‘Data Tsunami’ has been used by numerous authors in the last few months and it’s difficult to find another suitable analogy because what’s approaching is of such an increased order of magnitude that the IT industries continued expectations for data growth will be swamped in the next few years.
However impressive a spectacle a Tsunami is, it still wreaks havoc to those who are unprepared or believe they can tread water and simply float to the surface when the trouble has passed.
Making Big Data A Little Smaller
Big Data. The term has certainly caught on and the phenomenon is real. Every nanosecond of every day, more and more data is being created at ever increasing speeds. And since storage has become so economical, both cost and foot print, there are fewer compelling reasons for the enterprise to manage down the size of its databases. In response, new and emerging technologies such as universal database connectivity, complex event processing, connectivity to social network feeds and in-memory processing have been developed to better manage Big Data’s scale. While this is great news for the enterprise, it comes with some challenges in respect to business analytics. (more…)
Efficiency Is The Name Of Today’s Game
In my last post I talked about airlines becoming more efficient (or not) and I started thinking about how everything today is about efficiency – not a bad thing when you consider the growth of data volumes we’re seeing everywhere (go to YouTube and search “exponential times” – some interesting videos). Efficiency is necessary for scale, but also efficiency is about better use of resources (think Green). (more…)
Series: Architecting A Database Archiving Solution Final Part 5: Data Growth Assessments
As a final part of our series, Architecting A Database Archiving Solution, we will review a process I use to assess a client’s existing Total Cost of Ownership of their database application and how to justify a database archiving solution. The key metrics I begin with are listed below and explained:
Series: Architecting A Database Archiving Solution Part 4: Archive Repository Options
During this series of “Architecting a Database Archiving Solution”, we discussed the Anatomy of A Database Archiving Solution and End User Access Requirements. In this post we will review the archive repository options at a very high level. Each option has its pros and cons and needs to be evaluated in more detail to determine which will be the best fit for your situation.
(more…)
Series: Architecting A Database Archiving Solution Part 3: End User Access & Performance Expectations
In my previous blog as part of the series, architecting a database archiving solution, we discussed the major architecture components. In this session, we will focus on how end user access requirements and expected performance service levels drive the core of an architecture discussion.
End user access requirements can be determined by answering the following questions. When data is archived from a source database:
- How long does the archived data need to be retained? The longer the retention period, the more the solution architecture needs to account for potentially significant data volumes and technology upgrades or obsolescence. This will determine cost factors of keeping data online in a database or an archive file, versus nearline or offline on other media such as tape. (more…)
Architecting A Database Archiving Solution Part 2: The Anatomy Of A Database Archiving Solution
Before we can go into more details on how to architect a database archiving solution, let’s review at a high level the major components of a database archiving solution. In general, a database archiving solution is comprised of four key pieces – application metadata, a policy engine, an archive repository and an archive access layer.
Application Metadata – This component contains information that is used to define what tables will participate in a database archiving activity. It stores the relationships between those tables, including database or application level constraints and any criteria that needs to be considered when selecting data that will be archived. The metadata for packaged applications, such as Oracle E-Business Suite, PeopleSoft, or SAP can usually be purchased in pre-populated repositories, such as Informatica’s Application Accelerators for Data Archive to speed implementation times.
Policy Engine – This component is where business users define their retention policies in terms of time durations and possibly other related rules (i.e. keep all financial data for current quarter plus seven years and the general and sub ledgers must have a status of “Closed”). The policy engine is also responsible for executing the policy within the database, and moving data to a configured archive repository. This involves translating the policy and metadata into structured query language that the database understands (SELECT * from TABLE A where COLUMN 1 > 2 years and COLUMN 2 = “Closed”). Depending on the policy, users may want to move the data to an archive (meaning it is removed from the source application) or just create a copy in the archive. The policy engine takes care of all those steps.
Archive Repository – This stores the database archive records. The choices for the repository vary and will be determined based on a number of factors typically driven from end user archive access requirements (we will discuss this in the next blog). Some of these choices include another archive database, highly compressed query-able archive files, XML files to name a few.
Archive Access Layer – This is the mechanism that makes the database archive accessible either to a native application, a standard business reporting tool, or a data discovery portal. Again, these options vary and will be determined based on the end user access requirements and the technology standards in the organizations data center.
In the next series, we will discuss how End User Access and Performance Requirements impact the selection of these components in further detail.
Julie Lockner, Founder, www.CentricInfo.com
