Category Archives: Data Governance
Data Management Issue Categories
In my last post I started to talk about ideas for classifying the data management issues, with the reasoning that it will help to determine the feasibility that the expectation that acquiring a particular solution will actually address the core issues. I actually have used this categorization with some of our customers, and the process of classification does lend some clarity when considering solutions. There are five categories: (more…)
Lean Data Warehouse – Clean Up The Waste
Many years ago (over 30 to be precise) I can recall walking the halls of more than one fortune 500 company and seeing four-foot high stacks of boxes with computer printouts in the hallway outside of managers’ offices. In fact it was not uncommon to see pallet-loads of computer printouts in some companies. When I asked one manager what the reports were and why they had so many, he said “we don’t look at the reports any more but we don’t know how to get the data center to stop sending them.” (more…)
How Do You Handle the Recent Storage Shortage?
Gartner hosted a webinar on January 10, 2012: Gartner Worldwide IT Spending Forecast. One of the topics covered was industry IT spend for 2012.
In covering that topic they made a point of saying that due to severe flooding in Thailand, they expect storage to become in short supply (as much as a 29% global shortfall) through the end of 2012. It is expected that the price of storage/GB will increase as a result and supplies will fall short of demand. They recommended finding alternatives to purchasing storage to keep costs down. (more…)
Informatica Positioned in the Recent Gartner Magic Quadrant for MDM
Gartner recently published its annual Magic Quadrant for Master Data Management of Customer Data Solutions, which “positions MDM of customer data solution vendors (and their products) on the basis of their Completeness of Vision relative to the market and their Ability to Execute on that vision.” The growth of MDM market has been phenomenal – $1.6 billion in 2011, a growth of 21% from 2010, and projected to grow by the same rate to $1.9 billion in 2012. (more…)
Reading The Tea Leaves: Predictions For Data Quality In 2012
Following up from my previous post on 2011 reflections, it’s now time to take a look at the year ahead and consider what key trends will likely impact the world of data quality as we know it. As I mentioned in my previous post, we saw continued interest in data quality across all industries and I expect that trend to only continue to pick up steam in 2012. Here are three areas in particular that I foresee will rise to the surface: (more…)
Optimize Data Warehouses with Data Usage Monitoring and Data Warehouse Archiving
Data warehouses are applications– so why not manage them like one? In fact, data grows at a much faster rate in data warehouses, since they integrate date from multiple applications and cater to many different groups of users who need different types of analysis. Data warehouses also keep historical data for a long time, so data grows exponentially in these systems. The infrastructure costs in data warehouses also escalate quickly since analytical processing on large amounts of data requires big beefy boxes. Not to mention the software license and maintenance costs of such a large amount of data. Imagine how many backup media is required to backup tens to hundreds of terabytes of data warehouses on a regular basis. But do you really need to keep all that historical data in production?
One of the challenges of managing data growth in data warehouses is that it’s hard to determine which data is actually used, which data is no longer being used, or even if the data was ever used at all. Unlike transactional systems where the application logic determines when records are no longer being transacted upon, the usage of analytical data in data warehouses has no definite business rules. Age or seasonality may determine data usage in data warehouses, but business users are usually loath to let go of the availability of all that data at their fingertips. The only clear cut way to prove that some data is no longer being used in data warehouses is to monitor its usage.
Characteristics of Entities and Characteristics of Roles
I am 5’9” tall. I have brown-ish eyes (actually they seem to have adjusted slightly to being hazel-ish sometimes, but my driver’s license says brown). I was born on a specific date in a specific year. My first name is Howard, my middle name is David, and my last name is Loshin. These are all characteristics of me as an individual, and for the most part these attributes are static. Yes, I might shrink as I get older, and I could change my name; I could also lie about my birth date and/or year. But these are relatively drastic changes, and for most individuals, they are relatively good criteria as a start for determining an identity for a specific entity. (more…)
2011 #Cloud Integration Predictions in Review
‘Tis the season, but before I post my 2012 cloud integration predictions, I thought I’d spend a few minutes scoring how I did this year. Last year I predicted the following:
- Cloud adoption will drive two-tier cloud integration strategies
- LOB-driven cloud integration projects will lead to strategic MDM initiatives
- Cloud integration platforms will emerge
- Database.com will gain enterprise adoption
- Private Cloud confusion will continue
Here’s my assessment: (more…)
Dodd-Frank Legislation and Structured Data Retention
The “Dodd-Frank Wall Street Reform and Consumer Protection Act” has recently been passed by the US federal government to regulate financial institutions. Per this legislation, there will be more “watchdog” agencies that will be auditing banks, lending and investment institutions to ensure compliance. As an example, there will be an Office of Financial Research within the Federal Treasury responsible for collecting and analyzing data. This legislation brings with it a higher risk of fines for non-compliance. (more…)
Don’t Forget to Manage the Retention and Disposal of Data on Hadoop
According to an article written by Mark Brunelli interviewing James Kobielus of Forrester Research: Forrester’s Kobielus: It’s time for a Hadoop standards body, Hadoop is still a bit immature and needs adoption of standards. Mr. Kobielus goes on to indicate that when implementing Hadoop, “whether it’s through a data warehouse or Hadoop cluster, you’re talking about petabytes or multiple hundreds of terabytes worth of storage.” Hadoop, while designed to access these large data volumes (which can include social media data), does nothing to manage retention of that data. (more…)

