Category Archives: Data Warehousing
This year marks the 20th anniversary for Informatica. Twenty years of solving the problem of getting data from point A to point B, improving its quality, establishing a single view and managing it over its life-cycle. Yet after 20 years of innovation and leadership in the data integration market, when one would think the problem had been solved, all data had been extracted, transformed, cleansed and managed, it actually hasn’t — companies still need data integration. Why? Data is complicated business. And with data increasingly becoming central to business survival, organizations are constantly looking for ways to unlock new sources of it, use it as an unforeseen source of insight and do it all with greater agility and at lower cost. (more…)
In a recent visit to a client, three people asked me to autograph their copies of Integration Competency Center: An Implementation Guidebook. David Lyle and I published the book in 2005, but it was clear from the dog-eared corners and book-mark tabs that it is still relevant and actively being used today. Much has changed in the last seven years including the emergence of Big Data, Data Virtualization, Cloud Integration, Self-Service Business Intelligence, Lean and Agile practices, Data Privacy, Data Archiving (the “death” part of the information life-cycle), and Data Governance. These areas were not mainstream concerns in 2005 like they are today. The original ICC (Integration Competency Center) book concepts and advice are still valid in this new context, but the question I’d like readers to comment on is should we write a new book that explicitly provides guidance for these new capabilities in a shared services environment? (more…)
So wrote Potter Stewart, Associate Justice of the Supreme Court in Jacobellis v. Ohio opinion (1964). He was talking about pornography. The same holds true for data. For example, most business users have a hard time describing exactly what data they need for a new BI report, including what source system to get the data from, in sufficiently precise terms that allow designers, modelers and developers to build the report right the first time. But if you sit down with a user in front an analyst tool and profile the potential source data, they will tell you in an instant whether it’s the right data or not. (more…)
Today Informatica announced that we have joined Google’s Cloud Platform Partner Program, with the introduction of a true cloud-based connector for Google BigQuery. Taking advantage of the power of social media, Google made the announcement through their Developer Blog, noting that the new members of their partner program will, “make it much easier to automatically load data from a broad set of sources, as well as to analyze and visualize the data with spectacular dashboards. (more…)
Before the kids got out of class for the summer, my daughter’s fourth grade class was learning about product marketing. Her teacher is a good friend of ours and she knows I work with Informatica’s product marketing teams a great deal, so she asked, or I think it was closer to volunteered me to come into the class and talk a little about marketing concepts.
I have taken plenty of marketing classes in college, and have worked closely with marketing teams throughout my career; however, I’m not really qualified to talk marketing, but I mean they are fourth graders, how hard could it be right?
I told the head of the Enterprise Data Warehouse at a large bank, “you don’t have a data warehouse, you have 50,000 tables.” The issue is that the bank built the EDW without the necessary fundamentals in place. It wasn’t for lack of money; in fact the EDW was one of the biggest “money sinks” in the bank. The problem is that it was sitting on a sinking foundation.
One version of the truth isn’t achieved by putting all your data in one big system or one big database – that’s impossible. An enterprise data warehouse is indeed part of the solution, but it needs to be built on a solid foundation. What does a solid foundation look like? Here are five pillars for one version of the truth. (more…)
Treating Big Data Performance Woes with the Data Replication Cure Blog Series – Part 1
“Big Data” is all the rage – it is virtually impossible to check out any information management media channel, online resource, or community of interest without having your eyeballs bathed in articles touting the benefits and inevitability of what has come to be known as big data. I have watched this transformation over the past few years as data warehousing and business analytics appliances have entered the mainstream. Pure and simple: what was the bleeding edge of technology twenty years ago in performance computing is now commonplace, with Hadoop being the primary platform (or more accurately, programming environment) for developing big data analytics applications. (more…)