Tag Archives: NoSQL

Rising DW Architecture Complexity

Rising DW Architecture Complexity

Rising DW Architecture Complexity

I was talking to an architect-customer last week at a company event and he was describing how his enterprise data warehouse architecture was getting much more complex after many years of relative calm and stability.  In the old days of yore, you had some data sources, a data warehouse (with single database), and some related edge systems.

The current trend is that new types of data and new types of physical storage are changing all of that.

When I got back from my trip I found a TDWI white paper by Philip Russom that describes the situation very well in a white paper detailing his research on this subject;  Evolving Data Warehouse Architectures in the Age of Big Data.

From an enterprise data architecture and management point of view, this is a very interesting paper.

  • First the DW architectures are getting complex because of all the new physical storage options available
    • Hadoop – very large scale and inexpensive
    • NoSQL DBMS – beyond tabular data
    • Columnar DBMS – very fast seek time
    • DW Appliances – very fast / very expensive
  • What is driving these changes is the rapidly-increasing complexity of data. Data volume has captured the imagination of the press, but it is really the rising complexity of the data types that is going to challenge architects.
  • But, here is what really jumped out at me. When they asked the people in their survey what are the important components of their data warehouse architecture, the answer came back; Standards and rules.  Specifically, they meant how data is modeled, how data quality metrics are created, metadata requirements, interfaces for data integration, etc.

The conclusion for me, from this part of the survey, was that business strategy is requiring more complex data for better analyses (example: realtime response or proactive recommendations) and business processes (example: advanced customer service).  This, in turn, is driving IT to look into more advanced technology to deal with different data types and different use cases for the data.  And finally, the way they are dealing with the exploding complexity was through standards, particularly data standards.  If you are dealing with increasing complexity and have to do it better, faster and cheaper, they only way you are going to survive is by standardizing as much as reasonably makes sense.  But, not a bit more.

If you think about it, it is good advice.  Get your data standards in place first.  It is the best way to manage the data and technology complexity.  …And a chance to be the driver rather than the driven.

I highly recommend reading this white paper.  There is far more in it than I can cover here. There is also a Philip Russom webinar on DW Architecture that I recommend.

Share
Posted in Architects, CIO | Tagged , , , , , , , , , , | Leave a comment

Hadoop Tuesday Update: Discovering Hadoop’s Vibrant Open Source Community

There’s a historic parallel for Hadoop’s rapidly growing ecosystem and excitement – the Linux operating system had a similar trajectory more than a decade ago. At that time, as companies embraced the open source system, a vibrant ecosystem of users, vendors and community supporters evolved to move the technology forward and add value.

Now, we see the same thing happening with Big Data, as an impressive ecosystem emerges around Hadoop. “This is a very strong and vibrant and varied community,” Matt Aslett, analyst with the451 Group, pointed out at the recent Hadoop Tuesdays webcast. “It very much reminds us of the early early stages of Linux, where you have vendors and users who each have something to gain from Hadoop being successful.” (more…)

Share
Posted in Data Integration | Tagged , , | Leave a comment