Tag Archives: architect
The current trend is that new types of data and new types of physical storage are changing all of that.
When I got back from my trip I found a TDWI white paper by Philip Russom that describes the situation very well in a white paper detailing his research on this subject; Evolving Data Warehouse Architectures in the Age of Big Data.
From an enterprise data architecture and management point of view, this is a very interesting paper.
- First the DW architectures are getting complex because of all the new physical storage options available
- Hadoop – very large scale and inexpensive
- NoSQL DBMS – beyond tabular data
- Columnar DBMS – very fast seek time
- DW Appliances – very fast / very expensive
- What is driving these changes is the rapidly-increasing complexity of data. Data volume has captured the imagination of the press, but it is really the rising complexity of the data types that is going to challenge architects.
- But, here is what really jumped out at me. When they asked the people in their survey what are the important components of their data warehouse architecture, the answer came back; Standards and rules. Specifically, they meant how data is modeled, how data quality metrics are created, metadata requirements, interfaces for data integration, etc.
The conclusion for me, from this part of the survey, was that business strategy is requiring more complex data for better analyses (example: realtime response or proactive recommendations) and business processes (example: advanced customer service). This, in turn, is driving IT to look into more advanced technology to deal with different data types and different use cases for the data. And finally, the way they are dealing with the exploding complexity was through standards, particularly data standards. If you are dealing with increasing complexity and have to do it better, faster and cheaper, they only way you are going to survive is by standardizing as much as reasonably makes sense. But, not a bit more.
If you think about it, it is good advice. Get your data standards in place first. It is the best way to manage the data and technology complexity. …And a chance to be the driver rather than the driven.
I highly recommend reading this white paper. There is far more in it than I can cover here. There is also a Philip Russom webinar on DW Architecture that I recommend.
A couple comments on the importance of integration platforms like Informatica in an EDW/Hadoop environment.
- Hadoop does mean you can do some quick and inexpensive exploratory analysis with little or no ETL. The issue is that it will not perform at the level you need to take it to production. As the webinar points out, applying some structure to the data with columnar files (not RDBMS) will dramatically speed up query performance.
- The other thing that makes an integration platform more important than ever is the explosion of data complexity. As Dr. Kimball put it:
“Integration is even more important these days because you are looking at all sorts of data sources coming in from all sorts of directions.”
To perform interesting analyses, you are going to have to be able to join data with different formats and different semantic meaning. And that is going to require integration tools.
- Thirdly, if you are going to put this data into production, you will want to incorporate data cleansing, metadata management, and possibly formal data governance to ensure that your data is trustworthy, auditable, and has business context. There is no point in serving up bad data quickly and inexpensively. The result will be poor business decisions and flawed analyses.
For Data Warehouse Architects
The challenge is to deliver actionable content from the exploding amount of data available. You will need to be constantly scanning for new sources of data and looking for ways to quickly and efficiently deliver that to the point of analysis.
For Enterprise Architects
The challenge with adding Big Data to Your EDW Architecture is to define and drive a coherent enterprise data architecture across your organization that standardizes people, processes, and tools to deliver clean and secure data in the most efficient way possible. It will also be important to automate as much as possible to offload routine tasks from the IT staff. The key to that automation will be the effective use of metadata across the entire environment to not only understand the data itself, but how it is used, by whom, and for what business purpose. Once you have done that, then it will become possible to build intelligence into the environment.
For more on Informatica’s vision for an Intelligent Data Platform and how this fits into your enterprise data architecture see Think “Data First” to Drive Business Value
We are way past the point where the architecture needs to be aligned with business goals and value delivery. That is necessary but no longer sufficient. We are now at the point where architecture needs to be central to the creation of an organization’s strategy process. Not to get hyperbolic, but anything less is risky for your career.
The Challenge: Digitization
I just came back from the MIT Center for Information Systems Research (CISR) research forum. One of the leading topics was digitization and how every business is becoming digitized. To those in the High Tech industry, this may be an “of course” topic, but to most other industries it is a wrenching change. Even those who are comfortable with the idea of digitization risk taking this too lightly.
The fact is that most products and services will have a digital component to them in the near future and an increasing number of products and services will be entirely digital. The fact is that digitization and the technologies that enable it are going to bring about a period of increased disruption. This will mean:
- New competitors. Examples: autonomous cars, sports equipment with embedded sensors that provide feedback, personal assistant fully capable of making decisions and taking action. Gartner is predicting that almost everything over $100 will have a sensor by the turn of the decade.
- New competitors jumping across industry boundaries. Examples: Apple iTunes and Google cars to name a few.
Why Architects Are Important
Architects are in a unique position to not only understand the technology trends driving this disruption, but they also to know how to leverage these trends to drive business value within their organizations. The very best architects are going to be those who are deeply involved in defining the organization strategy, not just figuring out how to implement it.
Evidence of Change
Many architects and CIOs currently report very little interest from upper management in IT. That is about to change, and quickly. At the MIT CISR forum I attended last week, they presented research around this area that is very telling:
- Half of Board of Directors members believe that their board’s ability to oversee the strategic use of IT is “less than effective.”
- 26% of Boards hired consultants to evaluate major projects or the IT unit.
- 60% of Boards want to spend more time on digital issues next year.
- Board members estimate that 32% of their company’s revenues are under threat from digital disruption.
That last bullet is the really interesting piece of research. 32% is a huge impact.
The Role of Data in Digitization
Digitization by its very nature is all about data. The winners in this space will be those that can manage and deliver relevant data the quickest. The question for architects is this: Do you have the architecture and agility to take advantage of the coming disruptions and opportunities? Are you actively advising your organization on how to leverage them? As we have documented in many previous blogs, many organizations are poorly positioned to manage their data as a discoverable and easily sharable asset. This will essential for:
- Delivering business initiatives and showing value faster (agility).
- Enabling business self-service so that IT is not the bottleneck in new analyses and decisions.
All of this requires new thinking around enterprise data architecture. For fresh thinking on this subject see Thinking “Data First” to Drive Business Value.
Just last week, I visited a client for whom I had been consulting on-and-off for several years. On the meeting room wall, I saw their Enterprise Architecture portfolio, beautiful graphically designed and printed on a giant sheet of paper. My host proudly informed me how much she enjoyed putting that diagram together in 2009.
I jokingly reminded her of the famous notion of “art for art’s sake”; which is an appropriate phrase to describe what many architects are doing when populating frameworks. Indeed, when we refer to Enterprise Architecture, we must remember that the term ‘architecture’ is, itself, a metaphor.
In a tough economy, when competition is increasingly global and marketplaces are shifting, this ability to make tough decisions is going to be essential. Opportunities to save costs are going to be really valued, and architecture invariably helps companies save money. The ability to reuse, and thus rapidly seize the next related business opportunity, is also going to be highly valued.
The thing you have to be careful of is that if you see your markets disappearing, if your product is outdated, or your whole industry is redefining itself, as we have seen in things like media, you have to be ready to innovate. Architecture can restrict your innovative gene, by saying, “Wait, wait, wait. We want to slow down. We want to do things on our platform.” That can be very dangerous, if you are really facing disruptive technology or market changes.
Albert Camus wrote a famous essay exploring the Sisyphus myth called “The Myth of Sisyphus,” where he reinterpreted the central theme of the myth. Similarly, we need to challenge the myths of Enterprise Architecture and enterprise system/solution architecture in general – not meekly accept them.
IEEE says, “A key premise of this metaphor is that important decisions may be made early in system development in a manner similar to the early decision-making found in the development of civil architecture projects.”
Keep asking yourself, “When is what we built that’s stable actually constraining us too much? When is it preventing important innovation?” For many architects, that’s going to be tough, because you start to love the architecture, the standards, and the discipline. You love what you’ve created, but if it isn’t right for the market you’re facing, you have to be ready to let it go and go seize the next opportunity.
The central message is as follows: ‘documenting’ architecture in various layers of abstraction for the purposes of ‘completeness’ is plainly ridiculous. This is especially true when the effort to produce the artifacts takes such an amount of time as to make the whole collection obsolete on completion.
Adrian gathered experts and built workgroups to dig into the issue and do root cause analysis. The workgroups came back with some pretty surprising results.
- Most people expected that “incorrect data” (missing, out of date, incomplete, or wrong data) would be the main problem. What they found was that this was only #5 on the list of issues.
- The #1 issue was “Too much data.” People working with the data could not find the data they needed because there was too much data available, and it was hard to figure out which was the data they needed.
- The #2 issue was that people did not know the meaning of data. And because people had different interpretations of the data, the often produced analyses with conflicting results. For example, “claims paid date” might mean the date the claim was approved, the date the check was cut or the date the check cleared. These different interpretations resulted in significantly different numbers.
- In third place was the difficulty in accessing the data. Their environment was a forest of interfaces, access methods and security policies. Some were documented and some not.
In one of the workgroups, a senior manager put the problem in a larger business context;
“Not being able to leverage the data correctly allows competitors to break ground in new areas before we do. Our data in my opinion is the ‘MOST’ important element for our organization.”
What started as a relatively straightforward data quality project became a more comprehensive enterprise data management initiative that could literally change the entire organization. By the project’s end, Adrian found himself leading the data strategy of the organization.
This kind of story is happening with increasing frequency across all industries as all businesses become more digital, the quantity and complexity of data grows, and the opportunities to offer differentiated services based on data grow. We are entering an era of data-fueled organizations where the competitive advantage will go to those who use their data ecosystem better than their competitors.
Gartner is predicting that we are entering an era of increased technology disruption. Organizations that focus on data as their competitive edge will have the advantage. It has become clear that a strong enterprise data architecture is central to the strategy of any industry-leading organization.
For more future-thinking on the subject of enterprise data management and data architecure see Think ‘Data First” to Drive Business Value
What would the ideal data architecture of the year 2020 look like?
Informatica want’s to know how YOU would answer that question. For this reason, we’ve created the Informatica Architect’s Challenge, a chance for YOU to share how you would approach enterprise data architecture differently. Send us your proposal and you could win 100 iPad Minis for the school of your choice.
There are a lot of challenges to think about here, but let’s start with these:
- Organizations are requiring dramatically faster delivery of business initiatives and are unhappy with the current performance of IT. Think this is “marketing hyperbole?” See the McKinsey survey.
- Data in most organizations is highly fragmented and scattered across dozens or hundreds of different systems. Simply finding and prepping data is becoming the majority of the work in any IT project.
- The problem is only going to get worse as cloud, 3rd party data, social, mobile, big data, and the Internet of Things dramatically increase the complexity of enterprise data environments.
Data is the one thing that uniquely differentiates your organization from its competitors. The question is: How you are going to architect to deliver the data to fuel your future business success? How will you manage the challenges of increasing complexity while delivering with the speed your organization requires?
It’s a chance make a positive contribution for education, while at the same time gaining some professional visibility for yourself as a thought leader. We can’t wait to see what you’ll create!
For additional details, please visit the Informatica Architect’s Challenge official page.
I’m looking forward to doing a Webinar on data virtualization this Thursday, April 22nd. Why? Because this is the single most beneficial concept of architecture, including SOA, and it’s often overlooked by the rank-and-file developers and architects out there. I’m constantly evangelizing the benefits of data virtualization, including integrating data from many and different data sources in real-time, and enabling query-based applications to get data from multiple systems.
The idea is pretty simple, really. Considering that there are many physical database schemas within most enterprises, and typically no common view of the data, data virtualization allows you to map many physical schemas to virtual schemas that are a better representation of the business. For example, a single view of customer data, sales data, and other data that has the same logical meaning, but may be scattered amongst many different physical database systems, using any number of implementation models. (more…)