Tag Archives: enterprise data architecture
Talking to architects about analytics at a recent event, I kept hearing the familiar theme; data scientists are spending 80% of their time on “data wrangling” leaving only 20% for delivering the business insights that will drive the company’s innovation. It was clear to everybody that I spoke to that the situation will only worsen. The coming growth everybody sees in data volume and complexity, will only lengthen the time to value.
Gartner recently predicted that:
“by 2015, 50% of organizations will give up on managing growth and will redirect funds to improve classification and analytics.”
Some of the details of this study are interesting. In the end, many organizations are coming to two conclusions:
- It’s risky to delete data, so they keep it around as insurance.
- All data has potential business value, so more organizations are keeping it around for potential analytical purposes.
The other mega-trend here is that more and more organizations are looking to compete on analytics – and they need data to do it, both internal data and external data.
From an architect’s perspective, here are several observations:
- The floodgates are open and analytics is a top priority. Given that, the emphasis should be on architecting to manage the dramatic increases in both data quantity and data complexity rather than on trying to stop it.
- The immediate architectural priority has to be on simplifying and streamlining your current enterprise data architecture. Break down those data silos and standardize your enterprise data management tools and processes as much as possible. As discussed in other blogs, data integration is becoming the biggest bottleneck to business value delivery in your environment. Gartner has projected that “by 2018, more than half the cost of implementing new large systems will be spent on integration.” The more standardized your enterprise data management architecture is, the more efficient it will be.
- With each new data type, new data tool (Hive, Pig, etc.), and new data storage technology (Hadoop, NoSQL, etc.) ask first if your existing enterprise data management tools can handle the task before people go out and create a new “data silo” based on the cool, new technologies. Sometimes it will be necessary, but not always.
- The focus needs to be on speeding value delivery for the business. And the key bottleneck is highly likely to be your enterprise data architecture.
Rather than focusing on managing data growth, the priority should be on managing it in the most standardized and efficient way possible. It is time to think about enterprise data management as a function with standard processes, skills and tools (just like Finance, Marketing or Procurement.)
Several of our leading customers have built or are building a central “Data as a Service” platform within their organizations. This is a single, central place where all developers and analysts can go to get trustworthy data that is managed by IT through a standard architecture and served up for use by all.
For more information, see “The Big Big Data Workbook”
*Gartner Predicts 2015: Managing ‘Data Lakes’ of Unprecedented Enormity, December 2014 http://www.gartner.com/document/2934417#
The start of the year is a great time to refresh and take a new look at your capabilities, goals, and plans for your future-state architecture. That being said, you have to take into consideration that the most scarce resource in your architecture is probably your own personal time.
Looking forward, here are three things that I would recommend that every architect do. I realize that all three of these relate to data, but as I have said in the eBook, Think “Data First” to Drive Business Value, we believe that data is the key bottleneck in your enterprise architecture in terms of slowing the delivery of business initiatives in support of your organization’s business strategy.
So, here are the recommendations. None of these will cost you anything if you are a current Informatica PowerCenter customer. And #2 and #3 are free regardless. It is only a matter of your time:
1. Take a look at the current Informatica Cloud offering and in particular the templating capabilities.
Informatica Cloud is probably much more capable than you think. The standard templating functionality supports very complex use cases and does it all from a very easy to use, no-coding, user interface. It comes with a strong library of integration stubs that can be dragged & dropped into Microsoft Viseo to create complex integrations. Once the flow is designed in Viseo, it can be easily imported into Informatica Cloud and from there users have a Wizard-driven UI to do the final customization for sources, targets, mappings, transformations, filters, etc. It is all very powerful and easy to use.
- YouTube: Building Custom templates https://www.youtube.com/watch?v=yHmFkxov6bs
- 30 day free Informatica Cloud trial. http://more.informatica.com/en/cloud_trial/org?offer=30day-ICwebPage
Why This Matters to Architects
- You will see how easy it is for new groups to get going with fairly complex integrations.
- This is a great tool for departmental or new user use, and it will be completely compatible with the rest of your Informatica architecture – not another technology silo for you to manage.
- Any mapping created for Informatica on-premise can also run on the cloud version.
2. Download Informatica Rev and understand what it can do for your analysts and “data wranglers.”
Your data analysts are spending 80% of their time managing their data and only 20% on the actual analysis they are trying to provide. Informatica Rev is a great way to prepare your data before use in analytics tools such as Qlik, Tableau, and others.
With Informatica Rev, people who are not data experts can access, mashup, prototype and cleanse their data all in a User Interface that looks like a spreadsheet and requires no previous experience in data tools.
- For a free Informatica Rev download https://rev.informatica.com/
- Informatica Rev (Project Springbok) demo https://www.youtube.com/watch?v=0F_58bHKDDs
Why This Matters for Architects
- Your data analysts are going to use analytics tools with or without the help of IT. This enables you to help them while ensuring that they are managing their data well and optimizing their productivity.
- This tool will also enable them to share their “data recipes” and for IT to be involved in how they access and use the organization’s data.
3. Look at the new features in PowerCenter 9.6. First, upgrade to 9.6 if you haven’t already, and particularly take a good look at these new capabilities that are bundled in every version. Many people we talk to have 9.6 but don’t realize the power of what they already own.
- Profiling: Discover and analyze your data quickly. Find relationships and data issues.
- Data Services: This presents any JDBC or ODBC repository as a logical data object. From there you can rapidly prototype new applications using these logical objects without worrying about the complexities of the underlying repositories. It can also do data cleansing on the fly.
- Webinar: Great Data by Design. https://www.brighttalk.com/webcast/10477/104939
- PowerCenter 9.6 deep dive demo https://www.brighttalk.com/webcast/10477/110535
Why This Matters for Architects
- The key challenge for IT and for Architects is to be able to deliver at the “speed of business.” These tools can dramatically improve the productivity of your team and speed the delivery of projects for your business “customers.”
Taking the time to understand what these tools can do in terms of increasing the productivity of your IT team and enabling your end users to self-service will make you a better business partner overall and increase your influence across the organization. Have a great year!
The white paper, “The Great Rethink: Building a Highly Responsive and Evolving Data Integration Architecture” by Claudia Imhoff and Joe McKendrick provides an interesting view of what such an architecture might look like. The paper describes how to move from ad hoc Data Integration to an Enterprise Data Architecture. The paper also describes an approach towards building architectural maturity and a next-generation enterprise data architecture that helps organizations to be more competitive.
Organizations that look to compete based on their data are searching for ways to design an architecture that:
- On-boards new data quickly
- Delivers clean and trustworthy data
- Delivers data at the speed required of the business
- Ensures that data is handled in secure way
- Is flexible enough to incorporate new data types and new technology
- Enables end user self-service
- Speeds up the speed of business value delivery for an organization
In my previous blog, Digital Strategy and Architecture, we discussed the demands that digital strategies are putting on enterprise data architecture in particular. Add to that the additional stress from business initiatives such as:
- Supporting new mobile applications
- Moving IT applications to the cloud – which significantly increases data management complexity
- Dealing with external data. One recent study estimates that a full 25% of the data being managed by the average organization is external data.
- Next-generation analytics and predictive analytics with Hadoop and No SQL
- Integrating analytics with applications
- Event-driven architectures and projects
- The list goes on…
The point here is that most people are unlikely to be funded to build an enterprise data architecture from scratch that can meet all these needs. A pragmatic approach would be to build out your future state architecture in each new strategic business initiative that is implemented. The real challenge of being an enterprise architect is ensuring that all of the new work does indeed add up to a coherent architecture as it gets implemented.
The “Great Rethink” white paper describes a practical approach to achieving an agile and responsive future state enterprise data architecture that will support your strategic business initiatives. It also describes a high level data integration architecture and the building blocks to achieving that architecture. This is highly recommended reading.
Also, you might recall that Informatica sponsored the Informatica Architect’s Challenge this year to design an enterprise-wide data architecture of the future. The contest has closed and we have a winner. See the site for details, Informatica Architect Challenge .
What is digitization?
It can take many forms. Here are a few types of digitization of business and examples:
|Products that add digital components||Sports equipment with sensors for immediate feedback|
|Products sold through digital channels||Conde Nast magazines|
|“Solutions” that are assembled and delivered in digital channels||USAA Insurance|
|Products that are entirely digital||Apple iTunes, eSurance, PayPal, Google|
|Companies monetizing their data||Healthcare clinical data|
The really interesting thing about digitization that you can see from some of the examples above is that it enables new competition to enter your space and competitors to leap industry boundaries. The concept of “barriers to entry” itself is eroding.
The Impact of Digitization on IT
Some interesting facts from MIT CISR’s research with Boards of Directors on digitization jumped out at me:
- Board members estimate that 32% of company’s revenues are under threat from digital disruption. This is a really stunning number when you think about it.
- Half of Board members believe that their board’s ability to oversee the strategic use of IT is “less than effective.”
- 26% of Boards hired consultants to evaluate major projects or the IT unit.
- 60% of Boards want to spend more time on digital issues next year.
The Impact of Digitization for Architects?
It boils down to two things:
- Architects need to deliver a digital platform to enable business agility in a time of increasing competition and disruption. This includes standardization around business processes, data, and the platform.
- Architects need to get more proactive in the strategy process for their organizations both in terms of the platforms and architecture and in terms of a general understanding of the challenges and opportunities that arise from digital disruption.
For more on enterprise data architecture, best practices and reference architectures see the eBook: Think “Data First” to Drive Business Value
We are way past the point where the architecture needs to be aligned with business goals and value delivery. That is necessary but no longer sufficient. We are now at the point where architecture needs to be central to the creation of an organization’s strategy process. Not to get hyperbolic, but anything less is risky for your career.
The Challenge: Digitization
I just came back from the MIT Center for Information Systems Research (CISR) research forum. One of the leading topics was digitization and how every business is becoming digitized. To those in the High Tech industry, this may be an “of course” topic, but to most other industries it is a wrenching change. Even those who are comfortable with the idea of digitization risk taking this too lightly.
The fact is that most products and services will have a digital component to them in the near future and an increasing number of products and services will be entirely digital. The fact is that digitization and the technologies that enable it are going to bring about a period of increased disruption. This will mean:
- New competitors. Examples: autonomous cars, sports equipment with embedded sensors that provide feedback, personal assistant fully capable of making decisions and taking action. Gartner is predicting that almost everything over $100 will have a sensor by the turn of the decade.
- New competitors jumping across industry boundaries. Examples: Apple iTunes and Google cars to name a few.
Why Architects Are Important
Architects are in a unique position to not only understand the technology trends driving this disruption, but they also to know how to leverage these trends to drive business value within their organizations. The very best architects are going to be those who are deeply involved in defining the organization strategy, not just figuring out how to implement it.
Evidence of Change
Many architects and CIOs currently report very little interest from upper management in IT. That is about to change, and quickly. At the MIT CISR forum I attended last week, they presented research around this area that is very telling:
- Half of Board of Directors members believe that their board’s ability to oversee the strategic use of IT is “less than effective.”
- 26% of Boards hired consultants to evaluate major projects or the IT unit.
- 60% of Boards want to spend more time on digital issues next year.
- Board members estimate that 32% of their company’s revenues are under threat from digital disruption.
That last bullet is the really interesting piece of research. 32% is a huge impact.
The Role of Data in Digitization
Digitization by its very nature is all about data. The winners in this space will be those that can manage and deliver relevant data the quickest. The question for architects is this: Do you have the architecture and agility to take advantage of the coming disruptions and opportunities? Are you actively advising your organization on how to leverage them? As we have documented in many previous blogs, many organizations are poorly positioned to manage their data as a discoverable and easily sharable asset. This will essential for:
- Delivering business initiatives and showing value faster (agility).
- Enabling business self-service so that IT is not the bottleneck in new analyses and decisions.
All of this requires new thinking around enterprise data architecture. For fresh thinking on this subject see Thinking “Data First” to Drive Business Value.
Adrian gathered experts and built workgroups to dig into the issue and do root cause analysis. The workgroups came back with some pretty surprising results.
- Most people expected that “incorrect data” (missing, out of date, incomplete, or wrong data) would be the main problem. What they found was that this was only #5 on the list of issues.
- The #1 issue was “Too much data.” People working with the data could not find the data they needed because there was too much data available, and it was hard to figure out which was the data they needed.
- The #2 issue was that people did not know the meaning of data. And because people had different interpretations of the data, the often produced analyses with conflicting results. For example, “claims paid date” might mean the date the claim was approved, the date the check was cut or the date the check cleared. These different interpretations resulted in significantly different numbers.
- In third place was the difficulty in accessing the data. Their environment was a forest of interfaces, access methods and security policies. Some were documented and some not.
In one of the workgroups, a senior manager put the problem in a larger business context;
“Not being able to leverage the data correctly allows competitors to break ground in new areas before we do. Our data in my opinion is the ‘MOST’ important element for our organization.”
What started as a relatively straightforward data quality project became a more comprehensive enterprise data management initiative that could literally change the entire organization. By the project’s end, Adrian found himself leading the data strategy of the organization.
This kind of story is happening with increasing frequency across all industries as all businesses become more digital, the quantity and complexity of data grows, and the opportunities to offer differentiated services based on data grow. We are entering an era of data-fueled organizations where the competitive advantage will go to those who use their data ecosystem better than their competitors.
Gartner is predicting that we are entering an era of increased technology disruption. Organizations that focus on data as their competitive edge will have the advantage. It has become clear that a strong enterprise data architecture is central to the strategy of any industry-leading organization.
For more future-thinking on the subject of enterprise data management and data architecure see Think ‘Data First” to Drive Business Value