Data Integration - Informatica

Informatica Data Services

Right-Time Information for the Real-Time Enterprise

SOA’s Last Mile Part III: How to Address SOA’s Data-Centric Pitfalls Effectively

David Lyle

This blog post is part two of an ongoing series highlighting the importance of data in a Service-Oriented Architecture (SOA). I look forward to hearing your thoughts and input on the subject.

I’m back. It’s been a little longer than normal, longer than I would have liked. Perhaps that’s because ‘addressing SOA’s data-centric pitfalls’ isn’t easy. (Really it’s because I’ve been working on other things. But let’s get back to the topic at hand.)

One of the benefits of the SOA approach is the ability to think top-down about problems. The usual approach is to work tightly with the business to define your processes from a business perspective, leading to clearly defined services that the business understands and you can implement together.

This is wonderful and has a clarifying symmetry that Software Engineering has been trying to achieve since the days of CASE. But now, here we are in 2008 with the SOA standards defined and the tools available to potentially achieve this vision. Ah, finally, the integration hairball will be contained and life will improve immeasurably for all!

But as I talked about last time, one of the reasons that things aren’t that simple is the data-centric pitfalls. And addressing this problem is not easy if you want to take a long-term, enterprise-oriented approach.

In talking with folks who have walked down this path, struggled with data problems, and are trying to think holistically about a workable longer-term solution, three themes come up again and again:

1. You must work top-down and bottom up at the same time.
2. A Data Governance initiative will help to improve quality and oversight of your key data by getting the business involved with solving data-centric problems.
3. Some sort of Master Data Management initiative can help to provide key data for your services, such that the resulting building blocks you create will be more successfully reusable.

I’ll dive into these more deeply in the next section.

Top-Down Design and Bottom-Up Design: Certainly working with the business to define your business processes and the necessary resulting services that are required to make these processes work is a foundational approach in the SOA methodology that has multiple benefits. Besides documenting how the business works, top-down documentation has the more important benefit of getting IT and the business on the same page, speaking the same language. But time and again, customers have expressed to me the importance of simultaneously working the bottom-up angle with the top-down design.

With reference to my previous blog posting on the data-centric pitfalls, doing the bottom-up investigations means to investigate the sources of the data, to define the semantics of the data in question, to investigate the change history of the meanings of this data, and to understand the data quality of the data you are working with. Certainly, a data profiling tool can be very important to do this efficiently and correctly. But it can also be useful to pull the resulting information into a Business Glossary where you document this information.

Data Governance: The fruits of your bottom-up data investigation with the business should be documented and maintained as an important resource for the company. This provides a simple impetus to getting a Data Governance initiative going, such that you begin working with the business to improve and maintain the quality, definitions, availability and auditability of your data in a more systematic way. In other words, are you doing your bottom-up investigation and losing the knowledge learned into the ether of the enterprise? Or is this knowledge being documented into a Metadata Management repository that can be searched and understood by the business and IT into the future?

It is no coincidence that Data Governance has grown up as a topic just after people started gaining SOA experience. Certainly, whether it was called Data Governance or not, some organizations have been taking cracks at “data governance” for a couple of decades, in some cases. But now, “data governance” has become “Data Governance” and there are conferences and trade shows for this topic alone. Coincidence? I think not.

Master Data Management: In a similar vein, it is also no coincidence that Master Data Management has sprouted in this same epoch in the continuum of software engineering. Master Data Management should be seen as a means to an end, not an end in itself. Managing master data can provide a structure and approach to solving some (but not all) of the data-centric pitfalls outlined in the previous posting.

Certainly, Data Governance and Master Data Management carry lots of baggage and imply large initiatives on their own. The key takeaway is that the data problems and complexities that exist were not created, nor will they be solved overnight. Companies are attempting to institute better governance and controls to create repeatable approaches to getting a handle on these issues, and these controls can greatly help SOA initiatives if all these efforts are viewed as being holistically complementary.

However, DG and MDM do not necessarily have to mean multiple simultaneous ocean-boiling efforts to solve all of an organization’s data problems. Starting small but thinking long-term is always the best approach. And doing something rather than allowing entropy to continue to reign will provide significant benefits.

In this posting, I’ve talked about three high-level approaches to handling the data-centric pitfalls in SOA. Next time I’ll talk about how BPM is a related topic that has also suffered due to similar data issues. Read all about it in the next post.

Next up “BPM is Missing the Data”

Blog Update For Current Subscribers

Informatica has launched the Informatica Perspectives blog (RSS) where you can now find the latest Data Services discussions among other topics. Please update your RSS subscription to track the following RSS feed for the latest blog posts on Data Services.

Thanks,

The Informatica Team

SOA and BPM Tend to Overlook the Complexity of Integrating Fragmented Enterprise Data

Ash Parikh

This blog post focues on the typical data-centric challenges that SOA and BPM deployments face. Without accurate, consistent and timely information, SOA and BPM cannot effectively deliver on their promise. I look forward to your take on this.

 As we know, SOA uses a simple Web services paradigm to address high-level application integration and business process orchestration, but it cannot address more granular data issues. Semantic inconsistencies, inaccuracies, diverse data formats and access mechanisms, varying requirements for data latencies and volume are some examples. Typical SOA and BPM deployments assume the availability of readily consumable information. This introduces the costly risk of data inconsistency and inaccuracy surfacing later and undermining the business and IT value of an SOA or a BPM initiative.

I think that in order to maximize the business and IT value of an application-centric integration strategy, organizations need to look closely at data integration challenges, requirements, and prospective solutions. Focusing on application-centric integration approaches like SOA, BPM, EAI, and ESB which promise agility, unless complemented by sophisticated data integration platform,  most likely will fail to deliver on that promise. The platform has to deliver holistic and accurate information as a service to consuming applications and business processes, exactly at the speed and latency needs of the business. Our Real-Time resource center looks at this in more detail.

What do you think about real-time or right-time data integration? Do you agree that without considering an organization’s flexible latency needs (from batch to real-time), business agility is at risk?

Ted Friedman, VP Distinguished Analyst, Gartner recently stated, "Most important for organizations to recognize is that their data integration will require a mix of latencies — while real-time activity is on the increase, there will always be a need for higher-latency data integration work, since not all data in the architecture changes frequently, and not all processes, teams and roles are capable of harnessing real-time data."

Do you agree? What’s been your experience?

Getting EIM Right Part III: What is "Right-Time" Information?

Ash Parikh

This blog post is part of an ongoing series highlighting the importance of EIM and how a properly strategized and architected EIM initiative can remove the cost, complexity and risk associated with enterprise integration infrastructures.

Thus, in my opinion, in order to effectively enable business agility, businesses need access to information at the speed of business, or what is called “right-time” information. “Right-time” information, as we have discussed, is information that is made available to the business at exactly the speed or latency that it is required, be it batch, near real-time or real-time. When businesses have access to holistic and accurate information exactly when it is needed, it becomes extremely easy to respond quickly to changing compliance laws, roll-out new and differentiated functionality, improve the overall customer service experience, rapidly and effectively support mergers and acquisitions, and hence enable true business agility.

It has been a busy 2 months for me as I have been trying to catch-up on all the post-Informatica World activity, as you can follow in the "Informatica World Blog." As promised in my earlier post in this series, I want to round-off this discussion around Getting EIM Right, with a summary of how I define "right-time" information. I would like to hear from you to see if you see it in the same or different light. As I see it, it is extremely important that accurate and consistent information is available at exactly the time it is needed in order to respond effectively to the needs of the business, supporting timely decision making.

If we look around, it’s a new world driven by powerful macroeconomic conditions such as globalization, growth, governance and risk mitigation. With growing challenges in achieving agility and flexibility under these conditions, businesses are starting to see increasing demand to support sophisticated operational scenarios such as consolidation of customer data in real-time to support a call center, or delivery of timely and precise forecasts for supply chain operation optimizations, etc. People and businesses seem to want to access their information much faster than ever before. Also, in speaking to a number of CIOs, IT executives and IT managers, enterprise IT organizations are increasingly trying to use their enterprise data within their analytic domains for more mission-critical applications.

As we can see around us, enterprise data is constantly being accessed, manipulated, and used by more users, through more applications, in increasingly shorter time spans. I can see this trend being reinforced as businesses increasingly adopt industry standards like SWIFT in the financial services industry, ACORD in insurance and HL7 in healthcare, to exchange information with their partners. While in some use cases it could be sufficient, effective and possibly the requirement to get information using a batch data movement mode, in other more real-time, 24×7 or mission-critical operations, live or current information may be needed to maximize operational efficiency.

Here is a graph that I like a lot and that I use frequently to explain this point, as it succinctly depicts all these factors in what I call the enterprise information latency continuum. This graph showcases both the increase in demand for more current or live information as well as a blend of analytical and operational data for enabling businesses to better respond to macroeconomic conditions all around us.


Click to Enlarge

What do you think?

A DMReview Magazine Article of Note on Maximizing Business Value

Ash Parikh

This blog post features a link to an article that appears in the June 2008 issue of DMReview Magazine, written by David and myself. We look forward to hearing your thoughts and input on the subject.

This article introduces you to a data services platform as the most efficient approach for enabling business agility across the enterprise, through the delivery of right-time information, be it information delivered in batch, near real time or real time. As you will read in the article, with a data services platform you can enable scalable access, integration and right-time delivery of business-critical information to enterprise-wide composite applications. We have also tried to explain how a data services platform can maximize business value using right-time information for driving competitive advantage, lowered risk and cost-effective project implementations.

Read on…
Maximize Business Value through Right-Time Information Using Data Services - DMReview Magazine

The Latest News and Updates from Informatica World

If you didn't make Informatica World 2008 this year be sure to check out the latest news, announcements, photos, videos, and more on the Informatica World 2008 blog. Several Informatica thought leaders are now live blogging this weeks events sharing their thoughts on this years event and product announcements. Take a look and welcome you to share your thoughts and/or questions.

SOA's Last Mile Part II: SOA's Hidden Data-Centric Pitfalls

David Lyle

This blog post is part two of an ongoing series highlighting the importance of data in a Service-Oriented Architecture (SOA). I look forward to hearing your thoughts and input on the subject.

Last posting, I ranted about the fact that ‘data’ is finally a topic of discussion with respect to SOA initiatives. SOA provides business services that at their deepest level interact with data. What are the data-centric pitfalls that SOA can run into?

First off, data has meaning. While an enterprise ‘meaning’ can be presented by the services to outside consumers of those services, someone has to deal with the fact that the foundational business systems may have different meanings for the underlying data. The ‘transformation’ is frequently very important and complex.

Secondly, the meaning of data can change over time as the business changes. These changes will impact the services and the ‘transformations’ mentioned above. And sometimes these changes will affect the users of the services.

Thirdly, the quality of data is not perfect. How do you deal with these imperfections?

Fourthly, the systems of record for data are not usually neatly compartmentalized. At most complex enterprises, there isn’t just one Order Management system, or one HR system. The concepts of Customer, Policy, Employee, etc., can be spread across many heterogeneous systems, with overlapping responsibilities.

I’m sure there’s a fifth, a sixth, etc. But let’s just elaborate on these four. [Read more]

Getting EIM Right Part II: Where Technologies Such as EAI and EII Have Failed

 

David Lyle

This blog post is part of an ongoing series highlighting the importance of Enterprise Information Management (EIM) and how a properly strategized and architected EIM initiative can remove the cost, complexity and risk associated with enterprise integration infrastructures, I look forward to hearing your thoughts and input on the subject.

In the last post, I mentioned some of the typical modern day business concerns that were expressed to me by a number of customers and prospects. As I dug deeper and tried to understand how these enterprises were dealing with these concerns, it became obvious to me that in order to effectively deal with the business challenges, the underlying IT infrastructure needs to provide a single, comprehensive view into all business critical information assets. Also, the IT infrastructure needs to seamlessly handle the complexity of all enterprise data—its varying volume, its varying latencies, its many formats and structures.

So, does this mean that there are no existing solutions that can efficiently deal with the all the complexity of enterprise data? The simple answer is no! Existing technologies such as Enterprise Application Integration (EAI), Business Process Management (BPM), Enterprise Service Bus (ESB) and Enterprise Information Integration (EII) have fallen short of dealing with all the complexities of enterprise data. Either they have spent their time addressing only the application integration hairball and forgotten that a similar situation exists in the data layer, or they are the wrong or inefficient tool for the right problem. The problem consists of dealing with the complexity of enterprise data, its varied latencies, volumes, formats and structures.

[Read more]

SOA's Last Mile, Part I: Data is a Common Theme in SOA

David Lyle

This blog post is part one of an ongoing series highlighting the importance of data in a Service-Oriented Architecture (SOA) and in Business Process Management (BPM). I look forward to hearing your thoughts and input on the subject.

In 2005, I attended several SOA conferences and tried to discuss ‘data’ with attendees and vendors. Most people looked at me quizzically then ignored the topic, saying that SOA will abstract away concerns about data types, formats, location, and such. While some nodded about the importance of data semantics, there was little appreciation of the fact that without some kind of ‘data abstraction layer’ for services to utilize, everyone will end up solving the same data access, cleansing, transformation, semantic translation, and integration problems again and again, this time within java code buried within the services themselves, creating a complex, new ‘Integration Hairball’. Ouch!

But now, almost three years later, data is front and center. With new technologies, people seem to realize that this new ‘Integration Hairball’ will be created in a fraction of the time it took to create the existing, pre-SOA hairball, unless proper approaches to the ‘data problem’ are taken into account with respect to people, processes and technology around data utilized in the SOA initiatives.

Without taking the data into account from the beginning, SOA is just the next evolution of CORBA, COM, client/server, etc. Certainly, SOA can have benefits by itself, but it’s necessary to recognize that it isn’t complete without having a plan to manage the ‘data problem’. [Read more]

Getting EIM Right Part I: Why is Business Critical Information Such a Strategic Enterprise Asset?

David Lyle

This blog post is part one of an ongoing series highlighting the importance of Enterprise Information Management (EIM) and how a properly strategized and architected EIM initiative can remove the cost, complexity and risk associated with enterprise integration infrastructures. I look forward to hearing your thoughts and input on the subject.

In speaking to a number of customers over the last year or so, it has become increasingly clear to me that continuous change is an integral part of doing business in today’s complex world. Powerful forces such as globalization, improved customer service needs, mergers and acquisitions and business process outsourcing seem to be driving enterprises to become increasingly agile.

Based on what I have gathered from these discussions, making full use of good quality and consistent information available exactly when it is needed, seems to be at the heart of enabling agility in the enterpise.

[Read more]

Next,