David Linthicum

David Linthicum
David S. Linthicum is SVP at Cloud Technology Partners and an internationally recognized industry expert and thought leader. Dave has authored 13 books on computing, the latest of which is Cloud Computing and SOA Convergence in Your Enterprise, a Step-by-Step Approach. Dave’s industry experience includes tenures as CTO and CEO of several successful software companies, and upper-level management positions in Fortune 100 companies. He keynotes leading technology conferences on cloud computing, SOA, enterprise application integration, and enterprise architecture.

More Evidence That Data Integration Is Clearly Strategic

Data Integration Is Clearly Strategic

Data Integration Is Strategic

A recent study from Epicor Software Corporation surveyed more than 300 IT and business decision-makers.  The study results highlighted the biggest challenges and opportunities facing Australian businesses. The independent research report “From Business Processes to Product Distribution” was based upon a survey of Australian organizations with more than 20 employees.

Key findings from the report include:

  • 65% of organizations cite data processing and integration as hampering distribution capability, with nearly half claiming their existing software and ERP is not suitable for distribution.
  • Nearly two-thirds of enterprises have some form of distribution process, involving products or services.
  • More than 80% of organizations have at least some problem with product or service distribution.
  • More than 50% of CIOs in organizations with distribution processes believe better distribution would increase revenue and optimize business processes, with a further 38% citing reduced operating costs.

The core findings: “With better data integration comes better automation and decision making.”

This report is one of many I’ve seen over the years that come to the same conclusion.  Most of those involved with the operations of the business don’t have access to key data points they need, thus they can’t automate tactical decisions, and also cannot “mine” the data, in terms of understanding the true state of the business.

The more businesses deal with building and moving products, the more data integration becomes an imperative value.  As stated in this survey, as well as others, the large majority cite “data processing and integration as hampering distribution capabilities.”

Of course, these issues goes well beyond Australia.  Most enterprises I’ve dealt with have some gap between the need to share key business data to support business processes, and decision support, and what current exists in terms of data integration capabilities.

The focus here is on the multiple values that data integration can bring.  This includes:

  • The ability to track everything as it moves from manufacturing, to inventory, to distribution, and beyond.  You to bind these to core business processes, such as automatic reordering of parts to make more products, to fill inventory.
  • The ability to see into the past, and to see into the future.  The emerging approaches to predictive analytics allow businesses to finally see into the future.  Also, to see what went truly right and truly wrong in the past.

While data integration technology has been around for decades, most businesses that both manufacture and distribute products have not taken full advantage of this technology.  The reasons range from perceptions around affordability, to the skills required to maintain the data integration flow.  However, the truth is that you really can’t afford to ignore data integration technology any longer.  It’s time to create and deploy a data integration strategy, using the right technology.

This survey is just an instance of a pattern.  Data integration was considered optional in the past.  With today’s emerging notions around the strategic use of data, clearly, it’s no longer an option.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data First, Data Integration, Data Integration Platform, Data Quality | Tagged , , , | Leave a comment

The Pros and Cons: Data Integration from the Bottom-Up and the Top-Down

Data Integration from the Bottom-Up and the Top-Down

Data Integration from the Bottom-Up and the Top-Down

What are the first steps of a data integration project?  Most are at a loss.  There are several ways to approach data integration, and your approach depends largely upon the size and complexity of your problem domain.

With that said, the basic approaches to consider are from the top-down, or the bottom-up.  You can be successful with either approach.  However, there are certain efficiencies you’ll gain with a specific choice, and it could significantly reduce the risk and cost.  Let’s explore the pros and cons of each approach.

Top-Down

Approaching data integration from the top-down means moving from the high level integration flows, down to the data semantics.  Thus, you an approach, perhaps even a tool-set (using requirements), and then define the flows that are decomposed down to the raw data.

The advantages of this approach include:

The ability to spend time defining the higher levels of abstraction without being limited by the underlying integration details.  This typically means that those charged with designing the integration flows are more concerned with how they have to deal with the underlying source and target, and this approach means that they don’t have to deal with that issue until later, as they break down the flows.

The disadvantages of this approach include:

The data integration architect does not consider the specific needs of the source or target systems, in many instances, and thus some rework around the higher level flows may have to occur later.  That causes inefficiencies, and could add risk and cost to the final design and implementation.

Bottom-Up

For the most part, this is the approach that most choose for data integration.  Indeed, I use this approach about 75 percent of the time.  The process is to start from the native data in the sources and targets, and work your way up to the integration flows.  This typically means that those charged with designing the integration flows are more concerned with the underlying data semantic mediation than the flows.

The advantages of this approach include:

It’s typically a more natural and traditional way of approaching data integration.  Called “data-driven” integration design in many circles, this initially deals with the details, so by the time you get up to the integration flows there are few surprises, and there’s not much rework to be done.  It’s a bit less risky and less expensive, in most cases.

The disadvantages of this approach include:

Starting with the details means that you could get so involved in the details that you miss the larger picture, and the end state of your architecture appears to be poorly planned, when all is said and done.  Of course, that depends on the types of data integration problems you’re looking to solve.

No matter which approach you leverage, with some planning and some strategic thinking, you’ll be fine.  However, there are different paths to the same destination, and some paths are longer and less efficient than others.  As you pick an approach, learn as you go, and adjust as needed.

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data, Data Aggregation, Data Integration | Tagged , , , | Leave a comment

Government Cloud Data Integration: Some Helpful Advice

Government Cloud Data Integration

Government Cloud Data Integration

Recently, a study found that the Government Cloud Data Integration has not been extremely effective. This post will provide some helpful advice.

As covered in Loraine Lawson’s blog, MeriTalk surveyed federal government IT professionals about their use of cloud computing. As it turns out, “89 percent out of 153 surveyed expressed ‘some apprehension about losing control of their IT services,’ according to MeriTalk.”

Loraine and I agree that what the survey says about the government’s data integration, management, and governance, is that they don’t seem to be very good at cloud data management…yet. Some of the other gruesome details include:

  • 61 percent do not have quality, documented metadata.
  • 52 percent do not have well understood data integration processes.
  • 50 percent have not identified data owners.
  • 49 percent do not have known systems of record.

“Overall, respondents did not express confidence about the success of their data governance and management efforts, with 41 percent saying their data integration management efforts were some degree of ‘not successful.’ This lead MeriTalk to conclude, ‘Data integration and remediation need work.’”

The problem with the government is that data integration, data governance, data management, and even data security have not been priorities. The government has a huge amount of data to manage, and they have not taken the necessary steps to adopt the best practices and technology that would allow them to manage it properly.

Now that everyone is moving to the cloud, the government included, questions are popping up about the proper way to manage data within the government, from the traditional government enterprises to the public cloud. Clearly, there is much work to be done to get the government ready for the cloud, or even ready for emerging best practices around data management and data integration.

If the government is to move in the right direction, they must first come to terms with the data. This means understanding where the data is, what it does, who owns it, access mechanisms, security, governance, etc., and apply this understanding holistically to most of the data under management.

The problem within the government is that the data is so complex, distributed, and, in many cases, unique, that it’s difficult for the government to keep good track of the data. Moreover, the way the government does procurement, typically in silos, leads to a much larger data integration problem. I was working with government agencies that had over 5,000 siloed systems, each with their own database or databases, and most do not leverage data integration technology to exchange data.

There are ad-hoc data integration approaches and some technology in place, but nowhere close to what’s need to support the amount and complexity of data. Now that government agencies are looking to move to the cloud, the issues around data management are beginning to be better understood.

So, what’s the government to do? This is a huge issue that can’t be fixed overnight. There should be incremental changes that occur over the next several years. This also means allocating more resources to data management and data integration than has been allocated in the past, and moving it much higher up in the priorities lists.

These are not insurmountable problems. However, they require a great deal of focus before things will get better. The movement to the cloud seems to be providing that focus.

FacebookTwitterLinkedInEmailPrintShare
Posted in Cloud Data Integration | Tagged , | Leave a comment

Once Again, Data Integration Proves Critical to Data Analytics

When it comes to cloud-based data analytics, a recent study by Ventana Research (as found in Loraine Lawson’s recent blog post) provides a few interesting data points.  The study reveals that 40 percent of respondents cited lowered costs as a top benefit, improved efficiency was a close second at 39 percent, and better communication and knowledge sharing also ranked highly at 34 percent.

Ventana Research also found that organizations cite a unique and more complex reason to avoid cloud analytics and BI.  Legacy integration work can be a major hindrance, particularly when BI tools are already integrated with other applications.  In other words, it’s the same old story:

You can’t make sense of data that you can’t see.

Data Integration Proves Critical to Data Analytics

Data Integration is Critical to Data Analytics

The ability to deal with existing legacy systems when moving to concepts such as big data or cloud-based analytics is critical to the success of any enterprise data analytics strategy.  However, most enterprises don’t focus on data integration as much as they should, and hope that they can solve the problems using ad-hoc approaches.

These approaches rarely work as well a they should, if at all.  Thus, any investment made in data analytics technology is often diminished because the BI tools or applications that leverage analytics can’t see all of the relevant data.  As a result, only part of the story is told by the available data, and those who leverage data analytics don’t rely on the information, and that means failure.

What’s frustrating to me about this issue is that the problem is easily solved.  Those in the enterprise charged with standing up data analytics should put a plan in place to integrate new and legacy systems.  As part of that plan, there should be a common understanding around business concepts/entities of a customer, sale, inventory, etc., and all of the data related to these concepts/entities must be visible to the data analytics engines and tools.  This requires a data integration strategy, and technology.

As enterprises embark on a new day of more advanced and valuable data analytics technology, largely built upon the cloud and big data, the data integration strategy should be systemic.  This means mapping a path for the data from the source legacy systems, to the views that the data analytics systems should include.  What’s more, this data should be in real operational time because data analytics loses value as the data becomes older and out-of-date.  We operate a in a real-time world now.

So, the work ahead requires planning to occur at both the conceptual and physical levels to define how data analytics will work for your enterprise.  This includes what you need to see, when you need to see it, and then mapping a path for the data back to the business-critical and, typically, legacy systems.  Data integration should be first and foremost when planning the strategy, technology, and deployments.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Aggregation, Data Integration, Data Integration Platform, Data Quality | Tagged , , , | Leave a comment

Moving to the Cloud: 3 Data Integration Facts That Every Enterprise Should Understand

Cloud Data Integration

Cloud Data Integration

According to a survey conducted by Dimensional Research and commissioned by Host Analytics, “CIOs continue to grow more and more bullish about cloud solutions, with a whopping 92% saying that cloud provides business benefits, according to a recent survey. Nonetheless, IT execs remain concerned over how to avoid SaaS-based data silos.”

Since the survey was published, many enterprises have, indeed, leveraged the cloud to host business data in both IaaS and SaaS incarnations.  Overall, there seems to be two types of enterprises: First are the enterprises that get the value of data integration.  They leverage the value of cloud-based systems, and do not create additional data silos.  Second are the enterprises that build cloud-based data silos without a sound data integration strategy, and thus take a few steps backward, in terms of effectively leveraging enterprise data.

There are facts about data integration that most in enterprise IT don’t yet understand, and the use of cloud-based resources actually makes things worse.  The shame of it all is that, with a bit of work and some investment, the value should come back to the enterprises 10 to 20 times over.  Let’s consider the facts.

Fact 1: Implement new systems, such as those being stood up on public cloud platforms, and any data integration investment comes back 10 to 20 fold.  The focus is typically too much on cost and not enough on the benefit, when building a data integration strategy and investing in data integration technology.

Many in enterprise IT point out that their problem domain is unique, and thus their circumstances need special consideration.  While I always perform domain-specific calculations, the patterns of value typically remain the same.  You should determine the metrics that are right for your enterprise, but the positive values will be fairly consistent, with some varying degrees.

Fact 2: It’s not just about data moving from place-to-place, it’s also about the proper management of data.  This includes a central understanding of data semantics (metadata), and a place to manage a “single version of the truth” when it comes to dealing massive amounts of distributed data that enterprises must typically manage, and now they are also distributed within public clouds.

Most of those who manage enterprise data, cloud or no-cloud, have no common mechanism to deal with the meaning of the data, or even the physical location of the data.  While data integration is about moving data from place to place to support core business processes, it should come with a way to manage the data as well.  This means understanding, protecting, governing, and leveraging the enterprise data, both locally and within public cloud providers.

Fact 3: Some data belongs on clouds, and some data belongs in the enterprise.  Those in enterprise IT have either pushed back on cloud computing, stating that data outside the firewall is a bad idea due to security, performance, legal issues…you name it.  Others try to move all data to the cloud.  The point of value is somewhere in between.

The fact of the matter is that the public cloud is not the right fit for all data.  Enterprise IT must carefully consider the tradeoff between cloud-based and in-house, including performance, security, compliance, etc..  Finding the best location for the data is the same problem we’ve dealt with for years.  Now we have cloud computing as an option.  Work from your requirements to the target platform, and you’ll find what I’ve found: Cloud is a fit some of the time, but not all of the time.

 

FacebookTwitterLinkedInEmailPrintShare
Posted in Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Data Integration | Tagged , , | Leave a comment

3 Ways to Sell Data Integration Internally

3 Ways to Sell Data Integration Internally

Selling data integration internally

So, you need to grab some budget for a data integration project, but know one understands what data integration is, what business problems it solves, and it’s difficult to explain without a white board and a lot of time.  I’ve been there.

I’ve “sold” data integration as a concept for the last 20 years.  Let me tell you, it’s challenging to define the benefits to those who don’t work with this technology every day.  That said, most of the complaints I hear about enterprise IT are around the lack of data integration, and thus the inefficiencies that go along with that lack, such as re-keying data, data quality issues, lack of automation across systems, and so forth.

Considering that most of you will sell data integration to your peers and leadership, I’ve come up with 3 proven ways to sell data integration internally.

First, focus on the business problems.  Use real world examples from your own business.  It’s not tough to find any number of cases where the data was just not there to make core operational decisions that could have avoided some huge mistakes that proved costly to the company.  Or, more likely, there are things like ineffective inventory management that has no way to understand when orders need to be place.  Or, there’s the go-to standard: No single definition of what a “customer” or a “sale” is amongst the systems that support the business.  That one is like back pain, everyone has it at some point.

Second, define the business case in practical terms with examples.  Once you define the business problems that exist due to lack of a sound data integration strategy and technologies, it’s time to put money behind those numbers.  Those in IT have a tendency to either way overstate, or way understate the amount of money that’s being wasted and thus could be saved by using data integration approaches and technology.  So, provide practical numbers that you can back-up with existing data.

Finally, focus on a phased approach to implementing your data integration solution.  The “Big Bang Theory” is a great way to define the beginning of the universe, but it’s not the way you want to define the rollout of your data integration technology.  Define a workable plan that moves from one small grouping of systems and databases to another, over time, and with a reasonable amount of resources and technology.  You do this to remove risk from the effort, as well as manage costs, and insure that you can dial lessons learned back into the efforts.  I would rather roll out data integration within an enterprises using small teams and more problem domains, than attempt to do everything within a few years.

The reality is that data integration is no longer optional for enterprises these days.  It’s required for so many reasons, from data sharing, information visibility, compliance, security, automation…the list goes on and on.  IT needs to take point on this effort.  Selling data integration internally is the first and most important step.  Go get ‘em.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Integration Platform, Data Quality, Data Security | Tagged | Leave a comment

Data Integration with Devices is Easier than You Think

Data Integration with Devices

Data Integration with Devices

The concept of the “Internet of Things” (IOT) is about getting devices we leverage in our daily lives, or devices used in industrial applications, to communicate with other devices or systems. This is not a new notion, but the bandwidth and connectivity mechanisms to make the IOT practical is a recent development.

My first job out of college was to figure out how to get devices that monitored and controlled an advanced cooling and heating system to communicate with a centralized and automated control center. We ended up building custom PCs for the application, running a version of Unix (DOS would not cut it), and the PCs mounted in industrial cases would communicate with the temperature and humidity sensors, as well as turn on and turn off fans and dampers.

At then end of the day, this was a data integration, not an engineering problem, that we were attempting to solve. The devices had to talk to the PCs, and the PC had to talk to a centralized system (Mainframe) that was able to receive the data, as well as use that data to determine what actions to take. For instance, the ability determine that 78 degrees was too warm for a clean room, and that a damper had to be open and a fan turned on to reduce the temperature, and then turn off when the temperature returned to normal.

Back in the day, we had to create and deploy custom drivers and software. These days, most devices have well-defined interfaces, or APIs, that developers and data integration tools can access to gather information from that device. We also have high performing networks. Much like any source or target system, these devices produce data which is typically bound to a structure, and that data can be consumed and restructured to meet the needs of the target system.

For instance, data coming off a smart thermostat in your home may be in the following structure:

Device (char 10)
Date (char 8)
Temp (num 3)

You’re able to access this device using an API (typically a REST-based Web Service), which returns a single chunk of data which is bound to the structure, such as:

Device (“9999999999”)
Date (“09162014”)
Temp (076)

Then you can transform the structure into something that’s native to the target system that receives this data, as well as translate the data (e.g., converting the Data form characters to numbers). This is where data integration technology makes money for you, given its ability to deal with the complexity of translating and transforming the information that comes off the device, so it can be placed in a system or data store that’s able to monitor, analyze, and react to this data.

This is really what the IOT is all about; the ability to have devices spin out data that is leveraged to make better use of the devices. The possibilities are endless, as to what can be done with that data, and how we can better manage these devices. Data integration is key. Trust me, it’s much easier to integrate with devices these days than it was back in the day.

Thank you for reading about Data Integration with Devices! Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Integration Platform | Tagged , , | Leave a comment

The Changing ROI of Data Integration

The ROI of Data Integration

The Changing ROI of Data Integration

Over the years we’ve always tried to better define the ROI of data integration.  It seems pretty simple.  There is an increasing value to core enterprise systems and data stores once they communicate effectively with other enterprise systems and data stores.  There is unrealized value when systems and stores do not exchange data.

However, the nature of data integration has evolved, and so has the way we define the value.  The operational benefits are still there, but there are more strategic benefits to consider as well.

Data integration patterns have progressed from simple patterns that replicated data amongst systems and data stores, to more service-based use of core business data that is able to provide better time-to-market advantages and much better agility.  These are the strategic concepts that, when measured, add up to much more value than the simple operational advantages we first defined as the ROI of data integration.

The new ROI for data integration can be defined a few ways, including:

The use of data services to combine core data assets with composite applications and critical business processes.  This allows those who leverage data services, which is a form of data integration, to mix and match data services to provide access to core applications or business processes.  The applications leverage the data services (typically REST-based Web services) as ways to access back-end data stores, and can even redefine the metadata for the application or process (a.k.a., Data Virtualization).

This provides for a compressed time-to-market for critical business solutions, thus returning much in the way of investment.  What’s more important is the enterprise’s new ability to change to adapt to new business opportunities, and thus get to the value of agility.  This is clearly where the majority of ROI resides.

The use of integrated data to make better automated operational decisions.  This means that we’re taking integrated data, either as services or through simple replication, or using that data to make automated decisions.  Examples would be the ability to determine if inventory levels will support an increase in sales, or if the risk levels for financial trades are too high.

The use of big data analytics to define advanced use of data, including predicting the future.  Refers to the process of leveraging big data, and big data analytics, to make critical calls around the business, typically calls that are more strategic in nature.  An example would be the use of predictive analytics that leverages petabytes of data to determine if a product line is likely to be successful, or if the production levels will likely decline or increase.  This is different than operational use of data, as we discussed previously, in that we’re making strategic versus tactical use of the information derived from the data.  The ROI here, as you would guess, is huge.

A general pattern is that the ROI is much greater around data integration than it was just 5 years ago.  This is due largely to the fact that enterprises understand that data is everything, when it comes to driving a business.  The more effective the use of data, the better you can drive the business, and that means more ROI.  It’s just that simple.

Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration | Tagged | Leave a comment

Even More Drivers for Data Integration and Data Cleansing Tools

Data Cleansing ToolsThe growth of big data drives many things, including the use of cloud-based resources, the growth of non-traditional databases, and, of course, the growth of data integration. What’s typically not as well understood are the required patterns of data integration, or, the ongoing need for better and more innovative data cleansing tools.

Indeed, while writing Big Data@Work: Dispelling the Myths, Uncovering the Opportunities, Tom Davenport observed data scientists at work. During his talk at VentureBeat’s DataBeat conference, Davenport said data scientists would need better data integration and data cleansing tools before they’d be able to keep up with the demand within organizations.

But Davenport is not alone. Most who deploy big data systems see the need for data integration and data cleansing tools. In most instances, not having those tools in place hindered progress.

I would agree with Davenport, in that the number one impediment to moving to any type of big data is how to clean and move data. Addressing that aspect of big data is Job One for enterprise IT.

The fact is, just implementing Hadoop-based databases won’t make a big data system work. Indeed, the data must come from existing operational data stores, and leverage all types of interfaces and database models. The fundamental need to translate the data structure and content to effectively move from one data store (or stores, typically) to the big data systems has more complexities than most enterprises understand.

The path forward may require more steps than originally anticipated, and perhaps the whole big data thing was sold as something that’s much easier than it actually is. My role for the last few years is to be the guy who lets enterprises know that data integration and data cleansing are core components to the process of building and deploying big data systems. You may as well learn to deal with it early in the process.

The good news is that data integration is not a new concept, and the technology is more than mature. What’s more, data cleansing tools can now be a part of the data integration technology offerings, and actually clean the data as it moves from place to place, and do so in near real-time.

So, doing big data anytime soon? Now is the time to define your big data strategy, in terms of the new technology you’ll be dragging into the enterprise. It’s also time to expand or change the use of data integration and perhaps the enabling technology that is built or designed around the use of big data.

I hate to sound like broken record, but somebody has to say this stuff.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Quality | Tagged , | Leave a comment

Data Integration Eight Years Later

Data IntegrationI recently came across an article from 2006, which is clearly out-of-date, but still a good read about the state of data integration eight years ago. “Data integration was hot in 2005, and the intense interest in this topic continues in 2006 as companies struggle to integrate their ever-growing mountain of data.

A TDWI study on data integration last November found that 69% of companies considered data integration issues to be a high or very high barrier to new application development. To solve this problem, companies are increasing their spending on data integration products.”

Business intelligence (BI) and data warehousing were the way to go at the time, and companies were spending millions to stand up these systems. Data integration was all massive data movements and manipulations, typically driven by tactical tools rather than true data integration solutions.

The issue I had at the time was the inability to deal with real-time operational data, and the cost of the technology and deployments. While these issues were never resolved with traditional BI and data warehousing technology, we now have access to databases that can manage over a petabyte of data, and the ability to cull through the data in seconds.

The ability to support massive amounts of data have reignited the interest in data integration. Up-to-the-minute operational data in these massive data stores is actually possible. We can now understand the state of the business as it happens, and thus make incremental adjustments based upon almost perfect information.

What this situation leads to is true value. We have delivery of the right information to the right people, at the right time, and the ability to place automated processes and polices around this data. Business becomes self-correcting and self-optimizing. The outcome is a business that is data-driven, and thus more responsive to the markets as well as to the business world itself.

However, big data is an impossible dream without a focus on how the data moves from place to place, using data integration best practices and technology. I guess we can call this big data integration, but it’s really the path to provide these massive data stores with the operational data required to determine the proper metrics for the business.

While data integration is not a new term. However the application of new ways to leverage and value data brings unprecedented new value to enterprises. Millions of dollars an hour of value are being delivered to Global 2000 organizations that leverage these emerging data integration approaches and technology. What’s more, data integration is moving from the tactical to the strategic budgets of IT.

So, what’s changed in eight years? We finally figured out how to get the value from our data, using big data and data integration. It took us long enough, but I’m glad it’s finally become a priority.

FacebookTwitterLinkedInEmailPrintShare
Posted in Business Impact / Benefits, Business/IT Collaboration, Data Integration | Tagged , | Leave a comment