Checklist For Data Integration Platform Capabilities

As I have been discussing in my previous blogs, a data integration platform has to cover a lot of ground if it’s going to be the backbone for sharing data across a company or organization.  Below is a checklist of key capabilities that you should evaluate, but of course no checklist should be used blindly.  So a couple comments first.

First, are all of these capabilities relevant to you?  Try to think beyond the specific project you may have at hand to the different types of data integration projects others in your organization may be pursuing.  After all, the point of a platform, as opposed to a tool, is that you are trying to promote reuse and standardization across the organization and across different projects or initiatives.  Also, try to think of what is needed now vs. what is likely to be needed in the future.

Here are the capabilities we feel that any data integration platform worth its salt should support in order to claim that it is comprehensive:

  • Any data integration step:  There are five key lifecycle steps for data integration: access, discover, cleanse, integrate and deliver.  Does the platform support all the steps?
  • Any data.  Most organizations have data in hundreds of different formats– in enterprise applications, in databases, in flat files, in message queues, in spreadsheets and other documents.  And that data encompasses all different business entities, and can be spread across many different geographies/countries.  Can the platform handle any data type or format, including structured and unstructured data, and all master data types (e.g. customer data, product data, finance data), for all the lifecycle steps?
  • Anywhere.  Most organizations have data in thousands of places.  This data is not only within the enterprise, but it’s also coming from business partners, or is being managed by SaaS (software-as-a-service) vendors “in the cloud”.  Can the platform integrate data regardless of whether it sits inside or outside the firewall?
  • Any time.  There is a wide spectrum of timeframes and latencies for data integration, depending on the application and use case.  The timing can range from weeks and days to seconds.  Can the platform access, cleanse, integrate and deliver data whenever the data is needed by applications and users– be it real-time, batch, or changed data (CDC)?
  • Any role.  Many different people, including stewards, analysts, architects, administrators and developers, are involved in data integration projects.  They have different tasks to accomplish, and different skill sets.  At the same time, they all need to work together.  Does the platform provide role-specific tools, specifically designed for each person’s skills and tasks?
  • Any use.  Organizations also have multiple types of data integration projects which they need to tackle– from data warehousing to migrations to MDM to data quality to cloud data integration and B2B data exchange.  Has the platform been proven in a broad variety of real-world use cases?   Does it support both analytical (e.g. reporting and analysis) and operational (e.g. business processes related to operations execution) use cases?

There are some additional criteria that should be used to evaluate a data integration platform, beyond its functional capabilities.  More on that in a future posting…

This entry was posted in Data Governance, Data Integration, Data Quality, Data Warehousing, Master Data Management, Real-Time and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>