3 Avoidable Data Integration Blunders

data integration
3 Avoidable Data Integration Blunders

Data integration is an old science.  We started to sync data over networks between data stores and applications many years ago.  However, it’s evolved into something pretty bullet proof.  That is, if you put in the time to ensure that you make the right calls.

To that end, I’ve put together a list of 3 data integration blunders that I see happen a lot lately.  These are very avoidable blunders.  With a bit of work, you can remove these risks from your next data integration project.

First, security is something that’s systemic to everything that you do.  Security starts at the beginning of the project when you select your source and target data stores, to defining transformation, to translation, testing, execution, and operations.  Security needs to be part of a thoughtful process, and should not be a bolt-on afterthought, as happens in many enterprises.

Many of those who do data integration have a tendency to encrypt at rest, or in flight, but not both.  The end state is that the data is exposed in transit or where it is stored.  Also make sure to consider identity management, logging, and key management as you define, build, and test your data integration solution.

Second, define your metadata problems first.  Data integration is much easier if you know what data you need to integrate.  The purpose of defining your metadata is to better understand the data, and thus get your metadata under control, which means you’ll get your data integration under control at the same time.

Metadata should not just be format, owner, integrity, etc., but also include items that define compliance, security, governance, and more advanced concepts that we need to deal with these days.  By doing this, you’ll insure that your data integration project has a huge chance of success, since you’re removing the possibility that you’ll incorrectly integrate the data, or not provide the appropriate security.

Finally, don’t sign up to the data integration technology of the month club.  This is a common blunder.  Data integration projects are done in instances of time, and data integration project leaders have tendency to pick different technologies based upon their perception of what’s popular.

The issue is twofold.  First, if you have several data integration technologies around, there is no chance that you’ll get them under a common management layer.  At the very least, it will increase complexity, which will increase cost and risk.  Second, you’ll have to keep around the skills to maintain the data integration solution, and thus you’ll have to hire or train for several kinds of technologies.  The disadvantages of that scenario are obvious.

It’s often difficult to avoid some blunders.  Mistakes are made in almost every new venture, and most data integration projects just work around them.  What I’m referring to are mistakes that are very much avoidable, based upon what we know so far.  Learn from your peers.  Don’t recreate the same mistakes.