Real-time Data - A Data Integration Challenge
Posted in Data Integration, Enterprise Data Management by Don Tirsell |![]() |
One technical challenge not often discussed in data integration circles is the impact of real-time data to performance and scalability. I attribute this to a lack of real-world experience in handling real-time data, or a lack of recognition by IT that data integration software can effectively manage real-time data. Many architects and IT developers that I meet lump real-time into the EAI domain. This was a logical assumption 5 years ago, due to the fact that the data integration market was then primarily known for tackling “large batch volume” workloads (or as I like to refer to them “big batch problems”)
Informatica has spent 10 years focused to a good degree on solving that “big batch” problem. The inherent division between design time and run time in the underlying platform architecture enabled the introduction of parallelization/partitioning techniques, 64 bit processing, support for RDBMS vendor supplied batch utilities/APIs and improved data conversion/transformation without impacting the business logic design. This has proven invaluable to our customers in meeting their increasing volume, and in shrinking load window requirements.
Let’s discuss some of the technical challenges we’ve faced when incorporating support for real-time into the Informatica data integration platform. For one thing, real-time data flows in rapid fire, transactional patterns rather than large concerted blasts. In some instances, there is often an SLA associated real-time processing of each response rather than a time window to complete a full batch load. In addition, there is a different set of systems and formats for real-time data than for batch. And web services, message queues, the web itself, email, and data-base related change-data capture all offer sources of, and destinations for, real-time data. In other words, a different perspective is required to solve real-time problems.
Luckily, Informatica added this perspective to its development and performance teams back in 2001 and through three subsequent major releases (Informatica 6, 7, and 8 ) delivered a platform that masters both batch and real-time workloads concurrently – so that, using the same platform, both batch and real-time architectures co-exist (and thrive).
Are you trying to incorporate real-time data into your business solutions?
Have you taken a look at how real-time data fits into your enterprise data architecture?
Have you tried to real-time enable your data warehouse but are struggling?
Ralph Kimbal l and I presented on this topic on August 21st. We outlined the challenges and opportunities for Real-time Operational Data Warehouses and I reviewed Informatica’s approach to real-time data integration including a few customer examples. [Is there somewhere they can read a summary of the presentation?]It’s no longer just about batch – it’s about BOTH!











No Comments, Comment or Ping
Reply to “Real-time Data - A Data Integration Challenge”