Answering Questions about Informatica Data Integration Hub v10
In follow up to the launch of Data Integration Hub v10, we had a well attended Technical Deep Dive and Demo webinar to cover more details of the great new cloud and big data functionality in version 10 and demonstrate them in action live. Data Integration Hub provide a modern publish/subscribe hybrid architecture for data integration that provides new levels of agility, productivity, flexibility and team collaboration. Since there were many good questions during the webinar, we wanted to share them and our answers in this post.
How do you handle subscriptions that need an initial historical load and then delta loads going forward?
There are various ways this can be handled but some of our customers have provisioned both a Full and a Delta topic for the same data domain. Source system(s) publish infrequently on the full topics so that new consumers can use it to initially load the domain. They then begin consuming the Delta topic at regular intervals to stay up to date.
How does Data Integration Hub compare to ESBs?
Unlike message-centric Enterprise Service Buses, Data Integration Hub is data-centric and since it leverages PowerCenter, PowerExchange and Informatica Cloud connectivity and processing, it can scale to large data volumes and is capable of a rich set of data transformations including data quality. Data Integration Hub is mixed latency and can automatically handle delivering data via immediate or scheduled batch as well as near real-time depending on what the consuming system needs.
How is Data Integration Hub compared with multidomain MDM?
Data Integration Hub can be thought of as a transient data broker. It persists the data primarily to support mixed latency consumers. While it greatly improves the efficiency and consistency of data that is distributed between applications, it lacks the match/merge, curation and relationship capabilities that are cornerstones of MDM. Data Integration Hub may compliment MDM for certain use cases by providing an efficient data acquisition and/or data distribution mechanism.
Does the Data Integration Hub work with external scheduling products?
Yes, the Data Integration Hub provides both command line and programmatic APIs that allow the publication and subscriptions that are defined within the Data Integration Hub to be scheduled and run using an enterprise scheduling package.
How does Data Integration Hub pulls data from source? Does it use PowerExchange connectors or ODBC/JDBC?
Data Integration Hub can leverage any connectivity approach that is available within PowerCenter and/or Informatica Cloud. So while ODBC/JDBC might be an option for certain systems, for the vast majority we provide native adapters.
Can you first create a mapping in PowerCenter then use it in Data Integration Hub instead of creating it in Data Integration Hub then modify in PowerCenter?
Yes, We call this a custom Data Integration Hub workflow. If you have custom transformation logic then you would most likely develop a custom publication or subscription workflow using the traditional PowerCenter tooling then reference that workflow when you define the publication or subscription in the Data Integration Hub. Many of our customers auto-generate all of their workflows then edit them to introduce any custom logic thereby accelerating their development time.
What database platforms are supported for repository database?
Based on customer demand, the Data Integration Hub presently supports Oracle and Microsoft SQL Server as the relational publication repository.
Does Data Integration Hub store data at detail level and not aggregate level?
Typically data is stored at detail level, however, keep in mind that the data is published into the hub using a PowerCenter workflow so it could easily aggregate or summarize the data. Of course this would mean that all consumers desire only summarized data rather than detailed feeds, which is rarely the case. This is one of the many design considerations that our Professional Services or Implementation Partner will discuss during a Data Integration Hub implementation project.
What are some best practices that you would recommend?
There are many and this topic could easily consume a full webinar in itself, however one high level Best Practice that is worth mentioning is to start out with a well-defined net new project. Don’t try to boil the ocean by retrofitting all of your existing Informatica workflows into Data Integration Hub. Pick a well suited project with multiple data domains, multiple applications and repetitive consumption patterns that would otherwise result in inefficiencies caused by numerous redundant interfaces.
A replay of the webinar – What’s New in Informatica Data Integration Hub 10 – Technical Deep Dive and Demo is available here. There is more information about the Data Integration Hub on the product page.