10

Making The Case For Staging And Cloud Computing Integration

Darren Cunningham, in his recent blog post How to Migrate To The Cloud, made some great points around the use of staging for data integration for cloud computing. The reasons he would leverage a staging area for cloud computing include:

  • It enables better business control before the data is pushed from one system to the other.
  • It enables tracking and reconciliation of a business process.
  • It enables the addition of new sources or targets with reuse instead of building the spaghetti plate of point to point direct interfaces. It responds to the SOA paradigm.
  • It breaks the dependencies between the two systems enabling asynchronous synchronization or synchronous with different size of data set (single message or bulk).

The reality of data integration is that the patterns of use are largely dependent upon the requirements. While real-time or near-time data integration is required in certain circumstances, the use of a staging area for data integration is gaining ground as more and more corporate data sets reside in clouds.

As Darren points out above, the core reason is control of the data as it moves from system to system, typically on-premise to the cloud, and back again. The use of intermediary staging where the data can be viewed, manipulated, and cleaned, insures that the data quality and any data governance required occurs consistently.

Moreover, most on-premise to cloud computing problem domains typically deal with more than one system. The use of staging allows you easily to add and delete systems, using the staging area as a place where several data sources are combined, processed, and retransmitted to the target system. When attempting this using real- or near-time technology the transformation becomes complex, and thus difficult to manage and execute.

In the world of SOA we often argue about the use of loose coupling, and the use of a staging area makes that possible. Considering that the systems are largely decoupled, any changes to the source or target systems are easily managed within the staging area with simple tweaks to the transformation and translation logic.

When leveraging a staging area for on-premise to cloud computing integration, there are a few key areas of guidance that I would provide.

First, the design of the integration path is very important. Take time to understand the source and target schemas, and thus design the transformation and routing logic accordingly.

Second, when selecting technology make sure to understand your own use cases, and any growth or changes that will occur within your problem domain. Many enterprises only purchase for what they need now, and end with layers upon layers of different integration technologies, with none truly solving their problems.

Finally, consider the latency of the Internet. In many instances, attempting to transfer GBs of data just won’t happen in minutes as it does from on-premise system to on-premise system. You need to account for this within your data integration design, and consider the growth of the data sets over time.

FacebookTwitterLinkedInEmailPrintShare
This entry was posted in Cloud Computing, Data Integration Platform, SaaS, SOA and tagged , , . Bookmark the permalink.

10 Responses to Making The Case For Staging And Cloud Computing Integration

  1. Pingback: Making the Case for Staging and Cloud Computing Integration … | Suporte de Informática

  2. Thanks for adding to the discussion Dave! I think your points about the right design, technology, and latency are good ones. We do see a great deal of point-to-point, process-centric integration requirements coming from midsized companies in particular who are using the Informatica Cloud to connect SaaS applications with back-office systems. The direct connect approach also seems particularly attractive when there are either limited IT resources or a feeling from the LOB that their needs aren’t being met. The politics of cloud application integration might be an interesting topic for a future post…

    The original comment to my post was from Bertrand Cariou: http://bit.ly/dk51k7

    I wonder if he has anything else to add.

  3. Aditya Thota says:

    In the case of a typical DW architecture where we have staging tables, historical tables and finally dw tables. Can we compute using Saas as below.
    1. Extract from source tables.
    2. Land to staging tables (on-premise) using SaaS.
    3. Write to historical tables (on-premise) after cleansing using SaaS.
    4. Write to DW tables (on-premise) after dimensional lookups(that reside on-premise) using SaaS.
    What could potential implications be in this typical EDW architecture?
    Thanks,
    Aditya

  4. Aditya Thota says:

    Darren,
    Excellent article!!
    In the case of a typical DW architecture where we have staging tables, historical tables and finally DW tables. Can we compute using Saas as below.
    1. Extract from source tables.
    2. Land to staging tables (on-premise) using SaaS.
    3. Write to historical tables (on-premise) after cleansing using SaaS.
    4. Write to DW tables (on-premise) after dimensional lookups(that reside on-premise) using SaaS.
    What could potential implications be in this typical EDW architecture?
    Thanks,
    Aditya

  5. Darren Cunningham says:

    Adithya, you’ve got it. We see this exact use case happening everyday for both data synchronization and data replication use cases. We talked about the ICC possibilities for on-going replication with a SaaS app like Salesforce in this webinar.

    http://www.slideshare.net/dcunni07/infa-cloud-datareplication

    Let me know if you’d like to discuss futher.

    Darren
    dcunningham @ informatica.com

  6. Aditya Thota says:

    Darren,
    That still answers the question on being able to move data in and out of the cloud based on the use-case I have mentioned above. Can I use Informatica Cloud to write to my staging tables, history tables and datamart tables (all on a dB on-premises) ? Simply is it a Y/N ?
    Thanks,
    Aditya

  7. Aditya Thota says:

    Ooops!
    That still DOESN”T answer the question on being able to move data in and out of the cloud based on the use-case I have mentioned above. Can I use Informatica Cloud to write to my staging tables, history tables and datamart tables (all on a dB on-premises) ? Simply is it a Y/N ?
    Thanks,
    Aditya

  8. Pingback: Cloud Integration Webinar: Salesforce.com and SAP « In(tegrate) the Clouds

  9. x videos says:

    qqlc fujpk porno ucvukh m et m jde

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>