Tag Archives: grid computing

Hitting the Batch Wall, Part 2: Hardware Scaling

This is the second installment of my multi-part blog series on “hitting the batch wall.” Well, it’s not so much about hitting the batch wall, but what you can do to avoid hitting the wall. Today’s topic is “throwing hardware” at the problem (a.k.a. hardware scaling). I’ll discuss the common approaches and the tradeoffs of hardware scaling with Informatica software.

Before I can begin to discuss hardware scaling, I start with this warning: faster hardware only improves the load window situation when it resolves a bottleneck. Data integration jobs are a lot like rush hour traffic, they can only run as fast as the slowest component. It doesn’t make any sense to buy a Ferrari if you will always be driving behind a garbage truck. In other words, if your ETL jobs are constrained by the source/target systems or I/O or even just memory, then faster/more CPUs will rarely improve the situation. Understand your bottlenecks before you start throwing hardware at them! (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration | Tagged , , , , , | 2 Comments

Building the Business Case for Big Data: Learn to Walk Before You Run

In a recent webinar, Mark Smith, CEO at Ventana Research and David Lyle, vice president, Product Strategy at Informatica discussed: “Building the Business Case and Establishing the Fundamentals for Big Data Projects.”  Mark pointed out that the second biggest barrier that impedes improving big data initiatives is that the “business case is not strong enough.” The first and third barriers respectively, were “lack of resources” and “no budget” which are also related to having a strong business case. In this context, Dave provided a simple formula from which to build the business case:

Return on Big Data = Value of Big Data / Cost of Big Data (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data | Tagged , , , , , , , , , , , | Leave a comment