Big Data Edition is Here!

The PowerCenter Big Data Edition’s here!  The PowerCenter Big Data Edition’s here!

“The new phone book’s here! The new phone book’s here! “
“…Millions of people look at this book everyday! This is the kind of spontaneous publicity – your name in print – that makes people. I’m in print! Things are going to start happening to me now”

Navin R Johnson
The Jerk

OK, so maybe the rest of you aren’t as excited as Navin R Johnson (aka Steve Martin) and the rest of us over here in InformaticaLand ( the happiest place on earth for data integration) but you will be as excited pretty soon. I have to say that this is pretty darn cool.

One of the big issues that happens over and over again (like plastic parsley, use it over and over again: @ 15 seconds) in high tech is that a new technology comes out and it is awesome, but you have to be a rocket scientist or a brain surgeon to use it. Or in the case of Hadoop, you need to be a data scientist.  That is the one big issue we are hearing with the mainstream market about the big data opportunity, people are just afraid of it. They don’t have the knowledge and can’t acquire the skills they need to make the deployment of a Hadoop cluster successful.

This was also true back in 1993 when Informatica was founded. People had to extract, transform and load data back then and the overwhelming solution was to hand code the process. Then companies like Informatica came along, automated that process and created graphical tooling that made it easier for non-data scientists to move and manage what was considered to be, at the time, large amounts of data.   So it is déjà vu, all over again. This time with Hadoop and big data.

The main difference is that this time, ETL products exist already for moving data. And by simply extending the paradigm to support Hadoop as a data transformation engine, the tens of thousands of developers who were using Informatica PowerCenter to develop ETL jobs on the Informatica ETL engine, can now also use that same development environment for creating Hadoop jobs. So if you want to start pulling in social media data, processing it and then incorporating the output into your existing data flows, you can use the same graphical development environment you have grown accustomed to all these years.

The big data scientists here at Informatica have invented something that we are calling the Virtual Data Machine.  And it is thanks to this concept, that the same GUI that is used for traditional ETL, can also be used for data federation, grid-based ETL and now for Hadoop! I have been in high tech software for over 20 years, and while I have only been at Informatica for a little while, I have to say that the thinking behind the Virtual Data Machine is absolutely brilliant. Bottom line is that customers who purchased PowerCenter for ETL were future proofed for Hadoop and they didn’t even know it.

But more importantly you don’t have to be a rocket scientist to get started using Hadoop.  And that is even more exciting than finding out that your name is in the new phone book.

This entry was posted in Big Data, Data Integration and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>