Tag Archives: PowerCenter
When I was seven years old, Danny Weiss had a birthday party where we played the telephone game. The idea is this: there are 8 people sitting around a table, the first person tells the next person a little story. They tell the next person, the story, and so on, all the way around the room. At the end of the game, you compare the original story that the first person tells and compare it to the story the 8th person tells. Of course, the stories are very different and everyone giggles hysterically… we were seven years old after all.
The reason I was thinking about this story is that data integration development is similarly inefficient as a seven year old birthday party. The typical process is that a business analyst, using the knowledge in their head about the business applications they are responsible for, creates a spreadsheet in Microsoft Excel that has a list of database tables and columns along with a set of business rules for how the data is to be transformed as it moved to a target system (a data warehouse or another application). The spreadsheet, which is never checked against real data, is then passed to a developer who then creates code in separate system in order to move the data, which is then checked by a QA person which is then checked again by the business analyst at the end of the process. This is the first time the business analyst verifies their specification against real data.
99 times out of 100, the data in the target system doesn’t match what the business analyst was expecting. Why? Either the original specification was wrong because the business analyst had a typo or the data is inaccurate. Or the data in the original system wasn’t organized the way the analyst thought it was organized. Or the developer misinterpreted the spreadsheet. Or the business analyst simply doesn’t need this data anymore – he needs some other data. The result is lots of errors, just like the telephone game. And the only way to fix it is with rework and then more rework.
But there is a better way. What if the data analyst could validate their specification against real data and self correct on the fly before passing the specification to the developer. What if the specification were not just a specification, but a prototype that could be passed directly to the developer who wouldn’t recode it, but would just modify it to add scalability and reliability? The result is much less rework and much faster time to development. In fact, up to 5 times faster.
That is what Agile Data integration is all about. Rapid prototyping and self-validation against real data up front by the business analyst. Sharing of results in a common toolset back and forth to the developer to improve the accuracy of communication.
Because we believe the agile process is so important to your success, Informatica is giving all of our PowerCenter Standard Edition (and higher editions) customers agile data integration for FREE!!! That’s right, if you are a current customer of Informatica PowerCenter, we are giving you the tools you need to go from the old fashion error-prone, waterfall, telephone game style of development to a modern 21st century Agile process.
• FREE rapid prototyping and data profiling for the data analyst.
• Go from prototype to production with no recoding.
• Better communication and better collaboration between analyst and developer
PowerCenter 9.6. Agile Data Integration built in. No more telephone game. It doesn’t get any better than that.
My wife invited my new neighbors over for dinner this past Saturday night. They are a French couple with a super cute 5 year old son. Dinner was nice, and like most ex-pats in the San Francisco Bay Area, he is in high tech. His company is a successful internet company in Europe, but have had a hard time penetrating the U.S. market which is why they moved to the Bay Area. He is starting up a satellite engineering organization in Palo Alto and he asked me where he can find good “big data” engineers. He is having a hard time finding people.
This is a story that I am hearing quite a bit with customers that I have been talking to as well. They want to start up big data teams, but can’t find enough skilled engineers who understand how to develop in PIG or HIVE or YARN or whatever is coming next in the Hadoop/map reduce world.
This reminds me of when I used to work in the telecom software business 20 years ago and everyone was looking at technologies like DCE and CORBA to build out distributed computing environments to solve complex problems that couldn’t be solved easily on a single computing system. If you don’t know what DCE or CORBA are/were, that’s OK. It is kind of the point. They are distributed computing development platforms that failed because they were too damn hard and there just weren’t enough people who could understand how to use them effectively. Now DCE and CORBA were not trying to solve the same problems as Hadoop, but the basic point still stands, they were damn hard and the reality is that programming on a Hadoop platform is damn hard as well.
So could Hadoop fail, just like CORBA and DCE. I doubt it, for a few key reasons. One… there is a considerable amount of venture and industrial investment going into Hadoop to make it work. Not since Java has there been such a concerted effort by the industry to try to make a new technology successful. Second, much of that investment is in providing graphical development environments and applications that use the storage and compute power of Hadoop, but hide its complexity. That is what Informatica is doing with PowerCenter Big Data Edition. We are making it possible for data integration developers to parse, cleanse, transform and integrate data using Hadoop as the underlying storage and engine. But the developer doesn’t have to know anything about Hadoop. The same thing is happening at the analytics layer, at the data prep layer and at the visualization layer.
Bit by bit, software vendors are hiding the underlying complexity of Hadoop so organizations won’t have to hire an army of big data scientists to solve interesting problems. They will still need a few of them, but not so many that Hadoop will end up like those other technologies that most Hadoop developers have never even heard of.
Power to the elephant. And more later about my dinner guest and his super cute 5 year old son.
At InformaticaWorld, we made a very exciting announcement—the introduction of PowerCenter Express, our entry-level data integration and profiling tool. What is PowerCenter Express, exactly? Well, in a nutshell, it’s giving the Power of PowerCenter to everyone, “to the people” if you like. We made PowerCenter Express available to all attendees at InformaticaWorld and they’ll be able to install it and be up and running in less than ten minutes. Since it’s PowerCenter, they’ll be able to scale up to enterprise class capabilities whenever they need to, using Vibe, our “Map Once, Deploy Anywhere” technology. Starting in July PowerCenter Express will be generally available to everyone- as a free download from Informatica’s Marketplace.
What we are doing with PowerCenter Express, is making sure that everyone, including departments and growing businesses, have access to PowerCenter’s high quality data integration and profiling tools. Until now the options for these groups have been limited—hand coding or open source products. Neither of these options is able to scale to be able to handle enterprise class data integration requirements. Which meant that before the advent of PowerCenter Express when these smaller organizations reached the point where they needed enterprise class capabilities and had to migrate to an enterprise data integration tool, they had no choice but to scrap all of their prior work . We don’t want that to happen anymore. We don’t want anyone to have to re-write mappings, to re-do work—ever. We want people to be able to map once, and deploy anywhere. And that’s what PowerCenter Express makes possible, that any organization, no matter how small, can start with PowerCenter—the gold standard for data integration—and stay with PowerCenter, re-using those same mappings when they transition to enterprise class, or when they want to deploy those mappings to Hadoop.
The reality is, as organizations’ data integration complexity reaches a certain point, they end up coming to Informatica—for the best products , the best support and the biggest ecosystem of developers. But in the past, for smaller organizations starting with the fully functional PowerCenter wasn’t always the best option. With PowerCenter Express, organizations can start small, start now, and scale fast. PowerCenter Express offers a real choice and future protection for entry-level data integration
If you’d like to learn more about PowerCenter Express before the public launch, shoot me an email at EBurns@Informatica.com. And start following me here, I’ll be posting a lot about this exciting new product over the coming weeks and months.
Emily V. Burns
Sr. Product Marketing Manager, PowerCenter Express
For those of you hanging out at Informatica World, this is not news. For those of you who aren’t in Vegas with us, you missed the unveiling of the world’s best entry level data integration platform. So you heard it here second, not first. Next time, if you want to hear about this kind of stuff first, you have to show up at Informatica World! <shameless plug for INFAWorld 2013 complete>
So, what is it that I am bragging about? PowerCenter Express, that’s what. This is the latest addition to the Informatica PowerCenter family of products, specifically designed for entry level data integration and data profiling. This product will be downloadable over the Internet and installs in as little as 5 minutes. It is super simple to use but has all of the rich transformation functionality you are used to from Informatica. Also, you don’t have to install a separate profiling product, everything is self-contained. The product comes with built in “cheat sheets” that walk you through how to use the product in a step by step fashion. In addition, there is complete documentation as well as video based tutorials.
But best of all, PC Express delivers the kind of product quality you are accustomed to from Informatica. What does that mean? It means that unlike most of the entry level data integration products available for download, PC Express just works. It doesn’t crash just because your ETL job requires more memory than you have on your machine, it gracefully caches to disk.
But wait, there’s more. For the first time ever, Informatica is offering a FREE version of our market leading PowerCenter product. There will be two versions of PowerCenter Express:
- PowerCenter Express Personal Edition – available for FREE for a single developer at a time
- PowerCenter Express Professional Edition- available for $8K/user per year subscription (at the time of this blog post)
And one last important point. PC Express is based on the same virtual data machine as our enterprise class products and our cloud based products. This means that at some later date, if you decide you need more scalability, more users, or enterprise class features like high availability, you can easily migrate from PC Express to the other Informatica data integration product lines.
So if you are at Informatica World, you will be receiving an email outlining how you can download and try out PowerCenter Express. If you aren’t at Informatica World, maybe you have a friend who will share the secret website location where you can get a sneak peak at PowerCenter Express. If you don’t have any friends who went to Informatica world, well, you will just have to wait until the download site goes public in July. And next time you will know that you better go to Informatica World if you want to get early access to cool stuff.
“We do nightly updates to our data warehouse, but we have no way to validate that the data was moved and transformed correctly in the time available. We only have time to test a subset and hope that it is the right subset.”
“We have tools to tell us about performance after an issue occurs, but nothing that helps us prevent the issue in the first place. So, we find out about failures from the end-users looking at a bad report. This causes delayed or poor business decision making, and also impacts our departments’ reputation.”
These are just a couple of quotes I have heard from data integration end users. We have been collecting similar information from thousands of our data integration customers around their challenges across the data integration lifecycle. What we are hearing is that the growth of data within organizations and ever increasing demand for more timely data has introduced a number of threats. In turn, these new demands have created massive variability in the way customers approach their projects, which introduces a host of data integration challenges, especially in production.
So naturally, organizations have taken a variety of approaches to combat these threats – ranging from adding full-time employee/contractor teams to test and monitor workflows to developing customized scripts or a combination of both. Ok – so there are monitoring tools out there. But, the fact remains that generic monitoring tools don’t uncover deep data integration issues. This was discussed at length here. Additionally, typical testing efforts such as “stare and compare” and hand-coding are manual, un-repeatable, and un-auditable as discussed here.
Now recall the point about added pressure of explosive data growth and increasing demands for timely delivery, and add to it the wide variability in how organizations approach projects – it’s scary. But what’s more concerning is that this variability riddles production environments with errors, inefficiencies, and security threats. Manual and reactive approaches to remedy the problem only exacerbate issues by increasing complexity. Hence the delays in identifying, monitoring and fixing issues, typically requiring fire drills, manual solutions and reactive measures, but not addressing problems.
If any of this sounds familiar to you, you may want to take advantage of what we are doing at Informatica World 2013 to help you address these challenges. To make things convenient for you, I have prepared a customized guide on relevant sessions, hands-on-labs and booths to help you understand how automated, repeatable and auditable testing, along-with a pre-emptive approach to diffusing threats before they erupt into full-blown issues, can help you. Please feel free to sign-up here for some of the breakout sessions we are hosting on this topic, or swing by one of the labs or booths:
Tuesday, June 4
- 9:00AM – 10:00AM – Platform & Products Track – Best Practices for Doing Data Replication the Right Way
- 2:00OM – 3:00PM – Architecture Track – PowerCenter Architecture
Wednesday, June 5
- 2:30PM – 3:30PM – Platform & Products Track – New Approaches to Reducing PowerCenter Testing and Monitoring Time
Thursday, June 6
- 9:00AM – 10:00AM – Tech Talk Track – Proactive Monitoring: Greater IT Productivity with Streamlined Data Integration
- 10:15AM – 11:15AM – Platform & Products Track – What’s New from Informatica to Improve Data Warehouse Performance and Lowering Costs
BIRDS OF A FEATHER ROUNDTABLES (Check Agenda for Daily Timings):
- Testing Strategies and Tools
- Automating Administrative Maintenance Tasks
HANDS-ON LABS (Check Agenda for Daily Timings):
- Table 42 – Informatica Data Validation
- Table 43 – Informatica Proactive Monitoring
BOOTHS (Check Agenda for Daily Timings):
- PowerCenter Developer Productivity and Production Manageability
I look forward to seeing you at Informatica World 2013.
The hype around big data is certainly top of mind with executives at most companies today but what I am really seeing are companies finally making the connection between innovation and data. Data as a corporate asset is now getting the respect it deserves in terms of a business strategy to introduce new innovative products and services and improve business operations. The most advanced companies have C-level executives responsible for delivering top and bottom line results by managing their data assets to their maximum potential. The Chief Data Officer and Chief Analytics Officer own this responsibility and report directly to the CEO. (more…)
In Ashwin Viswanath’s previous video blog, he spoke about why it is important to have a cloud integration solution that has purpose-built integration applications. In this video, he delves deeper into the security aspects of cloud integration and how to rapidly provision integration environments for distributed business units, subsidiaries and departments in a quick and efficient manner.
This year marks the 20th anniversary for Informatica. Twenty years of solving the problem of getting data from point A to point B, improving its quality, establishing a single view and managing it over its life-cycle. Yet after 20 years of innovation and leadership in the data integration market, when one would think the problem had been solved, all data had been extracted, transformed, cleansed and managed, it actually hasn’t — companies still need data integration. Why? Data is complicated business. And with data increasingly becoming central to business survival, organizations are constantly looking for ways to unlock new sources of it, use it as an unforeseen source of insight and do it all with greater agility and at lower cost. (more…)
Search Advertising also known as Search Engine Marketing (SEM), is a totally unique medium for attracting new customers. It is a method of placing online advertisements on Web pages that show results from search engine queries. Through the same search engine advertising services, advertisements can also be placed on Web pages with other published content. It is a multi-billion dollar industry. Advertising on Google, Yahoo and MSN gives you a total reach of roughly 86.4% of all Internet users. With such a broad reach, Search Advertising is one of the most extensible forms of advertising available, with added benefits that other forms of advertising are lacking.
In Search Advertising advertisers pay the website owner for clicks on their ads. There are two major types of Search Advertising: Sponsored Search and Content Placement Targeting. (more…)
In a recent webinar, Mark Smith, CEO at Ventana Research and David Lyle, vice president, Product Strategy at Informatica discussed: “Building the Business Case and Establishing the Fundamentals for Big Data Projects.” Mark pointed out that the second biggest barrier that impedes improving big data initiatives is that the “business case is not strong enough.” The first and third barriers respectively, were “lack of resources” and “no budget” which are also related to having a strong business case. In this context, Dave provided a simple formula from which to build the business case:
Return on Big Data = Value of Big Data / Cost of Big Data (more…)