Tag Archives: PowerCenter Express

Large Data Sets Experience is Needed by Computer Science Graduates

Large Data Sets Experience is Needed

Large Data Sets Experience is Needed

During recent graduate interview discussions I asked about technical items mentioned on resumes. What I heard surprised me in a couple of cases. In particular, the very low volume data sets recent graduates have worked with. I was surprised that they had not used a large data set. I got the impression that some professors do not care about the volume of data that students work with when they create their applications.

In the media there is a constant discussion about a mismatch between the skills that education provides and the capabilities graduates bring to the work place. And, whether they are prepared for work. The lack of large data set use means that skills needed by employers may be missing. I will outline the skills that could be gained by working with large data sets.

Some types of data handling are just high volume. Business intelligence and analytics consume more data than 20 years ago. Handling the increasing volume is important. Research programming and data science are truly part of big data. Even if you are not doing data science, you may be preparing and handling the data sets. Some industries and organisations just have higher volumes of data. Retail is one example. Companies that used to have less volume are obtaining more data as they adapt to the big data world. We should expect the same trend to continue with organisations that have had higher data volumes in the past. They are going to have to handle a much bigger big data experience.

There are practical aspects to  handling large data sets. These can lead to experience in storage management and design, data loading, query optimization, parallelization, bandwidth issues and data quality when large data sets are used. And when you take on those issues, architecture skills are needed and can be gained.

Today, the trends known as the Internet of Things, All Things Data, and Data First are forming. As a result there will be demand for graduates who are familiar with handling high volumes of data.

The responsibility for using a large data set falls to the student. Faculty staff need to encourage this though. They often set and guide the students’ goals. A number of large data sets that could be used by students are on the web. An example of one data set would be the Harvard Library Bibliographic Dataset available at http://openmetadata.lib.harvard.edu/bibdata. Another example is the City of Chicago that makes a number of datasets available for download in a wide range of standard formats at https://data.cityofchicago.org/. The advantage of public large data sets is the volume and the opportunity to assess the data quality of the data set. Public data sets can hold many records. They represent many more combinations than we can quickly generate by hand.  Using even a small real world data set is a vast improvement over the likely limited number of variations in self-generated data. It may be even better than using a tool to generate data. Such data when downloaded can be manipulated and used as a base for loading.

Loading large data sets is part of being prepared. It requires the use of tools. These tools can be from loaders to full data integration tool suites. A good option for students who need to load data sets is PowerCenter Express. It was announced  last year. It is free for use with up to 250,000 rows per day. It is an ideal way to experience a full enterprise data integration tool and work with significantly higher volumes.

Big Data is here and it is a growing trend. And so students need to work with larger data sets than before. It is also feasible. The tools and the data sets the students need to work with large data sets are available. Therefore, in view of the current trends, large data set use should become standard practice in computer science and related courses.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Data Quality, PowerCenter, Uncategorized | Tagged , , , , | Leave a comment

Power(Center) to the People

At InformaticaWorld, we made a very exciting announcement—the introduction of PowerCenter Express, our entry-level data integration and profiling tool. What is PowerCenter Express, exactly? Well, in a nutshell, it’s giving the Power of PowerCenter to everyone, “to the people” if you like. We made PowerCenter Express available to all attendees at InformaticaWorld and they’ll be able to install it and be up and running in less than ten minutes. Since it’s PowerCenter, they’ll be able to scale up to enterprise class capabilities whenever they need to, using Vibe, our “Map Once, Deploy Anywhere” technology. Starting in July PowerCenter Express will be generally available to everyone- as a free download from Informatica’s Marketplace.

What we are doing with PowerCenter Express, is making sure that everyone, including departments and growing businesses, have access to PowerCenter’s high quality data integration and profiling tools. Until now the options for these groups have been limited—hand coding or open source products. Neither of these options is able to scale to be able to handle enterprise class data integration requirements. Which meant that before the advent of PowerCenter Express when these smaller organizations reached the point where they needed enterprise class capabilities and had to migrate to an enterprise data integration tool, they had no choice but to scrap all of their prior work . We don’t want that to happen anymore. We don’t want anyone to have to re-write mappings, to re-do work—ever. We want people to be able to map once, and deploy anywhere. And that’s what PowerCenter Express makes possible, that any organization, no matter how small, can start with PowerCenter—the gold standard for data integration—and stay with PowerCenter, re-using those same mappings when they transition to enterprise class, or when they want to deploy those mappings to Hadoop.

The reality is, as organizations’ data integration complexity reaches a certain point, they end up coming to Informatica—for the best products , the best support and the biggest ecosystem of developers. But in the past, for smaller organizations starting with the fully functional PowerCenter wasn’t always the best option. With PowerCenter Express, organizations can start small, start now, and scale fast. PowerCenter Express offers a real choice and future protection for entry-level data integration

If you’d like to learn more about PowerCenter Express before the public launch, shoot me an email at EBurns@Informatica.com. And start following me here, I’ll be posting a lot about this exciting new product over the coming weeks and months.

Emily V. Burns
Sr. Product Marketing Manager, PowerCenter Express

FacebookTwitterLinkedInEmailPrintShare
Posted in Marketplace | Tagged , , , | Leave a comment

World’s Best Entry Level Data Integration Product Has Finally Arrived!

For those of you hanging out at Informatica World, this is not news.  For those of you who aren’t in Vegas with us, you missed the unveiling of the world’s best entry level data integration platform. So you heard it here second, not first.  Next time, if you want to hear about this kind of stuff first, you have to show up at Informatica World!  <shameless plug for INFAWorld 2013 complete>

So, what is it that I am bragging about?  PowerCenter Express, that’s what.  This is the latest addition to the Informatica PowerCenter family of products, specifically designed for entry level data integration and data profiling.  This product will be downloadable over the Internet and installs in as little as 5 minutes.   It is super simple to use but has all of the rich transformation functionality you are used to from Informatica.  Also, you don’t have to install a separate profiling product,  everything is self-contained.   The product comes with built in  “cheat sheets” that walk you through how to use the product in a step by step fashion.  In addition, there is complete documentation as well as video based tutorials.

But best of all, PC Express delivers the kind of product quality you are accustomed to from Informatica.  What does that mean?  It means that unlike most of the entry level data integration products available for download, PC Express just works.  It doesn’t crash just because your ETL job requires more memory than you have on your machine, it gracefully caches to disk.

But wait, there’s more.  For the first time ever, Informatica is offering a FREE version of our market leading PowerCenter product.   There will be two versions of PowerCenter Express:

  • PowerCenter Express Personal Edition – available for FREE for a single developer at a time
  • PowerCenter Express Professional Edition- available for $8K/user per year subscription (at the time of this blog post)

And one last important point.  PC Express is based on the same virtual data machine as our enterprise class products and our cloud based products.  This means that at some later date, if you decide you need more scalability, more users, or enterprise class features like high availability, you can easily migrate from PC Express to the other Informatica data integration product lines.

So if you are at Informatica World, you will be receiving an email outlining how you can download and try out PowerCenter Express.  If you aren’t at Informatica World, maybe you have a friend who will share the secret website location where you can get a sneak peak at PowerCenter Express.  If you don’t have any friends who went to Informatica world, well, you will just have to wait until the download site goes public in July.  And next time you will know that you better go to Informatica World if you want to get early access to cool stuff.

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, SaaS | Tagged , , , , | Leave a comment