Tag Archives: ETL
Managing the recovery and flow of data files throughout your enterprise is much like managing the flow of oil from well to refinery – a wide range of tasks must be carefully completed to ensure optimal resource recovery. If these tasks are not handled properly, or are not addressed in the correct order, valuable resources may be lost. When the process involves multiple pipelines, systems, and variables, managing the flow of data can be difficult.
Organizations have many options to automate the processes of gathering data, transferring files, and executing key IT jobs. These options include home-built scheduling solutions, system integrated schedulers, and enterprise schedulers. Enterprise schedulers, such as Skybot Scheduler, often offer the most control over the organization’s workflow, as they offer the ability to create schedules connecting various applications, systems, and platforms.
In this way, the enterprise scheduler facilitates the transfer of data into and out of Informatica PowerCenter and Informatica Cloud, and ensures that raw materials are refined into valuable resources.
Enterprise Scheduling Automates Your Workflow
Think of an enterprise scheduler as the pipeline bearing data from its source to the refinery. Rather than allowing jobs or processes to execute randomly or to sit idle, the enterprise scheduler automates your organization’s workflow, ensuring that tasks are executed under the appropriate conditions without the need for manual monitoring or the risk of data loss.
Skybot Scheduler addresses the most common pain points associated with data recovery, including:
- Scheduling dependencies: In order for PowerCenter or Cloud to complete the data gathering processes, other dependencies must be addressed. Information must be swept and updated, and files may need to be reformatted. Skybot Scheduler automates these tasks, keeping the data recovery process consistently moving forward.
- Reacting to key events: As with oil recovery, small details can derail the successful mining of data. Key events, such as directory changes, file arrivals, and evaluation requirements can lead to a clog in the pipeline. Skybot Scheduler maintains the flow of data by recognizing these key events and reacting to them automatically.
Choose the Best Pipeline Available
Skybot Scheduler is one of the most powerful enterprise scheduling solutions available today, and is the only enterprise scheduler integrated with PowerCenter and Cloud.
Capable of creating comprehensive cross-platform automation schedules, Skybot Scheduler manages the many steps in the process of extracting, transforming, and loading data. Skybot maintains the flow of data by recognizing directory changes and other key events, and reacting to them automatically.
In short, by managing your workflow, Skybot Scheduler increases the efficiency of ETL processes and reduces the potential of a costly error.
To learn more about the power of enterprise scheduling and the Skybot Scheduler check out this webinar: Improving Informatica ETL Processing with Enterprise Job Scheduling or download the Free Trial.
On a recent trip to a new city, someone said that the easiest way from the airport to the hotel was to use the Metro. I could speak the language, but reading it was another matter. I was surprised by how quickly I navigated to the hotel by following the Metro map. The Metro map is based on the successful design of the London Underground map.
Harry Beck was not a cartographer. He was an engineering draftsman. He started drawing a different type of map in his spare time. Beck believed that the passengers were not worried about the distance accuracy of the map. He reduced the map to straight lines and sharp angles, which produced a map closer to an electrical schematic diagram rather than a more common geographic map. The company that ran the London Underground was skeptical of Beck’s map since it was radically different and they had not commissioned the project. (more…)
This is the second installment of my multi-part blog series on “hitting the batch wall.” Well, it’s not so much about hitting the batch wall, but what you can do to avoid hitting the wall. Today’s topic is “throwing hardware” at the problem (a.k.a. hardware scaling). I’ll discuss the common approaches and the tradeoffs of hardware scaling with Informatica software.
Before I can begin to discuss hardware scaling, I start with this warning: faster hardware only improves the load window situation when it resolves a bottleneck. Data integration jobs are a lot like rush hour traffic, they can only run as fast as the slowest component. It doesn’t make any sense to buy a Ferrari if you will always be driving behind a garbage truck. In other words, if your ETL jobs are constrained by the source/target systems or I/O or even just memory, then faster/more CPUs will rarely improve the situation. Understand your bottlenecks before you start throwing hardware at them! (more…)
This year marks the 20th anniversary for Informatica. Twenty years of solving the problem of getting data from point A to point B, improving its quality, establishing a single view and managing it over its life-cycle. Yet after 20 years of innovation and leadership in the data integration market, when one would think the problem had been solved, all data had been extracted, transformed, cleansed and managed, it actually hasn’t — companies still need data integration. Why? Data is complicated business. And with data increasingly becoming central to business survival, organizations are constantly looking for ways to unlock new sources of it, use it as an unforeseen source of insight and do it all with greater agility and at lower cost. (more…)
In a recent webinar, Mark Smith, CEO at Ventana Research and David Lyle, vice president, Product Strategy at Informatica discussed: “Building the Business Case and Establishing the Fundamentals for Big Data Projects.” Mark pointed out that the second biggest barrier that impedes improving big data initiatives is that the “business case is not strong enough.” The first and third barriers respectively, were “lack of resources” and “no budget” which are also related to having a strong business case. In this context, Dave provided a simple formula from which to build the business case:
Return on Big Data = Value of Big Data / Cost of Big Data (more…)
“The report of my death was an exaggeration.”
– Mark Twain
Ah yes, another conference another old technology is declared dead. Mainframe… dead. Any programming language other than Java…. dead. 8 track tapes …OK, well some things thankfully do die, along with the Ford Pinto that I used to listen to the Beatles Greatest Hits Red Album over and over again on that 8 track… ah yes the good old days, but I digress. (more…)
Ever wondered if an initiative is worth the effort? Ever wondered how to quantify its worth? This is a loaded question as you may suspect but I wanted to ask it nevertheless as my team of Global Industry Consultants work with clients around the world to do just that (aka Business Value Assessment or BVA) for solutions anchored around Informatica’s products.
As these solutions typically involve multiple core business processes stretching over multiple departments and leveraging a legion of technology components like ETL, metadata management, business glossary, BPM, data virtualization, legacy ERP, CRM and billing systems, it initially sounds like a daunting level of complexity. Opening this can of worms may end up in a measurement fatigue (I think I just discovered a new medical malaise.) (more…)
In my previous blog on this subject, I talked about the incredible innovations of Hadoop as a new analytics engine, and the innovations of Informatica in removing un-maintainable and complex hand-coding. In this blog I want to drill into the world of Informatica ETL and Hadoop in order to show why these two innovations are critical to augmenting traditional data processing approaches as companies begin to look at leveraging Big Data for new analytics. (more…)
I recently had the pleasure of participating in a big data panel at the Pacific Crest investor’s conference (the replay available here.) I was joined on the panel by Hortonworks, MapR, Datastax and Microsoft. There is clearly a lot of interest in the world of big data and how the market is evolving. I came away from the panel with four fundamental thoughts: (more…)