Tag Archives: Data Warehousing
You probably know this already, but I’m going to say it anyway: It’s time you changed your infrastructure. I say this because most companies are still running infrastructure optimized for ERP, CRM and other transactional systems. That’s all well and good for running IT-intensive, back-office tasks. Unfortunately, this sort of infrastructure isn’t great for today’s business imperatives of mobility, cloud computing and Big Data analytics.
Virtually all of these imperatives are fueled by information gleaned from potentially dozens of sources to reveal our users’ and customers’ activities, relationships and likes. Forward-thinking companies are using such data to find new customers, retain existing ones and increase their market share. The trick lies in translating all this disparate data into useful meaning. And to do that, IT needs to move beyond focusing solely on transactions, and instead shine a light on the interactions that matter to their customers, their products and their business processes.
They need what we at Informatica call a “Data First” perspective. You can check out my first blog first about being Data First here.
A Data First POV changes everything from product development, to business processes, to how IT organizes itself and —most especially — the impact IT has on your company’s business. That’s because cloud computing, Big Data and mobile app development shift IT’s responsibilities away from running and administering equipment, onto aggregating, organizing and improving myriad data types pulled in from internal and external databases, online posts and public sources. And that shift makes IT a more-empowering force for business change. Think about it: The ability to connect and relate the dots across data from multiple sources finally gives you real power to improve entire business processes, departments and organizations.
I like to say that the role of IT is now “big I, little t,” with that lowercase “t” representing both technology and transactions. But that role requires a new set of priorities. They are:
- Think about information infrastructure first and application infrastructure second.
- Create great data by design. Architect for connectivity, cleanliness and security. Check out the eBook Data Integration for Dummies.
- Optimize for speed and ease of use – SaaS and mobile applications change often. Click here to try Informatica Cloud for free for 30 days.
- Make data a team sport. Get tools into your users’ hands so they can prepare and interact with it.
I never said this would be easy, and there’s no blueprint for how to go about doing it. Still, I recognize that a little guidance will be helpful. In a few weeks, Informatica’s CIO Eric Johnson and I will talk about how we at Informatica practice what we preach.
Getting started with Cloud Data Warehousing using Amazon Redshift is now easier than ever, thanks to the Informatica Cloud’s 60-day trial for Amazon Redshift. Now, anyone can easily and quickly move data from any on-premise, cloud, Big Data, or relational data sources into Amazon Redshift without writing a single line of code and without being a data integration expert. You can use Informatica Cloud’s six-step wizard to quickly replicate your data or use the productivity-enhancing cloud integration designer to tackle more advanced use cases, such as combining multiple data sources into one Amazon Redshift table. Existing Informatica PowerCenter users can use Informatica Cloud and Amazon Redshift to extend an existing data warehouse with through an affordable and scalable approach. If you are currently exploring self-service business intelligence solutions such as Birst, Tableau, or Microstrategy, the combination of Redshift and Informatica Cloud makes it incredibly easy to prepare the data for analytics by any BI solution.
To get started, execute the following steps:
- Go to http://informaticacloud.com/cloud-trial-for-redshift and click on the ‘Sign Up Now’ link
- You’ll be taken to the Informatica Marketplace listing for the Amazon Redshift trial. Sign up for a Marketplace account if you don’t already have one, and then click on the ‘Start Free Trial Now’ button
- You’ll then be prompted to login with your Informatica Cloud account. If you do not have an Informatica Cloud username and password, register one by clicking the appropriate link and fill in the required details
- Once you finish registration and obtain your login details, download the Vibe ™ Secure Agent to your Amazon EC2 virtual machine (or to a local Windows or Linux instance), and ensure that it can access your Amazon S3 bucket and Amazon Redshift cluster.
- Ensure that your S3 bucket, and Redshift cluster are both in the same availability zone
- To start using the Informatica Cloud connector for Amazon Redshift, create a connection to your Amazon Redshift nodes by providing your AWS Access Key ID and Secret Access Key, specifying your cluster details, and obtaining your JDBC URL string.
You are now ready to begin moving data to and from Amazon Redshift by creating your first Data Synchronization task (available under Applications). Pick a source, pick your Redshift target, map the fields, and you’re done!
The value of using Informatica Cloud to load data into Amazon Redshift is the ability of the application to move massive amounts of data in parallel. The Informatica engine optimizes by moving processing close to where the data is using push-down technology. Unlike other data integration solutions for Redshift that perform batch processing using an XML engine which is inherently slow when processing large data volumes and don’t have multitenant architectures that scale well, Informatica Cloud processes over 2 billion transactions every day.
Amazon Redshift has brought agility, scalability, and affordability to petabyte-scale data warehousing, and Informatica Cloud has made it easy to transfer all your structured and unstructured data into Redshift so you can focus on getting data insights today, not weeks from now.
The reality in data warehousing is that the primary focus is on delivery. The data warehouse team is tasked with extracting, transforming, integrating, and loading data into the warehouse within increasingly tight timeframes. Twenty years ago, monthly data warehouse loads were common. Ten years ago, weekly loads became the norm. Five years ago, daily loads were called for. Nowadays, near-real-time analytics demands the data warehouse be loaded more frequently than once a day. (more…)
Thousands of Oracle OpenWorld 2012 attendees visited the Informatica booth to learn how to leverage their combined investments in Oracle and Informatica technology. Informatica delivered over 40 presentations on topics that ranged from cloud, to data security to smart partitioning. Key Informatica executives and experts, from product engineering and product management, spoke with hundreds of users on topics and answered questions on how Informatica can help them improve Oracle application performance, lower risk and costs, and reduce project timelines. (more…)
The widespread adoption of electronic health records (EHRs) is a key objective of the Health Information Technology for Economic and Clinical Health (HITECH) Act, enacted as part of the American Recovery and Reinvestment Act of 2009. With the pervasive use of EHRs, an enormous volume of clinical data will be readily accessible that has previously been locked away in paper charts. The potential value of this data to yield insights into what works in healthcare, and what doesn’t work, dwarfs the benefits of simply replacing a paper chart with an electronic system. There’s appropriate enthusiasm that this data is going to be a veritable goldmine for enterprise data warehousing, business intelligence, and comparative effectiveness research. However, there are other, equally valuable, uses for this data to enhance clinical decision-making and improve the value of healthcare spending. Simply having instant access to large volumes of data that span thousands or tens-of-thousands of physicians, hundreds-of-thousands of patients and millions of encounters, offers an unparalleled opportunity to increase the quality and lower the cost of healthcare. (more…)
Most of the big data discussions have been on the technology or the numerously re-played business discoveries used as examples of big data’s power. Many companies are still in the experimental stages of big data, asking for guidance regarding what their benefits would be, how they can re-align themselves to take advantage, and what new processes might be helpful to make them successful with these powerful new capabilities. (more…)
We have all heard of data federation and of late we have also been hearing how simple, traditional data federation often gets passed off as data virtualization. Let’s get back to basics and take a hard look at what the real need is.
Data federation is not a new concept. When it first arrived on the scene many years ago, technologists got excited as it offered a way to quickly access numerous disparate data sources without physically moving data. Years passed and the term kept appearing in research paper after research paper – but what did not happen was the anticipated widespread adoption. TDWI’s Wayne Eckerson does a great job at tracking the evolution of data federation in his recent webinar and blog. Simple, traditional data federation does one thing and only one thing well – it creates a virtual view across heterogeneous data sources, delivering data in real-time, typically to reporting tools and composite applications. In its very simplicity lay its downfall.
The devil, as they say, is in the detail. Your organization might have invested years of effort and millions of dollars in an enterprise data warehouse, but unless the data in it is accurate and free of contradiction, it can lead to misinformed business decisions and wasted IT resources.
We’re seeing an increasing number of organizations confront the issue of data quality in their data warehousing environments in efforts to sharpen business insights in a challenging economic climate. Many are turning to master data management (MDM) to address the devilish data details that can undermine the value of a data warehousing investment.
Consider this: Just 24 percent of data warehouses deliver “high value” to their organizations, according to a survey by The Data Warehousing Institute (TDWI). Twelve percent are low value and 64 percent are moderate value “but could deliver more,” TDWI’s report states. For many organizations, questionable data quality is the reason why data warehouses fall short of their potential. (more…)
The “Business” Needs Critical Data “Now” – We Need The Next Generation Data Federation Technology “Yesterday!”
There is a lot of talk about using data federation, Enterprise Information Integration (EII) or data virtualization to deliver new data to the business, on-demand. However, do existing approaches cut it?
I have been following the data integration space for many years now, and like many of you, I have wondered about the viability of data federation as a data integration approach. Not because it does not hold promise – it does – it has many advantages as a fast, flexible and low cost approach to integrate multiple and diverse data sources in real-time, without the need for physical data movement.
However, according to the numerous architects that I have had the pleasure of meeting with on the Informatica 9 World Tour, simple or traditional data federation has not been able to live up to its immense promise. And why is that I asked – the reasons were many…