Informatica

Informatica

Don’t Forget to Manage the Retention and Disposal of Data on Hadoop

According to an article written by Mark Brunelli interviewing James Kobielus of Forrester Research: Forrester’s Kobielus: It’s time for a Hadoop standards body, Hadoop is still a bit immature and needs adoption of standards. Mr. Kobielus goes on to indicate that when implementing Hadoop, “whether it’s through a data warehouse or Hadoop cluster, you’re talking about petabytes or multiple hundreds of terabytes worth of storage.”  Hadoop, while designed to access these large data volumes (which can include social media data), does nothing to manage retention of that data. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Application ILM, Application Retirement, Big Data, Data Governance, Database Archiving, Financial Services, Governance, Risk and Compliance | Tagged , , | Leave a comment

Apache Hadoop MapReduce Meets Informatica Data Parsing

Guest blog from Arun C. Murthy, Founder & Architect, Hortonworks

As the framework architects and developers of Apache Hadoop MapReduce, we are always looking for ways to simplify the complex tasks associated with large-scale processing of data. We want users and organizations to spend their time on analyzing their growing data to gain valuable insights, not on menial tasks such as massaging their data for consumption or tediously parsing complex structures in their data. The Informatica HParser technology is extremely valuable in this regard. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in B2B, Big Data, Marketplace, News & Announcements | Tagged , , , , , , , | Leave a comment

Future Integration Needs: Embracing Complex Data

Hear from Informatica’s Karen Hsu on a new study’s findings and implications of big complex data.

 

 


For more on this see: Future Integration Needs: Embracing Complex Data

FacebookTwitterLinkedInEmailPrintShare
Posted in B2B, VLog | Tagged , , | Leave a comment

MDM and ACORD Standards: Synergies and Considerations

Hear from Informatic’s Karen Hsu on the new ACORD certified Information Management solution that helps insurance organizations drive customer-centricity.

For more on this see: Master Data Management and ACORD Standards: Synergies and Considerations

FacebookTwitterLinkedInEmailPrintShare
Posted in B2B, Customer Acquisition & Retention, Master Data Management, VLog | Tagged , , | Leave a comment

Action Plan for Hadoop Data Integration: Conclusion of Hadoop Blog Series

checklistI had the opportunity to review and comment on the draft of a new Hadoop technical guide. It’s great to see the published paper: Technical Guide: Unleashing the Power of Hadoop with Informatica. This guide outlines the following five steps to get started with Hadoop from a data integration perspective.

(1) Select the Right Projects for Hadoop Implementation

Choose projects that fit Hadoop’s strengths and minimize its disadvantages. Enterprises use Hadoop in data-science applications for log analysis, data mining, machine learning and image processing involving unstructured or raw data. Hadoop’s lack of fixed-schema works particularly well for answering ad-hoc queries and exploratory “what if” scenarios. Hadoop Distributed File System (HDFS) and MapReduce address growth in enterprise data volumes from terabytes to petabytes and more; and the increasing variety of complex multi-dimensional data from disparate sources. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Data Integration, Enterprise Data Management | Tagged , , , , , | Leave a comment

Hadoop Security: Part 6 of Hadoop Series

Security is a work-in-progress for the Apache Hadoop project and sub-projects, as I discuss as part of an O’Reilly Hadoop tutorial, “Get started with Hadoop: from evaluation to your first production cluster”. Below are several of the security tips and best practices that I discuss in that article. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data | Tagged , , , , , , , , , , | Leave a comment

Video: Electronic Health Records Update

Richard Cramer, Chief Healthcare Strategist for Informatica shares some views on Electronic Health Record (EHR) adoption, including HITECH and Meaningful Use pressures. He also talks about the challenges that the future holds for EHRs.


 

Visit Informatica’s Healthcare pages for more on EMRs.

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data, Healthcare, VLog | Tagged , , , , | Leave a comment

Hadoop Toolbox: Part 5 of Hadoop Series

Many organizations will mix and match individual Apache projects and sub-projects using Apache Hadoop’s loosely coupled architecture. This Hadoop toolbox provides a powerful set of tools and capabilities, but it does have some important limitations that can require a platform approach to address.

The Hadoop Distributed File System (HDFS) combines storage and processing in each data node. With the HDFS file system, you can add new files or append to existing files, but not replace files without use of a new filename. The append capability works well for adding new time-stamped logs as they come in, but can complicate storage of structured files. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data | Tagged , , , , , , , , , , , , , , , , , , , | Leave a comment

Dating With Data: Part 4 In Hadoop Series

eHarmony, an online dating service, uses Hadoop processing and the Hive data warehouse for analytics to match singles based on each individual’s “29 Dimensions® of Compatibility”, per a a June 2011 press release by eHarmony and one its suppliers, SeaMicro. According to eHarmony, an average of 542 eHarmony members marry daily in the United States. (more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data | Tagged , , , , , , , | 1 Comment

Hadoop Extends Data Architectures: Part 3 In Hadoop Series

The list and diversity of NoSQL, “NewSQL”, cloud, grid, and other data architecture options seem to grow every year.

The Harry Potter books and movies were a particularly popular inspiration for project names. For example, at LinkedIn, to empower features such as “People You May Know” and “Jobs You May Be Interested In”, LinkedIn uses Hadoop together with an Azkaban batch workflow scheduler and Voldemort key-value store. We’ll see if the Twilight series has a similar impact on project names.

(more…)

FacebookTwitterLinkedInEmailPrintShare
Posted in Big Data | Tagged , , , , , , , , , , , , , | Leave a comment