A Return to Big Data

Quite a bit has happened on the topic of big data since my last post on Informatica Perspectives almost one and a half years ago.  I have spent a career working with organizations on how to get control over their uncontrolled data growth and industry visionaries are promoting this brave new world of big data.

Most data center managers and database administrators say they already have challenges keeping up with the data volumes they have now.  And now the business wants to store more data and analyze more types of data in a shorter period of time? The answer is a profound – Yes!  If there is a justifiable reason with a proven return, IT will need to look at innovative solutions to handle what’s coming – and handle it affordably at scale.  So in the meantime, I recommend taking a look at some of my archived blogs on how to implement data archiving, data governance, data classification and data privacy best practices.  While these blogs were written before Hadoop made it mainstream, these principles and practices still apply.  In fact, I would argue that these are that much more important now considering the fact that the problem domain just got Big.

This week, Informatica is hosting its annual conference Informatica World in Las Vegas, May 14-18.  With a key theme of Maximizing Your Return on Data.  I love this theme.  It all ties back to a business justification and prioritizing IT spend based on the ROI.  Big Data could mean a tremendous investment in time, capital spend, and training – but should not be done without a business case.  If you can minimize your costs and risks somewhere else in the data center, more can be invested in this promising technology innovation.

In my Informatica World sessions, one focusing on ‘The Impact of Big Data on Performance’, I am going to share some great research data collected from almost 400 of your peers on their definition of big data and when data volumes started to break down their system performance.  During my hiatus from Informatica Perspectives, I spent a spell as an analyst digging into Big Data and its profound impact on business and on IT Data Center processes.  I am looking forward to sharing the knowledge with those who plan to attend.  And for those who are not as fortunate to come to the event, I will be sure to continue blogging, tweeting, and writing about how even though today, data is big, tomorrow it will only be bigger.

Informatica World 2012 – Big Data Track – Tuesday May 15th 11:20am

Big Data and Its Impact on Performance

As organizations accumulate data in droves, if the underlying infrastructure is not designed to handle the volume, one of the first things to go is performance.  This may be a symptom of a big data.  Do you have big data?  If you are not sure, this presentation will guide you through an overview of what all the hoopla surrounding “Big Data” really means – from its core definition to how it impacts downstream data processing.

We will discuss:

  • How organizations can get a handle on big data in their transaction processing applications
  • How does big data impact data integration processes, and
  • How emerging platforms, such as MPP analytical databases and MapReduce frameworks and Hadoop are used to address Big Data Analytics performance challenges
This entry was posted in Application ILM, Big Data, Informatica 9.5 and tagged , , , , , , . Bookmark the permalink.

2 Responses to A Return to Big Data

  1. Ramakrishna says:

    We have tried to use the filter condition in the Application Source qualifier at session level but we are facing few issues.

    • When we hardcode the value in the filter condition it is working fine.

    LS_Account_Primary_Key_Concatenated__c in (‘1085GEEU_OU_EUR_LS_BE’,’1088GEEU_OU_SEK_PSDS_SWE_BIO’)

    • But when we are trying to parameterize the value and use the parameter in the filter condition it is throwing the below error.
    Can you please let us know if we can use the parameter variable in the SOQL filter condition.

    -bash-3.2$ more PK_ACCOUNT_2.txt

    LS_Account_Primary_Key_Concatenated__c in ($$Mapng_var)

    SOSQL [Select Id, LS_Account_Primary_Key_Concatenated__c From Account Where LS_Account_Primary_Key_Concatenated__c in ($$Mapng_var)]. Fault code [sf:MALFORMED_QUERY]. Reason [MALFORMED_QUERY:
    LS_Account_Primary_Key_Concatenated__c in ($$Mapng_var)
    ERROR at Row:1:Column:112
    line 1:112 no viable alternative at character ‘$’].

    • Ram Kishore says:

      You can try either of the following steps:

      – Please see KB 232569. EBF is available for 951 HF4

      – Or You can use a workflow variable instead of mapping variable (KB 101733)

      @Informaticacorp Perspective

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>