10 Customer Questions About Informatica’s Marketing Data Lake

Marketing Data LakeLast week, Prash Chandramohan and I presented a deep dive webinar on Informatica’s Marketing Data Lake to close to 700 attendees. As with any technical demo webinar, we weren’t able to address any questions during our one-hour slot as our focus was to ensure our attendees understood Informatica’s Marketing Data Lake’s key capabilities and features.

Prash and I went thru the list of 100s of questions and we found a lot of overlaps. In this blog, we are addressing top 10 questions that we think will benefit everyone.

If you missed the live webinar on Feb 2, you can listen to the on-demand version here. You can also get a copy of the presentation and the Data Lake Reference Architecture in the “attachment” section of the webinar.

Here are our answers to your questions. We would love to hear more from you and you can reach us by commenting on this blog.

1. Can the Marketing Data Lake complement an existing Data Warehouse?

 Yes, traditional data storage & processing platforms such as database or data warehouse appliances (Teradata, Netezza, Greenplum) can complement Hadoop & NoSQL. Most customers have a mature data warehouse implementation where they can “pull” data as needed into the marketing data lake.

2. Is possible to match “unknown” visitors to your website?

There are certain cases when it’s not possible to identify an unknown visitor, for example a visitor coming from a new ip-address with empty cookies in “browser incognito mode”. However, if the use identified themselves by downloading material, etc. all previous interactions are connected.  If the you’re using Adobe Analytics or Google Analytics, you may track the customer via cookie ID and integrate with Marketo to discern the known visitor.

3. Will you be able to show lineage from traditional data sources, like oracle?

 The metadata foundation on which Intelligent Data Lake is built, metadata is cataloged from Informatica, RDBMS, BI Tools, Hive, HDFS, S3, Redshift etc. and we are continuously adding more.

 4. Is there a difference between the Marketing Data Lake and the Intelligent Data Lake?

 The Marketing Data Lake is a vertical solution built using Informatica Big Data Solutions (see reference architecture) to solve specific marketing analytics use cases. This way our customers get direct value and a special purpose full solution. Intelligent Data Lake is a product used to find, prepare, and govern data for analysis in a uniquely collaborative way that enables the business to build marketing analytic solutions.

5. In the relationship diagram, what are these “users” you showed? Are these users who have access to the data?

 These users are somehow associated to the data asset. They may have created it in IDL or used in in IDL project. This does not necessarily mean they have accessed the data

6. Can Informatica’s matching and cleansing tools be deployed in the lake (or Hadoop?)

Yes, Informatica MDM – Relate 360 (formerly, Big Data Relationship Management) can be deployed on Hadoop, which allows customers to match duplicated party data. MDM – Relate 360 allows you to enrich customer profiles and creates a trusted view of your customers and their relationships on Hadoop. You can use the graph based visualization provided by MDM – Relate 360 to explore the relationships in a business-user friendly interface.

7. Can you explain in which phase data masking occurs?

Data masking is typically done during data ingestion into the lake. In BDM, we have data masking transformation with different masking techniques to De-identify, de-sensitize, and anonymize sensitive data from unauthorized access for application users, business intelligence, application testing, and outsourcing.

8. To follow up on data security, can you control who can see the published data?

Data access to published data sets is enforced at Hadoop level, so user needs to be aware of where they are publishing the asset and Hadoop admin has to ensure that only authorized users can access that data.

9. How do you execute and operationalize the data preparation steps in IDL?

 Within the IDL, any interaction with the data in the application is recorded as a “recipes”. When the data is published these recipes are translated into an Informatica Big Data Management mapping which is configured and executed to run at scale on any supported processing engine (via the smart executor) on Spark, Informatica Blaze or Map Reduce.

10. How is Informatica viewed in the Big Data space?

Recently, Forrester positioned Informatica as a leader in The Forrester Wave™: Big Data Fabric, Q4 2016. To learn more about all the top big data fabric vendors, their key features and functionalities, and how they compare, check out: http://infa.media/HVA3271LN

Thank you for attending this webinar. If you didn’t get a chance to see the complete session, you can always watch it again by checking the on-demand version here.