Naked Marketing: The 60-Day Sprint to Our Big Data Marketing Data Lake

Big Data MarketingAt the very beginning of the Naked Marketing series, I said I’d show you how we went from zero to Big Data Marketing hero in 60 days.

Caveat: A lot of foundation work had to happen before the 60-day sprint. And in posts #2 to #7, I drilled down into things like the Business Case, the Tech Stack, our Checklists and the Data behind our Big Data Marketing Operations Odyssey.

So now it’s time to talk about the 60-day sprint itself. The fun part. The fast part. And the part that changed everything—for our marketing team, our SDRs, and our wider sales team.

The Sprint to our Big Data Marketing Data Lake

The post about The Foundations was all about how we moved from basic out-of-the-box analytics to the full implementation of Adobe Analytics; and how we moved to Marketo; and implementation of marketing source / channel and campaign codes; and put in predictive lead scoring with Lattice Engines… the building blocks.

So there we were, standing at the starting line, looking ahead at our goal: to deliver actual revenue to our business instead of a ‘phantom pipeline’—and to be able to prove it with new levels of transparency and accountability.

In the next 60 days, we promised ourselves, we would do four important things:

  • Connect our silos, so the data from each marketing app we use would integrate with the others.
  • Put the neatly linked data from all the apps into our data lake, where we could slice it up, put it to work, and run our weird and wonderful queries.
  • Stick a visualization tool on the front, so we could create new dashboards, reports, and drill-downs (we chose Tableau for this).
  • Introduce our new capability to our colleagues in sales, so they could put this new insight to work and think up new, revenue-driving use cases.

All that in 60 days. Let’s go.

Linking the data across our core marketing applications: The APIs

The first two weeks of our sprint were focused on creating data linkages across our core marketing applications using the Application Program Interfaces (APIs) that each app provides.

APIs are the critical connection points of the digital marketing era. They allow any application to exchange information with any other. A tech company’s success often depends on the robustness of its API. APIs matter.

For our 60-day sprint, we concentrated on using APIs to create five data linkages between:

  1. Adobe Analytics and Marketo
  2. Adobe Analytics and Demandbase
  3. Adobe Analytics and Rio SEO
  4. Marketo and Lattice Engines
  5. Marketo and Salesforce
    1. Adobe Analytics and Marketo Link

The most important data link for our big data marketing program is between Adobe Analytics (tracking our web visits) and Marketo (tracking our lead nurturing, email, and other campaign activity).

Every new visitor to our website gets two IDs: the Marketo ID and an Adobe Analytics ID. All of each visitor’s activity is tracked in both, initially as an anonymous user. Once the person fills in a form, all of his or her past activity immediately accrues to the Marketo ID, making them a known visitor.

Below is Anish Jariwala’s Activity Log in the Marketo system, showing all of his activity, with a timestamp:


Marketo Activity Log for Anish with his own Marketo ID in the URL


Clearly, this data carries important indicators about Anish’s intent, product interest, etc.

We needed to expose this Marketo ID in Adobe Analytics—making it the key to connecting these two data sources in our marketing data lake. So after exposing Marketo’s SOAP API to our website using Adobe Tag Manager (more on this below), we were able to capture the Marketo ID in Adobe Analytics as a dimension (this is an eVar in Adobe’s lingo).

Below is the screenshot showing the Adobe Analytics tag that fires when someone visits our website. As you can see MKTO ID 6334939 is captured under the dimension (eVar8 – v8):


INFA home page
Various Adobe Analytics tags are fired when someone visits This screengrab shows how a Marketo ID is captured when someone visits the website


That innocuous-looking entry is actually a big deal. It means that we can connect all of Anish’s activity across the website with all of his other activity as he engages with our campaigns (his email opens and clicks; his webinar registrations; his visit to our booth at Informatica World… you get the idea).

Signal from noise

But what if Anish visited our website from different devices? What if he used his laptop at work but an iPad at home?

The bad news: Adobe Analytics and Marketo both generate a unique ID for every new browser per device. So, until Anish fills out a form on another browser (device), he looks like two or more different people to us.

The good news: Marketo has a de-duping feature called ‘Merge Leads’. So when Anish uses the same email address to submit a form on different devices, Marketo collapses the info into the original Marketo ID—very, very handy.

(Our own Master Data Management automates the same kind of thing on a wider set of parameters, kicking exceptions up to a data steward for a ruling.)

In the Adobe Analytics screen shot below, you can see that Anish’s Marketo ID – 6334939 has five Web visitors associated with it – he used five different browser/device combinations to access our website but submitted forms from each using the same email. So we’ve got all five devices connected to Anish now:


big data marketing
A lead using various devices to access content and using the same email ID when he/she submits a form.


If the same visitor uses multiple email addresses to submit a form, Marketo will always assign a new MKTO ID every time, concluding that they’re different people. We can’t do anything about this noise at this time—but we estimate that it only applies to less than 10 percent of our data.

If this were a financial system, we’d need to be perfect, but hey, this is Marketing. Better to be directionaly right than perfectly wrong. We can still make great decisions – in fact they will be much better due to the line of sight we have across all systems.

    1. Adobe Analytics and Demandbase Integration

Demandbase uses reverse IP lookup to provide firmographics like industry, company name, and location whenever someone visits our site (as long as they visit from their business office or when they are connected to their company via VPN).

Demandbase has various APIs that integrate with Adobe Analytics—it only took a few hours.

We just had to decide which dimensions (eVars) to use to populate Adobe Analytics with our Demandbase data fields and we were all set. Again, just as with Adobe Analytics, all Demandbase information is anonymous until we connect it to a known lead in our data lake.

    1. Adobe Analytics and Rio SEO integration

We use Rio-SEO to capture social shares, influencers, and word-of-mouth activity (people coming to by clicking a link in a social share, content shared via email or chat, etc.).

About 1 percent of our traffic comes from people we call influencers—people who share our content with their colleagues and followers. This influential 1 percent brings in a whopping 5-7 percent of our site traffic. And we believe this to be some of the most important site traffic we are getting.

When someone uses one of our share buttons on the site—or when they copy/paste a URL into an email—we can identify the original sharer when the recipient clicks on the link.

Rio-SEO has its own unique code called FBID (it’s device/browser specific so you can call it a Rio-SEO cookie). This ID is appended on every URL when you visit Look up at the address bar now and you’ll see yours.

For Anish, his FBID is: EiTpb-hEMwi. We capture this as eVar 60 in Adobe Analytics and you can see it in his profile below:


INFA home page with insert
Rio-Social ID is captured from URL string to a dimension (evar60) in Adobe Analytics


When Anish sends a link to others, they get their own unique FBID when they click on the link. They become the WoM—Word of Mouth—and he is the influencer.

We receive Rio-SEO data via FTP on a weekly basis, showing the influencer/WOM relationships. We match this information in our marketing data lake and surface the relationships in our Tableau dashboard.

All these integrations described so far are for anonymous visitors—Adobe Analytics, Demandbase, and Rio-SEO. These relationships and analytics are great to have and certainly help inform our marketing but if I were to approach our sales team with this information, they’d want company and contact details. That’s why making connections in the data lake is so important.

    1. Marketo and Lattice Engines Integration

Lattice is the predictive lead scoring tool that I talked about in Naked Marketing Post 5. Based on historical patterns it tells us how likely a prospect is to buy from us—a hugely important guide for our sales and marketing efforts.

Marketo and Lattice Engines integrate really well. Every time a lead is created in Marketo or a key activity is updated at a lead level, the lead record is scored or re-scored based on one or more of our Lattice models (we built different ones for different product lines and geographies)—the scores are directly written to Marketo via API.

The nine activity types we use to trigger Lattice scoring are:

      1. New Lead
      2. Click Link
      3. Visit Webpage
      4. Interesting Moment
      5. Open Email
      6. Email Bounced Soft
      7. Fill Out Form
      8. Unsubscribe Email
      9. Click Email

There are about 1000 different variables and attributes (available via Lattice)  that can be included in the predictive models in addition to the triggers we provide — things like technology profile, website profile, firmographics, website keywords, growth trends, etc. Of course not all them will turn out to be positive or negative predictors for conversion to revenue, the ones that are will become part of the predictive model.

The Lattice score (based on the hundreds of different variables in the final model) comes back into Marketo then on into the data lake. The rating that is stamped (written into a custom field on the opportunity table) on the opportunity at the point of conversion is ultimately pulled from the Salesforce opportunity object and loaded in.

    1. Marketo and Salesforce Integration

Salesforce holds all the data about what happens to leads once they’re passed to Sales. This, obviously, is critical for modeling our world and achieving the end goal: attributing revenue to marketing touches.

Fortunately, Marketo and Salesforce have a good basic bidirectional API sync. Every 5-10 minutes, fields that are updated in either app get synced in both directions, Marketo to Salesforce, and Salesforce to Marketo. Data, especially custom fields, that are not covered via the basic integration or require some additional massaging are being sync’ed leveraging our own Informatica Data Integration tools – the advantages are that there’s no coding required, debugging problems is simple and any issues are being reported.

When a new record is created in Marketo (through a form-fill, a list upload, the web API or manually), we had two options:

      1. Update a record in Salesforce in the next sync cycle.
      2. Update a record in Salesforce when the lead is considered ready to be contacted by Sales.

At Informatica, we decided to follow option #1—syncing a lead to Salesforce right away to give visibility to Sales. Complete transparency.

A lead will move to Salesforce from Marketo only if the minimum attributes are available. For us, this includes first name, last name, company, and email address.

In the reverse direction, a Lead or Contact record will sync from Salesforce back to Marketo as long the record has an email address. Then the Salesforce ID is passed over to Marketo when the lead syncs.

Below, you can see the SFDC (Salesforce) ID for Anish in the lead record in Marketo. This SFDC ID is the glue (connecting key, join) between Marketo and Salesforce.


Marketo with SFDC ID
Connecting Marketo to Salesforce using SFDC ID as a key

What all these data connections mean

These five bi-directional data handshakes are actually much more than that. Because all of this data lives in our marketing data lake, the connections we created between the data sets in Adobe Analytics, Marketo, SFDC, etc. mean that all the data is now joined up via a single lead or account ID.

No more silos.

Slapping a pretty face on all this

Okay, so all this data is in the data lake. But to gain insight, we need to expose it to the people who can put it to work: the field marketers, SDRs, biz dev teams, and marketers who are making decisions about who to target, what to say, and what tactics to use.

That means a dashboard—or a series of dashboards to be more accurate. A dashboard is the thin layer that sits on top of our data lake, a window, showing us various views about what’s in it.

To be honest, it’s by far the easiest part of big data marketing. There are lots and lots of excellent data visualization and dashboard tools out there. They’ll all do pretty much what you need: munge data into report widgets then stick them next to each other (and enable drill downs).

We chose Tableau for our dashboard. We like Tableau, the ease of use, and they’re partners.

(It’s funny that a lot of people in Informatica are now referring to the Marketing Data lake as ‘The Tablau Dashboard’ as if the entire project was just a matter of sticking a dashboard on top of some clumps of data—purely a visualization exercise. Unfortunately, this is how a lot of teams buy their data viz tools too: as if the last mile of data quality, integration, and analytics is actually the whole journey!)

Why we love our Tag Manager

I can’t talk about all this integration work without giving a hat tip to the tool that makes it all so much easier to deploy and manage: Adobe Tag Manager.

ATM, like the Google Tag Manager and others, is a tool that helps you manage all the javascripts and tags that you deploy on your website in one, central place.

Without it, we’d have to manually place the various code snippets on every relevant page for every marketing app we use (from web analytics to marketing automation to social shares… all of them).

Then, whenever we wanted to change the code, we’d have to go back to every instance of that snippet and update it manually. That’s a huge hassle that would prohibit a lot of what we’re doing.

With ATM, you inject and update all scripts and tags in one convenient place. Once. Which is a lifesaver. And it works for more than we discussed here e.g. advertising-, retargeting tags, etc.

The Aha! Moment

About 30 days into our 60-day sprint, I had an “Aha!” moment—a moment when I knew what we were doing would really add value to the whole revenue generation machine.

For me, it was when Anish called me over to show me a real Marketo ID combined with our Adobe Analytics data. “See?” he said, “We now have the email and phone number of that web visitor right there.”

Up until that moment, all of this was just theory. From that moment on, it was our new reality: a connected sales and marketing that linked all of our data to a profile; an end-to-end view of the whole customer journey.

I’d like to say we heard the sound of angels singing or that we popped open a bottle of champagne to celebrate. The truth is, we said, “Nice one.” And went home. There was still a lot of work to do.

(A business development rep in Austin reacted more appropriately when he saw the new data mining power at his disposal. “This is an absolute game changer,” he said).

Five Lessons

So that’s a quick tour of our 60-day big data marketing sprint. On day 1, we had a collection of marketing applications and platforms but no way to connect them. On day 60, we had a completely joined-up revenue engine with total visibility into which of our sales and marketing activities actually work.

Here are five lessons we learned on the way:

1) Start your journey with a business outcome in mind.

A lot of teams start journeys like this with a list of questions they need to answer or reports they’d like to see. That’s okay as a rough guide, but it could paint you into a corner.

Instead, start with the business outcomes you want to drive. The questions you want to answer will almost surely change over the course of the project—and might even get in the way of achieving the real business value.

So start with goals about real outcomes, like lead to opportunity conversion rate increase or improved SDR outbound call close rates. And keep all eyes on the prize.

2) Start with the revenue side.

Let’s face it, we’re all here to sell stuff. So make sure your big data marketing program prioritices delivering relevant insight to the sales team.

Budget optimization in the marketing department is a great thing. But what really motivates sales people is when they see real new opportunities and can drill down for the insight that creates sales conversations.

3) Make friends with IT

There’s no way our marketing team could do all this integration without the active involvement of our IT colleagues.

But IT is charged with delivering the most value to the business for the least investment. Until we sat down with them, their natural reaction to our plan was, “We have most of this data in our data warehouse. We don’t need yet another data store.” They had a point: why duplicate data?

Once we’d talked them through our needs and strategy (and showed them how inexpensive it is to hold this data in Hadoop) the IT team didn’t just accept our program, they were enthusiastic about it.

Soon, the challenge wasn’t to get IT people to help—it was how to make everyone else in IT feel a part of the project. As a group, we made sure that everyone learned about the project, even if they weren’t driving it. So everyone in warehousing is also getting data lake skills.

And just as you can’t do this without IT—IT can’t do it without you. You need to be actively engaged in translating your business logic over to the IT people so that can integrate things the right way. This is a two-way commitment.

4) Make sure your web analytics and marketing automation work well independently before you bring the data together.

You need to know that each system is doing exactly what you need it to do before trying to sync them up.

To get an idea of what I mean by that, check out Post 6 for the two checklists that Anish created: one for Marketo set up; one for Adobe Analytics.

5) Market your successes.

It’s important to let everyone know what you’re doing and how it’s working. We gave frequent updates to key stakeholders in Sales and Marketing but also to the CEO and even the company’s board.

The high visibility and clear communication about goals and progress turned the project into a runaway success that got all the intention and help needed to grow the momentum!

Big Data Marketing

Book – The Marketing Data Lake

Naked Marketing, Prologue – “Finally We can Connect All the Dots

Naked Marketing, Post 1 – “A Big Data Marketing Operations Odyssey

Naked Marketing, Post 2 – “Who’s Who Behind Our Big Data Marketing

Naked Marketing, Post 3 – “The 5 Foundations for Big Data Marketing

Naked Marketing, Post 4 – “The Business Case for Big Data Marketing

Naked Marketing, Post 5 – “The Big Data Marketing Technology Stack

Naked Marketing, Post 6 – “Big Data Marketing Checklists for Marketo and Adobe Analytics

Naked Marketing, Post 7 – “The Data For Big Data Marketing

Naked Marketing, Post 9 – “The Sales Leaders’ View of the Marketing Data Lake

Naked Marketing, Post 10 – “The Account Based Marketing Dashboard

Naked Marketing, Post 11 – “The Big, Beautiful Bubble Chart – Nailing Marketing Attribution