From Lab to Factory: How to Turn Big Data into Business Value

You’ve successfully proven the value of your big data proof of concept. Now you need to operationalize your experiment into a full-fledged production platform. One of the biggest challenges in transitioning the project from a lab to factory environment is how to do it well, and how to do it in a repeatable way (because this won’t be your last big data project)—and avoid failures.

The key lies in using a common set of data management standards and technologies so you can transition projects in a smooth and predictable way. But that’s often easier said than done.

In a previous blog we described the 5 Steps to a Successful Big Data POC. Today, I want to describe the steps to turning successful big data POCs into production environments. Based on our insights, lessons learned, and best practices gained from working with enterprises worldwide, we came up with a workbook—“From Lab to Factory: The Big Data Management Workbook”—that shows you what to avoid, what to do right, and what really matters when it comes to big data management.


big data
New ebook packed with advice for turning successful big data experiments into production deployments


To begin the journey from lab to factory—you need an enterprise architecture that’s capable of serving two distinct purposes:

  1. The lab: Make data ready for analysis in a lab environment where analysts can efficiently run meaningful analytic experiments and pilots.
  1. The factory: Make data production ready in a factory environment so it can be used for specific projects and products, as they’re being operationalized.

The good news is that the requirements for these two purposes, while quite different, can be met by a common set of architectural components. But it’s absolutely crucial to get the architecture right before you begin. Our workbook goes into much greater detail on this, but here’s a high-level look at a few key things that will help you successfully move from lab to factory:

  • Define a clear data management strategy that takes lab and factory into early consideration. The lab side of your data management infrastructure is all about empowering your analysts to run their analytic experiments on their own with less help from IT. But the factory side is about automating a data supply chain that turns the data and insights discovered in the lab into tangible business value. The factory should ensure that IT has the tools and investment it needs to build out what your business end-users and customers need.
  • In the lab environment, you want to create a structure that enables your analysts to build their own data pipelines on the fly. So if you give them tools that self- document transformations and data flows, you’ll speed up the process of IT provisioning these data pipelines in a production environment. The factory can just engineer based on the logic and objects used in the lab environment.
  • Build an automated, standardized data management infrastructure that supports both the lab and factory. A big data lab that can’t rapidly implement innovative solutions in a production-grade factory environment is only half-complete. And a data management infrastructure that can’t support self-service autonomy for analysts to experiment isn’t complete either. The bottom line is that smart architectural and infrastructural decisions can help you ensure risk-free experimentation while streamlining production.
  • Respect the 3 pillars of big data management: Integration, Governance, and Security. Apply the three pillars of big data management (integration, governance, and security), and you won’t just be streamlining IT’s development and production processes—you’ll be giving the smartest scientists and analysts in your business the license they need to innovate. Check out this blog post from Informatica’s Chief Product Officer, Amit Walia to learn more about big data management.

I hope you’ll take a look at our workbook “From Lab to Factory: The Big Data Management Workbook It will help you design an enterprise architecture that serves both the lab and the factory—making data ready for analysis in a lab where analysts can run valuable experiments to discover insights, and making data production-ready in a factory that is set up to deliver value to business stakeholders and customers.