Part 1: The Importance of Test Data Management for Development and Operations

Test Data Management
Part 1: The Importance of Test Data Management for Development and Operations

The adoption of agile software development has prompted the tighter collaboration between development and operations (DevOps) – a practice that enables the implementation of agile.

DevOps enables more frequent, timely, higher quality software delivery by deploying automated tools for test, release, configuration management and continuous integration, while fostering collaboration across the various teams involved in a release, including development, QA, operations, release management, and managers.

Test management and automation tools have long been recognized for speeding up software delivery and improving quality, thus enabling DevOps.  While test management tools provide capabilities for managing test cases (creation, organization, and maintenance), test planning, and execution, defect management, linking of test cases to results / defect, test coverage analysis, and test automation monitoring and reporting[1] the management of the test data itself is less of a focus.  This is where test data management tools come into the picture.

Test Data Management tools provide the following key capabilities to fill the gaps left by test automation solutions:

  • Sensitive Data Protection
  • Test data generation
  • Test data maintenance
  • Test data coverage analysis
  • Test data lifecycle management
  • Support for collaboration
  • Self-service for testers
  • Integration with Test Automation and DevOps tools

Sensitive Data Protection

Most quality assurance teams insist on using production data for testing because it represents the most realistic environment to test.  However, production data contains sensitive information that should not be exposed to developers and testers, especially when testing is outsourced or off-shored.   Examples of sensitive information are personally identifiable information (PII), such as customers’ names, addresses, social security numbers, date of birth, etc., healthcare records (PHI), and credit card information (PCI).  Sensitive data must be de-identified when used in non-production environments. Persistent Data Masking solutions enables organizations to protect their sensitive data by using various masking algorithms (e.g. encryption, substitution, aging, credit card specific rules, etc.) to obfuscate the original data.    The masking ensures no user can view or easily deduce the original data, while maintaining the original characteristics (addresses, names, etc. still look like the original information) of the data and consistency across related masked data, thus ensuring test data quality.

Test Data Generation

Though testing usually use production data, it is not always available nor is it complete.  For new applications, there is no production system to source test data, or the production system does not contain enough data to cover various scenarios.  The production data also might not cover corner cases and erroneous data and scenarios.  Test Data Generation tools allow you to create synthetic data based on a rich set of rules for various data domains (e.g. credit card, addresses, e-mail, IP addresses, age, etc.), relationships (parent child), out of bound data, aggregation, and others.  Test Data Generation tools can be used on its own for simple or new applications or to supplement existing production data.

Test Data Maintenance

Once you have created a set of test data, whether it’s sourced from production or also include generated data, you may want to:

  • Store it for future use
  • Reuse it for other test cases
  • Manipulate it by:
    • creating slices (subset)
    • multiplying it (superset) for load testing

OR

  • augment it with additional test data from other test data sets or with generated test data
  • Tag it so that you can find it later
  • Search for test data you’ve saved before or that others have created

All of the capabilities above are needed for test data maintenance and is provided by a Test Data Warehouse solution.

In Part 2 of my blog, I will discuss the rest of the essential test data management capabilities to support DevOps.

[1] Gartner Market Guide for Test Management Tools, October 1, 2015.