Test Data in the World of Agile Development

I was recently talking to an executive responsible for IT infrastructure of a mid-sized bank and he mentioned that he has 17 core applications critical for business, and, over 200 ancillary applications. Each core application historically has significant changes, once a quarter and 2 small changes every week. That is about 70 big changes and 1700 small changes every year!!! Just for the core applications that are critical to business.

Breaking existing capabilities
Broken Integration Tests

How does the business ensure that these perturbations to the existing systems do not break existing capabilities within the system? Worse, how does a business ensure that changes to these systems do not break other downstream applications that rely on data that is being captured in these systems? How do testing teams pick up the right test data to test all possible permutations and combinations that that these changes introduce?

Test data management solutions solve these problems with by subsetting a small set of product data to be used for testing and masking that data so that sensitive data is not exposed to a large set of users. Test data generation allows for data to be augmented for features that are currently not in production as well as introducing bad data in the environment.

As the number of applications increase many fold and the development strategies move from waterfall model to agile and continuous integration, there is a need to not only to provision test data, but provision it in such a way that it can be automated and used repeatedly. That requires a warehouse of test data that is categorized and tagged by test cases, test areas. This test data warehouse should:

  • Allow testers to review test data in the warehouse, tag interesting data and data sets and be able to search on them
  • In case test data is missing, augment test data
  • Visualize the test data, identifying white spaces in test data and generate test data to fill the white space.

Once this warehouse of test datasets are available, testers should be able to reserve test records that are being used by them for their testing. These records can be used to restore the test environments from the test data warehouse. Testers can then run their test cases without impacting other testers who are using the same test environment.

This still leaves the question open on how do organizations test a business process that spans multiple systems. How can organizations create test data in these systems that allow a business process to be executed from one end to another? Once organizations can master this capability, they will reach their potential in automating their test processes, and reduce risk that each perturbation in the environment inevitably bring.