This article explores Agile Data Integration and Business Intelligence practices and contrasts leading practices and technologies. First some definitions.
Agile DI is the application of agile techniques (iterative/incremental development, cross-functional self-organizing teams, rapid/flexible response to change, etc.) to address data integration challenges such as migrating data between systems or consolidating data from multiple systems. Agile BI is the application of agile techniques to address business intelligence challenges such as identifying and analyzing data to support better business decision-making. These two disciplines sometimes overlap or support each other. For example, you might use Agile DI to move data into a data warehouse and Agile BI to get it out of the warehouse in a useful form.
While Agile techniques in general can be done manually such as hand-written stories on a project wall or a sprint backlog on post-it notes, it helps to have tools to enable rapid iterations and response to change such as automated regression testing. The question I’d like to address in this blog is what technologies can have a big impact on enabling Agile DI and BI. Here are two examples (there are others) from the Informatica platform.
The Informatica 9.1 platform supports Agile DI by allowing data stewards, business analysts, technical architects, and integration developers to interact through role-based user interfaces accessing shared metadata rather than emailing Excel or Word documents back and forth. The reason this is important is that “getting the data right” is a collaborative learning process across a cross-functional team. Developers often complain that the users are constantly changing their requirements, but often it is more a matter of the requirements being refined as people learn. For example, needs change as a) source system analysis results in refined understanding of data definitions, b) target system needs are revised to respond to available source data, c) data profiling analysis highlights quality or consistency issues which result in programming work-arounds, and d) testing results identify problems or new opportunities for additional integration.
There is no such thing as “perfect source data” or “perfect target system requirements” except in extremely simple scenarios. Most of the time getting the data right is an iterative learning process – so the faster the cross-functional team can learn from each other and incrementally build the end solution, the more agile they become. The Informatica 9.1 platform supports integrated business glossary, data profiling, mapping specifications, and development metadata in shared repositories. Developers access the metadata through a thick-client developers workstation while analysts use a thin-client browser-based interface to access a view of the same metadata that is relevant to them. Everyone involved in the process can pass notes and comments to each other through the same shared repository.
Agile BI enabling technology is different. One example is the Informatica Data Services capability which enables data virtualization. In a data warehouse approach to BI, the data has to first be copied to the DW before the analytical query and reporting process can start. If the analytical process discovers missing data or uncovers an opportunity for additional analysis for data that wasn’t requested the first time around, it must wait until another DI effort is ramped up to deliver the additional data. By contrast, the data virtualization approach provides a common “view” into the database tables and metadata of multiple systems. No data is moved until a query is executed against the virtual view. If the initial query identifies the need/opportunity for a new line of analysis, the analyst simply changes the query and runs it again. No waiting! Once again the ability to rapidly iterate and accelerate the learning process enables an agile approach to BI. Don’t take my word for it – check out what HealthNow had to say about how they used data services as an integral part of the or SOA strategy.