The ability to create abstract schemas that are mapped to back-end physical databases provides a huge advantage for those enterprises looking to get their data under control. However, given the power of data virtualization, there are a few things that those in charge of data integration should know. Here are a few quick tips.
Tip 1: Start with a new schema that is decoupled from the data sources.
Many of those who leverage data virtualization attempt to build a schema on top of existing databases, leveraging attributes found within those databases. A much better approach is to design a database schema that’s directly related to the requirements of the business. For now, ignore the design of back-end databases.
Then, link the physical databases to the virtual database schema. Of course, some tradeoffs have to be made. However, the end design should be much closer to what the virtual database should be, and not just a copy of physical database attributes.
Tip 2: Consider performance in your design.
In most instances, performance issues can be traced to poor designs, and not poor technology. When considering data virtualization, make sure you understand the performance tradeoffs around the use of the number of physical databases, and how the data will be gathered and externalized.
As a rule of thumb, your solution will perform only as well as the worst performing back-end database. Thus, if you leverage three physical databases and database 1 is able to produce the result set in .03 seconds, and database 2 is able to produce the result set in .002 seconds, and database 3 is able to produce the result set in 3.4 seconds, than the latency is determined by database 3, or, more than 3.4 seconds to get the result set. However, creating abstractions that only leverage database 1 and database 2 will result in a latency that is only .03.
Tip 3: Focus on the business value.
As technologists we have a tendency to focus a bit too much on the coolness of the technology, and not as much on the business value. I would urge those looking at data virtualization to consider the value that this technology will bring the business. This means working up a business case.
The use of a business case does two things: First, it provides a foundation of understanding to obtain the funding to carry out a data virtualization project. Second, it establishes the definition of success for the project.
When you develop the business case, make sure you consider the value of the tactical problem you’re solving, such as the ability to have a single view of customer data. Also consider the strategic value, which is typically greater, such as the ability to quickly alter data to adapt to shifts in the business.
Of course, the value of these data virtualization tips is largely dependent upon your enterprise and business requirements. However, if data virtualization is something you’re considering, these are good starting points.