If you don’t understand application semantics ‑ simply put, the meaning of data ‑ then you have no hope of creating the proper data integration solution. I’ve been stating this fact since the 1990s, and it has proven correct over and over again.
Just to be clear: You must understand the data to define the proper integration flows and transformation scenarios, and provide service-oriented frameworks to your data integration domain, meaning levels of abstraction. This is applicable both in the movement of data from source to target systems, as well as the abstraction of the data using data virtualization approaches and technology, such as technology for the host of this blog.
This is where many data integration projects fall down. Most data integration occurs at the information level. So, you must always deal with semantics and how to describe semantics relative to a multitude of information systems. There is also a need to formalize this process, putting some additional methodology and technology behind the management of metadata, as well as the relationships therein.
Many in the world of data integration have begun to adopt the notion of ontology (or the instances of ontology: ontologies). Ontology is a term borrowed from philosophy that refers to the science of describing the kinds of entities in the world and how they are related.
Why should we care? Ontologies are important to data integration solutions because they provide a shared and common understanding of data that exists within the business domain. Moreover, ontologies illustrate how to facilitate communication between people and information systems. You can think of ontologies as the understanding of everything, and how everything should interact to reach a common objective. In this case the optimization of the business.
By leveraging this concept we can organize and share enterprise information, as well as manage content and knowledge, which allows better interoperability and integration of inter- and intra-company information systems and databases. We can also layer common ontologies within verticals, or domains with repeatable patterns. For example, the ability to leverage common data meanings within the retail or healthcare verticals.
At its essence, ontology is a conceptual information model. Ontologies describe things that exist in a problem domain. This includes properties, concepts, and rules, and how they relate one to another, which support a standard reference model for information integration (the link to data integration), as well as knowledge sharing.
The use of ontologies is not at all new. For years we’ve been leveraging ontologies in the science of data integration because the use of these models supports human understanding of the information. The time has come to look at the value of this approach as our data integration domains become more complex and far reaching. The ability to place order in these complex environments is paramount to the success of the data integration strategy, I’ve found.
This use is self-explanatory within the context of data integration. Ontologies also provide the ability to facilitate information-based access, and information integration across very different information systems. We achieve this by formalizing the data semantics between intra- and inter-organizational information resources.
If data integration is on your to-do list in 2013, I urge you to look into this approach. You’ll find that the game is won at the planning and understanding stage of the project, and the more you can do to make that stage easier and better, the more likely you are to succeed.