Explosion of Big Data Leads to the Exposition of Data Integration
A new report from Forrester talks about the explosion of big data, with non-relational databases making up the bulk of the growth. The report, claimed to be a first of its kind from the analyst firm, is titled: Big Data Management Solutions Forecast 2016 to 2021.
In essence, the report argues that NoSQL and Hadoop will see the biggest growth during the five-year period, with the markets growing 25.0% and 32.9% per year respectively. Forrester also claims that the big data technology space will grow at three times the rate of the overall technology market.
According to the report, big data technology can be divided into six buckets; enterprise data warehousing, NoSQL, Hadoop, big data integration, data virtualization, and in-memory data fabric. Big data integration is any technology that can support updating and transporting data from big data systems, including NoSQL- and Hadoop-based data stores.
The growth of big data integration is around a few key patterns that include:
- The appearance of net new data stores that support big data systems, which in turn require data integration for updates and extracts.
- The focus on data security, including encryption in flight and encryption at rest.
- The focus on performance, including delivery data in real time or near real time, in support of key business processes.
- The rise of new technology that is related to big data technology, such as the Internet of Things and Machine Learning.
- The rise of the importance of data within most enterprises that have revenues of over one billion dollars a year.
With all of that said, the path to data integration for big data systems is not as easy as it sounds. Those who are looking to select big data integration technology need to focus on a few key requirements: Performance, suitability to task, security, and data governance features.
Picking the right tools means taking a holistic look at how data needs to move in and out of the big data systems. Moreover, map what operations need to be performed on the data, such as schema transformation. In some cases, we may be removing a schema altogether, or adding a schema at run time. Keep in mind that big data systems deal with structured and unstructured data, and both sides of the data need to be managed in the same efficient ways.
Performance is perhaps the bigger issue here. We need to move data at “the speed of need.” It must arrive at the target system in time for the business process or analytical service to absorb it, and thus provide the information the business needs to operate. Lackluster performance, which often occurs, is a difficult problem to fix without major surgery. Consider issues before they become a true problem.
Big data integration is the most strategic use of data integration technology that we’ve seen for some time. Indeed, this is all about using data better, and that requires that your data storage changes, as well as the plumbing. Will you be ready to take advantage of your data?