1. 1

The Case for Data Integration

Data permeates our modern world, necessitating increasingly more complex technology solutions to ensure diverse systems work in harmony. The evolution of package software applications with data silos mandates data sharing between different applications. From sharing data between the most complex business applications to address book synchronization on your smart phone, the task of transforming and transferring this data comprises the expanding industry of data integration.

Explore with us why we integrate systems, how data is architected, and the many ways it’s integrated. We will discuss the various styles of integration and the products that support it, as well as the pros and cons of each. We provide an objective and comparative approach to general techniques, allowing you to draw your own conclusions about what is best for your particular use case. Rather than compare specific vendors, we will give you the tools to do your own evaluation.

What is Data Integration?

Various vendors and technical experts offer their own definitions for this term, many of them self-serving. However, the majority agrees that data integration describes the systems that combine data from different sources into an application for “meaningful and valuable information” (IBM, 2016). Organizations of all sizes employ a variety of applications which often do not share a common data source or format, many times necessitating third party tools or plugins to achieve interaction between multiple systems. Data integration has grown as the volume and need to share data have exploded.

Data Integration includes:

  • Data Warehouses: information decoupled from source applications for improved readability
  • Application Integration: data sharing between independent applications
  • Regulatory Compliance: backup and recovery, with our without versioning of records or entire data sets
  • Data Migration: preventing vendor lock-in
  • Big Data: creation of the Haystack in hopes of finding needles
  • Device Integration: the Internet of Things, or IoT

 

Styles of Integration

The styles of integration include but are not limited to:

  • Flat File interfaces, which hearken back to the days of punched cards and 9-track tapes
  • Tightly coupled, which builds interfaces between applications in the actual application layer using API’s.
  • Mashups, which combine data from disparate sources into a common user interface
  • ETL, which involves three primary steps:
    • Extracting data from diverse source systems and storage technologies
    • Transforming Data into the desired format and content
    • Loading Data into target systems
  • Data Bus, which employs a publish-and-subscribe model to allow applications to get the data they need.
  • Data Replication, which simply copies data from one source to another, often changing the underlying storage method

 

Evolution of Data Integration

Flat Files: Since mainframe days, flat files have served as interfaces between systems built by different teams for different reasons or at different times.  Integration consisted of reformatting data, merging, sorting, and copying flat files, and then sometimes loading the data into a target database.ibm_system_360_tape_drives

Unified Operational Data Model: In the 1980’s, the fantasy was that a corporate IT department would employ a single database model that would represent all the company’s data, and all applications would use the single version of the truth as their repository. This was never an achievable goal due to the complexity and cost of every company rebuilding all their systems by internal staff. The database technology at the time also did not lend itself to the task until the relational model caught on in the 1990’s.

Integrating Applications: As vendor package software applications proliferated to replace legacy custom applications, integrating applications on the back end became the new, more realistic objective.

Corporate Data Warehouses: Corporate data warehouses that could merge and restructure this data for high performance reporting became another driving factor for development of integration techniques.

Big Data: Gartner analysts Shella Childs and Merv Adrian outlined the impacts for organizations employing “big data” systems to analyze business information, and noted that “big data is not simply big volume”, citing the complexity and volume of enterprise applications, and recommends “IT should join with the business to aggressively embrace the concept of big data as providing the potential to deliver new revenue and/or competitive differentiation.”