Design, implementation and evaluation of a distributed mediator system for data integration

University dissertation from Linköping : Linköpings universitet

Abstract: An important factor of the strength of a modern enterprise is its capability to effectively store and process information. As a legacy of the mainframe computing trend in recent decades, large enterprises often have many isolated data repositories used only within portions of the organization. The methodology used in the development of such systems, also known as legacy systems, is tailored according to the application, whiteout concern for the rest of the organization. From organizational reasons, such isolated systems still emerge within different portions of the enterprises. While these systems improve the efficiency of the individual enterprise units, their inability to interoperate and provide the user with a unified information picture of the whole enterprise is a "speed bump" in taking the corporate structures to the next level of efficiency.Several technical obstacles arise in the design and implementation of a system for integration of such data repositories (sources), most notably distribution, autonomy, and data heterogeneity. This thesis presents a data integration system based on the wrapper-mediator approach. In particular, it describes the facilities for passive data mediation in the AMOS II system. These facilities consist of: (i) object-oriented (OO) database views for reconciliation of data and schema heterogeneities among the sources, and (ii) a multidatabase query processing engine for processing and executing of queries over data in several data sources with different processing capabilities. Some of the major data integration features of AMOS II are:A distributed mediator architecture where query plans are generated using a distributed compilation in several communicating mediator and wrapper servers. Data integration by reconciled OO views spanning over multiple mediators and specified through declarative OO queries. These views are capacity augmenting views, i.e. locally stored attributes can be associated with them. Processing and optimization of queries to the reconciled views using OO concepts such as overloading, late binding, and type-aware query rewrites. Query optimization strategies for efficient processing of queries over a combination of locally stored and reconciled data from external data sources. The AMOS II system is implemented on a Windows NT/95 platform. 

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.