Virtual Full Replication for Scalable Distributed Real-Time Databases

University dissertation from Linköping University

Abstract: A fully replicated distributed real-time database provides high availability and predictable access times, independent of user location, since all the data is available at each node. However, full replication requires that all updates are replicated to every node, resulting in exponential growth of bandwidth and processing demands with the number of nodes and objects added. To eliminate this scalability problem, while retaining the advantages of full replication, this thesis explores Virtual Full Replication (ViFuR); a technique that gives database users a perception of using a fully replicated database while only replicating a subset of the data.We use ViFuR in a distributed main memory real-time database where timely transaction execution is required. ViFuR enables scalability by replicating only data used at the local nodes. Also, ViFuR enables flexibility by adaptively replicating the currently used data, effectively providing logical availability of all data objects. Hence, ViFuR substantially reduces the problem of non-scalable resource usage of full replication, while allowing timely execution and access to arbitrary data objects.In the thesis we pursue ViFuR by exploring the use of database segmentation. We give a scheme (ViFuR-S) for static segmentation of the database prior to execution, where access patterns are known a priori. We also give an adaptive scheme (ViFuR-A) that changes segmentation during execution to meet the evolving needs of database users. Further, we apply an extended approach of adaptive segmentation (ViFuR-ASN) in a wireless sensor network - a typical dynamic large-scale and resource-constrained environment. We use up to several hundreds of nodes and thousands of objects per node, and apply a typical periodic transaction workload with operation modes where the used data set changes dynamically. We show that when replacing full replication with ViFuR, resource usage scales linearly with the required number of concurrent replicas, rather than exponentially with the system size.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.