Enabling Technologies for Management of Distributed Computing Infrastructures

University dissertation from Umeå : Umeå Universitet

Abstract: Computing infrastructures offer remote access to computing power that can be employed, e.g., to solve complex mathematical problems or to host computational services that need to be online and accessible at all times. From the perspective of the infrastructure provider, large amounts of distributed and often heterogeneous computer resources need to be united into a coherent platform that is then made accessible to and usable by potential users. Grid computing and cloud computing are two paradigms that can be used to form such unified computational infrastructures.Resources from several independent infrastructure providers can be joined to form large-scale decentralized infrastructures. The primary advantage of doing this is that it increases the scale of the available resources, making it possible to address more complex problems or to run a greater number of services on the infrastructures. In addition, there are advantages in terms of factors such as fault-tolerance and geographical dispersion. Such multi-domain infrastructures require sophisticated management processes to mitigate the complications of executing computations and services across resources from different administrative domains.This thesis contributes to the development of management processes for distributed infrastructures that are designed to support multi-domain environments. It describes investigations into how fundamental management processes such as scheduling and accounting are affected by the barriers imposed by multi-domain deployments, which include technical heterogeneity, decentralized and (domain-wise) self-centric decision making, and a lack of information on the state and availability of remote resources.Four enabling technologies or approaches are explored and developed within this work: (I) The use of explicit definitions of cloud service structure as inputs for placement and management processes to ensure that the resulting placements respect the internal relationships between different service components and any relevant constraints. (II) Technology for the runtime adaptation of Virtual Machines to enable the automatic adaptation of cloud service contexts in response to changes in their environment caused by, e.g., service migration across domains. (III) Systems for managing meta-data relating to resource usage in multi-domain grid computing and cloud computing infrastructures. (IV) A global fairshare prioritization mechanism that enables computational jobs to be consistently prioritized across a federation of several decentralized grid installations.Each of these technologies will facilitate the emergence of decentralized computational infrastructures capable of utilizing resources from diverse infrastructure providers in an automatic and seamless manner.