Virtual infrastructures for computational science: software and architectures for distributed job and resource management

University dissertation from Umeå : Institutionen för datavetenskap, Umeå universitet

Author: Per-olov Östberg; Umeå Universitet.; [2011]

Keywords: ;

Abstract: In computational science, the scale of problems addressed and the resolution of solu- tions achieved are often limited by the available computational capacity. The current methodology of scaling computational capacity to large scale (i.e. larger than individ- ual resource site capacity) includes aggregation and federation of distributed resource systems. Regardless of how this aggregation manifests, scaling of scientific compu- tational problems typically involves (re)formulation of computational structures and problems to exploit problem and resource parallelism. Efficient parallelization and scaling of scientific computations to large scale is difficult and further complicated by a number of factors introduced by resource aggregation, e.g., resource heterogene- ity and coupling of computational methodology. Scaling complexity severely impacts computation enactment and necessitates the use of mechanisms that provide higher abstractions for management of computations in distributed computing environments.This work addresses design and construction of virtual infrastructures for scientific computation that abstract computation enactment complexity, decouple computation specification from computation enactment, and facilitate large-scale use of compu- tational resource systems. In particular, this thesis discusses job and resource man- agement in distributed virtual scientific infrastructures intended for Grid and Cloud computing environments. The main area studied is Grid computing, which is ap- proached using Service-Oriented Computing and Architecture methodology. Thesis contributions discuss both methodology and mechanisms for construction of virtual infrastructures, and address individual problems such as job management, application integration, scheduling job prioritization, and service-based software development.I addition to scientific publications, this work also makes contributions in the form of software artifacts that demonstrate the concepts discussed. The Grid Job Manage- ment Framework (GJMF) abstracts job enactment complexity and provides a range of middleware-agnostic job submission, control, and monitoring interfaces. The FSGrid framework provides a generic model for specification and delegation of resource allo- cations in virtual organizations, and enacts allocations based on distributed fairshare job prioritization. Mechanisms such as these decouple job and resource management from computational infrastructure systems and facilitate the construction of scalable virtual infrastructures for computational science.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.