Managing applications and data in distributed computing infrastructures

University dissertation from Uppsala universitet

Abstract: Over the last few decades, the needs of computational power and data storage by collaborative, distributed scientific communities have increased very rapidly. Distributed computing infrastructures such as computing and storage grids provide means to connect geographically distributed resources and helps in addressing the needs of these communities. Much progress has been made in developing and operating grids, but several issues still need further attention. This thesis discusses three different aspects of managing large-scale scientific applications in grids:• Using large-scale scientific applications is often in itself a complex task, and to set them up and run experiments in a distributed environment adds another level of complexity. It is important to design general purpose and application specific frameworks that enhance the overall productivity for the scientists. The thesis present further development of a general purpose framework where existing portal technology is combined with tools for robust and middleware independent job management. Also, a pilot implementation of a domain-specific problem solving environment based on a grid-enabled R solution is presented.• Many current and future applications will need large-scale storage systems. Centralized systems are eventually not scalable enough to handle huge data volumes and also have can have additional problems with security and availability. An alternative is a reliable and efficient distributed storage system. In the thesis the architecture of a self-healing, grid-aware distributed storage cloud, Chelonia, is described and performance results for a pilot implementation are presented.• In a distributed computing infrastructure it is very important to manage and utilize the available resources efficiently. The thesis presents a review of different resource brokering techniques and how they are implemented in different production level middlewares. Also, a modified resource allocation model for the Advanced Resource Connector (ARC) middleware is described and performance experiments are presented.