Exhaustion Dominated Performance: Methodology, Tools and Empirical Experiments

Abstract: Problem with application sensitivity to insufficient resources in High PerformanceComputing (HPC) clusters has been a longstanding issue of concern for industries. Inthis thesis we propose a method based on a black box approach for characterizationand analysis of engineering simulation applications with respect to available hardwareresources in the situation of resource depletion. The basis of this hypothesis is theexistence of one dominating bottleneck at any given moment during execution time.This method suggests that engineering simulation application’s behavior to exhaustioncan be explained as an approximately linear dependency of execution time for theproblem at hand. Verification of this hypothesis required accurate non-intrusive measurement,precise methods and tools for individual depletion of hardware resources. The resultsof these efforts were methods for depletion of available bandwidth, available RAMmemory, processor capacity of compute nodes and practical guidelines for recognitionof bottlenecks.The method of bandwidth depletion succeeded to decrease the available bandwidth,by generating stable and robust network traffic, with only 0.5% deviation from thedesired target with a maximum variance of approximately 1%.Furthermore, the method for depletion of processor capacity resulted to a successfulto generate artificial CPU load and occupy as well as 90% of the processor capacity.The verification for this tool confirmed the accuracy of 0.11% median deviation. Theerror range for attempts for a target load lay between 0.00% at minimum and 1.04% atmaximum.Our experiments with HPL and Fluent whilst exhausting the RAM memory and theavailable bandwidth of the computer cluster, confirmed that our proposed method foranalysis and recognition of bottlenecks is fully applicable and a viable option forcharacterization in this domain.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.