Multi-Threaded Distributed System Simulations Using Bi-Lateral Delay Lines

University dissertation from Linköping : Linköping University Electronic Press

Abstract: As the speed increase of single-core processors keeps declining, it is important to adapt simulation software to take advantage of multi-core technology. There is a great need for simulating large-scale systems with good performance. This makes it possible to investigate how different parts of a system work together, without the need for expensive physical prototypes. For this to be useful, however, the simulations cannot take too long, because this would delay the design process. Some uses of simulation also put very high demands on simulation performance, such as real-time simulations, design optimization or Monte Carlo-based sensitivity analysis. Being able to quickly simulate large-scale models can save much time and money.The power required to cool a processor is proportional to the processor speed squared. It is therefore no longer profitable to keep increasing the speed. This is commonly referred to as the "power wall". Manufacturers of processors have instead begun to focus on building multi-core processors consisting of several cores working in parallel. Adapting program code to multi-core architectures constitutes a major challenge for software developers.Traditional simulation software uses centralized equation-system solvers, which by nature are hard to make parallel. By instead using distributed solvers, equations from different parts of the model can be solved simultaneously. For this to be effective, it is important to minimize overheadcosts and to make sure that the workload is evenly distributed over the processor cores.Dividing an equation system into several parts and solving them separately means that time delays will be introduced between the parts. If these occur in the right locations, this can be physically correct, since it also takes some time for information to propagate in physical systems. The transmission line  element method (TLM) constitutes an effective method for separating system models by introducing impedances between components, causing physically motivated time delays.Contributions in this thesis include parts of the development of the new generation of the Hopsan simulation tool, with support for TLM and distributed solvers. An automatic algorithm for partitioning models has been developed. A multi-threaded simulation algorithm using barrier synchronization has also been implemented.Measurements of simulation time confirm that the simulation time is decreased almost proportionally to the number of processor cores for large models. The decrease, however, is reduced if the cores are divided on different processors. This was expected, due to the communication delay for processors communicating over shared memory. Experiments on real-time systems with four cores show that a four times as large model can be simulated without losing real-time performance.The division into distributed solvers constitutes a sort of natural cosimulation. A future project could be to use this as a platform for linking different simulation tools together and simulating them with high performance. This would make it possible to model each part of the system in the most suitable tool, and then connect all parts into one large model.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)