Laying Tiles Ornamentally: An approach to structuring container traversals

University dissertation from Chalmers University of Technology

Abstract: Having hardware more capable of parallel execution means that more program scheduling decisions have to be taken to utilize that hardware efficiently. To this end, compilers implement coarse-grained loop transformations in addition to traditionally used fine-grained instruction reordering. Implementors of embedded domain specific languages have to face a difficult choice: to translate operations on collections to a low-level language naively hoping that its optimizer will do the job, or to implement their own optimizer as a part of the EDSL.

We turn ourselves to the concept of loop tiling from the imperative world and find its equivalent for recursive functions. We show the construction of a tiled functorial map over containers that can be naively translated to a corresponding nested loop.

We illustrate the connection between untiled and tiled functorial maps by means of a type-theoretic notion of algebraic ornament. This approach produces an family of container traversals indexed by tile sizes and serves as a basis of a proof that untiled and tiled functorial maps have the same semantics.

We evaluate our approach by designing a language of tree traversals as a DSL embedded into Haskell which compiles into C code. We use this language to implement tiled and untiled tree traversals which we benchmark under varying choices of tile sizes and shapes of input trees. For some tree shapes, we show that a tiled tree traversal can be up to 50% faster than an untiled one under a good choice of the tile size.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)