Compiler optimizations and autotuning for stencils and geometric multigrid

Compiler optimizations and autotuning for stencils and geometric multigrid

Publication Type	dissertation
School or College	College of Engineering
Department	Computing
Author	Basu, Protonu
Title	Compiler optimizations and autotuning for stencils and geometric multigrid
Date	2016
Description	Stencil computations are operations on structured grids. They are frequently found in partial differential equation solvers, making their performance critical to a range of scientific applications. On modern architectures where data movement costs dominate computation, optimizing stencil computations is a challenging task. Typically, domain scientists must reduce and orchestrate data movement to tackle the memory bandwidth and latency bottlenecks. Furthermore, optimized code must map efficiently to ever increasing parallelism on a chip. This dissertation studies several stencils with varying arithmetic intensities, thus requiring contrasting optimization strategies. Stencils traditionally have low arithmetic intensity, making their performance limited by memory bandwidth. Contemporary higher-order stencils are designed to require smaller grids, hence less memory, but are bound by increased floating-point operations. This dissertation develops communication-avoiding optimizations to reduce data movement in memory-bound stencils. For higher-order stencils, a novel transformation, partial sums, is designed to reduce the number of floating-point operations and improve register reuse. These optimizations are implemented in a compiler framework, which is further extended to generate parallel code targeting multicores and graphics processor units (GPUs). The augmented compiler framework is then combined with autotuning to productively address stencil optimization challenges. Autotuning explores a search space of possible implementations of a computation to find the optimal code for an execution context. In this dissertation, autotuning is used to compose sequences of optimizations to drive the augmented compiler framework. This compiler-directed autotuning approach is used to optimize stencils in the context of a linear solver, Geometric Multigrid (GMG). GMG uses sequences of stencil computations, and presents greater optimization challenges than isolated stencils, as interactions between stencils must also be considered. The efficacy of our approach is demonstrated by comparing the performance of generated code against manually tuned code, over commercial compiler-generated code, and against analytic performance bounds. Generated code outperforms manually optimized codes on multicores and GPUs. Against Intel's compiler on multicores, generated code achieves up to 4x speedup for stencils, and 3x for the solver. On GPUs, generated code achieves 80% of an analytically computed performance bound.
Type	Text
Publisher	University of Utah
Subject	Autotuning; Code Generation; Compilers; Geometric Multigrid; Parallel Programming; Stencil Compuations
Dissertation Name	Doctor of Philosophy in Computer Science
Language	eng
Rights Management	©Protonu Basu
Format Medium	application/pdf
Format Extent	27,183 bytes
Identifier	etd3/id/4084
ARK	ark:/87278/s6dj8q0g
Setname	ir_etd
ID	197634
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6dj8q0g