Compiler optimizations and autotuning for stencils and geometric multigrid

Update Item Information
Title Compiler optimizations and autotuning for stencils and geometric multigrid
Publication Type dissertation
School or College College of Engineering
Department Computing
Author Basu, Protonu
Date 2016
Description Stencil computations are operations on structured grids. They are frequently found in partial differential equation solvers, making their performance critical to a range of scientific applications. On modern architectures where data movement costs dominate computation, optimizing stencil computations is a challenging task. Typically, domain scientists must reduce and orchestrate data movement to tackle the memory bandwidth and latency bottlenecks. Furthermore, optimized code must map efficiently to ever increasing parallelism on a chip. This dissertation studies several stencils with varying arithmetic intensities, thus requiring contrasting optimization strategies. Stencils traditionally have low arithmetic intensity, making their performance limited by memory bandwidth. Contemporary higher-order stencils are designed to require smaller grids, hence less memory, but are bound by increased floating-point operations. This dissertation develops communication-avoiding optimizations to reduce data movement in memory-bound stencils. For higher-order stencils, a novel transformation, partial sums, is designed to reduce the number of floating-point operations and improve register reuse. These optimizations are implemented in a compiler framework, which is further extended to generate parallel code targeting multicores and graphics processor units (GPUs). The augmented compiler framework is then combined with autotuning to productively address stencil optimization challenges. Autotuning explores a search space of possible implementations of a computation to find the optimal code for an execution context. In this dissertation, autotuning is used to compose sequences of optimizations to drive the augmented compiler framework. This compiler-directed autotuning approach is used to optimize stencils in the context of a linear solver, Geometric Multigrid (GMG). GMG uses sequences of stencil computations, and presents greater optimization challenges than isolated stencils, as interactions between stencils must also be considered. The efficacy of our approach is demonstrated by comparing the performance of generated code against manually tuned code, over commercial compiler-generated code, and against analytic performance bounds. Generated code outperforms manually optimized codes on multicores and GPUs. Against Intel's compiler on multicores, generated code achieves up to 4x speedup for stencils, and 3x for the solver. On GPUs, generated code achieves 80% of an analytically computed performance bound.
Type Text
Publisher University of Utah
Subject Autotuning; Code Generation; Compilers; Geometric Multigrid; Parallel Programming; Stencil Compuations
Dissertation Name Doctor of Philosophy in Computer Science
Language eng
Rights Management ©Protonu Basu
Format application/pdf
Format Medium application/pdf
Format Extent 27,183 bytes
Identifier etd3/id/4084
ARK ark:/87278/s6dj8q0g
Setname ir_etd
ID 197634
Reference URL https://collections.lib.utah.edu/ark:/87278/s6dj8q0g
Back to Search Results