Opportunities for near data computing in mapreduce workloads

Update Item Information
Publication Type dissertation
School or College College of Engineering
Department Computing
Author Pugsley, Seth Hintze
Title Opportunities for near data computing in mapreduce workloads
Date 2015-05
Description In-memory big data applications are growing in popularity, including in-memory versions of the MapReduce framework. The move away from disk-based datasets shifts the performance bottleneck from slow disk accesses to memory bandwidth. MapReduce is a data-parallel application, and is therefore amenable to being executed on as many parallel processors as possible, with each processor requiring high amounts of memory bandwidth. We propose using Near Data Computing (NDC) as a means to develop systems that are optimized for in-memory MapReduce workloads, offering high compute parallelism and even higher memory bandwidth. This dissertation explores three different implementations and styles of NDC to improve MapReduce execution. First, we use 3D-stacked memory+logic devices to process the Map phase on compute elements in close proximity to database splits. Second, we attempt to replicate the performance characteristics of the 3D-stacked NDC using only commodity memory and inexpensive processors to improve performance of both Map and Reduce phases. Finally, we incorporate fixed-function hardware accelerators to improve sorting performance within the Map phase. This dissertation shows that it is possible to improve in-memory MapReduce performance by potentially two orders of magnitude by designing system and memory architectures that are specifically tailored to that end.
Type Text
Publisher University of Utah
Subject Big data; Computer architecture; Hardware accelerators; Memory systems
Dissertation Institution University of Utah
Dissertation Name Doctor of Philosophy
Language eng
Rights Management Copyright © Seth Hintze Pugsley 2015
Format Medium application/pdf
Format Extent 2,596,108 Bytes
Identifier etd3/id/3521
ARK ark:/87278/s6m07dpp
Setname ir_etd
ID 197074
Reference URL https://collections.lib.utah.edu/ark:/87278/s6m07dpp
Back to Search Results