Opportunities for near data computing in mapreduce workloads

Opportunities for near data computing in mapreduce workloads

Title	Opportunities for near data computing in mapreduce workloads
Publication Type	dissertation
School or College	College of Engineering
Department	Computing
Author	Pugsley, Seth Hintze
Date	2015-05
Description	In-memory big data applications are growing in popularity, including in-memory versions of the MapReduce framework. The move away from disk-based datasets shifts the performance bottleneck from slow disk accesses to memory bandwidth. MapReduce is a data-parallel application, and is therefore amenable to being executed on as many parallel processors as possible, with each processor requiring high amounts of memory bandwidth. We propose using Near Data Computing (NDC) as a means to develop systems that are optimized for in-memory MapReduce workloads, offering high compute parallelism and even higher memory bandwidth. This dissertation explores three different implementations and styles of NDC to improve MapReduce execution. First, we use 3D-stacked memory+logic devices to process the Map phase on compute elements in close proximity to database splits. Second, we attempt to replicate the performance characteristics of the 3D-stacked NDC using only commodity memory and inexpensive processors to improve performance of both Map and Reduce phases. Finally, we incorporate fixed-function hardware accelerators to improve sorting performance within the Map phase. This dissertation shows that it is possible to improve in-memory MapReduce performance by potentially two orders of magnitude by designing system and memory architectures that are specifically tailored to that end.
Type	Text
Publisher	University of Utah
Subject	Big data; Computer architecture; Hardware accelerators; Memory systems
Dissertation Institution	University of Utah
Dissertation Name	Doctor of Philosophy
Language	eng
Rights Management	Copyright © Seth Hintze Pugsley 2015
Format	application/pdf
Format Medium	application/pdf
Format Extent	2,596,108 Bytes
Identifier	etd3/id/3521
ARK	ark:/87278/s6m07dpp
Setname	ir_etd
ID	197074
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6m07dpp

Back to Search Results