Publication Type |
pre-print |
School or College |
College of Engineering |
Department |
Computing, School of |
Creator |
Balasubramonian, Rajeev |
Other Author |
Udipi, Aniruddha N.; Muralimanohar, Naveen; Davis, Al; Jouppi, Norman P. |
Title |
LOT-ECC: LOcalized and tiered reliability mechanisms for commodity memory systems |
Date |
2012-01-01 |
Description |
Memory system reliability is a serious and growing concern in modern servers. Existing chipkill-level mem- ory protection mechanisms suffer from several draw- backs. They activate a large number of chips on ev- ery memory access - this increases energy consump- tion, and reduces performance due to the reduction in rank-level parallelism. Additionally, they increase ac- cess granularity, resulting in wasted bandwidth in the absence of sufficient access locality. They also restrict systems to use narrow-I/O x4 devices, which are known to be less energy-efficient than the wider x8 DRAM de- vices. In this paper, we present LOT-ECC, a local- ized and multi-tiered protection scheme that attempts to solve these problems. We separate error detection and error correction functionality, and employ simple checksum and parity codes effectively to provide strong fault-tolerance, while simultaneously simplifying imple- mentation. Data and codes are localized to the same DRAM row to improve access efficiency. We use sys- tem firmware to store correction codes in DRAM data memory and modify the memory controller to handle data mapping. We thus build an effective fault-tolerance mechanism that provides strong reliability guarantees, activates as few chips as possible (reducing power con- sumption by up to 44.8% and reducing latency by up to 46.9%), and reduces circuit complexity, all while work- ing with commodity DRAMs and operating systems. Fi- nally, we propose the novel concept of a heterogeneous DIMM that enables the extension of LOT-ECC to x16 and wider DRAM parts. |
Type |
Text |
Publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
First Page |
285 |
Last Page |
296 |
Dissertation Institution |
University of Utah |
Language |
eng |
Bibliographic Citation |
Udipi, A. N., Muralimanohar, N., Balsubramonian, R., Davis, A., & Jouppi, N. P. (2012). LOT-ECC: LOcalized and tiered reliability mechanisms for commodity memory systems. Proceedings - International Symposium on Computer Architecture, no. 6237025, 285-96. |
Rights Management |
(c)2012 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. |
Format Medium |
application/pdf |
Format Extent |
415,196 bytes |
Identifier |
uspace,17708 |
ARK |
ark:/87278/s6k93s9k |
Setname |
ir_uspace |
ID |
708077 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s6k93s9k |