|
|
Creator | Title | Description | Subject | Date |
1 |
|
Balasubramonian, Rajeev | Quantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory device | Many of the pins on a modern chip are used for power delivery. If fewer pins were used to supply the same current, the wires and pins used for power delivery would have to carry larger currents over longer distances. This results in an "IR-drop" problem, where some of the voltage is dropped across t... | | 2013-01-01 |
2 |
|
Balasubramonian, Rajeev | Non-uniform power access in large caches with low-swing wires | Modern processors dedicate more than half their chip area to large L2 and L3 caches and these caches contribute significantly to the total processor power. A large cache is typically split into multiple banks and these banks are either connected through a bus (uniform cache access - UCA) or an on-c... | Large caches; Low-swing wires; Non Uniform Cache Access; NUCA | 2009 |
3 |
|
Swope, Steven M. | Simon II kernel reference manual | The principal objective of Simon II is to provide a flexible and adaptable framework for constructing simulators for a wide variety of parallel systems. A simulator consists of a set of software building blocks. Each building block, i.e. object, simulates a specific component of the parallel system... | Simon II; Kernels; Simulators; Parallel systems | 1986 |
4 |
|
Balasubramonian, Rajeev | Reducing the complexity of the register file in dynamic superscalar processors | Dynamic superscalar processors execute multiple instructions out-of-order by looking for independent operations within a large window. The number of physical registers within the processor has a direct impact on the size of this window as most in-flight instructions require a new physical register a... | Dynamic superscalar processors; Register file; Instruction-level parallelism; Microarchitecture; Reorder buffer | 2001 |
5 |
|
Regehr, John | Precise garbage collection for C | Magpie is a source-to-source transformation for C programs that enables precise garbage collection, where precise means that integers are not confused with pointers, and the liveness of a pointer is apparent at the source level. Precise GC is primarily useful for long-running programs and programs t... | | 2009-01-01 |
6 |
|
Balasubramonian, Rajeev | Towards scalable, energy-efficient, bus-based on-chip networks | It is expected that future on-chip networks for many-core processors will impose huge overheads in terms of energy, delay, complexity, verification effort, and area. There is a common belief that the bandwidth necessary for future applications can only be provided by employing packet-switched netwo... | On-chip networks; Multi-core computing; Bus-based; Energy efficient | 2010 |
7 |
|
Eide, Eric Norman | Static and dynamic structure in design patterns | Design patterns are a valuable mechanism for emphasizing structure, capturing design expertise, and facilitating restructuring of software systems. Patterns are typically applied in the context of an object-oriented language and are implemented so that the pattern participants correspond to obje... | Design patterns; Static structure; Dynamic structure | 2001-11-01 |
8 |
|
Hansen, Charles D. | Fast stereoscopic images with ray-traced volume rendering | One of the drawbacks of standard volume rendering techniques is that it is often difficult to comprehend the three-dimensional structure of the volume from a single frame; this is especially true in cases where there is no solid surface. Generally, several frames must be generated and viewed sequent... | Volume rendering; Ray tracing; Stereoscopic images; Reprojection | 1994 |
9 |
|
Hansen, Charles D. | Data distributed, parallel algorithm for ray-traced volume rendering | This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individ... | Volume rendering; Ray tracing; ; Computer algorithms; Scientific visualization; Network computing; Massively parallel processing | 1993 |
10 |
|
Evans, David | Graphical man/machine communications: December 1971 | Semi-Annual Technical Report for period 1 June 1971 to 31 December 1971. This document includes a summary of research activities and facilities at the University of Utah under Contract F30602-70-C-0300. Information conveys important research milestones attained during this period by each of the f... | Man/machine communications; Computing systems; Digital waveform processing | 1971-12 |
11 |
|
Brunvand, Erik L. | Low latency self-timed flow-through FIFOs | Self-timed flow-through FIFOs are constructed easily using only a single C-element as control for each stage of the FIFO. Throughput can be very high in this type of FIFO as the communication required to send new data to the FIFO is local to only the first element of the FIFO. Circuit density can ... | | 1995 |
12 |
|
Brunvand, Erik L. | Reduced latency self-timed FIFO circuits | Self-timed flow-through FIFOs are constructed easily using only a single C-element as control for each stage of the FIFO. Throughput can be very high in FIFOs of this type because new data can be sent to the FIFO after communicating locally with only the first element of the FIFO. Therefore the thro... | FIFO circuits; Self-timed; Flow-through; Reduced latency | 1994 |
13 |
|
Hansen, Charles D. | Data distributed, parallel algorithm for ray-traced volume rendering | This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individ... | Volume rendering; Ray tracing; ; Computer algorithms; Scientific visualization; Network computing; Massively parallel processing | 1993 |
14 |
|
Hansen, Charles D.; Wald, Ingo | Interactive isosurface ray tracing of time-varying tetrahedral volumes | Abstract- We describe a system for interactively rendering isosurfaces of tetrahedral finite-element scalar fields using coherent ray tracing techniques on the CPU. By employing state-of-the art methods in polygonal ray tracing, namely aggressive packet/frustum traversal of a bounding volume hierarc... | | 2007-11 |
15 |
|
Brunvand, Erik L. | Estimating performance of an ray- tracing ASIC design | Recursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance ... | | 2006 |
16 |
|
Balasubramonian, Rajeev | Dynamic memory hierarchy performance optimization | Although microprocessor performance continues to increase at a rapid pace, the growing processor-memory speed gap threatens to limit future performance gains. In this paper, we propose a novel configurable cache and TLB as an alternative to conventional two-level hierarchies. This organization le... | Microprocessor performance; Processor-memory speed gap | 2000 |
17 |
|
Balasubramonian, Rajeev | Dynamically allocating processor resources between nearby and distant ILP | Modern superscalar processors use wide instruction issue widths and out-of-order execution in order to increase instruction-level parallelism (ILP). Because instructions must be committed in order so as to guarantee precise exceptions, increasing ILP implies increasing the sizes of structures s... | Instruction-level parallelism; Microarchitecture; Primary thread; Future thread; Instruction reuse buffer | 2001 |
18 |
|
Parker, Steven G.; Hansen, Charles D. | Distributed interactive ray tracing for large volume visualization | We have constructed a distributed parallel ray tracing system that interactively produces isosurface renderings from large data sets on a cluster of commodity PCs. The program was derived from the SCI Institute's interactive ray tracer (*-Ray), which utilizes small to large shared memory platforms, ... | Ray tracing; Volume rendering; Large data; Cluster computing; Distributed shared memory | 2003 |
19 |
|
Awate, Suyash P.; Whitaker, Ross T. | Higher-order image statistics for unsupervised, information-theoretic, adaptive, image filtering | The restoration of images is an important and widely studied problem in computer vision and image processing. Various image filtering strategies have been effective, but invariably make strong assumptions about the properties of the signal and/or degradation. Therefore, these methods typically la... | Image filtering; Image restoration | 2005-04-15 |
20 |
|
Hansen, Charles D.; Wald, Ingo | Interactive ray tracing of arbitrary implicits with SIMD interval arithmetic | We present a practical and efficient algorithm for interactively ray tracing arbitrary implicit surfaces. We use interval arithmetic (IA) both for robust root computation and guaranteed detection of topological features. In conjunction with ray tracing, this allows for rendering literally any progr... | | 2007-09 |
21 |
|
Gopalakrishnan, Ganesh | Some unusual micropipeline circuits | We present a few unusual Micropipelines (Sutherland, CACM, September 1989) that employ the Muller C-ELEMENT or an extension of the C-ELEMENT called LOCKC (Liebchen and Gopalakrishnan, ICCD, 1992). We first describe two variations of the two-dimensional Micropipeline structure realized using ordinary... | Micropipeline circuits; Micropipelines | 1993 |
22 |
|
Regehr, John | Evolving real-time systems using hierarchical scheduling and concurrency analysis | We have developed a new way to look at real-time and embedded software: as a collection of execution environments created by a hierarchy of schedulers. Common schedulers include those that run interrupts, bottom-half handlers, threads, and events. We have created algorithms for deriving response tim... | | 2003-01-01 |
23 |
|
Balasubramonian, Rajeev | Staged reads: mitigating the impact of DRAM writes on DRAM reads | Main memory latencies have always been a concern for system performance. Given that reads are on the criti- cal path for CPU progress, reads must be prioritized over writes. However, writes must be eventually processed and they often delay pending reads. In fact, a single channel in the main memory ... | | 2012-01-01 |
24 |
|
Hansen, Charles D. | Parallel volume rendering using binary-swap compositing | Existing volume rendering methods, though capable of very effective visualizations, are computationally intensive and therefore fail to achieve interactive rendering rates for large data sets. Although computing technology continues to advance, computer processing power never seems to catch up to th... | | 1994 |
25 |
|
Venkatasubramanian, Suresh | Approximate Bregman near neighbors in sublinear time: beyond the triangle inequality | Bregman divergences are important distance measures that are used extensively in data-driven applications such as computer vision, text mining, and speech processing, and are a key focus of interest in machine learning. Answering nearest neighbor (NN) queries under these measures is very important i... | | 2012-01-01 |