1 - 25 of 80
Number of results to display per page
CreatorTitleDescriptionSubjectDate
1 Balasubramonian, RajeevQuantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory deviceMany of the pins on a modern chip are used for power delivery. If fewer pins were used to supply the same current, the wires and pins used for power delivery would have to carry larger currents over longer distances. This results in an "IR-drop" problem, where some of the voltage is dropped across t...2013-01-01
2 Balasubramonian, RajeevNon-uniform power access in large caches with low-swing wiresModern processors dedicate more than half their chip area to large L2 and L3 caches and these caches contribute significantly to the total processor power. A large cache is typically split into multiple banks and these banks are either connected through a bus (uniform cache access - UCA) or an on-c...Large caches; Low-swing wires; Non Uniform Cache Access; NUCA2009
3 Swope, Steven M.Simon II kernel reference manualThe principal objective of Simon II is to provide a flexible and adaptable framework for constructing simulators for a wide variety of parallel systems. A simulator consists of a set of software building blocks. Each building block, i.e. object, simulates a specific component of the parallel system...Simon II; Kernels; Simulators; Parallel systems1986
4 Balasubramonian, RajeevReducing the complexity of the register file in dynamic superscalar processorsDynamic superscalar processors execute multiple instructions out-of-order by looking for independent operations within a large window. The number of physical registers within the processor has a direct impact on the size of this window as most in-flight instructions require a new physical register a...Dynamic superscalar processors; Register file; Instruction-level parallelism; Microarchitecture; Reorder buffer2001
5 Regehr, JohnPrecise garbage collection for CMagpie is a source-to-source transformation for C programs that enables precise garbage collection, where precise means that integers are not confused with pointers, and the liveness of a pointer is apparent at the source level. Precise GC is primarily useful for long-running programs and programs t...2009-01-01
6 Balasubramonian, RajeevTowards scalable, energy-efficient, bus-based on-chip networksIt is expected that future on-chip networks for many-core processors will impose huge overheads in terms of energy, delay, complexity, verification effort, and area. There is a common belief that the bandwidth necessary for future applications can only be provided by employing packet-switched netwo...On-chip networks; Multi-core computing; Bus-based; Energy efficient2010
7 Eide, Eric NormanStatic and dynamic structure in design patternsDesign patterns are a valuable mechanism for emphasizing structure, capturing design expertise, and facilitating restructuring of software systems. Patterns are typically applied in the context of an object-oriented language and are implemented so that the pattern participants correspond to obje...Design patterns; Static structure; Dynamic structure2001-11-01
8 Hansen, Charles D.Fast stereoscopic images with ray-traced volume renderingOne of the drawbacks of standard volume rendering techniques is that it is often difficult to comprehend the three-dimensional structure of the volume from a single frame; this is especially true in cases where there is no solid surface. Generally, several frames must be generated and viewed sequent...Volume rendering; Ray tracing; Stereoscopic images; Reprojection1994
9 Hansen, Charles D.Data distributed, parallel algorithm for ray-traced volume renderingThis paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individ...Volume rendering; Ray tracing; ; Computer algorithms; Scientific visualization; Network computing; Massively parallel processing1993
10 Evans, DavidGraphical man/machine communications: December 1971Semi-Annual Technical Report for period 1 June 1971 to 31 December 1971. This document includes a summary of research activities and facilities at the University of Utah under Contract F30602-70-C-0300. Information conveys important research milestones attained during this period by each of the f...Man/machine communications; Computing systems; Digital waveform processing1971-12
11 Brunvand, Erik L.Low latency self-timed flow-through FIFOsSelf-timed flow-through FIFOs are constructed easily using only a single C-element as control for each stage of the FIFO. Throughput can be very high in this type of FIFO as the communication required to send new data to the FIFO is local to only the first element of the FIFO. Circuit density can ...1995
12 Brunvand, Erik L.Reduced latency self-timed FIFO circuitsSelf-timed flow-through FIFOs are constructed easily using only a single C-element as control for each stage of the FIFO. Throughput can be very high in FIFOs of this type because new data can be sent to the FIFO after communicating locally with only the first element of the FIFO. Therefore the thro...FIFO circuits; Self-timed; Flow-through; Reduced latency1994
13 Hansen, Charles D.Data distributed, parallel algorithm for ray-traced volume renderingThis paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individ...Volume rendering; Ray tracing; ; Computer algorithms; Scientific visualization; Network computing; Massively parallel processing1993
14 Hansen, Charles D.; Wald, IngoInteractive isosurface ray tracing of time-varying tetrahedral volumesAbstract- We describe a system for interactively rendering isosurfaces of tetrahedral finite-element scalar fields using coherent ray tracing techniques on the CPU. By employing state-of-the art methods in polygonal ray tracing, namely aggressive packet/frustum traversal of a bounding volume hierarc...2007-11
15 Brunvand, Erik L.Estimating performance of an ray- tracing ASIC designRecursive ray tracing is a powerful rendering technique used to compute realistic images by simulating the global light transport in a scene. Algorithmic improvements and FPGA-based hardware implementations of ray tracing have demonstrated realtime performance but hardware that achieves performance ...2006
16 Balasubramonian, RajeevDynamic memory hierarchy performance optimizationAlthough microprocessor performance continues to increase at a rapid pace, the growing processor-memory speed gap threatens to limit future performance gains. In this paper, we propose a novel configurable cache and TLB as an alternative to conventional two-level hierarchies. This organization le...Microprocessor performance; Processor-memory speed gap2000
17 Balasubramonian, RajeevDynamically allocating processor resources between nearby and distant ILPModern superscalar processors use wide instruction issue widths and out-of-order execution in order to increase instruction-level parallelism (ILP). Because instructions must be committed in order so as to guarantee precise exceptions, increasing ILP implies increasing the sizes of structures s...Instruction-level parallelism; Microarchitecture; Primary thread; Future thread; Instruction reuse buffer2001
18 Parker, Steven G.; Hansen, Charles D.Distributed interactive ray tracing for large volume visualizationWe have constructed a distributed parallel ray tracing system that interactively produces isosurface renderings from large data sets on a cluster of commodity PCs. The program was derived from the SCI Institute's interactive ray tracer (*-Ray), which utilizes small to large shared memory platforms, ...Ray tracing; Volume rendering; Large data; Cluster computing; Distributed shared memory2003
19 Awate, Suyash P.; Whitaker, Ross T.Higher-order image statistics for unsupervised, information-theoretic, adaptive, image filteringThe restoration of images is an important and widely studied problem in computer vision and image processing. Various image filtering strategies have been effective, but invariably make strong assumptions about the properties of the signal and/or degradation. Therefore, these methods typically la...Image filtering; Image restoration2005-04-15
20 Hansen, Charles D.; Wald, IngoInteractive ray tracing of arbitrary implicits with SIMD interval arithmeticWe present a practical and efficient algorithm for interactively ray tracing arbitrary implicit surfaces. We use interval arithmetic (IA) both for robust root computation and guaranteed detection of topological features. In conjunction with ray tracing, this allows for rendering literally any progr...2007-09
21 Gopalakrishnan, GaneshSome unusual micropipeline circuitsWe present a few unusual Micropipelines (Sutherland, CACM, September 1989) that employ the Muller C-ELEMENT or an extension of the C-ELEMENT called LOCKC (Liebchen and Gopalakrishnan, ICCD, 1992). We first describe two variations of the two-dimensional Micropipeline structure realized using ordinary...Micropipeline circuits; Micropipelines1993
22 Regehr, JohnEvolving real-time systems using hierarchical scheduling and concurrency analysisWe have developed a new way to look at real-time and embedded software: as a collection of execution environments created by a hierarchy of schedulers. Common schedulers include those that run interrupts, bottom-half handlers, threads, and events. We have created algorithms for deriving response tim...2003-01-01
23 Balasubramonian, RajeevStaged reads: mitigating the impact of DRAM writes on DRAM readsMain memory latencies have always been a concern for system performance. Given that reads are on the criti- cal path for CPU progress, reads must be prioritized over writes. However, writes must be eventually processed and they often delay pending reads. In fact, a single channel in the main memory ...2012-01-01
24 Hansen, Charles D.Parallel volume rendering using binary-swap compositingExisting volume rendering methods, though capable of very effective visualizations, are computationally intensive and therefore fail to achieve interactive rendering rates for large data sets. Although computing technology continues to advance, computer processing power never seems to catch up to th...1994
25 Venkatasubramanian, SureshApproximate Bregman near neighbors in sublinear time: beyond the triangle inequalityBregman divergences are important distance measures that are used extensively in data-driven applications such as computer vision, text mining, and speech processing, and are a key focus of interest in machine learning. Answering nearest neighbor (NN) queries under these measures is very important i...2012-01-01
1 - 25 of 80