Page 52

Contents | 52 of 150

Download PDF | | Reference URL | Gallery View | Parent Record

Publication Type	technical report
School or College	College of Engineering
Department	Computing, School of
Creator	Tinker, Peter A.
Title	The design and implementation of an OR-parallel logic programming system
Date	1987-08
Description	The research focus in OR-parallel logic programming is shifting rapidly from theoretical considerations and simulation on uniprocessors to implementation on true multiprocessors. This dissertation describes the design and implementation of such a system or OR-parallel Horn clause logic programs on the BBN Butterfly Parallel Processor.
Type	Text
Publisher	University of Utah
Subject	OR-parallel; computers
Subject LCSH	Logic programming; Parallel processing (Electronic computers); Multiprocessors
Language	eng
Bibliographic Citation	Tinker, P. A. (1987). The design and implementation of an OR-parallel logic programming system.
Series	University of Utah Computer Science Technical Report
Relation is Part of	ARPANET
Format Medium	application/pdf
Format Extent	187,545 Bytes
File Name	Tinker-The_Design_and_Implementation.pdf
Conversion Specifications	Original scanned with Kirtas 2400 and saved as 400 ppi uncompressed TIFF. PDF generated by Adobe Acrobat Pro X for CONTENTdm display
ARK	ark:/87278/s6f78cn2
Setname	ir_computersa
ID	95418
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6f78cn2

Page Metadata

Title	Page 52
Setname	ir_computersa
ID	95319
OCR Text	Show 42 ditlons. with the remaining switch capacity dissipated through operations at the source and destination nodes. The performance is summarized in Figure 7, which shows only the part of cuxve which changes rapidly. Performance improves nearly linearly thereafter. While there is a substantial difference in the speed with which the PNC can perform accesses to local memozy compared with remote memory. this difference is small when compared with that for loosely-coupled architectures such as hypercubes. Benchmarks indicate that a local 16-bit fetch to a 68020 register takes about 1.35 microseconds and a remote fetch takes 6.3 microseconds. Thus. while a penalty is paid for remote accesses. the ratio of times for remote and local accesses is not very large. Another way of viewing this situation is that local accesses are rather slow, but remote accesses are not vezy much slower. In particular, remote accesses are not slow enough that it is worthwhile to perform a context switch be-tween processes on the source node while a process watts for a remote memozy operation to complete. Instead, all 16-bit fetches are performed synchronously. It is important to note that instruction fetches (which are always local, although this need not be the case, architecturally) take as long as any other local fetches. If contiguous words of memozy are to be accessed, block transfers can be used to move data at the rate of about 19.4 milliseconds for a 64-kilobyte block. Block transfers become more efficient than single-word transfers if the size of the data ex-ceeds about 100 bytes. Block transfer requests take precedence over single-word transfers and, unlike single-word transfers. can proceed asynchronously with the operation of the CPU. These benchmark results are shown in Figure 8. Only the low end of the cuxve is shown; again, performance improves roughly linearly thereafter. Many Butterfly system are configured with multiple paths through the switch between each pair of nodes. A 16-node machine, for example, may be paired with These figures are the result of timing 4000 consecutive in-line 16-bit fetch instructions to a machine register. As with all subsequent benchmark figures presented here, these instructions were executed with interrupts disabled, and instruction-fetch time is included in the timing results. **This figure was obtained by executing in-line block-transfer code with interrupts disabled.
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6f78cn2/95319