| OCR Text |
Show 176 graph (i.e., the parallel computing tree) for solving a DRA problem; (2) To generate and communicate all the parallel data pattern during each DRA iteration while not actually moving any one of them; and (3) To accomplish requirements 1 and 2 without paying too great of a price. In what follows next, our design techniques that accomplish these three requirements are presented. 6.4.2 The Architectural Wavefront Definition 6.5 (Weiser-Davis [205]:) A wavefront, denoted as A, represents an ordered set of data elements: a(1,t), a(2,t), ... , a(n-1,t), a(n,t), where t is the time subscript. The elements a(I,t) for all t belonging to the jth data stream. For simplicity, the time subscript in the elements of a wavefront is omitted and a(i,t) will simply be represented as a(i). The above definition represents an example in which an intermediate set of solution results is able to be decomposed into an ordered set of uniform, homogeneous subsolutions at different time instances. Each set of the subsolution, i.e., each wavefront, can be generated on a static computing architecture, provided that all the required data are input into the system and stay at the right places. Although data decomposibility is particularly true for a linear problem on a synchronous computing architecture, it seems infeasible to route a 4-dirnensional, large volume of data on a static DRA computing architecture within a few clock cycles in order to generate a wavefront of ordered data set. Could this bottleneck be solved by analysis and application of architecture decomposibility? In this realm of thought, if there is a sequence of architectures, each of which is capable of performing one stage of the computation (if the computation can be decomposed as a plurality of stages), these different architectures can be combined in sequence |