| OCR Text |
Show 21 as the token ring. The model size was chosen to maintain the same load on every computing node, thus the maximum model size was 180x180x180 grid points. For the purpose of calculating the speedup and parallel efficiency it was not possible to fit the 180x180x180 model onto one machine to obtain the sequential performance figures. Instead, those were calculated indirectly by using a 30x180x180 model on one machine and extrapolating the numbers to the bigger size. The model itself was a simple homogeneous halfspace with Pwave velocity of 3000 m/s, S-wave velocity of 1730 m/s, and a density of 2500 Kg/m3 . The corresponding Poisson's ratio is 0.25. Speedup and parallel efficiency were calculated using the equations of Section 1.3, and the Megaflops calculations took into account the amount of time required for minimal I/0 operations to obtain the results of the calculations. The minimal result I/0 was assumed to be a single seismic section recorded parallel to the longest dimension on surface of the model in addition to a single snapshot during the calculations. Tables 3.1, 3.2, 3.3, and 3.4 summarize some of the results obtained using the FDDI token ring network. The first two tables show the results obtained with a model of 30x180x180 grid points per node, and the latter tables correspond to a model of 60x180x90 grid points per node. Even though the total data volume is the same in each case, they differ in the NYxNZ panel size which determines the message size that needs to be exchanged between neighboring nodes for interslice boundary computation. In the first case the panel size is 129.6 kbytes for a total transferred data volume of 4665.6 kbytes per node per iteration, whereas in the second case the panel size is 64.8 kbytes for a total transferred data volume of 2332.8 kbytes per node per iteration. This translates into smaller communication time, more megaflops, and a better parallel efficiency rating for the case with the smaller message size. It is also worth noting that the statistics for the cases with three or less computing nodes exhibit a somewhat poor scalability and consistency, whereas the statistics for the configurations with four computing nodes or more seem to be more consistent and show a good scalability trend. This behavior may be attributed to two factors: first the sequential performance numbers were obtained by extrapolation from one node simulating a model |