| OCR Text |
Show 107 5.6 Estimating the Delays The worst case for this stage of the multiplier would be as follows. We would have the worst case in the carry propagate addition. Initially, the overflow bit would be reset and we would go through the mux. However, in the worst case, this choice of the final result may not be correct. So we will have to go through another two gate delays (which is the same as a mux delay in DCFL) followed by another pass through the mux. This will be followed by another mux delay in the shifter. Thus we will have about four mux delays after the CPA in the worst case. At 25C and tt, the delay of a mux is about 200ps (one inverter and 1 OOf load). After layout, etc, this may be about 300ps. From the previous section we see that the delay for the CPA is 2. 7ns. Therefore, in the worst case the total delay would be about 3.9ns. To allow for a margin of error and delays through the control logic, this delay can be rounded upto 7ns. Thus, the worst case time for the final stage of the multiplier is expected to be 7ns at tt process corner and 25C temperature. Also note that, even in the best case, there will be two mux delays after the CPA. Thus, a bundled implementation will not be significantly worse. 5.7 Summary In this chapter some techniques for implementing the round to nearest/ even, which is the default rounding mode for the IEEE standard, were presented. It was shown how this could be merged with the final CPA thus saving area and delay. It was also shown how this could be implemented with an iterative architecture. This stage of the multiplier saw a possible use of the precharged circuits discussed in Chapter 3 of this thesis. The worst case delay of this stage was expected to be about 7ns under tt and 25C. Using the results of Chapter 4 and this chapter, the total latency for a single multiply will be about 24ns at tt process corner and 25°C. |