Path: utzoo!attcan!uunet!husc6!uwvax!oddjob!ncar!noao!nud!tom From: tom@nud.UUCP (Tom Armistead) Newsgroups: comp.arch Subject: Re: RISC machines and scoreboarding Message-ID: <1110@nud.UUCP> Date: 29 Jun 88 18:23:09 GMT References: <1082@nud.UUCP> <2438@winchester.mips.COM> <1098@nud.UUCP> <2459@gumby.mips.COM> Reply-To: tom@nud.UUCP (Tom Armistead) Organization: Motorola Microcomputer Division, Tempe, Az. Lines: 34 In article <2459@gumby.mips.COM> earl@mips.COM (Earl Killian) writes: >But my question in all this is why did the 88000 choose to fully >pipeline floating point and thus allow such a large number of pending >results? I understand why the 6600 and its successors did it, but the >same analysis for the 88000 suggests it is unnecessary. You don't >have the cache bandwidth to make fp pipelining useful even on large >highly vectorizable problems (i.e. 32b per cycle isn't enough). You >can't feed the fp pipeline fast enough. Assuming the FP operands are not in cache, this is true. However, there will be some class of problems which can make effective use of the FP pipelining and assuming that FP pipelining has no bad side effects (see paragraph below), it only makes sense to provide the feature. >The price you paid for pipelining appears to be enormous: the 88100's I didn't design the 88100 (I am not a chip designer at all). However, I doubt that any speed difference is due to a tradeoff made to provide FP pipelining. Whether the FP unit was pipelined or not, I think the FP latencies would still be the same. Do you have some reason to think that not pipelining can decrease FP latency? If no increase in latency is incurred to provide pipelining, then providing it can only help performance. >fp op latencies average 2.5x longer than the R3010's when measured in >cycles; 3x longer when measured in ns. Would you post the R3010 FP stats? Not that I doubt you, I'm just interested in how each particular R3010 FP instruction compares to the equivalent 88K instruction. -- Just a few more bits in the stream. The Sneek