Path: utzoo!attcan!uunet!nbires!ncar!noao!nud!tom
From: tom@nud.UUCP (Tom Armistead)
Newsgroups: comp.arch
Subject: Re: RISC machines and scoreboarding
Message-ID: <1111@nud.UUCP>
Date: 29 Jun 88 18:55:48 GMT
References: <1082@nud.UUCP> <2438@winchester.mips.COM> <1098@nud.UUCP> <2465@winchester.mips.COM>
Reply-To: tom@nud.UUCP (Tom Armistead)
Organization: Motorola Microcomputer Division, Tempe, Az.
Lines: 32

In article <2465@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes:

>	a) Are there indeed 2 latency cycles (i.e., that instruction 3

    Yes, 2 on loads. 0 latency on stores.

>	b) If so, what is the reason for the second latency slot?

    Address calculation.  The 88k provides a basic set of three addressing
modes.  The most frequently used construct for referencing memory is 
"register pointer + offset" (offset is rarely 0).  Without providing this basic
addressing mode, the code would have to do something like:

	add	ptr,ptr,offset		; 
	ld	dest,ptr,0
	sub	ptr,ptr,offset		; If pointer must be preserved.

for most loads and stores.  This overhead is worse than a 2 tick latency
on loads.  Also note that there is 0 latency on 88k store instructions*.
Since many lds have a corresponding st instruction somewhere, the averaged 88k
latency will be < 2 ticks.  

* What is the latency on a R3000 store instruction?

>Note that our numbers say that in our machines, it would cost us
>10-15% in overall performance to go from 1 cycle latency to 2,

    Assuming the addressing modes and other aspects of the machine remain
the same, this figure is in the ballpark (although a little high). 
However, I think the static machine assumption is not a valid one to make.
-- 
Just a few more bits in the stream.

The Sneek