Path: utzoo!mnetor!uunet!husc6!think!ames!sdcsvax!ucsdhub!hp-sdd!hplabs!pyramid!prls!mips!mark
From: mark@mips.UUCP (Mark G. Johnson)
Newsgroups: comp.arch
Subject: Impossible 40MHz R2000 ??
Message-ID: <1145@mips.UUCP>
Date: 15 Dec 87 23:49:06 GMT
Lines: 77

Quoting from author	jesup@pawl22.pawl.rpi.edu (Randell E. Jesup)
in article		<140@imagine.PAWL.RPI.EDU>
of comp.arch on date	13 Dec 87 13:02:20 GMT

	> Given current technology, r2000 could probably be scaled
	> to about 20 MHz.  However, custom RISC designs in CMOS are
	> now reaching 40 MHz, which would be impossible with the
	> double-clocked interface currently on the r2000.  Perhaps
	> the interface could be removed, given enough pins, but
	> that gets you back into the packaging limits.

"Impossible" is quite a strong word.  "Difficult", sure.  But he's
saying that a 2.4X improvement of a first-chip-designed-at-a-startup-
company, 2-micron-generic-silicon-foundry device is IMPOSSIBLE.

A few things might change :-) :-) between now (16.7 MHz) and 40 MHz.
Principal among these is experience; several different systems using
this double-clocked approach have now been built (by SGI, MIPS, and
others) and their properties have been measured and analyzed.
Weaknesses, if any :-) :-), can be improved, and strengths can
be exploited.

Other factors conspire to make the job of building a 40 MHz
double-clocked interface not "impossible":

	1.  Cache RAM access times will continue to decrease, likely
	    at the same rate as the processor clock, since SRAM vendors
	    now build RISC chips (including SPARC, R2000, Am29K).  So
	    RAM access time will probably stay at 40-50% of processor
	    cycle time.  {presently 60 ns cycle, 25-30 ns RAM access}.
	    The rest of the cycle is used up by setup & hold times,
	    bus drive (slew) times, timing uncertainties, and "margin".

	2.  Surface mount packages (having the *same number* of
	    leads, 144) might be used instead of the current 144 pin
	    Pin Grid Array.  Their lower inductance and better
	    controlled impedance can decrease dispersion and improve
	    signal quality.  Such packages, available today, are
	    more than 2.4X better than the existing PGA package, so
	    the net percentage of the cycle wasted in package-induced
	    "timing slop" would decrease.

	3.  Output voltage drive levels might shrink from the present
	    0.0 volts and 5.0 volts, to (an example) 0.4V and 2.7V.
	    This speeds up output transitions (dT = dV * C/I) without
	    increasing switching noise.  Less of the cycle (in
	    percentage terms) would be spent slewing the bus around.

	4.  The clock generation and distribution technology may
	    get a factor of > 2.4X more precise.  If this happened,
	    the fraction of the cycle lost to "slop" (timing edge
	    uncertainty) would go down.

	5.  BiCMOS fab processes might be employed, permitting the
	    use of open emitter, Wired-OR interfacing to ECL-compatible
	    cache RAM chips.  Current ECL RAMs are about 15-20 nsec,
	    a smaller fraction of the current processor cycle (60 nsec)
	    than current MOS RAM chips.  So the timing margins
	    *improve*.  Additionally, multiple-driver "collisions" or
	    "contention" are non-deadly in the wired-OR ECL structure,
	    (unlike CMOS tristate busses), such that the time between
	    disabling X and enabling Y onto the bus, can be reduced
	    dramatically.  And there's reason to believe that BiCMOS
	    RAM access times will scale at about the same rate as
	    traditional CMOS RAMs.


Taken individually or as a group, these scenarios indicate (at least
to me) that 40 MHz "double clocked cache interfaces" are indeed
possible, and might in fact be as robust (or moreso) than existing
implementations at 16.7 MHz.

Regards,

-Mark Johnson	*** DISCLAIMER: The opinions above are personal. ***	
UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mark   TEL: 408-720-1700 x208
US mail: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086