Path: utzoo!mnetor!uunet!husc6!think!ames!sdcsvax!ucsdhub!hp-sdd!hplabs!pyramid!prls!mips!mark From: mark@mips.UUCP (Mark G. Johnson) Newsgroups: comp.arch Subject: Impossible 40MHz R2000 ?? Message-ID: <1145@mips.UUCP> Date: 15 Dec 87 23:49:06 GMT Lines: 77 Quoting from author jesup@pawl22.pawl.rpi.edu (Randell E. Jesup) in article <140@imagine.PAWL.RPI.EDU> of comp.arch on date 13 Dec 87 13:02:20 GMT > Given current technology, r2000 could probably be scaled > to about 20 MHz. However, custom RISC designs in CMOS are > now reaching 40 MHz, which would be impossible with the > double-clocked interface currently on the r2000. Perhaps > the interface could be removed, given enough pins, but > that gets you back into the packaging limits. "Impossible" is quite a strong word. "Difficult", sure. But he's saying that a 2.4X improvement of a first-chip-designed-at-a-startup- company, 2-micron-generic-silicon-foundry device is IMPOSSIBLE. A few things might change :-) :-) between now (16.7 MHz) and 40 MHz. Principal among these is experience; several different systems using this double-clocked approach have now been built (by SGI, MIPS, and others) and their properties have been measured and analyzed. Weaknesses, if any :-) :-), can be improved, and strengths can be exploited. Other factors conspire to make the job of building a 40 MHz double-clocked interface not "impossible": 1. Cache RAM access times will continue to decrease, likely at the same rate as the processor clock, since SRAM vendors now build RISC chips (including SPARC, R2000, Am29K). So RAM access time will probably stay at 40-50% of processor cycle time. {presently 60 ns cycle, 25-30 ns RAM access}. The rest of the cycle is used up by setup & hold times, bus drive (slew) times, timing uncertainties, and "margin". 2. Surface mount packages (having the *same number* of leads, 144) might be used instead of the current 144 pin Pin Grid Array. Their lower inductance and better controlled impedance can decrease dispersion and improve signal quality. Such packages, available today, are more than 2.4X better than the existing PGA package, so the net percentage of the cycle wasted in package-induced "timing slop" would decrease. 3. Output voltage drive levels might shrink from the present 0.0 volts and 5.0 volts, to (an example) 0.4V and 2.7V. This speeds up output transitions (dT = dV * C/I) without increasing switching noise. Less of the cycle (in percentage terms) would be spent slewing the bus around. 4. The clock generation and distribution technology may get a factor of > 2.4X more precise. If this happened, the fraction of the cycle lost to "slop" (timing edge uncertainty) would go down. 5. BiCMOS fab processes might be employed, permitting the use of open emitter, Wired-OR interfacing to ECL-compatible cache RAM chips. Current ECL RAMs are about 15-20 nsec, a smaller fraction of the current processor cycle (60 nsec) than current MOS RAM chips. So the timing margins *improve*. Additionally, multiple-driver "collisions" or "contention" are non-deadly in the wired-OR ECL structure, (unlike CMOS tristate busses), such that the time between disabling X and enabling Y onto the bus, can be reduced dramatically. And there's reason to believe that BiCMOS RAM access times will scale at about the same rate as traditional CMOS RAMs. Taken individually or as a group, these scenarios indicate (at least to me) that 40 MHz "double clocked cache interfaces" are indeed possible, and might in fact be as robust (or moreso) than existing implementations at 16.7 MHz. Regards, -Mark Johnson *** DISCLAIMER: The opinions above are personal. *** UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mark TEL: 408-720-1700 x208 US mail: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086