Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!ubvax!vsi1!wyse!mips!earl From: earl@mips.COM (Earl Killian) Newsgroups: comp.arch Subject: Re: RISC machines and scoreboarding Message-ID: <2547@wright.mips.COM> Date: 6 Jul 88 03:46:35 GMT References: <1362@oakhill.UUCP> Lines: 44 In-reply-to: mpaton@oakhill.UUCP's message of 1 Jul 88 20:57:15 GMT In article <1362@oakhill.UUCP> mpaton@oakhill.UUCP (Michael Paton) writes: mp> The MIPS processors do not snoop their bus and therefore leave mp> memory coherence to the write-through mechanism. In mp> multiprocessing applications, the memory bus can become saturated mp> with a few processors on the bus (~4?). Write-back caches cause a mp> sufficient reduction in memory bus traffic to allow twice the mp> number of processing ensembles to utilize the bus. Write through to a 32b bus does indeed limit you to about 4 processors (the max supported by the 88000). If you want more than that, use a 64b bus (~8 processors), or use a secondary cache (which has its own benefits) and make it write-back. Both approaches have already been implemented in R2000-based systems. When you build a R3000-based MP, you don't have to limit the amount of cache per processor (unlike, e.g., the 88000, which allows only 16KB per processor in a 4-processor system). If you're building an MP, presumably you're interested in performance, so it seems strange to cripple each processor with a small cache. mp> ...in particular, we attempted to beat on the SRAM technology less mp> hard. If we are correct, this should be more scalable in the mp> future (read ECL/GaAs) as off-chip delays approach .4 cycle. It's hard to envision ECL output drive-times approaching 0.4 clocks; modern ECL parts (e.g. Sony's CXB1100Q 3-NOR) can receive signals from off-chip, do the NOR function, and drive off chip again (using 100K levels) in 390 picoseconds. {EDN 6-23-88, p. 97} So (0.4 * Tclock) = 390ps giving Tclock = 975 picoseconds (1.03 GHz) !!! A more "believable" clock period might be 4ns (a la Cray-2), in which case the drive-off time is 0.1 clock.... a smaller fraction of the cycle than you quote for your CMOS design. mp> Alternatively, our design costs less to manufacture in high volume mp> and allow less costly SRAM parts than the MIPS Co. design. An R3000 may require faster SRAMs, but these are multiple-sourced, off-the-shelf, commodity devices, and the price of 30 16Kx4 20ns SRAMs is actually lower than that of eight 88200s (sole-sourced). -- UUCP: {ames,decwrl,prls,pyramid}!mips!earl USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086