Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site mips.UUCP Path: utzoo!linus!philabs!prls!amdimage!amdcad!decwrl!Glacier!mips!mash From: mash@mips.UUCP (John Mashey) Newsgroups: net.arch Subject: Re: Re: Cache revisited + page tables Message-ID: <168@mips.UUCP> Date: Wed, 14-Aug-85 07:26:13 EDT Article-I.D.: mips.168 Posted: Wed Aug 14 07:26:13 1985 Date-Received: Mon, 19-Aug-85 06:49:14 EDT References: <5374@fortune.UUCP> <901@loral.UUCP> <2583@sun.uucp> <37@intelca.UUCP> Distribution: net Organization: MIPS Computer Systems, Mountain View, CA Lines: 64 Ken Shoemaker writes: > > John Van Zandt of Loral Instrumentation (ucbvax!sdcsvax!jvz) said: > > > Cache's are strange creatures, exactly how they are designed can impact > > > performance significantly. Your idea of separate cache's for supervisor > > > and user is a good one. > > I believe (uninformed opinion) that this makes it perform worse, all > > other things equal. It means that at any given time, half the cache > > is unusable, thus if you spend 90% of your time in user state you > > only have a halfsize cache. (Ditto if you are doing system state stuff.) > > I believe (uninformed opinion) that a better solution is to use a > multi-way set associative cache. ... A multi-way associative cache > is really the best of both worlds, since it allows both options (a large > non-split cache or two half size split caches), albeit at increased expense... > The same argument could be applied to seperate instruction/data caches. Most of this is true, except for caveats to the last sentence. Direct-mapped, split I & D caches behave somewhat like 2-way set-assoc. caches, i.e., they have measurably higher rates than direct-mapped joint I&D cache. [With direct-mapped joint, all you need is one frequently-used loop that happens to reference clashing data, and you get many misses.] Caches are indeed strange things, but they do follow a few reasonable rules. Given a fixed total amount of cache memory: 1) (N+1)-way set-associative cache has a higher hit rate than N-way. (in particular, 2-way is higher than 1-way (direct)). 2) Joint caches have higher hit rates than split ones. Unfortunately, 1) (N+1)-way is more expensive than N-way (for the same speed). Even worse, unless you're building everything from scratch, you may be able to buy parts to make N-way go fast enough, but maybe not for N+1. In particular, there's a big jump from N=1 to anything higher. In particular, fast CPUs demand fast cache access. 2) Split I&D caches can get away with using slower SRAMs than do joint I&D caches, at least for some architectures, because you can more-or-less alternate accesses to the 2 caches. Although true, the above is a ferocious over-simplification - it's very hard to evaluate cache designs without knowing: a) CPU chip nature (speed; execution cycles per I-fetch - i.e., CISC vs RISC). b) Choice of write-thru vs write-back caches; use of write buffers. c) Approach to cache coherency, i.e., bus-watching cache vs software methods. d) Main bus speed. e) Requirements (or lack thereof) for real multi-processor support. f) Use of optimizing compilers that put things in registers, often driving the hit rate down [yes, down], although the speed is improved and there are less total memory references. g) Physical versus virtual caches, and interaction with timing of whatever memory management scheme is used; related are interactions on operating system, given style of shared address spaces, and requirements, if any, for cache-flushing - there are many tradeoff combinations possible. h) Cost, speed, and availability of fast RAMs. i) Board size and power limitations. [I'm sure I've forgotten some, but these come to mind quickly.] Nontrivial stuff; not necessarily intuitive; easy to do wrong. -- -john mashey UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash DDD: 415-960-1200 USPS: MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043