Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site mips.UUCP
Path: utzoo!linus!philabs!prls!amdimage!amdcad!decwrl!Glacier!mips!mash
From: mash@mips.UUCP (John Mashey)
Newsgroups: net.arch
Subject: Re: Re: Cache revisited + page tables
Message-ID: <168@mips.UUCP>
Date: Wed, 14-Aug-85 07:26:13 EDT
Article-I.D.: mips.168
Posted: Wed Aug 14 07:26:13 1985
Date-Received: Mon, 19-Aug-85 06:49:14 EDT
References: <5374@fortune.UUCP> <901@loral.UUCP> <2583@sun.uucp> <37@intelca.UUCP>
Distribution: net
Organization: MIPS Computer Systems, Mountain View, CA
Lines: 64

Ken Shoemaker writes:
> > John Van Zandt of Loral Instrumentation (ucbvax!sdcsvax!jvz) said:
> > > Cache's are strange creatures, exactly how they are designed can impact
> > > performance significantly.  Your idea of separate cache's for supervisor
> > > and user is a good one.
> > I believe (uninformed opinion) that this makes it perform worse, all
> > other things equal.  It means that at any given time, half the cache
> > is unusable, thus if you spend 90% of your time in user state you
> > only have a halfsize cache.  (Ditto if you are doing system state stuff.)
> 
> I believe (uninformed opinion) that a better solution is to use a
> multi-way set associative cache.  ...  A multi-way associative cache
> is really the best of both worlds, since it allows both options (a large
> non-split cache or two half size split caches), albeit at increased expense...
> The same argument could be applied to seperate instruction/data caches.

Most of this is true, except for caveats to the last sentence.
Direct-mapped, split I & D caches behave somewhat like 2-way set-assoc. caches,
i.e., they have measurably higher rates than direct-mapped joint I&D cache.
[With direct-mapped joint, all you need is one frequently-used loop that
happens to reference clashing data, and you get many misses.]
Caches are indeed strange things, but they do follow a few reasonable rules.
Given a fixed total amount of cache memory:

1) (N+1)-way set-associative cache has a higher hit rate than N-way.
(in particular, 2-way is higher than 1-way (direct)).

2) Joint caches have higher hit rates than split ones.

Unfortunately,
1) (N+1)-way is more expensive than N-way (for the same speed).
Even worse, unless you're building everything from scratch, you may
be able to buy parts to make N-way go fast enough, but maybe not for N+1.
In particular, there's a big jump from N=1 to anything higher.  In particular,
fast CPUs demand fast cache access.

2) Split I&D caches can get away with using slower SRAMs than do joint
I&D caches, at least for some architectures, because you can more-or-less
alternate accesses to the 2 caches.

Although true, the above is a ferocious over-simplification - it's very
hard to evaluate cache designs without knowing:
a) CPU chip nature  (speed; execution cycles per I-fetch - i.e., CISC vs RISC).
b) Choice of write-thru vs write-back caches; use of write buffers.
c) Approach to cache coherency, i.e., bus-watching cache vs software methods.
d) Main bus speed.
e) Requirements (or lack thereof) for real multi-processor support.
f) Use of optimizing compilers that put things in registers, often driving
the hit rate down [yes, down], although the speed is improved and there are
less total memory references.
g) Physical versus virtual caches, and interaction with timing of
whatever memory management scheme is used; related are interactions
on operating system, given style of shared address spaces, and requirements,
if any, for cache-flushing - there are many tradeoff combinations possible.
h) Cost, speed, and availability of fast RAMs.
i) Board size and power limitations.
[I'm sure I've forgotten some, but these come to mind quickly.]

Nontrivial stuff; not necessarily intuitive; easy to do wrong.
-- 
-john mashey
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash
DDD:  	415-960-1200
USPS: 	MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043