Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84 SMI; site sun.uucp Path: utzoo!linus!philabs!cmcl2!seismo!harvard!talcott!panda!genrad!decvax!decwrl!sun!gnu From: gnu@sun.uucp (John Gilmore) Newsgroups: net.arch Subject: Re: MMU Cache revisited Message-ID: <2581@sun.uucp> Date: Thu, 8-Aug-85 20:29:14 EDT Article-I.D.: sun.2581 Posted: Thu Aug 8 20:29:14 1985 Date-Received: Mon, 12-Aug-85 02:34:16 EDT References: <5374@fortune.UUCP> <268@gcc-bill.ARPA> <1838@amdahl.UUCP> Distribution: net Organization: Sun Microsystems, Inc. Lines: 26 Someone pointed out that changing contexts in the IBM 370 doesn't cause a big performance hit because the page table cache (called the "TLB") contains multiple contexts' entries, indexed with a pointer into another cache ("STO stack"). It should be mentioned that early 370's do not have the STO hardware; it was added because context switching took too long. (Their first virtual memory systems did not give each process a different address space, it just let the address space everybody shared be larger than the physical memory. For that they didn't need to change the MMU context.) It has also not been mentioned that systems where you copy page table entries into dedicated fast RAMs need not recopy on every context switch. In the Sun-2 MMU, for example, 8 complete contexts can remain in the fast RAMs, and the only reloading required is when you are context-switching more than 8 processes. On a single user system (running Unix where most processes die quickly anyway) this is not a performance bottleneck. From the hardware designs I've seen, it's a lot harder to build an MMU with a cache than it is to build one out of RAM. This is because the cache is doing in hardware what would otherwise be done in software (updating the entries in the hardware translation table). Whether this is worth it or not depends on the individual system and what it will be used for. I suspect the overhead difference between the two is negligible in the overall system load, unless the hardware is badly designed, so I favor the cheaper approach.