Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!columbia!rutgers!cbmvax!welland From: welland@cbmvax.UUCP (Bob Welland) Newsgroups: comp.arch Subject: Re: Phys vs Virtual Addr Caches Message-ID: <2154@cbmvax.UUCP> Date: Fri, 24-Jul-87 11:11:07 EDT Article-I.D.: cbmvax.2154 Posted: Fri Jul 24 11:11:07 1987 Date-Received: Sat, 25-Jul-87 14:42:35 EDT References: <3904@spool.WISC.EDU> <23622@sun.uucp> Reply-To: welland@cbmvax.UUCP (Bob Welland) Organization: Commodore Technology, West Chester, PA Lines: 51 >Here's a question. Why do people build their caches to respond to physical >addresses instead of virtual addresses? [ . . . ] >If you cache virtual addresses you can present the address to the cache >as soon as it is generated, no delay do translation. At the same time you >are doing the cache lookup you can be doing the translation in case there >is a miss. > >Am I missing something or is this the wave of the future? There are a few reasons why people use physical address caches instead of virtual address caches (to reverse the perspective): 1. Cache consistency is very difficult with virtual address caches. This is because virtual addresses are "private" to the process they are associated with. Physical addresses are the "normal form" for the system as a whole. Cache consistency is basically collision detection. Two detect a collision you need to compare addresses. Normal form addresses are easy to compare (equality) while private form addresses require a more complex comparison algorithm. 2. Extra tag space is needed for a process Id to distinguish colliding virtual addresses from different processes. It is also necessary to flush the cache when you reuse a process ID (or use a big PID field), This can be rather ugly. 3. Often it is possible to use the low order address bits (which in a paging system are untranslated) to access the cache in parallel to doing the address translation. In VLSI these paths can be very well matched. Usually you use a small content addressable memory (CAM) for address translation and a large RAM for the tags. This often means that address translation is "free" because it is done in parallel. This is easy to do in VLSI but quite difficult with random logic and so you do not see this approach taken often. The reason it is difficult in random logic is because it is very difficult (basically impossible) to build a CAM and so you end up using some more elaborate translation scheme (i.e. SUN's two level translation table) that makes translation time consuming. Most of the people who build very fast caches do them discretely and so they end up with the translate then cache dilemma described above. As VLSI technology evolves, building more complex structures will become possible allowing MMU and CACHE to be one and the same. So in summary: Yes virtual caches are the wave of the present but not (in my mind) the wave of the future. Robert Welland Opinions expressed are my own and not those of Commodore.