Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site fortune.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxl!ihnp4!fortune!rpw3 From: rpw3@fortune.UUCP Newsgroups: net.arch Subject: Re: RISC perspective - (nf) Message-ID: <2736@fortune.UUCP> Date: Sat, 10-Mar-84 23:02:23 EST Article-I.D.: fortune.2736 Posted: Sat Mar 10 23:02:23 1984 Date-Received: Sun, 11-Mar-84 07:06:49 EST Sender: notes@fortune.UUCP Organization: Fortune Systems, Redwood City, CA Lines: 55 #R:orstcs:-280000100:fortune:16500005:000:2889 fortune!rpw3 Mar 10 19:07:00 1984 And while all you guys are busy stuffing things in registers, let us not forget that process-switch time gets worse the more "short term" state there is to save/restore. Especially when using an operating system model of the Concurrent Euclid sort, or any other system in which the "interrupt handlers" are full processes in their own right, saving and restoring ("swapping") the enormous amount of state implied by all those registers can result in a net reduction of throughput. Properly designed non-write-through cache does not suffer from that problem, since many pieces of many interrupt processes can come to live in the cache. Small aside: I really wish that the Motorola 68000 had only 8 general registers rather than 8 addr + 8 data. For any instruction set of the VAX/68k/16k style, 16 registers is too many if you do a lot of process switching. If the "registers" are implemented as main memory locations which addressed with short addresses relative to some process base, and the identification of "register" addresses is fast enough so that cached "registers" compete with hardware register addressing, then the number of registers should be set solely by the number of bits we wish to use for them in the instructions. I still feel that even in this case the number of specially named "registers" (short addresses) will be quite small (32 or less). An interesting example to look at for register naming is the old RCA-1602 (?) 4-bit microcontroller chip. It had a few registers (4?), but the neat thing was that the actual scratchpad cell that was used for a given register was run-time settable. To do a "process switch" you said "Use loc.234 for the N register, 107 for the T reg, and (last) loc.025 for the PC". Now, following that through to bigger machines, we should look at decoupling the NAMING of "registers" (which is restricted by the desire to keep the instruction stream efficient) from the LOCATION of registers (which should be a large address space). The context-switch problem then becomes changing the name-location map quickly, like the TI 9900. Additionally, with smart enough compilers (but don't ask me to program in assembly for THIS machine!) you can change the assignment of registers to locations as needed throughout a given module or even subroutine, to balance the "register working set". ("Quick Martha, call the nut house! He's gone and re-invented register page-maps!") If the register page-map was actually of the translation- lookaside-cache style (with the same number of entries as registers), and was onboard the CPU chip, it could probably compete in performance with fixed hardwired register sets. Just a little something to jog the discussion along... Rob Warnock UUCP: {sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3 DDD: (415)595-8444 USPS: Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065