Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!mit-eddie!uw-beaver!cornell!rochester!pt.cs.cmu.edu!andrew.cmu.edu!zs01+ From: zs01+@andrew.cmu.edu (Zalman Stern) Newsgroups: comp.arch Subject: Re: using assembler Message-ID: <0X1Yh8y00Vs8QEumsN@andrew.cmu.edu> Date: 15 Aug 88 02:24:08 GMT References: <6341@bloom-beacon.MIT.EDU> <60859@sun.uucp> <474@m3.mfci.UUCP> <2926@utastro.UUCP> <37014@linus.UUCP> <1086@garth.UUCP>, <17326@gatech.edu> Organization: Carnegie Mellon Lines: 80 In-Reply-To: <17326@gatech.edu> > *Excerpts from ext.nn.comp.arch: 27-Jul-88 Re: using assembler Ken Seefried* > *iii@gatech. (2803)* > And C is a cakewalk? Fine, Ill tell you what...lets get together some > time and program, say, a MIPS machine, you in asm, and me in C. > I'll even tie one hand behind my backand program in FORTRAN. And > we'll see how far each gets (the MIPS is a RISC machine. Even trivial > operations require non-trivialamounts of asm code). I am not sure which instructions you are refering to as requiring non-trivial amounts of code to synthesize. Are they things I am likely to use quite often? Here are some reasons why the MIPS R2000/R3000 should be quite reasonable for assembly language programming: Simplicity: 3 hours with "MIPS R2000 RISC Architecture" by Gerry Kane and I understood (or "groked" if you prefer) the R2000 and R2010 (floating point unit). There were none of those "What would I use that instruction for?" type questions going through my mind. Orthogonal register set: One register is pretty much as good as another. None of this "Where am I going to put the CX register so I can do a shift" crud you run into on the earlier Intel beasties. Abundance of registers: You get ten or twelve (depending on whether or not you count v0 and v1) unsaved registers to use as temporaries. (Actually, you can add in four to that for the argument passing registers a0-a3.) Arguments passed in registers: Many routines will not need to allocate a stack frame at all. This frees you from having to deal with the calling convention a lot of the time. Single cycle instructions: You don't have to have an instruction timing table handy to write efficient code. Almost every instruction takes one cycle. The only exceptions I know of are multiply/divide, loads/stores, and branches. (And of course floating point.) Intelligent assembler: The assembler removes the burden of scheduling delay slots from programmer. The assembler can also synthesize addressing modes for the programmer. Of course I don't write entire programs in assembly. (For many reasons, most of which can be summed up by saying "Assembly language is just the wrong level of abstraction.") I occasionally find it necessary to write a routine or two in assembly either because high level languages can't do what I need, or because I need extreme speed. Examples of where this has come up in practice are dynamic loading and DES encryption. We have a dynamic loading system which uses a "link snapping" mechanism. This means that when you call a routine that hasn't been loaded yet, you wind up in some trampoline code that loads the routine, fixes the original reference to the routine to point to the newly loaded code, and finally jumps to the new routine. Since there is no way to jump to a routine in C, this trampoline code must be written in assembly. In the DES case, assembly can win big because DES is essentially a bunch of bit manipulations on a small block of data (64 bytes if I remember correctly.) In assembly, the entire block of data can be loaded into the register file and manipulated. The lack of loads and stores during the manipulation makes the encryption run much faster. (I have yet to run into a C compiler that is tense enough to do this. Maybe someday, one will exist.) Most people have decided that the portability loss of assembly is not worth the speed gain for DES code. I have never actually programmed on the MIPS machine. I have however written assembly code for the IBM RT which has some of the same features. (Notably passing arguments in registers.) I have had a much easier time on the RT than on either the VAX, the 68000, or the 8086. (Granted the 68020 and the 80386 fix a few of my complaints with these processor families.) In short, a processor's machine language ought to be simple, regular, and damn fast. Sincerely, Zalman Stern Internet: zs01+@andrew.cmu.edu Usenet: I'm soooo confused... Information Technology Center, Carnegie Mellon, Pittsburgh, PA 15213-3890