Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!mit-eddie!genrad!decvax!decwrl!labrea!navajo!billw From: billw@navajo.STANFORD.EDU (William E. Westfield) Newsgroups: comp.lang.misc Subject: Assembly language and Speed... Message-ID: <1233@navajo.STANFORD.EDU> Date: Thu, 18-Dec-86 21:13:13 EST Article-I.D.: navajo.1233 Posted: Thu Dec 18 21:13:13 1986 Date-Received: Fri, 19-Dec-86 05:26:52 EST Organization: Stanford University Lines: 62 Keywords: its sooo frustrating... Well, Ill admit to doing almost all of my current programming in assembly languages of various sorts (mostly tops20 monitor and utility hacking - it's already in macro, so I don't have much choice. On the other hand, the PDP10, like the PDP11, has an artistically beautiful instruction set. The other thing I program a lot in assembly language is Intel 808x based MSDOS machines...) While higher level languages certainly have portability advantages in many cases (but not all), and there are clear areas where complexity forbids assembly (I hope never to have to write a package doing heavy floating point math in assembler, even on a machine that has floating point instructions!), I'd like to argue against the idea that HLL code is only a "little bit" slower than assembly code. It is true that any compiler can generate code as good or better than a human for straight line math code. A good compiler ought to be able to optimize loops and array accesses and register allocation for complex expressions as well as a human. But where HLL's fall down is when you start to do function and procedure calls. And of course you should do a lot of those in a well structured program. An assembly language programmer has the option of writing "fast" subroutines that pass arguments in registers and don't bother setting up stack frames, and so on. He can make sure that the arguments he is passing in registers are already in the resgisters that they are supposed to be in, and results go where they are ready to be used next. Arguments can be passed by value or by reference, which ever is more convenient, or faster. And of course there are special instructions to be exploited. As an extreme example, which I have actually done, consider writing some kind of parser in a high level language. One of the things you have to do is push and pop tokens from a stack. Great. Almost every popular processor has instructions that do PUSH and POP in hardware. Some processors do overflow and/or underflow detection at the same time. Many processors allow multiple stacks to exist concurrently (multiple stack pointers in registers). In other cases its is perfectly possible to use the hardware CALL/RETURN/etc stack for user values too. So what happens when you try to do this in PASCAL, huh? Well, you write functions called PUSH and POP, and your arguments get put on a stack frame on the hardware stack, and you do a call instruction which puts the return addresses on a stack and goes off to code that probably fetches the value of stack pointer from memory and puts it back, and does a return instruction that fetches the return address back, and then you clean up the stack frame and continue on. Wow, that was only 10 or 20 times slower than the hardware instuction. Boy am I glad that I was able to write that in a high level language with error checking and things like that instead of having to remember the strange mnemonic for the PUSH instruction (gee, it could have been something really nasty like MOVW source,A2@-). That was a frustrating Pascal program to write. [OK, ill alllow that in C this might end up written as a macro doing something like *sp++ = &source that would have a good chance of being optimized to a single instruction too, but this is not specifically a C vs Assembler debate...] Sigh. BillW, writing small, fast, programs, in assembler, and happy.