Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!rutgers!mit-eddie!genrad!decvax!decwrl!labrea!navajo!billw
From: billw@navajo.STANFORD.EDU (William E. Westfield)
Newsgroups: comp.lang.misc
Subject: Assembly language and Speed...
Message-ID: <1233@navajo.STANFORD.EDU>
Date: Thu, 18-Dec-86 21:13:13 EST
Article-I.D.: navajo.1233
Posted: Thu Dec 18 21:13:13 1986
Date-Received: Fri, 19-Dec-86 05:26:52 EST
Organization: Stanford University
Lines: 62
Keywords: its sooo frustrating...


Well, Ill admit to doing almost all of my current programming in
assembly languages of various sorts (mostly tops20 monitor and utility
hacking - it's already in macro, so I don't have much choice.  On the
other hand, the PDP10, like the PDP11, has an artistically beautiful
instruction set.  The other thing I program a lot in assembly language
is Intel 808x based MSDOS machines...)

While higher level languages certainly have portability advantages in
many cases (but not all), and there are clear areas where complexity
forbids assembly (I hope never to have to write a package doing heavy
floating point math in assembler, even on a machine that has floating
point instructions!), I'd like to argue against the idea that HLL code
is only a "little bit" slower than assembly code.

It is true that any compiler can generate code as good or better than a
human for straight line math code.  A good compiler ought to be able to
optimize loops and array accesses and register allocation for complex
expressions as well as a human.  But where HLL's fall down is when you
start to do function and procedure calls.  And of course you should do
a lot of those in a well structured program.

An assembly language programmer has the option of writing "fast"
subroutines that pass arguments in registers and don't bother setting
up stack frames, and so on.  He can make sure that the arguments he is
passing in registers are already in the resgisters that they are
supposed to be in, and results go where they are ready to be used next.
Arguments can be passed by value or by reference, which ever is more
convenient, or faster.  And of course there are special instructions to
be exploited.

As an extreme example, which I have actually done, consider writing
some kind of parser in a high level language.  One of the things you
have to do is push and pop tokens from a stack.  Great.  Almost every
popular processor has instructions that do PUSH and POP in hardware.
Some processors do overflow and/or underflow detection at the same
time.  Many processors allow multiple stacks to exist concurrently
(multiple stack pointers in registers).  In other cases its is
perfectly possible to use the hardware CALL/RETURN/etc stack for user
values too.  So what happens when you try to do this in PASCAL, huh?
Well, you write functions called PUSH and POP, and your arguments get
put on a stack frame on the hardware stack, and you do a call
instruction which puts the return addresses on a stack and goes off to
code that probably fetches the value of stack pointer from memory and
puts it back, and does a return instruction that fetches the return
address back, and then you clean up the stack frame and continue on.
Wow, that was only 10 or 20 times slower than the hardware instuction.
Boy am I glad that I was able to write that in a high level language
with error checking and things like that instead of having to remember
the strange mnemonic for the PUSH instruction (gee, it could have been
something really nasty like MOVW source,A2@-).

That was a frustrating Pascal program to write.

[OK, ill alllow that in C this might end up written as a macro
 doing something like *sp++ = &source that would have a good chance
 of being optimized to a single instruction too, but this is not
 specifically a C vs Assembler debate...]

Sigh.

BillW, writing small, fast, programs, in assembler, and happy.