Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!lll-lcc!ames!ucbcad!ucbvax!ulysses!ggs
From: ggs@ulysses.homer.nj.att.com (Griff Smith)
Newsgroups: comp.lang.misc
Subject: Re: Assembly language and Speed...
Message-ID: <1552@ulysses.homer.nj.att.com>
Date: Fri, 26-Dec-86 20:30:18 EST
Article-I.D.: ulysses.1552
Posted: Fri Dec 26 20:30:18 1986
Date-Received: Sat, 27-Dec-86 00:44:58 EST
References: <1233@navajo.STANFORD.EDU> <464@unc.unc.UUCP> <4375@mit-eddie.MIT.EDU>
Organization: AT&T Bell Laboratories, Murray Hill
Lines: 88
Keywords: Space
Summary: Peephole optimization is a well-developed art

In article <4375@mit-eddie.MIT.EDU>, rh@mit-eddie.MIT.EDU (Randy Haskins) writes:
> Just to add to the argument, here is an example of what a human can do
> that I'd seriously doubt that a compiler would do.  I first saw something
> like this in some local hacks to the Monitor for a Tops-20 at MIT.  At
> first, I was offended, but then I saw the sheer elegance of it.
> 
> PRSPC:	Skipa	T2,[32.]	; move a space into 2, skip next instruction
> PRTAB:	Movei  T2,9.		; put a tab in 2
>		Movei   T1,.PRIOU	; primary output
>		Bout%			; byte output
>		 Ercal  .+1		; ignore error
>		Popj    P		; return
> 
> Not only does it avoid having to make an extraneous jump or call (in order
> to use common code), it saves space.  Would a compiler "know" to do something
> like this?  I doubt it.
> 
> Randwulf  (Randy Haskins);  Path= genrad!mit-eddie!rh

Actually, this kind of optimization was standard for the TOPS-10 and
TOPS-20 Fortran and LISP compilers.  A simple peephole optimizer could
find these cases easily.  The first step would be for the optimizer to
recognize that the ends of the two functions are identical; it would
then do a space optimization by creating a jump to the common code.
The {skipa, movei} trick would then fall out while applying standard
peephole optimizations of moves and jumps.  As an example of how far a
compiler could carry it, consider the following hypothetical output
from a (probably non-existent) optimizing C compiler:

Given the statement:

	if (a == 0 || a == 1)
		b += 1;
	else
		b -= 1;

the raw PDP-10 assembly language from the code generator would be something like

	MOVE	R1,a	; if (a == 0
	JUMPE	R1,L1
	CAIE	R1,1	; || a == 1)
	 JRST	L2
L1:
	AOS	b	; b += 1
	JRST	L3
L2:
	SOS	b	; b -= 1
L3:

A simple optimization of the last five lines of text would reduce this to:

	MOVE	R1,a	; if (a == 0
	JUMPE	R1,L1
	CAIE	R1,1	; || a == 1)
	 JRST	L2
L1:
	AOSA	b	; b += 1 (skip next instruction)
L2:
	 SOS	b	; b -= 1

This can then be further optimized by inverting the sense of the compare
and eliminating the jump (JRST):

	MOVE	R1,a	; if (a == 0
	JUMPE	R1,L1
	CAIN	R1,1	; || a == 1)
L1:
	 AOSA	b	; b += 1 (skip next instruction)
	 SOS	b	; b -= 1

Finally, by replacing the {move, jumpe} with an instruction that loads
and tests in one operation we get:

	SKIPE	R1,a	; if (a == 0
	 CAIN	R1,1	; || a == 1)
	 AOSA	b	; b += 1 (skip next instruction)
	 SOS	b	; b -= 1

None of these substitutions are any more complicated that ones commonly
used by the UNIX(R) C compiler, yet the resulting sequence of instructions
appears to be the result of clever design.  Given how long it took to
make sure I didn't have any mistakes in the above example, I would much
rather have a compiler do it for me.
-- 
Griff Smith	AT&T (Bell Laboratories), Murray Hill
Phone:		1-201-582-7736
UUCP:		{allegra|ihnp4}!ulysses!ggs
Internet:	ggs@ulysses.uucp