Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!ll-xn!mit-eddie!uw-beaver!uw-june!pardo
From: pardo@june.cs.washington.edu (David Keppel)
Newsgroups: comp.arch
Subject: Re: Copying bytes quickly (Was: CISC bashing at USENIX)
Message-ID: <5305@june.cs.washington.edu>
Date: 16 Jul 88 05:43:06 GMT
References: <6965@ico.ISC.COM> <936@garth.UUCP> <202@baka.stan.UUCP> <1746@vaxb.calgary.UUCP> <974@garth.UUCP>
Reply-To: pardo@uw-june.UUCP (David Keppel)
Organization: U of Washington, Computer Science, Seattle
Lines: 31

In article <974@garth.UUCP> walter@garth.UUCP (Walter Bays) writes:
>[ while(--count) *dst++ = *src++; ]
>[ bcopy should use an unrolled loop, word copy if ok alignment ]
>[ assert byte allignment ]

Interestingly, the VAX |movc3| and |movc5| instructions will copy
byte-at-a-time until they get to a longword boundary, then copy 4
bytes at a time until they get to less than 4 bytes from the end of
the copy, then copy byte-at-a-time again.  All done in microcode.

>[ smart compilers inline ]

With GNU CC, you can inline assembly code and NOT have to guess
where the registers are going to wind up.  Thus you can write your
own "smart compiler" for a VAX:

:

#ifdef vax
#ifdef __GNUC__
#   define bcopy(dest,src,count) {\
	register void *__m_src = (src), *__m_dest = (dest); \
	register unsigned int __m_count = (count); \
	asm volatile \
	    ("bcopy %0,%l,%2", "g"(__m_count):"g"(__m_dest),"=g"(__m_src)); \
    }
#endif
#endif
	
NOTE: I haven't actually tried this.

	;-D on  ( Cheap hacks for speed )  Pardo