Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!ucbvax!decwrl!alverson From: alverson@decwrl.dec.com (Robert Alverson) Newsgroups: comp.arch Subject: Re: RISC bashing at USENIX Summary: Library code does the right thing Message-ID: <603@bacchus.DEC.COM> Date: 14 Jul 88 20:50:36 GMT References: <6965@ico.ISC.COM> <936@garth.UUCP> <202@baka.stan.UUCP> <59798@sun.uucp> <204@baka.stan.UUCP> Reply-To: alverson@decwrl.UUCP (Robert Alverson) Distribution: na Organization: Digital Equipment Corporation Lines: 30 In article <204@baka.stan.UUCP> stan!landru@boulder.edu writes: >In article <59798@sun.uucp> pope@sun.UUCP (John Pope) writes: >>>register long count; >>>register long *src, *dst; > ^^^^ >>> while( --count ) >>> { >>> *dst++ = *src++; >>> } >>*** Warning! Brain damaged software alert! *** >>This should be re-coded to use the bcopy() library routine, which >>does a 32 bit copy instead of a byte at a time. You should see a >>*noticable* improvement. Moral: use your libraries, that's what they're >>there for. Despite the incorrectness of Pope's reasoning, I tend to agree that you should use a library routine to perform such a low-level function as copying memory. In particular, a library routine might unroll the loop many times, so that the cost per word approaches that of a single load+store pair. This would make the cost per byte nearly 5 cycles on Sparc (I think), bringing it to 300ns (?). This is still rather high, it seems like a RISC ought to do a load+store in 2 or 3 cycles (scheduled!). Similarly, on a VAX, the library routine might just happen to correspond directly to a VAX instruction, so that the loop could be executed in microcode. In any case, copying memory seems like such a fundamentally useful operation that you can expect the library code to be at least as good as what you can get out of the compiler. Bob