Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!agate!ucbvax!TYCHO.YERKES.UCHICAGO.EDU!pearce From: pearce@TYCHO.YERKES.UCHICAGO.EDU ("Eric C. Pearce") Newsgroups: comp.lang.c Subject: Duff's device. Message-ID: <8809092109.AA06071@tycho.yerkes.uchicago.edu> Date: 9 Sep 88 21:09:55 GMT Sender: usenet@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 30 After trying Doug Schmidts duff timer on our Sun 3/260, with the sun supplied cc complier, I found duff's device considerably faster (almost a factor of two). However, the test is unfair in the sense that no attempt is made to optimize the non-duff copy without resorting to syntax-nightmares like the duff code. Additionally, Schmidts test is somewhat unfair since it leaves some loop initialization (such as the initialization of A and B) out of the loop. This will be quite small if the copy is large though. Personally, I prefer this "conventional" copy fragment: A = array1; B = array2; n = Count / 8; for (i = Count%8; i > 0; i--) *A++ = *B++; while (--n >= 0) { *A++ = *B++; *A++ = *B++; *A++ = *B++; *A++ = *B++; *A++ = *B++; *A++ = *B++; *A++ = *B++; *A++ = *B++; } >From a performance standpoint, it falls only 2% behind duff device and is 102% more readable and not "kludgy".