Path: utzoo!attcan!uunet!lll-winken!lll-tis!helios.ee.lbl.gov!pasteur!ucbvax!decwrl!labrea!rutgers!njin!princeton!phoenix!pupthy!lgy From: lgy@pupthy.PRINCETON.EDU (Larry Yaffe) Newsgroups: comp.arch Subject: Fancy indexing (was: Re: m88000 benchmarks) Summary: Move the data! Keywords: FFT benchmarks M88K MIPS Message-ID: <3261@phoenix.Princeton.EDU> Date: 15 Jul 88 03:29:21 GMT References: <1384@claude.oakhill.UUCP> Sender: news@phoenix.Princeton.EDU Reply-To: lgy@pupthy.PRINCETON.EDU (Larry Yaffe) Organization: Physics Dept, Princeton Univ Lines: 21 In article <1384@claude.oakhill.UUCP> wca@oakhill.UUCP (william anderson) writes: [comments on E. Killian's coding of FFT inner loop] +One problem with this code is that is assumes the "stride" of the loop +(the varible "n1" in the C code segment above) is unity! +What about the code for the inner loop in the general case? What +effect does the assumption of non-unity stride have on the MIPS loop +timing? [...] This FFT code is a typical example where it is worthwhile to move data explicitly outside the inner loop, in order to keep the indexing in the inner loop as simple as possible. The overhead is small compared to the work in the inner loop (for any substantial sized calculation). In my experience, even on machines whose hardware supports scaled indexing, the simplest addressing modes are significantly faster - enough so that using the fancy modes is non-optimal. Matrix multiplication is a particularly classic example of this. + /\ /\ William C. Anderson + //\\ //\\ Member of the M88000 Design Group + ///\\\ ///\\\ Motorola Microprocessor Division + // \\ // \\ Oak Hill, TX.