Path: utzoo!attcan!uunet!husc6!cmcl2!nrl-cmf!ames!hc!lll-winken!maddog!brooks
From: brooks@maddog.llnl.gov (Eugene D. Brooks III)
Newsgroups: comp.arch
Subject: Re: m88000 benchmarks (LONG)
Keywords: m88000 benchmark fft instruction scheduling
Message-ID: <9442@lll-winken.llnl.gov>
Date: 5 Jul 88 21:58:58 GMT
References: <1359@claude.oakhill.UUCP>
Sender: usenet@lll-winken.llnl.gov
Reply-To: brooks@maddog.UUCP (Eugene D. Brooks III)
Organization: Lawrence Livermore National Laboratory
Lines: 15

In article <1359@claude.oakhill.UUCP> wca@oakhill.UUCP (william anderson) writes:
A detailed exposition of the performance of two compilers and hand assembly
for the FFT inner loop.

For even better pipeline utilization from compiler B, the public domain
compiler, change the register allocation strategy so that when looking
for a scratch register is remembers the last one allocated and starts
upward folding around to R0 when it hits the top of the register file.
By doing this the false dependencies caused by register reuse will be
minimized and the postprocessing rescheduler will have more code that
can be moved around.  We used this strategy with PCC for the Cerberus
instruction set with a postprocessing optimizer and it exploited the
functional unit pipelines very well.  The same will apply for the 88000.
The situation for Cerberus will be even better using a GCC/{postprocessing
scheduler} combo.  I am willing to bet that that "public domain" compiler
was GCC which is an ANSI C compiler that is to be taken quite seriously!