Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 8/28/84; site lll-crg.UUCP
Path: utzoo!watmath!clyde!cbosgd!ihnp4!qantel!dual!lll-crg!brooks
From: brooks@lll-crg.UUCP (Eugene D. Brooks III)
Newsgroups: net.lang.c
Subject: The "poor" performance of the Caltech C compiler.
Message-ID: <866@lll-crg.UUCP>
Date: Wed, 25-Sep-85 01:16:59 EDT
Article-I.D.: lll-crg.866
Posted: Wed Sep 25 01:16:59 1985
Date-Received: Sun, 29-Sep-85 04:11:40 EDT
References: <418@phri.UUCP> <700002@fthood> <187@graffiti.UUCP> <175@mit-bug.UUCP> <897@turtlevax.UUCP> <698@sfmag.UUCP> <56@escher.UUCP>
Reply-To: brooks@lll-crg.UUCP (Eugene D. Brooks III)
Organization: Lawrence Livermore Labs, CRG
Lines: 31


With regards to the performance of the Caltech C compiler with single
precision floating point operations.

This compiler implemented both single and double precision register
variables and delivered quite a substantial speed improvement over
the standard Unix compiler.  I recorded speed increases of up to a
factor of 2.5 for some vector operations implemented as unrolled loops
in C.

	register float *a, *b, *c;

	int dim;

	dim /= 8;

	do {
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
		*a++ = *b++ + *c++;
	while(dim-- > 0);

With the apropo handling of the dim%8 part, went like the devil.  Other
operations, dot product, etc obtained similar speed improvements.  Inspection
of the resulting assembler code showed that one could not do better by
writing in assembler.