Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site cheviot.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!teddy!panda!talcott!harvard!seismo!mcvax!ukc!cheviot!robert
From: robert@cheviot.UUCP (Robert Stroud)
Newsgroups: net.lang
Subject: Re: Smart compilers
Message-ID: <206@cheviot.UUCP>
Date: Tue, 8-Jan-85 15:57:31 EST
Article-I.D.: cheviot.206
Posted: Tue Jan  8 15:57:31 1985
Date-Received: Sat, 12-Jan-85 07:34:47 EST
Reply-To: robert@cheviot.UUCP (Robert Stroud)
Organization: U. of Newcastle upon Tyne, U.K.
Lines: 67



There has been a lot of discussion recently about how to optimise
the Fortran program,

      DO 100 I=1,10
         IF (Y .GT. 0) X(I) = SQRT(Y)
  100    CONTINUE

Most of the suggestions have been wrong, and even those that are right
would not work in the presence of certain pathological cases of aliassing
and side-effects. For example, suppose SQRT was a user function which
modified the COMMON block variable I as a side-effect. Or suppose, SQRT
gave a negative result and Y was EQUIVALENCE'd onto X(3). Could even
a really good data flow analyser cope with such pathologies and would it
be reasonable to expect it to be able to cope?? If this was a SUBROUTINE
fragment and X and Y were parameter or COMMON block elements, then whether
the optimisation was valid or not might well depend on precisely how the
subroutine was called; in other words, it would work sometimes but not 
necessarily always.

I agree completely with the principle that optimisers should not modify
the semantics of programs, but there is a grey area where things get
very tricky. I believe that there are a lot of little known rules for
programming in FORTRAN which are part of the language definition and
which attempt to prevent these pathological cases arising. They only
exist in order to guarantee that certain optimisations will always
be possible. I am thinking of a paper called "Serious Fortran", (I'm
afraid I can't give a more precise reference), and things like the
restrictions on modifications to loop variables in extended DO loop 
ranges and the apparently little known fact that the DO loop variable
is UNDEFINED after executing a loop.

[This may be historical - can any Fortran expert shed more light??]

As an aside, in Pascal you are not allowed to modify a for loop variable
or alter the value of a with expression within the bounds of the statement.
But does anyone know of a Pascal compiler that actually checks for this
and prohibits even such blatant violations as...

       WITH P^ DO
       BEGIN
          P^ := P^.next;
       END;

In practice these rules are not enforced (or indeed cannot be enforced)
and so programs which are strictly speaking illegal will compile without
error but will give unexpected results when optimised. An obvious solution
to the problem is to try and introduce syntactic restrictions which make
aliassing and side-effects either impossible or easily detectable. This
was the approach taken by the designers of the language Euclid, but what 
started as Pascal got extremely complicated and I think it is fair to say that
this approach is a lot harder than it looks. Aliases and side-effects will
be with us as long as we use languages with variables and assignment.

But again, even if in theory all these nasty things are lurking beneath
the surface ready to bite us, in practice do we really run into such
pathological cases. If things were really that bad, we wouldn't be able
to write any programs that worked (:-)!

Does anyone have any good horror stories about being bitten by an alias
or a side-effect? [Preferably in languages with proper abstraction - ie
BASIC is excluded!!]

Robert Stroud,
Computing Laboratory,
University of Newcastle upon Tyne