Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site cheviot.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!teddy!panda!talcott!harvard!seismo!mcvax!ukc!cheviot!robert From: robert@cheviot.UUCP (Robert Stroud) Newsgroups: net.lang Subject: Re: Smart compilers Message-ID: <206@cheviot.UUCP> Date: Tue, 8-Jan-85 15:57:31 EST Article-I.D.: cheviot.206 Posted: Tue Jan 8 15:57:31 1985 Date-Received: Sat, 12-Jan-85 07:34:47 EST Reply-To: robert@cheviot.UUCP (Robert Stroud) Organization: U. of Newcastle upon Tyne, U.K. Lines: 67There has been a lot of discussion recently about how to optimise the Fortran program, DO 100 I=1,10 IF (Y .GT. 0) X(I) = SQRT(Y) 100 CONTINUE Most of the suggestions have been wrong, and even those that are right would not work in the presence of certain pathological cases of aliassing and side-effects. For example, suppose SQRT was a user function which modified the COMMON block variable I as a side-effect. Or suppose, SQRT gave a negative result and Y was EQUIVALENCE'd onto X(3). Could even a really good data flow analyser cope with such pathologies and would it be reasonable to expect it to be able to cope?? If this was a SUBROUTINE fragment and X and Y were parameter or COMMON block elements, then whether the optimisation was valid or not might well depend on precisely how the subroutine was called; in other words, it would work sometimes but not necessarily always. I agree completely with the principle that optimisers should not modify the semantics of programs, but there is a grey area where things get very tricky. I believe that there are a lot of little known rules for programming in FORTRAN which are part of the language definition and which attempt to prevent these pathological cases arising. They only exist in order to guarantee that certain optimisations will always be possible. I am thinking of a paper called "Serious Fortran", (I'm afraid I can't give a more precise reference), and things like the restrictions on modifications to loop variables in extended DO loop ranges and the apparently little known fact that the DO loop variable is UNDEFINED after executing a loop. [This may be historical - can any Fortran expert shed more light??] As an aside, in Pascal you are not allowed to modify a for loop variable or alter the value of a with expression within the bounds of the statement. But does anyone know of a Pascal compiler that actually checks for this and prohibits even such blatant violations as... WITH P^ DO BEGIN P^ := P^.next; END; In practice these rules are not enforced (or indeed cannot be enforced) and so programs which are strictly speaking illegal will compile without error but will give unexpected results when optimised. An obvious solution to the problem is to try and introduce syntactic restrictions which make aliassing and side-effects either impossible or easily detectable. This was the approach taken by the designers of the language Euclid, but what started as Pascal got extremely complicated and I think it is fair to say that this approach is a lot harder than it looks. Aliases and side-effects will be with us as long as we use languages with variables and assignment. But again, even if in theory all these nasty things are lurking beneath the surface ready to bite us, in practice do we really run into such pathological cases. If things were really that bad, we wouldn't be able to write any programs that worked (:-)! Does anyone have any good horror stories about being bitten by an alias or a side-effect? [Preferably in languages with proper abstraction - ie BASIC is excluded!!] Robert Stroud, Computing Laboratory, University of Newcastle upon Tyne