Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!killer!ames!pasteur!ucbvax!decwrl!purdue!i.cc.purdue.edu!j.cc.purdue.edu!pur-ee!a.cs.uiuc.edu!uxc.cso.uiuc.edu!uicsrd.csrd.uiuc.edu!hoefling From: hoefling@uicsrd.csrd.uiuc.edu Newsgroups: comp.arch Subject: Re: getting rid of branches Message-ID: <43700043@uicsrd.csrd.uiuc.edu> Date: 7 Jul 88 17:40:00 GMT References: <12258@mimsy.UUCP> Lines: 66 Nf-ID: #R:mimsy.UUCP:12258:uicsrd.csrd.uiuc.edu:43700043:000:1956 Nf-From: uicsrd.csrd.uiuc.edu!hoefling Jul 7 12:40:00 1988 >/* Written 1:21 pm Jul 4, 1988 by ho@svax.cs.cornell.edu */ > >original: do 100 i = 1, 100 > statement1 > if (x(i)) goto 200 > statement2 > 100 continue > statement3 > 200 statement4 > >transformed: > ex1 = .true. > do 100 i = 1, 100 > if (ex1) statement1 > if (ex1) ex1 = .not. x(i) > if (ex1) statement2 > 100 continue > if (.not. ex1) goto 200 > statement3 > 200 statement4 > >the do loop can now be vectorized. If "x" in the original is invariant inside the loop (i.e. no dependences involving it within the loop), then it is trivial to determine on which iteration the exit will occur [ it's iteration i where x(i) is first .TRUE. ]. Knowing that, it is also trivial to determine how many "statement1"s, "statement2"s, "statement3"s and "statement4"s will be executed. It is therefore also trivial to set up vector statements which do the statements that many times. The crucial question is whether there are any dependences from statement2 to statement1. Such a dependence would make the loop not vectorizable (at some point, an instance of statement1 would have to wait for an instance of statement2 to finish). If we assume that there are no dependences on "x" and no dependences from statement2 to statement1, then the problem comes down to simply finding the index of the first occurence of .TRUE. in "x". C---Let's say that "first_TRUE_index" returns 0 if no x(i) is .TRUE. exit_index = first_TRUE_index(x) if (exit_index .EQ. 0) then limit1 = 100 limit2 = 100 else limit1 = exit_index limit2 = exit_index-1 end if dovector i=1,limit1 statement1 end dovector dovector i=1,limit2 statement2 end dovector if (exit_index .EQ. 0) statement3 statement4 Jay Hoeflinger Center for Supercomputing Research and Development U of Illinois