Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!killer!ames!pasteur!agate!garnet!weemba From: weemba@garnet.berkeley.edu (Obnoxious Math Grad Student) Newsgroups: comp.arch Subject: Re: Vectorising conditional code. Message-ID: <11855@agate.BERKELEY.EDU> Date: 9 Jul 88 14:18:47 GMT References: <893@garth.UUCP> Sender: usenet@agate.BERKELEY.EDU Reply-To: weemba@garnet.berkeley.edu (Obnoxious Math Grad Student) Organization: Brahms Gang Posting Central Lines: 68 Supersedes: <11854@agate.BERKELEY.EDU> In-reply-to: smryan@garth.UUCP (Steven Ryan) In article <893@garth.UUCP>, smryan@garth (Steven Ryan) writes: > for i > a[i] := ... > exit if p[i] > b[i] := ... >The actual number of iterations is controlled by the predicate p[]. >This means the actual number of elements stored into a[] is determined >by a subsequent statement. I see no simple way to handle this. See my comments below. If the above code fragment is the inner loop of a bigger calculation, and each inner loop is independent of all others, then vectorization can be done straightforwardly. >If I were actually working on this at the moment, I would like see enough >typical cases I once played with the Mandelbrot set on a Cray-1. The relevant code frag- ment is (with complex arithmetic): for c in [set of pixels] { for(z=i=0; |z|<2 && iwhere all the extra work is worth the effort. In this case, it was definitely worth it. Now back to your example and my promised comments. > for i > a[i] := ... > exit if p[i] > b[i] := ... What I did, then, was to vectorize the a[i] calculation. I had no b[i] calculation--this is true for any while loop. But this could be handled just as easily. I write one part of the code that does A-P, and another part that does B-A-P. Every pixel would do one round of A-P, and then it's just a B-A-P while loop. More fun happens if you replace the "exit" with "continue": then the pix- els start in the A-P batch and eventually enough migrate to B-A-P allow- ing both to loop for a while. Keeping the vector registers full and beat- ing off gridlock is tedious, but it is not overly difficult. Remember, this method works if you have a huge outer loop that guarantees that the code fragments A-P and B-A-P are vector calculations. It can, in principle, vectorize arbitrarily complex conditionals. ucbvax!garnet!weemba Matthew P Wiener/Brahms Gang/Berkeley CA 94720