Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!ames!xanth!mcnc!decvax!ima!esegue!compilers-sender From: nick@lfcs.edinburgh.ac.uk (Nick Rothwell) Newsgroups: comp.compilers Subject: Compiling Functional Programs and Parallelism Message-ID: <1989Aug18.181601.406@esegue.uucp> Date: 18 Aug 89 18:16:01 GMT Sender: compilers-sender@esegue.uucp (John R. Levine) Reply-To: Nick RothwellOrganization: Compilers Central Lines: 87 Approved: compilers@esegue.segue.bos.ma.us I've had a few pieces of mail asking me to clarify my scheme for compiling a functional language to execute in parallel on a shared-memory machine. I'll just give a quick overview. As I said previously, if you're happy with combinator graphs and reduction architectures, then fine. I personally feel that the hardware technology isn't quite there yet, and, more importantly, I'm not sure how much use reduction execution is for other languages or for an entire operating system; I think it's too specialised. My compilation scheme requires a fairly conventional control-flow architecture with threads and shared memory, plus a simple semaphore operation on memory locations. The compiler is in two passes (at least :-)). The first pass has a simple rule which says that all function applications should be forked off to a new process, and the parent should wait for the result. In principle, this is fairly simple-minded and fairly useless at this stage. For example: fun f x = (f1 x, f2 x, f3 x) A call to `f' creates a new process context. That process forks a child for `f1', waits for it to finish, forks a child for `f2', and so on. Finally it builds a tuple record and returns it to its caller. The second pass is where things get interesting. It is worth noting that the second pass is *nothing more* that a peephole optimiser on the intermediate code. We are attempting to get a result back to the parent while still doing useful work; this is where the parallelism starts to come from. Firstly, we can return the record to the parent *while it is empty*. We create the record, return it to the parent, and then deal with `f1', `f2', `f3'. The semaphore operations in the memory mean that the parent will block if it tries to access a record field; but, it can do other things first. If `f1', `f2' and `f3' return records, then `f' might get them back with empty fields; then, `f' can return to its parent while the `fi's are still going, and so on. The net result that empty records get claimed from the heap, and structures are filled in in parallel by a number of processes. Next, there are the usual tail-recursion optimisations: fun f x = g x Execution of `f' can pass silently to `g'. Finally, we attempt to bound the amount of concurrency. Creating a new process for each function is very expensive, so we use tail recursion optimisation techniques to knock a lot of these out. The above example is an obvious one: use `f's process for `g', rather than create a new one. In fact, the optimiser can go one better than this: fun f x = cons(..., g x) This is "structurally tail recursive"; `f's last action is to activate `g', followed by some structure assignment. So, I have an optimisation which makes this iterative. `f' builds the cons cell, passes it back to the parent, and then jumps to `g'. This has some interesting implications. Consider: fun f 0 = nil | f n = cons(n, f(n-1)) fun g(cons(x, y)) = cons(x+1, g y) Then, `f 100' will start a process to generate a list of 100 numbers. `g(f 100)' will start a process to generate 100 numbers, and another process to consume the first list in parallel and generate a second list. This generalises to `fun f x = (a, b, c, d, e, f, g x)' and so on. That's about it. My thesis also dabbled rather unsuccessfully with some polymorphic typechecking issues, and I integrated a concurrent Prolog compiler into the functional language as well. If you want a copy, I think it costs money (5 quid or something). Either write to: Dorothy McKie, Department of Computer Science, James Clerk Maxwell Building, The King's Buildings, University of Edinburgh EH9 3JZ SCOTLAND Or e-mail to: dam@uk.ac.ed.ecsvax, asking for ordering details. Sorry, but I don't know much about administrative matters like this... I'll pass on the enquiries I've had already. Nick. [From Nick Rothwell ] -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU { decvax | harvard | yale | bbn }!ima. Meta-mail to ima!compilers-request. Please send responses to the author of the message, not the poster.