Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!cwjcc!gatech!hubcap!kale From: kale@m.cs.uiuc.edu (L. V. Kale') Newsgroups: comp.parallel Subject: Superlinear(?) speedups Message-ID: <3756@hubcap.UUCP> Date: 5 Dec 88 21:05:06 GMT Sender: fpst@hubcap.UUCP Lines: 31 Approved: parallel@hubcap.clemson.edu Here is one more way / reason that leads to the famed superlinear speedups. We observed superlinear speedups on some simple divide-and-conquer programs running on the "Chare Kernel" system (**) of ours on the iPSC/2 hypercubes. Further investigation showed the cause of this to be memory usage. In particular, the memory allocation scheme we used has the property that with larger number of allocations (with subsequent deallocations), its perfromance tends to degrade somewhat. (One has to search more for the right size block due to fragmentation) With increasing number of processors, the number of allocations done on each processor decreases, thus reducing the time spent in memory management. As our dynamic load balancing scheme is very good :-) we get the computation nicely distributed, and presto: superlinear speedup. In a sense, this is simply a consequence of having more memory. But then, that is part of having more processors. The question is having paid (for P processors) real dollar costs P times more than one processor, can one get speedup more than P. And in this case, the answer is yes... I won't get into the controversy about whether superlinear speedup are real or not. Just reporting an observation, for now. ** The Chare Kernel Language for Parallel Programming: A perspective" L.V. Kale and W.W. Shu , TR UIUCDCS-R-88-1451, (if interested in receiving copies of this and related papers, write to me: L.V. Kale, Dept. of Computer Science, University of Illinois 1304 W. Springfield Ave., Urbana, IL-61801) or kale@cs.uiuc.edu