Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!gatech!hubcap!Douglas From: jones%HERKY.CS.UIOWA.EDU@VM1.NODAK.EDU (Douglas Jones) Newsgroups: comp.parallel Subject: communication delays (was analysis of parallel algorithms) Message-ID: <3781@hubcap.UUCP> Date: 7 Dec 88 13:21:45 GMT Sender: fpst@hubcap.UUCP Lines: 51 Approved: parallel@hubcap.clemson.edu [I found this on comp.theory. Sorry to repost it for those readers, but I thought it was worth it. We haven't gotten into the theoretical models much --- this might be the time Steve ] In Hai-Ning Liu's theorynet note of Dec. 5, 1988, he asks: > I am interested in results about communication delay between any two > processors. Note this is different from PRAM model. I have long suspected that there is a natural law that communication delays in a system with n components are O(log n). As a proposed natural law, this is subject to counterexample and refinement, but cannot be proven. Note that most Parallel Random Access Memory models assume that access time is constant, independent of memory size and number of processors, but that real parallel random access memory cannot be built this way. To see this, imagine any fixed technology. All known technologies build memory from modules with a fixed capacity per module. Even if there is only one processor, that processor must have a way to send memory addresses and data to all of the modules. All known technologies have a limit on the fanout allowed on any signal. If there are more modules than this fanout limit, some kind of repeaters or amplifiers must be used. Say the fanout limit allows a maximum of k modules to listen to any signal. In general, this requires a k-ary tree of amplifiers to take address or data signals from the processor to memory. For a sufficiently large memory, the delays involved in this k-ary tree will dominate the memory access time, so we have O(log n) access, with logarithms to the base k. A similar argument applies to the data path from memory to processors, but here, the problem is the limited fanin of feasible multiplexors, with the result that a multiplexor tree is needed to bring signals from modules to the processor. Similar arguments apply to n processors sharing one memory, and these arguments directly underly the design of the butterfly switch. In multiprocessor systems with message passing, one can make similar arguments about the message routing hardware. Note that all of these arguments rest on the key statement "All known technologies". I know of no way to prove that someone tommorow cannot find a new technology that evades these limits; such a new technology might, after-all rest on now unknown physical principles. At this point, a digression into the realm of Karl-Popper is appropriate, but not in this newsgroup. Douglas Jones jones@herky.cs.uiowa.edu