Path: utzoo!utgpu!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!rutgers!gatech!hubcap!Douglas
From: jones%HERKY.CS.UIOWA.EDU@VM1.NODAK.EDU (Douglas Jones)
Newsgroups: comp.parallel
Subject: communication delays (was analysis of parallel algorithms)
Message-ID: <3781@hubcap.UUCP>
Date: 7 Dec 88 13:21:45 GMT
Sender: fpst@hubcap.UUCP
Lines: 51
Approved: parallel@hubcap.clemson.edu

[I found this on comp.theory.  Sorry to repost it for those readers, but
	I thought it was worth it.  We haven't gotten into the
	theoretical models much --- this might be the time
 Steve
]



In Hai-Ning Liu's theorynet note of Dec. 5, 1988, he asks:

> I am interested in results about communication delay between any two
> processors.  Note this is different from PRAM model.

I have long suspected that there is a natural law that communication delays
in a system with n components are O(log n).  As a proposed natural law, this
is subject to counterexample and refinement, but cannot be proven.

Note that most Parallel Random Access Memory models assume that access time
is constant, independent of memory size and number of processors, but that
real parallel random access memory cannot be built this way.  To see this,
imagine any fixed technology.  All known technologies build memory from
modules with a fixed capacity per module.  Even if there is only one processor,
that processor must have a way to send memory addresses and data to all of the
modules.  All known technologies have a limit on the fanout allowed on any
signal.  If there are more modules than this fanout limit, some kind of
repeaters or amplifiers must be used.  Say the fanout limit allows a maximum
of k modules to listen to any signal.  In general, this requires a k-ary tree
of amplifiers to take address or data signals from the processor to memory.
For a sufficiently large memory, the delays involved in this k-ary tree will
dominate the memory access time, so we have O(log n) access, with logarithms
to the base k.

A similar argument applies to the data path from memory to processors, but
here, the problem is the limited fanin of feasible multiplexors, with the
result that a multiplexor tree is needed to bring signals from modules to the
processor.

Similar arguments apply to n processors sharing one memory, and these arguments
directly underly the design of the butterfly switch.

In multiprocessor systems with message passing, one can make similar arguments
about the message routing hardware.

Note that all of these arguments rest on the key statement "All known
technologies".  I know of no way to prove that someone tommorow cannot find a
new technology that evades these limits; such a new technology might, after-all
rest on now unknown physical principles.  At this point, a digression into the
realm of Karl-Popper is appropriate, but not in this newsgroup.

					Douglas Jones
					jones@herky.cs.uiowa.edu