Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site redwood.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!ihnp4!houxm!whuxl!whuxlm!akgua!sdcsvax!sdcrdcf!hplabs!hpda!fortune!redwood!rpw3
From: rpw3@redwood.UUCP (Rob Warnock)
Newsgroups: net.arch
Subject: Re: Cube designs vs. x,y,z bus
Message-ID: <173@redwood.UUCP>
Date: Wed, 27-Feb-85 07:03:55 EST
Article-I.D.: redwood.173
Posted: Wed Feb 27 07:03:55 1985
Date-Received: Sun, 3-Mar-85 04:35:52 EST
References: <48@pbear.UUCP> <268@oliveb.UUCP> <7306@watrose.UUCP>
Organization: [Consultant], Foster City, CA
Lines: 79

+---------------
| The whole point of point-to-point communication channels is to eliminate all
| forms of bus contention that may occur between processors...
| ...Point-to-point, on the other hand, buys zero bus squabbling (i.e. full bus
| bandwidth on each wire), at the price of data for "distant" processors
| having to be routed through some intermediaries.
| Hope this answers the "why point-to-point" questions ....
| Chris Shaw
+---------------

Yes, this sounds nice, but unfortunately, you have just pushed the
"bus squabbling" into each processor node! For "N" greater than 2 or
3, "N" times the "full bus bandwidth" is not available WITHIN each
processor to SERVICE such point-to-point channels! (...unless each
channel is terribly slow.) Take a look sometime at the memory-bus
characteristics the "full-function DMA" Ethernet chips, such as the
Intel 82586 and the AMD/Mostek LANCE.  (To avoid confusion, I will
use "M-bus" to mean a processor's internal memory bus, and "E-bus" to
mean the external Ethernet or similar bit-serial bus.)

Because of the time the chip spends holding the M-bus, you can't run
more than about two simultaneous controllers on the same M-bus at 10
Mbits/sec. In order to get 6 or 8 or more point-to-point channels on
one M-bus, you have to slow each channel down so much that you would
be better off with the "bus contention" on a full-speed E-bus!

Yes, I know that by supplying a whole mess of external data and address
buffers (and control logic for them), you can cut the M-bus-occupancy
time of these chips, but that only raises "N" to 3 or 4 before you run
out of memory bandwidth. (Even the "x,y,z" configuration is going to
need some help to support just 3 controllers.) And before I would try
to interface either of the above to a 32-bit (or wider) memory bus, I
would "drop back and punt" and use a simple serializer such as the Seeq
or Fujitsu Ethernet chips and a state machine to do the DMA. But by the
time you have built 8 fast channels and have widened the memory enough
to support them, you could afford perhaps DOUBLE the number of the
cheaper "x,y,z" E-bus processors!

Remember, with fast source-path routing and smart DMA controllers such
as the Intel or AMD/Mostek parts, sending a packet from E-bus "x" to
E-bus "y" to E-bus "z" at 10 Mbits/sec can easily be FASTER than sending
it point-to-point over a 1 Mbit/sec channel.

[Note: Slight subject change coming -- away from "point-to-point vs. bus" ]

I happen to favor a hybrid approach, as I mentioned in an earlier posting,
which uses an E-bus to group a number (possibly small, say 8 perhaps? ;-} )
of processors together to make a "fat point" (I believe I used the word
"hyper-point" earlier). Each processor needs but two (2) E-bus controllers:
one used for the internal or intra-point communication, and one used for
inter-point or "edge" connections. (This is easily achieved at 10 Mbits/sec
with existing controller chips and 16-bit data paths.)

If each "edge" E-bus contains but two processors, you can exactly model
the hyper-cube style, but with each "point" being 6-10 processors with
2 controllers each rather than one processor with 6-10 controllers.
Yes, each transmission from one "edge" E-bus to another requires an
intermediate hop through the "point" E-bus, but this is compensated by
the higher speed of the "edge" links, which run at a full 10 Mbits/sec.

But higher-degree "edges" are possible -- one can fold edges together (by
connecting edge E-busses) either to form "hyper-hyper-cubes" (whatever that
might mean) or to simply save on processors. If every other point (using
Hamming distance to give meaning to "every other") E-bussed all of it's
edges together, such "distinguished" points need have only one processor
(and no "internal" E-bus), and you end up with something I can only describe
as the "half-dual" of a hyper-cube.

Intermediate forms are of course possible, as well as the other extreme.
I leave as an exercise the construction of an "x,y,z bus" system from
hyper-points whose processors contain only two E-bus controllers per M-bus...


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404