Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!gatech!hubcap!Irving
From: Irving@hubcap.UUCP
Newsgroups: comp.hypercube
Subject: Re: Question on hypercube routing
Message-ID: <319@hubcap.UUCP>
Date: Sat, 18-Jul-87 16:15:37 EDT
Article-I.D.: hubcap.319
Posted: Sat Jul 18 16:15:37 1987
Date-Received: Sun, 19-Jul-87 01:14:11 EDT
Sender: fpst@hubcap.UUCP
Lines: 60
Approved: hypercube@hubcap.clemson.edu

In article <278@hubcap.UUCP> pase%oregon-grad.csnet@RELAY.CS.NET (Douglas M. Pase) writes:
>Message routing, it would seem, is a simple problem on a hypercube [...]

>			*** The Question ***
>
>Does anyone know of a routing algorithm which utilizes more evenly the channels
>available for *all* combinations of node sources and destinations?
>--
>Doug Pase   --   ...ucbvax!tektronix!ogcvax!pase  or  pase@Oregon-Grad.csnet

Well, it just so happens that my supervisor and I have a paper submitted for
publication on just this problem...

First, to apologise for the long delay; turnaround time out here is pretty
bad.  I have seen most of the other responses (they all arrived at the same
time).

Anyhow, what we have done is to compare fixed-order (the one you outlined),
random order routing (at each router, a direction is chosen at random from
the directions in which the message must move), and two "adaptive" schemes
where the dimensions chosen depend on the all the messages passing through
the node at the same time.

The locally optimal assignment of messages to outgoing wires (that is,
moving the most messages forward at once) is exactly the Bipartite Matching
problem, for which the best know solution is a very tricky O(n^2.5)
algorithm that would be almost impossible to implement effectively in
silicon.  We used this as one of our adaptive schemes; the other was a
simple "first-fit" method.

We simulated all of these under random message loads at a variety of load
levels, and compared queue lengths, total message transit time (in cube
steps) and a few other things.  Basically, the "fixed order" methods perform
dismally compared to the "adaptive" methods.


The second part of the paper investigates what happens with limited queues
at each corner of the cube.  Messages are assigned moves away from their
destination if necessary to prevent queue overflows.

What we found was that very little queue space is needed (ie, 2 or 4 total
(not per dimension) spaces for an 8-cube) to get the best throughput/delay
under heavy loads.  Under light loads it doesn't make any difference, since
messages don't have to wait in the queues much.  Also, our first-fit method
comes very close to Bipartite Matching without the complex algorithm.


The paper is available as a tech report from the University of Saskatchewan;
e-mail me (reid@sask.UUCP) your postal address if you would like a copy.

I have also done a fairly complete design of a router chip for an 8-cube
using the first-fit algorithm; it maps quite well into VLSI.  The design
can be extended to attach an arbitrary number (limited by pin count and chip
size) of PEs to each corner of the cube to make better use of the high load
capacity of the first-fit routers.

-- 
 - irving -   (reid@sask.uucp or {alberta, ihnp4, utcsri}!sask!reid)

Whose idea was this, anyway?