Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site redwood.UUCP Path: utzoo!watmath!clyde!cbosgd!ihnp4!mhuxn!mhuxj!mhuxr!ulysses!allegra!bellcore!decvax!tektronix!hplabs!hpda!fortune!redwood!rpw3 From: rpw3@redwood.UUCP (Rob Warnock) Newsgroups: net.arch Subject: Re: Re: Caltech's Cosmic Cube Message-ID: <183@redwood.UUCP> Date: Tue, 5-Mar-85 01:39:04 EST Article-I.D.: redwood.183 Posted: Tue Mar 5 01:39:04 1985 Date-Received: Sat, 9-Mar-85 08:47:42 EST References: <333@oakhill.UUCP> <21294@lanl.ARPA> <7268@watrose.UUCP> <166@cmu-cs-wb1.ARPA> <198@cornell.UUCP> Organization: [Consultant], Foster City, CA Lines: 111 +--------------- | >What puzzles me is why use point to point channels between processors (and | >do routing if a connection does not exist)? Wouldn't it be much simpler to | >use a dedicated ethernet? | I assume that most of the communication between processors consists of very | short packets, i.e., a single floating point number... +--------------- Just went to a very interesting talk today at NASA/Ames given by Cleve Moler of Intel Scientific Computers, who make a commercial hypercube system (announced in net.arch previously). Don't know about Caltech's applications, but for Intel's, the messages tend to be fairly large vectors, actually. (Hundreds of floating-point numbers.) +--------------- | ... Ethernet is very | inefficient when it is handling short packets, since it has a lot of overhead | per packet. In actual practice, the 10mb bandwith is approximated only when | packets are very long (perhaps 10KB, I forget). +--------------- Well, not 10KB, since the maximum legal packet is 1518 bytes. The minimum packet size is 46 data bytes (64 total bytes including preamble, address, and CRC), and those can happen every 60.8 microseconds (51.2 for the packet and 9.6 "mikes" of inter-packet delay), or every 76 byte times. Let's see, that's a minimum efficiency of 46/76 or about 60%, in the absence of collisions. Packets of only 128 data bytes yield 81%; 256 bytes, 89.5%; and 1024 bytes, 97%. Even with collisions, channel efficiency stays high for packets over 128 bytes or so, but remember that in the backplane "bus" application here, the Ethernet channel is VERY short (much less than a bit time), so collisions are much less frequent. (Try solving the equations for efficiency in the original Ethernet paper for C = 10 Mbit/sec and T = 0.1 microsecond.) +--------------- | Also, I bet most of the algorithms for the Cosmic Cube are fairly | synchronous, so all the processors would want to be broadcasting at the | same time... +--------------- That didn't seem to be the case for the application problems I saw presented today -- concurrent, yes; "synchronous", no. Further, the targets of messages were always specific processors (processes, actually). Broadcast did not seem to be (yet) implemented. +--------------- | ... Ethernet assumes that the net is not very loaded. A 10% | loaded Ethernet is very rare. +--------------- True, a heavily-loaded Ethernet is rare in, say, a real-life office-automation environment. But Ethernet doesn't "assume" that, in fact, the access algorithm and total throughput are stable even under extreme overload. (See "Measured Performance of an Ethernet...", Shoch & Hupp.) The net will not collapse, as long as the rules are followed, and the thoroughput will be high if packets are a few hundred bytes or more. On a "bus" backplane, the throughput will be even higher (the number of "hosts" is smaller, and the "cable" is shorter.) +--------------- | Also, Ethernet is not that cheap. Each connection runs a few hundred | dollars. A straightforward serial connection would only be a few dollars, +--------------- Geez... I wonder why the Intel hypercube uses ETHERNET chips... EIGHT (8) OF THEM!!! ;-} ;-} And they use them for mere point-to-point links! Seriously, you should look at current chip prices. In "backplane" applications you don't need a full transceiver per connections, but can interconnect at the "transceiver cable" level (or even at TTL, if you supply clock). +--------------- | ... A straightforward serial connection would only be a few dollars, | and a parallel port is even faster and almost as cheap (wiring costs, you | know). +--------------- Sorry, most of the cost is NOT in the serialization, but in the bus interface, buffer handling, and line driving/receiving -- all things which a parallel interface also has to do. And the parallel interface doesn't have the noise immunity (at least not a cheap TTL one), while the Ethernet transceiver-cable driver/receivers cheerfully drive 50 meters over a shielded twisted pair (differential shifted-ECL levels). +--------------- | ...As long as the interconnection pattern is regular and there are | not too many processors (too many is more than the number that fit in one | or two cabinets) the Cosmic Cube interconnection scheme should be cheap | and simple. | | Ralph Johnson +--------------- I'd like to see you interconnect 128 processors in a hypercube using 50-pin ribbon cable! ;-} The interconnection pattern is regular, but it's not necessarily convenient! (Remember, each processor is a "corner", and as you "linearize" the Cube by putting it in a rack, the interconnects get to be a bit of a rat's nest.) Disclaimer: I am not selling the Intel method; I have some concerns about having that many high-speed point-to-point links on a memory bus. (I am an advocate of quasi-bus serial backplanes, rather than point-to-point). However, Intel's use of Ethernet chips is quite reasonable, given the connection pattern they chose, and is MUCH preferred to 8 parallel interfaces! Rob Warnock Systems Architecture Consultant UUCP: {ihnp4,ucbvax!dual}!fortune!redwood!rpw3 DDD: (415)572-2607 USPS: 510 Trinidad Lane, Foster City, CA 94404