Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site watrose.UUCP Path: utzoo!watmath!watrose!cdshaw From: cdshaw@watrose.UUCP (Chris Shaw) Newsgroups: net.arch Subject: Re: Cube designs vs. x,y,z bus Message-ID: <7327@watrose.UUCP> Date: Fri, 1-Mar-85 17:44:04 EST Article-I.D.: watrose.7327 Posted: Fri Mar 1 17:44:04 1985 Date-Received: Sat, 2-Mar-85 02:53:37 EST References: <48@pbear.UUCP> <268@oliveb.UUCP> <7306@watrose.UUCP> <5056@fortune.UUCP> Reply-To: cdshaw@watrose.UUCP (Chris Shaw) Organization: U of Waterloo, Ontario Lines: 48 > Ah, surely you jest. Given that each node on the psuedo cube has > its own associated memory, the vast majority of that processors time > will be spent without touching the 'bus'. And in most cases, the only > times the bus is used is for infrequent data movement (lets say for > mmu misses) and for interprocessor communication. No I don't jest.. and here's why. The purpose of the cube is NOT to have a machine which dozens of people can log on to and have their own 286-based micro. The motivation for the cube is to get a machine in which all of the processors work together on the same VERY large problem. In other words, the cube is a PARALLEL processing machine, not just a machine with lots of processors. As a previous reply to your posting indicated, you can't have too much parallelism in a machine of this ilk. > As long as there are not too many processors on any one bus or >ethernet link, the number of times where you would have to wait for the >bus would be minimal. The trade off as to how many would be allowed >is part of the architects job, to analyze the usage of the machine, the >tasks it must do, the performance requirements and the cost. There are two thing I see wrong with this suggestion : 1) It seems to imply that there would be several versions of a machine, say a linear algebra engine, a database engine, etc. This sounds kind of hokey to me. 2) Your estimation of communications load I think is far too small. Caltech has a cube running in which they solve the 7-body problem. This problem requires that you calculate 21 pairs of interactions (I don't know for sure). The structure of solution was to have 7 body processes and one i/o processes on a 4-node square. (2 processes per node). Each body process sent its position to 3 other processes, and received similar data from the other 3. With info for each body, calculations were done, and the results passed on to the remaining processes. (See Jan '85 CACM for real description). The point is, the solution to this problem is highly communication based. Many applications where matrix-bashing is needed are also likely to be dependent on getting partial results sent to them from other processes within the cube. Basically, as much time could conceivably be spent on sending and receiving data is spent on doing actual calculations. >-Jim Wall >...amd!fortune!wall Chris Shaw University of Waterloo