Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site redwood.UUCP
Path: utzoo!linus!decvax!decwrl!amdcad!fortune!foros1!redwood!rpw3
From: rpw3@redwood.UUCP (Rob Warnock)
Newsgroups: net.micro.68k
Subject: Re: 68020 Performance Revisited Again
Message-ID: <77@redwood.UUCP>
Date: Mon, 5-Nov-84 00:25:20 EST
Article-I.D.: redwood.77
Posted: Mon Nov  5 00:25:20 1984
Date-Received: Tue, 6-Nov-84 07:37:53 EST
References: <4107@decwrl.UUCP>
Organization: [Consultant], Foster City, CA
Lines: 86

+---------------
| The truth of the matter is probably somewhere in the range between my
| figures and MacGregor's...
+---------------

That may be good theory, and I agree with most of your comments about
cache design (see some long stuff I posted some months ago), but my actual
experience with "real" UNIX tasks (cc, nroff, grep, vi, mail, news, etc.)
runs counter to even your "conciliatory" numbers:

+---------------
| 68K MEMORY	   100ns	  200 ns	  300ns
| 8MHz    68000    0.14-0.25      0.14-0.25       0.14-0.25
+---------------
---> 32:16					265ns
---> 5.5MHz 68k					~0.5

My experience with the Fortune Systems 32:16, which runs a 5.5 Mhz clock
(no wait states), with 200ns 64K chips (including time in the cycle for
65 ns for ECC that's not used, so call them 265 ns. chips), is that on
every CPU-intensive benchmark I tried that did not involve (significant)
floating-point, the 5.5MHz 68k ran almost EXACTLY 0.5 * VAX-11/780 speed
(single-user, in both cases). (Note the compiler used on the 68k treats
"int" == "long" == 32 bits, as does the VAX.)

On disk intensive tasks, the speeds were very nearly the ratio of the
random access times of the drives involved. For certain tasks for which
the 68k software had been carefully tuned (e.g., tty output), it actually
outperformed the VAX (though making the same changes to the VAX/4.1bsd
kernel would surely wipe out the discrepency). On mixed tasks, it did
somewhat better than linear interpolation would predict (but this is
to be expected since there is a non-linear soft transition from disk-bound
to CPU-bound).

+---------------
| My initial motivation was to put an end to the practice of comparing
| microprocessor chips to 7-year-old fully-configured, virtual-memory,
| multi-user computer systems. 
+---------------

I agree, wholeheartedly! But when you say...

+---------------
| ...I firmly believe that the data in the
| table above can be used as ballpark figures for systems built around
| the 68020, but one must remain cautious of the hooks.
| 
| Joe Falcone
+---------------

Sorry, your table isn't even close to what my stopwatch says about Fortune,
CT Miniframe (10Mhz no wait states), Callan, and others. [Hint: I think
you may have been misled by basing your VAX numbers on the theoretical
performance of the 200ns. SBI -- Isn't it true that a 780 processor can't
keep the SBI busy? Also, you need to allow for bias in comparing UNIX to VMS.]

As far as my experience has led me to conclude so far, unless the designers
screw up in the UNIX port or in the memory management or in the disk subsystem,
MY "ballpark figure" is that a straightforward 68000 system at 10Mhz (with
no wait states) closely equals a VAX-11/780 in UNIX system performance with
(say) 5-25 users doing "typical UNIX" things (with the "same" lineage UNIX).

On the other hand, whipping up a blazing 68020 system's not so easy, either.
I firmly agree that getting a 20Mhz 68020 to do 4 * VAX (factor of two in
clock over 10MHz times ~1.5 for instruction cache times ~1.5 for 32-bit bus)
is NOT going to be easy, and maybe not even economical!

But a SYSTEM designer might settle for 2 to 2.5 times VAX, and win big in
price/performance. (E.g., DON'T use a cache, but just use the fastest 256K chips
you can get, interleaved to save power while using overlapped RAS/MMU-decode.
Use multi-processors if you still need more horses in the box.)

Summary:
	1. I agree with your general style of analyis, ...
	2. ...but I think your "baseline" is still WAY off what I have seen.
	3. "Incidental" issues like disk I/O and tty drivers can make a
	   FAR greater difference on user-perceived system performance --
	   be careful about too much fine-tuning of the CPU/memory.
	4. I am only talking about "typical UNIX" apps, not F.P. crunching.

Rob Warnock

UUCP:	{ihnp4,ucbvax!amd}!fortune!redwood!rpw3
DDD:	(415)572-2607
Envoy:	rob.warnock/kingfisher
USPS:	510 Trinidad Ln, Foster City, CA  94404