Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!husc6!yale!mfci!colwell
From: colwell@mfci.UUCP (Bob Colwell)
Newsgroups: comp.arch
Subject: Re: Memory latency / cacheing / scientific programs
Keywords: cache latency bus memory
Message-ID: <452@m3.mfci.UUCP>
Date: 28 Jun 88 02:12:49 GMT
References: <243@granite.dec.com> <443@m3.mfci.UUCP> <448@m3.mfci.UUCP> <244@granite.dec.com>
Sender: root@mfci.UUCP
Reply-To: colwell@mfci.UUCP (Bob Colwell)
Organization: Multiflow Computer Inc., Branford Ct. 06405
Lines: 73

In article <244@granite.dec.com> jmd@granite.UUCP (John Danskin) writes:
>
>Somewhere up in my original posting I said 'scoreboarding'. With
>scoreboarding, the code still runs. It may not run fast anymore, but
>that is another problem.
>
>You tell the compiler what the latencies are so that it can produce
>good code.  You implement scoreboards so that binaries port from your
>mark 1 machine to your mark 2 machine (they don't run as fast as they
>should, but they probably run a little faster).
>
>The real problem seems to be that scoreboards are expensive. You guys
>may be on the right track (blow off compatibility, make the machine
>soooo fast and relatively cheap that people will use it despite the
>headache).

I sure hope so.  We considered a lot of alternatives, including
putting in a hardware mode to make newer hardware correctly mimic the
pipelines of older, but we all suffered allergic rashes for a week.
Visions of something analogous to dragging the 8086 register set
around for the rest of our lives danced through our heads.  It's just
really yucky (scientifically speaking) to make your current hardware
slower so that old executables run unmodified.

Your point about the reason for scoreboarding was on target.
What I was trying to say was that however expensive you think
they are for normal machines, I believe they'd turn out to be lots
more so on a VLIW like ours, because of the sheer number of
instruction stream bits you'd have to simultaneously monitor (1024
all told, across 8 different boards).  I'd rather put all that
extra hardware to use as more functional unit horsepower, or more
I/O, caches, whatever.  Something that would contribute to my usable
compute horsepower.

>Tradeoffs are shifting in that direction, but I still see a lot of
>value in binary compatibility at least for a few models of a design.
>
>Have you guys thought about keeping an intermediate language copy of
>each executable IN the executable with the 'cached' binary? Have the
>loader check the binary to see if it has the right tag for the current
>machine, if it does, run the code, if it doesn't then  regenerate code
>from the intermediate language spec and then run. You would want to
>provide a utility for executable conversion as this is not a real
>performance answer. If the intermediate language was sophisticated
>enough you might even be able to do the code generation reasonably
>quickly...
>
>You would pay a disk space penalty now, but avoid being bitten
>by the inevitable backwards compatibility problems that seem so
>irrelevant when you build your first machine...

Dunno about expanding the sizes of our executables; they're already
something like 3X what a VAX would use.  We considered the utility
idea, but felt that the manpower required to implement it could be
better spent making the new machine run as fast as possible (which
would convince the customer that recompiling was the right move).

Personally, I think it is one of the hallmarks of RISC design in
general that one trades whatever one can for performance, and as long
as it is performance that customers buy our design choices are quite
constrained.  It seems to me that big companies (the 'D' word and 
the 'I' word) require object compatibility, because that's a primary
reason for their sales.  Companies who hope to compete with them
will not, however, get their sales for that same reason.  We have to 
provide enough performance to get attention, and hope that the set of
tradeoffs represented thereby still have critical mass.  (I know
that's a horrible circumlocution, but I want to keep my job here!)


Bob Colwell            mfci!colwell@uunet.uucp
Multiflow Computer
175 N. Main St.
Branford, CT 06405     203-488-6090