Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!husc6!yale!mfci!colwell From: colwell@mfci.UUCP (Bob Colwell) Newsgroups: comp.arch Subject: Re: Memory latency / cacheing / scientific programs Keywords: cache latency bus memory Message-ID: <452@m3.mfci.UUCP> Date: 28 Jun 88 02:12:49 GMT References: <243@granite.dec.com> <443@m3.mfci.UUCP> <448@m3.mfci.UUCP> <244@granite.dec.com> Sender: root@mfci.UUCP Reply-To: colwell@mfci.UUCP (Bob Colwell) Organization: Multiflow Computer Inc., Branford Ct. 06405 Lines: 73 In article <244@granite.dec.com> jmd@granite.UUCP (John Danskin) writes: > >Somewhere up in my original posting I said 'scoreboarding'. With >scoreboarding, the code still runs. It may not run fast anymore, but >that is another problem. > >You tell the compiler what the latencies are so that it can produce >good code. You implement scoreboards so that binaries port from your >mark 1 machine to your mark 2 machine (they don't run as fast as they >should, but they probably run a little faster). > >The real problem seems to be that scoreboards are expensive. You guys >may be on the right track (blow off compatibility, make the machine >soooo fast and relatively cheap that people will use it despite the >headache). I sure hope so. We considered a lot of alternatives, including putting in a hardware mode to make newer hardware correctly mimic the pipelines of older, but we all suffered allergic rashes for a week. Visions of something analogous to dragging the 8086 register set around for the rest of our lives danced through our heads. It's just really yucky (scientifically speaking) to make your current hardware slower so that old executables run unmodified. Your point about the reason for scoreboarding was on target. What I was trying to say was that however expensive you think they are for normal machines, I believe they'd turn out to be lots more so on a VLIW like ours, because of the sheer number of instruction stream bits you'd have to simultaneously monitor (1024 all told, across 8 different boards). I'd rather put all that extra hardware to use as more functional unit horsepower, or more I/O, caches, whatever. Something that would contribute to my usable compute horsepower. >Tradeoffs are shifting in that direction, but I still see a lot of >value in binary compatibility at least for a few models of a design. > >Have you guys thought about keeping an intermediate language copy of >each executable IN the executable with the 'cached' binary? Have the >loader check the binary to see if it has the right tag for the current >machine, if it does, run the code, if it doesn't then regenerate code >from the intermediate language spec and then run. You would want to >provide a utility for executable conversion as this is not a real >performance answer. If the intermediate language was sophisticated >enough you might even be able to do the code generation reasonably >quickly... > >You would pay a disk space penalty now, but avoid being bitten >by the inevitable backwards compatibility problems that seem so >irrelevant when you build your first machine... Dunno about expanding the sizes of our executables; they're already something like 3X what a VAX would use. We considered the utility idea, but felt that the manpower required to implement it could be better spent making the new machine run as fast as possible (which would convince the customer that recompiling was the right move). Personally, I think it is one of the hallmarks of RISC design in general that one trades whatever one can for performance, and as long as it is performance that customers buy our design choices are quite constrained. It seems to me that big companies (the 'D' word and the 'I' word) require object compatibility, because that's a primary reason for their sales. Companies who hope to compete with them will not, however, get their sales for that same reason. We have to provide enough performance to get attention, and hope that the set of tradeoffs represented thereby still have critical mass. (I know that's a horrible circumlocution, but I want to keep my job here!) Bob Colwell mfci!colwell@uunet.uucp Multiflow Computer 175 N. Main St. Branford, CT 06405 203-488-6090