Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!pyramid!voder!apple!bcase From: bcase@apple.UUCP (Brian Case) Newsgroups: comp.arch Subject: Re: Horizontal pipelining Message-ID: <6832@apple.UUCP> Date: Wed, 25-Nov-87 12:18:23 EST Article-I.D.: apple.6832 Posted: Wed Nov 25 12:18:23 1987 Date-Received: Sun, 29-Nov-87 02:10:42 EST References: <201@PT.CS.CMU.EDU> <388@sdcjove.CAM.UNISYS.COM> <988@edge.UUCP> <958@winchester.UUCP> <11444@sci.UUCP> Reply-To: bcase@apple.UUCP (Brian Case) Organization: Apple Computer Inc., Cupertino, USA Lines: 25 In article <11444@sci.UUCP> kenm@sci.UUCP (Ken McElvain) writes: [Seems to be talking about something like the PPUs of the old Cybers.[ >I agree that cache [or TLB] hit rates will almost certainly go down. >However, miss penalties will also drop. It is quite possible that >a cache fill could happen in the time it takes for the barrel >to turn around. >A ten stage barrel processor running at 25Mhz would easily allow >over 300ns for a cache fill before it cost another instruction slot. >The performance limit here is likely to be the bandwidth of the >cache fill mechanism. Yes, but if a fair fraction of the processors in the barrel are causing misses (say 3 or so) then your memory system will have to be multiported (or very fast, in which case why not just one fast processor?). This doesn't invalidate what you are saying, just an observation. >Another issue is the instruction set. It's not clear that you want >a bunch of registers. It may be much better to do more of a memory >to memory architecture. (I would recommend keeping some base registers). >A number of other areas also have some surprising tradeoffs. I fail to see why memory-memory would be better than registers. Can you give some proof? Also, what other areas have surprising tradeoffs, and what are they?