Path: utzoo!attcan!uunet!cs.utexas.edu!usc!apple!oliveb!amiga!cbmvax!daveh From: daveh@cbmvax.UUCP (Dave Haynie) Newsgroups: comp.sys.amiga.tech Subject: Re: DMA or polling (was Re: GVP controller) Message-ID: <7616@cbmvax.UUCP> Date: 10 Aug 89 18:10:43 GMT References: <8908092130.AA23369@jade.berkeley.edu> Organization: Commodore Technology, West Chester, PA Lines: 88 in article <8908092130.AA23369@jade.berkeley.edu>, 451061@UOTTAWA.BITNET (Valentin Pepelea) says: > Steve -Raz- Berrywrites in <120232@sun.Eng.Sun.COM> >> In article <8908072207.AA14796@jade.berkeley.edu> 451061@UOTTAWA.BITNET >> (Valentin Pepelea) writes: >> >The net result is that the processor therefore spends less time on the data >> >transfer and is available more often for other concurrent tasks. >> Yikes! I'm sorry, but I TOTALLY disagree with you on this one. >> Logicly, if you look at the time to complete a given task, based only >> on the number of bus cycles it takes to transfer a given block of data, >> DMA will always win. Period. > Clearly you don't understand, or perhaps I did not explain well. The > bottleneck here is the speed at which the hard disk turns, and therefore > the rate at which data is available to the DMA channel. >> Sorry, this is one EE type that just won't believe it. > Obviously some EE types are better than others. Well, you all know me as an EE type. I think there confusion here because the problem hasn't been properly decomposed. There are two transfers going on in most hard drive systems -- from the drive to the controller, and from the controller to system memory. It's always a losing proposition to transfer directly from the data as read from the drive to the system memory, regardless of whether you go via a CPU read method or a DMA method. Fortunately, it's almost impossible as well, unless you're dealing with direct manipulation of an ST-506 interface. Assuming a SCSI device, you really don't have any idea how the data is handled between the physical hard drive and the SCSI channel. Still, the best a direct asynchronous SCSI read or DMA can do is significantly less that any buffering scheme you might come up with. The Apple Macintosh is a good example of what happens when you don't buffer up your SCSI, if for no other reason than to convert the SCSI byte stream to a word stream before travelling between the controller and the system memory. So let's agree not to take any simple, stupid approaches -- all the mentioned controllers, GVP, Commodore, and Microbotics, take a much more intelligent approach. GVP is the simplest in concept. It sucks up a whole block into local RAM, then transfers this at memory-to-memory speeds across the bus, from it's local RAM to it's final destination. On a 68000, even with some cleverly designed copy loops like CopyMemQuick() or similar, you'll still have over two bus crossings per word transferred -- one from the local RAM to the 68000, one from the 68000 to the system RAM, and occasional stops to fetch opcodes. With a 68010 or better, you can basically ignore the opcode fetch time, but you still have the two complete bus crossings per word. With a 68020 or 68030 and some 32 bit memory, you can reduce this to two slow and one fast bus crossings per longword, which comes pretty close to one bus crossing per word, but not quite. The Commodore controllers are all DMA driven and backed by a FIFO. The 2090 will read from the SCSI controller into it's FIFO, and when the FIFO starts to fill, it'll take the bus, dump 32 words across at full speed, and then give back the bus. This results in one bus crossing per word, plus a small bus arbitration time. Most other DMA driven controllers work very similarly. The main idea here is that the fastest a non-DMA controller will ever run is approximately the same as the normal speed of a DMA controller. Without a 68020 or 68030 and some 32 bit RAM, the DMA controller is always a win. You can, of course, pick a bad DMA controller and compare it to a good programmed controller, or visa versa, to accentuate the point of YOUR particular religious views, but I'm dealing in science here. There is one situation where a non-DMA device will run faster than a DMA device in Amiga systems. If you have a 68020 or 68030 system with 32 bit memory above the 24 bit address space of the 68000, a good non-DMA device like GVPs will go faster under FFS. The deal here is that the programmed transfer doesn't have any 24 bit limits, while the DMA transfer does. Plus, with a 32 bit card, the non-DMA transfer is already approaching the speed of the DMA transfer (the difference with a fast '030 card may be as much software overhead as hardware differences). So while the non-DMA transfer works normally, the DMA device must dump it's data to a temporary RAM buffer, and then run a CPU driven copy to the final destination. That copy is likely about as fast as the non-DMA transfer, so in this situation, the non-DMA device may be around twice as fast as the DMA transfer. This situation will disappear with full 32 bit DMA device, but you won't be having them on the A2000 bus. > Valentin -- Dave Haynie Commodore-Amiga (Systems Engineering) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: D-DAVE H BIX: hazy Be careful what you wish for -- you just might get it