Path: utzoo!utgpu!attcan!uunet!cbmvax!daveh From: daveh@cbmvax.UUCP (Dave Haynie) Newsgroups: comp.arch Subject: Re: Sw vs. Hw BitBlit. Message-ID: <4461@cbmvax.UUCP> Date: 10 Aug 88 15:47:53 GMT References: <61783@sun.uucp> Organization: Commodore Technology, West Chester, PA Lines: 62 in article <61783@sun.uucp>, guy@gorodish.Sun.COM (Guy Harris) says: > Keywords: BitBlit. > The second common case of "bitblt" is scrolling a rectangular region > of a bitmap, usually the display. Since the word boundaries in the > scan lines of a bitmap are at the same place in each line, the speed of > scrolling depends primarily on the speed of the MC68000 instruction > mov.l %a0@+, %a1@+ > or, in C, > register long *p, *q; > *p++ = *q++; > For typical rectangles, the edges, which must be handled with more > complicated code, do not dominate the performance. There is nothing > hardware can do to accelerate this loop except provide faster memory > access. If the display were accessed through a narrower or clumsier > interface, it would take longer to move the data. With a MC68000, not so. Given an equal memory access speed, something like a DMA controller can be several times faster than the 68000. All it needs do is fetch data from location A, dump it to location B, and increment some internal counters. While it looks like that's what the 68000 is doing, it's really also fetching the move instruction and a branch instruction of some kind. So for every word moved, you're probably fetching as many instruction words as overhead. Certainly the 68010 in some cases and the 68020 in most cases solve this problem via caching, but I can't yet buy either of these parts for the $2.50 or so I pay for a 68000. > If a BitBlt chip is reasonably cheap, and can do the whole job, it may be worth > it. Note that in the cases shown, you got at most a 3.5x speedup (scroll > screen horizontally). For vertical scrolling, you got only 1.18x; for randomly > drawing the letter 'a', you got only 1.23x; and for texturing a random 40x40 > square, you got 1.95x. How cheap does it have to be for that to be worth it? > (The "do the whole job" comes from comments made in the paper that a > half-hearted hardware assist can get in the way, rather than help.) You also have to consider a few more things. For instance, if you have a blitter that operates on video memory and lets the CPU do things with non video memory in parallel (like on the Amiga, and apparently on the Sun mentioned), then you have a big advantage, in that any blit may end up costing nothing but the setup time in terms of real CPU usage. Still no good reason to use the blitter for small, single character blits, but it can really be a justification for larger things. And given that a blit chip can often be a much simpler design than the host CPU, there's a real good chance it WILL be able to have a faster path to memory. That depends of course on the chip and the base CPU in your system. If the combination of a blitter chip and 68000 ran me more than a 68020, that had better be one heck of a blitter, or I'm wasting my $$$ -- the 68020 being more general purpose than a blitter can give you a better overall system performance. But if I can get my blitter and 68000 CPU and maybe a bunch of other functions for less than the cost of a 68010, I'm probably winning (if I'm not concerned about the 68010's virtual memory facilities, which a Sun of course obviously is). -- Dave Haynie "The 32 Bit Guy" Commodore-Amiga "The Crew That Never Rests" {ihnp4|uunet|rutgers}!cbmvax!daveh PLINK: D-DAVE H BIX: hazy "I can't relax, 'cause I'm a Boinger!"