Path: utzoo!utgpu!water!watmath!clyde!bellcore!rutgers!cbmvax!uunet!mcvax!philmds!leo From: leo@philmds.UUCP (Leo de Wit) Newsgroups: comp.sys.atari.st Subject: Re: null fill eliminated Keywords: addition Message-ID: <491@philmds.UUCP> Date: 3 Jun 88 09:54:28 GMT References: <490@philmds.UUCP> Reply-To: leo@philmds.UUCP (L.J.M. de Wit) Organization: Philips I&E DTS Eindhoven Lines: 76 Here are some small corrections for the fast loader I put on the net this week. 1) There is a header for the module now. It says: * Even when loading from ramdisk or harddisk the ROM program null fills all * uninitialized data, heap, stack (often the major part of your RAM). * This null filler makes loading programs faster. Its null filling is 7 times * as fast as the ROM's, using the quick movem.l instruction. Besides it only * clears the BSS space. * At least the fillhigh and filllow addresses have to be adapted to suit your * ROM version. 2) The bigint definition should read: bigint equ $7ffffff0 3) I abandoned the idea of no null filling at all. Some programs generated bus errors when started with this VBL routine active, so I've looked things up in K & R. In paragraph 4.9 (Initialization): ...In the absence of explicit initialization, external and static variables are guaranteed to be initialized to zero; ... So the routine now clears the BSS space; the programs that generated errors now work OK. The null filling is performed by null filling chunks of 128 bytes using movem.l instructions; that seems to be the fastest way, especially if you move many registers at a time. The 'modulo 128' part is cleared first, at the top of the BSS. Here it is (I have left the initialization routine out): fastload movea.l 74(sp),a0 * PC cmpa.l #fillhigh,a0 bhi.s fastdone cmpa.l #filllow,a0 blt.s fastdone lea.l 32(sp),a0 * Address D5 on stack cmp.l #bigint,(a0) bge.s fastdone * Already filled move.l #bigint,(a0) * Maximize D5 on stack move.l 68(sp),a6 * Value of A6 on stack to A6 move.l -4(a6),a4 * Start of block to fill move.l -58(a6),d0 * # bytes to fill: BSS size move.l d0,d1 and.w #$7f,d1 * d1 = d0 & 0x7f moveq.l #0,d2 lea.l (a4,d0.l),a5 * End (one past) bra.s fastl1 fastl0 move.b d2,-(a5) * Clear top d1 bytes fastl1 dbra d1,fastl0 moveq.l #0,d0 * Nullify d0-d7/a0-a3 move.l d0,d1 move.l d0,d2 move.l d0,d3 move.l d0,d4 move.l d0,d5 move.l d0,d6 move.l d0,d7 move.l d0,a0 move.l d0,a1 move.l d0,a2 move.l d0,a3 bra.s fastl3 * a5 - a4 is now a multiple of 128 fastl2 movem.l do-d7/a0-a3,-(a5) * Clear 4 * (12 + 12 + 8) = 128 bytes / turn movem.l do-d7/a0-a3,-(a5) movem.l do-d7,-(a5) fastl3 cmpa.l a4,a5 bgt.s fastl2 * Until start address A4 reached fastdone rts section s.data noque dc.b 'No vbl entry available!',13,10,0 end