Path: utzoo!attcan!utgpu!watmath!att!dptg!rutgers!usc!ucsd!ames!amdahl!pyramid!ncc!alberta!idacom!andrew From: andrew@idacom.UUCP (Andrew Scott) Newsgroups: comp.lang.forth Subject: Re: Forth Compilation (again) ? Message-ID: <712@idacom.UUCP> Date: 8 Aug 89 19:53:46 GMT References: <114600003@uxa.cso.uiuc.edu> Organization: IDACOM Electronics Ltd., Edmonton, Alta. Lines: 61 In article <114600003@uxa.cso.uiuc.edu>, ews00461@uxa.cso.uiuc.edu writes: > > I've seen Forth systems that generate object code. How is usually done ? > Inline code ? Lots of jsr (subroutine calls) ? Are they simply using > the dictionary as a "symbol table" ? Yes, yes, and sometimes. Or, to be a little less glib, there are many ways of implementing subroutine threaded systems and you've touched on some of the techniques used. First of all, a subroutine threaded Forth uses subroutine calls as the threading mechanism. Depending on the underlying architechture, this can be faster than any other kind of threading because the inner interpreter has been eliminated. Literals are handled by inline code that pushes the value to the stack. Cond- itional code uses the processor's native branch instructions instead of mani- pulating the inner interpreter's instruction pointer. The subroutine calls need not take up more space, either. For a 68000 system, you can use the two-byte or four-byte forms of BSR when compiling words that are within 128 or 32K of the current location. You only need to use a JSR instruction when you call a word over 32K away. If the code is sufficiently modular, it shouldn't happen that often. Inline code is usually used as an optimization in these systems. Things like DUP, DROP, +, and EXIT are usually very short words and do not increase the size of code when inlined. I recently wrote a subroutine-threaded Forth that added a new class of optim- izations to those above. I made the compiler do more than " CFA , " when compiling a word. Instead, sequences of words are found that can be collapsed into short sequences of inline code. For example: Forth 68000 code ----- ---------- DUP >R MOV.L (SP), -(RP) LIT + ADDI.L #N, (SP) (ADDQ used when possible) = 0BRANCH CMPM.L (SP)+, (SP)+ BNE.S ?? The latter example illustrates how conditionals can be made even faster. IF, UNTIL, WHILE etc. all compile the primitive 0BRANCH (or ?BRANCH in other Forths). A relational operator such as = can be "folded into" the branch, eliminating the need of pushing a true/false value to the stack only to be popped off by 0BRANCH and tested. Using a list of about 75 sequence "rules", the compiler produced code for a particular application that ran about three times as fast as indirect threaded Forth and was 12% smaller. The optimizer was not that large, either. It was written in about 250 lines of assembler (for compilation speed). The rules file was about 200 lines of Forth. No "inline" flags were required either. The only hook was to replace the CFA , in the guts of INTERPRET with a new word, which I called (COMPILE). (BTW: I'm glad to see that activity on comp.lang.forth has picked up recently!) -- Andrew Scott andrew@idacom - or - {att, watmath, ubc-cs}!alberta!idacom!andrew