Path: utzoo!utgpu!attcan!uunet!husc6!cmcl2!nrl-cmf!ames!pasteur!ucbvax!BLIULG11.BITNET!A-PIRARD From: A-PIRARD@BLIULG11.BITNET (Andre PIRARD) Newsgroups: comp.lang.forth Subject: Re: I need a word! Message-ID: <8808101430.AA28524@jade.berkeley.edu> Date: 10 Aug 88 08:01:02 GMT References:Sender: daemon@ucbvax.BERKELEY.EDU Reply-To: Forth Interest Group International List Organization: The Internet Lines: 70 >and I need a word that can take a buffer address and a string address, >I guess, and return the address of the letter following first occurrance of >the string in the buffer! Well, your problem is one of finding the occurence of a string in another one. At the Southern Belgium Fig Chapter, we have been debating for a long time how Comforth should perform this function in terms of stack behaviour. We finally settled for the definition and code below. You will easily be able to turn it to anything you wish. We found the definition of -MATCH suggested by the standard often involving intricate computation on the parameters returned. We like thinking of strings in terms of address and count. MATCH returns what we think is most useful for immediate use: the substring preceding the match and the one following it. The following example shows how easily it is used to type on different lines the parts of a character string separated by commas: : SPLIT \ scon -- BEGIN DUP WHILE " ," MATCH TYPE CR 1 /STRING REPEAT 2DROP ; MATCH scon1 scon2 -- scon3 scon1' Search the first occurrence in scon1 of the characters of scon2. Split the string at the match point, giving scon1' and scon3, such that scon1=scon1'+scon3 and either, when a match exists, scon3=scon2+rest or, when no match or scon2 is the null string, scon3=null. When scon2 is not null, scon3 size may be used as a flag. MATCH is most useful at removing separators from a string and repeatedly handling the first part then continuing with the rest until a null string is encountered. /STRING may be used to behead scon3 by scon2 size to remove the matching part off that rest. scon2 size is limited to 255 characters on some CPU's. Note: our descriptions use the symbol to refer to a pair of stack entries containing the address and count of a character string whose contents is not changed by the definition. CODE MATCH \ a1 l1 a2 l2 -- a1' l1' a1 l3 \ find scon2 at a1' in scon1 LABEL NOMATCH DI , CX ADD CX , CX XOR \ l1'=0 when no match LABEL OKMATCH DX , DI MOV SI POP DI POP AX POP DX PUSH CX PUSH \ a1',l1', matching substring AX PUSH DX , AX SUB \ a1,l3=a1'-a1, skipped one NEXT ENTRY: BX POP CX POP AX POP AX PUSH DI PUSH SI PUSH DI , AX MOV AL , 0 [BX] MOV \ DI=a1 CX=l1 BX=a2 DX=l2 AL=first char of 2 DX , DX OR NOMATCH JE \ null string 2 special case DX DEC BX INC BEGIN REPNZ SCASB NOMATCH JNE \ search next first char equality => a1',l1' CX , DX CMP NOMATCH JB \ not enough to compare rest, l1'