Path: utzoo!utgpu!water!watmath!watdragon!lion!smking From: smking@lion.waterloo.edu (Scott M. King) Newsgroups: comp.editors Subject: Re: pattern matches Keywords: regular expression pattern match Message-ID: <7618@watdragon.waterloo.edu> Date: 4 Jul 88 03:27:21 GMT References: <427@grand.UUCP> <37200009@m.cs.uiuc.edu> Sender: daemon@watdragon.waterloo.edu Reply-To: smking@lion.waterloo.edu (Scott M. King) Organization: U. of Waterloo, Ontario Lines: 34 In article <37200009@m.cs.uiuc.edu> liberte@m.cs.uiuc.edu writes: >e.g. abc(def)^ would match a string that starts with >abc but is NOT followed by def. If I have the pattern "abc(d.*e)^ghi", and the line "abcdxxxghi", then what gets matched, if anything? The abc part is easy. Then, the pattern matcher tries to match a d.*e but it doesn't find one. That means the (d.*e)^ is successful, but since it didn't find anything, it has to start looking for the ghi at "dxxxghi", and doesn't find it. Thus the match fails. So, your ^ operator must act like a *test* for anything but the pattern rather than a *match* for anything but the pattern. Ie., test for the pattern, but don't eat up any characters while testing. Note that this behaviour would mean that with the line "abcecf", the pattern "ab(cd)^c." would match "abce", not "abcecf". >There is possible confusion with [^abc] as negation of a char set, >but such use could be phased out since [abc]^ is equivalent. But, [abc]^ would really just test for anything but an a, b or c. It wouldn't also match it. Of course, being able to test for a subpattern without actually matching it would probably be handy on occasion. However, it would be much better to use another special character like ! to indicate that the pattern should only be tested. "(rexp)" would match "rexp", and "(rexp)!" would test for "rexp". I guess you could still use your ^ operator in conjunction with this to test for anything but a pattern. However, using (rexp)^ (without !) would either be undefined, or mean the same thing as (rexp)!^ -- Scott M. King