Path: utzoo!utgpu!water!watmath!watdragon!lion!smking
From: smking@lion.waterloo.edu (Scott M. King)
Newsgroups: comp.editors
Subject: Re: pattern matches
Keywords: regular expression pattern match
Message-ID: <7618@watdragon.waterloo.edu>
Date: 4 Jul 88 03:27:21 GMT
References: <427@grand.UUCP> <37200009@m.cs.uiuc.edu>
Sender: daemon@watdragon.waterloo.edu
Reply-To: smking@lion.waterloo.edu (Scott M. King)
Organization: U. of Waterloo, Ontario
Lines: 34

In article <37200009@m.cs.uiuc.edu> liberte@m.cs.uiuc.edu writes:
>e.g. abc(def)^ would match a string that starts with
>abc but is NOT followed by def.

If I have the pattern "abc(d.*e)^ghi", and the line "abcdxxxghi", then
what gets matched, if anything? The abc part is easy. Then, the pattern
matcher tries to match a d.*e but it doesn't find one.
That means the (d.*e)^ is successful, but since it didn't find anything,
it has to start looking for the ghi at "dxxxghi", and doesn't find it.
Thus the match fails.

So, your ^ operator must act like a *test* for anything but the pattern
rather than a *match* for anything but the pattern.
Ie., test for the pattern, but don't eat up any characters while testing.
Note that this behaviour would mean that with the line "abcecf",
the pattern "ab(cd)^c." would match "abce", not "abcecf".

>There is possible confusion with [^abc] as negation of a char set,
>but such use could be phased out since [abc]^ is equivalent.

But, [abc]^ would really just test for anything but an a, b or c.
It wouldn't also match it.

Of course, being able to test for a subpattern without actually
matching it would probably be handy on occasion. However,
it would be much better to use another special character
like ! to indicate that the pattern should only be tested.
"(rexp)" would match "rexp", and "(rexp)!" would test for "rexp".
I guess you could still use your ^ operator in conjunction with this
to test for anything but a pattern. However, using (rexp)^ (without !)
would either be undefined, or mean the same thing as (rexp)!^
--

Scott M. King