Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!ncar!tank!uxc!uxc.cso.uiuc.edu!a.cs.uiuc.edu!m.cs.uiuc.edu!liberte From: liberte@m.cs.uiuc.edu Newsgroups: comp.lang.misc Subject: Re: Dumb Lexical Analyzers are Smart Message-ID: <5200027@m.cs.uiuc.edu> Date: 20 Sep 88 06:11:00 GMT References: <5200026@m.cs.uiuc.edu> Lines: 49 Nf-ID: #R:m.cs.uiuc.edu:5200026:m.cs.uiuc.edu:5200027:000:2117 Nf-From: m.cs.uiuc.edu!liberte Sep 20 01:11:00 1988 My interpretation of Bill Smith's argument is that languages should distinguish tokens lexically if they are used in different ways syntactically. As part of that argument, he says: > C is not such > a language because of the typedef construct. A typedef changes > the lexical class of the new type's identifier to avoid > horrendous ambiguity in the language. But a typedef would not have to change the lexical class of identifiers if typedef identifiers were only used in the syntax in unambiguous ways. However, they are used ambiguously and the only way to disambiguate is to use the (semantic) fact that an identifier was declared as a typedef. For example, when you see "foo * bar;" in C, you cant tell whether it is a use of the typedef identifier "foo" or a multiplication expression; not without looking back to find out if the next and previous lines are declarations and/or if "foo" is a typedef id. But a better syntax could make that locally unambiguous. Pascal also supports user defined type identifiers, but they may only be used, like all type identifiers, in unambiguous ways. Actually, some Pascals support type casting that looks identical to function calls, so this is ambiguous in some sense, but it doesnt matter at the syntactic level. (A similar example is the ambiguity between a parameterless function call and a variable reference.) So, while you see it as a problem to be solved at the lexical level, I would prefer to solve it at the syntax level. > Advantages I can see: > > 1. A person familiar with the lexical rules of the language can > more easily understand a routine without consulting all of the > declarations involved. Granted. But if the syntax doesnt permit ambiguous use of tokens, then a reader of a program should be able to tell pretty quickly what is meant from the local context. The other advantages you list relate to splitting up lexical and syntax processing and not requiring semantic info. But the same advantages obtain with a syntactic solution. Dan LaLiberte uiucdcs!liberte liberte@cs.uiuc.edu liberte%a.cs.uiuc.edu@uiucvmd.bitnet