Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!linus!philabs!cmcl2!seismo!harvard!godot!ima!ism780b!jim From: jim@ism780b.UUCP Newsgroups: net.lang.c Subject: Re: Standardization questions (cpp mostl Message-ID: <51@ism780b.UUCP> Date: Mon, 8-Oct-84 00:20:01 EDT Article-I.D.: ism780b.51 Posted: Mon Oct 8 00:20:01 1984 Date-Received: Tue, 9-Oct-84 19:42:26 EDT Lines: 0 Nf-ID: #R:decvax:-8300:ism780b:25500027:000:6846 Nf-From: ism780b!jim Oct 6 15:37:00 1984 > 1. undefined token has the value zero (in #if's). Cpp should > print a warning -- or error if the evaluator is intelligent > about statements like: > #if defined (foo) && foo == 0 > My cpp prints a warning, as it erroneously evaluates the > entire statement. I disagree. I've seen plenty of code which does #if foo or #if !foo and I don't think a "-Dfoo 0" should be required. > 2.is "invisible" to all processing in > the standard. I regard it as "whitespace" outside of > strings, and hence a token delimiter. This greatly > simplifies accurate error message generation. If is allowed everywhere, it should behave the same everywhere, so it can be handled at the lowest possible level. And since it must be ignored in strings, it should be ignored everywhere. > 3. The standard isn't clear about and -- > are they everywhere identical to ? I.e. may they appear > between the start of a line and the # that introduces a control > statement? The standard is also unclear about the action to be > taken at the end of an include file: is the a token delimiter? > Does it terminate a line? The System V cpp allows FF and VT before the #, but it does not allow other space there. FF and VT should be allowed at least wherever other whitespace is allowed. Allowing arbitrary space before the # breaks the use of cpp as a general preprocessor, but that is probably not a concern of the committee. > 4. Just how invisible are comments? For example, are the following > correct? > > /* foo */ #ifdef foo > # /* foo */ endif Reiser's cpp is poorly written, and breaks on comments in any surprising place. However, it is hard to think of a syntax which does not make that a bug (except for the case where the # must be the first character on the line). > 5. cpp should accept "# " as a synonym for "#line " > so that it accepts its own output format. Absolutely. The fact that the committee has not allowed for this reveals that they have not spent much time looking at existing cpp implementation or usage. > 6. Some people write > > #ifdef foobar > #endif foobar > > This should be provided for in the syntax -- or explicitly > rejected. I agree; it should be allowed. Also for #else. Arbitrary text should be allowed to the right of the required tokens. > 7. I added __DATE__ to the preprocessor predefineds. It's > useful for embedding debugging status (but not essential). The form __FOO__ should be reserved for preprocessor built-in's, with __FILE__ and __LINE__ required and all other implementation-defined. > 8. I claim that nested comments /* ... /* ... */ warrant > a warning message -- that is a very common source of > error in the programs I see (and impossible to detect > without a warning message). I think the standard allows this warning but does not demand it. This is probably good policy for all warnings, and will encourage implementors to provide the warnings to be competitive. >Those are all the problems I have (today). Here are some questions >about the new concatenation operation: > >1. May it appear anywhere, or only on a #define line? If it appears anywhere, the # introducing a control line becomes syntactically ambiguous. However, that is probably ok. I can't imagine a proper implementation that wouldn't have to do more work to disallow it outside of #defines than to allow it. >2. What are the semantics of, say, > > #define foo abc # def > > Is it (1) "foo"; (2) read "abc"; (3) read '#' and realize we're > expanding a token, so (4) read "def" and glue them together? > If so, what happens when "abc" or "def" are macro's: > > #define unique here # __LINE__ Is white space allowed around the #? That makes the syntax a bit messy; the concatenation operator then becomes "an intermixed sequence of zero or more whitespace characters and one or more #'s". How much whitespace do you have to scan in order to find the #? If the # is allowed in running text, that whitespace would normally be copied, but obviously not if it is followed by a #. I don't see why macros are a problem. The # is just like whitespace in that it delimits a token, but it is not copied to the output. > 3. May the #define token be concatenated: > > #define unique # counter __LINE__ That would require expanding names delimited by #, even when they appear in a position not normally expanded. No big deal, but it doesn't get you much; see below. > 4. If I should write: > > #define unique_var var # counter > #define counter (counter + 1) > #define another_var var # counter > > will cpp "do what I mean?" Of course not; the preprocessor does not do arithmetic, and counter is not expanded at the time of the define. But I agree that the semantics must be fully specified so the behavior of such cases is well-defined. >I just added stringization to Decus cpp and discovered something >interesting: > > #define print(format, value) printf("Result " "format", value) > print("%d", 123); > >My first attempt expanded to > > printf("Result " ""%d"", 123); > >I've added a hack to strip one level of quotes, but aren't too >happy with it. Note that you just can't omit the argument quotes as you may want to pass ',' through. Why not just define it as #define print(format, value) printf("Result " format, value) Certainly the string concatenation should not happen until format is evaluated. >Also, is this ok: > > print('%d', 123); > >In that case, I generate > > printf("Result " "'%d'", 123); > >without comment. I would think that you want print("'%d'", 123); You should consistently require double quotes. If you don't want to require quotes, then you can't allow commas, right parens, /*, etc. in the argument. Trying to have your cake and eat it too by stripping quotes just doesn't cut it. Rememeber that this isn't m4, where the quotes are balanced (`'). > The committee might consider specifying the core run-time library > (str..., is..., the math routines, and a few others) such that the > compiler may generate in-line code or non-standard calling sequences. > There should be a way to override some or all of this, of course. > This was done for Fortran with no evil effects. I agree. It would be nice if there were a way to specify in a header file that a routine is possibly builtin, so lint could complain if you take its address. -- Jim Balter, INTERACTIVE Systems (ima!jim)