Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site mulga.OZ Path: utzoo!watmath!clyde!bonnie!akgua!sdcsvax!dcdwest!ittvax!decvax!mulga!kre From: kre@mulga.OZ (Robert Elz) Newsgroups: net.lang.c Subject: Re: ANSI C and the C Pre-Processor Message-ID: <459@mulga.OZ> Date: Sat, 29-Sep-84 06:54:00 EDT Article-I.D.: mulga.459 Posted: Sat Sep 29 06:54:00 1984 Date-Received: Mon, 1-Oct-84 04:32:40 EDT References: <1691@pegasus.UUCP>, <4335@utzoo.UUCP>, <4082@tekecs.UUCP> Organization: Comp Sci, Melbourne Uni, Australia Lines: 132 From Henry Spencer: | | > ............ However, this idea is being extended to include strings and | > character constants as tokens that don't get scanned for replacement text. | | K+R, section 12.1: "Text inside a string or a character constant is | not subject to replacement." In other words, this is not something new: | the language has always been specified to behave that way. I think it instructional to consider the wording of the proposed (draft) standard. [This is from the July version, I doubt that its changed in the Sept one]. Sect 9.2: ..... Character constants and strings in the token sequence or in the rest of the program are not scanned for defined identifiers or formal parameters. .... Now consider the wording in the April version (it was sect 9.1 then) Sect 9.1: ..... Character strings in the token sequence or in the rest of the program are not scanned for defined identifiers. .... Note the difference. K&R was never clear on this point - its wording on this point (and others) was ambiguous. That is, a perfectly viable interpretation, taken by Reiser, was that strings in the token sequence could be scanned for parameters. There are (as has been pointed out many times) many reasons for allowing this. The ONLY one for denying it, that I can see, is that some people get confused (don't understand what's happening). The right way to solve that problem is to clearly document what happens - no-one will have any problems with it if its made clear what will happen. Henry continues: | | > The questions are: Should this change be endorsed? | | Of course it should be endorsed, since it's not really a change at all. | The standard is the documentation, not Reiser's code. The problem is that K&R is *not* a standard. If it was, we wouldn't need X3J11. In the absence of a standard, and in the presence of ambiguous documentation, the only place to look is in the implementations. Henry also stated (quote omitted) that most non unix C compilers adopted the restrictive approach. So, now we have a conflict - no immedate practical reason (in terms of broken code) for jumping one way or the other. In short, nearly the ideal situation for adopting the best solution. If C were a language for amateur programmers, beginnners, etc, I would tend to favour the restricted approach. But that's not what C is. Its a dangerous language, filled with dangerous features. Its for professionals. We should adopt the most useful approach - the one that gives the greatest power to the programnmer - that is clearly the liberal approach. Pragmatically too, it will be much easier to convert programs broken by this strategy (those in which macro replacement text contains strings containing "accidental" references to parameters) than those broken by the current draft proposed standard (those that use replacement inside strings to good effect). In the former case, all that needs to be done is to rename the formal parameter. In the latter, some whole new mechanism needs to be devised - possibly requiring changes in the source. I also suspect that less programs would be broken by the former. Henry again: | | As for what should be done to bring back the lost functionality... the | ANSI C folks have basically said "if you want a general-purpose macro | processor, use m4". The programs that this "change" will break are | broken already, and should be fixed to do it right. No-one is asking for a full blown macro processor, just that subset that is really useful for C programs. If the committee were to take the "use m4" attitude, they would logically have to standardize m4 as a (possibly optional) part of the C compiler. Otherwise all those programs that go to the trouble of adopting their recommendation, and use m4, will stop being portable, which can hardly be the aim. Joe Mueller replied: | | As Henry stated, the X3J11 committee (ANSI C), felt that the preprocessor | was not intended to be a general purpose macro processor, BUT, we did | acknowledge that there was a large body of code that used these types | of "features". The committee is currently concidering proposals for | | a) token concatination operations within the preprocessor. It will | definitely NOT be startoftoken/**/argument. Currently it looks like | the # will be used like this: startoftoken#argument. I don't believe | we have definitely decided the syntax for the operation. I think that | the committee did decide that the functionality was needed. I agree that this is needed - while I regret the need to alter some of my source (I am a xxx/**/yyy user) I admit that this is a revolting way of forming tokens, something better, anything better, would be welcome. [No, please don't tell me about your favourite revolting way of avoiding xxx/**/yyy, I've seen most of them, none of the existing ones is clearly better.] The '#' operator proposal looks reasonable to me. When you're considering this, please also remember to do something about the problems of blanks in the actual parameter strings - are they signifigant, or not? That is spaces between the preceding comma or '(' and the start of the replacement text, and blanks after the text before the ')' or next comma. I would prefer that the standard make it clear that these should not be included as part of the replacement text. Joe: | | b) "stringizing" (I didn't make up this term, someone else did) arguments | is also under concideration. One proposal is to do the substitution | if the argument name is the only thing within the quotes. i.e. | #define foo(bar) printf("bar") | will expand bar within the quotes where | #define foo(bar) printf("the argument was bar") | will not expand bar. Ugh! How could you justify that! I appreciate, that combined with constant string concatenation, it would give all the functionality that is needed - the second example could be rephrased as: #define foo(bar) printf("the argument was ""bar") but that's going to be a nasty distinction to try to explain to anyone. And that would break ALL existing implementations. Seems to me that in this case, adopting the Reiser interpretation is the better thing to do. Document it clearly, so people aren't trapped, and that should end the problems. Robert Elz decvax!mulga!kre