Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!gem.mps.ohio-state.edu!ginosko!uunet!mcsun!unido!gmdzi!wittig From: wittig@gmdzi.UUCP (Georg Wittig) Newsgroups: comp.std.c Subject: C source character set Message-ID: <1302@gmdzi.UUCP> Date: 2 Oct 89 15:31:58 GMT Organization: GMD - German National Research Laboratory for Computer Science Lines: 68 May be the follwing are RTFM questions, but I don't have the ANSI C papers; Harbison & Steele II don't seem to cover it ... My questions are about the legal characters in a C source programme: [1] There exist editors that allow you to enter any ASCII character. Consider the following program fragment: /* in the following lines let @ be the character '\0' */ int x; x = 1 + /* foo @ bar */ 2 /* */ ; Is this program fragment equivalent to [a] ``int x; x = 1 + 2;'' In this case C compilers cannot use ``fgets'' to read the source lines. or [b] ``int x; x = 1 + ;'' This will result in a syntax error message in later compiler phases. What about a '\0' outside a C comment? Does it terminate the current line or must it be kept so that a syntax error message will be the result? What about a '\0' in a string constant? [2] Furthermore, there are (non-UNIX) operating systems that encode the end of a source line by the number of bytes of that line instead of inserting a newline character (\x0a or \x0d in ASCII, \x15 in EBCDIC) at the end of that line. As an example, the line ``abc'' could be encoded as ``\3abc'', and not as ``abc\x0d''. In those environments ``[f]getc'' must generate an artificial '\n' character at the end of the line. Or am I mistaken? What if exactly this artificial '\n' is also a character of the line? What is a ``line'' in this context? Consider a (perverse looking) macro like the following: /* in the following line let @ be the character '\n' */ #define X(a,b) foo@#define X(a,b) ((a)+(b)) i = X(27,38); Is this required to pass the preprocessor phase without an error message, and if so what is the output of that phase? I can think of at least 5 different ways to process such a crazy macro. [3] Line continuation by `\': Does it only apply to #define contexts and string constant contexts, or is it a general rule? Example: int terrible_long_identifier; terrible_lon\ g_identifier = 1; Does the assignment statement alter the value of that terrible long variable, or is it a syntax error (``terrible_lon'' and ``g_identifier'' undeclared)? Thanks in advance, -- Georg Wittig GMD-Z1.BI P.O. Box 1240 D-5205 St. Augustin 1 (West Germany) email: wittig@gmdzi.uucp phone: (+49 2241) 14-2294 ------------------------------------------------------------------------------- "Freedom's just another word for nothing left to lose" (Kris Kristofferson)