Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!gem.mps.ohio-state.edu!apple!sun-barr!decwrl!shlump.nac.dec.com!mountn.dec.com!minow
From: minow@mountn.dec.com (Martin Minow)
Newsgroups: comp.std.c
Subject: Re: C source character set
Message-ID: <887@mountn.dec.com>
Date: 2 Oct 89 20:29:54 GMT
References: <1302@gmdzi.UUCP>
Reply-To: minow@mountn.dec.com (Martin Minow)
Organization: Digital Equipment Corporation
Lines: 37

In article <1302@gmdzi.UUCP> wittig@gmdzi.UUCP (Georg Wittig) writes:
>[1] There exist editors that allow you to enter any ASCII character. Consider
>    the following program fragment:
>
>		/* in the following lines let @ be the character '\0' */
>		int x;
>		x = 1 +	/* foo @ bar */
>		    2	/* */
>		    ;
This is probably a "quality of implementation" issue (because of NUL's
specific use in C to terminate strings.  A good implementation ought to
sweep out such characters (my opinion). More interesting is whether the
'@' can stand for one of the national letters in the ISO Latin-1 alphabet
(these have values from 0xA0 to 0xFF).   Again, "good" implementations will
allow characters in comments, 'char' and "string" constants that aren't
in the C source alphabet.

>
>[2] Furthermore, there are (non-UNIX) operating systems that encode the end of
>    a source line by the number of bytes of that line instead of inserting a
>    newline character

fgets() should encode these lines as "string\n" -- how it would treat an
embedded \n is a quality of implementation issue.  I would suggest that
there should be no difference between an explicit \n and one generated
to signal an end-of-record.

> I can think of at least 5
>    different ways to process such a crazy macro.

>[3] Line continuation by `\'

May occur anywhere (ignoring trigraphs).  Thus "terribly_lon\
g_identifier" is legal anywhere.

Martin Minow
minow@thundr.dec.com