Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site warwick.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!think!harvard!seismo!mcvax!ukc!warwick!kay
From: kay@warwick.UUCP (Kay Dekker)
Newsgroups: net.unix
Subject: Unix text files
Message-ID: <2339@flame.warwick.UUCP>
Date: Sat, 26-Oct-85 11:37:17 EST
Article-I.D.: flame.2339
Posted: Sat Oct 26 11:37:17 1985
Date-Received: Tue, 29-Oct-85 00:49:01 EST
References: <23@pixel.UUCP> <2235@brl-tgr.ARPA> <2333@flame.warwick.UUCP> <2308@brl-tgr.ARPA>
Reply-To: kay@warwick.UUCP (Kay Dekker)
Organization: VLSI Group, Warwick University, UK
Lines: 71
Xpath: warwick flame flame ubu

Extensive quoting ensues, as I've moved the discussion to net.unix from
net.bugs, and people may have missed this...

Sometime back, gwyn@brl-tgr.ARPA (Doug Gwyn ) wrote:

>> >Many UNIX text-file utilities will discard a (necessarily final)
>> >text line that does not end in a newline.  Quite simply, such a
>> >file is not a proper UNIX text file.

and I responded with:

>> Who says?  Where's the definition of a 'proper' UNIX text file?

to which he replied:

>The problem is, there are several interpretations of such a file,
>depending on the utility involved.  Perhaps there should be a
>well-defined standard interpretation, but there isn't currently.
>
>"A file of text consists simply of a string of characters, with
>lines demarcated by the newline character."  -- from "The UNIX
>Time-Sharing System" by Ritchie & Thompson
>
>"text file, ASCII file -- a file, the bytes of which are understood
>to be in ASCII code"  -- from "Glossary" in "UNIX Time-Sharing
>System Programmer's Manual", 8th Ed.
>
>"A text stream is an ordered sequence of bytes composed into lines,
>each line consisting of zero or more characters plus a terminating
>new-line character.  ...  The sequentially last character read in
>from a text stream will, however, always be sequentially the last
>character that was earlier written out to the text stream, if that
>character was a new-line."  -- from ANSI X3J11/85-045
>
>My personal choice would be similar to Ritchie & Thompson, where
>newlines delimit (NOT "terminate") text lines, so that the last
>character in a text file would not need to be a newline.  However,
>this raises the question of what utilities should do with the
>null line at the end of every text file that DOES end with a
>newline; this will still be utility-dependent (and should be
>documented whenever it is handled differently from other text
>lines in the file).
>
>X3J11/85-045 botched it anyhow, since they intended that ALL UNIX
>files qualify as "text streams" under stdio (vs. "binary streams",
>which have to be handled differently on some non-UNIX OSes).
>
>So, how do we establish a standard interpretation for non-newline-
>terminated UNIX text files?

Doug,
	I may be being optimistic (and thus *wrong*) but I don't see where
the problem with your suggestion [newlines delimiting text lines] lies:
the rule would be, simply,

"Text consists of an ordered sequence of characters, with lines delimited
by newline characters.  Text is normally terminated by a newline.  This
newline should be considered to be followed by a (nonexistant) null line.
The null line should not be considered to be part of the text.
	"If the last character of the text is not a newline, then consider
the text to be terminated by a newline - null line pair; however, this
newline - null line pair should not be considered to have been part of
the file.

I *think* that's right...
							Kay.
-- 
"The only good thing that I can find to say about the idea of colonies
in space is that America could, at last, have a world to herself."
						-- Elisabeth Zyne
			... mcvax!ukc!warwick!flame!kay