Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!rutgers!ames!oliveb!sun!gorodish!guy From: guy%gorodish@Sun.COM (Guy Harris) Newsgroups: comp.bugs.4bsd Subject: Re: read() from tty has fencepost error Message-ID: <22664@sun.uucp> Date: Sun, 5-Jul-87 04:51:37 EDT Article-I.D.: sun.22664 Posted: Sun Jul 5 04:51:37 1987 Date-Received: Sun, 5-Jul-87 09:37:16 EDT References: <648@haddock.UUCP> <6040@brl-smoke.ARPA> <13048@topaz.rutgers.edu> <6053@brl-smoke.ARPA> Sender: news@sun.uucp Lines: 82 > At one time, a special "delimiter" marker was inserted into the stream > at that point. Apparently, some UNIXy implementations do it one way > and some another. Non-STREAMS tty drivers generally have a "raw" queue and a "canonical" queue. Reads in "cooked" mode take place from the "canonical" queue. In the AT&T drivers, of various flavors (V7, S3, S5), characters accumulate in the "raw" queue until a "read" is done. If the terminal is in cooked mode when the "read" is done, the "read" blocks until a line terminator (newline, EOF, or "secondary end-of-line" character) is received. At that point, one and only one line is canonicalized (erase/kill processing is done) and is moved to the "canonical" queue. If the "line" is terminated by an EOF rather than an end-of-line character, the EOF does NOT appear in the canonical queue. Thus, the top-level reading code won't see delimiters. The 4BSD driver(s) move data from the "raw" queue to the "canonical" queue as soon as a line terminator is received. "Canonicalization" is done on the fly; for example, as soon as an "erase" character is received, the character it erases is removed from the "raw" queue. (This makes it easier to implement more correct handling of the "erase" character - it's easier for the driver to know what character is being erased, so it can do a better job of erasing it from the screen - and also makes it easier to handle a "reprint" character that causes the current queued-up input to be re-echoed. It also means that erase, kill, etc. characters do NOT count against the 256-character limit of uncanonicalized characters, but subtract from that count.) If the line ended with EOF, the EOF is left in the canonical queue as a delimiter. It is stripped out when the "read" is done; however, if there are five characters in the queue, and the "read" asks for five bytes, only those five characters are looked at. If an EOF follows them, it is left in the queue and seen by the next "read". > I seem to recall that SVR3.0 STREAMS was missing the M_DELIM message type, > so whenever AT&T finally gets the whole character I/O system converted to > STREAMS, they couldn't insert a delimiter if they wanted too (according to > Ron, that would be consistent with current UNIX System V behavior). This is the true. STREAMS messages somewhat resemble "mbuf" chains; delimiters are implicit in the structure of these chains (when you get to the end of one, you're at the end of a message). A line would be a single STREAMS message; the EOF would be discarded ASAP, since it is not needed as a delimiter. As such, any driver based on the S5R3 STREAMS code will give the "push", rather than the "delimiter" behavior (regardless of whether it implements "canonicalize at read time" or "canonicalize at input time" behavior). The "streams" code described in Dennis Ritchie's paper in the BSTJ (I have no idea if that implementation is called STREAMS or just "streams") has a "delimiter" message type. I don't know what sort of behavior the various V8 "streams"-based (as opposed to S5R3 STREAMS-based) tty drivers provide; Dennis' paper described two drivers, one giving the 4.1BSD "old" line discipline behavior (which may resemble V7 behavior) and one giving the 4.1BSD "new" line discipline behavior (which probably resembles other 4BSD systems). I agree with most of the people here; the non-4BSD behavior is correct. When I type ^D, it doesn't mean that I'm putting a ^D into the input queue, it measns I'm terminating a record. > Alas, another difference among UNIX variants. What does POSIX have to > say about this? From the draft of Draft 10 (*sic*) we have here: 7.1.1.11 Special Characters ... EOF ...When received, all the characters waiting to be read are immediately passed to the program, without waiting for a new-line, and the EOF is discarded. Thus, if there are no characters waiting (that is, the EOF occurred at the beginning of a line), zero characters shall be passed back, representing an end-of-file indication. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com