Path: utzoo!attcan!uunet!husc6!ukma!gatech!mcnc!rti!xyzzy!meissner
From: meissner@xyzzy.UUCP (Michael Meissner)
Newsgroups: comp.sources.d
Subject: Re: v05i053: A "safe" replacement for gets()
Message-ID: <2055@xyzzy.UUCP>
Date: 28 Nov 88 15:36:27 GMT
References: <6508@csli.STANFORD.EDU>
Reply-To: meissner@xyzzy.UUCP (Michael Meissner)
Organization: Data General (Languages @ Research Triangle Park, NC.)
Lines: 49

In article <6508@csli.STANFORD.EDU> wagner@arisia.xerox.com (Juergen Wagner)
writes:
| I am wondering how long the discussion will last... The alternatives were:
	...	/* extraneous suggestions deleted */

| o  replace gets by something more intelligent
|    We arrive at fgets. The "more intelligent" will certainly mean some kind
|    of overflow checking. That's already available in fgets.

To my way of thinking, both gets and fgets are brain damaged, in that
they encourage using fixed sized buffers.  I think we should be
migrating programmers to something that allocates buffers, and if the
buffer overflows, realloc's a bigger buffer.  Whether or not the
previous buffer is reused, should be an option available to the
programmer.  No matter what buffersize you choose, you will run into
situations where either you waste so much space based on pessimistic
assumptions, or run into input that is larger than expected.

In most of the code I've looked at, when the programmer did use fgets,
no check was done to see if the newline is actually in the buffer, or
if a check is made via strchr (or index for BSD types), it assumes
that NULL is never returned.

I think this readline function should look something like (in ANSI C
prototypes):

	typedef struct {
		size_t	line_alloc_max;	/* current max # bytes allocated */
		size_t	line_num_chars;	/* # chars in this line or 0 */
		char	line_buffer[1];	/* start of line buffer */
	} line_t;

	extern line_t *readline( FILE *stream, line_t *oldline );

If readline returns NULL, it means an error or end of file, and if
oldline is not NULL, it means reuse the line buffer passed in.  Except
for the initial allocation, it should be faster than the fgets
solution, since the newline is already located.

You can also get fancy (like I've done sometimes), and define another
argument that gives a comment character (like '#' in shell scripts),
that automatically discards anything after the comment character, and
also break the line into whitespace separated fields, since reading
tabular data is a common use for fgets.

-- 
Michael Meissner, Data General.

Uucp:	...!mcnc!rti!xyzzy!meissner
Arpa:	meissner@dg-rtp.DG.COM   (or) meissner%dg-rtp.DG.COM@relay.cs.net