Path: utzoo!attcan!uunet!wyse!vsi1!ames!elroy!jpl-devvax!lwall
From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Newsgroups: comp.sources.d
Subject: Re: v05i053: A "safe" replacement for gets()
Message-ID: <3600@jpl-devvax.JPL.NASA.GOV>
Date: 28 Nov 88 22:58:47 GMT
References: <6508@csli.STANFORD.EDU> <2055@xyzzy.UUCP>
Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Organization: Jet Propulsion Laboratory, Pasadena, CA.
Lines: 43

In article <2055@xyzzy.UUCP> meissner@xyzzy.UUCP (Michael Meissner) writes:
: To my way of thinking, both gets and fgets are brain damaged, in that
: they encourage using fixed sized buffers.  I think we should be
: migrating programmers to something that allocates buffers, and if the
: buffer overflows, realloc's a bigger buffer.  Whether or not the
: previous buffer is reused, should be an option available to the
: programmer.  No matter what buffersize you choose, you will run into
: situations where either you waste so much space based on pessimistic
: assumptions, or run into input that is larger than expected.

I'm inclined to agree with this.  For the quick-and-dirty program, fgets()
is fine, but I'm appalled at the number of Unix utilities that have fixed
line length limits.  There's simply no excuse for it in this day and age.
(Red-faced, he thinks about patch, which has a limit of 1024 characters...)

: If readline returns NULL, it means an error or end of file, and if
: oldline is not NULL, it means reuse the line buffer passed in.  Except
: for the initial allocation, it should be faster than the fgets
: solution, since the newline is already located.

You can make it even faster by slurping the appropriate iob values into
registers and bypassing getc().  (Don't do this at home, kids.)  For an
example, see the str_gets() routine in perl.  While it's true that you
*shouldn't* cheat on the iob structure, you can get away with it almost
everywhere.  An #ifdef will handle other situations.

: You can also get fancy (like I've done sometimes), and define another
: argument that gives a comment character (like '#' in shell scripts),
: that automatically discards anything after the comment character, and
: also break the line into whitespace separated fields, since reading
: tabular data is a common use for fgets.

Ack.  Pfft.

You could do this for a particular application, but don't put it into the
general routine.  The perl routine mentioned does do alternate record
terminator stuff (including paragraph mode), but you can still keep
everything in registers (on a Vax) while doing so.  I don't think I'd
have put the extra functionality into that routine otherwise.

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov
"He who toys with the most wins dies."