Path: utzoo!yunexus!geac!syntron!jtsv16!uunet!peregrine!elroy!ames!mailrus!cornell!uw-beaver!ssc-vax!lee From: lee@ssc-vax.UUCP (Lee Carver) Newsgroups: comp.unix.wizards Subject: file descriptor vs. file pointer closing Keywords: open(2), close(2), exit(2) Message-ID: <1122@ssc-bee.ssc-vax.UUCP> Date: 12 Aug 88 23:49:43 GMT Article-I.D.: ssc-bee.1122 Organization: Boeing Aerospace Corp., Seattle WA Lines: 187 Why should file descriptor closing neccesarily close the file pointer? Especially when there are more then one file descriptors associated with the file pointer. The following is ~180 lines of discussion. --- The plan I was trying to build a nice, prompting, validating input reader for adding to shell scripts. The idea was to run this program (we'll call it readtkn), and have it read and validate the user's input, then write the result to stdout. Managed to build something more useful then not. Typical usage might be: # this file is 'demo' set src=`readtkn 'Enter source file > ' opt opt opt` set dst=`readtkn 'Enter destination > ' opt opt opt` cp $src $dst Obviously, expand this to your heart's content. Presumably, the options in the above example constrain the user's input to valid file names, etc. --- The problem Unfortunately, we ran into serious problems during testing, and any other time that a script using readtkn is sent a file of responses instead of reading the user's terminal. We test our package by running the scripts with known answers, and verifying the expected behavior, along the lines of: # this file is 'test' demo << EOF model /tmp/model EOF diff model /tmp/model The second execution of readtkn finds an end of file. It seems that when the first execution of readtkn terminates, it closes the file descriptor (exit semantics). Thus, when the second readtkn runs, it is handed the file descriptor of a closed file, and read EOF. My understanding of the UNIX process and file structure tells me that each file descriptor is associated with a file pointer. When readtkn is run (by fork), it creates a new file descriptor to the original file pointer. Thus, if either the child (readtkn) or parent (shell/demo) change the file pointer (seek, read, etc.), the other one is affected. The problem is the close, which is called automatically for every open file descriptor on exit. The close eliminates the file pointer, even though there are two file descriptors associated with it. It seems to me that this should not be so. The file pointer that the first readtkn closes is shared by a file descriptor in the shell. Now it must close its descriptor/connection, but why should it cause the file pointer to be closed as well? In fact, one might be inclined to argue that the file pointer is "owned" by the parent, not the child. So what is the child doing closing the file pointer that it does not own? Why does this work at all if stdin is /dev/tty? Apparently, the shell reopens stdin if it is closed by a child process, but I'm not sure. --- The proposals Clearly, there are programs that rely on these semantics. So we cannot change the semantic of the close, at least in the normal case. That eliminates the proposal that only the "owner" of the file pointer can close the file pointer. The next alternative is new fcntl option to mark a file descriptor as "don't close on exit". This is somewhat similar to the F_SETFD option to "close on exec". Personally, I don't like this, since I'm not sure I could manage it. My feeling is that a "new" close call should be provided (disconnect?). The semantics of disconnect would be to close the file pointer only when the last file descriptor is disconnected. An equivalent variation would be a fcntl option to activate these semantics on close could be added. A problem with these proposals is the interaction of stdio. If the input actually read by readtkn is less then the amount pre-fetched by stdio, we need to clean up. The un-read but pre-fetched bytes need to be restored to the file. Perhaps an lseek to the last delivered byte is all that is needed. Hopefully I'm missing something obvious. If not, or you have a better idea, send me mail. --- The workaround On our system at least (see disclosure), we were able to get things to work by wrapping a shell script around readtkn in the following style: # this file is 'readtkn', a wrapper for the program readtkn.exe if test $AUTOMATED then read scrap echo $scrap | readtkn.exe $* else readtkn.exe $* endif This does work, but seems quite inelegant. Also, it limits you to full lines in readtkn (no single character reads). Also, you have to have explict knowledge of nesting because of the AUTOMATED variable. Without that, interactive users don't get their prompts until after they supply the correct answer (sigh). --- The disclosure This was actually done in full flower on an Apollo system with their proprietary Aegis operating system, and proprietary "/com/sh" shell. After complaining to them about this weirdness, I discover that it also happens on un-adulterated UNIX (well BSD 4.3). So, if the problem statement isn't exactly UNIX-ese, I'm sorry. These sample programs have been run, with the indicated results, on ssc-bee, a BSD 4.3 VAX-11/785. My plan is to tell Apollo how I'd like to see it fixed. Since it seems broken on UNIX too, maybe we can all benefit. --- The details So, now we conclude with the actual file sources. The file names should be clear. The body of each file is indented three spaces, and each file is terminated with the line ' *** EOF ***'. test driver script: sample << EOF AB EOF *** EOF *** sample, the script called from above: readtkn readtkn *** EOF *** results of running the test driver: A --- END OF FILE --- *** EOF *** desired results, if stdin stayed open: A B *** EOF *** readtkn.c, the source of readtkn: #includemain ( argc, argv ) int argc; char **argv; { int ch; ch = getchar (); if ( ch == EOF ) puts ( "--- END OF FILE ---" ); else { putchar ( ch ); putchar ( '\n' ); } exit (0); } *** EOF *** --- The signature Please mail me your comments and suggestions. I'll summarize what comes in. Thanks. Lee Carver Boeing Aerospace csnet: lcarver@boeing.com uucp: {...}!uw-beaver!ssc-vax!ssc-bee!lee