Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/5/84; site rlgvax.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxj!houxm!vax135!cornell!uw-beaver!tektronix!hplabs!oliveb!ios!qubix!sun!decwrl!decvax!wivax!cadmus!harvard!seismo!rlgvax!guy From: guy@rlgvax.UUCP (Guy Harris) Newsgroups: net.unix-wizards Subject: Re: Recovery possible from signal aborted write(2) ? Message-ID: <138@rlgvax.UUCP> Date: Wed, 26-Sep-84 00:23:22 EDT Article-I.D.: rlgvax.138 Posted: Wed Sep 26 00:23:22 1984 Date-Received: Tue, 25-Sep-84 21:02:35 EDT References: <3253@ecsvax.UUCP> Organization: CCI Office Systems Group, Reston, VA Lines: 59 > Using systems 3 or 5, is there any way of deterministically restarting > output to a tty that has been interrupted by a signal. > > Scenario: > Screen managing type program generating lots of escape sequences > also is subject to receiving several signals a minute. The > program wants to buffer its output for effeciency, so it does > write(2)s of say 50-100 characters at a time. When it receives > a signal, however, the write returns a -1, rather than indicating > the number of characters actually output. This makes it very > hard to guarantee screen integrity while buffering output. > > One obvious solution is to forgo buffering. Any other suggestions? You could disable interrupts, but that's all you can do in System III or System V. 4.1BSD, 4.2BSD, and several other systems have added a "hold" action to "signal" (yes yes, I know, 4.2 actually replaced the whole signal mechanism) like the "ignore" action, except that it "holds" any signals that come in while that action is on rather than discarding them. Then, when the action is changed back to "catch", the signal will come through. Thus, you just "hold" the interrupts while the screen is being painted. However, if you actually want to catch the signals while the write is occurring, no common UNIX I know of lets you do this. The problem here is that signals were originally intended as "traps"; they indicated that some "error" had occurred and that the program should stop what it's doing and quit (the user hitting their interrupt key is considered an "error" of this sort). Then they were shanghaied into service as software interrupts; unfortunately, they don't work well as software interrupts. For one thing, you can't continue an interrupted system call; 4.xBSD will *restart* an interrupted system call that never got started, but if the "write", say, had already written some data it says "the hell with it" and just aborts it. For another, you can't defer them inside a critical region (except with the aforementioned "hold" mechanism). And, of course, when a signal comes in the signal action is reset to the default, which usually blows the process away; this means that if the signals come in fast enough the process will simply (and mysteriously) die. (For fun, if your interrupt key is a quickly repeating key on your terminal, try holding it down while at the shell level; unless you're on 4.2 or some other system that doesn't do this reset, you stand a good chance of getting logged out as the shell gets blown away by a SIGINT before it gets a chance to reset the signal handler.) Our office automation system has already run into this problem; if the user hits their interrupt key while the screen's being painted, they can lose. So it is a real problem; the "hold" mechanism will do OK for us (we only get interrupts from the keyboard, so deferring them while the screen is painted is no problem) but it may not work for everybody. Having the "write" return the number of characters actually written out seems good offhand. If a "read" moved 0 bytes, however, it could be mistaken for an EOF. Perhaps this one would have to be special-cased. None of the other "slow" system calls have this problem, as far as I know. Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy