Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!husc6!uwvax!oddjob!gargoyle!ihnp4!cbosgd!osu-cis!tut!lvc From: lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) Newsgroups: comp.unix.xenix,comp.unix.questions Subject: Re: Need help with SCO: the process that would not die. Message-ID: <2345@tut.cis.ohio-state.edu> Date: Fri, 27-Nov-87 17:01:52 EST Article-I.D.: tut.2345 Posted: Fri Nov 27 17:01:52 1987 Date-Received: Mon, 30-Nov-87 00:42:22 EST References: <116@citcom.UUCP> <911@csun.UUCP> Organization: Ohio State Computer & Info Science Lines: 30 Keywords: ps kill process SCO xenix Summary: thats the attnix way Xref: mnetor comp.unix.xenix:1225 comp.unix.questions:5127 In article <911@csun.UUCP>, abcscnge@csun.UUCP (Scott Neugroschl) writes: > > I realize this isn't a Xenix question (from me), but we have a similar > problem with our Zilog S8000 running ZEUS 3.2 (Zilog's version of SYS III) > at work (not CSUN). It appears to be related to signal processing. Our > in-house guru tells us that the process is "locked on I/O", implying that > the signal really screwed up the kernel data. Recommend you look at the > signal handling logic if possible, and ask the people causing the lockup > if they have done an interrupt (ctrl-c or DEL) just before it locked... > > Any wizards out there know of such bugs in either kernel (xenix or zilog)? > > Scott "The Pseudo-Hacker" Neugroschl Its not a bug. This is the way UNIX and all derivatives (that I know of) are designed. Whether this is a good design is another question. If the operating system is performing certain I/O operation on behalf of your program (eg a close), and the operation does not complete (for whatever reason - usually a hardware problem) your program won't die, and can't die with a signal, not even SIGKILL. You might adb the os and fiddle some bits, but I don't recommend it. A reboot is the only sure way to make it go away, though other tricks sometimes work depending on the circumstances. The wchan is an address that can be used to identify the offendig hardware, a tty structure, a tape, or network device for example. A local guru should be able to tell you what device corresponds to the address. If he or she can't they aren't much of a guru. This is one area of UNIX where it is particularly weak. Hardware failures ought to be handled more robustly, and most certainly if they are for non critical devices. I don't see any hope soon for a better strategy.