Path: utzoo!utgpu!water!watmath!clyde!cbosgd!ncr-sd!matt From: matt@ncr-sd.SanDiego.NCR.COM (Matt Costello) Newsgroups: comp.unix.wizards Subject: Re: Wait, Select, and a SIGCHLD Race Condition Message-ID: <1944@ncr-sd.SanDiego.NCR.COM> Date: 12 Dec 87 05:53:50 GMT References: <5105@sol.ARPA> Reply-To: matt@ncr-sd.SanDiego.NCR.COM (Matt Costello) Organization: NCR Corporation, Rancho Bernardo Lines: 39 In article <5105@sol.ARPA> stuart@cs.rochester.edu writes: >I need advice (or sympathy) for handling a race condition in 4.3BSD >flavored UNIX. Briefly, I want to use wait3 to reap all the dead or >stopped children of a process, then use select to wait for the first >new IO or child activity. I've two methods I use to get around the race conditions in signals. They are: 1. If you are not using SIGALRM for something else, have your timeout routine re-enable the SIGALRM on 1 second intervals until it is turned off in the outer level code. If the original signal hits the timing hole then the second (or third) won't. The beauty of this is that it usable in any version of UNIX, since it uses no features specific to BSD or USG. For wanting to not miss any child processes with SIGCHLD: onedied() { signal(SIGCHLD,SIG_DFL); /* will infinite loop otherwise */ signal(SIGALRM,onedied); alarm(1); } signal(SIGCHLD,onedied); /* race condition is here... */ numfds = select(); /* or read(), or msgrcv() */ alarm(0); 2. For select() or any operation where the process is waiting on incoming IO, you can have the signal routine send a dummy message that will cause the select() to return immediately. Rather than aborting the operation find some way to make it terminate normally. This works wonderfully for SYSV message queues since it is perfectly legal to send a zero length message. -- Matt Costello+1 619 485 2926 {sdcsvax,cbosgd,pyramid,nosc.ARPA}!ncr-sd!matt