Path: utzoo!attcan!uunet!legato!mojo From: mojo@legato (Joseph Moran) Newsgroups: comp.unix.wizards Subject: Re: csh pgrp problem Message-ID: <920@legato.LEGATO.COM> Date: 12 Aug 89 20:50:32 GMT References: <712@skye.ed.ac.uk> Reply-To: mojo@legato (Joseph Moran) Organization: Legato Systems, Inc., Palo Alto, CA Lines: 75 In article <712@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes: >Running under SunOS 4, we occasionally encounter an annoying problem: >a pipeline (eg cat /etc/passwd | more) will stop, with the message > > Stopped (tty output) > >I believe I've found the problem, what I want to know is whether there's >a simple fix, perhaps in more recent versions of csh than we have here. Unfortunately, the `simple' fix I know of is to continue to use vfork with csh... >I had some problems when I first compiled this shell for SunOS 4, and >the simplest solution seemed to be to #undef VFORK, since fork() in >SunOs 4 does copy-on-write. Here's some history on this stuff. While I was working at Sun, I did most of the work on the new VM system. When the new VM project was started, we believed that we could just have the vfork system call do a standard copy-on-write fork for binary compatibility. Then we could retire vfork from the C library since vfork was a hack marked for deletion. When I ran a prototype new VM kernel on my workstation, I occasionally ran in the "Stopped (tty output)" problem when using csh and pipes. I spent some time tracking this mess down. I found that if I compiled csh with VFORK not defined, that the csh would occasionally fail the same way that it did running on the new VM system with vfork replaced by fork. From this I concluded that I was seeing the result of a long time csh bug that was never noticed at Berkeley (where both csh and vfork originated) since vfork was always used there for csh. Folklore has it that vfork was created solely for csh because of the performance costs of csh doing Unix fork's in a paged environment without copy-on-write. After tracking down the race condition in setting the process group stuff in csh, I decided that it was too hard for me personally to fix (I was doing kernel VM work, not csh support). As time went on, we found more places that depended on the subtle effects of vfork. Eventually it was decided that SunOS needed to continue to support vfork even after we had a copy-on-write fork just because of a few $%$#$!* programs that either took advantage of the vfork semantics (e.g., csh using vfork to keep exec hash statistics) or accidentally depended on them (e.g., the csh process group problem when not using vfork). >What seems to be happening is that the shell forks twice (once for cat >and once for more). Each child sets its process group to the jobid, >which is cat's process id. The first child sets the terminal process >group to the same thing. However, there's nothing to guarantee that >the first child sets the terminal process group before the second child >starts running, and perhaps once in 20 times it doesn't. In these >cases the ioctls performed by more cause a SIGTTOU. Yes - this is problem that I found. And this is one of the reasons why SunOS 4.0 csh still uses vfork even though fork now uses copy-on-write. >Presumably using vfork() forces things to happen in the right order. Exactly - when using vfork the child process gets to run first and "borrow the address space" of the parent until the child exec's or exit's. After the child exec's or exit's, the parent gets to run after it gets its address space back from the child process. I think that the general lesson to be learned here is to not introduce "temporary hack system calls" because it can be hard to later get rid of them because some important program(s) either accidentally or consciencely depending on the (subtle effects of that) hack. Joseph Moran Legato Systems Inc. 260 Sheridan Avenue Palo Alto, CA 94306 (415) 329-7886 mojo@legato.com or {sun,uunet}!legato!mojo