Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!seismo!gatech!hao!oddjob!mimsy!chris
From: chris@mimsy.UUCP (Chris Torek)
Newsgroups: comp.unix.questions
Subject: Re: create_process() vs vfork()
Message-ID: <7428@mimsy.UUCP>
Date: Thu, 9-Jul-87 23:04:46 EDT
Article-I.D.: mimsy.7428
Posted: Thu Jul  9 23:04:46 1987
Date-Received: Sun, 12-Jul-87 08:51:15 EDT
References: <7737@brl-adm.ARPA> <1186@ius2.cs.cmu.edu> <8174@utzoo.UUCP> <129@xyzzy.UUCP>
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 69

In article <129@xyzzy.UUCP> throopw@xyzzy.UUCP (Wayne A. Throop) writes:
>Further, I assume that COW fork() consuming no "additional CPU time"
>relative to vfork() means that the process in question writes to no
>pages of memory whatsoever.

No, it gets to write to one or two stack pages.

>... in the trivial sense that one could add a busy loop to a vfork()
>implementation to slow it down to be equal to or slower than a fork()
>implementation, I agree with Chris and disagree with my statement above.

This is not necessary.

>But if the MIPS R2000 really has no additional cost of managing the
>split of address spaces and copy of a page or two of data that is
>written to by the child, I'll be very surprised indeed.

Since the virtual to physical translation is done by kernel code,
one need not do anything special to split an address space: the two
programs can point to the same mappings.  The child part of a fork
usually reads about like this:

	register int pid;

	if ((pid = fork()) == 0) {
		dup2(a, 0);
		dup2(b, 1);
		signal(SIGINT, SIG_IGN);
		execv(p, v);
		_exit(1);
	}

All of this writes to register and stack (not static data), so the
trick is that in fork(), you pre-copy one or two pages of stack
space, and share all the rest.  The advance copies are already
writable, so there are no extra faults before the exec().  The
parent process is likely to write to its data space, but you prevent
it from running for a moment until the child execs or until some
(short) timeout has expired.

>... any virtual memory system must spend CPU or memory or
>silicon to acheive the *effect* of a PTE.  The cost cannot be zero.

True, but the cost may be the same whether the pretend-PTE is
being shared via a vfork style fork or via a copy-on-write style
fork().  This means that the cost for the vfork style sharing is
greater that it might have to be if the hardware were different,
but who cares?  We were trying to eliminate vfork anyway.

>Again, my claim is that *if* one wishes to make an optimized process
>create, the way to do it is to use a create process call for cases where
>it is possible and profitable . . . .

Certainly.  I am just playing devil's advocate anyway.  All the
hardware *I* have around here makes copy-on-write fork inherently
more expensive than vfork fork.  I do not know whether the difference
is noticeable.  (Maybe I should hack up copy-on-write this evening
just to find out :-) .)  (No, have to get 4.3BSD working on that
8250 first.  No trouble, just a few thousand lines of BI code to
write, working without manuals as usual [*].  Now where did I put
those Mach tapes? . . .)

-----
[*] Actually, DEC may be reversing their trend:  The KDB50 manuals
that came with the machine actually have a substantial fraction of
the information I need.  This is, at least, a good sign.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris