Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!apple!sun-barr!newstop!sun!terra!brent
From: brent%terra@Sun.COM (Brent Callaghan)
Newsgroups: comp.protocols.nfs
Subject: Re: login hangs when server dies (finally fixed)
Message-ID: <125703@sun.Eng.Sun.COM>
Date: 3 Oct 89 20:52:39 GMT
References: <1989Sep28.195215.4656@tubsibr.uucp> <30535@news.Think.COM>
Sender: news@sun.Eng.Sun.COM
Lines: 58

In article <30535@news.Think.COM>, barmar@kulla (Barry Margolin) writes:
> Someone else already mentioned the possibility that a directory
> mounted from the dead server is in your PATH.  Another problem is
> NFS-mounted files adjacent to one of the directories between your
> working directory and the root.  This can cause the getwd() function
> to hang.  Here's why:
> 
> In order to find out the name of a directory, getwd() first stat()s
> "." to find out its inode#.  Then it scans ".." and stat()s each file
> until it finds one with the same inode#.  If one of these files is an
> NFS mount point, or a symbolic link to a path on an NFS-mounted file
> system, the stat() must contact the NFS server, and will hang if it's
> down.

In SunOs 4.0 the getwd() algorithm was changed.  If it crosses a
mountpoint in its walk up the file tree it takes a peek in /etc/mtab.
It looks for a mount entry with the same device id and when it finds
the mountpoint it just steals the path from the /etc/mtab entry
and prepends it to the current path.

The good news about this is that you don't have to walk all the way
up to the root and can avoid stat'ing mountpoints in "/" or "/usr".
The bad news is that you stat *all* the mountpoints in /etc/mtab - so
the hanging problem could be worse than before.  To work around this
problem the device id's for all the mountpoints in /etc/mtab are
cached in a file /tmp/.getwd.  As long as the /etc/mtab is not
updated (no mounts or unmounts) the cache can be consulted for
pathname/device-id pairs instead of risking lots of stat'ing of
mountpoints.

A problem with this scheme is that the getwd cache became invalid
whenever the /etc/mtab is modified.  If you use the automounter the
/etc/mtab can being updated so frequently that the getwd cache is
almost useless.

In SunOs 4.1 (real soon now) the /tmp/.getwd cache is gone.  The
device id for each mountpoint is now kept in the /etc/mtab file
itself.  It lives in the mount options string as a hex number
following the string "dev=".  Commands like mount and automount
that append entries to the /etc/mtab just stat() the new mount
(not much chance of hanging) and insert the "dev=" stuff into
the mount option string.  This is not user-visible - unless
you cat the /etc/mtab.  Since the device id is a invariant for
the lifetime of a mount it makes sense for the device id to
live in the /etc/mtab with all the other mount-invariant 
information.  Any program like getwd(), df or find that likes
to stat mountpoints to get a devid can avoid the stat if
the "dev=" is present.

Now getwd() doesn't need to stat() mountpoints at all (unless the dev=
is missing for some reason).  This seems to be working well - I 
think we've got that one licked (finally).

	Brent

Made in New Zealand -->  Brent Callaghan  @ Sun Microsystems
			 uucp: sun!bcallaghan
			 phone: (415) 336 1051