Path: utzoo!attcan!uunet!ginosko!gem.mps.ohio-state.edu!tut.cis.ohio-state.edu!mailrus!ames!think!kulla!barmar
From: barmar@kulla (Barry Margolin)
Newsgroups: comp.protocols.nfs
Subject: Re: login hangs when server dies
Message-ID: <30535@news.Think.COM>
Date: 30 Sep 89 17:08:24 GMT
References: <1989Sep28.195215.4656@tubsibr.uucp>
Sender: news@Think.COM
Organization: Thinking Machines Corporation, Cambridge MA, USA
Lines: 39

In article <1989Sep28.195215.4656@tubsibr.uucp> petri@tubsibr.UUCP (Stefan Petri) writes:
>Problem: when a nfs-servers dies, the `login' on all of his
>clients will hang somewhere after displaying /etc/motd (saying nfs-server gargel
>not responding, still trying) ; even if neither the user
>nor the system needs (seems to need ? ) the files from that died
>server. ( e.g. remote-mounted man-pages).

We've seen this on Suns running SunOS 4.0.3, as well as earlier
releases.  And it isn't only the shell that hangs; I've seen "rn" and
"emacsclient" hang, as well.

Someone else already mentioned the possibility that a directory
mounted from the dead server is in your PATH.  Another problem is
NFS-mounted files adjacent to one of the directories between your
working directory and the root.  This can cause the getwd() function
to hang.  Here's why:

In order to find out the name of a directory, getwd() first stat()s
"." to find out its inode#.  Then it scans ".." and stat()s each file
until it finds one with the same inode#.  If one of these files is an
NFS mount point, or a symbolic link to a path on an NFS-mounted file
system, the stat() must contact the NFS server, and will hang if it's
down.

When using the automounter, there is an additional behavior caused by
this problem.  When all the NFS servers are up, getwd() may still take
a long time to complete.  This is because many of the file systems it
encounters may not be mounted, and it takes a significant fraction of
a second to mount them.

I think the solution is for getwd() to use lstat() rather than stat(),
and to keep mount points out of the root.  This way, getwd() will only
encounter links, and lstat() shouldn't need to access the target of
the link.

Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar