Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!brl-adm!adm!narten@purdue.EDU From: narten@purdue.EDU (Thomas Narten) Newsgroups: comp.unix.wizards Subject: Re: A possible network bug in Sun unix? Message-ID: <1640@brl-adm.ARPA> Date: Thu, 18-Dec-86 10:01:07 EST Article-I.D.: brl-adm.1640 Posted: Thu Dec 18 10:01:07 1986 Date-Received: Thu, 18-Dec-86 23:34:37 EST Sender: news@brl-adm.ARPA Lines: 96 This may be a feature of Sun UNIX, but is probably not restricted to it. It is caused by two problems: 1) Unix has a keepalive option on sockets that times out (breaks) connections if the peer in connection goes away. For TCP, "going away" is defined as not having recieved any packets from the peer in X amount of time. Rlogind uses this option. 2) Sun diskless machines reboot much more quickly then normal Unix machines, because they don't have large disks for fsck to churn away on. In particular, they are back up and running before old connections have timed out due to (1). 1 is implemented by running a timer that expires whenever no packets have been exchanged for a certain period of time. When the timer expires, TCP sends a one byte data segment that is outside of its send window (i.e. it already has an ACK for that sequence number). The peer TCP, in receiving the segment, notes that it already has the data and sends back an ACK for the sequence number that it expects to see. The client TCP gets that ACK, and updates its timer indicating that the connection is still alive. The connection eventually breaks if no ACKs are received. This works just fine as long as both TCPs are still there, or if one end of the TCP connection goes away in the sense that the host is unreachable. On the other hand, if one machine crashes and reboots quickly, the following occurs: The client TCP sends a keepalive packet, which the peer TCP receives. Now however, there is no protocol control block for that connection, so the peer TCP sends back a RESET. The client TCP receives the packet, updates its keepalive timer (hum... I got a packet, the connection must still be fine), then checks the sequence numbers that were ACKed. The ACK is outside of its receive window and there was no data sent in the segment, so TCP drops the packet ignoring the RESET. (This follows the TCP spec). >Since the other end of the rlogin will stick around until some I/O >forces it to recognize the connection is broken (we just cat'ed to the >pty on the remote system and it closed), This results from the RESET being ignored since it is not within its recieve window. If you force the TCP to send real data, the ACK that gets returned will be within the receive window and the RESET causes the connection to break. One workaround is to change the line in tcp_input(...): tp->t_timer[TCPT_KEEP] = TCPTV_KEEP; to something like: if ((tiflags&TH_RST) == 0) tp->t_timer[TCPT_KEEP] = TCPTV_KEEP; This will cause the connection to eventually timeout. Both 4.2 and 4.3 BSD suffer from this problem. >1) You rlogin from your sun workstation (Sun-3/50 in this case) to another > system on the network. >2) Your sun workstation crashes. >3) After rebooting you try to rlogin to the same other system again and > you can't even after multiple tries. I tried to duplicate your behavior on our Sun machines running NFS3.2 trying to connect to 4.2, 4.3 and NFS3.0 machines. I don't have a 3.0 machine handy that I can crash at will. I would rlogin to host A, reboot the workstation, and rlogin to A again. Each time, I was able to rlogin successfully. Each connection used the same port numbers. Note that under normal conditions, the following packet exchange takes place: A B send SYN, SEQ=n,ACK=0 (thinks connection is established) gets SYN, sends back ACK=m,SEQ=o gets ACK, notices sequence number is not what it expects & replies with: ACK=0,SEQ=m,RESET gets RESET,drops connection and sends back RESET,ACK=m,SEQ=o At this point the "old" rlogin has gone away, and the next SYN will cause the connection to become established properly. I suppose that things could break if the sequence number chosen by A was the same as B was expecting, but that would be an awful coincidence. It is the case, however, that when a machine reboots, it starts with an initial sequence number of 0. If your machine crashes several times in quick succession, it is possible that the sequence numbers on the peer connection could also be very low. Still, I find it hard to believe that this is the cause the problem. Do you have anyway of determinig what sequence numbers are involved in the connections or what sort of packets are floating around for the connection in question? Thomas Narten narten@purdue.EDU or {ihnp4, allegra}!purdue!narten