Path: utzoo!attcan!uunet!wuarchive!gem.mps.ohio-state.edu!apple!oliveb!mipos3!omepd!intelob!rjh
From: rjh@intelob.intel.com (Bob Hathaway)
Newsgroups: comp.protocols.tcp-ip
Subject: TCP Urgent Data Handling
Message-ID: <4982@omepd.UUCP>
Date: 27 Sep 89 16:48:12 GMT
Sender: news@omepd.UUCP
Reply-To: rjh@intelob.UUCP (Bob Hathaway)
Organization: Intel Corp., Hillsboro
Lines: 70



I'm implementing TCP urgent data handling for our product line and have
discovered some ambiguous semantics.  This implementation will support
multiple transport layer interfaces including a Unix socket layer and must
be able to communicate with other TCP/IP implementations, however it appears
BSD Unix doesn't implement the TCP specifications as I would expect.
I'd appreciate hearing from any internet experts or be referred to any 
existing specifications which can clarify urgent data handling.


TCP RFC 793 and the MIL STD do not offer precise semantics for urgent
data handling.  Single byte messages are simple but larger messages seem
to be poorly defined.  For example, Ultrix assumes the first byte of a
multi-character message is urgent and 4.3 BSD assumes the last byte.  Also,
4.3 breaks large urgent messages into several segments with the URG bit
set and the urgent pointer pointing to just past the data *in each segment*.
The receiver will believe each segment is an urgent message and each segment
will override the last saved urgent byte unless inlining is specified.  This
implementation seems erroneous.

A more correct interpretation of the TCP specifications for multi-segment
urgent messages seems to be setting the URG bit on the first segment only 
and setting the urgent pointer to one byte past the last byte in the entire 
multi-segment urgent message.  The transport service will consider an entire
urgent message as urgent data allowing the socket layer to extract a single
byte from the urgent message if necessary.  Future socket implementations
will hopefully conform more closely to the TCP specification.  With this
interpretation, a receiver sets TCB variable RCV.UP = SEG.UP when an URG bit
is detected and arriving data up to *RCV.UP* is assumed to be urgent.

For example, this interpretation of the TCP specification will result in:

             URGENT MESSAGE		  SEGMENTS
	     ==============               ========
	      					URG=1, UP=3000 -+
           m	|-------|	          m     |-------|	|
		|	|	     		|	|	|
		|	|	     m + 1023	|-------|	|
		|	|			 	 	|
		|	|		        URG=UP=0	|
		~	~	     m + 1024   |-------|	|
		|	|			|	|	|
		|	|	     m + 2047   |-------|	|
		|	|					|
		|	|		        URG=UP=0	|
		|	|	     m + 2048   |-------|	|
     m + 2999	|-------|	     		|	|	|
				     m + 2999   |-------|	|
							<-------+

PSH will also be set on all outgoing segments. 

For reliability, SEG.UP would have to be constraint checked and segments
with the URG bit set and a new UP which arrive past the first segment but
within the original urgent message would have to be handled, for example
when the second or third segments above arrived with URG=1, UP=new value.
These updates could be considered errors by sending a RST with logging or 
could be considered correct by updating RCV.UP.  We opt for considering UP
updates within messages an error condition and disconnect with a RST because
this indicates the peer is out of sync.

This scheme raises some important questions such as compatibility with
existing systems and correctness.  Are there any existing specifications
or internet experts which can clarify this?

Thanks,
Bob Hathaway
rjh@inteloa.intel.com
!tektronix!biin!rjh