Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante!mit-eddie!uw-beaver!tektronix!reed!omen!caf
From: caf@omen.UUCP
Newsgroups: comp.protocols.misc
Subject: Re: About Protocols for File Transfer
Message-ID: <686@omen.UUCP>
Date: 2 Jun 88 13:47:58 GMT
References: <303@cfcl.UUCP> <698@lakesys.UUCP> <692@ncrcce.StPaul.NCR.COM> <9295@eddie.MIT.EDU> <8WbMLYy00Vs8EzltB4@andrew.cmu.edu> <304@cfcl.UUCP>
Reply-To: caf@omen.UUCP (Chuck Forsberg WA7KGX)
Organization: Omen Technology Inc, Portland Oregon
Lines: 107
Posted: Thu Jun  2 09:47:58 1988

While reading Dave's comments on file transfers I thought I'd pass on some
of the lessons I learned while developing ZMODEM and some of its antecedents.

In article <304@cfcl.UUCP> dwh@cfcl.UUCP (Dave Hamaker) writes:
 ....
:Protocol transfers and non-protocol transfers have no speed advantage over
:each other in terms of the raw data to be sent.  Since the non-protocol
:transfer sends nothing else, protocol transfers cannot be faster.  If it
:is true that additional information must be communicated for checking to
:be possible, and it is, how can protocol transfers avoid being slower?
:THE CHECKING INFORMATION CAN TAKE THE FORM OF FEEDBACK; IT CAN TRAVEL IN
:THE OTHER DIRECTION (full duplex, remember?)!  This by itself will not

Some modems are more full duplex than others.  The current crop of
v.29 and PEP based high speed modems cannot handle too frequent
ACKs without a severe throughput loss.  At best, it costs as much
to send a CRC in one direction as the other when using these modems.

:quite reach "as fast as" performance.  Some "attention" signal is needed
:which is distinct from data.  Asynchronous RS-232 even provides this, in
:the form of the "break."

Since many systems cannot pass a "break" signal, a real world
protocol must reserve a code combination instead.  This adds 
some overhead.

:A break signal causes the receiver to return the check information for the
:current short packet (if any), return its own break signal, and react to
:the next input character.  It will either be another break, signaling the

As a practical matter, there's no guarantee what a break
signal will look like at the other end.  It depends on the
receiving UART and the transmission medium.  On some devices
it looks like simple null characters.  Break generation hasn't
progressed much beyond the break key on a Teletype ASR33.
Protocolwise, a break is somewhere between a cherry bomb and a
nuke.

:end of the transmission; or a sequence-number byte followed by two CRC
:bytes on the sequence number, followed by retransmitted data (at this
:point the receiver is back to normal).  If the CRC of the sequence number
:received is wrong, the receiver sends back a break and ignores data until
:another break is received.  A break is also the receiver's final response
:to the break which marks the end of the transmission.
:
:A "transmission window" is also needed.  This extends some number of
:packets beyond the first packet which is waiting for check information to
:be returned. The window is advanced as check information is verified.  The
:sender may not send data outside the window and the receiver may deduce
:from the highest packet received which earlier packets may be presumed
:correct (and written to disk, etc.).  There is no framing information with
:the data, and the receiver should actually act as if the window is a little
:larger that it actually is.  This prevents things like phantom characters
:caused by line noise from advancing the receiver's view of the window to
:the point where the receiver would be misled to accept an unverified packet.

The one thing I've learned for certain is that *nothing* can
be presumed in a protocol transfer.  There is no guarantee the
receiver will receive the first, second, or third break signal
from the sender signifying bad data has been received.  Then
again, if you're lucky you won't have to send a break, the
line disturbance might do it for you.  Dat's da breaks, chum.

There isn't much of an upper bound on the number of extra
characters that might be inserted into the stream.  Would you
believe several hours' worth of "UUUUUUUUUUUUUUUU" at 1200
bps?  (I enhanced error recovery procedures after that one.)
That's one modem's rendition of dial tone.

In the absence of independently validated segments in the file
transmission, the receiver must buffer the entire file to be
very sure its been received intact unless the operating system
can trucate files on the fly.  And of course the message the
sender sends to the receiver to indicate the end of file must
be difficult for noise to mimic.

:The window needs to be large enough to smooth out the effect of transmission
:delays.
:
:I'm sure everyone will agree that this protocol meets the "as fast as"
:requirement, although it is possible to quibble.  A non-protocol transfer

We haven't even started on network protection, provisions for
multiple streams, and adaptive error control procedures.
In the case of ZMODEM transmitting with 1k subpacket length,
the overhead for sending the CRC is about one half per cent.
On a binary file, the overhead for network protection is
several times more.

ZMODEM actually uses feedback a bit like PROTDH.  In the
absence of errors, ZMODEM in full streaming mode transmits but
one data header for the entire file.  Each subpacket contains
only two protocol overhead characters (not counting CRC).

:doesn't need the two break signals at the end.  For that matter, the sender
:has to wait for all outstanding check data before sending the final break

With CRC's in the forward direction, the ZMODEM sender doesn't
have to wait before sending the EOT.  With the Relay protocol,
the sender may actually be several FILES ahead of the
receiver.

  .....

:    1. I think it's good to be reminded that looking at problems in
:       unconventional ways can result in solutions for the seemingly
:       impossible problem.