Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/5/84; site oliveb.UUCP
Path: utzoo!watmath!clyde!cbosgd!ulysses!allegra!oliveb!jerry
From: jerry@oliveb.UUCP (Jerry Aguirre)
Newsgroups: net.news.b
Subject: Re: ihave/sendme and 2.10.3
Message-ID: <606@oliveb.UUCP>
Date: Thu, 26-Sep-85 15:53:10 EDT
Article-I.D.: oliveb.606
Posted: Thu Sep 26 15:53:10 1985
Date-Received: Sun, 29-Sep-85 06:44:23 EDT
References: <190@peregrine.UUCP>
Distribution: net
Organization: Olivetti ATC; Cupertino, Ca
Lines: 69

I have modified news version 2.10.2 to include "avail" and "iwant"
control messages.  These implement an efficient form of the ihave/sendme
control messages.

The "avail" message consists of a list of article IDs and optionally,
the associated spool filename.  The list can be generated via a log
entry in the sys file or extracted from the history file with an editor
script.

The receiving system checks each article ID in the "avail" list against
the history and if it is a new one then it requests it.  There are two
options for requesting the articles. 

The first is to send an "iwant" control message containing the articles
that were requested.  Including the spool filename speeds things up
because it is not necessary to search the history file.  The original
system can then send the requested articles via (compressed) batching or
whatever.

The second alternative is to directly access the article via uucp.  A
command of the form:
	uux -r -z !rnews < remotesys!pathname

is executed to fetch the news article from the remote system.  This
reduces the delay in transmitting the article but doesn't allow for
batching or compression.

Which transmission method you would use depends on the volume of the
requested articles.

I had originally intended this scheme for exactly what you proposed.
Two sites could send each other a list of the articles that they have
received and then only send the articles that were lost in regular
transmission.  This is a very low overhead operation.  A days worth of
articles can be listed in a message of a few K bytes.  The receiving
system can check them against the history in < 30 seconds.

I tried it our on my own systems and ran into a serious problem.  The
news history is just not reliable enough.  There can be large numbers of
articles that exist locally but are not in the history.  One of my
systems wound up requesting several hundred articles that it already had
but didn't know about.  After numerous user complaints about the
duplications I stopped testing until I can modify the news history to
work better.

I have in mind a scheme to map the article ID into a pathname.  Then the
rnews program could simply attempt to do a creat (mode 644) on the
pathname and if the article already existed the create would fail.  This
should be faster than that silly DBM mechanism and more portable.  The
created pathname would become the base name for the article.  All the
net.* cross postings would be links to the base name.  In this way the
history becomes the articles and there can be no disagreement between
them.  Given the xref mod, the base article would reference all the
links made to it so expire could remove them all.

This makes referencing a parent article absurdly simple and fast.  One
need only map the article ID to it's base pathname and you have the
article.  The current readnews code to reference a parent article is
not only circuitous but is also just plain broken.

Given that Unix allows anything but a null or / in a filename, mapping
the article ID into a pathname is simple.  The only critical part is
creating enough sub directories to keep the access time fast.

Anybody want to take on the project of improving the news history
processing?

					Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|tymix|olhqma}!oliveb!jerry