Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/17/84; site plus5.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!ihnp4!mgnetp!we53!busch!wuphys!plus5!hokey
From: hokey@plus5.UUCP (Hokey)
Newsgroups: net.news
Subject: Re: too many new groups (keyword based news)
Message-ID: <626@plus5.UUCP>
Date: Tue, 5-Mar-85 23:09:39 EST
Article-I.D.: plus5.626
Posted: Tue Mar  5 23:09:39 1985
Date-Received: Thu, 7-Mar-85 03:52:33 EST
References: <576@vortex.UUCP> <614@plus5.UUCP> <2413@nsc.UUCP>
Reply-To: hokey@plus5.UUCP (Hokey)
Organization: Plus Five Computer Services, St. Louis
Lines: 39
Summary: 

There is a difference between using keywords instead of newsgroups, and
using keywords for archival lookup.

I am primarily interested in the former.  We could start with words like
"request" and "inquiry", for example.  These, coupled with other words
(an operating system, hardware, language) would be of great help in cleaning
up the way articles are plastered all over.

I would suspect the first way to approach the problem is to enhance news
posting programs to prompt (excuse the alliteration) for keywords in addition
to a Subject line.  We would have to "grow" the keyword and alias databases
over time.  Eventually, the keywords would be able to "take over" instead
of newsgroups.

We already have a situation where newsgroups fragment because the readers
wish to restrict the quantity of stuff they see.  Likewise, people tend
to post to multiple newsgroups just to reach a wider audience.  These
two trends are antithetical.

There are two ways to store the article index.  One way is to maintain the
index by message ID, the other way is by keyword.  Data compression can be
handled in either case.  For starters, message IDs can be compressed by
having a table of "registered" sites.  Sites in this list would have their
site name replaced by a compressed offset into the site table.  Similar
compression could be done to the sequence number.  The keywords would be
replaced by a compressed offset into the keyword table, and there would
be an additional entry for each keyword for an alias.

If the compression were good enough, we could keep each keyword index in
an ascii file with delimiters surrounding the article IDs.  This would
greatly ease the deletion/insertion problem of index maintenance.

If an online article database were kept, it could be scanned for keywords
by grep.  If people didn't like the keywords, they could easily add or
delete them.

-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492