Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/17/84; site plus5.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!ihnp4!mgnetp!we53!busch!wuphys!plus5!hokey From: hokey@plus5.UUCP (Hokey) Newsgroups: net.news Subject: Re: too many new groups (keyword based news) Message-ID: <626@plus5.UUCP> Date: Tue, 5-Mar-85 23:09:39 EST Article-I.D.: plus5.626 Posted: Tue Mar 5 23:09:39 1985 Date-Received: Thu, 7-Mar-85 03:52:33 EST References: <576@vortex.UUCP> <614@plus5.UUCP> <2413@nsc.UUCP> Reply-To: hokey@plus5.UUCP (Hokey) Organization: Plus Five Computer Services, St. Louis Lines: 39 Summary: There is a difference between using keywords instead of newsgroups, and using keywords for archival lookup. I am primarily interested in the former. We could start with words like "request" and "inquiry", for example. These, coupled with other words (an operating system, hardware, language) would be of great help in cleaning up the way articles are plastered all over. I would suspect the first way to approach the problem is to enhance news posting programs to prompt (excuse the alliteration) for keywords in addition to a Subject line. We would have to "grow" the keyword and alias databases over time. Eventually, the keywords would be able to "take over" instead of newsgroups. We already have a situation where newsgroups fragment because the readers wish to restrict the quantity of stuff they see. Likewise, people tend to post to multiple newsgroups just to reach a wider audience. These two trends are antithetical. There are two ways to store the article index. One way is to maintain the index by message ID, the other way is by keyword. Data compression can be handled in either case. For starters, message IDs can be compressed by having a table of "registered" sites. Sites in this list would have their site name replaced by a compressed offset into the site table. Similar compression could be done to the sequence number. The keywords would be replaced by a compressed offset into the keyword table, and there would be an additional entry for each keyword for an alias. If the compression were good enough, we could keep each keyword index in an ascii file with delimiters surrounding the article IDs. This would greatly ease the deletion/insertion problem of index maintenance. If an online article database were kept, it could be scanned for keywords by grep. If people didn't like the keywords, they could easily add or delete them. -- Hokey ..ihnp4!plus5!hokey 314-725-9492