Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 (Fortune) 6/7/84; site dmsd.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!godot!harvard!seismo!umcp-cs!gymble!lll-crg!dual!amd!fortune!hpda!dmsd!bass
From: bass@dmsd.UUCP (John Bass)
Newsgroups: net.news
Subject: Re: too many new groups (keyword based news)
Message-ID: <163@dmsd.UUCP>
Date: Mon, 4-Mar-85 11:03:07 EST
Article-I.D.: dmsd.163
Posted: Mon Mar  4 11:03:07 1985
Date-Received: Thu, 7-Mar-85 04:25:20 EST
References: <576@vortex.UUCP> <614@plus5.UUCP> <2413@nsc.UUCP>
Lines: 41

Chuq I'm afraid is overcome by the magnitude of some of the numbers and
appears to haven't thought it through well enough. The dragging performance
of rnews and expire comes from TWO main issues -- doing "system(blah)"
to invoke the creation of an article -- and -- the tremendous amount
of time doing nami's throughout the news lib and spool areas. As for
estimates of the increased space, puting the news into a real
N-key database format will REDUCE the total space used by most systems
due to round off in 1k block filesystems.

On my 2.10.1 system each incoming batched news item appears to require
in excess of 70 disk transactions (I will have exact numbers in about
a week -- I think it may be double that). Done properly in an N-Key
data base system with 5 keys average that number should be closer
to 15 or so. 3/4 of the current disk traffic appears to be in nami and
most of the rest in exec -- again I will have better numbers in a week or
so. I have already experimented with replacing the "system(blah)" calls
with "execl(aklj,aklj,...)" with a noticable improvement.

I think that pushing news into one big database will actually reduce
filespace requirements by about 20% or more. The comes from two main
sources -- average wasted space is 512 bytes per article on a 1kbyte
filesystem -- and -- the overhead per file is 16bytes per directory
entry, plus the directory overhead (tree plus wasted directory space
that is allocated) of about 20bytes, plus inode size of 64bytes, plus
first level indirect overhead on long directories. Thus we have over
600 bytes overhead in the current system and a properly done data
base will likely have less than 100 bytes per item. That should be
a net savings of near 1.3mb on a 2700 item news database.

News has outgrown the tools approach used to prototype it ... a serious
rewrite is LONG overdue. With a rewrite I belive from the work I have
already done that the news system can be reduced to 10% of it's current
disk traffic and under 50% of the current cpu time WITH significant
increases in functionality. The experience gained by implementing the
notes system should have been more carefully examined years ago.

John Bass
-- 
John Bass
DMS Design (System Performance and Arch Consultants)
{dual,fortune,idi,hpda}!dmsd!bass     (408) 996-0557