Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10 beta 3/9/83; site desint.UUCP
Path: utzoo!watmath!clyde!bonnie!akgua!sdcsvax!sdcrdcf!trwrb!desint!geoff
From: geoff@desint.UUCP (Geoff Kuenning)
Newsgroups: net.news.b
Subject: Re: expire takes 73 minutes of cpu?!?!?
Message-ID: <206@desint.UUCP>
Date: Sat, 10-Nov-84 03:39:33 EST
Article-I.D.: desint.206
Posted: Sat Nov 10 03:39:33 1984
Date-Received: Mon, 12-Nov-84 11:36:07 EST
References: <1828@nsc.UUCP>
Distribution: net
Organization: his home computer, Thousand Oaks, CA
Lines: 39

From chuqui@nsc.UUCP:

>Expire is, to put it nicely, a hog.

The current expire opens up every article file to look for an "Expires:" line
in the header.  To find out how much this costs (approximately), I did:

	cd /usr/spool/news
	time find . -type f -print | xargs cat >/dev/null

(In retrospect head -5 would have been more accurate, but it's not off by too
much).  It ran for 45 minutes before I had to abort it, and produced exactly
the same seeking pattern as expire.  My normal expires run somewhere from an
hour to 1:15 when there is no other disk activity, and eat essentially 100%
of the seek time on the drive.  
 
The obvious solution is to put the expiration date in the history file.  This
is a bit beyond my current free-time level.  So I was wondering about doing
a shell script something like this:

	break up /usr/lib/news/history into article pathnames
	sort the list, and 'comm' it against yesterday's list to get a list of
		newly-arrived articles
	Append their pathnames and expiration dates to a file called
		/usr/lib/news/expdates

From here, it is fairly easy to see how to expire without opening lots of
files.  Only, when I plot it out a bit more, it becomes obvious that you need
to write at least one program that calls getdate.y to crack the dates and
has the smarts to expire based on newsgroup, Expires line, and Date-Received
and such lines.  So that doesn't seem like much of an approach, either.

Can anybody out there come up with a quick hack to cut down on these multiple
opens?  Or does somebody maybe have the time to do it right?
-- 

	Geoff Kuenning
	First Systems Corporation
	...!ihnp4!trwrb!desint!geoff