Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site phri.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!talcott!harvard!cmcl2!phri!roy
From: roy@phri.UUCP (Roy Smith)
Newsgroups: net.news.b
Subject: checkgroups consumes file system
Message-ID: <3@phri.UUCP>
Date: Sat, 2-Nov-85 15:00:43 EST
Article-I.D.: phri.3
Posted: Sat Nov  2 15:00:43 1985
Date-Received: Mon, 4-Nov-85 01:47:40 EST
References: <187@phri.UUCP>
Organization: Public Health Research Inst. (NY, NY)
Lines: 73
Original-Subject: loss of traffic through phri

> Because we ran out of space on our spool file system today...

	It turns out that the culprit was news itself.  I ran the latest
checkgroups message <1826@gatech.CSNET> yesterday.  A hour or so later, we
ran out of space on /usr.  Doing a hasty expire (i.e. cd ~news/trash-group;
rm *) and rm'ing a bunch of other stuff didn't seem to help as much as I
thought it would, and we ran out of space again soon after that.

	I brought Unix down and found some 3000 (!) checkgroups messages in
/usr/spool/news/control, at about 11kb each.  I had saved the article in
xxx and done "sed 1,/Lines:/d xxx | sh >& xxx.out &", much the same
procedure I usually use.  I just repeated the procedure today with no
trouble, so I'm really confused as to what is going on.  No, I havn't
changed the directory permissions before or since.

	Some excerpts from yesterday's xxx.out (over 2000 lines total, many
of them blank):

inews: Cannot reread article
inews: Cannot reread article
[...]
inews: Cannot install article as /usr/spool/news/control/3140
inews: Link into /usr/spool/news/control/3140 failed, errno 5, check dir permissions.
inews: Cannot reread article
inews: Cannot install article as /usr/spool/news/control/3140
inews: Link into /usr/spool/news/control/3140 failed, errno 5, check dir permissions.
[...]
inews: News system locked up
[...]
inews: Cannot install article as /usr/spool/news/control/3647
inews: Cannot install article as /usr/spool/news/control/3648
inews: Cannot install article as /usr/spool/news/control/3649
inews: Cannot install article as /usr/spool/news/control/3650
[...]
inews: Cannot install article as /usr/spool/news/control/3908
inews: Link into /usr/spool/news/control/3908 failed, errno 5, check dir permissions.
inews: Cannot reread article
inews: Cannot install article as /usr/spool/news/control/3908
inews: Link into /usr/spool/news/control/3908 failed, errno 5, check dir permissions.
[...]

	And, some excerpts from /usr/lib/news/log (I *think* the first line
shown is the first mention of the checkgroup):

Nov  1 18:33	local	posted <543@phri.UUCP> ng control subj 'cmsg checkgroups'
Nov  1 18:33	local	Ctl Msg control from roy: checkgroups
Nov  1 18:33	cmcl2.UUCP	received <1840@gatech.CSNET> ng mod.newslists,net.announce.newusers subj 'Changes to Publicly Accessible Mailing Lists'
Nov  1 18:33	local	system(sed '1,/^$/d' /usr/spool/news/.ar012964 | /usr/lib/news/checkgroups usenet) status 36096
Nov  1 18:33	local	posted <544@phri.UUCP> ng control subj 'cmsg checkgroups'
Nov  1 18:33	local	Ctl Msg control from news: checkgroups
Nov  1 18:33	cmcl2.UUCP	from usenet@gatech.CSNET relay version B 2.10.2 9/18/84; site cmcl2.UUCP
Nov  1 18:33	local	system(sed '1,/^$/d' /usr/spool/news/.ar012970 | /usr/lib/news/checkgroups usenet) status 36096
[...]
Nov  1 19:48	local	Ctl Msg control from news: checkgroups
Nov  1 19:49	cmcl2.UUCP	Duplicate article <320@ssc-vax.UUCP> rejected
Nov  1 19:49	local	system(sed '1,/^$/d' /usr/spool/news/.ar015494 | /usr/lib/news/checkgroups usenet) status 36096
Nov  1 19:49	local	posted <931@phri.UUCP> ng control subj 'cmsg checkgroups'

	As you can see, inews just kept trying to post the checkgroups to
newsgroup control, without any success.  It seems to me that after a
certain number of tries, it should just give up.  As it is, inews just kept
trying until it killed the system -- 30 Mbytes of aborted articles in
~news/control, plus a log file well over 1300 kbytes long!

	I'm not sure why it failed the in the first place.  Could doing a
checkgroups while news is coming in be a Bad Thing?  Actually, several
news-things were going on at the same time.  In addition to the checkgroups
and the incoming feed, we were queueing up news for 2 or 3 downstream
sites, and expire was running (the load on our 11/750 was 8!)
-- 
Roy Smith 
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016