Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site phri.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!mit-eddie!genrad!panda!talcott!harvard!cmcl2!phri!roy From: roy@phri.UUCP (Roy Smith) Newsgroups: net.news.b Subject: Re: checkgroups consumes file system Message-ID: <201@phri.UUCP> Date: Sat, 2-Nov-85 15:12:01 EST Article-I.D.: phri.201 Posted: Sat Nov 2 15:12:01 1985 Date-Received: Mon, 4-Nov-85 01:47:53 EST References: <187@phri.UUCP> Organization: Public Health Research Inst. (NY, NY) Lines: 110 > Because we ran out of space on our spool file system today... It turns out that the culprit was news itself. I ran the latest checkgroups message <1826@gatech.CSNET> yesterday. A hour or so later, we ran out of space on /usr. Doing a hasty expire (i.e. cd ~news/trash-group; rm *) and rm'ing a bunch of other stuff didn't seem to help as much as I thought it would, and we ran out of space again soon after that. I brought Unix down and found some 3000 (!) checkgroups messages in /usr/spool/news/control, at about 11kb each. I had saved the article in xxx and done "sed 1,/Lines:/d xxx | sh >& xxx.out &", much the same procedure I usually use. I just repeated the procedure today with no trouble, so I'm really confused as to what is going on. No, I havn't changed the directory permissions before or since. Some excerpts from yesterday's xxx.out (over 2000 lines total, many of them blank): inews: Cannot reread article inews: Cannot reread article [...] inews: Cannot install article as /usr/spool/news/control/3140 inews: Link into /usr/spool/news/control/3140 failed, errno 5, check dir permissions. inews: Cannot reread article inews: Cannot install article as /usr/spool/news/control/3140 inews: Link into /usr/spool/news/control/3140 failed, errno 5, check dir permissions. [...] inews: News system locked up [...] inews: Cannot install article as /usr/spool/news/control/3647 inews: Cannot install article as /usr/spool/news/control/3648 inews: Cannot install article as /usr/spool/news/control/3649 inews: Cannot install article as /usr/spool/news/control/3650 [...] inews: Cannot install article as /usr/spool/news/control/3908 inews: Link into /usr/spool/news/control/3908 failed, errno 5, check dir permissions. inews: Cannot reread article inews: Cannot install article as /usr/spool/news/control/3908 inews: Link into /usr/spool/news/control/3908 failed, errno 5, check dir permissions. [...] And, some excerpts from /usr/lib/news/log (I *think* the first line shown is the first mention of the checkgroup): Nov 1 18:33 local posted <543@phri.UUCP> ng control subj 'cmsg checkgroups' Nov 1 18:33 local Ctl Msg control from roy: checkgroups Nov 1 18:33 cmcl2.UUCP received <1840@gatech.CSNET> ng mod.newslists,net.announce.newusers subj 'Changes to Publicly Accessible Mailing Lists' Nov 1 18:33 local system(sed '1,/^$/d' /usr/spool/news/.ar012964 | /usr/lib/news/checkgroups usenet) status 36096 Nov 1 18:33 local posted <544@phri.UUCP> ng control subj 'cmsg checkgroups' Nov 1 18:33 local Ctl Msg control from news: checkgroups Nov 1 18:33 cmcl2.UUCP from usenet@gatech.CSNET relay version B 2.10.2 9/18/84; site cmcl2.UUCP Nov 1 18:33 local system(sed '1,/^$/d' /usr/spool/news/.ar012970 | /usr/lib/news/checkgroups usenet) status 36096 [...] Nov 1 19:48 local Ctl Msg control from news: checkgroups Nov 1 19:49 cmcl2.UUCP Duplicate article <320@ssc-vax.UUCP> rejected Nov 1 19:49 local system(sed '1,/^$/d' /usr/spool/news/.ar015494 | /usr/lib/news/checkgroups usenet) status 36096 Nov 1 19:49 local posted <931@phri.UUCP> ng control subj 'cmsg checkgroups' As you can see, inews just kept trying to post the checkgroups to newsgroup control, without any success. It seems to me that after a certain number of tries, it should just give up. As it is, inews just kept trying until it killed the system -- 30 Mbytes of aborted articles in ~news/control, plus a log file well over 1300 kbytes long! I'm not sure why it failed the in the first place. Could doing a checkgroups while news is coming in be a Bad Thing? Actually, several news-things were going on at the same time. In addition to the checkgroups and the incoming feed, we were queueing up news for 2 or 3 downstream sites, and expire was running (the load on our 11/750 was 8!) It also seems that our /usr/lib/news/seq file got zonked. Several times during this episode, the article sequence numbers seem to have been reset to 1. I've artificially reset our sequencer to 2000, hoping that will get us past any sequence numbers generated during this mess. From /usr/lib/news/errlog: [...] Nov 1 20:49 <489@dadla.UUCP> inews: Article <489@dadla.uucp> locked up Nov 1 21:51 <1533@phri.UUCP> inews: Cannot reread article Nov 1 21:51 <1534@phri.UUCP> inews: Cannot reread article Nov 1 21:51 <1535@phri.UUCP> inews: Cannot reread article Nov 1 21:51 <1@phri.UUCP> inews: Cannot reread article Nov 1 21:52 <2@phri.UUCP> inews: Cannot reread article Nov 1 21:53 <4@phri.UUCP> inews: Cannot reread article Nov 1 21:53 <170@sagan.UUCP> inews: Cannot reread article Nov 1 21:55 <2@phri.UUCP> inews: Cannot install article as /usr/spool/news/control/3140 Nov 1 21:55 <2@phri.UUCP> inews: Link into /usr/spool/news/control/3140 failed, errno 5, check dir permissions. Nov 1 21:55 <2@phri.UUCP> inews: Cannot reread article Nov 1 21:55 <3@phri.UUCP> inews: Cannot install article as /usr/spool/news/control/3140 [...] Nov 1 22:03 <2@phri.UUCP> inews: Cannot reread article Nov 1 22:03 <3@phri.UUCP> inews: Cannot reread article Nov 1 22:03 <4@phri.UUCP> inews: Cannot reread article [...] Nov 2 00:51 <1@phri.UUCP> inews: Cannot reread article Nov 2 00:51 <2@phri.UUCP> inews: Cannot reread article Nov 2 00:52 <1@phri.UUCP> inews: Cannot reread article Nov 2 00:52 <2@phri.UUCP> inews: Cannot reread article And here there looks like a write glitch (probably becuase the file system was full by then): Nov 2 00:52 <3@phri.UUCP> inews: Cannot rereadNov 2 01:10 <29@phri.UUCP> inews: Cannot reread article Nov 2 01:11 <39@phri.UUCP> inews: Cannot reread article Nov 2 01:11 <40@phri.UUCP> inews: Cannot reread article -- Roy SmithSystem Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016