Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/17/84 chuqui version 1.7 9/23/84; site daisy.UUCP
Path: utzoo!decvax!tektronix!hplabs!nsc!daisy!cwc
From: cwc@daisy.UUCP (C.W. Chung)
Newsgroups: net.news.sa
Subject: Re: Wanted: Faster Dump/Backup Procedures for bsd4.2 (Summary)
Message-ID: <114@daisy.UUCP>
Date: Mon, 17-Jun-85 19:03:34 EDT
Article-I.D.: daisy.114
Posted: Mon Jun 17 19:03:34 1985
Date-Received: Sun, 30-Jun-85 05:35:30 EDT
References: <112@daisy.UUCP>
Distribution: na
Organization: Daisy Systems Corp., Mountain View, Ca
Lines: 73
Xref: tektronix net.news.sa:00144 


This is summary of what I have heard from the net.  Thanks to all those netters
who has sent me info on this subject.  I really appreciate your responses and
thoughtful suggestions.

There are several ways to speed up the dumping: use concurrent processes to
overlap the disk and tape I/O, modify the kernel to support higher throughput,
modify the record size of dump/restore, dump to disk in stead of tape.

The 'caltech mod' by Don Speck (now at BRL?) make use of concurrent processes
to overlap the disk and tape I/O.  The number of concurrent processes is 
apparently a tunable parameter.  It is set to 3 by default: one process writes
to the tape and two processes read from the disk.
As Jeff Gilliam and others pointed out, Don Speck has posted the source 
modification to the /etc/dump program a while ago.   I went back to the 
net.sources and found it ( id: 8339brl-tgr.arpa; posted 2/24/85).  It replace
dumptape.c of the 4.2 /usr/src/etc/dump.  The making and installation is quite
simple.  I have it up and running very quickly.  I have not tried it on a large
file system.  However, the results on dumping the root partition does not
measured up to my expectation.  I found no substantial improvement at all (
4:39 minutes with 4.2 dump vs 4:33 minutes with the modified dump to dump 4996
tape blocks).  It probably is due to the small size of the file system.  With
an Eagle disk, the root partition spreads in 100 cylinders although a minimum
of 16 cylinders is enough to hold 7.8Mbyte.  With such a small disk, there
is no significant advantage in creating several processes to overlap I/O.
I expect the time saving of a larger file system will be substantial, however.


Don's mod does a good job in overlapping the activities in the I/O.  However,
it does not solve the overhead problem with the UNIX file system.  The kernel
has to be modified to support a higher steady I/O transfer rate.  Apparently,
Don is also working on that (the forthcoming raw i/o speedup?).   Chris Torek
of Univ. of Maryland has a mod call 'mass driver' (to dump out a huge block
of data at high rate??).  He said he may bring that to USENIX.  That should
be a welcome speedup.

Another way is to use a bigger blocking factor.  (The default is 1K block and
10 blocks per record.)  There is an undocumented option '-b' to do just that.
However, the 'restore' does not support this option!!  Dave Barto has a 
version of restore that could understand the '-b' option.

Roy Smith makes a suggestion which I have also been contemplating for some time:
dump to the disk and then write the image to a tape afterwards.  This shortens
the time that the system should be quiet/shutdown.  Obviously,
you need to have independent seek arms ; the disk you are writing on is
better to be not the one that you are dumping from.  You also need to allocate
a reasonable big space for the dump.  A candidate is the swap partition.
Another possibility is to use a different disk pack for the dumping purpose
if you have a removable disk drive.  The option of tape density (-d) and tape
length (-s) can be used to control how many blocks are dumped to the device 
and hence should be useful for multi-reel dump also.  This mode of dumping 
is particularly useful for incremental dump.  However, I am more interested
in shortening the time to do full and multi-reel dump.  If the dump fits
in one tape, then I can insert the tape in the tape drive, kick off the dump 
in early morning without human assistance.  Backup is annoying only because I 
have to sit around to wait for the dump to finish one tape so that I could 
put in another reel.  I have hacked up the dump to give me an estimate of the
tape to be used so that I could decide whether to take the dump or not.

Looks like with a combination of concurrent processes, speedup of raw i/o
and larger blocking factor, we could really speed up the dump/restore a lot.
I am looking forward to seeing the posting from Don Speck, Christ Torek and Dave
Barto.   I am also looking forward to seeing actual data measure of various
dump speedup.

I only have the original posting from Don Speck.  I'll be happy to forward
it to whoever may miss the original one.

Thanks.
C.W.Chung
-- 
{cbosgd,fortune,hplabs,ihnp4,seismo}!nsc!daisy!cwc        Chi Wo Chung
Daisy System Corp, 700 Middlefield Road, Mountain View CA 94039. (415)960-6976