Path: utzoo!attcan!uunet!steinmetz!davidsen
From: davidsen@steinmetz.ge.com (William E. Davidsen Jr)
Newsgroups: comp.binaries.ibm.pc.d
Subject: Re: SIMTEL20 to ban ARC files
Keywords: lzw, atob/btoa, 7 bit pure
Message-ID: <12229@steinmetz.ge.com>
Date: 26 Sep 88 17:12:09 GMT
References:   <6630@ihlpl.ATT.COM> <2736@uoregon.uoregon.edu> <8475@smoke.ARPA> <2594@csccat.UUCP> <424@pigs.UUCP> <6583@dasys1.UUCP>
Reply-To: davidsen@crdos1.UUCP (bill davidsen)
Organization: General Electric CRD, Schenectady, NY
Lines: 44

In article <6583@dasys1.UUCP> tneff@dasys1.UUCP (Tom Neff) writes:

| This is somewhat reminscent of the old order-of-operations quandary
| in the days of LBR and SQ on CP/M and MSDOS.  There you had two separate
| tools, a squeezer and a librarian.  There were two headaches.  First,
| people kept LBR'ing *first* and then squeezing (yielding an LQR file),
| which is the wrong way to do it for two well-defined reasons[1].

| --------------------------
| 
| NOTES
| 
| [1] Squeezed libraries are much worse than libraries of individually
| squeezed members because (a) an LQR has no immediately accessible
| directory structure - it has to be unsqueezed before you can look at
| it; and (b) dissimilar member files (README, executable, fonts etc)
| yield a heterogeneous library which responds poorly to most types
| of compression.

  Your conclusion is not universally valid. In the case where the files
are similar, such as source code, data files, etc, the compression will
be greater if the compress is done on the archive as a whole, since LZW
is adaptive and will improve for larger files.

  As an example I compressed a set of source files in two ways:
individually into an archive, and uncompressed into an archive and
compress the entire archive as a single file. The cpio method was to
compress files and cpio the results vs. cpio all files and compress. The
2nd allows use of zcat for directory or extract.

Results:
 archiver	individual%	group%
  zoo		58		68
  compress	57		68
  arc		55		66
  cpio+cmprs	60		66

  If the objective is convenient storage with some compression, your are
completely correct, but for saving disk space or transfer time it is not
optimal in many common cases.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me