Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!mcvax!diku!olamb!kimcm
From: kimcm@olamb.UUCP (Kim Chr. Madsen)
Newsgroups: comp.databases
Subject: Re: Optimization algorithms
Message-ID: <185@olamb.UUCP>
Date: Wed, 17-Dec-86 03:32:13 EST
Article-I.D.: olamb.185
Posted: Wed Dec 17 03:32:13 1986
Date-Received: Thu, 18-Dec-86 20:36:36 EST
References: <8612010047.AA14252@BORAX.LCS.MIT.EDU> 9768@sri-spam.istc.sri.com <8988UH2@PSUVM>
Organization: AmbraSoft A/S (Denmark)
Lines: 42

In article <8988UH2@PSUVM>, UH2@PSUVM.BITNET writes:
> 1.  Have you tried egrep?  In theory it takes longer to encode the
> search string, but then the search is faster.

All very fine for ordinary ASCII files that's not too big!
or if you have ordered them in some way to optimize (might not be
ASCII sorting) seaching even egrep to to slow to produce the answers.

Another thing is that Regular Expressions (as used in egrep) tends to
be unitelligible for non-UNIX-wizards (eg secretaries) and making shell-
scripts to hide the RE's will almost certainly limit the usability of them.

> 
> 2.  By adding structure to your phone book, you are gonna lose the
> transparent interface to other tools provided by the shell.

OK, UNIX provides you with a byte-stream, which is all very fine, makes a
lot of applications avaiable to the same files. But the concept is
already broken in several places, just think of the /etc/passwd file. A
nicely structured file, with a record structure. But you can still use
standard UNIX tools on it (cat, sed, awk, vi ...etc...).

Database systems that uses C-ISAM (or the like) will normally make
similar ASCII files, with one line containing a record, with fields
separated by a special character (not always a good idea, many uses
'|' (vertical bar) as a default separator but this is a genuine letter in
many languages - but still we have to face that UNIX is based on good ole
7-bit ASCII). This file (the table file) is normally paired with an index
file, which contains the offset-indexes to the record-entries and
field-entries in the tablefile.

Other database systems, do not use the standard UNIX filesystem but make
a disk-partition on which it uses raw access and own I/O routines. This
concept is generally used to speed up the disk-I/O which tends to be the
bottleneck in database systems (giving the system a superior performance
compared with systems using the standard UNIX FS). This may sound far
away from the UNIX approach - but again it has been done in standard UNIX
for a long time - think of the swap-partition on every UNIX disk, and
think of the performance loss it would be to implement the swapping on
the standard FS.

					Kim Chr. Madsen