Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!mcvax!diku!olamb!kimcm From: kimcm@olamb.UUCP (Kim Chr. Madsen) Newsgroups: comp.databases Subject: Re: Optimization algorithms Message-ID: <185@olamb.UUCP> Date: Wed, 17-Dec-86 03:32:13 EST Article-I.D.: olamb.185 Posted: Wed Dec 17 03:32:13 1986 Date-Received: Thu, 18-Dec-86 20:36:36 EST References: <8612010047.AA14252@BORAX.LCS.MIT.EDU> 9768@sri-spam.istc.sri.com <8988UH2@PSUVM> Organization: AmbraSoft A/S (Denmark) Lines: 42 In article <8988UH2@PSUVM>, UH2@PSUVM.BITNET writes: > 1. Have you tried egrep? In theory it takes longer to encode the > search string, but then the search is faster. All very fine for ordinary ASCII files that's not too big! or if you have ordered them in some way to optimize (might not be ASCII sorting) seaching even egrep to to slow to produce the answers. Another thing is that Regular Expressions (as used in egrep) tends to be unitelligible for non-UNIX-wizards (eg secretaries) and making shell- scripts to hide the RE's will almost certainly limit the usability of them. > > 2. By adding structure to your phone book, you are gonna lose the > transparent interface to other tools provided by the shell. OK, UNIX provides you with a byte-stream, which is all very fine, makes a lot of applications avaiable to the same files. But the concept is already broken in several places, just think of the /etc/passwd file. A nicely structured file, with a record structure. But you can still use standard UNIX tools on it (cat, sed, awk, vi ...etc...). Database systems that uses C-ISAM (or the like) will normally make similar ASCII files, with one line containing a record, with fields separated by a special character (not always a good idea, many uses '|' (vertical bar) as a default separator but this is a genuine letter in many languages - but still we have to face that UNIX is based on good ole 7-bit ASCII). This file (the table file) is normally paired with an index file, which contains the offset-indexes to the record-entries and field-entries in the tablefile. Other database systems, do not use the standard UNIX filesystem but make a disk-partition on which it uses raw access and own I/O routines. This concept is generally used to speed up the disk-I/O which tends to be the bottleneck in database systems (giving the system a superior performance compared with systems using the standard UNIX FS). This may sound far away from the UNIX approach - but again it has been done in standard UNIX for a long time - think of the swap-partition on every UNIX disk, and think of the performance loss it would be to implement the swapping on the standard FS. Kim Chr. Madsen