Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site akgua.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!mhuxj!ihnp4!zehntel!hplabs!sdcrdcf!sdcsvax!akgua!glc
From: glc@akgua.UUCP (G.L. Cleveland [Lindsay])
Newsgroups: net.followup
Subject: Re: Request for info on a SW package [What is ISAM?]
Message-ID: <999@akgua.UUCP>
Date: Sun, 23-Sep-84 17:34:03 EDT
Article-I.D.: akgua.999
Posted: Sun Sep 23 17:34:03 1984
Date-Received: Wed, 26-Sep-84 19:20:04 EDT
References: <2052@ucbvax.ARPA> <524@turtlevax.UUCP>
Organization: AT&T Technologies/Bell Labs, Atlanta
Lines: 73

Re:

>> From: timos@ucbingres (Timos Sellis)
>> 
>> I am looking for information on a software package which enables
>> UNIX users to work with ISAM structured files.
>
>What is ISAM?
>-- 
>Ken Turkowski @ CADLINC, Palo Alto, CA

Ken,

   As "One who was there" when IBM came out with ISAM in the late
60's, my comment on "What is ISAM" is "You *really* don't want to
know!"

   ISAM stands for Indexed Sequential Access Method.  I attended
IBM classes on the description and usage of it.  Also had to
hand-hold a bunch of developers who were trying to use it for a
random-access application.  It could very often be a disk grinder!
It had three different files per "data set".  One was an index file
which contained a (large) on-disk table of the main area.  The
second file was the actual data.  It was primarily a sequentially-
ordered file.  The third file was the "overflow" area.

  To access a record, it would scan thru the index area until it
got an "equal to or greater than" match on the search key.  Since
the IBM disk controller had that sort of search logic built into
the hardware, that's why they designed this "serial scan".

  With the disk address of a block which might contain the desired
record now known, that block was read.  However, if the record had
been added since the entire data set had been loaded, it wouldn't
be there.  So then a serial search of the "overflow" area was done
to find the record (or report its absence).  This technique meant
that you had to do at least two access to the disk at two (often
widely-separated) tracks, and a possibility of a third (or more)
just to get one record.

  IBM later came out with some improvements in the techniques, such
as maintaining an "index to the index" in memory.

  The kicker was if you had an active file!  The overflow area
would have too many records and the "hit" ratio in the main area
would decrease.  So then you would do a reorganization of the
entire data.  This was done by accessing it as if it were a
sequential file and copy everything out to a tape.  Next you would
wipe out the file and recreate it from that tape, coming up with an
empty overflow area.

  One fellow had a file which took up an entire disk pack.  The
processing (elapsed) time was painful.  On my recommendation, he
added a preliminary step to copy the "index" portion to a separate
(temporary) disk.  After the main processing step, he then copied
the revised index back to the original pack.  These two steps took
10 minutes.  But the reduced disk arm movement changed the overall
processing time from 12 hours to 3 hours!

  I think the idea behind ISAM was to make a random access method
for a "dumb" programmer who only knew how to process seqential
(tape) files.  Fortunately the Data Base Manager packages started
appearing in the mid 70's and got rid of the appeal of the ISAM beast.


  Now...aren't you sorry you asked!

Cheers,
  Lindsay

Lindsay Cleveland  (...{ihnp4|mcnc|sdcsvax|clyde}!akgua!glc)
AT&T Technologies/Bell Laboratories ... Atlanta, Ga
(404) 447-3909 ...  Cornet 583-3909