Usenet
Path: utzoo!utgpu!watmath!clyde!att!rutgers!ukma!uflorida!novavax!proxftl!twwells!comparc
From: comparc@twwells.uucp (comp.archives)
Newsgroups: comp.archives
Subject: Comp.archives database format
Message-ID: <234@twwells.uucp>
Date: 1 Dec 88 11:15:08 GMT
Reply-To: comparc@twwells.UUCP (comp.archives)
Organization: None, Ft. Lauderdale
Lines: 333
Approved: bill@twwells.uucp (T. William Wells)
This is the third attempt at the database structure. Changes are
still possible, so send in any comments you might have.
Here is a short summary of the changes from the previous version:
1) The definition of a site is somewhat vague. What I am going to
do is to consider one set of archives under the control of a
single administrator as an archive site. This means that the
site entry won't have different sets of data for archives
located at the same site. This also means that the archive
name will be somewhat less related to the address of the
archive.
2) The access method and access tag of the CO fields have been
swapped. The access method now comes first.
---
Comments in the databases begin with a #. They are retained with the
data but are otherwise ignored.
In the line oriented databases, if there is a line that is to be left
blank, that line should still be entered, but with everything but the
keyword left blank.
---
The site database contains a series of entries separated by blank
lines. Each entry has the following lines:
NM
EN
TM
TT
AD
MA
CO
IX
KW
DE
Each of the lines from AD to DE may be repeated as often as necessary
to contain the data.
Following is a detailed description of each line.
NM
This name should be related to the address used to find the site,
though it doesn't have to. This should be kept fairly short.
EN @ ()
This says who the person is who entered the database entry. The
is the output from the date command.
TM ;[[],...- ];...
This lets people know when the best times are to use the archive.
The first field is the time zone the archive is contained in; all
times in the site entry are presumed to be relative to that time
zone. is a three letter day abbreviation. The and
are times in 24 hr notation. is a single word
describing the load on your system at these times, the suggested
words are: none, light, moderate, heavy, swamped, best, worst.
TT
A short title for the archive.
AD @ ()
The person who administers the archive. If more than one person
administers the archive, there should be more than one of these.
MA
The mailing address for help or information. Leave this blank
unless you want snail-mail. People who mail to this address had
better include a SASE or e-mail address or forget about getting
any response.
CO ftp;;;;;
CO uucp;;;
This line describes each method of getting at the archive. If there
is more than one way to get at the archive, or more than one
directory containing archive information, then there will be more
than one of these lines.
The is used when not all of your archived information
is available through all paths to your site. For example, you
might have a mail based server for small items but require a
direct link for larger things. Each item that you list as
available through your archive has a tag that is used to indicate
which way it can be accessed.
There may be more than one line for a single tag. This would mean
that there is more than one way to get to the same set of
information.
The next field describes the access method. Right now, it is
either uucp or ftp; more will be added as needed.
The remaining fields depend on the access method.
There are two fields for uucp access. The first is the path name
which archive file names are relative to. The second is an L.sys
entry that would be used to access your site.
For ftp, the fields are the domain name for accessing the archive,
the internet address for the above, the directory where the
archive information resides, and the times when the archive is
available.
If the archive is always available, leave that field blank.
Otherwise, format as [[,...-];...
IX ;;;;;
This line describes the index file(s) for the archive. It is the
same format as the entries in the index database, except that the
first three fields are not present. You should also list README
files and the like.
KW ,...
This is a list of keywords that describe what the site carries.
DE
This is a few lines that describe the site. This should be kept
reasonable short, but should give any information not specified in
the previous lines that might be useful to the archive user.
---
The archived information database contains a series of entries
separated by blank lines. Each entry has the following lines:
NM
VR
AU
MA
EN
TT
KW
SY
DE
Following is a detailed description of each line.
NM
The name of the item. If the item is a program that runs in
one environment, the name is that environment hyphenated with
the program name; otherwise it is just the name. Note that
this is not intended to be useful by itself, e.g., unix-pcomm
might eventually also refer to something that has been made to
work under VMS. Should there be two items with the same name,
the later item will have its author's name appended. For
example, should John Turkey later write a pcomm for UNIX, it
would be called unix-pcomm-turkey.
VR version
VR date
These tell which version this entry refers to. The first form is
used for things with named versions, the second is used for
something which is regularly updated. The date, for the second
format, is yymmdd, and specifies the date the thing was last
updated. Some things are so continuously updated that they
should not have a version; for them, leave this line blank.
AU @ ()
This is the person or persons who wrote the thing. If there
is more than one author, use more than one line.
MA @ ()
This is who is maintaining the item. If the item is not being
maintained, leave this blank. If several people are
maintaining it, use several lines. Note that anyone whose name
is on one of these lines can expect e-mail about the item.
EN @ ()
This is the person responsible for the entry and the date on
which the entry was added or updated.
TT
A title for the item.
KW ,...
Keywords describing the item. Some good kinds of keywords:
`all-source', which means that all the source (other than that
of the tools mentioned below) needed is included;
`public-domain', which indicates that the item is in the public
domain.
SY [:[:;
For each system this item runs on (or must be used on), there
should be one of these lines. The fields are:
1) The hardware it runs on. If it runs on any hardware which a
particular OS runs on, the entry is `any'. If the item
needs hardware other than the standard for the system, add
words for it after a colon.
2) The OS it runs under. There are several generic names like
`unix' or `sysv-unix'. Optional OS things which are needed
are indicated the same way hardware options are. Also,
software which is not listed in this database which is
needed to make this item go is listed here. For example,
were this item to be a Dbase program, this field would be:
MS-DOS:Dbase-II.
3) How much effort is needed to make it go. If following the
directions is sufficient, the entry is `install'.
4) This entry contains any tools, not normally available on
your system, which one must have in order to build or use
this item. All items which are in this section must also
have their own entries in the information directory.
DE
This is a short descrpiton of the item. This should be kept
brief; putting the man page here is not appropriate.
Here is an entry, suitable for the databases created through
comp.archives.
NM free-distribution-database
VR
AU bill@twwells.UUCP (T. William Wells)
MA bill@twwells.UUCP (T. William Wells)
EN bill@twwells.UUCP (T. William Wells) Fri Nov 11 00:56:16 EST 1988
TT Database of freely distributable, electronically accessible information.
KW database,public-domain
SY any;any;;
DE This database is constructed from the information that passes
DE through comp.archives. It contains information on any software,
DE databases, documents, or what-have-you, that is both freely
DE distributable and available electronically. "Freely
DE distributable" means that, if you have a copy of the item, you
DE can (at least) make exact copies and give them away, and you
DE don't have to tell the owner of the item (if any) that you have
DE done so. "Electronically available" means that it is either
DE accessible through a publicly accessible network, or is available
DE by a means that does not involve paying a fee to the
DE distributor. This information is provided as a free service and
DE there is *no one* guaranteeing that any of it is accurate or
DE useful. Use it your own risk.
---
The site index ties the previous two databases together. This is the
format:
;;;;;;
;;
The first two fields link this entry to an entry in the info
database; they correspond to the NM and VR fields. If this
file is not listed in the database, these fields are blank.
`Site-name' is the name of the site, as recorded in the site
database.
`Access-type' is one of the access tags specified in the site
entry. Note that this is in the style of UNIX file names:
wild cards are permitted.
`handle' is used with the information from the site entry to
construct the request from the archive. For example, using
uucp, if the site entry contained /usr/archives as the path
to which files names are relative, and this field contains
foobar.shar, then the path name you should use to get this
item is /usr/archives/foobar.shar.
`Date' is the date which this entry was added to the database.
This should be yymmdd.
`Tools' is a list of programs needed to unarchive the file;
each must be a name in the info database. Standard system
utilities are not listed.
`Comments' is anything useful to add.
---
The DB: postings contain information to update the database. The
update information starts with the first line beginning with an @ and
ends with a line containing @END. Additional information, not
intended to be part of the database can be added before the first @
line or after the @END line.
Commands to add data look like:
@ADD
and the following data is what is to be added. is one of
the strings INFO, SITE, or INDEX. The new data is terminated by a
blank line. This blank line is required, no matter what the next
command is.
Commands to delete data look like:
@DEL
The key depends on what is being deleted. Deletions from the
information database just use the item name. Deletions from the site
database use the site name. Deletions from the archive index use the
site name, the access method, and the access handle for the line to be
deleted.
There is a special command to delete all index entries for a site;
its form is:
@DELALL INDEX
---
Bill
{uunet|novavax}!proxftl!twwells!bill
send comp.archives postings to twwells!comp-archives
send comp.archives related mail to twwells!comp-archives-request