Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!seismo!rutgers!ames!ucbcad!ucbvax!BU-CS.BU.EDU!bzs
From: bzs@BU-CS.BU.EDU (Barry Shein)
Newsgroups: mod.protocols.tcp-ip
Subject: NFS
Message-ID: <8612232003.AA05624@bu-cs.bu.edu>
Date: Tue, 23-Dec-86 15:03:38 EST
Article-I.D.: bu-cs.8612232003.AA05624
Posted: Tue Dec 23 15:03:38 1986
Date-Received: Wed, 24-Dec-86 04:17:19 EST
Sender: daemon@ucbvax.BERKELEY.EDU
Organization: The ARPA Internet
Lines: 66
Approved: tcp-ip@sri-nic.arpa
I think there is a misconception brewing here about UNIX file
semantics. It is true that the low level UNIX system calls (eg. OPEN,
READ, WRITE, LSEEK) impose no structure on a file except as a stream
of bytes. This is not peculiar to UNIX, most any O/S that I know of
has some way to just get the bytes off the disk although systems which
prefer structured/typed files tend to resist that and lead the user
towards an access method. Of course, a processor with a sophisticated
IOP (eg. a data base back-end) might be an exception to this rule but
I believe such situations are beyond the scope of this discussion.
Any access method could be layered on top of the UNIX low level calls,
and many have been. Surely I could write DEC's RMS or IBM's access
methods in terms of these simple calls.
As a specific example, consider the UNIX DBM calls which stores
arbitrary data as hashed key/value pairs. This presents the same
problem as any more structured system (eg. how would I fetch the next
key/value pair out of such a file from a remote, non-UNIX system? No
different really than fetching the next ISAM record etc.)
This is somewhat in response to Geoff's note (which was a very good
direction for thought.) I am only saying that the problem is entirely
symmetrical, there is no magic property of access methods whether
built into an O/S or supplied as applications libraries, bytes is
bytes. The only possible difference is that a system that provides
many access methods might be able to make a list quickly of access
methods which users are probably using (give or take how the users
employed the various options such as record-size, blocking,
bucket-size etc etc.)
I only bring this up so that we don't wring our hands over what I
believe to be a common misconception. Any O/S could (I presume)
present their files as a stream of bytes, the problems would then
be symmetrical.
There are some differences, such as guaranteed atomicity of updates
and types of failure (eg. how extents are handled) but I don't believe
this level of detail is yet where this discussion has found itself
and, I suspect, would be solvable within any scheme that solves the
other, more salient problems.
However, unlike Geoff, I am more pessimistic IN THE GENERAL CASE.
If two systems have a !very! similar access method, such as an ISAM
implementation, then writing an interface between the two should be
relatively straight-forward (although it is still fraught with danger,
eg IBM's V-record format uses 16-bits to express lengths, another
system may not use 16-bits although it supports a V-record format, how
compatible could you make those two access methods?) In the case where
the access method doesn't exist at all I can't see how it could be
utilized at all (oh, I suppose a V-record could be returned to a
text-oriented application as "string" but that sort of thing
is limited as a solution.)
I won't even mention the Fortran programmer who would like to access
a file full of 128-bit binary floating point values via this NFS (no,
XDR doesn't work unless someone knows it's time to employ it, it still
may not work, does your machine have 128-bit floats?)
I don't think it's insoluble, but I do suspect we will have to be
prescriptive (rather than descriptive) to provide a standard. Given
a standardized menu of network access methods could *you* do your
work?
-Barry Shein, Boston University