Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!mcnc!gatech!bloom-beacon!think!ames!sdcsvax!ucbvax!CUNIXC.COLUMBIA.EDU!cck From: cck@CUNIXC.COLUMBIA.EDU Newsgroups: comp.protocols.appletalk Subject: MacNFS's file mapping. Message-ID: <8707191828.AA21745@columbia.edu> Date: Sun, 19-Jul-87 14:27:44 EDT Article-I.D.: columbia.8707191828.AA21745 Posted: Sun Jul 19 14:27:44 1987 Date-Received: Sun, 19-Jul-87 20:49:41 EDT References: <@andrew.cmu.edu:tom@citi.umich.edu> <8707180921.AA17507@columbia.edu> Sender: daemon@ucbvax.BERKELEY.EDU Distribution: world Organization: The ARPA Internet Lines: 196 Tom's message is very well thought out. I do have some things to add though. Let's take a step back and define the "requirements" as we (Bill and I) saw them. >I. STORAGE OF THE DATA, RESOURCE, AND FINDER INFO FORKS The format should allow: Primary: P1 storage of Macintosh files under Unix with complete information (e.g. resource, data, and "finder info" forks) P2 use of Unix files under the Mac OS (e.g. allow Mac to access files not stored as in P1) P3 Quick, efficent access for the various network servers/clients (e.g. allow Mac NFS/Aufs/Tops to enumerate and access files, etc) Secondary: S1 access to Macintosh files stored on a Unix file system through Unix Bill S. and I both strongly disagree with the approach of combining the three files into one file! There are significant disadvantages and few advantages. A few disadvantages are: need to special routines to handle files under (S1), difficulties in handling (P2), problems with "holes" in unix files that this would require, etc. The primary advantage would be that it appears to the "naive" user to be simpler, and the coordination of the three parts would be "builtin" (e.g. you wouldn't ever be left in the situation where you have a .resource file and no .finderinfo and data files). This method might well be the method of choice if the Unix system were only a file server and naught else. The primary differences between the following two approaches are: o Aufs scheme has better coherence than EFS scheme, though same as A/UX scheme. o Aufs scheme is easier to implement! Using three files in one directory. As Tom noted, if anything set a standard in the past, it was EFS. We thought about it long and hard before we decided not to go with this scheme. I don't remember the details of the conversation, but will enumerate some of the advantages of the scheme we decided upon later. The EFS and A/UX schemes effects the goal (P1) completely, (P2) requires a by-pass mechanism for the EFS scheme and can be considered to be convered by the A/UX scheme and (P3) is reasonably handled. (S1) is also well-handled. The Aufs scheme is quite simple (Tom covered most of this, but wish to reiterate with some justifications). Simply: the data fork is the closest match to a unix file, therefore store it as-is in the specified directory (same as A/UX), the resource fork and the so called "finder info" fork (mostly part of desktop on Mac - finder info in resource fork is still there though) are "special" and can be stored by the same name in special subdirectories of the specified directory. To be concrete, the Mac file "keeper" stored in a directory "stuff" would be stored by Aufs on the unix file system as: stuff/keeper - data fork stuff/.finderinfo/keeper - "finder info" fork stuff/.resource/keeper - resource fork Advantages: easy to scan directories for files, easy to manipulate Mac and Unix files in a rational way, elegant - most implementation decisions are resolved in an easily managable way with few problems. Disavantages: pain to do copies, moves, deletes on stored Mac files without utility programs. With one caveat, this scheme completely covers the goals P1-P3 and S1 listed above. Caveat: to implement P2, we must "default" finder information for unix files (e.g. assign "default" finder information to be used when no ".finderinfo/.." file is found). Enough of this though - I could go on listing advantages and disadvantages for a long time. You know the scheme I advocate. > 3. Other issues. (E.g. no data fork, no resource fork situations.) Well, for Aufs I think the best way to explain this is to say that a directory with a .finderinfo and .resource directories is considered to be a "Macintosh" directory - e.g. a reasonable place to store MacIntosh files. (The distinction also makes it easy for us to simply tell people that only certain directories are special (e.g. have the special subdirectories)). We believe the tradeoffs here - primarly that you cannot store a Macintosh file just anywhere (as a matter of fact, I consider this a distinct advantage) - are reasonable. Aufs will only create the .finderinfo and .resource directory when it receives the "create directory" command - e.g. "New Folder". This means that for the various "unix" directories (for example /usr/bin), no junk will be left lying around. We believe this to be important and it quickly resolves the issues of when to create files - iff the approriate directories exist. One more issue that Tom did not bring up is that the contents of the so called finder information fork needs to be standardized. Currently Aufs stores the 32 bytes of finder information (cf. AFP spec.) and any comment in this file. Additional information might be a AFP "short name" for MS-DOS style clients and/or a mapping from the 14 character SVID file names to 32 character Macintosh file names - some careful though is required to determine if this is the appropriate place to place these mappings (some more on this later). > II. FORMAT OF TEXT FILES. Nothing to add to what Tom has to say except that he has some good ideas here. Hopefully, we will add some (in some form to Aufs). > III. FILE NAME MAPPING. Aufs does handle it slightly differently. Mac name to Unix name: Any non printing character (and "/") is stored as two hexidecimal digits "escaped" by a colon (v.s. a ^ under NFS). Unix name to Mac name: Treats ":hh" as the hex representation of a character. Sequences as "::" or ":" followed by a non-hex character result in the ":"(s) being translated into a "|"(s). We chose ":" because it couldn't be in Mac file names and is rarely if ever used in Unix file names. The 14 character file name problem that all the SVID compliant systems such as A/UX, HPUX, etc. pose can be resolved in two ways: o head in sand - simply don't allow names longer than 14 characters (not really so ridiculous - most names are reasonable). o some mapping database - can live in three places reasonably a) as part of finderinfo or another such "special" file b) as part of the volume desktop information c) in the directory as so-called "local" desktop information Not sure which to do yet. We don't consider (b) to be a particularly efficent or clean solution (reeks too much of keeping a "directory" of the files in the volume - real problem for unix files and being able to access files via unix). Another problem to be mentioned is that the Mac OS doesn't distinguish case while Unix does. Aufs simply ignores the difference because most Mac OS (if not all) utilities will display the correct case and use the correct case in accessing the files. A notable exception is MPW. A simple solution might be to simply lowercase everything, but then you have the problem that two unix files Makefile and makefile can co-reside - which one is the right one? The way things are now you will get the one with the case you specify (e.g. always the right one - not sure if both are displayed by finder/standard file package though). (Simple solution - make Mac OS distinigush case in file names :-). > IV. DESK TOP INFORMATION. Aufs seperates the icon and application info into .IDeskTop and .ADeskTop for one reason - it's simpler to handle. We were careful about the amount of information that had to be shared per volume (e.g. .ADeskTop and .IDeskTop files) because of the problems in resolving competing read/writes. (Note: for files, just hope for best!!! - this means two people with write permission to the same volume had better be careful!!!). Aufs supports read-only volumes right now. In fact, the two primary uses of Aufs at our site is as: (a) private (read individual) file storage were coordination of read/writes is not a real issue and (b) shared read-only volumes. I guess I've gone on enough, but would just like to say, that where previously existing methods existed, we thought carefully before trying to supplant them with our own methods - in all cases we felt there was sufficient justification to do so. One more thing - I've listed our primary requirements (p1-p3) above. We believe that Aufs does a decent job in meeting them. If you wanted to drop some of the requirements such as (P2), then different strategies would go into effect. In implementing Aufs, careful thought was put into making Aufs layered in such a way that the protocol specific parts were seperated from not only the OS dependent parts (which have turned out to be fairly Unix OS independent - not suprising though), but also the parts that implement the particular paradigm (e.g. the model has the server allowing functions P1, P2, P3, and S1). Thus, if you really don't have a need for some of the primary requirements, then you can also take the Aufs source code and make it into what you really want without a massive (but not inconsiderable) amount of work (e.g. you won't be completely reinventing the wheel). I know I've missed some points, but I hope this provides a better insight into why Aufs does things the way it does. Charlie C. Kim User Services Center for Computing Activites and Libraries Columbia University