Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!mailrus!ames!pasteur!ucbvax!ucsd!ucsdhub!jack!crash!pnet01!haitex From: haitex@pnet01.cts.com (Wade Bickel) Newsgroups: comp.sys.amiga.tech Subject: New IFF format details (long). Message-ID: <3450@crash.cts.com> Date: 19 Sep 88 12:56:37 GMT Sender: news@crash.cts.com Organization: People-Net [pnet01], El Cajon CA Lines: 369 cmcmanis%pepper@Sun.COM (Chuck McManis) writes: >First what was your misunderstanding? IFF can be parsed fairly readily by >most "types" of parsers primarily because it's grammar is self consistent. > >-> So the question I wish to pose is; Would you (the Amiga community) >->reject a re-design of the current IFF standard? > >Yes if you decided to redesign it simply because you blew it while reading >the documents. Please, don't be offended but there are already "other"... > >Please, let us know what the "flaw" is first and *then* ask us if we want Ok, here goes... -------- The problem with the current IFF is that it is not generic. To be more specific, a FORM specifier is not a chunk per say. Under EA's definition, an ILBM is defined as: +-----------------------------------+ | 'FORM' size | +-----------------------------------+ | 'ILBM' | +-----------------------------------+ | +-------------------------------+ | | | 'BMHD' size + | | | data | | | +-------------------------------+ | | | 'CMAP' size | | | | data | | | +-------------------------------+ | | pad bytes (if needed) | +-----------------------------------+ | 'BODY' size + | data | +-----------------------------------+ The difficulty is that the 'ILBM' specifier is a special case, it has no size specifier. This wreaks havic on a generic parser. It also results in a nesting depth limitation (ie: BMHD cannot contain chuncks.) Another problem is that no bad chunk management is done. If any chunk is bad, the whole file is bad. Why not make a reasonable effort to retain the valid chunks? If the CMAP is messed up do we really need to through away the BODY? Recovering the CMAP would, in many cases, take but minutes useing a tool such as Doug's Color Commander (Seven Seas' Software), whereas an artist might loose hours of careful manipulation of the BODY. By allocating a bit in each chunk header this can be easily accomodated. Another problem is that there are no dirty chunk provisions. I feel that dirty chunk tracking would be a valuable option. Dirty chunks would occure when, after finding some recognized chunks, unrecognized chunks are encountered. IFF '85 discards these chunks. I propose that as a user option unrecognized chunks be retained when a program modifies a partially understood IFF '88 file. This can be easily achieved by allocating two bits in each chunk header. When unrecognized chunks are written they're marked as dirty, and any chunks which have been modified are also noted. This would allow programs with new, or proprietary chunks, to be made more compatable with existing programs (certain paint programs come to mind...). { BTW: I got the idea for the need for dirty chunk handling from Carolyn Scheppner, so don't tell me I'm off the wall on this one, I just happen to agree with her and offer this as one solution. I'm very open to any better solutions. } In IFF '88 a LONGWORD (ie: 32 bits) would be included at the top of all chunks to maintain the "status" of the chunk. Consider the following IFF '88 proposed format, +-----------------------------------+ | 'FORM' size,status | | +-------------------------------+ | | | 'ILBM' size,status | | | | +---------------------------+ | | | | | 'BMHD' size,status | | | | | | data | | | | | +---------------------------+ | | | | | 'CMAP' size,status | | | | | | data | | | | | +---------------------------+ | | | | | 'BODY' size,status + | | | | | data | | | | | +---------------------------+ | | | +-------------------------------+ | +-----------------------------------+ (pad bytes not shown, but considered added at the end of any odd byte length chunk, checksum assumed included at the end of each chunk as well). This format allows a generic parser to reconize 'FORM' and 'ILBM' as just another chunk type. More importantly, it allows a much simpler parser design that is also much more versital. It is entirely possible to place chunks within ANY chunk type. Thus data structures such as B-Trees are easily and efficeintly supported. Example: +-----------------------------------------------------+ | 'FORM' size,status | | +-------------------------------------------------+ | | | '23BT' size,status | | | | +---------------------------------------------+ | | | | | 'NODE' size,status | | | | | | +-----------------------------------------+ | | | | | | | 'NDAT' size,status | | | | | | | | data | | | | | | | +-----------------------------------------+ | | | | | | | 'NODE' size,status | | | | | | | | +-------------------------------------+ | | | | | | | | | 'NDAT' size,status | | | | | | | | | | data | | | | | | | | | +-------------------------------------+ | | | | | | | | | 'NODE' size,status | | | | | | | | | | +---------------------------------+ | | | | | | | | | | | 'NDAT' size,status | | | | | | | | | | | | data | | | | | | | | | | | +---------------------------------+ | | | | | | | | | | | NODEs, etc. etc. etc... | | | | | | | | | | | | | | | | | | | | | | | |^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^| | | | | | | | | | +-------------------------------------+ | | | | | | | +-----------------------------------------+ | | | | | | | 'NODE' size,status | | | | | | | | +-------------------------------------+ | | | | | | | | | {NDAT and 3 NODES...etc., etc. | | | | | | | | | +-------------------------------------+ | | | | | | | +-----------------------------------------+ | | | | | | | 'NODE' size,status | | | | | | | | +-------------------------------------+ | | | | | | | | | {NDAT and 3 NODEs...etc., ect. | | | | | | | | | +-------------------------------------+ | | | | | | | +-----------------------------------------+ | | | | | +---------------------------------------------+ | | | +-------------------------------------------------+ | +-----------------------------------------------------+ Amoung other things, this format would support quicker searchs of the file for a specific node, since nodes can be searched in a true tree like fassion. However this is not the point of the change. What I really want to do is create a purely Data driven mechanism, as opposed to the Code driven one in the current IFF. Rather than having to write code to handle each type of occurance, a structure would be initialized at run time, and this would be passed to the Reader or Writer parser to be handled. In this way it would never be necessary to update the Library(s). The following is a document specifying how the system is to work. ============================================================================= Conceptual Design Specification --------------------------------- Like its' predecessor, IFF '88 is a recursive descendant parser design. The primary differences between the old design and the new one is that while IFF '85 was code driven, IFF '88 is data driven. Whereas IFF '85 reader/writers' require re-compilation of the source to accomodate format updates, IFF '88 will not. IFF '88 also incorporates a more natural recursive descendant format. Basically, IFF '88 will consist of a number of libraries. In the simplest scheme there would be two libraries, one containing two parsers (read and write) and the other containing support routines. In a more complex scheme 5 libraries would be created, one for each parser, one for each set of related support routines, and the fifth for routines shared by both the reader and writer libraries. To use IFF '88 the developer will initialize a control stucture (a list of nodes) which will be used to read/write the files. Effectively, your program will write a program, which will be used to write or read the desired file. Initialization of the data structures will be simplified with routines provided in the support libraries. Defining a control structure will be acheived through calls much like those used to initialize intuition menue structures, which most of us are quite familiar with. The IFF '88 parser design is generic and performs no error checking on the validity of the control structure it is passed. It will be the responsiblility of the developer to ensure that a valid control structure is passed to the parser. The Writer Mechanism ------------------------ In order to write a file an implementation first creates and properly initializes a writer-structure, then calls the writer function which parses the structure and writes the file. ENTRIES in the Write Structure ---------------------------------- The basic element of the writer/reader structure will henceforth be called an "entry". An entry to the writer structure is simply the following record: StdProcPtr = POINTER TO PROCEDURE(ADDRESS); WrtAlgParamsPtr = POINTER TO WrtAlgParams; WrtAlgParams = RECORD DataAddr : ADDRESS; ByteCount : LONGCARD; END; WENTRY = RECORD ckID : ARRAY[0..4] OF CHAR; ckStatus : LONGWORD; PreCall, WrtAlg, PostCall : StdProcPtr; PreData, WrtData, PostData : ADDRESS; WLev : WLevelPtr; {defined later in this doc.} END; The fields have the following definitions: ckID : 4 byte ID as defined in IFF '85. ckStatus: 32 bits to be used for flags and such. I envision three flags to be used for "bad", "dirty", and "modified" chunk identification. WrtAlg : The algorithm used to write the chunk contents as referenced by the "WrtData" field. In the simplest case the WrtAlg will point at a standard WriteBytes routine. This routine is passed one parameter on the stack. In this way differences in compiler paramater passing conventions can be more easily resolved. PreCall : Normally NIL. Used for special cases to execute a pre-write function, and is passed the value held in "PreData" as its parameter. PostCall: As for PreCall, but called after a call WrtAlg. WrtData : Passed to the fuction pointed to by WrtAlg. There is no restriction on what this field is to be used for. However, as a general convention it will be used to hold the address of an initialized WrtAlgParams record. PreData : As WrtData, but used in conjunction with PreCall. PostData: As WrtData, but used in conjunction with PostCall. WLev : A pointer to a lower WLEVEL structure. If this pointer is NIL then this entry contains data and the other feilds of this entry are processed. If it is not NIL the other feilds in this entry are ignored, and the WLEVEL structure pointed at is parsed. A variant record could also be used, but this is easier and thus less prone to cause undesired results. LEVELs in the Write Structure --------------------------------- Levels in the write structure represent nesting control of the file writing mechanism. WLevelPtr = POINTER TO WLevel; WLevel = RECORD Entry : WENTRY; Next : WLEVELptr; END; Using levels in the write structure is quite simple. A level is composed of any number of WLevel nodes, linked together in a list, and defines how the parser should organize chunks. The following example should provide an efficeint explanation of the operational mechanism. Parsing an Example Initialized Write Strutructure --------------------------------------------------- The parser is very simple. The easiest way to decribe its function is through example so... First we need something to parse so consider the following initialized structure for writing a simple ILBM. The parser is passed a WLevelPtr which we will call root. Unintialized fields are not shown. Record types are shown in {} as in "{WLevel}" and are abstract (not part of the actual data). The contents of a record type are indented one space. Sorry for the lack of graphics in this doc. root \ \ {WLevel} {WEntry} ckID = "FORM"; WLev --------> {WLevel} Next = NIL; {WEntry} ckID = "ILBM"; WLev -----> {WLevel} Next = NIL; {WEntry} ckID = "BMHD"; WrtAlg = ADR(WriteBytes()); WrtData ---> {WrtAlgParams} Next ADR(BitMapHdr); | TSIZE(BitMapHdr); | V {WLevel} {WEntry} ckID = "CMAP" WrtAlg = ADR(WriteBytes()); WrtData ---> {WrtAlgParams} Next ADR(ColorTable); | nColors; | V {WLevel} {WEntry} ckID = "BODY"; WrtAlg = ADR(BodyWrtAlg()); WrtData ---> {WrtAlgParams} Next ADR(BitMap); | | V NIL Effectively each node in the level structure is a node in a simple binary tree. One of the descendant pointers is contained in the WLEVEL structure and is used to establish lists of entries at the same level. The other descendant pointer, WLev, is contained in the WENTRY structure. It is used to establish lower levels or specify that the chunk contains data (by being NIL). The reader is a bit more complicated, but follows the same general principals. The structure is more complex, allowing groupings of chunks. Level pointers can be connected to higher levels creating a recursive reader. What all this buys us is versatility. Because it is possible to link user routines into the writer or reader structures, it is not necessary to update the library to incorporate a new low-level algorithm, such as compression algorithms. Also, LISTS and CATS are unnecessary; simple extention through Levels is sufficient to write any file. It would probably be desirable to replace the "FORM" keyword with something new, such as "NIFF" or "IF88". Sorry this is not well organized, but I already spent more of the day on this than I have. There is undoubtedly room for improvement, suggestions? If there is any interest I'll go into more detail. Right now I have to get back to X-Specs 3D stuff. Thanks, UUCP: {cbosgd, hplabs!hp-sdd, sdcsvax, nosc}!crash!pnet01!haitex ARPA: crash!pnet01!haitex@nosc.mil INET: haitex@pnet01.CTS.COM Opionions expressed are mine, and not necessarily those of my employer.