From: utzoo!decvax!ucbvax!ARPAVAX:C70:editor-people Newsgroups: fa.editor-p Title: Buffer representations Article-I.D.: ucb.1946 Posted: Tue Sep 14 06:59:10 1982 Received: Wed Sep 22 07:23:07 1982 >From KLH@MIT-MC Tue Sep 14 06:55:18 1982 For what it's worth, when I developed ELLE I decided to go with a linked-list-of-text scheme. I deliberately wanted to avoid the list-of-lines strategy since that is what its predecessor TORES was using, and I was sick of the problems associated with it. One advantage of this approach is that it doesn't matter WHAT is in a node of the list; the editor works whether your lines are 80 or 8000000 characters long. I specifically wanted to be able to "edit" core dumps, for example. If your memory space is a problem (ELLE was intended for use with machines limited to 16 bits of address space, such as the 11/40) and your scheme requires having a "current line" entirely in memory, then eventually you will blow yourself up. You may blow yourself up anyway if you don't have a way to merge your linked lists of lines or swap out the list of swapped-out lines, etc etc ad nauseam, which seems to be emulating the list-of-text approach. Note that a key feature of this scheme is that a node need not hold text in memory. It can simply point to where the text was swapped out on disk. This is one of the other important reasons I went with this approach; if all the text could be kept in memory all the time, I doubt I'd find a linked list as attractive. Another advantage is that it is easy to merge and split blocks freely -- you don't have to worry about "line boundaries". In fact, reading a file into the buffer consists simply of creating a node which points to the entire file on disk; chunks are read in as needed. If the buffer is never modified, the GC can always reduce this list back to a single node. It's true that this requires clever programming and some crunching during redisplay, which does use char/line I/D (just to emphasize I wasn't punting anything). The redisplay data structure in ELLE is a table of lines, one for each physical screen line, in which the buffer data is duplicated. Routines which modify the buffer use a "core" set of routines which know how to inform redisplay of the exact range of modifications. I could go on, but most of this seems pretty standard, and people who want more details can just ask. As for performance, nobody has ever complained that ELLE was too slow. It works well, which is the main point. It is possible to ask it to do things that take a long time, but those things (like upper-casing everything in a million-byte file) would be pretty slow no matter what scheme you used! Of course, if your machine has a huge address space, forget all this hair and simply use the buffer-gap method, keeping everything in memory. It's simpler. This is how the real ITS EMACS (TECO) works, by the way.