Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: notesfiles - hp internal release 1.2; site hp-pcd.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxl!houxm!houxz!vax135!floyd!cmcl2!seismo!hao!hplabs!hp-pcd!grant From: grant@hp-pcd.UUCP Newsgroups: net.micro Subject: Re: Re: text compression methods Message-ID: <7100015@hp-pcd.UUCP> Date: Mon, 11-Jun-84 15:26:00 EDT Article-I.D.: hp-pcd.7100015 Posted: Mon Jun 11 15:26:00 1984 Date-Received: Sun, 10-Jun-84 01:51:01 EDT References: <363@sri-arpa.UUCP> Organization: Hewlett-Packard Portable Computer Division - Corvallis, OR Lines: 16 Nf-ID: #R:sri-arpa:-36300:hpcvrd:7100015:000:799 Nf-From: hpcvrd!grant Jun 6 11:26:00 1984 For a different approach yet, see the first article in the June, 84 "Computer" magazine (IEEE). It gets around 2x compression without any prior knowledge of character/word probabilities. The algorithm is to start with a set of symbols which represent the input characters. As a message is read in, characters are parsed off to match the longest symbol defined. When this string is ended, a new symbol is defined which is the parsed symbol followed by the next character. The parsed symbol is then sent. Decoding rebuilds the symbol table from the input stream, so the translation table does not need to be transmitted. One question I had: what does the algorithm do when it runs out of symbols? I think this method has good potential. Grant Garner hp-pcd!grant