Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site alice.UUCP Path: utzoo!watmath!clyde!burl!ulysses!allegra!alice!trickey From: trickey@alice.UucP (Howard Trickey) Newsgroups: net.internat Subject: Re: Hyphenation Message-ID: <4546@alice.UUCP> Date: Sun, 10-Nov-85 09:17:41 EST Article-I.D.: alice.4546 Posted: Sun Nov 10 09:17:41 1985 Date-Received: Mon, 11-Nov-85 05:42:18 EST References: <471@harvard.ARPA> <773@mmintl.UUCP>, <1861@watdcsu.UUCP> Organization: Bell Labs, Murray Hill Lines: 17 > Yes, and none of them are any good. The hyphenation algorithm invented by Frank Liang, and incorporated in TeX is good. It is essentially a way of converting a hyphenated wordlist (from a dictionary, but with all forms of all words) and creating a list of "patterns". You can set parameters to trade off table size vs. percentage of hyphens that it will find vs. error rate. The standard TeX table takes about 20kbyte, finds 86.7% of the hyphens in an inflected Webster's Pocket Dictionary (and all of the hyphens in the 676 most common words), and no wrong hyphens. With about 2kbyte you could find 35.2% of the hyphens and no errors. (Please note that this algorithm is in TeX82, not the original TeX.) See "Word Hy-phen-a-tion by Com-put-er" by Frank Liang (Phd thesis) Stanford CS Dept. tech report STAN-CS-83-977 for details. Several groups have done French hyphenation tables using this algorithm, and found that they are typically much smaller than English ones.