Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!pasteur!ucbvax!hplabs!hpda!hp-sde!wunder From: wunder@hp-sde.SDE.HP.COM (Walter Underwood) Newsgroups: comp.misc Subject: Re: Soundex algorithm Message-ID: <460001@hp-sde.SDE.HP.COM> Date: 12 Jul 88 16:59:48 GMT References: <2130@hubcap.UUCP> Organization: HP Software Dev Environments - Palo Alto, CA Lines: 20 > The table has L=4, R=6; I find this surprising, as both R and L are > semivowels and they are easily confused by those who did not grow up > with the distinction (e.g., some Orientals). > > In-Real-Life: Chris Torek Phonetic algorithms are closely tied to the language. You really need a different algorithm for each language, and even for variants of the language (Canadian French, Cajun French, and French French, for example). How about an American user phonetically spelling a French name? Now we have the user language and the name language. Yikes. Things get even worse when you translate Chinese names into English and look them up with Soundex. You get Every Lee, Li, etc., in the book, because English does not include phonitic distinctions that exist in Chinese. So, Soundex is a quick hack, and we probably should live with the limitations. A better solution is probably much more complex. wunder