Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!pasteur!ucbvax!hplabs!hpda!hp-sde!wunder
From: wunder@hp-sde.SDE.HP.COM (Walter Underwood)
Newsgroups: comp.misc
Subject: Re: Soundex algorithm
Message-ID: <460001@hp-sde.SDE.HP.COM>
Date: 12 Jul 88 16:59:48 GMT
References: <2130@hubcap.UUCP>
Organization: HP Software Dev Environments - Palo Alto, CA
Lines: 20

> The table has L=4, R=6; I find this surprising, as both R and L are
> semivowels and they are easily confused by those who did not grow up
> with the distinction (e.g., some Orientals).
>
> In-Real-Life: Chris Torek

Phonetic algorithms are closely tied to the language.  You really need
a different algorithm for each language, and even for variants of the
language (Canadian French, Cajun French, and French French, for example).
How about an American user phonetically spelling a French name?  Now
we have the user language and the name language.  Yikes.

Things get even worse when you translate Chinese names into English
and look them up with Soundex.  You get Every Lee, Li, etc., in the
book, because English does not include phonitic distinctions that
exist in Chinese.

So, Soundex is a quick hack, and we probably should live with the
limitations.  A better solution is probably much more complex.

wunder