Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.1 6/24/83; site opus.UUCP Path: utzoo!watmath!clyde!floyd!harpo!seismo!hao!cires!nbires!opus!rcd From: rcd@opus.UUCP Newsgroups: net.lang.c Subject: Re: What Can't C Do? (strings) Message-ID: <251@opus.UUCP> Date: Mon, 19-Mar-84 22:02:08 EST Article-I.D.: opus.251 Posted: Mon Mar 19 22:02:08 1984 Date-Received: Wed, 21-Mar-84 02:23:57 EST References: <6900@unc.UUCP>, <226@opus.UUCP> <58@utastro.UUCP> Organization: NBI, Boulder Lines: 18 <> Articles suggesting string comparison in C so far have addressed the issues of "does it belong there?" and such. A very different issue is that the correct string-comparison algorithm is anything but obvious. Using the native collating sequence - even in the ASCII part of the world - is only a primitive solution. Text can seldom be compared "reasonably", in terms of human expectations, with the ASCII ordering. For one thing, "dictionary order" is different from ASCII, since it regards case of letters as less significant than differing letters. Comparisons of names present another problem (and a messy one) - you have to get Mc and Mac together, etc. Then there's the language problem - for example, in Spanish "ll" and "ch" are treated as single letters. Probably the situations in which the native collating sequence is correct are limited to numeric comparisons (integers only, of course) and general string comparisons where ANY order is OK as long as it satisfies the rules of an ordering relation - as in searching, building ordered trees, etc. -- {hao,ucbvax,allegra}!nbires!rcd