Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.1 6/24/83; site opus.UUCP
Path: utzoo!watmath!clyde!floyd!harpo!seismo!hao!cires!nbires!opus!rcd
From: rcd@opus.UUCP
Newsgroups: net.lang.c
Subject: Re: What Can't C Do? (strings)
Message-ID: <251@opus.UUCP>
Date: Mon, 19-Mar-84 22:02:08 EST
Article-I.D.: opus.251
Posted: Mon Mar 19 22:02:08 1984
Date-Received: Wed, 21-Mar-84 02:23:57 EST
References: <6900@unc.UUCP>, <226@opus.UUCP> <58@utastro.UUCP>
Organization: NBI, Boulder
Lines: 18

<>
Articles suggesting string comparison in C so far have addressed the issues
of "does it belong there?" and such.  A very different issue is that the
correct string-comparison algorithm is anything but obvious.  Using the
native collating sequence - even in the ASCII part of the world - is only a
primitive solution.  Text can seldom be compared "reasonably", in terms of
human expectations, with the ASCII ordering.  For one thing, "dictionary
order" is different from ASCII, since it regards case of letters as less
significant than differing letters.  Comparisons of names present another
problem (and a messy one) - you have to get Mc and Mac together, etc.  Then
there's the language problem - for example, in Spanish "ll" and "ch" are
treated as single letters.  Probably the situations in which the native
collating sequence is correct are limited to numeric comparisons (integers
only, of course) and general string comparisons where ANY order is OK as
long as it satisfies the rules of an ordering relation - as in searching,
building ordered trees, etc.
-- 
{hao,ucbvax,allegra}!nbires!rcd