Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10.2 9/18/84; site cca.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!unc!mcnc!decvax!cca!g-rh
From: g-rh@cca.UUCP (Richard Harter)
Newsgroups: net.math
Subject: Sorting a sorted list by a different order of keys
Message-ID: <3070@cca.UUCP>
Date: Mon, 24-Jun-85 23:32:19 EDT
Article-I.D.: cca.3070
Posted: Mon Jun 24 23:32:19 1985
Date-Received: Thu, 27-Jun-85 04:42:57 EDT
References: <>
Reply-To: g-rh@cca-unix.UUCP (Richard Harter)
Organization: Computer Corp. of America, Cambridge
Lines: 62
Keywords: sort, sorting, mathsort
Summary: In core sorts can usually be done O(n)

Gregory J.E. Rawlins, Department of Computer Science, U. Waterloo,
writes:

>    There are only three ways to "beat" nlgn. Either you know
>something about the distribution of the input prior to running
>you algorithm or you decided to count something other than the
>number of comparisons of two data elements, or you count something
>other than the worst case possible, in all three cases you
>have changed the model. Radix sort is a simple example of the 
>first type of special case since it won't work unless you know
>that the input consists of integers in some prespecified range.
>Sorting in parallel using n processors (taking constant time)
>is an example of the second. Hash sort is an example of the 
>third type since you are concerned with the average case.

	Er, well, it is a bit more complicated than that.
Your third formulation is wrong; the basic theorem of sorting
says that the average can be no better than O(n log n).
However the theorem is a little misleading since O(n) sorts
exist for rather general classes of cases.  The basis for the
theorem is that a sort must distinguish between the n! possible
permutations.  It happens that n! is O(n**n).  The number of
binary comparisons required to distinguish the n! cases is
O(log2 n!) which is O(n log n).  This is the best possible
average case.

	The reason that the basic theorem is misleading is
that is stated in terms of binary comparisons.  This involves
two assumptions.  The first is that the binary comparison
is the appropriate element of measurement.  The second is
that the only information to be used is relative ordering
information.

	The problem with the first assumption is that the
appropriate fundamental operation is the machine operation.
All modern computers are inherently doing parallel processing
because they operate on all of the bits of a word.  The
second assumption is appropriate for the most general class
of sort where keys can be completely arbitrary; in practice,
however, additional information always exists and can be
exploited.

	A good example of an O(n) sort is the generalized
math sort.  Given n keys, assign n buckets.  Make a first
pass over the data to get the smallest and largest element.
Make a second pass over the data and use linear interpolation
on the keys to get the bucket index.  After the second pass
use any sort method which is O(n) for almost sorted data.
In practice this sort is O(n); the provisos are beyond the
scope of this article.

	Address space is important.  It can be shown (See
Knuth) that if auxilliary storage devices are needed then
sorting must be O(n log n) and that (I believe) that there
is nothing better than a merge sort.

	Finally, radix sorting is O(n log n), at best.  The
reason is that the keys must be at least log n bits long to
represent n distinct keys.

			Richard Harter, SMDS Inc.
			(Running at g-rh@cca-unix)