Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!uunet!seismo!husc6!cmcl2!phri!roy From: roy@phri.UUCP (Roy Smith) Newsgroups: comp.unix.questions,comp.sources.wanted Subject: Re: Multiple Field Sorts in UNIX(tm) Message-ID: <2810@phri.UUCP> Date: Wed, 22-Jul-87 20:01:04 EDT Article-I.D.: phri.2810 Posted: Wed Jul 22 20:01:04 1987 Date-Received: Sat, 25-Jul-87 01:48:37 EDT References: <2459@whuts.UUCP> Reply-To: roy@phri.UUCP (Roy Smith) Organization: Public Health Research Inst. (NY, NY) Lines: 42 Summary: sort(1) can do it Xref: mnetor comp.unix.questions:3301 comp.sources.wanted:1688 In article <2459@whuts.UUCP> tes@whuts.UUCP (STERKEL) writes: > I need a multiple field sort that maintains sub-field order. > [...] > An inefficient implementation of this has been: > cat file | sort on field3 | sort on field2 | sort on field1 > sorted > > BUT, this only for sorts that are "bubble" and/or "shell". Using Sort(1) > "scrambles" the previous sorts on each pass, leaving me with no easy way > to use sort(1) to do multiple field sorts. What you want to do is "sort +2 +1 +0 file > sorted". The +N arguments mean to skip N fields. This is a bit counter-intuitive, but it boils down to "+2" meaning skip 2 fields, i.e. use the third field. The following "+1" and +0" arguments mean use the second and first field as secondary and tertiary sort keys; exactly what you want (if I understood your question right). The sort(1) man page says that if all keys compare equal, it sorts on the whole input line. This is why it "scrambles" your file when you do a multi-pass sort. It would be nice if sort had a "stable" option (i.e. make two lines which compare equal on all specified keys stay in the original input order; exactly what you need to do the multi-pass sort you describe) but as far as I know, it doesn't. Not being a real sorting guru, I don't know what impact that option would have on performance. But I can tell you that the "sort +2 +1 +0" is a lot faster than "sort +2 | sort +1 | sort +0", unless you happen to be on a Sequent, and maybe not even then. You wanted a SysV answer. I'll admit that this is a 4.N answer, but since I don't think sort has changed much (on the outside) since v6, I'm pretty confident that it works on SysV too. The Unix sort(1) utility is a pretty amazingly powerful tool, especially with the "+N.M" stuff to skip fields and characters within fields. Unfortunately, the sort man page is about as obtruse as man pages get. I've been reading that man page on and off for about 10 years and I'm still not sure I understand all the details. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016