Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!dptg!att!chinet!henry
From: henry@chinet.chi.il.us (Henry C. Schmitt)
Newsgroups: comp.sys.mac.programmer
Subject: Re: Hashing....
Summary: Don't use example given!!
Message-ID: <9690@chinet.chi.il.us>
Date: 28 Sep 89 05:13:23 GMT
References: <4345@internal.Apple.COM> <1728@neoucom.UUCP>
Reply-To: henry@chinet.chi.il.us (Henry C. Schmitt)
Distribution: na
Organization: Chinet - Public Access Unix
Lines: 58

In article <1728@neoucom.UUCP> sam@neoucom.UUCP (Scott A. Mason) writes:
>In article <4345@internal.Apple.COM> athos@apple.com (Rick Eames) writes:
>>I am a relatively new programmer and I am searching for a good
>>algorithim for finding one pattern in a source string...
>>
>It may not be a hashing function, but it does find a substring in a given
>string.  Kind of a quick and dirty type utility one writes when needed.
>----------cut here-----------
>/* searchfor returns 1 if substring is a substring of string.  It
>returns 0 otherwise. The theory behind searchfor is to search for
>the first character of the substring in the source string, and then
>compare from there.  This helps speed of the process reducing the
>calls to strncmp. */ 
> [algorithm deleted!]
I don't mean to be rude but "Ack! Thpppt!"  Creeping along looking
for the first character is just about the worst way to find a
string!  (It does, however have the advantage of being simple!)

Since I am currently in a graduate course in algorithms and we are
currently doing (you guessed it) string searching, I recommend the
Boyer-Moore algorithm.  It has the interesting property of working
faster the longer the substring is!

Here it is (directly from my notes and untested) in pseudo-code:
BEGIN B-M
	Tstart := 1
	For i := 1 to max-char
		Skip[i] = len(sub)
	For i := 1 to len(sub)
		Skip[sub[i]] := i
	OUT-LOOP: DO
		s := len(sub)
		t := Tstart + len(sub) - 1
		IN-LOOP: While sub[s] = str[t] and s > 0
			s := s - 1
			t := t - 1
		END IN-LOOP
		IF s = 0 then return Tstart
		slide := Skip[str[t]] - len(sub) + s
		If slide < 0 then slide := 1
		Tstart := Tstart + slide
	While Tstart < len(str) - len(sub)
	END OUT-LOOP
	Return 0 /* Not found */
END B-M

I highly recommend the book _Algorithms_ by Robert Sedgewick for all
the gory details of Boyer-Moore.  In fact this book has over 40
chapters of exellent algorithms for doing just about anything.  Any
good Comp. Sci. bookstore will have it (I've seen it in B.Dalton's
Software Etc.)

Hope this helps!

-- 
  H3nry C. Schmitt     | CompuServe: 72275,1456  (Rarely)
                       | GEnie: H.Schmitt  (Occasionally)
 Royal Inn of Yoruba   | UUCP: Henry@chinet.chi.il.us  (Best Bet)