Path: utzoo!telly!ddsw1!mcdchg!rutgers!apple!voder!pyramid!cbmvax!uunet!igor!dsb@Rational.COM
From: dsb@Rational.COM (David S. Bakin)
Newsgroups: gnu.g++
Subject: Re: Perfect hash function for g++ reserved words
Message-ID: <323@igor.UUCP>
Date: 20 Sep 88 16:44:22 GMT
References: <8809192248.aa03503@PARIS.ICS.UCI.EDU>
Sender: news@igor.UUCP
Reply-To: dsb@Rational.COM (David S. Bakin)
Distribution: gnu
Organization: Rational
Lines: 28
In-reply-to: schmidt%siam.ics.uci.edu@PARIS.ICS.UCI.EDU ("Douglas C. Schmidt")

I don't want to make a big fuss, especially as it is someone else who is
being so good about writing code and posting it, but ...

A perfect hash function isn't the best for the purpose ... if any effort
is made for it to be a minimal perfect hash function.  I notice that the
g++ hash function just posted isn't minimal but at the same time some
table entries are duplicated instead of being left NULL.

Since 1) a test against NULL is faster than a string comparison
  and 2) most (statistics anyone?) calls to the hash function produce
         the result MISMATCH because most calls are for identifiers not
         for reserved words (especially in C, maybe not as much for Ada)
  and 3) memory is cheap (at least, you can't convince me that gcc/g++
         is designed otherwise)
then it is a good idea to have a perfect hash function that hashes randomly
into a large table -- say one with at least 50% or even 75% null entries,
so that only 1-in-2 or 1-in-4 probes results in a string comparison while
the rest terminate immediately.

Again -- I appreciate the effort of other people to write the code and post
it!

-- Dave 
-- 
----------------------------------------------------------
Dave Bakin				    (408) 496-3600
c/o Rational; 3320 Scott Blvd.; Santa Clara, CA 95054-3197
Internet:  dsb@rational.com	 Uucp:  ...!uunet!igor!dsb