Path: utzoo!censor!geac!yunexus!oz
From: oz@yunexus.UUCP (Ozan Yigit)
Newsgroups: comp.lang.c
Subject: Re: Perfect HASH functions.....(tries)
Keywords: hash 16-bit runtime
Message-ID: <4043@yunexus.UUCP>
Date: 30 Sep 89 04:59:38 GMT
References: <9900014@bradley> <1989Sep23.192021.26473@paris.ics.uci.edu> <1989Sep24.214153.8867@rpi.edu> <1989Sep26.133339.2890@twwells.com> <3987@yunexus.UUCP> <1989Sep29.021823.12598@twwells.com>
Reply-To: oz@yunexus.UUCP (Ozan Yigit)
Organization: York U. Communications Research & Development
Lines: 29

In article <1989Sep29.021823.12598@twwells.com> bill@twwells.com (T. William Wells) writes:
>I guess I was unclear. The trie (which I know about, most of my data
>compression work deals with them) is actually a degenerate case of my
>"invention". The neat trick was in first generating a hash code of
>something to be stored in a table, then using the hash code to search
>the trie. 

Well, maybe I was unclear.... It is a pretty useful, but still, a very
old trick, dating back to 1978: Per-Ake Larson "Dynamic Hashing", BIT
18 (1978). This is exactly what dbm/ndbm, and my pd sdbm uses. Details
of ndbm's way of doing this trie traversal via the hash value has been 
discussed in this group before.

>.. using the trie means that a
>full bucket only affects the current subtree, resulting in both space
>and time savings, with the drawback of doing memory allocations.

Right, but you have to be very careful with the hash function: (back to
where we started) it has to be "bit-randomizing" [In fact, larson uses a
hash-number seeded random binary-number generator] so that the trie will
not "one way branch" due to a large number of common bits in the hash
values.

oz
-- 
The king: If there's no meaning	   	    Usenet:    oz@nexus.yorku.ca
in it, that saves a world of trouble        ......!uunet!utai!yunexus!oz
you know, as we needn't try to find any.    Bitnet: oz@[yulibra|yuyetti]
Lewis Carroll (Alice in Wonderland)         Phonet: +1 416 736-5257x3976