Path: utzoo!censor!geac!yunexus!oz From: oz@yunexus.UUCP (Ozan Yigit) Newsgroups: comp.lang.c Subject: Re: Perfect HASH functions.....(tries) Keywords: hash 16-bit runtime Message-ID: <4043@yunexus.UUCP> Date: 30 Sep 89 04:59:38 GMT References: <9900014@bradley> <1989Sep23.192021.26473@paris.ics.uci.edu> <1989Sep24.214153.8867@rpi.edu> <1989Sep26.133339.2890@twwells.com> <3987@yunexus.UUCP> <1989Sep29.021823.12598@twwells.com> Reply-To: oz@yunexus.UUCP (Ozan Yigit) Organization: York U. Communications Research & Development Lines: 29 In article <1989Sep29.021823.12598@twwells.com> bill@twwells.com (T. William Wells) writes: >I guess I was unclear. The trie (which I know about, most of my data >compression work deals with them) is actually a degenerate case of my >"invention". The neat trick was in first generating a hash code of >something to be stored in a table, then using the hash code to search >the trie. Well, maybe I was unclear.... It is a pretty useful, but still, a very old trick, dating back to 1978: Per-Ake Larson "Dynamic Hashing", BIT 18 (1978). This is exactly what dbm/ndbm, and my pd sdbm uses. Details of ndbm's way of doing this trie traversal via the hash value has been discussed in this group before. >.. using the trie means that a >full bucket only affects the current subtree, resulting in both space >and time savings, with the drawback of doing memory allocations. Right, but you have to be very careful with the hash function: (back to where we started) it has to be "bit-randomizing" [In fact, larson uses a hash-number seeded random binary-number generator] so that the trie will not "one way branch" due to a large number of common bits in the hash values. oz -- The king: If there's no meaning Usenet: oz@nexus.yorku.ca in it, that saves a world of trouble ......!uunet!utai!yunexus!oz you know, as we needn't try to find any. Bitnet: oz@[yulibra|yuyetti] Lewis Carroll (Alice in Wonderland) Phonet: +1 416 736-5257x3976