Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!uflorida!novavax!twwells!bill From: bill@twwells.com (T. William Wells) Newsgroups: comp.lang.c Subject: Re: Perfect HASH functions..... Keywords: hash 16-bit runtime Message-ID: <1989Sep26.133339.2890@twwells.com> Date: 26 Sep 89 13:33:39 GMT References: <9900014@bradley> <1989Sep23.192021.26473@paris.ics.uci.edu> <1989Sep24.214153.8867@rpi.edu> Organization: None, Ft. Lauderdale, FL Lines: 46 In article <1989Sep24.214153.8867@rpi.edu> flynn@acm.rpi.edu (Kevin Lincoln Flynn) writes: : As long as people are talking hashing, can anyone think of a GOOD way to hash : 16-bit numbers determined dynamically at runtime into a fast way to look them : up? Specifically, I'm talking about the attribute dictionary of an object in : an OO system we're working on here -- the tags of attributes are 16-bit handles : about which nothing is known at runtime. Someone suggested closed hashing, : with the number of buckets growing by roughly doubling primes.... ie start : with two buckets, then 3, then 7, then 17, etc..... any ideas out there? : (oh yeah, the hash function as well as the strategy'd by nice. [ :) ]) Here's my invention, which is not quite a hash table, but will do quite as well: create a binary tree in which the numbers are used to control the descent of the tree. In other words, if you had three numbers (of four bits): 0100, 0110, 1000, your tree would look like this: * / \ * 1000 / \ 0100 0110 To find 0110, look at the left bit, a zero, which means go left; the next bit, a 1 means go right, and then, since we are at a leaf, we are done. If the item is not in the tree, one runs into a leaf that is not equal to the item or a null leaf pointer. Both insertion and deletion should be fairly obvious. The times for search, insert, and delete are all bounded and are linear in the maximum depth of the tree. There are no collisions and it is very fast to search. In fact, the only drawback I know of is that you really should dynamically allocate the node structures, unless you are willing to accept a difficult to compute upper bound on the number of things you can store in the tree. You can, however, add additional information to the nodes and keep the number of nodes to no more than 50%, more or less, of the number of leaves. There are, of course, many variations on this. The one I invented it with generates hash code for strings, uses the hash code to control the tree traversal, and stores the items in small buckets at the leaves. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com