Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!twwells!bill From: bill@twwells.com (T. William Wells) Newsgroups: comp.lang.c Subject: Re: Perfect HASH functions..... Keywords: hash 16-bit runtime Message-ID: <1989Sep29.021823.12598@twwells.com> Date: 29 Sep 89 02:18:23 GMT References: <9900014@bradley> <1989Sep23.192021.26473@paris.ics.uci.edu> <1989Sep24.214153.8867@rpi.edu> <1989Sep26.133339.2890@twwells.com> <3987@yunexus.UUCP> Organization: None, Ft. Lauderdale, FL Lines: 38 In article <3987@yunexus.UUCP> oz@yunexus.UUCP (Ozan Yigit) writes: : In article <1989Sep26.133339.2890@twwells.com> bill@twwells.com (T. William Wells) writes: : >Here's my invention, which is not quite a hash table, but will do : >quite as well: create a binary tree in which the numbers are used to : >control the descent of the tree. : : Nice trie !! :-) Actually, it is nice to be able to invent things (I did : that once or twice), even if they are about a decade old. See "radix : search tries" or "compressed tries" or "digital search tries" . Also see : Knuth, or "Handbook of Algorithms and Data Structures" by G. H. Gonnet, : Addison-Wesley ISBN 0-201-14218-X. I guess I was unclear. The trie (which I know about, most of my data compression work deals with them) is actually a degenerate case of my "invention". The neat trick was in first generating a hash code of something to be stored in a table, then using the hash code to search the trie. (Ugh, I hate that word!) Later, a friend of mine was writing a disassembler and wanted a way to translate addresses to symbols and I realized that the identity hash function :-) would do his job quite nicely. (And this "hash" also preserves ordering, which made the job of finding the symbol below or equal to an address fairly easy.) And that version is what I was suggesting. Anyway, I developed the original hash method by a consideration of the hash method wherein the hash table is doubled when a bucket fills and each element of the table is duplicated adjacent to itself. I realized that a transformation of that would have nodes representing the original entries with children nodes representing the newly duplicated nodes. Unlike that method, using the trie means that a full bucket only affects the current subtree, resulting in both space and time savings, with the drawback of doing memory allocations. This is my standard hash routine. Maybe someday I'll stop being a perfectionist about the code and release it. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com