Path: utzoo!attcan!uunet!kddlab!icot32!icot21!chik From: chik@icot21.icot.junet (Chikayama Takashi) Newsgroups: comp.lang.prolog Subject: Re: common subterms (was Re: Why no macro facility?) Message-ID:Date: 15 Jul 88 01:32:18 GMT References: <9671@lll-winken.llnl.gov> <6899@burdvax.PRC.Unisys.COM> <6206@megaron.arizona.edu> <561@ecrcvax.UUCP> Sender: chik@icot32.icot.junet Reply-To: chikayama@icot.junet Organization: Institute for New Generation Computer Technology, Tokyo, Japan. Lines: 33 In-reply-to: micha@ecrcvax.UUCP's message of 14 Jul 88 07:29:14 GMT In article <561@ecrcvax.UUCP> micha@ecrcvax.UUCP (Micha Meier) writes: I have also thought about factoring common terms, however it seems to me that recognizing them can become *very* costly, especially when many lists occur in the clause, or do you have some clever algorithm? The cost is expected to be not much more than linear with the size of the program. You can compute some hash function for all the structures occurring in a clause, and register them in a hash table. On registration, occurrences of the same structure appeared before can be recognized. The cost for computing the hash function can be linear with the size for the lowest level structure. If the hashing function is designed so that the hash value for upper-level structure only depends on the hash value of its elements, total cost for computing the hash function is still linear (you don't have to recompute for elements). The cost for registration is constant when the same structure is _not_ already in the table, as far as the hash table is not too densely populated and the hashing function is designed properly to avoid too many accidental collisions. When the same structure is already there, matching takes cost proportional to the size of the structure. Thus, in the worst case, the total cost is proportional to square of the clause size. However, as this _worst_ case is the _best_ case for common subterm factoring, it pays off, doesn't it? If your compiler is written in a language in which efficient hashing is difficult, such as _standard_ Prolog, it's a thousand pities :-) T.Chikayama