Path: utzoo!attcan!uunet!kddlab!icot32!icot21!chik
From: chik@icot21.icot.junet (Chikayama Takashi)
Newsgroups: comp.lang.prolog
Subject: Re: common subterms (was Re: Why no macro facility?)
Message-ID: 
Date: 15 Jul 88 01:32:18 GMT
References: <9671@lll-winken.llnl.gov> <6899@burdvax.PRC.Unisys.COM> <6206@megaron.arizona.edu> <561@ecrcvax.UUCP>
Sender: chik@icot32.icot.junet
Reply-To: chikayama@icot.junet
Organization: Institute for New Generation Computer Technology, Tokyo, Japan.
Lines: 33
In-reply-to: micha@ecrcvax.UUCP's message of 14 Jul 88 07:29:14 GMT


In article <561@ecrcvax.UUCP> micha@ecrcvax.UUCP (Micha Meier) writes:

	I have also thought about factoring common terms, however it seems
	to me that recognizing them can become *very* costly, especially
	when many lists occur in the clause, or do you have some
	clever algorithm?

The cost is expected to be not much more than linear with the size of
the program.  You can compute some hash function for all the
structures occurring in a clause, and register them in a hash table.
On registration, occurrences of the same structure appeared before can
be recognized.

The cost for computing the hash function can be linear with the size
for the lowest level structure.  If the hashing function is designed
so that the hash value for upper-level structure only depends on the
hash value of its elements, total cost for computing the hash function
is still linear (you don't have to recompute for elements).

The cost for registration is constant when the same structure is _not_
already in the table, as far as the hash table is not too densely
populated and the hashing function is designed properly to avoid too
many accidental collisions.  When the same structure is already there,
matching takes cost proportional to the size of the structure.  Thus,
in the worst case, the total cost is proportional to square of the
clause size.  However, as this _worst_ case is the _best_ case for
common subterm factoring, it pays off, doesn't it?

If your compiler is written in a language in which efficient hashing
is difficult, such as _standard_ Prolog, it's a thousand pities :-)

T.Chikayama