Path: utzoo!utgpu!watmath!clyde!att!pacbell!ames!sgi!arisia!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.arch Subject: Re: BENCHMARKS AND LIPS Message-ID: <798@quintus.UUCP> Date: 2 Dec 88 10:49:02 GMT References: <1740MLWLG@CUNYVM> <746@quintus.UUCP> <595@mqcomp.oz> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 36 In article <595@mqcomp.oz> s8504867@mqcomp.mq.oz (John Gardner) writes: >In article <746@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >>Logical Inferences Per Second is a property of a _Prolog_ implementation, >Come off the grass. That's like saying MIPS are a property of C >only. While it is nice to do benchmarks by only varying what you want to >compare, it is certainly valid to calculate LIPS using any theorem prover, >not just prolog. Do you think that all other theorem provers are incapable >of logical inferences ? Do you think prolog is the only langauge availible >for this sort of work ? Stop knocking down straw men; you'll only get straw in your hair and then what will the neighbours think? It is no more "valid to calculate LIPS using any theorem prover" than it is to calculate Dhrystones "using any program". Remember: the meaning of a compound noun is *not* a simple composition of the meanings of the words it is made from. Is a foot-hill a hill made of feet? Is a benchmark a mark on a bench? (Not now.) Yes, other programs are theorem provers capable of drawing logical inferences. Prolog is a pretty weak theorem prover, that's _why_ it is usable as a programming language. If you compare the number of resolutions per second in a good theorem prover (say Markgraf Karl, or ITP) with the kind of LIPS rating a good Prolog should get (a) the theorem prover would look _terrible_, and (b) you would learn nothing of interest. In fact, you don't learn a whole lot comparing the LIPS rating of two Prolog systems, either. Run some tests and you can get quite a few more procedure calls per second than the LIPS rating; run some others and you can get far fewer. We should be trying to get rid of "LIPS", not trying to spread the disease. For comparing theorem provers in general, there's a book of examples from one of the Argonne crowd which might be useful for benchmarking. Cpu and wall time in seconds to solve each of those problems would be much more illuminating than a single figure which favours depth-first search.