Xref: utzoo comp.arch:6121 comp.lang.prolog:1204 Path: utzoo!utgpu!attcan!uunet!seismo!sundc!pitstop!sun!quintus!ok From: ok@quintus.uucp (Richard A. O'Keefe) Newsgroups: comp.arch,comp.lang.prolog Subject: Re: Perils of comparison -- an example Message-ID: <295@quintus.UUCP> Date: 20 Aug 88 05:41:06 GMT References: <282@quintus.UUCP> <15221@shemp.CS.UCLA.EDU> <292@quintus.UUCP> <1303@eos.UUCP> Sender: news@quintus.UUCP Reply-To: ok@quintus.UUCP (Richard A. O'Keefe) Organization: Quintus Computer Systems, Inc. Lines: 52 In article <1303@eos.UUCP> eugene@eos.UUCP (Eugene Miya) writes: >In article <292@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >>Don't get me wrong: Naive Reverse is not a specially good benchmark. > >I see you came from prolog and cross posted to arch. > Misemphasis: it was a joint posting to both groups because I thought the original article (comments on a paper about a new architecture) were relevant to both groups. >I've stated this many times in comp.arch, and I'll repeat this once >for the Prolog community's benefit. Measurement of repetition >isn't equivalent to repetition of measurement on a computer. Cache, >paging, and optimization conspire against oversimplistic >measurements of this type. We *know* that. But we *also* know that if you measure one iteration of a typical micro-benchmark it falls below the resolution of the clock. Running nrev a few thousand times is to get a figure which can be distinguished from clock quantisation. Let me summarise my position here: (1) A paper describing a machine called DLM appeared in FGCS. (2) The paper compared the DLM with a 68020 using *different* micro benchmarks (3) one of which is the official definition of LI/s, but (4) neither of which is good. (5) Because of (2) and other reasons, it appears that the special-purpose machine is not as much of an advance over conventional chips as it seems. >I've been trying to find out what "really constitutes a Logical >Instruction" As far as I can tell, it's totally arbitrary whereas >Instructions and Operations tend to correspond to discrete states >(barring instruction pipelining, yes yes....). (Yes I have Gabriel's >thesis and others). Gabriel's thesis? Do you mean Tick's? Logical Inferences per second are defined by the naive reverse benchmark and by nothing else. The place to look for the definition is Warren's thesis. The term has been *mis*applied as "number of procedure calls per second" which can be almost anything depending on what code you run. >Your keyword about measuring prolog is "Naive." This isn't a putdown, >but the Prolog community will have to recognize some of these problems. >Another gross generalization from Eugene Miya. Well, if it isn't a putdown, it'll do until a real one comes along. We *know* that these micro-benchmarks don't extrapolate well, but they're the best we've got. ("Naive" refers to the algorithm, by the way.) Some constructive advice about how to structure a benchmark suite to compare implementations of a high-level language on a range of 32-bit workstations would be really welcome, and if the comp.arch community is really so sophisticated, such advice should be forthcoming, no?