Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: $Revision: 1.6.2.16 $; site pbear.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!ucbvax!decvax!yale!pbear!peterb
From: peterb@pbear.UUCP
Newsgroups: net.arch
Subject: Re: Re: RISC (really on multiplication d
Message-ID: <600005@pbear.UUCP>
Date: Sat, 13-Jul-85 00:24:00 EDT
Article-I.D.: pbear.600005
Posted: Sat Jul 13 00:24:00 1985
Date-Received: Wed, 17-Jul-85 04:55:56 EDT
References: <149@mips.UUCP>
Lines: 29
Nf-ID: #R:mips:-14900:pbear:600005:000:1349
Nf-From: pbear!peterb    Jul 13 00:24:00 1985


/* Written  9:12 pm  Jul 11, 1985 by amdahl!mat in pbear:net.arch */
>>
>> Having a hardware multiplication instruction isn't as much of a win as
>> you might think.  On everybody's favorite chip, the 8088, a 16 x 16
>> multiply takes about 115 cycles, while shifts and adds are 2 and 3 cycles
>> respectively.  This means that for practically any constant multiplier
>> you'll get faster code by constructing your multiply from shifts, adds,
>> and substracts.
>>
>Unless the multiplier is very wide and smart enough to do things like sum
>partial products in parallel with each multiplication.  This gives speed
>you couldn't get with the above.  However, it sure is not obvious what
>the right thing to do is.
>--
>Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat
/* End of text from pbear:net.arch */

From the timing, one can see that since a 8 bit multiply can vary in cylcles
from 70 - 77 and that a 16 bit multiply takes from 118 - 133, the difference
in timing is 7 and 15 respectively. This is a by product of shift/add
sequences to do the multiply, and it is slower than need be. A 26116 can do
16 by 16 multiply in about 19 clock cylces. that's about 1.9 microseconds for
100ns cycles time.

Why it is so slow I don't know. could anybody explain this???

Peter Barada
{ihnp4!inmet|{harvard|cca}!ima}!pbear!peterb