Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Posting-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!henry
From: henry@utzoo.UUCP (Henry Spencer)
Newsgroups: net.micro.68k,net.micro.16k
Subject: Re: Re: PDP11s vs the micros
Message-ID: <5890@utzoo.UUCP>
Date: Tue, 20-Aug-85 17:50:51 EDT
Article-I.D.: utzoo.5890
Posted: Tue Aug 20 17:50:51 1985
Date-Received: Tue, 20-Aug-85 17:50:51 EDT
References: <1617@hao.UUCP> <847@mako.UUCP> <2422@sun.uucp>
Organization: U of Toronto Zoology
Lines: 38

The following two Dave Trissel quotes are from the same message:

> [we can continue other instructions in parallel] ... Even if the write
> fails (bus errors) there could be several more instructions executed (in fact
> any amount until one is hit which requires the bus again.)

> The stack save equates to about the same overhead as executing 12
> instructions.

In other words, all you need is 12 contiguous non-memory-referencing
instructions and the 68020's stack puke will actually break even!  This
is stretching it a bit, since on the pdp11 typically every third or fourth
instruction did some sort of memory reference; I doubt that the 68000 family
does much better.  Speaking of the 68000 *family*, note that a 68010 gets
the full performance hit every time since it doesn't pipeline much.

On the other hand, I'm glad to hear that Motorola did have the sense to
put a floating-point-used flag in the FPU, so at least you don't have to
shovel 300 bytes of state around unnecessarily.

> Intel has a novel approach on their 8087 and 2087 where they let the process
> context switch without saving FP state.  If another process tries using
> floating-point an interrupt occurs letting the OS then swap context only
> when necessary.  The trouble with this technique is that all it takes is
> for one out of every 20 or so context switches to require a re-save and you
> start losing overall processor time over just saving it unconditionally.
> At worse, if you have several processes constantly sharing the FP chip then
> you have essentially forced a complete extra interrupt exception invocation
> for every change in context - a massive penalty.

An interesting possibility would be to have the hardware support *both* an
FPU-used flag *and* a trap-on-first-FPU-use bit.  It would not seem too
difficult to set up some code in the kernel that switches between the
two strategies as a function of the number of FPU context switches that
have occurred lately.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry