Path: utzoo!censor!geac!jtsv16!uunet!tut.cis.ohio-state.edu!ucbvax!agate!bionet!ames!ll-xn!mit-eddie!uw-beaver!tektronix!sequent!jjb
From: jjb@sequent.UUCP (Jeff Berkowitz)
Newsgroups: comp.arch
Subject: Re: delayed branch
Message-ID: <19870@sequent.UUCP>
Date: 9 Aug 89 06:16:56 GMT
References: <828@eutrc3.urc.tue.nl>
Reply-To: jjb@sequent.UUCP (Jeff Berkowitz)
Organization: Sequent Computer Systems, Inc
Lines: 52

In article <828@eutrc3.urc.tue.nl> rcpieter@rc4.urc.tue.nl writes:
>Just wondering---
> - What happens on existing processors which use delayed branches when the
>instruction put in the branch instruction's shadow is also a branch?

If you haven't thought this through, it's quite amusing.  Given such a
machine, the typical CISC code (erroneous in this case, of course) -

    looptop:
	...
	jsr	subr
	br	looptop

causes exactly *one* instruction at "subr" to be executed before going
back to "looptop".  (When executing the jsr, you fetch the branch; when
executing the branch, you fetch @subr; when executing @subr, you fetch
at looptop, etc).

On one machine in my past, the architects were disturbed enough by this
behavior to "fix" it in hardware: if a branch occurred in the shadow,
the machine automatically converted the "shadowed" branch into a noop!
If the first branch happened to be a jsr, as in the example above, the
machine would not only "noop" it but would remember that it should be
the return addrss of "subr", so it would be executed when you got back.
This allowed code like

	jsr syscall_enter
	bcs error		# executed on return from syscall_enter

to work "just like a CISC".

The machine actually had two instructions in the "shadow" (we called
these "trailers").  So there were actually three possible return
addresses of any jsr instruction -

    the next instruction, if it was a branch, jump, or jsr
    else the instruction after that, if *it* was a branch, jump, or jsr
    else the instruction after *that*.

The instruction pipeline had to determine which to "push" on-the-fly.

Was this complexity worth it?  Well, presumably it improved code density.
I am not aware of any measurements.  It made breakpoint debugging a
serious pain; you needed a zillion types of breakpoint instructions
so that if you set a breakpoint on a branch that was the second trailer
of a jsr, the breakpoint instruction itself would *also* look like a
branch.  Anyway the machine was not successful, in part because of the
complexity of trying to implement a lot of little rules like this in
hardware.
-- 
Jeff Berkowitz N6QOM			uunet!sequent!jjb
Sequent Computer Systems		Custom Systems Group