Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!husc6!necntc!ima!johnl
From: johnl@ima.ISC.COM (John R. Levine)
Newsgroups: comp.arch
Subject: Re: Is the Intel memory model safe from NO-ONE ?!?
Summary: of course not, and for good reasons, too
Keywords: intel 286 segmentation
Message-ID: <1018@ima.ISC.COM>
Date: 12 May 88 17:12:56 GMT
References: <3384@drivax.UUCP> <4053@killer.UUCP>
Reply-To: johnl@ima.UUCP (John R. Levine)
Organization: Not much
Lines: 59

In article <4053@killer.UUCP> elg@killer.UUCP (Eric Green) writes:
>There's one case where segmentation (of code) is a Big Win (segmentation of
>data space is almost never a win): Shared libraries. ...
>
>On a segmented machine, all shared libraries could start at address 0, in
>different segments in different processes. It is a much more elegant solution
>than chopping your address space in such a manner that you cannot have two
>windowing libraries loaded at the same time.  

You'd think so, wouldn't you? Unfortunately, on the 286, code segments
invariably include segment:offset addresses for jump and call instructions
in-line in the code. It's also quite common to have sequences like this:

	mov ax,seg foop	; get segment number of something
	mov es,ax	; put it in a segment registers
	mov dx,es:something ; get the thing out of its segment

The first mov instruction also has a segment number in-line in the code. The
practical effect is to require that shared libraries be bound to fixed
addresses when they are loaded, and that they be bound to the same segment
numbers in each process in which they are used.

Gordon Letwin's article on OS/2 in the current Byte goes into considerable
detail describing the backflips he had to go through to make shared libraries
work, including reserving in all of the segment tables a range of segment
numbers when it loads a library, then making those segments point to the
library in the tasks that are using it, and making them invalid in all other
tasks.

The issue of binding shared libraries to address spaces is not exactly a new
one. TSS/360 did a reasonable job of it in 1969, on a non-segmented
architecture, taking the approach that code segments had to be 100% pure and
contain no relocatable addresses at all, and that at each procedure call the
caller passed to the callee the address of the callee's data segment. Each
routine kept the addresses of all of its callees' data segment in its own data
segment, and there was some hack to pass the address of the main routine's
data in the initial call. The 360 has no direct addressing, so almost all data
addressing is done based on a pointer either loaded from memory or passed in
somehow as a parameter; the extra effort to do stuff the TSS way was very low.
(TSS had other problems, but shared libraries wasn't one of them.)

I suppose they could have enforced a rule like this in OS/2, since all of the
OS/2 code is new or at least recompiled. But it would be a horrible hack.
Where would you pass the segment number -- as an extra argument on the stack,
in the DS or ES, or somewhere else? If it's an extra argument, it creates
considerable excitement for the many programmers who use slightly non-standard
calling sequences. If in a segment register, there's a serious performance hit
because reloading a segment register is very slow, even if the new value is
the same as the old.

The message here is that although the *86's segmentation scheme is somewhat
less awful than the bank-switching kludges used on the Z80, it doesn't solve
the problems that segmentation normally does, and so hardly deserves the
same name as the addressing scheme in Multics or the B5000.

(end of diatribe)
-- 
John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something
Rome fell, Babylon fell, Scarsdale will have its turn.  -G. B. Shaw