Path: utzoo!attcan!uunet!lll-winken!lll-tis!ames!mailrus!husc6!necntc!ima!johnl From: johnl@ima.ISC.COM (John R. Levine) Newsgroups: comp.arch Subject: Re: Is the Intel memory model safe from NO-ONE ?!? Summary: of course not, and for good reasons, too Keywords: intel 286 segmentation Message-ID: <1018@ima.ISC.COM> Date: 12 May 88 17:12:56 GMT References: <3384@drivax.UUCP> <4053@killer.UUCP> Reply-To: johnl@ima.UUCP (John R. Levine) Organization: Not much Lines: 59 In article <4053@killer.UUCP> elg@killer.UUCP (Eric Green) writes: >There's one case where segmentation (of code) is a Big Win (segmentation of >data space is almost never a win): Shared libraries. ... > >On a segmented machine, all shared libraries could start at address 0, in >different segments in different processes. It is a much more elegant solution >than chopping your address space in such a manner that you cannot have two >windowing libraries loaded at the same time. You'd think so, wouldn't you? Unfortunately, on the 286, code segments invariably include segment:offset addresses for jump and call instructions in-line in the code. It's also quite common to have sequences like this: mov ax,seg foop ; get segment number of something mov es,ax ; put it in a segment registers mov dx,es:something ; get the thing out of its segment The first mov instruction also has a segment number in-line in the code. The practical effect is to require that shared libraries be bound to fixed addresses when they are loaded, and that they be bound to the same segment numbers in each process in which they are used. Gordon Letwin's article on OS/2 in the current Byte goes into considerable detail describing the backflips he had to go through to make shared libraries work, including reserving in all of the segment tables a range of segment numbers when it loads a library, then making those segments point to the library in the tasks that are using it, and making them invalid in all other tasks. The issue of binding shared libraries to address spaces is not exactly a new one. TSS/360 did a reasonable job of it in 1969, on a non-segmented architecture, taking the approach that code segments had to be 100% pure and contain no relocatable addresses at all, and that at each procedure call the caller passed to the callee the address of the callee's data segment. Each routine kept the addresses of all of its callees' data segment in its own data segment, and there was some hack to pass the address of the main routine's data in the initial call. The 360 has no direct addressing, so almost all data addressing is done based on a pointer either loaded from memory or passed in somehow as a parameter; the extra effort to do stuff the TSS way was very low. (TSS had other problems, but shared libraries wasn't one of them.) I suppose they could have enforced a rule like this in OS/2, since all of the OS/2 code is new or at least recompiled. But it would be a horrible hack. Where would you pass the segment number -- as an extra argument on the stack, in the DS or ES, or somewhere else? If it's an extra argument, it creates considerable excitement for the many programmers who use slightly non-standard calling sequences. If in a segment register, there's a serious performance hit because reloading a segment register is very slow, even if the new value is the same as the old. The message here is that although the *86's segmentation scheme is somewhat less awful than the bank-switching kludges used on the Z80, it doesn't solve the problems that segmentation normally does, and so hardly deserves the same name as the addressing scheme in Multics or the B5000. (end of diatribe) -- John R. Levine, IECC, PO Box 349, Cambridge MA 02238-0349, +1 617 492 3869 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something Rome fell, Babylon fell, Scarsdale will have its turn. -G. B. Shaw