Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/18/84; site rlgvax.UUCP Path: utzoo!watmath!clyde!bonnie!akgua!sdcsvax!dcdwest!ittvax!decvax!genrad!wjh12!harvard!seismo!rlgvax!guy From: guy@rlgvax.UUCP (Guy Harris) Newsgroups: net.lang.c Subject: Re: offsets in structures. Message-ID: <209@rlgvax.UUCP> Date: Thu, 18-Oct-84 19:53:45 EDT Article-I.D.: rlgvax.209 Posted: Thu Oct 18 19:53:45 1984 Date-Received: Sun, 21-Oct-84 14:22:08 EDT References: <393@orion.UUCP> <5172@brl-tgr.ARPA>, <6542@mordor.UUCP> <5272@brl-tgr.ARPA>, <196@rlgvax.UUCP> <5319@brl-tgr.ARPA> <204@rlgvax.18 Oct 84 23:53:45 GMT Organization: CCI Office Systems Group, Reston, VA Lines: 120 Arithmetic expressions that produce pointers: 1) Purely integer expressions: As discussed in my previous article, K&R indicates that no such expression, except a constant 0, is to be interpreted as a null pointer. The phrase "the constant 0" appears in several places (in the discussion of the conditional operator, as well as the places mentioned in my previous article); I do not think that the modifier "the constant" appears by accident. I believe it was explicitly put there to indicate that an arbitrary integral result of zero need not be converted into a null pointer; only an explicit zero constant need be so converted. If somebody has a statement to the contrary, from either K or R, they should exhibit it. 2) Pointer plus or minus an integer expression: The actual phrase in "7.4 Additive operators" reads A pointer *to an object in an array* and a value of any integeral type may be added. The latter is in all cases converted to an address offset by multiplying it by the length of the object to which the pointer points. The result is a pointer of the same type as the original pointer, and which points to another object in the same array, appropriately offset from the original object. A null pointer does not point to any object in an array. If you add an integer to a pointer, by the paragraph above the resulting pointer points to an object in an array. Therefore, it is not a null pointer. I am quite aware that if you have a pointer to an element in a character array on a PDP-11, and the element has the address 0177777, adding one to that pointer yields the result 0. This is not an argument that you can produce a null pointer by an arithmetic expression. First of all, arrays move forward in memory, so there *is* no next element in that array, as the element in question is at the end of your address space. Second of all, if you have a machine on which a null pointer does not have the value zero, and you add 1 to a pointer whose value is such that adding 1 to it will cause wrap-around, you have still not produced a null pointer. You may have produced a pointer that doesn't point where it "should", and which may even to a non-existent part of the address space, but that does not mean it must be a null pointer. 3) Other expressions: Under 14.4, "Explicit pointer conversions", it says Certain conversions involving pointer are permitted *but have implementation-dependent aspects.... ...An object of integeral type may be explicitly converted to a pointer. The mapping always carries an integer converted from a pointer back to the same pointer, but is otherwise machine dependent. This implies that if you convert a null pointer to an integer, the integer that results must convert back into a null pointer. The most natural and "unsurprising" conversion (see the previous paragraph in section 14.4 on conversions from pointer to integer) is just a bitwise copy. If converting a null pointer produces an integer with the value 0xff000000, so be it. If that's how a null pointer is represented internally, I'd find conversion of a null pointer into a zero integer more surprising than conversion of it into 0xff000000. Given that, converting an integer back into a pointer by a bitwise copy would be the natural way to do it; this would convert an integer value of 0, other than a constant 0 (which is *not* an integer converted from a pointer), into a pointer with the value 0, not a null pointer, and would convert an integer with the value 0xff000000 into a null pointer. Yes, this implies that it's a pain to produce a pointer which points to location 0. It even implies that producing a pointer which points to location 0 can't be done the same way you produce a pointer which points to location 1; you'd have to say something *p; int i; p = (i - i); Worse things have happened. It may be a pain to produce such a pointer, but it's not impossible, and it's not *that* common an operation. So what sort of arithmetic expressions are left? I do rescind my earlier statement that 16-bit "int"s and 32-bit pointers are illegal. The statement that "(the integer-to-pointer mapping) always carries an integer converted from a pointer back into the same pointer" does not imply that an "int" must be big enough to hold a pointer. It merely implies that there must be an *integral type* big enough to hold a pointer; "int" is not the largest integral type, just the most "natural" type. "Natural" is not a precise specification; it implies that the choice of size of "int" is machine dependent. Of course, what is most "natural" given the data path width of the machine isn't necessarily the most "natural" given the size of objects you can put on the machine; try using "malloc" and "realloc" to grow a symbol table past 64K on a machine with 16-bit "int"s but 32-bit pointers. (It can't be done in a straightforward fashion. Believe me. We have such a machine, and we've *tried*. The standard UNIX "nm" uses that technique, and if your symbol table is bigger than 64K bytes, you lose.) So if you have 32-bit "int"s, you can't convert the pointer with the bit pattern 0x801234 into an "int" and back and get the same value back, but you can convert it to the integral type "long" and back; as it says in section 14.4, paragraph 2, A pointer may be converted to any of the integral types *large enough to hold it. Whether an "int" or "long" is required is machine dependent.* (italics mine) However, it does state specifically that the difference between two pointers is an "int", not just an integral value. (We don't do that. *Nostra culpa* - not "*mea culpa*"; it wasn't my idea. Our newer systems will bite the bullet and have 32-bit "int"s, mainly for compatibility with our 32-bit supermini, but also because they're 4.2BSD-based, and there's probably several *months* of work changing 4.2BSD to use "long" instead of "int" when it means "32-bit quantity". I assume the AT&T 68000 C compiler gets this right, when built for 16-bit "int"s.) Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy