Path: utzoo!mnetor!uunet!husc6!cmcl2!nrl-cmf!ames!oliveb!sun!gorodish!guy From: guy@gorodish.Sun.COM (Guy Harris) Newsgroups: comp.lang.c Subject: Re: Help me cast this!: Ultrix 2.x bug Message-ID: <52684@sun.uucp> Date: 10 May 88 18:26:06 GMT References: <294@fedeva.UUCP> <1451@iscuva.ISCS.COM> <11344@mimsy.UUCP> <392@m3.mfci.UUCP> Sender: news@sun.uucp Lines: 144 Keywords: pointer to array of struct First: mail to "root" at "mfci" failed, so I'll post this; could the person maintaining netnews at Multiflow please try to arrange that not all messages from there have a "From:" line listing "root@mfci.UUCP" as the poster? The *real* poster's name appears in the "Reply-To:" line, so the information *is* available. Second: > >-However, pcc compilers [without Guy Harris's fix, or equivalent] > >-don't give a warning, and I was once told that Dennis Ritchie considers > >-it to be perfectly legal C. > >Told by whom? > > By Bjarne Stroustrup, whom I assume simply asked him. This was the > result of a mail conversation I was having with him several years ago > over what &a should mean when a is an array. Of course, &a is not > legal K&R C, but Bjarne thought it should be treated just like a, > i.e., that &a should yield a pointer to the first element of a. This is *not* the same as the "it" referred to above, which is an assignment of a value of type "struct outfile **" to a variable of type "struct outfile (*)[]". The latter is not valid; "array of" and "pointer to " are inequivalent types, and therefore "pointer to array of " and "pointer to pointer to " are inequivalent types. I would be *EXTREMELY* surprised if Dennis Ritchie felt they were equivalent. If it's not clear why they must be inequivalent, here's a specific example. Consider that, if all pointers are represented in a particular implementation as pointers to the first byte of an object, then p++ causes the address contained in "p" to be incremented by the size of the object. Now, the size of "struct outfile *" might, say, be 4 on a machine with 8-bit bytes and 32-bit pointers. However, the size of "struct outfile [23]", for example, is 23*sizeof (struct outfile), and the size of "struct outfile []" is unknown (in effect, zero). As such, if you have: struct outfile **p; struct outfile (*output)[]; on an implementation of the sort listed above, with 8-bit bytes and 32-bit pointers, the expression "p++" will increment the address contained in "p" by 4 bytes (as "sizeof (struct outfile *)" is 4) and the expression "output++" will probably elicit a complaint from the compiler (as "sizeof (struct outfile [])" is unknown). In (old) K&R C (the new K&R presumably describes the ANSI rules), it is considered incorrect to put "&" before an array or function. In almost all contexts, an expression of type "array of " or "function returning " is converted to type "pointer to " or "pointer to function returning ". The pointer-valued expressions in question are not lvalues, and thus cannot be preceded with "&", just as you can't say "&3". Some *compilers* permit an "&" to be placed before expressions of this type, and treat it as redundant. Some compilers also appear to consider "pointer to " and "array of " to be equivalent. Unfortunately, this causes some invalid programs to compile without complaint; those programs fail later. In fact, one such program *did* fail; somebody posted something to "comp.lang.c" about it (actually to "net.lang.c", if I remember correctly, which indicates how long ago this was!), which was what got me to look for and find the PCC bug in question. > PS Dennis claims that this is C: > main() > { > int a[5][7] ; > int (*p)[5][7]; > > p = (int***) a; /* no & */ > printf("a %d p %d *p %d\n",a,p,*p); /* a == p == *p !!! */ > (*p)[2][4] = 123 ; > printf("%d\n",a[2][4]); /* 123 */ > } > It works! Amazing! "a" is of type "array of 5 arrays of 7 'int's". "p" is of type "pointer to array of 5 arrays of 7 'int's." There is no way in K&R C to type-correctly assign a pointer to "a" to "p". The "(int ***)" cast is incorrect; "p" does *NOT* have type "pointer to pointer to pointer to 'int'." The fact that the values returned by "a", "p", and "*p" should not be surprising. In almost all contexts, an array-valued expression is converted to a pointer to the first element of the array. "*p" is an array-valued expression and gets so converted; in effect, "*p" is equivalent to "p" in almost all contexts. The fact that the expressions "a" "p" have the same numeric value is a consequence of the fact that *most* C implementations represent pointers by the address of the first addressible unit of the object pointed to. As such, the addresses represented by "a" and "p" are the same. If Dennis considers the above valid C, either by K&R rules or by ANSI C rules, I would like to see his reasoning. Everything *except for* the "p = (int ***)a" is valid K&R C and valid ANSI C. (Actually, if one wants to be *extremely* fussy, one can complain about: the "printf" - there is no guarantee in K&R that *any* particular "printf" format specifier can be used to print any particular pointer, and ANSI C guarantees only that "%p" can be used to print "void *"; the lack of certain #includes, such as "#include "; the lack of declaration of arguments for "main()"; but none of those are germane to this particular discussion.) The following *would* be valid K&R C (modulo the other stuff): main() { int a[5][7]; int (*p)[7]; p = a; /* no &, no cast */ printf("a %d p %d *p %d\n",a,p,*p); /* a == p == *p !!! */ p[2][4] = 123; printf("%d\n",a[2][4]); /* 123 */ } Note that "p" is of type "pointer to array of 7 'int's." "a" is of type "array of 5 arrays of 7 'int's." In most contexts, the expression "a" is converted to a pointer to the first element of "a"; this first element is of type "array of 7 'int's," so a pointer to it is of type "pointer to array of 7 'int's," which is the same type as "p". The above is also valid ANSI C. The following would be valid ANSI C (modulo the other stuff), but not valid K&R C: main() { int a[5][7]; int (*p)[5][7]; p = &a; /* no cast */ printf("a %d p %d *p %d\n",a,p,*p); /* "a", "p", and "*p" have the same numeric value */ /* however, "p" and "*p" are *NOT* equivalent */ /* "p" is a pointer to "a", "*p" is a pointer to "a[0]" */ (*p)[2][4] = 123; printf("%d\n",a[2][4]); /* 123 */ }