Path: utzoo!utgpu!water!watmath!clyde!att!ihnp4!poseidon!psrc From: psrc@poseidon.UUCP (Paul S. R. Chisholm) Newsgroups: comp.lang.c Subject: Re: Array indexing vs. pointers... Summary: avoiding multiplication Message-ID: <517@poseidon.UUCP> Date: 27 Sep 88 03:39:29 GMT References: <8809191521.AA17824@ucbvax.Berkeley.EDU> <68995@sun.uucp> <934@l.cc.purdue.edu> Organization: AT&T Bell Laboratories Lines: 73 In article <8809191521.AA17824@ucbvax.Berkeley.EDU> U23405@UICVM (Michael J. Steiner) writes: >First, I have a question. How and why is array indexing slower than >pointer arithmetic? They are very similar. Also, I think that compilers >should automatically translate array indexing into pointer arithmetic when >dealing with arrays in C. Any comments, opinions? I waited a long time to respond to this one, because I expected everyone to post the obvious answer. Yes, a[i] is equivalent to *(a+i), *in* *C*; but what about in machine language? Y'know, the stuff the CPU really eats? "The definition of 'adding 1 to a pointer,', and by extension, all pointer arithmetic, is that the increment is scaled by the size in storage of the object is pointed to." [K&R 1st ed, p. 94] This implies multiplying the pointer offset or array index by the size. Granted, the size is known, is one for char's, and is often a power of two; but what about structures that are seventeen bytes long? So, let's consider the (*un*optimized) translation of the following loops into a pseudo-C assembler language, where pointers arithmetic is scaled by the smallest addressable cell (probably a byte): C: for ( i = 0; i < N; ++i ) a[ i ] = b[ i ]; /* a and b are of type t */ /* or equivalently, *(a+i) = *(b+i) machine language: i = 0 begin_for: if ( ! ( i < N ) ) goto end_for tmp = i * sizeof t *( a + tmp ) = *( b + tmp ) ; probably a single instruction i += 1 goto begin_for end_for: TOTAL: N+1 integer assignments, N increments, N copies; N multiples! C: for ( p = a, q = b, i = 0; i < N; ++i ) *p++ = *q++; machine language: p = a q = b i = 0 begin_for: if ( ! ( i < N ) ) goto end_for *p = *q ; on a PDP-11, these next three lines p += sizeof t ; can be replaced by a single instruction q += sizeof t ; ditto VAX, 680x0, etc. i += 1 goto begin_for end_for: TOTAL: 3 integer assignments, 3*N increments, N copies; 0 multiplies! We've traded N multiplies and N assignments for 2*N increments, which can easily (on a PDP-11 or similar CPU) be snuck into auto-increment modes of the copy instruction. Yes, an optimizer would reduce the first to the second, or better. But one of Ritchie's goals was to make it easy for the compiler generate good low-level code from clever high-level code. Moral: A really good C programmer should know assembler language and compiler technology to write ordinary applications efficiently. Homework: Discover what kind of code your compiler generates for *p++ = *q++ when pointing to a char, int, or odd length structure. If p and q are auto, not register, is *p++ = *q++ the same as: *(++p - 1) = *(++q - 1) /* C, not assembler */ (that is to say, very ugly code), or possibly equivalent to: tmp = *p = *q, ++p, ++q, tmp Paul S. R. Chisholm, psrc@poseidon.att.com (formerly psc@lznv.att.com) AT&T Bell Laboratories, att!poseidon!psrc, AT&T Mail !psrchisholm I'm not speaking for the company, I'm just speaking my mind.