Path: utzoo!attcan!uunet!husc6!purdue!umd5!mimsy!chris
From: chris@mimsy.UUCP (Chris Torek)
Newsgroups: comp.lang.fortran
Subject: Re: Should I convert FORTRAN code to C?
Message-ID: <12302@mimsy.UUCP>
Date: 4 Jul 88 10:04:03 GMT
References: <2742@utastro.UUCP> <20008@beta.UUCP> <224@raunvis.UUCP> <20531@beta.lanl.gov>
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 164

>>In article <20518@beta.lanl.gov> jlg@beta.lanl.gov (Jim Giles) claimed that
>>>... The C gurus all have been saying that a static multidimensional
>>>array is implemented as pointer to array, but that a dynamic
>>>multidimensional array is implemented as pointer to pointer
>>>(to pointer ... etc for the number of dimensions).

>In article <12292@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>Only the false gurus.  :-)

In article <20531@beta.lanl.gov> jlg@beta.lanl.gov (Jim Giles) writes:
>I would like to see dynamically allocated multi-dimensional arrays in
>C that aren't declared as:
>    int **a ...

Okay:

	int n, m, *a, i, j;

	/* make a an int a[n][m] */
	a = (int *)zalloc(n * m, sizeof(int));
	/* zalloc is like calloc but never fails (aborts on error) */

	/* work with it */
	for (j = 0; j < n; j++)
		for (i = 0; i < m; i++)
			... ref a[j * m + i] ...

The other reason your claim in >>> above is wrong is because statically
allocated 2-D arrays can be done with row vectors:

	/* these must be global */
	int a0[M], a1[M], a2[M], a3[M], ..., a99[M]; /* if N=100 */
	int *a[N] = { a0, a1, a2, a3, ..., a99 };

>But all implementations (both here on the net and in journals) have done
>this.

Because it is convenient.  The version above with j*m + i is not.
(This is the key point; more later.)

>The result is that a reference to such an array really IS turned
>into pointer-to-pointer type constructs.  If you have an alternate
>implementation, please send it to me and I'll shut-up.

Someone already did.  One can combine the row-vector and the flat
array:

	int **a, i;
	a = (int **)zalloc(N, sizeof(int *));
	*a = (int *)zalloc(N * M, sizeof (int));
	for (i = N; i < N * M; i += M)
		a[i] = *a + i;

	/*
	 * now may work with either a[j][i] (row vector)
	 * or a[0][j * M + i] (flat).
	 */

>>Ideally, one should be able to say
>>
>>	void foo(int m, int n, int arr[n][m]) {
>>		... /* ref arr[j][i] */ ...
>>	}
>>	/* called as foo(M, N, arr) */

>I agree.  The point is that you can't.

Right.  This makes working with flat arrays of more than one dimension
inconvenient (NOT impossible!).

>Furthermore, 'foo' has to be implemented twice if the expected
>argument (arr) could be either statically or dynamically allocated.

Wrong.  As long as the static and dynamic versions agree---both
must either be flat, or be row-vector---the same code can be used
for both.  If the dynamic array is both flat and row-vector, then
it does not matter which the static version is.  Incidentally, I
can make the static version also be both flat and row-vector:

	/* these must again be global */
	int a00[M * N];
	int *a[N] = { a00 + 0, a00 + M, a00 + 2*M, ..., a00 + 99*M };

>In the later case, the declaration has to be: int **arr.

In the row-vector case.  If we use the flat-and-row-vector dynamic
implementation, the call must read

	foo(m, n, &arr[0][0])

and the declaration must read

	void foo(int m, int n, int *arr) {
		int j, i;
		for (j = 0; j < n; j++)
			for (i = 0; i < m; i++)
				... work with arr[j*n + i] ...
	}

If we use the flat static implmentation, the call and the declaration
remain identical.

>>>... The Fortran generally runs faster because its
>>>conception of array is better suited to these types of optimization.

>>which is correct, but not because of the form of subscripting, but
>>rather because of an implicit rule against aliasing ...

>Which is what I pointed out when I said that C needs the 'noalias'
>directive.

I think that the noalias directive damages C far more than it improves it,
just as I think that rule in Fortran damages Fortran.

>The Fortran rule against aliasing is not implicit either.

I meant only that it is not explicit in the code.

>The standard specifically prohibits it.  Due to separate compilation
>constraints, the compiler can't check the problem.

... in the general case, yes.  In almost all cases the compiler
(`linker', if you prefer) can, at least if the Fortran rule against
aliasing matches what I think was intended by X3J11's `noalias'.  If
the rule is stronger than that (I hope not), the `linker' (which should
certainly do code generation, if you want any sort of decent inline
expansion) can detect all illegal aliasing (again, I hope the rule is
not *that* strong!---it would prohibit all sorts of useful tricks that
never cause trouble on real hardware).

>By the way, the Fortran 8x version of dynamic memory allocation doesn't
>have the pointer-to-pointer problem.  When you declare an allocatable
>array, its dimensionality is part of the declaration.

Likewise for gcc:

	void f(int n, int m) {
		int a[n][m];
		/* or, if you prefer to use malloc:
		    (*a)[m] = (int (*)[m])zalloc(n * m, sizeof(int)) */
		...
	}

Again, this is convenient.  Unfortunately, it is also not standard, so
it cannot be used in general.  If one wants a flat array (because the
compiler is unable to vectorise the row-vector version, perhaps), one
must use the inconvenient access version, a[j*m + i] (with either the
flat allocation or the combined flat-plus-row-vector allocation).  When
something is convenient, it gets used; when it is inconvenient enough,
it does not get used (for instance, I never use Fortran [*]).  This
gives a slight advantage to the flat-plus-row-vector version:  when
speed is unimportant (or when your compiler is smart enough), use the
convenient row-vector access a[j][i]; when you have to play tricks to
convince the compiler to vectorise, use the inconvenient flat acess
a[0][j*m + i].  The problems with the f-p-r-v version are that the
allocator is more complex (minor: it is only written once) and that
using it in two different ways is confusing (major: this must constantly
be documented).

-----
[*] Bet you took this one wrong!  Instead, when I must, I use Ratfor. :-)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris