Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!wuarchive!brutus.cs.uiuc.edu!ginosko!uunet!mstan!amull
From: amull@Morgan.COM (Andrew P. Mullhaupt)
Newsgroups: comp.sw.components
Subject: Re: Garbage Collection & ADTs
Summary: Irreconcilable differences?
Message-ID: <393@e-street.Morgan.COM>
Date: 26 Sep 89 00:10:37 GMT
References: <900@scaup.cl.cam.ac.uk> <6530@hubcap.clemson.edu> <62342@tut.cis.ohio-state.edu>
Organization: Morgan Stanley & Co. NY, NY
Lines: 83


	The debate raging over garbage collection in this newsgroup is an
interesting contrast to the "effect of free()" thread in comp.lang.c; there
are many there who argue that they need to examine pointers after de-allocating
them(!) One argument advanced is that even though it's a bad thing, since it's
usually OK, it should become standard practice. This is to my mind in character
for C programmers. In this newsgroup, we find other programmers arguing that
programmers should have no direct use for pointers, and so the language can
determine the correct lifetime of dynamic data.

	I suggest it is going to prove necessary for the languages which have
this "Pascalian" point of view to close all loopholes (typecasts to integer,
etc.) to prevent the widespread use of C style perfidious trickery to escape
from sensible programming practice. (If you give them an inch...)

	I would also point out that despite the claims of its adherents, APL
(A Pointerless Language) despite having no explicit pointers does not do a
great job with storage management. Often space is allocated, then de-allocated
as soon as possible, only to be re-allocated for the same object, somewhere
else in the workspace, with the attendant risks of fragmentation and garbage
collection. The programmer is often forced to peculiar code (even for APL)
in order to try and provide memory management cues to the interpreter. There
is often no relief for the problem when storage efficiency requires re-use of
a local variable for more than one purpose. At least they have the cute idea
of always garbage collection immediately after printing the last response to
the user. (This has the effect of making garbage collection less noticable 
when working in the typical sofware engineer's paradise of a tiny test problem,
and excrutiatingly painful when you get a trivial syntax error in a workspace
where you just started a huge operation.) The point is that it makes a lot of
difference when garbage collection is performed, and if the programmer can 
control it.

	Another APL story: When calling FORTRAN from APL, you must allocate
the storage for the FORTRAN subprogram in the APL workspace. There are many
important FORTRAN subprograms which involve two calls sharing data in an
auxilliary array residing in the APL workspace, (i.e. at the mercy of the APL
garbage collector.) It can happen that APL decides it must perform garbage
collection between the two calls to FORTRAN and the auxilliary array is moved
to a new home. As can be expected in cases like this, the manual guarantees
unpredictable results. The classic example of this is a Fast Fourier Transform
which sets up all the trig calls in one call and performs the transform in a
second. Repeated transforms can then be done by repeating the second call only,
if you're calling from FORTRAN, (or C, or Pascal etc.) but this is too daring
from APL. And since there is no way to force APL to leave your array alone, the
only hope, and common practice, is to ensure that the garbage collection is
performed before the first call to FORTRAN, and to look at the the size of the
available workspace to make sure the arrays to be transformed are not large
enough to trigger the calamity again. This means that you always suffer the
performace hit of garbage collection, and you always have to essentially manage
your own storage. If writing an FFT in APL wasn't so slow, and the highly
vectorized FORTRAN FFT so fast, and the data arrays so large as to endanger
garbage collection, this wouldn't be much of a payoff. But they are.

	This last example falls into the category of "application programmer
doing something naughty (by extension through library routine) with (what are
essentially) pointers." The interesting part is that *BOTH* APL and FORTRAN get
to accuse you of this: APL doesn't want you to rely on the array being in any
given place, and FORTRAN doesn't want you to be able to move it. This is the
kind of unforseen conflict which prevents "RE-USABILITY" in many cases. From
their separate points of view, both the FORTRAN and APL implementors have
correct and sensible implementations, but their products are being used (to
very good result) in a way which neither expected, and through silly tricks
which make for highly unreadable and difficult to maintain code. 

	The moral of the story is: 

1. Put in garbage collection, but also put in sufficient controls for the
programmer.

2. If you need to make assertions about scope of dynamic data, FORCE them.

3. Programmers (even us Pascal lovers) will not stop at anything when the
big push comes to the big shove, and we'll likely get around sensible 
and normal guidelines which are not supported by adequate sensible facilities.

Later,
Andrew Mullhaupt.

Disclaimer:
Any opinions here are not necessarily those of my employer.

(Did you know that Morgan Stanley & Co. Inc., is the world's largest APL
  shop? IBM told us so...)