Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cs.utexas.edu!uunet!hsi!wright
From: wright@hsi.UUCP (Gary Wright)
Newsgroups: comp.sw.components
Subject: Re: Garbage Collection & ADTs
Message-ID: <604@hsi86.hsi.UUCP>
Date: 26 Sep 89 15:50:08 GMT
References: <599@hsi86.hsi.UUCP> <6579@hubcap.clemson.edu>
Reply-To: wright@hsi.com (Gary Wright)
Organization: Health Systems Intl., New Haven, CT.
Lines: 109

In article <6579@hubcap.clemson.edu> billwolf%hazel.cs.clemson.edu@hubcap.clemson.edu writes:
>From wright@hsi.UUCP (Gary Wright):
>> I have seen a number of people claim that GC can not be used in a
>> real-time system and then conclude that GC is a bad thing that should
>> not be used.  
>
>   You're missing several important steps here having to do with
>   performance degradation and the violation of the general principle
>   that one should never do at run time that which can be done instead
>   at compile time, design time, etc.; in general, the sooner a problem
>   is handled (and similarly, the sooner an error is detected), the less
>   costly it will be to handle it.  

In my posting I tried to emphasize the fact that there is probably a
trade-off between using a language that supports GC  vs. one that
doesn't.  It is not clear to me that the relative costs are such as to
render GC useless.  I do believe that it is harder to measure the costs
associated with non-GC languages. Please provide some reasoning for your 
statements which indicate that you feel differently.  Simply claiming that 
the costs do not support GC is not enough.  I will try to provide some
reasoning for GC below.

I agree that that as much as possible should be done at compile or 
design time.  I would prefer that the compiler did most of the dirty work.
Also, in the context of reusable software, design decisions should, as much
as possible, not limit later reuse of the software (e.g. a clear violation
of this is C++'s "virtual" and "inline" keywords).

>   Why pay and pay and pay at run time
>   when an efficient self-managing solution (valid forevermore) can be 
>   used (and reused) instead?

It is not clear to me that garbage collection entails a particularly
harsh performance penalty.  It is also not clear to me that the
self-managing solution is simple enough to allow such components to be
easily created and modified (i.e. modification via inheritance, which
Ada does not support).

After reading your posting, I went back and read the sections in
"Object-oriented Software Construction" regarding memory management.
What follows is mostly Bertrand Meyer's ideas regarding memory
management.  I tend to agree, so I will simply present his ideas with
my comments.

Bertrand Meyer claims that programmer-controlled deallocation "is
unacceptable for two reasons: security and complication of program
writing."

By security, Meyer means that programmers are mortals, and they will
make mistakes and will dispose of objects that still have active 
references.  Complication refers to the fact that simply disposing
of an object is not sufficient, all internal objects must also
be disposed.  Meyer calls this the "recursive dispose problem":

	This means that a specific release procedure must be
	written for any type describing objects that may refer
	to other objects.  The result will be a set of mutually
	recursive procedures of great complication.

	[...]
	Instead of concentrating on what should be his job -
	solving an application problem - the programmer turns 
	into a bookkeeper, or garbage collector (whichever 
	metaphor you prefer).  The complexity of programs
	is increased, hampering readability and hence other
	qualities such as ease of error detection and
	ease of modification.  This further affects security: 
	the more complex a program, the more likely it is to 
	contain errors.

This is the trade-off I was talking about. You can do without GC,
but the increased complexity of your programs has its own
penalty also.

Meyer goes on to describe an alternative, the "Self-Management Approach"
in which the designer of an ADT takes care of the storage management
associated with the ADT.  This is the approach I believe Bill Wolf
is advocating.  This is indeed a "responsible alternative" to leaving
everything to the programmer.  Meyer advocates the use of this technique
for languages that do not support GC.  In his example, all objects to
be added to a linked list, are copied into storage that is managed
by the linked list.  This would seem to be quite inefficient for large
objects.  I'm not sure how storage could be safely managed when an ADT
only holds references to objects.  

In his example, procedures that add nodes to or delete nodes from the
linked list cause new nodes to be created or destroyed.  If an ADT
provides direct procedures to add or destroy objects within the ADT,
then the ADT has not really hidden the problem from the programmer at
all.  At the most, it has hidden the recursive dispose problem.

This approach does not solve the previously mentioned problems of
security and complication.  Instead of the applications programmer
worrying about storage, the ADT designer must worry about it.  My hunch
is that the distiction between an ADT designer and an applications
programmer is clear for objects like linked lists, stacks etc. but that
it is not so clear the farther away from the basic data structures you
get.  I wonder if the overhead related to having ADT's managing storage
via copying, and redundant code (all the recursive dispose procedures)
doesn't come close to the overhead for GC.

It should be noted, that Meyer advocates a GC facility that is
incremental in nature (not bursty), and can be explicitly turned on or
off when necessary (e.g. when real-time constraints exist).

Your turn. :-)
-- 
Gary Wright 					...!uunet!hsi!wright
Health Systems International                    wright@hsi.com