Path: utzoo!utgpu!attcan!uunet!lll-winken!lll-lcc!ames!ncar!oddjob!uxc!uxc.cso.uiuc.edu!a.cs.uiuc.edu!p.cs.uiuc.edu!johnson
From: johnson@p.cs.uiuc.edu
Newsgroups: comp.lang.smalltalk
Subject: Re: Software Reuse 2
Message-ID: <80500038@p.cs.uiuc.edu>
Date: 19 Aug 88 15:07:00 GMT
References: <552@dcl-csvax.comp.lancs.ac.uk>
Lines: 107
Nf-ID: #R:dcl-csvax.comp.lancs.ac.uk:552:p.cs.uiuc.edu:80500038:000:5556
Nf-From: p.cs.uiuc.edu!johnson    Aug 19 10:07:00 1988


For a discussion of how Smalltalk encourages reusable software, see
the article "Designing Reusable Classes" that I wrote with Brian
Foote and which appeared in the latest (the second) issue of the
Journal of Object-Oriented Programming.

Briefly, there are a number of features of Smalltalk that encourage
software reuse, and inheritance is only one of them.  A very important
feature is late binding of procedure calls, usually called "message
sending".  This is not really related to inheritance, though languages
like C++ combine the two.  The programming environment also encourages
inheritance by providing tools for browsing class descriptions and
an interactive debugger that helps the programmer figure out what the
reused code is REALLY doing.  One of the most important "features" of
Smalltalk is the culture that has grown up around it, where the
designer of a reusable package is the chief among princes and "Not 
Invented Here" is the object of derision.

One of the things that makes it difficult to talk about inheritance is
that inheritance has many uses.  The most obvious is sharing code, but
even code sharing has at least two uses.  One is making a system smaller
and simpler by reducing the number of procedures and adding structure.
This kind of code sharing requires a great deal of hard work and
inspiration on the part of the designers.  The other use of code sharing
is during rapid-prototyping, when the designer makes subclasses of
anything that is handy, forcing classes to serve roles for which they
were never designed.  

These two kinds of code sharing result in very different class hierarchies,
and a class hierarchy of the second kind, which arose by chance, will never
be mistaken for a class hierarchy of the first kind, which arose by design.
However, it is a fact that the poorly designed class hierarchies are much
easier to design, and that usually it is impossible to design a good class
hierarchy until you have made a few versions.  Thus, you might as well start
with an ugly class hierarchy and not even try to design a good one until you
know what you are doing.

I am exagerating a bit.  It is always worthwhile to think about a problem
before you start coding, and a little though can result in a class hierarchy
that is much better than what you would get by chance.  However, unless you
have done it before, you will NEVER, NEVER get the class hierarchy right
the first time, and if you think differently than you are just fooling
yourself.  Our major class hierarchies have been redone half a dozen times,
and there is no evidence that we are done with them yet.  We are searching
for perfection, however, and we probably could have stopped a little while
ago and claimed victory if we were just looking for something that was
good enough.

The size of class hierarchies varies greatly depending on the problem
domain.  My Smalltalk compiler has 32 parse node classes.  There are five or
six flow node classes.  Small class hierarchies will not have much
structure, just a common super class and a few subclasses, but a large
class hierarchy should have a fair bit of structure.  If I see a class
hierarchy consisting of a superclass and 20 subclasses then I know that
the design is not very far along.  However, usually class hierarchies
grow incrementally, with occasional reorganizations, so it is very unlikely
to have a class with 20 direct subclasses.  A well design class hierarchy
containing 20 classes will have a depth of from 3 to 5.  For example,
the parse node class hierarchy is

TypedParseNode ()
    TypedAssignmentNode ('variable' 'value' )
    TypedBlockNode ('arguments' 'statements' 'type' 'returns' 'codeIndex' 'nestingLevel' 'blockTemps' )
    TypedCascadeNode ('receiver' 'messages' )
    TypedCaseNode ('testCase' 'options' 'otherwise' )
    TypedChoiceNode ('choice' 'trueCase' )
        TypedChoiceClassNode ()
    TypedCoercionNode ('statement' 'type' )
    TypedIndexNode ('rcvr' 'index' 'type' )
    TypedInlineNode ('argSetup' 'stmts' 'inlineInterval' )
        TypedInlineBlockNode ()
        TypedInlineMethodNode ()
    TypedJumpDestinationNode ('label' 'result' 'references' )
    TypedJumpNode ('destinationNode' )
    TypedLeafNode ('key' )
        TypedArgVarNode ()
        TypedBlockArgNode ('myBlock' )
        TypedBlockTempNode ('myBlock' )
        TypedContextNode ()
        TypedGlobalVarNode ('type' )
        TypedInstVarNode ()
        TypedLiteralNode ('type' )
        TypedSelfNode ()
        TypedSpecialVarNode ()
        TypedTempVarNode ()
            TypedTempRegVarNode ()
    TypedMessageNode ('receiver' 'selector' 'arguments' 'type' )
    TypedMethodNode ('selectorOrFalse' 'arguments' 'statement' 'encoder' 'temporaries' 'rcvType' 'retType' 'myLabel' 'literals' 'blockInfo' )
    TypedPrimitiveNode ('number' 'rtlList' 'argList' 'codeString' 'codeStream' 'returnTo' 'failCase' 'receiver' 'returnType' )
    TypedProcedureNode ('receiver' 'arguments' 'selector' 'class' 'methodIndex' 'retType' )
    TypedReturnNode ('expr' )
    TypedStatementListNode ('statements' )

Note that 17 of the 34 classes, or half of them, are at level 2 (with the
root being level 1) and there is only one level 4 class.  Thus, the class
hierarchy has the general form

	*
  ************
  ************
        *

This is not too uncommon in a single inheritance class hierarchy, though
usually it is a little thicker at the bottom.  I think that some of our
TypedLeafNode subclasses need to be reorganized, resulting in some more
level 4 classes.

Ralph Johnson