Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!rpi!nyser!cmx!dl
From: dl@cmx.npac.syr.edu (Doug Lea)
Newsgroups: comp.lang.c++
Subject: Re: named return values
Message-ID: <1835@cmx.npac.syr.edu>
Date: 10 Aug 89 10:46:25 GMT
References: <1826@cmx.npac.syr.edu> <26302@shemp.CS.UCLA.EDU> <6444@columbia.edu>
Reply-To: dl@cmx.npac.syr.edu (Doug Lea)
Organization: Northeast Parallel Architectures Center, Syracuse NY
Lines: 177


I have a great deal of sympathy with Steve Kearns views, but disagree
with his conclusions. For many, `algebraic' classes, including vectors
and matrices, there is a natural object-oriented operations &
semantics, that is substantially different-looking than the more
familiar value-based operations & semantics. I think many such classes
ought to support BOTH.

To compose an argument about why this should be so, I'll scale down,
and consider a class-based re-implementation of simple integers:

class Int
{
  int  rep;
public:
       Int(const Int& a)         :rep(a.rep) {}
       Int(int i = 0)            :rep(i) {}

  void operator = (const Int& a) { rep = a.rep; }

  void negate()                  { rep = -rep; }
  void operator +=(const Int& a) { rep += a.rep; }
  void operator -=(const Int& a) { rep -= a.rep; }
  void operator *=(const Int& a) { rep *= a.rep; }
  void operator /=(const Int& a) { rep /= a.rep; }
};


This definition is very much an object-oriented one. However, I bet
that few programmers would like to use it. Instead of coding integer
operations in a familiar expression-based (value-based) notation like,
for Int a, b, c, d;

   a = (b - a) * -(d / c);     (*)

they would be forced into something akin to assembly language
calculations, hand-translating their intentions (i.e., (*)) into


  Int t1(b);  t1 -= a;         (**)
  Int t2(d);  t2 /= c;
  Int t3(t2); t3.negate();
  Int t4(t1); t4 *= t3;
  a = t4;

Unless, of course, they are pretty good hand-optimizers! In which
case, using some basic, well-known arithmetic optimization (rewriting)
principles, they could write this more efficiently as,

  a -= b; a *= d; a /= c;      (***)

Hardly anyone likes to hand-optimize such expressions. (The fact that
the object operations employ operator notation (`t4 *= t3') instead of
prefix notation (e.g., `t4.mul(t3)') scarcely makes this any more fun.)

Now, it is not hard to completely automate the first (*) => (**)
translation step via

Int operator - (const Int& a)               return r(a) { r.negate(); }
Int operator + (const Int& a, const Int& b) return r(a) { r += b; }
Int operator - (const Int& a, const Int& b) return r(a) { r -= b; }
Int operator * (const Int& a, const Int& b) return r(a) { r *= b; }
Int operator / (const Int& a, const Int& b) return r(a) { r /= b; }

As I mentioned in my previous note, these are perhaps best thought of
as extensions or disguised forms of Int constructors.  Named return
values make these simpler, faster, and more obvious, I think.


My points so far are:

1) For many classes, expression-/value- based operations support a more
    natural programming style, that C++, as a hybrid language, is fully
    capable of supporting.

2) Object-based operations are surely the more central and basic in
    any object-oriented language, since value-based operations may
    be layered on top of the object-based ones.

3) As proponents of functional programming like to argue, the use of
    value-based operations results in code that is almost always
    easier to informally and formally verify for correctness. However,
    layering-in value operations as a `translation' step, makes design
    and verification at the object level easier too, since one need
    `merely' determine that the translation preserves correctness.



The kinds of optimizations seen going from (**) to (***) are further
refinements of this expression translation process, familiar to
assembly language programmers, compiler writers, and those studying
general program transformation techniques (e.g., the work inspired by
Burstall & Darlington).

Automation of such optimizations requires capabilities that seem
currently unavailable in C++. There are a lot of well known `tricks'
(reference counting, temp classes, etc.) that can exploit some of the
most glaring opportunities, but there are no built-in
`compiler-compiler' constructs that would allow programmers to specify
all of the kinds of optimizations possible for these kinds of classes.

Among the most frustrating aspects of all this is that these
optimizations are, on the one hand, fairly obvious and well-known,
but, on the other hand, very tedious and error-prone to carry out
manually every time you write out an expression.  Yet more frustrating
is the fact that nearly any compiler already knows how to optimize
expressions involving built-in data types like int, float, char, but
knows nothing at all about expressions involving user-defined types
like Int.


These seem to be the available options for implementors of such classes:

    1) Refuse to support expression-oriented operations.

    2) Support both the object-oriented methods, and simple
       value-to-object translations, thereby allowing programmers to
       hand-optimize using the object operations if they need to.

    3) Support as much optimization as you can manage from within
       the confines of C++.

    4) Write expression node classes that form expression trees during
       *runtime*, and fully optimize the trees, again during runtime
       before processing. (E.g., ExprNode operator + (Int&, Int&); makes
       a node, and Int::operator=(ExprNode&); evaluates it.)
 
    5) Write class-specific preprocessors, that perform optimized
       translations using knowledge that can't be expressed in C++.

    6) Extend the language.

Choosing (1) means that people will find the class difficult and/or
annoying to use.  Even the purest object-oriented languages provide at
least some arithmetic operator support, probably because programmers
would refuse to use them otherwise. As Michael Tiemann says, code
isn't reusable if it's not usable!

(2) is perhaps the most practical way to proceed right now, and is
fully within the spirit of the C/C++ maxim of `make it right, *then*
make it fast'. (Note, for example, that C/C++, unlike many languages,
already supports many object-based operations on builtin types, like
operator +=, that were designed, in part, to make hand-optimization by
programmers easier.)

I have written a bunch of libg++ classes along the lines of (3).
Unfortunately, such coding often involves cleverness or trickery that
obfuscates otherwise simple designs, and does not have a great payoff
since only a very small subset of common optimizations can be done
within such limitations.

In effect, (4) turns such classes into little interpretors. This *is*
doable, but approaches viability only when the objects and expressions
are so large and complex that the time taken to create and evaluate
the expression tree during runtime is always paid off by the resulting
savings. Moreover, it seems just plain wrong to do compile-time
processing during run time.

I am currently investigating some forms of (5), in part, to scope out
the kinds of issues and problems inherent in (6) about which I
do not yet have any good concrete suggestions.



Among the things I find most interesting about this topic is that
there are some classes (numbers, strings...) in which value-based
programming is seen by most everyone as the most natural and desirable
methodology, others (queues, windows, ...) in which object-oriented
programming is most natural, and some (sets, matrices, ...)  where people
like to use both. I'm not exactly sure why this is so. But a great
attraction of C++ is that there is the possibility that both views can
be accomodated.


Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl