Path: utzoo!attcan!uunet!tut.cis.ohio-state.edu!gem.mps.ohio-state.edu!rpi!nyser!cmx!dl From: dl@cmx.npac.syr.edu (Doug Lea) Newsgroups: comp.lang.c++ Subject: Re: named return values Message-ID: <1835@cmx.npac.syr.edu> Date: 10 Aug 89 10:46:25 GMT References: <1826@cmx.npac.syr.edu> <26302@shemp.CS.UCLA.EDU> <6444@columbia.edu> Reply-To: dl@cmx.npac.syr.edu (Doug Lea) Organization: Northeast Parallel Architectures Center, Syracuse NY Lines: 177 I have a great deal of sympathy with Steve Kearns views, but disagree with his conclusions. For many, `algebraic' classes, including vectors and matrices, there is a natural object-oriented operations & semantics, that is substantially different-looking than the more familiar value-based operations & semantics. I think many such classes ought to support BOTH. To compose an argument about why this should be so, I'll scale down, and consider a class-based re-implementation of simple integers: class Int { int rep; public: Int(const Int& a) :rep(a.rep) {} Int(int i = 0) :rep(i) {} void operator = (const Int& a) { rep = a.rep; } void negate() { rep = -rep; } void operator +=(const Int& a) { rep += a.rep; } void operator -=(const Int& a) { rep -= a.rep; } void operator *=(const Int& a) { rep *= a.rep; } void operator /=(const Int& a) { rep /= a.rep; } }; This definition is very much an object-oriented one. However, I bet that few programmers would like to use it. Instead of coding integer operations in a familiar expression-based (value-based) notation like, for Int a, b, c, d; a = (b - a) * -(d / c); (*) they would be forced into something akin to assembly language calculations, hand-translating their intentions (i.e., (*)) into Int t1(b); t1 -= a; (**) Int t2(d); t2 /= c; Int t3(t2); t3.negate(); Int t4(t1); t4 *= t3; a = t4; Unless, of course, they are pretty good hand-optimizers! In which case, using some basic, well-known arithmetic optimization (rewriting) principles, they could write this more efficiently as, a -= b; a *= d; a /= c; (***) Hardly anyone likes to hand-optimize such expressions. (The fact that the object operations employ operator notation (`t4 *= t3') instead of prefix notation (e.g., `t4.mul(t3)') scarcely makes this any more fun.) Now, it is not hard to completely automate the first (*) => (**) translation step via Int operator - (const Int& a) return r(a) { r.negate(); } Int operator + (const Int& a, const Int& b) return r(a) { r += b; } Int operator - (const Int& a, const Int& b) return r(a) { r -= b; } Int operator * (const Int& a, const Int& b) return r(a) { r *= b; } Int operator / (const Int& a, const Int& b) return r(a) { r /= b; } As I mentioned in my previous note, these are perhaps best thought of as extensions or disguised forms of Int constructors. Named return values make these simpler, faster, and more obvious, I think. My points so far are: 1) For many classes, expression-/value- based operations support a more natural programming style, that C++, as a hybrid language, is fully capable of supporting. 2) Object-based operations are surely the more central and basic in any object-oriented language, since value-based operations may be layered on top of the object-based ones. 3) As proponents of functional programming like to argue, the use of value-based operations results in code that is almost always easier to informally and formally verify for correctness. However, layering-in value operations as a `translation' step, makes design and verification at the object level easier too, since one need `merely' determine that the translation preserves correctness. The kinds of optimizations seen going from (**) to (***) are further refinements of this expression translation process, familiar to assembly language programmers, compiler writers, and those studying general program transformation techniques (e.g., the work inspired by Burstall & Darlington). Automation of such optimizations requires capabilities that seem currently unavailable in C++. There are a lot of well known `tricks' (reference counting, temp classes, etc.) that can exploit some of the most glaring opportunities, but there are no built-in `compiler-compiler' constructs that would allow programmers to specify all of the kinds of optimizations possible for these kinds of classes. Among the most frustrating aspects of all this is that these optimizations are, on the one hand, fairly obvious and well-known, but, on the other hand, very tedious and error-prone to carry out manually every time you write out an expression. Yet more frustrating is the fact that nearly any compiler already knows how to optimize expressions involving built-in data types like int, float, char, but knows nothing at all about expressions involving user-defined types like Int. These seem to be the available options for implementors of such classes: 1) Refuse to support expression-oriented operations. 2) Support both the object-oriented methods, and simple value-to-object translations, thereby allowing programmers to hand-optimize using the object operations if they need to. 3) Support as much optimization as you can manage from within the confines of C++. 4) Write expression node classes that form expression trees during *runtime*, and fully optimize the trees, again during runtime before processing. (E.g., ExprNode operator + (Int&, Int&); makes a node, and Int::operator=(ExprNode&); evaluates it.) 5) Write class-specific preprocessors, that perform optimized translations using knowledge that can't be expressed in C++. 6) Extend the language. Choosing (1) means that people will find the class difficult and/or annoying to use. Even the purest object-oriented languages provide at least some arithmetic operator support, probably because programmers would refuse to use them otherwise. As Michael Tiemann says, code isn't reusable if it's not usable! (2) is perhaps the most practical way to proceed right now, and is fully within the spirit of the C/C++ maxim of `make it right, *then* make it fast'. (Note, for example, that C/C++, unlike many languages, already supports many object-based operations on builtin types, like operator +=, that were designed, in part, to make hand-optimization by programmers easier.) I have written a bunch of libg++ classes along the lines of (3). Unfortunately, such coding often involves cleverness or trickery that obfuscates otherwise simple designs, and does not have a great payoff since only a very small subset of common optimizations can be done within such limitations. In effect, (4) turns such classes into little interpretors. This *is* doable, but approaches viability only when the objects and expressions are so large and complex that the time taken to create and evaluate the expression tree during runtime is always paid off by the resulting savings. Moreover, it seems just plain wrong to do compile-time processing during run time. I am currently investigating some forms of (5), in part, to scope out the kinds of issues and problems inherent in (6) about which I do not yet have any good concrete suggestions. Among the things I find most interesting about this topic is that there are some classes (numbers, strings...) in which value-based programming is seen by most everyone as the most natural and desirable methodology, others (queues, windows, ...) in which object-oriented programming is most natural, and some (sets, matrices, ...) where people like to use both. I'm not exactly sure why this is so. But a great attraction of C++ is that there is the possibility that both views can be accomodated. Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367 email: dl@oswego.edu or dl%oswego.edu@nisc.nyser.net UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl