Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!mailrus!csd4.milw.wisc.edu!cs.utexas.edu!husc6!ogccse!blake!lgy From: lgy@blake.acs.washington.edu (Laurence Yaffe) Newsgroups: comp.lang.c++ Subject: Re: named return values Message-ID: <3163@blake.acs.washington.edu> Date: 9 Aug 89 07:15:36 GMT References: <1826@cmx.npac.syr.edu> <26302@shemp.CS.UCLA.EDU> <6444@columbia.edu> Reply-To: lgy@newton.phys.washington.edu (Laurence Yaffe) Organization: University of Washington, Seattle Lines: 50 In article <6444@columbia.edu> kearns@cs.columbia.edu writes: >While this is very nice in theory, in practice it can lead to horrible >performance because of the various temporary matrices that are created. [ various comments about the desirability of explicitly controlling memory use for matrix operations deleted ] > It is also >more "honest": matrices are NOT good candidates for having value semantics >because their copying time is large. >-steve >(kearns@cs.columbia.edu) The claim that frequent copying of matrices causes unacceptable performance degradation appears to be common dogma, but what real evidence supports this? Since most common operations on matrices (multiplication, diagonalization, decomposition, inversion, ...) involve order N^3 operations for N-dimensional matrices, while copying is only order N^2, the overhead of copying will be significant only if (a) matrices are small and copies are very frequent (compared to other operations), (b) matrices are so large that memory limitation intervene, or (c) no O(N^3) operations are being performed. In years of my own work, I've never seen real examples of case (c), and only a few examples of case (a). Over quite a range of applications, I've found that the breakeven point where O(N^2) copies become important is well under N=10, typically 3 or 4. And for compute intensive applications with matrices that small, special methods tend to be more appropriate (fixed dimension types, inline coding, ...). I have run into examples in case (b), most recently in a calculation involving 1280 x 1280 dimensional matrices which needed more than 80 Mb of swap space! But this type of problem seems to be largely a thing of the past - unless you have a very fast machine or the patience to do O(N^3) operations on 1000 x 1000 matrices. On all the machines I've used, sequentially accessing all matrix elements in a row is significantly faster than accessing a column (better locality of reference, faster pointer increment). And yet surprisingly few canned matrix multiply routines pre-transpose one of the matrices (or use equivalent tricks involving an O(N^2) movement of data) in order to take advantage of this fact. Absolutely criminal... Anyone have real data (or just more anecdotal tales) on the significance of matrix copies in real applications? -- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet