Path: utzoo!utgpu!watmath!att!ucbvax!agate!usenet
From: hughes@math.berkeley.edu (Eric Hughes)
Newsgroups: comp.databases
Subject: Re: Extended RDB vs OODB
Message-ID: <1989Aug17.180057.2623@agate.berkeley.edu>
Date: 17 Aug 89 18:00:57 GMT
References: <28@dgis.daitc.mil>
Sender: usenet@agate.berkeley.edu (USENET Administrator;;;;ZU44)
Reply-To: hughes@math.berkeley.edu (Eric Hughes)
Organization: UC Berkeley Math Dept
Lines: 50
In-reply-to: jkrueger@dgis.daitc.mil (Jonathan Krueger)

In article <28@dgis.daitc.mil>, jkrueger@dgis (Jonathan Krueger) writes:
>Therefore I'd like to divide the question: efficient implementation of
>a data model versus inherently bad performance of some models for some
>operations.  Recent traffic has confused the two issues without
>addressing either.

"Inherently bad performance" is a slippery term.  It is important to
remember that a database model is an abstraction, and that there are
many different implementations of the same abstraction.  To speak of
"inherently bad performace" would require that one find some predicate
which is invariant over all possible implementations and show that
such an invariant has certain undesirable minimum time and/or space
growth rates.  I can see both information theory and complexity theory
useful in this regard, but as far as I know there has been no work
done in this area.  For example, one could calculate some measure of
the information present in a query and relate that to some critical
path of operation to get a lower bound.

In short, I don't think it makes much sense to talk of "inherently"
bad performance, at least for now.

The situation is not so hopeless, because such kinds of operation
counting are possible when one fully specifies the type of
implementation.  For example, one could analyze the performance of an
RDBMS whose implementation consisted of nothing but flat files with no
indexes.  You can get time and space estimates in terms of the file
sizes.  By adding indices to the implementation one can reduce some
O(n) operations to O(log n), and by adding pointer rings to represent
a one-to-many relationship (redundant data for speed sake) one can
reduce some O(log n) operations to O(log log n) (which, for most
applications, is as good as constant).  This is not inherent
performance, but might be termed "practically inherent."

Let's be careful when we talk about the performance of a model.  Only
programs have performance measure per se, and so to measure the
performance of model one must somehow relate it to the performance of
some programs.

Finally, I have a two-part conjecture: Asymptotic performance
measurements (i.e. "order of" type measurement) of the relational and
object-oriented models (and all their children :-) are identical.  No
model has a set of constants for such asymptotic measurement (i.e. the
constant that gets rid of the "order of" symbol) whose inverses
dominate (i.e. are everywhere better than) the respective constants of
all other models.  To paraphrase, both models have inherently the
same order of magnitude performance, and certain models are better
suited to certain operations than others.

Eric Hughes
hughes@math.berkeley.edu   ucbvax!math!hughes