Path: utzoo!attcan!uunet!mcvax!cernvax!ethz!marti
From: marti@ethz.UUCP (Robert Marti)
Newsgroups: comp.databases
Subject: Re: A few words on the "normalization"
Message-ID: <1737@ethz.UUCP>
Date: 10 Aug 89 07:28:26 GMT
References: <1707@ethz.UUCP> <36270009@hpindwa.HP.COM>
Organization: ETH Zuerich
Lines: 40

In article <36270009@hpindwa.HP.COM>, donovan@hpindwa.HP.COM (Donovan Hsieh)
writes as contribution to our ongoing (never-ending :-) discussion:
> In theory, the suggested way to design a fully normalized relational database
> scheme is as follows: [ ... ]
> But in reality, the difficulties are :
> 	1. You'll probably never be able to prove that you have obtained
> 	   all possible FDs & MVDs for the real world enterprise except for
> 	   those toy databases. [ ... ]
I agree.  Absolutely.  I never claimed anything else in my previous
postings.  The point I was trying to make was:  If a designer is not
aware of certain relationships (dependencies) between different data
items, or if the designer is mistaken about the kind of relationship
between data items (i.e one-to-many instead of many-to-many, or FD
instead of MVD), you will end up with incorrect models of the world
no matter what design methodology you use.

> 	3. [ ... ] the net result of a mechanical normalization process does
>          not always produce the most efficient schema for real world queries.
Again I agree.  I might even go further and claim that it hardly ever
produces the most efficient schema.

>          This is due to that all normalization algorithms are purely
>          "syntactical" instead of "semantic". They do not have the knowledge
>          of what are the real world database requirements (from the
>          perspective of query execution efficiency or referential integrity).
Well, in the end, all information in the computer is purely syntactic,
even if you use a so-called semantic model.  In order to capture what
you call "real world database requirements", you'd probably want a
design tool / algorithm which also takes into consideration
- arbitrary integrity constraints (including referential integrity)
- expected data volumes (if known)
- "canned" update transactions and queries and their frequency (if known)
(The paper by Finkelstein et al. in ACM ToDS, Vol.13, No.1, March 88
presents a design tool which represents a step in this direction.)

-- 
Robert Marti                      Phone:      +41 1 256 52 36
Institut fur Informationssysteme
ETH-Zentrum                       CSNET/ARPA: marti%inf.ethz.ch@relay.cs.net
CH-8092 Zurich, Switzerland       UUCP:       ...uunet!mcvax!ethz!marti