CP363 : Normalization

What is Normalization?


Why Normalize?

Remember that these three anomalies can corrupt the contents of a database. (We sill see some examples momentarily.) By reducing data duplication and eliminating these anomalies, we intend to minimize the possiblility of data corruption and simplify the development, maintenance, and expandability of the database.


How Do We Normalize?

It is important to remember that normalization techniques are based not only on algorithms, but on the semantics of the data in the database. If we do not understand the data domains and relationships we cannot perform proper normalizations.

A normalized schema should have the properties:


Normal Forms

The following normal forms are listed from least to most normalized:

Normal Forms 1 through 3 are based upon primary keys. Every relation is assumed to have a primary key made up of one or more attributes.

Reminder: A superkey is a set of attributes that uniquely identify a tuple in a relation. A key is a minimal superkey, i.e. a superkey with superfluous attributes removed. Possible keys are called candidate keys. A key chosen from a set of candidates is a primary key.