2020-12-16 07:12

CP363 : Functional Dependencies

There are two basic design methodologies for designing a database:

Bottom-Up Design Methodology: Starts with all possible individual attributes for a database and builds relationships from that. Rarely used and unpopular.
Top-Down Design Methodology: Starts with tables and relations that have already been defined conceptually, and further refines the database design. The concepts of functional dependency and normalization are used with this methodology as criteria for proper database design.

What is a "good" relational database design?

Independent data in separate tables, i.e.: Each table should consist of a primary key and a set of mutually independent attributes

Functional dependency and normalization are tools for good relational database design.

Functional Dependency

A set of attributes Y is functionally dependent on a set of attributes X ( X→Y ) when Y can be uniquely determined by X. This can be read as "X functionally determines Y", or as "Y is functionally determined by X". Note that the converse, Y→X, is not necessarily true.

This is not to say that because X is unique Y is also unique. For example, although {Student_ID,Course}→Grade, (i.e. a student in a given course may have only one grade, for example an 'A'), other students may also have an 'A' in the same course. It simply means that a particular student in a particular course has one and only one grade. As well, clearly Grade→{Student_ID,Course} is not true.

A functional dependency is a type of constraint on attributes that arises out of the meaning of those attributes. In other words, in any given tuple the value of one set of attributes depends on the value of another set of attributes. These depend on the semantics, or rules, underlying the relation in question. The previous example, {Student_ID,Course}→Grade, could turn out to be false if a student is allowed to take the same course in a different term. Then you could infer that the functional dependency should be {Student_ID,Course,Term}→Grade instead. It is not always easy to infer a functional dependency, although it takes only a single violation of an FD to falsify it, i.e. show that it is false.

Based upon the first example from the Anomalies web page:

Student_ID	Course	Dept	Last_Name	First_Name	Instructor	Grade
999568440	CP363	Computing	Snord	Cranston	D. Brown	F
999568440	CP400	Computing	Snord	Cranston	T. Yang	A-
999568440	CP102	Computing	Snord	Cranston	D. Brown	C
987859400	PC466	Physics	Zzap	Zachary	B. Pavlova	D
987859400	HP202	History	Zzap	Zachary	S. Zeller	D
987859400	CP102	Computing	Zzap	Zachary	D. Brown	B+
005689250	CP102	Computing	Snord	Lillibelle	D. Brown	A+

Based upon the data we can see, we can claim the following:

{Last_Name,Course}→Grade: is false, as there are multiple occurrences of a combination of the same last name and course name.
Student_ID→{Last_Name,First_Name}: is true, and there is no reason to doubt that the semantics behind this matches our observation. It follows from this that both Student_ID→Last_Name and Student_ID→First_Name are true by the decomposition rule.
Instructor→Course: is false, as clearly one instructor is involved in many courses.
Course→Instructor: may be true, as each value of Course determines only one value of Instructor. However, if the semantics of the situation claimed that a course may have multiple instructors then this FD would be false, despite the fact that the data we can see supports it. Without knowing the semantics we cannot be sure of an FD.
Instructor→Dept: may be true, if the semantics are that an instructor may belong to only one department.
Course→Dept: is true, as a course is offered by only one department. It could also be claimed that if both Course→Instructor and Instructor→Dept are true then Course→Dept must be true by the transitive properties of functional dependencies.

Some FD Rules:

Reflexive Rule: If X is a superset of Y, then X→Y
Augmentation Rule: If X→Y, then XZ→YZ
Transitive Rule: If X→Y and Y→Z then X→Z
Decomposition Rule: If X→YZ then X→Y and X→Z
Union Rule: If X→Y and X→Z then X→YZ
Pseudotransitive Rule: If X→Y and WY→Z then WX→Z

A minimal set of dependencies is a set of dependencies that has:

a standard form where every FD has a single attribute on the right-hand side
no redundancies in that there are no redundant attributes on the left side of any FD in a set
no redundancies in that no FD in a set is superfluous, i.e. can be created from any of the FD rules stated above

There may not be a unique set of FDs that can be used to describe a set of relations in a database. However, a minimal cover set of FDs is a set of the smallest number FDs that meets the above requirements.

Why are functional dependencies useful? An FD represents an integrity constraint - i.e. semantics can be rendered into database constraints. Further they can aid in the normalization of a set of relations. Properly defining a minimal set of FDs can help remove anomalies from table and relationship designs. For example, does the sample table on this page represent a minimal set of functional dependencies for the data used?