Lesson 4: Relational Algebra

Introduction

Relational Algebra defines transformations that convert an input relation (table) to an output relation. In other words, it gives the mathematical basis for how SQL database are queried. In practice not all algebraic operations are implemented in practical database, and many databases have extensions that are not modeled by relational algebra. In the course we will not do any formal proofs on this algebra nor look at how it is implemented 'under the hood'.

Reference

Refers to an existing relation.

Algebra
R
SQL
SELECT * FROM table
Examples
Select Entire Table
              
SELECT * FROM member
            

Returns every tuple in the member table. (No particular order is requested or implied.)

Selection

Selects specific tuples from a relation.

A selection condition may compare:

Conditions may use the comparison operators =, <, <=, >, >=, and != or <>, and the Boolean operators AND, OR, NOT, etc.

Algebra
σ<selection condition>(R)
(σ : lower-case sigma)
SQL
SELECT * FROM table WHERE condition
Examples
Select Member by Surname
              
SELECT * FROM member
  WHERE memberSurname = 'Copp'
            

Returns all tuples in member where the attribute memberSurname is equal to the constant value 'Copp'.

Select Member by Birth Year
              
SELECT * FROM member
  WHERE YEAR(memberBirthDate) < 1950
            

Applies the YEAR function to the memberBirthDate attribute and compares the result to the constant value 1950.

Select Member by Age
              
SELECT * FROM member
  WHERE YEAR(memberBirthDate) < 1950
  AND memberInstitution LIKE '%Laurier%'
            

The LIKE operator allows a partial comparison to a value. The % symbol is a string wildcard in SQL. Thus the comparison:
attribute = 'Laurier'
compares an attribute value exactly against the string 'Laurier', while:
attribute LIKE '%Laurier%'
looks for the string 'Laurier' anywhere within the attribute value. Thus it would match 'Wilfrid Laurier University' or 'Laurier College' since both string contain the substring 'Laurier'.

Uses the boolean operator AND to return all tuples in member where a member was born before 1950 and is associated with a institution containing 'Laurier' in its name.

Select Loan That Is Not Returned
              
SELECT * FROM Loan
  WHERE Date_Returned IS NULL
            

Returns all tuples in Loan where Date_Returned does not have a value, i.e. the item has not yet been returned. (Comparisons to NULL are made with the IS operator.)

Select Loan Returned the Same Day it is Borrowed
              
SELECT * FROM Loan
  WHERE Date_Returned = Date_Borrowed
            

Returns all the tuples in Loan where the item was returned on the same day it was borrowed by comparing the two attributes Date_Returned and Date_Borrowed.

Select Long Term Loans
              
SELECT * FROM Loan
  WHERE DATEDIFF(Date_Returned, Date_Borrowed) >= 365
            

DATEDIFF is a MySQL function that returns the number of days between two DATE values. Thus this statement returns all tuples from Loan where the item borrowed was out for at least one year.

Select Long Term Loans Still Out
              
SELECT * FROM Loan
  WHERE Date_Returned IS NULL
  AND DATEDIFF(CURRENT_DATE, Date_Borrowed) >= 365
            

CURRENT_DATE is a MySQL variable that returns the current date. Thus this statement returns all tuples from Loan where the item borrowed has been out for at least one year and is not yet returned.

Projection

Drops attributes from a result set. Attributes are returned in the same order in which they are listed. (The order of the tuples is unchanged!)

Selection and projection may be used together.

Algebra
π<attribute list>(R)
(π: lower-case pi)
SQL
SELECT attribute list FROM table
Examples
Extract Only Names
              
SELECT memberSurname, memberForename FROM member
            

Returns only the values of the memberSurname and memberForename attributes in the member table.

Extract Only Distinct Names

Note that although the definition of projection is that no duplicate tuples are returned when only non-key attributes are listed, this is not true for SQL. To return only non-duplicate tuples you must use the DISTINCT clause.

              
SELECT DISTINCT memberSurname FROM member
            

Returns each value of memberSurname only once.

Extract Specific Last Names
              
SELECT memberSurname, memberForename FROM member
  WHERE memberSurname = 'Copp'
            

Returns the full names of all members whose surname is 'Copp'.

Renaming

Attributes and relations can be renamed. This is often useful in making the results of queries easier to read and understand. Renamed attributes and relations are said to have aliases.

Algebra
ρ< new attribute list>(R) ρS< new attribute list>(R) ρS(R)
(ρ: lower-case rho)
SQL
SELECT attribute AS new attribute FROM table
SELECT * FROM old table AS new table
Examples
Extract Concatenated Names with New Identifier
              
SELECT CONCAT(memberSurname, ', ',  memberForename) AS name FROM member
            

Returns the member name in the form 'Copp, Terry' as the attribute name. The 'CONCAT()' function concatenates string in MySQL. (Most other SQL database dialects use '||' as the concatenation operator.)

Rename Table
              
SELECT * FROM member AS M
            

member is temporarily renamed to M.

(See the description of Cross Product for an example of when a result set can be renamed in SQL.)

Cross Product

Pairs all tuples from two tables. This is generally not useful on its own, but becomes useful when further selection is applied to its result.

Algebra
R<attribute list> × S<attribute list>
(×: cross-product)
SQL
SELECT * FROM table1, table2
Examples
All Possible Combinations
              
SELECT * FROM member, broad
            

Returns all possible pair combinations of members and broads, i.e. all members are listed against all broads (where a broad represents an area of expertise).

Cross Product with Projection
              
SELECT memberSurname, broadDesc FROM member, broad
            

Projection may be used within a cross product.

Neither of these examples make much sense in the context of this database, unless, for example, you were trying to match every possible member against every possible broad expertise for display on a web page. Cross products have very limited use, and can create very large result sets unless carefully limited.

Set Union

Merges the result sets of two selection statements. All the tuples in both result sets are included in the final result set.

Algebra
RS
(∪: union)
SQL
SQL: SELECT * FROM table UNION SELECT * FROM table
Examples
Union with Projection and Selection
              
SELECT memberSurname, memberInstitution FROM member
  WHERE memberInstitution LIKE '%LCMSDS%'
UNION
SELECT memberSurname, memberInstitution FROM member
  WHERE memberInstitution LIKE '%Laurier%'
            

The intent is to return a list of member's surnames who are associated with LCMSDS or Wilfrid Laurier University. Duplicate last names are eliminated. Both selection statements must return the same attributes.

(Note intent. Looking for 'LCMSDS' and 'Laurier' as part of the institution name includes any institution with the words 'LCMSDS' or 'Laurier' in them, which would not necessarily be just these institutions.)

Union with Duplicates
              
SELECT memberSurname, memberInstitution FROM member
  WHERE memberInstitution LIKE '%LCMSDS%'
UNION ALL
SELECT memberSurname, memberInstitution FROM member
  WHERE memberInstitution LIKE '%Laurier%'
            

With the use of UNION ALL duplicates are not eliminated.

Set Intersection

Merges the result sets of two selection statements. Only the tuples common to both result sets are included in the final result set.

Algebra
RS
(∩: intersection)
SQL
SELECT * FROM table INTERSECT SELECT * FROM table
Examples
Intersection with Projection and Selection
              
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberInstitution NOT LIKE '%University%'
INTERSECT
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberTitle = 'Dr.'

            

Returns a set of all members who work at UW and have the title 'Dr.'. Duplicate member names are eliminated. Add the ALL clause to keep any duplicates.

This example could be rewritten as:

Intersection Equivalent
              
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberInstitution NOT LIKE '%University%'
  AND memberTitle = 'Dr.'

            

This version cannot produce duplicates since it is a single SELECT and not the intersection of two selections.

Set Difference

The final result set contains only those tuples from the first result set that are not in the second result set.

Algebra
R - S
SQL
SELECT * FROM table EXCEPT SELECT * FROM table
Examples
Difference with Projection and Selection
              
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberInstitution LIKE '%University%'
EXCEPT
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberTitle = 'Dr.'

            

Returns a list of members who are associated with a university but do not have the title 'Dr.'

You might believe this example could be rewritten as:

Exception NOT Equivalent
              
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberInstitution LIKE '%University%'
  AND memberTitle != 'Dr.'

            

But it does not return the same results! Why? Because of the comparison:
memberTitle != 'Dr.'
The members who have no title have a NULL value stored in memberTitle, not an empty string. The comparison:
memberTitle != 'Dr.'
compares only strings, attributes with a NULL value are ignored.

To properly capture members with no title, the SELECT must be written as:

              
SELECT memberSurname, memberTitle, memberInstitution FROM member
  WHERE memberInstitution LIKE '%University%'
  AND  (memberTitle != 'Dr.' OR memberTitle IS NULL)

            

Note the use of:
IS NULL
for the NULL comparison. (This is similar to the use of is None in Python for similar reasons.

There are subtleties in SQL, like every other language. Watch out in particular for NULL values.