CP363 : SQL Subqueries

A subquery is simply a query within a query. The inner query is executed first, then the outer query can use the results of the inner query in its processing. The inner query can be thought of as a virtual table.


Subquery as a Temporary Table

Examples

The following example finds all students who have repeated a course more than once:

SELECT s.Student_ID, last_name, first_name, Course, Repeats
FROM Student AS s INNER JOIN
  ( SELECT Student_ID, Course, COUNT(*) AS Repeats
  FROM Student_Course
  GROUP BY Student_ID, Course )
  AS Course_Repeats
ON s.Student_ID = Course_Repeats.Student_ID
WHERE Repeats > 1

In this example, the subquery:

  ( SELECT Student_ID, Course, COUNT(*) AS Repeats
  FROM Student_Course
  GROUP BY Student_ID, Course )
  AS Course_Repeats

creates a temporary table that lists students and the number of times they have taken any course. This temporary table is given an alias, Course_Repeats, so that it can be joined against the Student table in order to get the student names as well as the course name and count. The WHERE clause can be applied against any of the resulting fields, and in this case is applied to the Repeats field.


Subquery With the IN Clause

The IN clause is part of a WHERE clause that restricts the result of a SELECT to tuples that contain values in a list. NOT IN restricts the result set to tuples that do not have values in a list. Example:

SELECT last_name, first_name, title
  FROM member
  WHERE member_id NOT IN
    ( SELECT member_id FROM pub )

Here the subquery gets a list of all member_ids from pub, i.e. a list of members who have actually published something. The outer query then compares all member_ids in member to the contents of this list and returns the names of those members who are not IN this list.

The opposite result (i.e. the names of members who have publications) could be generated with the same subquery, but without the NOT:

SELECT last_name, first_name, title
  FROM member
  WHERE member_id IN
    ( SELECT member_id FROM pub )

However, the same thing could be accomplished with an simple join:

SELECT last_name, first_name, title
  FROM member AS m INNER JOIN pub AS p
  ON m.member_id = p.member_id

In these examples it is the need to generate a negative result set (which members have not published) that makes the use of a subquery necessary.

The IN clause works only with a subquery that returns a result set consisting of the values of a single attribute. It also works with a list of literals, as in these examples:

SELECT last_name, first_name, Title
  FROM member
  WHERE member_id in ( 9, 12, 35 )

or

SELECT member_id
  FROM member
  WHERE last_name IN ( 'KILGOUR', 'BROWN' )

Subqueries With the EXISTS Clause

The use of the EXISTS clause is similar to the use of IN, except that the subquery need not restrict itself to returning the list of values of a single attribute. So long as any result other than NULL is returned by the subquery, the EXISTS condition becomes true. Like IN, EXISTS may be used with a NOT.

SELECT last_name, first_name, Title FROM member AS m
  WHERE NOT EXISTS
    ( SELECT * FROM pub WHERE member_id = m.member_id )

This subquery returns all publications belonging to the member selected in the outer SELECT statement. Should the subquery return a NULL, the outer query returns a tuple from the member table, i.e. it lists only members who do not have publications. This is equivalent to the first example of the use of the IN clause.

These clauses may be mixed, as in this example:

SELECT DISTINCT last_name, first_name, title
  FROM member AS m
  INNER JOIN member_keyword AS m_k
  ON m.member_id = m_k.member_id
  WHERE NOT EXISTS
    ( SELECT * FROM pub
      WHERE member_id = m.member_id )
  AND m_k.keyword_id IN ( 7, 8 )

This selects all members who do not have publications as well as having expertise in "Arms Control and Non-Proliferation Studies" (keyword_id = 7) and "Strategic Studies" (keyword_id = 8).

The following selection statement produces the same result:

SELECT DISTINCT last_name, first_name, title
  FROM member AS m
  WHERE member_id NOT IN
    ( SELECT member_id FROM pub )
  AND EXISTS
    ( SELECT * FROM member_keyword
      WHERE keyword_id IN ( 7, 8 )
      AND member_id = m.member_id )

Another example:

SELECT last_name FROM member AS m
  WHERE member_id IN
    ( SELECT member_id
      FROM member_keyword
      WHERE keyword_id = 12 )
  AND member_id NOT IN
    ( SELECT member_id
      FROM member_keyword
      WHERE keyword_id = 19 )

This example selects the last name of all members who have expertise in "Military History" (keyword_id = 12), but who do not have expertise in "Military Alliances" (keyword_id = 19).


Subqueries as Fields

Subqueries can be used to generate fields or columns similar to how they can generate a temporary table. The subquery must generate a single value rather than a table or list of values, as in this example:

SELECT s.Student_ID, last_name, first_name,
  ( SELECT COUNT(Course)
  FROM Student_Course AS sc
  WHERE s.Student_ID = sc.Student_ID ) AS TotalCourses
FROM Student AS s

where the subquery:

  ( SELECT COUNT(Course)
  FROM Student_Course AS sc
  WHERE s.Student_ID = sc.Student_ID ) AS TotalCourses

counts how many courses an individual student has taken. This generates a single number for each student, and this number can be used, with its alias, as a field in the surrounding query.


Subqueries with Subqueries

Subqueries may have subqueries of their own:

SELECT last_name FROM member AS m
  WHERE member_id IN
    ( SELECT member_id
      FROM member_keyword
      WHERE keyword_id = 12 )
  AND member_id NOT IN
    ( SELECT member_id
      FROM member_keyword
      WHERE keyword_id IN ( 18, 19, 20 ) )

This is similar to the previous example except that the member must not have expertise in either "Information Warfare" (18) or "Non-Traditional Threats to Security" (20) as well.

The use of subqueries allows for the creation of very complex and powerful selection statements.