A subquery is simply a query within a query. The inner query is executed first, then the outer query can use the results of the inner query in its processing. The inner query can be thought of as a virtual table.
Examples
The following example finds all students who have repeated a course more than once:
SELECT s.Student_ID, last_name, first_name, Course, Repeats FROM Student AS s INNER JOIN ( SELECT Student_ID, Course, COUNT(*) AS Repeats FROM Student_Course GROUP BY Student_ID, Course ) AS Course_Repeats ON s.Student_ID = Course_Repeats.Student_ID WHERE Repeats > 1
In this example, the subquery:
( SELECT Student_ID, Course, COUNT(*) AS Repeats FROM Student_Course GROUP BY Student_ID, Course ) AS Course_Repeats
creates a temporary table that lists students and the number of times they
have taken any course. This temporary table is given an alias, Course_Repeats
,
so that it can be joined against the Student
table in order to
get the student names as well as the course name and count. The WHERE
clause can be applied against any of the resulting fields, and in this case
is applied to the Repeats
field.
IN
Clause
The IN
clause is part of a WHERE
clause that
restricts the result of a SELECT
to tuples that contain values
in a list. NOT IN
restricts the result set to tuples that do
not have values in a list. Example:
SELECT last_name, first_name, title FROM member WHERE member_id NOT IN ( SELECT member_id FROM pub )
Here the subquery gets a list of all member_ids from pub,
i.e. a list of members who have actually published something. The outer
query then compares all member_ids in member
to the contents of this list and returns the names of those members who are
not IN
this list.
The opposite result (i.e. the names of members who have publications) could
be generated with the same subquery, but without the NOT
:
SELECT last_name, first_name, title FROM member WHERE member_id IN ( SELECT member_id FROM pub )
However, the same thing could be accomplished with an simple join:
SELECT last_name, first_name, title FROM member AS m INNER JOIN pub AS p ON m.member_id = p.member_id
In these examples it is the need to generate a negative result set (which members have not published) that makes the use of a subquery necessary.
The IN
clause works only with a subquery that returns a result
set consisting of the values of a single attribute. It also works with a
list of literals, as in these examples:
SELECT last_name, first_name, Title FROM member WHERE member_id in ( 9, 12, 35 )
or
SELECT member_id FROM member WHERE last_name IN ( 'KILGOUR', 'BROWN' )
EXISTS
Clause
The use of the EXISTS
clause is similar to the use of IN
,
except that the subquery need not restrict itself to returning the list of
values of a single attribute. So long as any result other than NULL
is returned by the subquery, the EXISTS
condition becomes true.
Like IN
, EXISTS
may be used with a NOT
.
SELECT last_name, first_name, Title FROM member AS m WHERE NOT EXISTS ( SELECT * FROM pub WHERE member_id = m.member_id )
This subquery returns all publications belonging to the member selected in
the outer SELECT
statement. Should the subquery return a NULL
,
the outer query returns a tuple from the member table, i.e.
it lists only members who do not have publications. This is equivalent to
the first example of the use of the IN
clause.
These clauses may be mixed, as in this example:
SELECT DISTINCT last_name, first_name, title FROM member AS m INNER JOIN member_keyword AS m_k ON m.member_id = m_k.member_id WHERE NOT EXISTS ( SELECT * FROM pub WHERE member_id = m.member_id ) AND m_k.keyword_id IN ( 7, 8 )
This selects all members who do not have publications as well as having expertise in "Arms Control and Non-Proliferation Studies" (keyword_id = 7) and "Strategic Studies" (keyword_id = 8).
The following selection statement produces the same result:
SELECT DISTINCT last_name, first_name, title FROM member AS m WHERE member_id NOT IN ( SELECT member_id FROM pub ) AND EXISTS ( SELECT * FROM member_keyword WHERE keyword_id IN ( 7, 8 ) AND member_id = m.member_id )
Another example:
SELECT last_name FROM member AS m WHERE member_id IN ( SELECT member_id FROM member_keyword WHERE keyword_id = 12 ) AND member_id NOT IN ( SELECT member_id FROM member_keyword WHERE keyword_id = 19 )
This example selects the last name of all members who have expertise in "Military History" (keyword_id = 12), but who do not have expertise in "Military Alliances" (keyword_id = 19).
Subqueries can be used to generate fields or columns similar to how they can generate a temporary table. The subquery must generate a single value rather than a table or list of values, as in this example:
SELECT s.Student_ID, last_name, first_name, ( SELECT COUNT(Course) FROM Student_Course AS sc WHERE s.Student_ID = sc.Student_ID ) AS TotalCourses FROM Student AS s
where the subquery:
( SELECT COUNT(Course) FROM Student_Course AS sc WHERE s.Student_ID = sc.Student_ID ) AS TotalCourses
counts how many courses an individual student has taken. This generates a single number for each student, and this number can be used, with its alias, as a field in the surrounding query.
Subqueries may have subqueries of their own:
SELECT last_name FROM member AS m WHERE member_id IN ( SELECT member_id FROM member_keyword WHERE keyword_id = 12 ) AND member_id NOT IN ( SELECT member_id FROM member_keyword WHERE keyword_id IN ( 18, 19, 20 ) )
This is similar to the previous example except that the member must not have expertise in either "Information Warfare" (18) or "Non-Traditional Threats to Security" (20) as well.
The use of subqueries allows for the creation of very complex and powerful selection statements.