Title: Structured Query Language
1Structured Query Language
2What is a Subquery?
- A subquery is a SELECT statement thats coded
within another SQL statement. - A subquery can return a single value, a result
set that contains a single column, or a result
set that contains one or more columns. - A subquery that returns a single value can be
coded, or introduced, anywhere an expression is
allowed. - A subquery that returns a single column can be
introduced in place of a list of values, such as
the values for an IN phrase. - A subquery that returns one or more columns can
be introduced in place of a table in the FROM
clause. - A subquery thats used in a WHERE or HAVING
clause is called a subquery search condition or a
subquery predicate. This is the most common use
for a subquery.
Source (Murachs SQL Server 2005 for Developers)
3What is a Subquery?
- The syntax for a subquery is the same as for a
standard SELECT statement. However, a subquery
doesnt typically include the GROUP BY or HAVING
clause, and it cant include an ORDER BY clause
unless the TOP phrase is used. - Subqueries can be nested within other subqueries.
However, subqueries that are nested more than two
or three levels deep can be difficult to read and
can result in poor performance.
4Subqueries in a SELECT statement
- In a WHERE clause as a search condition
- In a HAVING clause as a search condition
- In the FROM clause as a table specification
- In the SELECT clause as a column specification
5A Subquery Example
- Placing one query inside another
- Inner query is a subquery
- It is executed first the result of the subquery
is stored in a temporary table and is evaluated
against the conditions of the outer query
select from student where student_gpa gt
(select avg(student_gpa)
from student )
6Subqueries VS. Joins
- How subqueries compare to joins
- Like a join, a subquery can be used to code
queries that work with two or more tables. - Most subqueries can be restated as joins and most
joins can be restated as subqueries. - Advantages of joins
- The result set of a join can include columns from
both tables. The result set of a query with a
subquery can only include columns from the table
named in the outer query, not in the subquery. - A join tends to be more intuitive when it uses an
existing relationship between the two tables,
such as a primary key to foreign key
relationship. - A query with a join typically performs faster
than the same query with a subquery, especially
if the query uses only inner joins.
Source (Murachs SQL Server 2005 for
Developers)
7Subqueries VS. Joins
- Advantages of subqueries
- You can use a subquery to pass an aggregate value
to the outer query. - A subquery tends to be more intuitive when it
uses an ad hoc relationship between the two
tables. - Long, complex queries can sometimes be easier to
code using subqueries.
8Subqueries VS. Joins
- A query that uses an inner join
- SELECT InvoiceNumber, InvoiceDate, InvoiceTotal
- FROM Invoices JOIN Vendors
- ON Invoices.VendorID Vendors.VendorID
- WHERE VendorState 'CA'
- ORDER BY InvoiceDate
- The same query restated with a subquery
- SELECT InvoiceNumber, InvoiceDate, InvoiceTotal
- FROM Invoices
- WHERE VendorID IN
- (SELECT VendorID
- FROM Vendors
- WHERE VendorState 'CA')
- ORDER BY InvoiceDate
9Using IN phrase with a subquery
- WHERE test_expression NOT IN (subquery)
- How to use subqueries with the IN operator
- You can introduce a subquery with the IN operator
to provide the list of values that are tested
against the test expression. - When you use the IN operator, the subquery must
return a single column of values. - A query that uses the NOT IN operator with a
subquery can typically be restated using an outer
join.
10Example
- A query that returns vendors without invoices
- SELECT VendorID, VendorName, VendorState
- FROM Vendors
- WHERE VendorID NOT IN
- (SELECT DISTINCT VendorID
- FROM Invoices)
- OR
- SELECT Vendors.VendorID, VendorName, VendorState
- FROM Vendors LEFT JOIN Invoices
- ON Vendors.VendorID Invoices.VendorID
- WHERE Invoices.VendorID IS NULL
11Multi-Table Subqueries
- Tables are usually joined in SQL by using the
WHERE clause to ensure the matching values in the
common columns - However, there are other alternatives to joining
the tables - E.g.
- IN, NOT IN, ALL, ANY
- all of these are used with sub-queries that
return multiple values
12Multi-Table Subqueries with ALL
- A query that returns invoices that are larger
than the largest invoice for vendor 34 - SELECT VendorName, InvoiceNumber, InvoiceTotal
- FROM Invoices JOIN Vendors ON Invoices.VendorID
- Vendors.VendorID
- WHERE InvoiceTotal gt ALL
-
(SELECT InvoiceTotal - FROM
Invoices - WHERE
VendorID 34) - ORDER BY VendorName
x gt ALL (1, 2)x gt 2x must be greater than the
maximum value returned by the subquery x lt ALL
(1, 2)x lt 1x must be less than the minimum value
returned by the subquery. x ltgt ALL (1, 2)(x ltgt 1)
AND (x ltgt 2) This condition is equivalent to x
NOT IN (1, 2)
13Multi-Table Subqueries with ANY
- A query that returns invoices smaller than the
largest invoice for vendor 115 - SELECT VendorName, InvoiceNumber, InvoiceTotal
- FROM Vendors JOIN Invoices ON Vendors.VendorID
- Invoices.InvoiceID
- WHERE InvoiceTotal lt ANY
-
(SELECT InvoiceTotal -
FROM Invoices -
WHERE VendorID 115)
x gt ANY (1, 2)x gt 1x must be greater than the
minimum value returned by the subquery. x lt ANY
(1, 2)x lt 2x must be less than the maximum value
returned by the subquery. x ltgt ANY (1, 2)(x ltgt
1) OR (x ltgt 2)This condition will evaluate to
True for any non-empty result set containing at
least one non-null value that isnt equal to x.
14Subqueries in the FROM clause
- A subquery thats coded in the FROM clause
returns a result set called a derived table. - When you create a derived table, you must assign
an alias to it. Then, you can use the derived
table within the outer query just as you would
any other table. - When you code a subquery in the FROM clause, you
must assign names to any calculated values in the
result set. - Derived tables are most useful when you need to
further summarize the results of a summary query.
15Subqueries in the FROM clause
- A query that uses a derived table to retrieve the
top 5 vendors by average invoice total - SELECT Invoices.VendorID, MAX(InvoiceDate) AS
LatestInv, - AVG(InvoiceTotal) AS AvgInvoice
- FROM Invoices JOIN
- (SELECT TOP 5 VendorID, AVG(InvoiceTotal)
- AS AvgInvoice
- FROM Invoices
- GROUP BY VendorID
- ORDER BY AvgInvoice DESC) AS TopVendor
- ON Invoices.VendorID TopVendor.VendorID
- GROUP BY Invoices.VendorID
- ORDER BY LatestInv DESC
16Subqueries in the SELECT clause
- When you code a subquery for a column
specification in the SELECT clause, the subquery
must return a single value. - A subquery thats coded within a SELECT clause is
usually a correlated subquery. - Subqueries are seldom coded in the SELECT clause.
Joins are used instead because theyre generally
faster and more readable.
17Subqueries in the SELECT clause
- A query that uses a correlated subquery in its
SELECT to retrieve the most recent invoice for
each vendor - SELECT DISTINCT VendorName,
- (SELECT MAX(InvoiceDate) FROM Invoices
- WHERE Invoices.VendorID Vendors.VendorID)
- AS LatestInv
- FROM Vendors
- ORDER BY LatestInv DESC
18 - The same query using a join instead of a subquery
in the SELECT clause - SELECT VendorName, MAX(InvoiceDate) AS LatestInv
- FROM Vendors JOIN Invoices
- ON Vendors.VendorID Invoices.VendorID
- GROUP BY VendorName
- ORDER BY LatestInv DESC
19Building Complex Queries
- A procedure for building complex queries
- 1. State the problem to be solved by the query in
English. - 2. Use pseudocode to outline the query. The
pseudocode should identify the subqueries used by
the query and the data they return. It should
also include aliases used for any derived tables. - 3. If necessary, use pseudocode to outline each
subquery. - 4. Code the subqueries and test them to be sure
that they return the correct data. - Code and test the final query.
- The problem to be solved
- Which vendor in each state has the largest
invoice total?
20Building Complex Queries
- Pseudocode for the query
- SELECT Summary1.VendorState, Summary1.VendorName,
- TopInState.SumOfInvoices
- FROM (Derived table returning VendorState,
VendorName, - SumOfInvoices) AS Summary1
- JOIN (Derived table returning VendorState,
- MAX(SumOfInvoices)) AS TopInState
- ON Summary1.VendorState TopInState.VendorSta
te AND - Summary1.SumOfInvoices
TopInState.SumOfInvoices - ORDER BY Summary1.VendorState