Title: The Query Language SQL
1The Query Language SQL
2What we cannot compute with RA
- Aggregate operations. E.g. The number of
climbers who have climbed Last Tango' '' or
The average age of climbers''. These are
possible in SQL which has numerous extensions to
relational algebra. - Recursive queries. Given a relation
Parent(Parent, Child) compute the Ancestor
relation. (Can do this in Datalog.) - Computing with non 1NF relations e.g. lists,
arrays, multisets, nested relations.
3Basic Query
SELECT DISTINCT target-list FROM
relation-list WHERE qualification
- relation-list A list of relation names (possibly
with a range-variable after each name). - target-list A list of attributes of relations in
relation-list. can be used to denote all atts. - qualification Comparisons (Attr op const or
Attr1 op Attr2, where op is one of
combined using AND, OR and NOT. - DISTINCT (optional) keyword indicates that the
answer should not contain duplicates. Default is
that duplicates are not eliminated!
4Conceptual Evaluation Strategy
- Compute the product of relation-list
- Discard tuples that fail qualification
- Project over attributes in target-list
- If DISTINCT then eliminate duplicates
-
- This is probably a very bad way of executing
the query, and a good query optimizer will use
all sorts of tricks to find efficient strategies
to compute the same answer.
5Sample tables
Routes RId RName Grade Rating Height 1 Last
Tango II 12 100 2 Garden Path I 2
60 3 The Sluice I 8 60 4 Picnic
III 3 400 Climbers
Climbs Cid CName Skill Age CId RId Date
Duration 123 Edmund EXP 80 123 1 10/10/88
5 214 Arnold BEG 25 123 3 11/08/87 1 313
Bridget EXP 33 313 1 12/08/89 5 212 James
MED 27 214 2 08/07/92 2
313 1 06/07/94 3
6Select/project queries
SELECT Grade, Height FROM Routes
7Distinct
- Note that SQL did not eliminate duplicates. We
need to request this explicitly.
SELECT DISTINCT Grade, Height FROM Routes
8Pattern Matching
- Can be used in where clause. _ denotes any
character, 0 or more characters.
SELECT FROM Routes WHERE RName LIKE 'L_o'
9Arithmetic
- as can be used to label columns in the output
arithmetic can be used to compute results
SELECT DISTINCT Grade, Height/10 as H FROM Routes
Grade H II 10 I 6 III 40
10Set operations -- union
SELECT CId FROM Climbers WHERE Age lt
40 UNION SELECT CId FROM Climbs WHERE RID 1
- Duplicates do not occur in the union.
11The UNION ALL operator preserves duplicates
SELECT Cid FROM Climbers WHERE Age lt 40 UNION
ALL SELECT Cid FROM Climbs WHERE RID 1
12What does union compatible mean?
SELECT CId FROM Climbers UNION SELECT RId FROM
Routes
Ok
SELECT CName FROM Climbers UNION SELECT
RId FROM Routes
Error
13Intersection and difference
SELECT CId FROM Climbers WHERE Age gt
40 INTERSECT SELECT CId FROM Climbs WHERE RId 1
SELECT CId FROM Climbers WHERE Age lt
40 MINUS SELECT CId FROM Climbs WHERE RId 1
14Nested queries
- We could also have written the previous queries
as follows
SELECT CId FROM Climbers WHERE Age gt 40 AND CId
IN (SELECT CId FROM Climbs
WHERE RId 1)
SELECT CId FROM Climbers WHERE Age lt 40 AND CId
NOT IN (SELECT CId FROM Climbs
WHERE RId 1)
15Nested queries with correlation
SELECT CId FROM Climbers c WHERE EXISTS (SELECT
FROM Climbs b WHERE
c.CIdb.CId AND b.RID 1) SELECT CId FROM
Climbers c WHERE NOT EXISTS (SELECT
FROM Climbs b WHERE
c.CIdb.CId) SELECT CId FROM Climbers c WHERE
EXISTS UNIQUE (SELECT
FROM Climbs b WHERE c.CIdb.CId AND
RID 1)
16More on set comparison ops
- Besides IN, NOT IN, EXISTS, NOT EXISTS, UNIQUE
and NOT UNIQUE we can also say ltopgt ANY, ltopgt
ALL, where ltopgt is any of - What does the following mean in English?
SELECT CName, Age FROM Climbers WHERE Age gt ALL
(SELECT Age FROM Climbers)
17Set comparison ops, cont.
- What does the following mean in English?
SELECT FROM Climbers WHERE Age gt ANY (SELECT
Age FROM Climbers
WHERE CName'Arnold')
18Using expressions for relation names
- Consider the following query Find the names of
climbers who have not climbed any route.
SELECT CName FROM (SELECT CId FROM
Climbers MINUS SELECT CId FROM
Climbs) Temp, Climbers WHERE Temp.CId
Climbers.CId
19Products
SELECT FROM Climbers,Climbs
- Note that the CID column name is duplicated in
the output.
20Conditional join
SELECT FROM Climbers,Climbs WHERE Climbers.CId
Climbs.CId
CID CNAME SKIL AGE CID RID DAY DURATION 123
Edmund EXP 80 123 1 10-OCT-88 5 123 Edmund
EXP 80 123 3 08-NOV-87 1 313 Bridget EXP 33
313 1 08-DEC-89 5 214 Arnold BEG 25 214 2
07-AUG-92 2 313 Bridget EXP 33 313 1
07-JUN-94 3
21Example 1
- The names of climbers who have climbed route 1.
SELECT CName FROM Climbers, Climbs WHERE
Climbers.CId Climbs.CId AND RId 1
22Example 2
- The names of climbers who have climbed the route
named Last Tango.
SELECT CName FROM Climbers, Climbs, Routes WHERE
Climbers.CId Climbs.CId AND Routes.RId
Climbs.RID AND RName 'Last Tango'
23Example 3
- The IDs of climbers who have climbed the same
route at least twice. - Note the use of aliases for relations.
SELECT C1.CId FROM Climbs C1, Climbs C2 WHERE
C1.CId C2.CId AND C1.RId C2.RId AND
(C1.Day ltgt C2.Day OR C1.DURATION ltgt
C2.DURATION))
24Example 4
- Recall The names of climbers who have not
climbed any route
SELECT CName FROM (SELECT CId FROM
Climbers MINUS SELECT CId FROM
Climbs) Temp, Climbers WHERE Temp.CId
Climbers.CId
25Example 4, cont.
SELECT CName FROM Climbers WHERE CId NOT IN
(SELECT CId FROM Climbs)
26Universal Quantification
- The IDs of climbers who have climbed all routes.
SELECT CId FROM Climbs c1 WHERE NOT EXISTS
(SELECT RId ? Routes not climbed FROM
Routes r by c1. WHERE NOT EXISTS
(SELECT FROM Climbs
c2 WHERE c1.CIdc2.CId
and c2.RIdr.RId)
27Non-algebraic operations
- SQL has a number of operations that cannot be
expressed in relational algebra. The first is
to express arithmetic in queries.
SELECT RName, Rating Height AS Difficulty FROM
Routes
28Arithmetic, cont
- Arithmetic (and other expressions) cannot be used
at the top level. E.g. 22 isn't an SQL query. - Question -- how would you get SQL to compute 22?
29Counting
SELECT COUNT(RId) FROM Routes
- Surprisingly, the answer to both of these is the
following
SELECT COUNT(Grade) FROM Routes
30Counting, cont.
- To fix this, we use the keyword DISTINCT
- Can also use SUM, AVG, MIN and MAX.
SELECT COUNT(DISTINCT Grade) FROM Routes
31Group by
- So far, these aggregate operators have been
applied to all qualifying tuples. Sometimes we
want to apply them to each of several groups of
tuples. - For example Print the number of routes in each
grade.
32Group by
SELECT Grade, COUNT() FROM Routes GROUP BY
Grade
- Note that only the columns that appear in the
GROUP BY statement and aggregated columns can
appear in the output. So the following would
generate an error.
SELECT Grade, RName, COUNT() FROM Routes GROUP
BY Grade
33Group by having
- HAVING is to GROUP BY as WHERE is to FROM
- HAVING is used to restrict the groups that
appear in the result.
SELECT Height, AVG(Rating) FROM Routes GROUP BY
Height HAVING Height lt 300
34Another example
SELECT Height, AVG(Rating) FROM Routes GROUP BY
Height HAVING MAX(Rating) lt 10
35Null Values
- The value of an attribute can be unknown (e.g., a
rating has not been assigned) or inapplicable
(e.g., no spouse). - SQL provides a special value null for such
situations. - The presence of null complicates many issues.
E.g. - Special operators needed to check if value is/is
not null. - Is ratinggt8 true or false when rating is equal to
null? What about AND, OR and NOT connectives?
3-valued logic (true, false and unknown). - Meaning of constructs must be defined carefully.
(e.g., WHERE clause eliminates rows that dont
evaluate to true.)
36Outer Join
- A variant of the join that relies on null values
- Tuples of Climbers that do not match some tuple
in Climbs would normally be excluded from the
result the left outer join preserves them with
null values for the missing Climbs attributes.
SELECT FROM Climbers NATURAL LEFT OUTER
JOIN Climbs
37Result of left outer join
- Null values can be disallowed in a query result
- by specifying NOT NULL.
38Summary
- SQL is relationally complete all of the
operators of the relational algebra can be
simulated. - Additional features string comparisons, set
membership, arithmetic and grouping.