Title: More on SQL
1More on SQL
- Lecture Week 9
- INFS 614, Fall 2008
2Example Instances
R1
S1
S2
3SQL Seen so far
- Syntax of Basic SQL query
- Relation-list (FROM)
- Target-list (SELECT)
- Qualification (WHERE)
- Semantic Conceptual Evaluation Strategy
- String operators LIKE, NOT LIKE
- Set operators Union (all), Intersect (all),
Except (all) - Qualification involving sets IN, EXISTS,
UNIQUE, ANY, ALL (and their negated version) - Nested queries.
4Post Processing
- Processing on the result of an SQL query
- Sorting can sort the tuples in the output by any
column (even the ones not appearing in the SELECT
clause) - Duplicate removal
- Example
- Aggregation operators
SELECT Distinct S.sname FROM Sailors S,
Reserves R WHERE S.sidR.sid and R.bid103 Order
by S.sid asc
5Order By Example
SELECT S.sname, S.age FROM Sailors S WHERE
S.sname LIKE U ORDER BY S.age,S.sname
- For each attribute in the sort list, we can
specify its order - ASC Ascending Order
- DESC Descending Order
- Default Order is Ascending
6Aggregate Operators
- Significant extension of relational algebra.
COUNT () COUNT ( DISTINCT A) SUM ( DISTINCT
A) AVG ( DISTINCT A) MAX (A) MIN (A)
single column
7COUNT () COUNT ( DISTINCT A) SUM ( DISTINCT
A) AVG ( DISTINCT A) MAX (A) MIN (A)
Aggregate Operators
SELECT COUNT () FROM Sailors S
SELECT AVG (S.age) FROM Sailors S WHERE
S.rating10
single column
SELECT COUNT (DISTINCT S.rating) FROM Sailors
S WHERE S.snameBob
SELECT AVG ( DISTINCT S.age) FROM Sailors
S WHERE S.rating10
SELECT S.sname FROM Sailors S WHERE S.rating
(SELECT MAX(S2.rating)
FROM Sailors S2)
8Find name and age of the oldest sailor(s)
SELECT S.sname, MAX (S.age) FROM Sailors S
- The first query is illegal! (Well look into the
reason a bit later, when we discuss GROUP BY.) - The third query is equivalent to the second
query, and is allowed in the SQL/92 standard, but
is not supported in some systems.
SELECT S.sname, S.age FROM Sailors S WHERE
S.age (SELECT MAX (S2.age)
FROM Sailors S2)
SELECT S.sname, S.age FROM Sailors S WHERE
(SELECT MAX (S2.age) FROM
Sailors S2) S.age
9GROUP BY and HAVING
- So far, weve applied aggregate operators to all
(qualifying) tuples. Sometimes, we want to apply
them to each of several groups of tuples. - Consider Find the age of the youngest sailor
for each rating level. - In general, we dont know how many rating levels
exist, and what the rating values for these
levels are! - Suppose we know that rating values go from 1 to
10 we can write 10 queries that look like this
(!)
10Queries With GROUP BY and HAVING
SELECT DISTINCT target-list FROM
relation-list WHERE qualification GROUP
BY grouping-list HAVING group-qualification
- The target-list contains (i) attribute names
(ii) terms with aggregate operations (e.g., MIN
(S.age)). - The attribute list (i) must be a subset of
grouping-list. - Intuitively, each answer tuple corresponds to a
group, and these attributes must have a single
value per group. (A group is a set of tuples
that have the same value for all attributes in
grouping-list.)
11Conceptual Evaluation
- The cross-product of relation-list is computed,
tuples that fail qualification are discarded,
unnecessary fields are deleted, and the
remaining tuples are partitioned into groups by
the value of attributes in grouping-list. - The group-qualification is then applied to
eliminate some groups. Expressions in
group-qualification must have a single value per
group! - In effect, an attribute in group-qualification
that is not an argument of an aggregate op also
appears in grouping-list. (SQL does not exploit
primary key semantics here!) - One answer tuple is generated per qualifying
group.
12Query with GROUP BY and HAVING
- Query Find the age of the youngest sailor for
each rating level.
SELECT S.rating, MIN (S.age) FROM Sailors
S GROUP BY S.rating
13Find the age of the youngest sailor older than
18, for each rating with at least 2 such sailors
older than 18
SELECT S.rating, MIN (S.age) FROM Sailors
S WHERE S.age gt 18 GROUP BY S.rating HAVING
COUNT () gt 1
- Only S.rating and S.age are mentioned in the
SELECT, GROUP BY or HAVING clauses other
attributes unnecessary. - 2nd column of result is unnamed. (Use AS to name
it.)
14For each red boat, find the number of
reservations for this boat
SELECT B.bid, COUNT () AS rescount FROM Boats
B, Reserves R WHERE R.bidB.bid AND
B.colorred GROUP BY B.bid
- Grouping over a join of two relations.
- What do we get if we remove B.colorred from
the WHERE clause and add a HAVING clause with
this condition?
15Find the age of the youngest sailor with age gt
18, for each rating with at least 2 sailors (of
any age)
SELECT S.rating, MIN (S.age) FROM Sailors
S WHERE S.age gt 18 GROUP BY S.rating HAVING 1
lt (SELECT COUNT ()
FROM Sailors S2 WHERE
S.ratingS2.rating)
- HAVING clause can also contain a correlated
subquery. - Compare this with the query where we considered
only ratings with 2 sailors over 18! - What if HAVING clause is replaced by
- HAVING COUNT() gt1
16For each sailor with more than three
reservations, find the number of his reservations
SELECT R.sid, COUNT () AS rescount FROM
Reserves R GROUP BY R.sid HAVING COUNT() gt 3
Find the age of the youngest sailor, for each
rating level gt 5 with an average age gt 25
SELECT S.rating, MIN(S.age) AS minAge FROM
Sailors S WHERE S.rating gt 5 GROUP BY
S.rating HAVING AVG(S.age) gt 25
SELECT S.rating, MIN(S.age) AS minAge FROM
Sailors S GROUP BY S.rating HAVING S.rating gt 5
AND AVG(S.age) gt 25
17Find those ratings for which the average age of
sailors is the minimum over all ratings
- Aggregate operations cannot be nested! WRONG
SELECT S.rating FROM Sailors S WHERE
AVG(S.age) (SELECT MIN (AVG (S2.age))
FROM Sailors S2 GROUP BY S2.rating)
18Continue from previous slide
- Correct solution (in SQL/92)
SELECT Temp.rating, Temp.avgage FROM (SELECT
S.rating, AVG (S.age) AS avgage FROM
Sailors S GROUP BY S.rating) AS
Temp WHERE Temp.avgage (SELECT MIN
(Temp.avgage)
FROM Temp)
19Note on Oracle SQL in the labs
Do not use the keyword AS to name a temporary
table within the FROM clause.
20Note on Oracle SQL in the labs
SELECT Temp.rating, Temp.avgage FROM (SELECT
S.rating, AVG (S.age) AS avgage FROM
Sailors S GROUP BY S.rating) AS
Temp WHERE Temp.avgage (SELECT MIN
(Temp.avgage)
FROM Temp)
21Conclusions (so far)
- Post processing on the result of queries is
supported. - Aggregation is the most complex post processing
- Group by clause partitions the results into
groups - Having clause puts condition on groups (just
like Where clause on tuples).
22Null Values
- Field values in a tuple are sometimes unknown
(e.g., a rating has not been assigned) or
inapplicable (e.g., no spouses name). - SQL provides a special value null for such
situations.
23Dealing with the null value
- Special operators are needed to check if value
is/is not null. - Attribute_name IS NULL TRUE if the attributes
value is null. FALSE otherwise - Attribute_name IS NOT NULL TRUE if the
attributes value is not null. FALSE otherwise - Is ratinggt8 true or false when rating is equal to
null? - Actually, its unknown.
- How about rating gt 8 AND age lt 40 ?
- Three-valued logic.
24Three Valued Logic
- Logical operators AND, OR and NOT using a
3-valued logic (TRUE, FALSE and unknown).
25Null Values Some issues
- The presence of null complicates many issues
- WHERE clause eliminates tuples for which the
qualification does not evaluate to TRUE. - Important in nested queries involving EXISTS or
UNIQUE - Two tuples in SQL are duplicates in a relation if
corresponding attributes have the same value or
both contains null.
SELECT S1.sname FROM SAILORS S1 WHERE EXISTS
( SELECT FROM SAILORS S2 WHERE S1.sname
S2.sname AND AND S1.SID lt S2.SID AND
S1.RATING ltgt S2.RATING)
26Issues with the null value Summary
- WHERE and HAVING clause eliminates rows (groups)
for which the qualification does not evaluate to
true (i.e., evaluate to false or unknown). - Aggregate functions ignore null values (except
count()). - Distinct treats all null values as the same.
- The arithmetic operations , -, , / return null
if one of the arguments is null.
27Outer joins
(left outer-join)
28In Oracle
- Select
- From Sailor S, Reserves R
- Where S.sid R.sid ()
How about Select S.sid, count(R.bid) From Sailor
S, Reserves R Where S.sid R.sid () Group by
S.sid
OR Select S.sid, count() From Sailor S, Reserves
R Where S.sid R.sid () Group by S.sid
29More outer joins
- Left outer join
- sign on the right in Oracle
- Select from R, S where R.idS.id()
- Right outer join
- sign on the left in Oracle
- Select from R, S where R.id()S.id
- Full outer join
- not implemented in Oracle
30More on value functions
- Values can be transformed before aggregated
- e.g. SELECT sum(S.A/2) FROM S
- An interesting decode function (Oracle specific)
- decode(value, if1, then1, if2, then2, , else)
- SELECT sum(decode(Major, INFS, 1, 0)) as
No_IS_Stu, sum(decode(Major, INFS, 0,
1)) as No_NonIS_Stu - FROM Student
- Calculating GPA from letter grades?
31Examples
- Department (D-code, D-Name, Chair-SSn)Course
(D-code, C-no, Title, Units)Prereq (D-code,
C-no, P-code, P-no)Class (Class-no, D-code,
C-no, Instructor-SSn)Faculty (Ssn, F-Name,
D-Code, Rank)Student (Ssn, S-Name, Major,
Status)Enrollment (Class-no, Student-Ssn)Transcr
ipt (Student-Ssn, D-Code, C-no, Grade)
32Query 1
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
List the classes (Class-no) currently taken by
students whose names start with 'T'.
SELECT distinct e.Class-no FROM Enrollment e,
Student s WHERE e.Student-Ssn s.Ssn AND
s.S-Name LIKE 'T'
33Query 2
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
List the students (SSN) who are currently taking
exactly one class.
SELECT e.Student-Ssn FROM Enrollment e GROUP BY
e.Student-Ssn HAVING 1count()
34Query 3
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
Give the percentage of the students (among all
students) who are currently taking courses
offered by ISE (D-code'ISE').
SELECT count(distinct e.Student-Ssn)/count(distinc
t s.Ssn) as Percent FROM Enrollment e, Class
c, Student s WHERE e.Class-noc.Class-no and
c.D-code'ISE'
35Query 4
- Department (D-code, D-Name, Chair-SSn)Course
(D-code, C-no, Title, Units)Prereq (D-code,
C-no, P-code, P-no)Class (Class-no, D-code,
C-no, Instructor-SSn)Faculty (Ssn, F-Name,
D-Code, Rank)Student (Ssn, S-Name, Major,
Status)Enrollment (Class-no, Student-Ssn)Transcr
ipt (Student-Ssn, D-Code, C-no, Grade)
List the faculty members (F-Name) who teach 2 or
more classes. List these faculty members by the
number of classes they teach (ascending order).
SELECT f.F-Name FROM Faculty f, Class c WHERE
f.Ssnc.Instructor-SSn GROUP BY f.Ssn,
f.F-Name HAVING count(distinct c.Class-no)gt2 ORDE
R BY count(distinct c.Class-no), F-Name
36Query 5
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
List the students (SSN and Name) along with the
number of classes they are taking. If a student
is not taking any class, the student should also
be listed (with 0 as the number of classes he/she
is taking). The list should be ordered by the
number of classes (in an ascending order), and in
case of a tie, by the SSN of the students.
SELECT s.Ssn, s.S-Name, count(distinct
Class-no) FROM Student s, Enrollment e WHERE
s.Ssn e.Student-Ssn () GROUP BY s.Ssn,
s.S-Name ORDER BY count(distinct Class-no), Ssn
37Query 6
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
List the faculty members (F-Name) who teach more
than twice as many classes as Professor Smith
(F-Name'Smith') is teaching. (Note that if
Professor Smith is not teaching anything, then
any professor who teaches at least one class will
satisfy the above query.)
SELECT f.F-Name FROM Faculty f, Class c WHERE
f.Ssnc.Instructor-SSn GROUP BY f.Ssn,
f.F-Name HAVING count(distinct c.Class-no) gt
(SELECT 2count(distinct Class-no)
FROM Faculty f, Class c WHERE
f.F-Name'Smith' and
f.Ssnc.Instructor-SSn)
38Query 7
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
Find the number of departments which do not have
a chairman (Chair_ssn is 'NULL').
SELECT count(d.D-code) FROM Department d WHERE
d.Chair-SSn is NULL
39Query 8
- Department (D-code, D-Name,
Chair-SSn)Course (D-code, C-no, Title,
Units)Prereq (D-code, C-no, P-code, P-no)Class
(Class-no, D-code, C-no, Instructor-SSn)Faculty
(Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name,
Major, Status)Enrollment (Class-no,
Student-Ssn)Transcript (Student-Ssn, D-Code,
C-no, Grade)
For each department (i.e., students Major), give
the number of graduate students (status'Grad')
and the number of other students (status ltgt
'Grad'). The two numbers must be shown in the
same row as the department code. Hint use
decode.
SELECT Major, sum(decode(Status, 'Grad', 1,0)) as
Grad, sum(decode(Status, 'Grad',
0,1)) as NoGrad FROM Student GROUP BY Major
40Query 9
For each department (D-code), give the highest
rank of the professors in the department along
with the number of the faculty with that highest
rank. The output contains one row for each
department. Assume FullgtAssociategtAssistant,
i.e., lexicographic order is fine. Note that some
department may not have professors in some ranks
(e.g., a department may not have full, or
associate or assistant professors).
SELECT f.D-Code as dept, f.maxrank, count(e.Ssn)
as num FROM (SELECT D-code, max(Rank) as
maxrank FROM Faculty GROUP BY
D-code) AS f, Faculty e WHERE
f.D-codee.D-code AND e.Rankf.maxrank GROUP BY
f.D-code, f.maxrank ORDER BY f.D-code