Title: Relational Algebra and My SQL
1Relational Algebra and My SQL
Lecture 5 CS157B
- Prof. Sin Min Lee
- Deparment of Computer Science
- San Jose State University
2Functional Dependency
A, B ? R(.) , we say FD A?B if
- For all t1, t2 element of R if t1A t2A gt
t1B t2B
Definition A? R(.) is a key if FD A
?R
Definition A? R(.) is a candidate key if FD
A?R and not proper subset of A is a key.
34. Find all the candidate keys for the following
table ( Mid1 Study Guide)
- R(A B C D)
- 1 2 3 4
- 2 2 3 5
- 3 2 5 1
- 1 2 5 6
- S (A B C D)
- 1 2 3 4
- 2 2 3 5
- 3 2 5 1
- 1 2 5 6
4Queries
- Q1
- A?R ()
- A is not a key
- Q2
- B?R ()
- B is not a key
- Q3
- C?R ()
- C is not a key
- Q4
- D?R ()
- Yes-D is a key
- Q5
- AB?R ()
- AB together cannot form the key
- Q6
- AC?R ()
- Yes- AC together cannot form the key
- Q7
5Relational Algebra
- Basic operations
- Selection ( ) Selects a subset of rows
from relation. - Projection ( ) Deletes unwanted columns
from relation. - Cross-product ( ) Allows us to combine two
relations. - Set-difference ( ) Tuples in reln. 1, but
not in reln. 2. - Union ( ) Tuples in reln. 1 and in reln. 2.
- Additional operations
- Intersection, join, division, renaming Not
essential, but (very!) useful. - Since each operation returns a relation,
operations can be composed! (Algebra is closed.)
6Projection
- Deletes attributes that are not in projection
list. - Schema of result contains exactly the fields in
the projection list, with the same names that
they had in the (only) input relation. - Projection operator has to eliminate duplicates!
(Why??, what are the consequences?) - Note real systems typically dont do duplicate
elimination unless the user explicitly asks for
it.
7Selection
- Selects rows that satisfy selection condition.
- Schema of result identical to schema of (only)
input relation. - Result relation can be the input for another
relational algebra operation! (Operator
composition.)
8Union, Intersection, Set-Difference
- All of these operations take two input relations,
which must be union-compatible - Same number of fields.
- Corresponding fields have the same type.
- What is the schema of result?
9Cross-Product
- Each row of S1 is paired with each row of R1.
- Result schema has one field per field of S1 and
R1, with field names inherited if possible. - Conflict Both S1 and R1 have a field called sid.
10Joins
- Condition Join
- Result schema same as that of cross-product.
- Fewer tuples than cross-product. Filters tuples
not satisfying the join condition. - Sometimes called a theta-join.
11Joins
- Equi-Join A special case of condition join
where the condition c contains only equalities. - Result schema similar to cross-product, but only
one copy of fields for which equality is
specified. - Natural Join Equijoin on all common fields.
12Division
- Not supported as a primitive operator, but useful
for expressing queries like
Find sailors who
have reserved all boats. - Precondition in A/B, the attributes in B must be
included in the schema for A. Also, the result
has attributes A-B. - SALES(supId, prodId)
- PRODUCTS(prodId)
- Relations SALES and PRODUCTS must be built using
projections. - SALES/PRODUCTS the ids of the suppliers
supplying ALL products.
13Examples of Division A/B
B1
B2
B3
A/B1
A/B2
A/B3
A
14Expressing A/B Using Basic Operators
- Division is not essential op just a useful
shorthand. - (Also true of joins, but joins are so common that
systems implement joins specially. Division is
NOT implemented in SQL). - Idea For SALES/PRODUCTS, compute all products
such that there exists at least one supplier not
supplying it. - x value is disqualified if by attaching y value
from B, we obtain an xy tuple that is not in A.
The answer is ?sid(Sales) - A
15(No Transcript)
16(No Transcript)
17EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN
- Equality join connects tuples from two relations
that match on certain attributes. The specified
joining columns are kept in the resulting
relation. - ?name(?dnametoy(Emp Dept)))
- Natural join connects tuples from two relations
that match on the specified common attributes - ?name(?dnametoy(Emp Dept)))
- How is an equality join between Emp and Dept
using dno different than a natural join between
Emp and Dept using dno? - Equality join SS, name, age, salary, Emp.dno,
Dept.dno, - Natural join SS, name, age, salary, dno,
dname, - Join is similar to equality join using different
comparison operators - A S op , ?, , , lt, gt
- att op att
(dno)
(dno)
18EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
-
- Equality Join, (Emp Dept)))
Dept
EMP
SS Name Age Salary EMP.dno Dept.dno dname floor mgrss
1 Joe 24 20000 2 2 Shoe 2 1
2 Mary 20 25000 1 1 Toy 1 5
3 Bob 22 27000 1 1 Toy 1 5
4 Kathy 30 30000 2 2 Shoe 2 1
5 Shideh 4 4000 1 1 Toy 1 5
(dno)
19EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
-
- Natural Join, (Emp Dept)))
Dept
EMP
SS Name Age Salary dno dname floor mgrss
1 Joe 24 20000 2 Shoe 2 1
2 Mary 20 25000 1 Toy 1 5
3 Bob 22 27000 1 Toy 1 5
4 Kathy 30 30000 2 Shoe 2 1
5 Shideh 4 4000 1 Toy 1 5
(dno)
20EXAMPLE JOIN
SS Name Age Salary dno
1 Joe 24 20000 2
2 Mary 20 25000 1
3 Bob 22 27000 1
4 Kathy 30 30000 2
5 Shideh 4 4000 1
dno dname floor mgrss
1 Toy 1 5
2 Shoe 2 1
Dept
EMP
SS Name Age Salary dno x.SS x.Name x.Age x.Salary x.dno
2 Mary 20 25000 1 2 Shideh 4 4000 1
3 Bob 22 27000 1 3 Shideh 4 4000 1
4 Kathy 30 30000 2 4 Shideh 4 4000 1
Salary gt 5 salary
21EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN
(Cont)
- Example retrieve the name of employees who earn
more than Joe - ?name(Emp (salgtx.sal)?nameJoe(? x(Emp)))
- Semi-Join selects the columns of one relation
that joins with another. It is equivalent to a
join followed by a projection - Emp (dno)Dept ?SS, name, age, salary,
dno(Emp Dept)
22(No Transcript)
23(No Transcript)
24JOIN OPERATORS
- Condition Joins
- - Defined as a cross-product followed by a
selection - R ?c S sc(R ? S)
(? is called the bow-tie) - where c is the condition.
- - Example
- Given the sample relational instances S1 and R1
The condition join S ?S1.sidltR1.sid R1 yields
25JOIN OPERATORS
- Condition Joins
- - Defined as a cross-product followed by a
selection - R ?c S sc(R ? S)
(? is called the bow-tie) - where c is the condition.
- - Example
- Given the sample relational instances S1 and R1
The condition join S ?S1.sidltR1.sid R1 yields
26(No Transcript)
27(No Transcript)
28(No Transcript)
29- Equijoin
- Special case of the condition join where the join
condition consists solely of equalities between
two fields in R and S connected by the logical
AND operator (?). - Example Given the two sample relational
instances S1 and R1
The operator S1 R.sidSsid R1 yields
30SQL
- SQL (Structured Query Language) is the standard
language for commercial DBMSs - SEQUEL (Structured English QUEry Language)was
originally defined by IBM for System R - standardization of SQL began in the 80s
- current standard is SQL-99
- SQL is more than a query language it includes a
DDL, DML and administration commands - SQL is an example of a transform-oriented
language. - A language designed to use relations to transform
inputs into required outputs.
31Basic structure of an SQL query
2
322
The SituationStudent Particulars
field type width contents id numeric
4 student id number name character
10 name dob date 8 date of
birth sex character 1 sex M /
F class character 2 class hcode character
1 house code R, Y, B, G dcode character
3 district code remission logical 1 fee
remission mtest numeric 2 Math test score
33General Structure
I
SELECT ...... FROM ...... WHERE ......
SELECT ALL / DISTINCT expr1 AS col1, expr2
AS col2 FROM tablename WHERE condition
34General Structure
I
SELECT ALL / DISTINCT expr1 AS col1, expr2
AS col2 FROM tablename WHERE condition
- The query will select rows from the source
tablename and output the result in table form.
- Expressions expr1, expr2 can be
- (1) a column, or
- (2) an expression of functions and fields.
- And col1, col2 are their corresponding column
names in the output table.
35- 3.2 SQL
- SELECT a1, ..., an
- FROM R1, R2, , Rm
- WHERE Con1, ,Conk
- This means
- p a1, ..., an( s Con1( (s Conk ( R1 R2
Rm))))
36- Example
- Find the SSN and tax for each person.
- SELECT SSN, Tax
- FROM Taxrecord, Taxtable
- WHERE wages interest capital_gain
income
37- AS keyword used to rename relations
- Two SQL expressions can be combined by
- INTERSECT
- UNION
- MINUS set difference
38- Example
- Find the names of the streets that intersect.
- SELECT S.NAME, T.NAME
- FROM Streets AS S, Streets AS T
- WHERE S.X T.X and
- S.Y T.Y
39- Example Assume we have the relations
- Broadcast ( Radio, X , Y )
- Town ( Name, X, Y )
- Find the parts of Lincoln, NE that can
- be reached by at least one Radio station.
- (SELECT X, Y
- FROM Town
- WHERE Name Lincoln)
- INTERSECT
- (SELECT X, Y
- FROM Broadcast)
40- Example
- Find the SSN and tax for each person.
-
- pSSN,Tax swagesinterestcapital_gain
income Taxrecord Taxtable - Example
- Find the area of Lincoln reached by a radio
station. - ( pX,Y ( sNameLincoln Town ) ) ? (
pX,Y Broadcast )
41- Another way of connecting SQL expressions
- is using the IN keyword.
- SELECT ..
- FROM ..
- WHERE a IN ( SELECT b
- FROM ..
- WHERE .. )
42- SQL with aggregation
- SELECT aggregate_function
- FROM .
- WHERE
- aggregate_function
- Max (c1a1 .. cnan)
where ai are attributes - Min (c1a1 .. cnan)
and ci are constants - Sum(a) where a is an
attribute that is - Avg(a) constant in
each constraint tuple - Count(a)
43- Example
- Package(Serial_No, From, Destination, Weight)
- Postage (Weight , Fee)
- Find the total postage of all packages sent
- from Omaha.
- SELECT Sum(Fee)
- FROM Package, Postage
- WHERE Package.Weight Postage.Weight AND
- Package.From Omaha
44- GROUP BY
- SELECT a1, , an, aggregate_function
- FROM ..
- WHERE
- GROUP BY a1, ..., ak
- Evaluates basic SQL query
- Groups the tuples according to different values
of a1,..,ak - Applies the aggregate function to each group
separately - a1, , ak ? a1, , an
45- Example
- Find the total postage sent out from each
city. - SELECT Package.From, Sum(Postage.Fee)
- FROM Package, Postage
- WHERE Package.Weight Postage.Weight
- GROUP BY Package.From
46General Structure
I
SELECT ALL / DISTINCT expr1 AS col1, expr2
AS col2 FROM tablename WHERE condition
- DISTINCT will eliminate duplication in the output
while ALL will keep all duplicated rows.
- condition can be
- (1) an inequality, or
- (2) a string comparison
- using logical operators AND, OR, NOT.
47General Structure
I
- Before using SQL, open the student file
- USE student
eg. 1 List all the student records.
SELECT FROM student
48General Structure
I
eg. 2 List the names and house code of 1A
students.
SELECT name, hcode, class FROM student
WHERE class"1A"
49General Structure
I
eg. 2 List the names and house code of 1A
students.
50General Structure
I
eg. 3 List the residential district of the Red
House members.
SELECT DISTINCT dcode FROM student WHERE
hcode"R"
51Data Manipulation
- Select query data in the database
- Insert insert data into a table
- Update updates data in a table
- Delete delete data from a table
Source Database Systems Connolly/Begg
52Retrieve all columns and all rows
- SELECT firstColumn,,lastColumn
- FROM tableName
- SELECT
- FROM tableName
53Use of Distinct
- SELECT DISTINCT columnName
- FROM tableName
columnName
A A B B C D
columnName
A B C D
54Calculated fields
- SELECT columnName/2
- FROM tableName
price
5.00 3.00 6.00
price
10.00 6.00 12.00
55Comparison Search Condition
- equals
- lt gt is not equal to (ISO standard)
- ! (allowed in some dialects)
- lt is less than
- gt is greater than
- lt is less than or equal to
- gt is greater than or equal to
Source Database Systems Connolly/Begg
56Comparison Search Condition
- An expression is evaluated left to right.
- Subexpressions in brackets are evaluated first.
- NOTs are evaluated before ANDs and ORs.
- ANDs are evaluated before ORs.
Source Database Systems Connolly/Begg
57Range Search Condition
- SELECT columnName
- FROM tableName
- WHERE columnName BETWEEN 20 AND 30
- SELECT columnName
- FROM tableName
- WHERE columnName gt 20
- AND columnName lt 30
58Set membership search condition
- SELECT columnName
- FROM tableName
- WHERE columnName
- IN (name1, name2)
- SELECT columnName
- FROM tableName
- WHERE columnName name1
- OR columnName name2
59Pattern matching symbols
- represents any sequence of zero
- or more characters (wildcard).
- _ represents any single character
Source Database Systems Connolly/Begg
60Pattern match search condition
- h begins with the character h .
- h_ _ _ four character string beginning with
the - character h.
- e any sequence of characters, of length
at - least 1, ending with the character e.
- CS157B any sequence of characters of any
- length containing CS157B
Source Database Systems Connolly/Begg
61Pattern match search condition
- LIKE h
- begins with the character h .
- NOT LIKE h
- does not begin with the character h.
Source Database Systems Connolly/Begg
62Pattern match search condition
- To search a string that includes a
- pattern-matching character
- 15
- Use an escape character to represent
- the pattern-matching character.
- LIKE 15 ESCAPE
-
Source Database Systems Connolly/Begg
63NULL search condition
- DOES NOT WORK
- comment
- comment !
- DOES WORK
- comment IS NULL
- comment IS NOT NULL
64Sorting
- The ORDER BY clause
- consists of list of column identifiers that the
result is to be sorted on, separated by commas. - Allows the retrieved rows to be ordered by
ascending (ASC) or descending (DESC) order
Source Database Systems Connolly/Begg
65Sorting
- Column identifier may be
- A column name
- A column number (deprecated)
Source Database Systems Connolly/Begg
66Sorting
- SELECT type, rent
- FROM tableName
- ORDER BY type, rent ASC
type rent
Apt Apt Flat Flat 450 500 600 650
type rent
Flat Apt Flat Apt 650 450 600 500
Source Database Systems Connolly/Begg
67Aggregate Functions
- COUNT returns the number
- SUM returns the sum
- AVG returns the average
- MIN returns the smallest
- MAX returns the largest
- value in a specified column.
Source Database Systems Connolly/Begg
68Use of COUNT( )
- How many students in CS157B?
- SELECT COUNT( ) AS my count
- FROM CS157B
my count
40
69GROUP BY clause
- When GROUP BY is used, each item in the SELECT
list must be single-valued per group. - The SELECT clause may contain only
- Column names
- Aggregate functions
- Constants
- An expression involving combinations of the above
Source Database Systems Connolly/Begg
70Grouping
- SELECT dept, COUNT(staffNo) AS my count
SUM(salary) - FROM tableName
- GROUP BY dept
- ORDER BY dept
dept staffNo Salary
A B C A B 1 1 1 2 2 200.00 200.00 200.00 100.00 100.00
dept my count Salary
A B C 2 2 1 300.00300.00 200.00
71Restricting Grouping
- HAVING clause
- is with the GROUP BY clause.
- filters groups into resulting table.
- includes at least one aggregate function.
- WHERE clause
- filters individual rows into resulting table.
- Aggregate functions cannot be used.
Source Database Systems Connolly/Begg
72- SELECT dept, COUNT(staffNo) AS my count,
SUM(salary) AS my sum - FROM Staff
- GROUP BY dept
- HAVING COUNT(staffNo) gt 1
- ORDER BY dept
dept staffNo Salary
A B C A B 1 1 1 2 2 200.00 200.00 200.00 100.00 100.00
dept my count my sum
A B 2 2 300.00 300.00
Source Database Systems Connolly/Begg
73Subqueries
- SELECT columnNameA
- FROM tableName1
- WHERE columnNameB (SELECT columnNameB
- FROM tableName2
- WHERE condition)
result from inner SELECT applied as a condition
for the outer SELECT
Source Database Systems Connolly/Begg
74Subquery with Aggregate Function
List all staff whose salary is greater than the
average salary, show by how much their salary is
greater than the average.
- SELECT fName, salary
- ( SELECT AVG(salary)
- FROM Staff ) AS salDiff
- FROM Staff
- WHERE salary gt ( SELECT AVG(salary)
- FROM Staff )
Source Database Systems Connolly/Begg
75Nested Subqueries Use of IN
- SELECT property
- FROM PropertyForRent
- WHERE staff IN(
- SELECT staff
- FROM Staff
- WHERE branch (
- SELECT branch
- FROM Branch
- WHERE street
- 112 A St))
Selects branch at 112 A St
Source Database Systems Connolly/Begg
76Nested Subqueries Use of IN
- SELECT property
- FROM PropertyForRent
- WHERE staff IN(
- SELECT staff
- FROM Staff
- WHERE branch ( branch ) )
Select staff members who works at branch.
Source Database Systems Connolly/Begg
77Nested Subqueries Use of IN
- SELECT property
- FROM PropertyForRent
- WHERE staff IN( staffs who works
- at branch on 112 A St)
Since there are more than one row selected,
cannot be used.
Source Database Systems Connolly/Begg
78Use of ANY/SOME
- SELECT name, salary
- FROM Staff
- WHERE salary gt SOME( SELECT salary
- FROM Staff
- WHERE branch A )
Result list of staff with salary greater than
2000.
Result2000,3000,4000
Source Database Systems Connolly/Begg
79Use of ALL
- SELECT name, salary
- FROM Staff
- WHERE salary gt ALL( SELECT salary
- FROM Staff
- WHERE branch A )
Result list of staff with salary greater than
4000.
Result2000,3000,4000
Source Database Systems Connolly/Begg
80Use of Any/Some and All
- If the subquery is empty
- ALL returns true
- ANY returns false
- ISO standard allows SOME to be
- used interchangeably with ANY.
Source Database Systems Connolly/Begg