Title: Lecture 11: Basic SQL, Integrity constraints
1Lecture 11Basic SQL,Integrity constraints
- www.cl.cam.ac.uk/Teaching/current/Databases/
2SQL A language for relational databases
- Originates from System R project at IBM
- So widespread it has been standardised
- SQL-86 (ANSI)
- SQL-89 (minor revisions)
- SQL-92 (major revision ANSI/ISO current
standard) - SQL1999 (major extension - objects)
- SQL2003 (more features and XML support)
3SQL components
- SQL/92 is a huge standard (600 pages)
- SQL is a comprehensive database language,
containing - Data definition language (DDL) to define schemas
- Data manipulation language (DML) to insert,
delete, modify and retrieve data - Lots of other features e.g. views, schema
evolution, access control, triggers, assertions,
4Relation instances
Reserves
Boats
sid bid day
22 101 101001
55 103 111201
bid bname colour
101 Clipper Red
102 Hatter Blue
103 Interlake green
sid sname rating age
11 Sue 7 26
22 Tim 7 26
33 Bob 8 28
55 Kim 10 28
Sailors
5Simple queries
- Contents of entire relation SELECT
FROM Sailors - Qualifications SELECT FROM
Sailors WHERE agegt22
6Simple queries cont.
- Projection of attributes SELECT rating,age
FROM Sailors - Note that SQL does not remove duplicates. We need
to request this explicitly SELECT DISTINCT
rating,age FROM Sailors
7Simple queries cont.
- Can query across multiple relations
SELECT S.sname FROM Sailors S,
Reserves R WHERE S.sidR.sid AND
R.bid103 - SQL allows the range variables to be dropped,
e.g. SELECT sname FROM
Sailors,Reserves WHERE Sailors.sidReserves.
sid AND Reserves.bid103
8Basic SQL query syntax
- SELECT DISTINCT target-list FROM
relation-list WHERE qualification - relation-list is a list of relation names
(possibly with range variables) - target-list is a list of attributes of relations
in relation-list - qualification is a Boolean expression
9Semantics of SQL queries
- The semantics of SQL query evaluation can be
conceptually defined as - Compute the product of relation-list
- Discard those tuples that fail the qualification
- Delete attributes that are not in target-list
- If DISTINCT is specified, eliminate duplicate
rows - NOTE This is probably the least efficient way to
evaluate a query!
10Compiling to relational algebra
- The query SELECT a1, , an FROM
R1, , Rm WHERE b - Can be compiled to the RA expression
?a1,,an(?b(R1??Rm)) - (Assuming no duplicates)
11Example
- Find sailors whove reserved at least one boat
- SELECT S.sid FROM Sailors S,
Reserves R WHERE S.sidR.sid - Would adding DISTINCT to this query make any
difference?
12String matching
- SQL provides powerful string matching facilities
- These can be used in the WHERE clause
- _ denotes any one character
- denotes zero or more characters
- For example SELECT FROM Sailors
WHERE sname LIKE _ic
13Arithmetic
- SQL provides arithmetic operators, which can be
used in the SELECT clause - Use AS to name columns SELECT sname, age5 AS
OldAge FROM Sailors - Can also be used in the WHERE clause SELECT
S1.sname FROM Sailors S1, Sailors S2
WHERE S1.rating2S2.rating 4
14Set theoretic operations
- SQL provides the familiar set theoretic
operators UNION, INTERSECT, IN (set membership) - SQL also allows a WHERE clause to contain another
query, e.g. SELECT S.sname FROM Sailors
S WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid103)
15Example
- Find names of sailors whove reserved a red or a
green boat
SELECT S.sname FROM Sailors S, Reserves R,
Boats B WHERE S.sidR.sid AND R.bidB.bid
AND (B.colourred OR B.colourgreen)
16Example
- Find names of sailors whove reserved a red and a
green boat
SELECT S.sname FROM Sailors S, Reserves R1,
Reserves R2, Boats B1, Boats B2 WHERE
S.sidR1.sid AND R1.bidB1.bid AND
S.sidR2.sid AND R2.bidB2.bid AND
B1.colourred AND B2.colourgreen
17Integrity constraints
- An integrity constraint (IC) is a condition that
must be true for any instance of the database - E.g. domain constraints
- ICs are specified when schema is defined
- ICs are checked when relations are modified
- A legal instance of a relation is one that
satisfies all the specified ICs - Wed like the DBMS to check for ICs
18Primary key constraints
- A set of fields is a (candidate) key for a
relation if - No two distinct tuples can have same values in
all the key fields and - This is not true for any subset of the key
- If part 2 is false, then its a superkey
- If theres more than one key for a relation, then
one is chosen by the DBA to be the primary key
19Primary and candidate keys in SQL
- Can define possibly many candidate keys, one of
which is chosen as the primary key CREATE
TABLE Enrolled ( sid CHAR(20),
name CHAR(20), age INTEGER,
grade CHAR(2), PRIMARY KEY (sid),
UNIQUE (name,age))
20Foreign key
- A foreign key is a set of fields in one relation
that is used to refer to a tuple in another
relation (must correspond to a primary key of the
second relation) - A bit like a pointer
- For example, sid in Reserves is a foreign key
referring to Sailors
21Foreign keys in SQL
- Only sailors listed in the Sailors relation
should be allowed to make reservations
CREATE TABLE Reserves (sid INTEGER,
bid INTEGER, date DATE,
PRIMARY KEY (sid,bid,date), FOREIGN KEY
(sid) REFERENCES Sailors)
22Referential integrity
- If all foreign key constraints are enforced then
referential integrity is achieved. - Similar to no dangling pointers
- Can you name a data model without referential
integrity?
23Semantic integrity constraints
- ICs come from the semantics of the real-world
enterprise that is being modelled - Another class of ICs is semantic integrity
constraints. These apply to the actual data
values that can be stored, e.g. Sailor rating is
between 0 and 10
24Summary
- You should now understand
- Simple SQL queries
- SELECT-FROM-WHERE
- Compilation to relational algebra
- Nested queries
- Integrity constraints
- Primary and candidate keys