Title: Integrity Constraints and Functional Dependencies
1Integrity Constraints and Functional Dependencies
2Review
- Three things managed by a DBMS
- Data organization
- E/R Model
- Relational Model
- Data Retrieval
- Relational Algebra
- Relational Calculus
- SQL
- Data Integrity
- Integrity Constraints
3Integrity Constraints
- Purpose prevent semantic inconsistencies in data
e.g.
e.g.
4 kinds of ICs 1. Key Constraints 2. Attribute
Constraints 3. Referential Integrity
Constraints 4. Global Constraints
No entry for Kenmore... ???
4ICs
- What are they?
- predicates on the database
- must always be true (, checked whenever db gets
updated) - There are the following 4 types of ICs
- Key constraints (1 table)
- e.g., 2 accts cant share the same acct_no
- Attribute constraints (1 table)
- e.g., 2 accts must have nonnegative balance
- Referential Integrity constraints ( 2 tables)
- E.g. bnames associated w/ loans must be names of
real branches - Global Constraints (n tables)
- E.g., all loans must be carried by at least 1
customer with a svngs acct
5Key Constraints
- Idea specifies that a relation is a set, not a
bag
SQL examples 1. Primary Key
CREATE TABLE branch(
bname
CHAR(15) PRIMARY KEY,
bcity CHAR(20),
assets
INT) or CREATE TABLE
depositor(
cname CHAR(15),
acct_no CHAR(5),
PRIMARY
KEY(cname, acct_no)) 2. Candidate Keys
CREATE TABLE
customer (
ssn CHAR(9) PRIMARY KEY,
cname
CHAR(15),
address CHAR(30),
city
CHAR(10),
UNIQUE (cname, address, city)
6Key Constraints
- Effect of SQL Key declarations
- PRIMARY (A1, A2, .., An) or
- UNIQUE (A1, A2, ..., An)
Insertions check if any tuple has same values
for A1, A2, .., An as any
inserted tuple. If found, reject
insertion Updates to any of A1, A2, ..., An
treat as insertion of entire tuple
- Primary vs Unique (candidate)
- 1 primary key per table, several unique keys
allowed. - Only primary key can be referenced by foreign
key (ref integrity) - DBMS may treat primary key differently
- (e.g. create an index
on PK)
7Attribute Constraints
- Idea
- Attach constraints to values of attributes
- Enhances types system (e.g. gt 0 rather than
integer) - In SQL
1. NOT NULL e.g. CREATE TABLE
branch( bname
CHAR(15) NOT NULL,
.... ) Note
declaring bname as primary key also prevents null
values 2. CHECK e.g. CREATE TABLE
depositor(
.... balance
int NOT NULL,
CHECK( balance gt 0),
....
) affect insertions, update in affected columns
8Attribute Constraints
- Domains can associate constraints with DOMAINS
rather than attributes
e.g instead of CREATE TABLE depositor(
....
balance INT NOT NULL,
CHECK (balance gt
0)
) One can write CREATE
DOMAIN bank-balance INT (
CONSTRAINT not-overdrawn
CHECK (value gt 0),
CONSTRAINT not-null-value
CHECK( value NOT
NULL)) CREATE TABLE depositor (
.....
balance bank-balance,
) Advantages?
9Attribute Constraints
- Advantage of associating constraints with domains
1. can avoid repeating specification of same
constraint for multiple columns
2. can name constraints e.g. CREATE DOMAIN
bank-balance INT ( CONSTRAINT
not-overdrawn
CHECK (value gt 0),
CONSTRAINT not-null-value
CHECK( value NOT NULL))
allows one to 1. add or remove
ALTER DOMAIN bank-balance
ADD CONSTRAINT capped
CHECK( value lt 10000)
2. report better errors (know which
constraint violated)
10Referential Integrity Constraints
- Idea prevent dangling tuples (e.g. a loan
with a bname, Kenmore, when no Kenmore tuple in
branch)
Referencing Relation (e.g. loan)
Referenced Relation (e.g. branch)
foreign key bname
primary key bname
Ref Integrity ensure that
foreign key value ? primary
key value (note need not to ensure ?, i.e.,
not all branches have to have loans)
11Referential Integrity Constraints
bname
bname
x
Referencing Relation (e.g. loan)
Referenced Relation (e.g. branch)
x
x
In SQL CREATE TABLE branch(
bname CHAR(15) PRIMARY KEY
....) CREATE TABLE
loan ( .........
FOREIGN KEY bname REFERENCES branch)
Affects 1) Insertions, updates of
referencing relation 2) Deletions, updates of
referenced relation
12Referential Integrity Constraints
c
c
ti
x
x
tj
x
what happens when we try to delete this tuple?
A
B
Ans 3 possibilities 1)
reject deletion/ update 2)
set ti c, tjc NULL
3) propagate deletion/update
DELETE delete ti, tj
UPDATE set tic, tjc
to updated values
13Referential Integrity Constraints
c
c
ti
x
x
tj
x
what happens when we try to delete this tuple?
A
B
CREATE TABLE A ( .....
FOREIGN KEY c REFERENCES B action
.......... )
Action 1) left blank (deletion/update
rejected) 2)
ON DELETE SET NULL/ ON UPDATE SET NULL
sets tic NULL, tjc NULL
3) ON DELETE CASCADE
deletes ti, tj
ON UPDATE CASCADE
sets tic, tjc to new key values
14Global Constraints
- Idea two kinds
- 1) single relation (constraints spans multiple
columns) - E.g. CHECK (total svngs check) declared in
the CREATE TABLE - 2) multiple relations CREATE ASSERTION
SQL examples 1) single relation
All Bkln branches must have assets gt 5M
CREATE TABLE branch (
..........
bcity CHAR(15),
assets INT, CHECK
(NOT(bcity Bkln) OR assets gt 5M)) Affects
insertions into branch
updates of bcity or assets in branch
15Global Constraints
SQL example 2) Multiple relations every loan
has a borrower with a savings account CHECK
(NOT EXISTS ( SELECT
FROM loan
AS L WHERE NOT
EXISTS(
SELECT
FROM borrower B, depositor D, account A
WHERE
B.cname D.cname AND
D.acct_no
A.acct_no AND
L.lno B.lno)))
Problem Where to put this constraint? At
depositor? Loan? .... Ans None of the above
CREATE ASSERTION loan-constraint
CHECK( ..... )
Checked with EVERY DB update! very
expensive.....
16Global Constraints
- Issues
- 1) How does one decide what global constraint
to impose? - 2) How does one minimize the cost of checking
the global constraints? - Ans Functional dependencies.
- but before we go there
17Summary Integrity Constraints
18Functional Dependencies
An example
loan-info
Observe tuples with the same value for lno
will always have the same value for amt We
write lno ? amt (lno determines amt, or
amt is functional determined by lno)
True or false? amt ? lno? lno ?
cname? lno ? bname? bname ? lno?
cant always decide by looking at populated
dbs
19Functional Dependencies
- In general
- A1 A2 .... An ? B
single attribute
set of attributes
Formally if 2 tuples agree on
their values for A1, A2,...,An they will also
agree on their values for B
Formally ? t, u (tA1 uA1 ?
tA2 uA2 ? .... ? tAn uAn ) gt tB
uB
20Back to Global ICs
- How do we decide what constraints to impose?
Consider loan-info(bname, lno, cname, amt) with
FDs lno ? bname
How do we ensure that lno ? bname?
CREATE ASSERTION lno-bname CHECK ( NOT
EXIST (SELECT
FROM loan-info
l1, loan-info l2
WHERE ?))
? l1.lno l2.lno AND l1.bname ltgt
l2.bname
FDs tell us what global constraints to
impose....
21Back to FDs
- How to derive them?
- (1) Key constraints ( e.g. bname a key for
branch -
bname ? bname -
bname ? bcity -
bname ? assets) - we can write bname ? bname bcity
assets
Q Define superkeys in terms of FDs
A Any set of attributes in a relation that
functionally determines all
attributes in the relation
Q Define candidate key in terms of FDs
A Any superkey such that the removal of any
attribute leaves a set that does
not functionally determine all attributes.
22Functional Dependencies
- How to derive them?
- (1) Key constraints
- (2) Laws of physics.... e.g. time room ? course
- (3) Trial-and-error...
- Given R(A, B,C) try each of the following
to see if they make sense - A? B AB?C
- A?C AC?B
- B?A BC?A
- B?C
- C?A
- C?B
What about AB ? A ? B
? B ?
Just say ...plus all of the trivial dependencies
23Back to Global ICs
(2) Avoiding the expense Recall lno ? bname
preserved by
CREATE ASSERTION lno-bname CHECK ( NOT
EXIST (SELECT
FROM loan-info
l1, loan-info l2
WHERE l1.lno l2.lno AND
l1.bname ltgt l2.bname))
Is it necessary to have an assertion for every
FDs?
- Ans Luckily, no. Can preprocess FD set
- some FDs can be eliminated
- some FDs can be combined
24Functional Dependencies
Combining FDs
a. cname ?ccity CREATE ASSERTION name-city
CHECK ( NOT EXIST
(SELECT FROM
customer c1, customer c2
WHERE c1.cname c2.cname AND
c1.ccity ltgt
c2.ccity))
b. cname ?cstreet CREATE ASSERTION name-street
CHECK ( NOT EXIST
(SELECT FROM
customer c1, customer c2
WHERE c1.cname c2.cname AND
c1.cstreet ltgt
c2.cstreet))
?? ((c1.ccity ltgt c2.ccity) OR (c1.cstreet ltgt
c2.cstreet))
combine into cname ? ccity cstreet CREATE
ASSERTION name-city-street CHECK ( NOT
EXIST (SELECT
FROM customer c1, customer c2
WHERE c1.cname c2.cname
AND
??))
25Functional Dependencies
Determining unnecessary FDs
Consider cname ? cname
CREATE ASSERTION name-name CHECK ( NOT
EXIST (SELECT
FROM customer c1,
customer c2 WHERE
c1.cname c2.cname AND
c1.cname ltgt c2.cname))
cannot possibly be violated!
Note X ? Y s.t. Y X
a trivial dependency (true,
regradless of attributes
involved) So Dont create
assertions for trivial dependencies
26Functional Dependencies
Determining unnecessary FDs
even non-trivial FDs can be unnecessary e.g.
a . lno ? bname CREATE ASSERTION
lno-bname CHECK ( NOT EXIST
(SELECT
FROM loan-info l1,
loan-info l2 WHERE
l1.lno l2.lno AND
l1.bname ltgt l2.bname)) b.
bname?assets CREATE ASSERTION
lno-bname CHECK ( NOT EXIST
(SELECT
FROM loan-info l1,
loan-info l2 WHERE
l1.bname l2.bname AND
l1.assets ltgt l2.assets))
27Cont.
c. lno?assets CREATE ASSERTION
lno-bname CHECK ( NOT EXIST
(SELECT
FROM loan-info l1,
loan-info l2 WHERE
l1.lno l2.lno AND
l1.assets ltgt l2.assets))
But if (a) and (b) succeed, then c must also
28Functional Dependencies
Using FDs to determine global ICs Step 1
Given schema R A1, ..., An use
key constraints, laws of physics,
trial-and-error, etc ... to
determine an initial FD set, F. Step 2 Use FD
elimination techniques to generate an alternative
(but equivalent) FD set, F Step 3
Write assertions for each f in F . (for now)
Issues (1) How do we guarantee that
F F? ans closures
(2) How do we find a minimal F F?
ans canonical cover algorithm
29Functional Dependencies
Example suppose R A, B, C, D, E,
H and we determine that
F A ? BC,
B ? CE,
A ? E, AC ?
H, D ? B
Then we determine the canonical cover of F
Fc A ? BH,
B ? CE, D ?
B ensuring that F and Fc are equivalent
Note F requires 5 assertions
Fc requires 3 assertions
30Functional Dependencies
Equivalence of FD sets FD sets F and G are
equivalent if the imply the same set of FDs
e.g. A? B and B ? C implies A ? C
equivalence usually expressed in terms of closures
Closures For any FD set, F, F is the set of
all FDs implied by F. can calculate in 2
ways (1) Attribute Closure
(2) Armstrongs axioms Both techniques
tedious-- will do only for toy examples F
equivalent to G iff F G
31Attribute Closures
Given R A, B, C, D, E, H and
F A ? BC,
B ? CE,
A ? E,
AC ? H,
D ? B
Attribute closure CD Iteration
Result ----------------------------------- 0
C D 1
C D B 2 C D B E
What is the closure of CD (CD) ?
Algorithm att-closure (X set of Attributes)
Result ? X repeat until stable
for each FD in F, Y ? Z, do if
Y Result then
Result ? Result ? Z
32Attribute Closures
Q what is ACD ?
Ans ACD ? R
Q How do you determine if ACD is a superkey?
Ans it is if ACD
? R
Q How can you determine if ACD is a candidate
key? Ans It is
if ACD ? R
AC -/-gt R
AD -/-gt R not true gt AD is a
candidate key CD
-/-gt R
33Attribute Closures to determine FD closures
Given R A, B, C, D, E, H and
F A ? BC,
B ? CE,
A ? E,
AC ? H,
D ? B
F A ? A, B ? B,
C ? C, D ? D,
E ? E, H ?
H, AB ? AB,
AC ? AC, AD ? AD,
......
To decide if F,G are equivalent (1)
Compute F (2) Compute G (3) Is
(1) (2) ? Expensive F has 63 rules (in
general 2R-1 rules)
34FD Closures Using Armstrongs Axioms
- A. Fundamental Rules (W, X, Y, Z sets of
attributes) - 1. Reflexivity
- If Y X then X ? Y
- 2. Augmentation
- If X ? Y then WX ? WY
- 3. Transitivity
- If X? Y and Y ? Z then X?Z
- B. Additional rules (can be proved from A)
- 4. UNION If X ? Y and X ? Z then X
? YZ - 5. Decomposition If X ? YZ then X ? Y,
X ?Z - 6. Pseudotransitivity If X ? Y and WY ?
Z then WX ?Z
35FD Closures Using Armstrongs Axioms
Given F A ? BC,
(1) B ?
CE, (2) A
? E, (3)
AC ? H, (4)
D ? B (5)
Exhaustively apply Armstrongs axioms to generate
F F F ? 1. A ? B,
A ? C decomposition on (1)
2. A ? CE transitivity to 1.1 and (2)
3. B ? C, B ? E decomp to
(2) 4. A ? C, A ? E decomp
to 2 5. A ? H
pseudotransitivity to 1.2 and (4)