Title: Chapter 9 Integrity
1Chapter 9 Integrity
2Outline
- Introduction
- A Closer Look
- Predicates and Propositions
- Relvar Predicates and Database Predicates
- Checking the Constraints
- Internal vs. External Predicates
- Correctness vs. Consistency
- Integrity and Views
- A Constraint Classification Scheme
- Keys
- Triggers
- SQL Facilities
3Constraint Examples
- Every supplier status value is in the range 1 to
100 inclusive - CONSTRAINT SC1
- IS_EMPTY ( S WHERE STATUS lt 1 OR STATUS gt 100 )
- Every supplier in London has status 20
- If there are any parts at all, at least one of
them is blue - No two distinct suppliers have the same supplier
number - Every shipment involves an existing supplier
- No supplier with status less than 20 supplies any
part in a quantity greater than 500
4Introduction
- Originally focused on keys, the theory of
integrity has evolved to focus on constraints in
general - An integrity constraint is a boolean expression
associated with a database that is required to
evaluate at all times to true - RDBMS must monitor updates that might violate the
constraints and reject those that do - An integrity constraint can regarded as a formal
expression of a business rule - Constraints may be a priori or a posteriori
5A Priori vs a Posteriori Constraints
- Type implies an a priori constraint
- Since every attribute of every relvar is of some
type, the collection of types is a priori for the
relvar - Business rule constraints of the sort that are
represented by uniqueness and value constraints,
are a posteriori, that is, a result of decisions
after the fact
6Introduction (Cont.)
- The general formulation for a relvar constraint
is as follows If a certain tuple appears in a
certain relvar then that tuple satisfies a
certain condition - This applies equally to value constraints,
uniqueness constraints and key constraints - More generally, if certain tuples appear in
certain relvars then those tuples satisfy a
certain condition
7Key Constraint Formal Definition Beginning
- FORALL X ? S, xn ? NAME,
- xt ? INTEGER, xc ? CHAR,
- Y ? S, yn ? NAME,
- yt ? INTEGER, yc ? CHAR
- (IF S, x, SNAME xn, STATUS xt,
- CITY xc ? S AND
- S, y, SNAME yn, STATUS yt,
- CITY yc ? S
8Key Constraint Formal Definition Conclusion
- THEN (IF x y
- THEN xn yn, AND
- xt yt, AND
- xc yc ) )
- This expresses that S is a superkey, and
possibly a candidate key
9Predicates and Propositions
- An expression is a predicate, and its variables
are parameters to the predicate - When we instantiate the variables, we are passing
arguments to the predicate, and so turning it
into a proposition, which is either true or false - A constraint is an expression, and therefore a
predicate, which is checked by passing it
arguments, and evaluating the proposition to be
true or false
FORALL s ?S, sn ? NAME, st ? INTEGER, sc
?CHAR( IF S s, SNAME sn, STATUS st, CITY
sc ? S THEN st gt 1 AND st lt 100 )
10Checking the Constraints
- Constraints should be checked immediately before
attempting any insert or update or delete - This is equally true from an implementation
perspective and from the model - To do otherwise is inefficient, and violates the
Golden Rule - Golden rule no update operation must ever assign
to any database a value that causes its database
predicate to evaluate to false
11Internal vs. External Predicates The Closed
World Assumption
- Internal predicates are those understood and
enforced by the system - External predicates are those understood and
implemented by the user - Internal predicates should reflect external
predicates - The Closed World Assumption If an otherwise
valid tuple does not appear in a relvar, its
corresponding proposition is false
12Correctness vs. Consistency
- The Closed World Assumption is logically valid,
but is unenforceable by the system - External predicates are not understood by the
system therefore the system can enforce
consistency, but not truth - The external predicate for a relvar is its
intended interpretation - Thus a database may be populated by valid but
false propositions - Correctness implies consistency, and
inconsistency implies incorrectness - A database is correct iff it fully reflects the
true state of affairs in the real world
13A Constraint Classification Scheme
14Type Constraints
- A type constraint is a definition of the set of
values that constitute a given type - e.g. TYPE WEIGHT POSSREP D DECIMAL(5,
1) CONSTRAINT D gt 0.0 and D lt 5000.0 - Type constraints can always be thought of as
being checked during the execution of some
selector invocation - e.g. WEIGHT(7500.0)
- Type constraints are always checked immediately,
and no relvar can ever acquire a value for any
attribute in any tuple that is not of the
appropriate type - Type constraints can be dropped only by dropping
the type iteself
15Attribute Constraints
- An attribute constraint (a priori constraint) is
a constraint on the values a given attribute is
permitted to assume - e. g. S, SNAME, SCITY, CITY are constrained to
be of types S, NAME, INTEGER, and CHAR,
respectively - An attribute constraint can be dropped only by
dropping the attribute itself
VAR S BASE RELATION S S, SNAME NAME,
STATUS INTEGER, CITY CHAR
16Relvar and Database Constraints
- A relvar constraint is a constrain on the values
a given relvar is permitted to assume (involve
only one relvar) - The relvar predicate for a relvar R is the
logical AND or conjunction of all of the
constraints that apply to R - e.g. the relvar predicate for S is the
conjunction of constraints 1, 2, 4, 5, and 6, and
the relvar predicate for SP is the conjunction of
constraints 5 and 6 - A database constraint is a constraint on the
values a given database is permitted to assume
(involve two or more relvars) - A database predicate is the conjunction of all
predicates of its relvars
17The Golden Rule
- First version no update operation must ever
assign to any relvar a value that causes its
relvar predicate to evaluate to false - Final version no update operation must ever
assign to any database a value that causes its
database predicate to evaluate to false
18Transition Constraints
- Transition constraints can apply to a database or
a relvar, but not an attribute, or a type - Transition constraints constrain certain actions,
for example forbidding an update to change a
status from married to never married - If we have a way to refer within a single
expression to both (a) the value of the variables
in question before an arbitrary update and (b)
the value of that same variable after that same
update, then we have the means to formulate any
desired transition constraint
19Transition Constraints (Cont.)
- CONSTRAINT TRC1FORALL SX FOR ALL SX (SX.S ltgt
SX.S OR SX.STATUS lt SX.STATUS) - CONSTRAINT TRC2FORALL PX SUM(SPX WHERE SPX.P
PX.P, QTY) lt SUM(SPX WHERE SPX.P PX.P,
QTY)
20Keys Topics
- Candidate Keys
- Superkeys
- Primary Keys
- Alternate Keys
- Foreign Keys
- Referential Integrity
- Referential Actions
21Candidate Keys
- Let K be a set of attributes of relvar R. Then K
is a candidate key for R iff it has both of the
following properties - Uniqueness No legal value of R ever contains
two distinct tuples with the same value for K - Irreducibility No proper subset of K has the
uniqueness property - Candidate key definition examples
VAR S BASE RELATION S S, SNAME NAME,
STATUS INTEGER, CITY CHAR KEY S
VAR SP BASE RELATION S S, P P,
QTY QTY KEY S, P
Composite candidate key
Simple candidate key
22More Candidate Key Definition Examples
VAR ELEMENT BASE RELATION NAME NAME,
SYMBOL CHAR, ATOMIC INTEGER KEY NAME
KEY SYMBOL KEY ATOMIC
VAR MARRIAGE BASE RELATION HUSBAND NAME,
WIFE NAME, DATE DATE KEY HUSBAND, WIFE
KEY HUSBAND, DATE KEY WIFE, DATE
Note that a candidate key definition is really
just shorthand for a certain relvar constraint
All relvars, including views, have candidate keys
23Superkey
- A superset of a candidate key is a superkey
- A superkey has the uniqueness property, but it
does not have the irreducibility property - e.g. S, CITY is a superkey for S
- If SK is a superkey for R and A is an attribute
of R, then the functional dependency SK ? A
necessarily holds in R (see chapter 11) - There is no implication there must be an index
(or any other special physical access path) on a
candidate key
24Primary Keys and Alternate Keys
- If a relvar has two or more candidate keys, one
must be chosen to be the primary key - The others are then designated alternate keys
- This choice is logically arbitrary
- Logically, candidate keys are of paramount
importance choosing the primary key is ancillary
25Foreign Keys
- Loosely, a foreign key is a set of attributes of
some relvar R2 whose values are required to match
values of some candidate key of some relvar R1 - Formal definition A foreign key in R2 is a set
of attributes of R2, say FK, such that - There exists a relvar R1 (R1 and R2 not
necessarily distinct) with a candidate key CK - It is possible to rename some subset of the
attributes of FK, such that FK becomes FK and
FK and CK are of the same tuple type - For all time, each value of FK in the current
value of R2 yields a value for FK that is
identical to the value of CK in some tuple in the
current value of R1 (the converse is not a
requirement)
26Notes
- An FK value represents a reference to the tuple
containing the matching CK value (the referenced
tuple) - The constraint that values of FK must match
values of CK is known as referential constraint - The problem of ensuring that the database does
not include any invalid foreign key values is the
referential integrity problem - Referential diagram
Referential path
SP
S
P
R(n-1)
Rn
R1
R2
Referential cycle
S
P
SP
S
P
R(n-1)
Rn
Rn
R1
27Foreign Key Examples
- VAR SP BASE RELATION FOREIGN KEY S
REFERENCES S - VAR EMP BASE RELATION EMP EMP, , MGR_EMP
EMP, FOREIGN KEY RENAME MGR_EMP AS EMP
REFERENCES EMP - FOREIGN KEY ltitem commalistgt REFERENCES
ltrelvargt - Each item is an attribute name of the referencing
relvar or an RENAME expression
28Foreign Keys Referential Integrity
- The database must not contain any unmatched
foreign key values - Originally foreign keys were defined in terms of
the primary key in the referenced relvar, but
this qualification is superfluous, although
perhaps desirable in practice - The relationship in one directional it is not a
requirement that every referenced candidate key
value appear in the foreign keys of the referrer
29Foreign Keys Referential Actions
- DELETE may violate the referential integrity
constraint - Consider DELETE S WHERE S S(S1) NO
ACTION - If only delete S1 in S, there may be some
shipments made by S1 - The system should perform an appropriate
compensation action that will guarantee that the
overall result still satisfy the constraint - RESTRICT limits the action of the DELETE to just
those tuples that do not have referring tuples in
another relvar - CASCADE broadcasts the DELETE to include any
tuples that reference the affected tuples - UPDATE requires similar behavior
- CASCADE/RESTRICT/NO ACTION apply to all relvars
involved in a referential path
30Triggers
31Overview
- Declarative integrity support vs. procedural
integrity support - Triggered procedures are precompiled procedures
that are stored along with the DB and invoked
automatically whenever some specified event
occurs - Triggered procedures are not the recommended way
to implement integrity constraints (harder to
understand and harder to optimize) - The applicability of triggered procedures is not
limited to the integrity problem - Alert the user if some exception occurs (part
quantity goes below the danger level) - Debugging, Auditing, performance measurement
- Carrying out compensating actions
32Trigger Example
- CREATE VIEW LONDON_SUPPLIERAS SELECT S, SNAME,
STATUSFROM SWHERE CITY London - CREATE TRIGGER LONDON_SUPPLIER_INSERTINSTEAD OF
INSERT ON LONDON_SUPPLIERREFERENCING NEW ROW AS
RFOR EACH ROWINSERT INTO S (S, SNAME, STATUS,
CITY)VALUES (R.S, R.SNAME, R.STATUS, London)
33Notes
- CREATE TRIGGER specifies an event, a condition,
and an action - The event is an operation on the DB
- INSERT, DELETE, UPDATE, COMMIT, reaching a
specified time of day, exceeding a specified
elapsed time, violating a specified constraint - The condition is a boolean expression that has to
evaluate to TRUE in order for the action to be
executed (if no condition, always TRUE) - The action is the triggered procedure
- BEFORE, AFTER, or INSTEAD OF the specified event
- Can be performed on FOR EACH ROW, or FOR EACH
STATEMENT - There will be a way for the action be refer to
the data as it is both before and after the
specified event has occurred
34Notes (Cont.)
Declarative solutions, when available, are always
to be preferred to procedural ones
- Event condition triggering event event
condition action trigger - Triggers are also known as event-condition-action
rules (ECA rules) - A DB that has associated triggers is sometimes
called an active database - Use of triggers can be problematic in practice
- If the same event causes several distinct
triggers to fire, then the sequence in which they
do so might be important and undefined - Trigger chain (T1 ? T2 ? T3)
- Trigger T may cause itself to fire again,
recursively
35MySQL Triggers
- Trigger definition
- CREATE TRIGGER lt trigger name gt BEFORE AFTER
INSERT UPDATE DELETE ON ltbase table
namegtFOR EACH ROWlttriggered SQL statementgt - lttrigger namegt table_name _
trigger_type_abbreviation - t26_bu
- Does not support for each statement triggers
- Drop trigger
- DROP TRIGGER lttrigger namegt
36MySQL Triggers (Cont.)
- Transition variables
- NEW OLD.column_name
- With INSERT, only NEW is legal. With DELETE, only
OLD is legal. With UPDATE, both NEW and OLD are
legal.
37MySQL Triggers Example
- Create Table
- CREATE TABLE T22 (S1 INTEGER UNSIGNED NOT
NULL,S2 INTEGER UNSIGNED,PRIMARY KEY(S1) ) - Create Trigger
- CREATE TRIGGER T22_BIBEFORE INSERT ON T22FOR
EACH ROWBEGIN SET _at_x 'Trigger was
activated!' SET NEW.S2 NEW.S111END - INSERT INTO T22 (S1) VALUES (32)
38MySQL Triggers Example (Cont.)
- Implementing a CHECK constraint
- CREATE TABLE t25 (s1 INT, s2 CHAR(5), PRIMARY KEY
(s1)) - CREATE TRIGGER t25_bi BEFORE INSERT ON t25 FOR
EACH ROW IF LEFT(NEW.s2,1)ltgt'A' THEN SET
NEW.s10 END IF - CREATE TRIGGER t25_bu BEFORE UPDATE ON t25 FOR
EACH ROW IF LEFT(NEW.s2,1)ltgt'A' THEN SET
NEW.s10 END IF - Test
- INSERT INTO t25 VALUES (0,'a') / priming the
pump / - INSERT INTO t25 VALUES (5,'b') / gets error
'23000' /
39SQL Facilities
40Overview
- SQL does not support type constraints, nor
attribute constraints, nor relvar constraints,
nor database constraints - SQL supports base table constraints, which are a
superset of a subset of relvar and database
constraints - No view constraints
- SQL support general constraints called assertions
- SQL has no direct support for transition
constraints, which can be implemented
procedurally via triggers
41Base Table Constraints
- Specified on either CREATE TABLE or ALTER TABLE
- Support candidate key constraints, foreign key
constraints, and check constraints - Candidate key
- PRIMARY KEY (ltcolumn name commalistgt) (at most
one) - Each specified column is additionally assume to
be NOT NULL - UNIQUE ( ltcolumn name commalistgt) (any number)
42Base Table Constraints (Cont.)
- Foreign key
- FOREIGN KEY (ltcolumn name commalistgt)REFERENCES
ltbase table namegt (ltcolumn name commalistgt)
ON DELETE ltreferential actiongt ON UPDATE
ltreferential actiongt - lt referential actiongt NO ACTION (default),
RESTRICT, CASCADE, SET DEFAULT, SET NULL - The second ltcolumn name commalistgt is required if
the foreign key references a candidate key that
is not a primary key (matching on the basis of
column position within the commalist
43Base Table Constraints (Cont.)
- Check Constraints always true for an empty base
table - CHECK ( lt bool exp gt )
- Not limited to referring just to the involved
base table, but can instead refer to any
accessible portion of the DB - Example
- CREATE TABLE SP(S S NOT NULL, P P NOT NULL,
QTY QTY NOT NULL, PRIMARY KEY ( S, P ),
FOREIGN KEY ( S ) REFERENCES S ON DELETE
CASCADE ON UPDATE CASCADE, FOREIGN KEY (
P ) REFERENCES P ON DELETE CASCADE ON
UPDATE CASCADE,CHECK (QTY gt QTY (0) and QTY lt
QTY (5000) ) ) - NOT NULL is a shorthand for CHECK (ltcolumn namegt
IS NOT NULL)
44General Constraints Assertions
- Assertion definition
- CREATE ASSERTION ltconstraint namegt CHECK (
ltbool expgt ) - Drop a assertion DROP ASSERTION ltconstraint
namegt - Examples
- CREATE ASSERTION SC1 CHECK( NOT EXISTS (SELECT
FROM S WHERE S.STATUS lt 10 OR S.STATUS
gt 100 )) - CREATE ASSERTION SC4 CHECK ( UNIQUE (SELECT S.S
FROM S ) ) - CREATE ASSERTION SSP6 CHECK( NOT EXISTS ( SELECT
FROM S, SP WHERE S.STATUS lt 20 AND
S.S SP.S AND SP.QTY gt QTY (500) ) )
45Constraint and View
- CREATE VIEW LONDON SUPPLIER AS SELECT S,
SNAME, STATUS FROM S WHERE CITY London - UNIQUE (S) is an invalid SQL specification, but
- We can specify the general constraint to achieve
the same purpose - CREATE ASSERTION LSK CHECK( UNIQUE (SELECT S
FROM LONDON_SUPPLIER ) )
46Deferred Checking
- In the textbooks scheme, all constraints are
check immediately - In SQL, constraints can be defined to be
DEFERRABLE or NOT DEFERRABLE - If a constraint is DEFERRABLE, it can further be
defined to be INITIALLY DEFERRED or INITIALLY
IMMEDIATELY, which defines its state at the
beginning of each transaction - NOT DEFERRABLE constraints are always checked
immediately - DEFERRABLE constraints can be dynamically
switched on and off - SET CONSTRAINTS ltconstraint name commalistgt
ltoptiongt - SET CONSTRAINTS SSP5, SSP6 DEFERRED
47Deferred Checking (Cont.)
- DEFERRABLE constraints are checked only when they
are in the IMMEDIATE state - Setting a DEFERRABLE constraint into the
IMMEDIATE state causes the constraint to be
immediately checked - If the check fails, the SET IMMEDIATE fails
- COMMIT forces a SET IMMEDIATE for all DEFERRABLE
constraints if any integrity check then fails,
the transaction is rolled back
48Deferred Checking
- SQL constraint verification can be DEFERRABLE or
NOT DEFERRABLE - NOT DEFERRABLE means the check will be immediate
- DEFERRABLE offers the option of
- SET IMMEDIATE