Title: Views
1Data Integrity
2Data Management
- Effective management of data is critical in
modern organisations. Significant problems that
typically arise include - Multiple definitions of the same data entity (in
separate databases) - Inconsistent definition of similar data, where
there are such multiple definitions - this can be
hazardous when attempting to merge, consolidate
or aggregate data - Missing primary and foreign keys - data retrieval
is impossible - Poor knowledge of existing data, their location,
and their meaning
3Data Management
4Data Integrity
- Data Integrity is the correctness and
completeness of data in a database - Data integrity may be violated in several ways
- Inserting invalid data values
- Updating data to incorrect values
- Missing or ambiguous data
- Redundant orphan data not being purged
- Data operations lost because of system failure
- Partially-completed transactions applied
- Poor concurrency management
- Poor management of distributed transactions
- Inadequate data security
5Data Integrity
- DBMS systems assure data integrity by
implementing checks to - enforce entity integrity
- enforce referential integrity
- validate fields - is value within specified
domain ? (also called domain integrity) - enforce uniqueness constraints
- enforce required-value constraints (NOT NULL)
- manage transactions and concurrent operations
- manage distributed operations
- recover data after system crash
- define security constraints
- enforce business rules - using triggers
6Entity Integrity
- There are two critical aspects to Entity
Integrity - the primary key (identifier) of a table must
always contain a value - the primary key must always be unique
- In the case of composite (concatenated) primary
keys, no part of the key can be null - this has implications during logical database
design when normalising relations - The ANSI/ISO SQL standard specifies that the
primary key should be defined as part of the
CREATE TABLE statement. - The DBMS invokes the non-null uniqueness
constraint on every insertion or update.
7Entity Integrity SQL
- In SQL, primary key is explicitly specified in
CREATE TABLE statement - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 data_type PRIMARY KEY,
- field_2 data_type,
- .
- .
- .
- field_n data_type )
8Entity Integrity SQL
- Alternative example - composite key (generic
syntax) - CREATE TABLE table_1 (
- field_1 data_type,
- field_2 data_type,
- .
- .
- .
- field_n data_type,
- PRIMARY KEY (field_1, field_2) )
- However, SQL also permits a table to be created
without specifying a primary key
9Entity Integrity Microsoft Access
- Single-column or composite primary key can be
specified when creating a new table or altering
design of existing table
10Entity Integrity Microsoft Access
- If a primary key is not defined when attempting
to save the definition of a newly created table,
Access prompts user - may create key of
AutoNumber (Counter) type
11Uniqueness
- Sometimes a column, other than the primary key,
also contains a value which is unique to each
record - This would be the case for records where there
might be alternative primary keys - for example - Vehicle Registration Number, Chassis Number
- Employee Staff ID, RSI Number, Email Address
- Supplier Supplier Code, Telephone Number, Fax
Number, WWW Address (URL)
12Uniqueness SQL
- In SQL, unique columns (fields) are explicitly
specified in CREATE TABLE statement - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 data_type PRIMARY KEY,
- field_2 data_type,
- field_3 data_type UNIQUE,
- .
- .
- .
- field_n data_type )
- If a field is specified as (being part of) the
primary key, it is unnecessary to specify it as
UNIQUE - in fact, if you try to do so, may cause errors
in some dialects of SQL !
13Uniqueness Microsoft Access
- Uniqueness can be specified in Field Properties
by setting value of Indexed property to Yes (No
Duplicates)
14Required Data
- There will often be situations where fields,
other than the primary key field, must contain a
value - Example (Structured English Algebraic Notation)
- CUSTOMER Customer_Code
- Customer_Name
- Customer_Address
- (Customer_PhoneNum)
- (Customer_Email)
- Every INSERT operation must specify a non-null
value - Every UPDATE operation must specify a non-null
value - NULL is not equivalent to zero or to the empty
string ()
15Required Data SQL
- The NOT NULL constraint is specified as part of
the CREATE TABLE statement - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 data_type PRIMARY KEY,
- field_2 data_type NOT NULL
UNIQUE, - field_3 data_type,
- field_4 data_type NOT NULL,
- .
- .
- .
- field_n data_type )
- If a field is specified as (being part of) the
primary key, it is unnecessary to specify it as
NOT NULL
16Required Data Microsoft Access
- Required values can be specified in Field
Properties by setting value of Required property
to Yes - unnecessary for Primary Key field(s)
17Required Data Microsoft Access
- Alternatively, required values can be specified
in Field Properties by setting value of
Validation Rule property to Is Not Null
18Domain Integrity
- Appropriate checks must be designed to ensure
that no field takes on a value that is outside of
the range of legal values - Correct data type ?
- Field size restrictions ?
- Set of discrete values ?
- Range of allowable values ? Upper and / or lower
limits ? - Pattern matching ? Standard input mask / format ?
- Special case foreign key domains (referential
integrity) - The DBMS should invoke these validation checks on
every INSERT or UPDATE operation
19Domain Integrity SQL
- Data types and field sizes are specified as part
of the CREATE TABLE statement - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 CHAR(20) PRIMARY KEY,
- field_2 CHAR(15),
- field_3 NUMBER(3),
- field_4 NUMBER(5,2),
- field_5 DATE,
- .
- .
- .
- field_n data_type )
- Note data types vary between dialects of SQL
20Domain Integrity SQL
- Check constraints can also be specified as part
of the CREATE TABLE statement - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 CHAR(20) PRIMARY KEY,
- field_2 CHAR(15),
- field_3 NUMBER(3),
- field_4 NUMBER(5,2),
- field_5 DATE,
- .
- .
- .
- field_n data_type,
- CHECK (field_2 IN (EUROPE, FAR EAST,
NORTH - AMERICA, AUSTRALASIA)),
- CHECK (field_5 BETWEEN
- 01-JAN-1999 AND 01-JAN-2000) )
21Domain Integrity Microsoft Access
- Microsoft Access supports a variety of data types
22Domain Integrity Microsoft Access
- Further constraints can be specified in Field
Properties using Field Size, Input Mask,
Validation Rule, Validation Text
23Referential Integrity
- Parent - Child relationships exist between
tables in a relational database - This distinction is also referred to as Master -
Detail or Strong Entity - Weak Entity - All relationships in a relational database are
effectively one-to-many - many-to-many are broken down into multiple
equivalent one-to-many relationships using
intersection tables - one-to-one are subsumed by one-to-many
- The table at the many side of the relationship
is the Child, and the table at the one side is
the Parent - One-to-many relationships are implemented by
defining a foreign key in the Child table
24Referential Integrity Problems
- When creating a record in the Child table, the
value inserted into the foreign key must match
one of the values in the primary key of its
Parent - Otherwise, the database is corrupted (an Orphan
is created) - Occasionally, NULL values are allowed in foreign
keys - The same rule applies when updating a record in
the Child table - When deleting a row from the Parent table, if
there are matching rows in the Child table(s)
which are not also deleted, they become orphans - Likewise, when updating the value in the primary
key of a Parent table, the relationship with
associated records in the Child table(s) may be
inadvertently severed
25Referential Integrity Solutions
- Relationships should be explicitly defined in the
database definition - The second problem is overcome by checking the
value of the updated foreign key and disallowing
illegal unmatched values - However, note a potential problem though a value
might be matched, it need not be correct ! - Example if a clerk were keying credit card
numbers, and mis-keyed a number, but if by
coincidence the mis-keyed number was valid, the
wrong person ends up getting billed ! - ? The third and fourth problems are similar and
can be resolved in a variety of ways - No Restriction
- Restrict
- Cascade
- Set Null
- Set to Default Value
26Referential Integrity Solutions
- No Restriction
- Any record in the table may be updated / deleted
without regard to any records in any other tables - Restrict (Delete)
- A deletion of a record in a parent table must be
disallowed until any matching records are deleted
from child table(s) - Cascade
- A deletion / update of a record in a parent table
must be automatically followed by the deletion /
update of matching records in related child
table(s) - Set Null / Set to Default Value
- A deletion / update of a record in a parent table
must be automatically followed by setting any
matching foreign keys in related child table(s)
to NULL or specified default value
27UPDATE and DELETE Operations
- The Restrict rule applies at a singular level
it only affects the Parent table in a
relationship - Set Null / Set to Default Value are two-level
rules which stop at the Child table - In contrast, the Cascade rule applies to
multiple levels, and should therefore be
carefully used - data can be inadvertently erased or changed if
invoked in inappropriate circumstances
28Referential Integrity SQL
- Foreign keys and update / delete rules are
specified as part of the CREATE TABLE statement - The consequence is that Parent tables must be
created first - Example (generic syntax)
- CREATE TABLE table_1 (
- field_1 CHAR(20) PRIMARY KEY,
- field_2 CHAR(15),
- field_3 DATE,
- .
- .
- .
- FOREIGN KEY (field_n) REFERENCES table_2
- ON DELETE CASCADE )
29Referential Integrity Microsoft Access
- Microsoft Access supports referential integrity,
and the Restrict and Cascade policies
30Referential Integrity Microsoft Access
- Combo (drop-down) boxes or list boxes can be
used in forms to assure valid domain for foreign
key fields
31Referential Cycles
- Occasionally, there may be more than a single
relationship between a pair of tables - If null values were not allowed in the Foreign
Key columns, insertion deadlocks could arise - SQL permits null values in a foreign key field
- The delete and update rules that can be defined
are also affected by this cyclic relationship
32Business Rules
- Many data integrity constraints are tied to
business rules which fall beyond the immediate
scope of SQL - Enforcement of these rules is the responsibility
of the Database Administrator through development
of application programs - Triggers / Event Procedures may be defined,
being an action or set of actions to be
automatically executed if a particular event were
to occur - Example a re-order may be triggered if stock
levels of a particular item were to drop below a
specified level