Title: Data Integrity
1Data Integrity
2Topic
- What is a transaction?
- What are the goals and strategies for maintaining
data integrity?
3What is a transaction?
- A transaction is a series of actions to be taken
on the database such that they must be entirely
completed or aborted - A transaction is a unit of work
- All-or-nothing rule serializability
4Transactions COMMIT ROLLBACK
- A transaction can have one of two outcomes.
- successful that transaction should be committed
and the DB reaches a new consistent state. - unsuccessful the transaction should be aborted
and the DB must be restored to the consistent
state it was in before the transaction started. - COMMIT An SQL command to make a set of DB
changes permanent. - ROLLBACK An SQL command to reverse a set of
temporary DB changes.
5A Transaction Example
- We wish to register the credit sale of 100 units
of product X to customer Y in the amount of 500 - Transaction Start
- Read account receivable customer Y and store it
in A - A A 500
- Write A as the new account receivable
- Read quantity on hand of product X and store it
in B - if B
- Transaction rollback
- else B B - 100
- Write B as the new quantity on hand
- Transaction Commit
6Goals Strategies for Data Integrity
- Maintaining data integrity implies 3 goals
- Protecting existence
- Maintaining quality
- Ensuring confidentiality
- The more complex the environment the more effort
that must be expended on maintaining integrity
7Strategies for Protecting Existence
- Preventative measure
- Isolation (i.e., insulate the DB from destruction
by keeping it in safe places, such as vaults) - Remedial measure
- Database backup and journaling
- Database recovery
- Switching to a duplicate database
- Backward recovery
- Forward recovery
- Reprocessing transactions
8Database update process
U
p
d
a
t
e
U
p
d
a
t
e
U
p
d
a
t
e
t
r
a
n
s
a
c
t
i
o
n
A
t
r
a
n
s
a
c
t
i
o
n
B
t
r
a
n
s
a
c
t
i
o
n
C
D
a
t
a
b
a
s
e
D
a
t
a
b
a
s
e
D
a
t
a
b
a
s
e
D
a
t
a
b
a
s
e
(
s
t
a
t
e
2
)
(
s
t
a
t
e
3
)
(
s
t
a
t
e
4
)
(
s
t
a
t
e
1
)
D
a
t
a
b
a
s
e
(
s
t
a
t
e
2
)
9Strategies for Maintaining Quality
- Update authorization
- Use of GRANT and REVOKE SQL commands
- Integrity constraints / data validation
- Entity integrity constraint
- Referential integrity constraint
- Data type and length of an attribute
- Domain of an attribute
- Etc.
- Concurrent update control
10GRANT
- Defines a users privileges
- Format
- GRANT privileges ON object TO users
- WITH GRANT OPTION
- An object is a table or view
- The keyword privilege can be ALL PRIVILEGES or
chosen from - SELECT
- UPDATE
- DELETE
- INSERT
- Privileges can be granted to everybody using the
keyword PUBLIC or - to selected users by specifying their user
identifier - WITH GRANT OPTION permits a user to pass
privileges to another - user
11Using GRANT
- Give Alice all rights to the STOCK table
- GRANT ALL PRIVILEGES ON STOCK TO ALICE
- Permit the accounting staff, Todd and Nancy, to
update the price of a stock - GRANT UPDATE (STKPRICE) ON STOCK TO TODD, NANCY
- Give all staff the privilege to select rows from
ITEM - GRANT SELECT ON ITEM TO PUBLIC
- Give Alice all rights to view STK
- GRANT SELECT, UPDATE, DELETE, INSERT ON STK TO
ALICE
12REVOKE
- Removes privileges
- Format
- REVOKE privileges ON object FROM users
- When a users privileges are revoked, so are all
those privileges which were passed on by this
user.
13Using REVOKE
- Remove Sophie's ability to select from ITEM
- REVOKE SELECT ON ITEM FROM SOPHIE
- Nancy is no longer permitted to update stock
prices - REVOKE UPDATE ON STOCK FROM NANCY
14Grant and Revoke
- GRANT SELECT ON EMPLOYEE TO A3
- WITH GRANT OPTION
- The clause WITH GRANT OPTION means that A3 can
now propagate the SELECT privilege to other
accounts by using GRANT - GRANT SELECT ON EMPLOYEE TO A4
- Notice that A4 cannot propagate the SELECT
privilege to other accounts. - If the SELECT privilege on the EMPLOYEE table is
revoked from A3... - REVOKE SELECT ON EMPLOYEE FROM A3
- then the DBMS must automatically revoke the
SELECT privilege on EMPLOYEE from A4.
15Concurrent Control
- Concurrency control is the process of managing
simultaneous operations on the database without
having them interfere with one another. - Concurrency control is important because the
simultaneous execution of transactions over a
shared database can create several data integrity
and consistency problems. - A common problem Lost Update
- A common solution Locking
16Lost Update Problem
- Lost update An apparently successfully
completed update operation by one user can be
overridden by another user - Example Assume that two transactions
independently deposit 100 to a same bank account
that originally has 200 in balance
17Lost Update Problem
Time
T2
T1
200
18Lost Update Problem
Time
T2
T1
200
start
start
read balance
200
19Lost Update Problem
Time
T2
T1
200
start
start
read balance
read balance
200
200
20Lost Update Problem
Time
T2
T1
200
start
start
read balance
read balance
balance balance 100
?
200
21Lost Update Problem
Time
T2
T1
200
start
start
read balance
read balance
balance balance 100
200
300
22Lost Update Problem
Time
T2
T1
200
start
start
read balance
read balance
balance balance 100
balance balance 100
?
300
23Lost Update Problem
Time
T2
T1
200
start
start
read balance
read balance
balance balance 100
balance balance 100
300
300
24Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
300
300
25Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
300
300
26Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
commit
300
300
27Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
commit
300
300
28Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
commit
commit
300
300
29Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
commit
commit
300
300
30Lost Update Problem
Time
T2
T1
300
start
start
read balance
read balance
balance balance 100
balance balance 100
write balance
write balance
commit
commit
What should be the correct final balance?
31Solution Locking
- Locking the lock prevents one transaction from
using the data item while another transaction is
using it. - Two types of locks
- Slocks (shared or read locks) - more than one
transactions may obtain a shared lock on a data
item - Xlocks (exclusive or write locks) - will be
granted if no other locks are held on the data
item
32Locking Example
- Again, assume that two transactions independently
deposit 100 to a same bank account that
originally has 200 in balance. This time, also
assume that locking is applied.
33Locking Example
T2
Time
T1
start
start
write-lock(balance)
read balance
34Locking Example
T2
Time
T1
start
start
write-lock(balance)
read balance
write-lock(balance) Waiting...
35Locking Example
T2
Time
T1
start
start
write-lock(balance)
read balance
write-lock(balance) Waiting...
balance balance 100
write balance
commit
unlock (balance)
36Locking Example
T2
Time
T1
start
start
write-lock(balance)
read balance
balance balance 100
write balance
commit
unlock (balance)
write-lock(balance)
read balance
balance balance 100
write balance
commit
unlock(balance)
Final Balance?
37Granularity of Locks
The most flexible multi-user data access / the
most computer overhead
- A field (attribute)
- A row
- A page
- A table
- The whole database
The less flexible multi-user data access / the
less computer overhead
38Potential Locking Problem Deadlock
- When two or more transactions wait indefinitely
for each other to release a lock, they are in a
deadlock (i.e., deadly embrace)
39A First Example of a Deadlock
- Ex
- User As update transaction xlocks record 1
- User Bs update transaction xlocks record 2
- User A attempts to read record 2 for update
- User B attempts to read record 1 for update
40A Second Example of a Deadlock
- Transactions, T1 and T2, request locks in
parallel - T1 locks S( A ),
- T2 locks S( C ),
41A Second Example of a Deadlock
- Transactions, T1 and T2, request locks in
parallel - T1 locks S( A ), W( B ),
- T2 locks S( C ), W( D ),
42A Second Example of a Deadlock
- Transactions, T1 and T2, request locks in
parallel - T1 locks S( A ), W( B ), S( C ),
- T2 locks S( C ), W( D ), W( A ),
43A Second Example of a Deadlock
- Transactions, T1 and T2, request locks in
parallel - T1 locks S( A ), W( B ), S( C ), S( D )
- T2 locks S( C ), W( D ), W( A ), ...
44A Second Example of a Deadlock
- Transactions, T1 and T2, request locks in
parallel - T1 locks S( A ), W( B ), S( C ), S( D ), ...
- T2 locks S( C ), W( D ), W( A ), ...
45Controlling Deadlock
- Two general techniques to control deadlocks
- Prevention A transaction has to lock in advance
all data items it will require if these data
items are already locked, the transaction has to
wait. - Resolution The DBMS detects and breaks deadlocks.
46Strategies for Ensuring Confidentiality
- Data access control
- Identification through
- remembered information (ex. password)
- possessed object (ex. plastic card)
- personal characteristic (ex. fingerprint)
- Authorization
- Use of GRANT and REVOKE SQL commands
- Encryption
- Sensitive information needs to remain secure
- Encryption hides the meaning of a message
- Decryption reveals the meaning of an encrypted
message