DBMS and Data Management - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

DBMS and Data Management

Description:

( undoing the whole transaction) The Need for Concurrency Control ... The effect of the rollback in Transaction B is to undo the update of V at time T2. ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 56
Provided by: mcsy46
Category:

less

Transcript and Presenter's Notes

Title: DBMS and Data Management


1
DBMS and Data Management
2
DBMS
  • Definition A database management system (DBMS)
    is a set of software that are used to define,
    store, manipulate and control the data in a
    database.
  • define --- define data types, structures and
    constraints.
  • store --- store data to support efficient access.
  • manipulate --- perform retrieval and update
    operations using a query language.
  • control --- control access to data.
  • Database System Database DBMS

3
Integrity, Security
  • Topics
  • Integrity Constraints within a DBMS
  • Security of Data

4
Database Integrity
DBMS cannot protect against all accidental
errors a date being entered incorrectly, mistyp
ing a name, BUT DBMS should protect against
unreasonable entries, updates, deletions
etc. age gt 120 salary gt max_salary making
employees redundant without removing them from
the payroll data.
5
Types of Integrity Constraints
  • 1. The data model underlying the DBMS may have
    its own integrity constraints.
  • For example the relational model has two
    forms of integrity rule which must be satisfied
  • Entity Integrity
  • Referential Integrity
  • 2. There are also a set of constraints regarding
    permitted values for one or more attributes.
  • Domain Constraints
  • 3. In addition most DBMS permit additional
    (general) constraints known as
  • user specified integrity constraints
  • (aka enterprise constraints, or
    business rules)
  • (These are constraints which the data must
    satisfy in order to be meaningful to the
    organisation - User actually being data or
    database administrator)
  • Ideally the DBMS should enforce ALL the above
    types of constraint - however it often falls to
    the application program to protect against
    integrity violations.

6
  • Relational Model Entity Integrity
  • 1. a database reflects some part of reality
  • 2. unique objects in the "real" world must be
    identifiable and unique within the database
  • 3. objects are identified using identifiers
    (keys) made from 1 or more attributes
  • 4. identifiers must be
  • unique (for a given object
    type) minimal (have no spurious attributes)
  • That is the reason why PRIMARY KEY is NEEDED

7
  • SQL setting the primary key, (with 2
    attributes)
  • Create TABLE t (
  • attributeA type,
  • attributeB type,
  • PRIMARY KEY (attributeA, attributeB),
  • )

8
Justification a null valued key implies a
missing, or unknown, identity. If the identity
is missing then the object in the real world is
not identifiable. If the identity is unknown
then you cannot know if the object is already in
the database - therefore you cannot tell that if
the identity is unique.
9
Relational Model Referential Integrity A
foreign key is an attribute (possibly composite)
of one relation, whose value matches the primary
key of some tuple in a relation. (sometimes
called a posted identifier)
SQL setting the foreign key, Create TABLE t
( attributeA type, attributeB
type, attributeC type, PRIMARY KEY
(attributeA, attributeB), FOREIGN KEY
(attributeC) REFERENCES anothertable(referecedAtt
ribute), )
10
  • During update or delete
  • Restricted
  • Update/Delete only if no foreign key
    exists
  • Cascades
  • Update/Delete operation cascades to
    update/delete the foreign key entries
  • Nullifies (or set default value)
  • Update/Delete primary key values
  • Null (or default value) in the foreign
    key entry.

11
Domain Constraints
Each value appearing in a particular place in the
database should come from a specific domain of
values appropriate to the use of the value.
  • These can be defined as
  • an explicit set of values
  • supplier code domain (S1, S2, S3, S4, S5)
  • a range of valid values
  • age domain (0 .. 120)
  • or a pattern which values must satisfy
  • employee code domain ( AAAA9999AAA)
  • (4 letters followed by 4 digits followed by 3
    letters)
  • or (sometimes) just a datatype
  • size domain integer
  • or combinations of the above.

12
SQL for domain constrain
  • Create TABLE t (
  • attributeA DATE,
  • attributeB VARCHAR2(30), NOT NULL
  • attributeC CHAR(5), NOT NULL
  • )

13
User Specified Integrity Constraints
  • Categories of User Specified Integrity
    Constraints
  • Relation rules
  • record constraints
  • set constraints
  • which may be
  • static
  • dynamic (before and after alterations)

14
Relation Rules
What combination of attribute values are
acceptable in a relation?
  • Record constraints
  • Set constraints

treat the relation one row at a time For all
Vehicle.Type CAR then Vehicle.Wheels
(3,4) Vehicle.Type CYCLE then Vehicle.Wheels
(2,3) Vehicle.Type HOVERCRAFT then
Vehicle.Wheels null This prevents meaningless
combinations of data for an individual object in
the database.
apply to the relation as a whole, that is all
data in the table SUM(employee.salary) lt limit
COUNT(distinct dept.car) lt 10 AVG(staff.overtime)
IN /- 10 of 6
15
Relation rules may be Static or DynamicStatic
constraints
must hold between data items at ALL
times. employee.salary lt manager.salary IN
employee, manager WHERE employee.mgr
manager.mgr
Dynamic constraints must hold between database
states before AND after changes to the relation
or relations T1(salary) - T0(salary) gt 0 after
update(salary) at time T1. That is, any change to
salaries must be increasing and the integrity
constraints may relate to several tables.
16
Responses to Integrity Violation
Attempts to alter a database in ways which
violate integrity must be prevented Responses
include reject the attempted operation request/
prompt for alternative values abort the
current transaction Integrity checking may be
performed on input on deletion on
update deferred to some convenient time (end of
transaction)
17
Deferred Integrity Constraints Checking
Some checks MUST be deferred to a transaction
boundary (the transaction is the unit of
integrity). Consider a pair of
constraints For each project entry there
must be at least one employee. For each
employee there must be a project on which
they are employed
18
Security.
  • Why do we need security?
  • Many benefits accrue from the use of a DBMS,
    centralised data storage and access management
    making the contents of the database available to
    the user.
  • However, this centralised storage potentially
    means that all users have access to all possible
    data. Clearly this is not secure, additional
    commands must be available to prevent users
    accessing data which requires sensitive
    management.
  • Security features are required to prevent
    unauthorised
  • Disclosure
  • Alteration
  • Destruction
  • of objects in the database.

19
The Classical Solution to Security.
  • 1. Identify the user.
  • access through a login sequence requiring a user
    identifier.
  • 2. Authenticate the users identity.
  • require the user to input a secure password.
  • 3. Match the user identifier with a list of
    privileges.
  • 4. Privileges such as
  • what type of operation can be performed?
  • on what objects in the database?
  • 5. Before a transaction is executed check objects
    and operations required for the transaction
    execution against the privileges granted to the
    user.

20
The above would use an authority matrix to
determine which users had what rights on what
objects.
In a relational database the above would form
part of the database itself.
21
Example Privileges
SQL includes the grant statement (DCL) to grant
privileges on a table or view. GRANT
privilege_list ALL ALL PRIVILEGES ON
object_name TO authorisation_id_list
PUBLIC where privilege_list consists of a comma
separated list whose components come from select
insert delete update column_list references
column_list references gives a user the right
to create foreign keys that reference to the
table concerned. The person who issues the
statement must own the tables or views specified
in the statement. GRANT select, insert ON stock
TO smith GRANT ALL ON product TO fox GRANT update
(stock_code, stock_price) ON stock TO brown GRANT
select ON branch TO PUBLIC
22
The reverse of granting access privileges is to
REVOKE privileges. REVOKE privilege_list ALL
ALL PRIVILEGES ON object_name FROM
authorisation_id_list PUBLIC REVOKE select,
insert ON stock FROM smith REVOKE ALL ON product
FROM fox REVOKE update (stock_code,
stock_price) ON stock FROM brown REVOKE select
ON branch FROM PUBLIC
23
  • In order to limit a users access to data we
    could specify a set of views on to a relation.
  • The views would expose only those parts of the
    relation which the user is permitted to see, an
    abstraction of the underlying data.
  • No access privileges are granted on the
    underlying table, only on the views.
  • We use the External Schema to limit access to the
    Conceptual Data.

Consider, A database concerned with personnel
information which includes salary information,
the database is used both by the payroll office
which needs to see information on NI, Salary,
working hours etc. and also by the medical staff
who need to see records of home address and
phone numbers for emergency contacts.
24
Clearly we wish to restrict the payroll view to
exclude the medical information, and restrict the
medical view to exclude the financial
information. personnel
(USING SQL) CREATE VIEW medical
AS SELECT Name, Address, Phone,
Emergency_contact FROM personnel
CREATE VIEW payroll AS SELECT Name,
Address, NI, Salary, weekly_hours FROM
personnel GRANT select ON payroll TO
payroll_clerk GRANT select ON medical
TO nurse Will do what we require.
25
Notice that the view can be quite
sophisticated CREATE VIEW zwu (name,
address, salary) AS SELECT Name,
Address, Salary FROM personnel WHERE
personnel.Name "Zimin Wu" GRANT select ON zwu
TO Zimin Or how about summarising information
in the payroll file, without making the
individual salaries accessible?
CREATE VIEW salary_summary (total_salary)
AS SELECT sum(Salary) FROM
personnel GRANT select ON
salary_summary TO bank_manager So, the view is a
useful 'access limiting' tool.
26
A actual scenario
  • Suppose a teacher asks the teaching assistant
    (TA)to key in the scores of those students who
    have taken the re-sit exam of Database(2701) for
    him in a database system. What solution should
    the teacher have in order to protect the privacy
    of each student? (Hint show only necessary
    information and grant proper privileges to the
    TA)
  • Student information Table
  • ScoreSID,CID,score,resit,sname

27
Transaction and Concurrency Control
  • Topics
  • Large Scale Data Management
  • Lots of Data
  • Lots of Users
  • Lots of
  • What are Transactions?
  • Concurrency Control
  • Why?
  • How?

28
Large Scale Data Management Systems
  • To put into perspective the problems of large
    scale database management systems consider the
    requirements for the implementation of an
    electronic cash system (e.g. smart cards)
  • There are 400 Billion recorded consumer payment
    transactions each year.
  • Person to Person transactions account for a
    further 400 Billion.
  • There are 2 Trillion US dollars worth of sub 10
    transactions world-wide per year.
  • e-cash (Internet payment mechanisms) are
    expected to drop to the sub cent level for
    one-off per page access.
  • Banks like to keep accurate, secure records of
    ALL financial activities.
  • Credit card transactions are authorised in
    approx 30 secs - no matter how busy the system is
    (how many simultaneous attempts to authorise
    cards occur).

29
Transaction Processing
  • A database transaction represents a real world
    task.
  • A transaction can be defined as a program unit
    that has four core properties
  • Atomicity (a logical unit of work)
  • all or nothing, transactions are indivisible
    units.
  • Transactions either completely succeed fail no
    partial, or completely successes
  • Consistency (logical unit of integrity)
  • moves database from one consistent state to
    another.

30
  • Independence (logical unit of concurrency)
  • transactions execute independently of other
    transactions which may be occurring
    simultaneously within the database system.
  • Durability (logical unit of recovery)
  • failures in the database system cannot cause the
    effects of completed transactions to be undone.
    Once committed the effects are recoverable
    after failure.

31
How to represent a Transaction.
  • How to represent a Transaction.
  • Most DBMS (or Database Languages) have
    mechanisms similar to the SQL description which
    follows (they do differ from system to system)
  • A program unit can be one or more statements -
    we talk about multi-statement transactions (MST).
    A DBMS needs user support to inform it of
    the boundaries of a transaction.
  • Marking Transaction Boundaries
  • A transaction begins with the user's first
    executable SQL statement.SQL has two simple
    commands in the Data Manipulation Language to
    allow the user to mark the end of transactions.

32
  • COMMIT
  • marks the end of a transaction - and is used
    where the effects of the transaction are to be
    made visible to the other transactions in the
    database.
  • ROLLBACK
  • also marks the end of a transaction - it is used
    where you want to undo the effects of the
    transaction (rollback the state of the database)
    to the state it was in just before the BEGIN
    TRANSACTION was executed. (undoing the whole
    transaction)

33
The Need for Concurrency Control
  • Most databases are used in multi-user mode
    resulting in more than one transaction occurring
    simultaneously in the system.
  • Databases frequently have lots of users.
  • (Airline booking, Telesales Operations, Credit
    Card Mgt.)
  • Within a collection of transactions individual
    statements might be interleaved as well as the
    underlying operating system causing processes to
    be interleaved.

34
  • CLASSIC PROBLEMS
  • Without some form of concurrency management the
    effects of simultaneous transactions may
    interfere causing three possible types of
    problem
  • The lost update
  • The uncommitted dependency
  • Inconsistent analysis.

35
The Lost Update Problem
At time after T4 the update performed by
transaction A is lost the stored value of V is
30.
36
The Uncommitted Dependency
The effect of the rollback in Transaction B is to
undo the update of V at time T2. However,
Transaction A has fetched the updated value of V
at time T3. If another transaction executed the
same code as A after T4 the result would be
different.
37
Inconsistent Analysis
What is the value of the sum total? Is it
right?
38
Requirements of Concurrency Control
To avoid these problems transactions must be
serializable.
A non-serial schedule is any sequence of reads
and writes performed by a set of concurrent
transactions. A serial schedule is a schedule
where the read and write operations of each
transaction are executed consecutively without
any interleaved operations from other
transactions.
39
There is no interference, but still the two
sequences lead to different results.
40
Concurrency Control Locking.
  • Basically, locking is simple.
  • When a transaction is going to use some record
    or table and cannot risk that object changing
    then it acquires a lock on the object.
  • The effect of the lock is to prevent other
    transactions from gaining access to that object.
  • As usual the actual locking operations in most
    DBMS are more complex than the above, for example
  • locking a whole table may be unduly restrictive
  • the use of a single mode locking strategy
    prevents shared 'read only' access
  • Serialisability is not guaranteed with locks
    only.

41
Most DBMS use
  • Locking Modes
  • Read Locks
  • reading is none destructive so multiple readers
    can share a read locked object
  • writers are prevented from obtaining read locked
    objects
  • Write Locks
  • writers are destructive, readers and writers
    cannot obtain write locked objects
  • Some DBMSs (e.g. Oracle) provide for lock
    conversion (aka upgrade) automatically as
    appropriate

42
  • Multiple Level
  • locks can be acquired at different levels, e.g.
    table and row levels in Oracle.
  • may be escalated in certain DBMSs
  • Locking Phase
  • Growing Phase
  • locks are acquired throughout the transaction
    no locks are released until the end of the
    growing phase
  • Shrinking Phase
  • after 1st lock is released no additional locks
    can be acquired
  • This strategy ensures serializability.

43
  • How does INGRES do it?
  • Setting Locks
  • set automatically by the DBMS
  • Types of Locking
  • X - exclusive (write) locks, only one
    transaction at a time may hold this per
    resource.
  • S - shared (read) locks, multiple transactions
    can hold an S lock on this resource, but no
    transaction can obtain an X lock on a resource
    while any S lock is held on it.
  • IX/IS - intended X/S locks, taken on tables
    before X and S locks are taken at page level.
  • Locking Levels
  • Page level locks, physical device pages used to
    limit locks to relevant pieces of a table
    (needs the table to be physically distributed in
    a predictable manner, e.g. hashed key).
  • Table level locks, lock the whole table.

44
Summary of locking level usage in INGRES
Oracle uses a similar approach, but SELECT
statement (without FOR UPDATE in SELECT) does not
lock data.
45
Page Vs Table Locking.
  • Lock Escalation
  • If a query uses gt 10 pages then a table lock is
    taken. This prevents a large number of page locks
    being accumulated during execution.
  • If during execution more than 10 page locks are
    acquired on a table then they are promoted to a
    table lock, and the accumulated page locks are
    dropped.

46
Example 1 SELECT FROM emp WHERE name
"Jack" Actions 1) acquire table level IS
lock on emp. 2) acquire page level lock on
page 2 of emp table, assuming "Jack" is on page
2. 3) execute query. 4) drop all
locks. Example 2 BEGIN TRANSACTION SELECT FROM
dept WHERE dname "ADMIN" UPDATE emp SET
salary40000 WHERE position "MGR" END
TRANSACTION Actions 1) acquire table level
IS lock on dept. 2) acquire page level S
lock on page 1 of dept. 3) execute select
query. 4) acquire table level X lock on
emp, (position is not the primary key and is not
indexed so cannot use page locking) 5)
execute update query. 6) drop ALL locks.
47
What if you request a lock on a locked
table? Lock Compatibility
  • An attempt to obtain a lock on a previously
    locked object may result in the requesting
    transaction being blocked.
  • A blocked action is held in a wait state
    until the resource requested is freed.
  • BUT
  • this blocking may lead to DEADLOCK!

48
Consider
DEADLOCK!
49
  • Question
  • If the DBMS can detect deadlock can the above
    situation be resolved adequately?
  • Answer Yes (with varying levels of
    sophistication).
  • INGRES aborts (issues a rollback) one of the
    transactions arbitrarily and reports an error to
    the user.
  • Oracle reports an error to one of the
    transactions and expects the signalled
    transation to roll back explicitly to allow the
    other transaction to continue.
  • More elegant systems should rollback one or
    other transaction (silently, no need to concern
    the user) and reschedule the transaction from
    the information in the system log.
  • Really elegant systems will rollback the
    transaction which had performed the least work
    at the time the deadlock was detected.
  • Beware of transaction starvation (Livelock)
    solvable with a priority system.

50
16.6 Deadlock Handling
  • System is deadlocked if there is a set of
    transactions such that every transaction in the
    set is waiting for another transaction in the
    set.
  • Deadlock prevention protocols ensure that the
    system will never enter into a deadlock state.
  • Some prevention strategies
  • Require that each transaction locks all its data
    items before it begins execution
    (predeclaration).
  • Impose partial ordering of all data items and
    require that a transaction can lock data items
    only in the order specified by the partial order
    (graph-based protocol).

51
More Deadlock Prevention Strategies
  • Following schemes use transaction timestamps for
    the sake of deadlock prevention alone.
  • wait-die scheme non-preemptive(???)
  • older transaction may wait for younger one to
    release data item. Younger transactions never
    wait for older ones they are rolled back
    instead.
  • a transaction may die several times before
    acquiring needed data item
  • wound-wait scheme preemptive
  • older transaction wounds (forces rollback) of
    younger transaction instead of waiting for it.
    Younger transactions may wait for older ones.
  • may be fewer rollbacks than wait-die scheme.

52
Deadlock prevention (Cont.)
  • Both in wait-die and in wound-wait schemes, a
    rolled back transactions is restarted with its
    original timestamp.
  • Older transactions thus have precedence over
    newer ones, and starvation is hence avoided.
  • Timeout-Based Schemes
  • a transaction waits for a lock only for a
    specified amount of time. After that, the wait
    times out and the transaction is rolled back.
  • thus deadlocks are not possible
  • simple to implement but starvation is possible.
  • Also difficult to determine good value of the
    timeout interval.

53
Deadlock Detection
  • Deadlocks can be described as a wait-for graph,
    which consists of a pair G (V,E),
  • V is a set of vertices (all the transactions in
    the system)
  • E is a set of edges each element is an ordered
    pair Ti ?Tj.
  • If Ti ? Tj is in E, then there is a directed
    edge from Ti to Tj, implying that Ti is waiting
    for Tj to release a data item.
  • When Ti requests a data item currently being held
    by Tj, then the edge Ti Tj is inserted in the
    wait-for graph. This edge is removed only when Tj
    is no longer holding a data item needed by Ti.
  • The system is in a deadlock state if and only if
    the wait-for graph has a cycle. Must invoke a
    deadlock-detection algorithm periodically(????)
    to look for cycles.

54
Deadlock Detection (Cont.)
Wait-for graph with a cycle
Wait-for graph without a cycle
55
Deadlock Recovery
  • When deadlock is detected
  • Some transaction will have to rolled back (made a
    victim) to break deadlock.
  • Select that transaction as victim that will incur
    minimum cost.
  • Rollback -- determine how far to roll back
    transaction
  • Total rollback Abort the transaction and then
    restart it.
  • More effective to roll back transaction only as
    far as necessary to break deadlock.
  • Starvation happens if same transaction is always
    chosen as victim.
  • Include the number of rollbacks in the cost
    factor to avoid starvation
Write a Comment
User Comments (0)
About PowerShow.com