Title: Chapter 4: Transaction Management
1Chapter 4 Transaction Management
- Title Efficient Locking for Concurrent
Operations on B-Trees - Authors Philip L. Lehman, S. Bing Yao
- Pages 334-354
2Efficient Locking for Concurrent Operations on
B-Trees
- Problem
- Problem Statement
- Why is this problem important?
- Why is this problem hard?
- Approaches
- Approach description, key concepts
- Contributions (novelty, improved)
- Assumptions
3Problem Statement
- Given
- Data on secondary storage devices
- Database index
- Find Efficient Locking
- Locking mechanisms for search, insertion, and
deletion - Objectives
- The mechanisms are safe from concurrent
operations - Constraints
- Many processes are allowed to operate on the data
simultaneously. - Each process do not share its primary memory.
- Disk page is the smallest unit of read and write.
- Locks should not prevent other processes from
reading the locked page.
4Why is this problem important?
- B-tree or B-tree is widely used as a data
structure for storing large files of information
on secondary storage devices. - Most databases are manipulated concurrently by
several processes.
5Why is this problem Hard?
- Locking root may reduce concurrency.
- Depending upon nodes
- parent child
- Insert / split may go up many levels
- split / insert conflicts with read, insert
- Concurrent operation on B-tree is erroneous.
- A, B, C blocks of primary storage
- x, y, z variables in the primary storage
6Novelty of Contribution
- Related Work
- Naïve approach to concurrent B-tree problem
fails. - Using semaphore locks entire sub-tree affected by
updates. - B-tree
- Locks are applied mostly in lower sections of
tree. - Contributions
- Uses a small (constant) of locks at any time
- Locks only prevent multiple update access.
7Principles of Blink-tree
- Add a single link pointer field to each node.
- The link provides an additional method for
reaching a node. - The split two nodes are joined by a link pointer,
and are functionally essentially the same as a
single node. - The link pointer serves as a temporary fix that
allows correct concurrent operation. - Additionally, the Blink-tree enables serial
search, i.e., retrieving nodes in the same level
(e.g., retrieving only leaves).
Reference A Guttman R-tree a dynamic index
structure for spatial searching, 1984
8Example of Blink-tree
9Search, Insertion Algorithms
- Search
- If a current node is to split, the search
algorithm rectifies the error by following the
link pointer of the newly split node. - Insertion
- The insertion may cause
- splitting a node. ( unsafe)
- Lock a node before modification.
Example Splitting node a into node a and b
10Locking Efficiency
- The insertion algorithm uses at most a constant
of locks (three) for any process at any time. - Split ? chaining across the level of nodes
containing the father to find the correct
insertion position ? Three nodes are locked for
the duration of one operation. - This type of locking occurs rarely in a
Blink-tree - Extremely small collision probability
Example Splitting node a into node a and b
11Validation Methodology
- Correctness Proof
- Theorem 1 Deadlock Freedom. The system cant
produce deadlock. - Impose an order bottom to top / left to right
- Locks are placed by the inserter according to a
well-ordering - As long as inserter follow the well-ordering, it
never places a lock on any node below a locked
node, nor on any node to the left. - Theorem 2 All put operations correctly modify
tree structure. - Classify put operations into three types.
- Prove the correctness of first case and show
consecutive put operations is equivalent to one
change. - Theorem 3 Interaction Theorem. Actions of an
insertion process dont impair correctness of
actions of other processes. - Classify three possible types of insertion.
- Apply lemma 3 to several aspects separately.
- Livelock one process runs indefinitely.
- extremely unlikely problem
12Class Exercise 1/2
- How can we resolve the erroneous behavior of
B-tree using Blink-tree?
- A, B, C blocks of primary storage
- x, y, z variables in the primary storage
13Class Exercise 2/2
- Can insert lead to deadlock? Livelock?
- Many nodes have 2 pointers pointing to them,
- One from parent
- One from left sibling
- Which one is created first?
- In the figure (b), why the
- right link was created first?
Example Splitting node a into node a and b
14Summary
- Papers focus
- Blink-tree implementations and correctness
- Ideas
- Link provides an additional method to reach a
node. - The split two nodes work as a single node by the
link. - Contributions
- Locking scheme is simpler (no read-locks).
- A constant of nodes are locked.
- Analytical Validation
- Correctness proofs
15Assumptions, Rewrite today
- Assumptions
- Many processes can operate on data
simultaneously. - A process is allowed to lock and unlock a disk
page. - Rewrite today
- Compare with newer methods
- T-tree
- Experimental evaluation - Simulation
- Measure lock efficiency