Title: Distributed Transaction Processing
1Distributed Transaction Processing
2Review
- Question 1
- Why does a hash help reveal file tampering? That
is, why do you know theres a problem if - hash(received_file) ? hash(original_file)
- Question 2
- Suppose Darth sets up shop in a local coffee shop
where students use a publicly accessible WiFi
network to check their email. - If Darth runs a packet sniffer and watches users
connecting to U-M Telnet servers, under what
conditions will he be able to - Read the users account name and password in
order to type it in manually later - Would it help if the password were encrypted
before being sent? - How can students protect themselves against this
risk?
3Public hosts
Global Internet
Firewall
Firewall
ProxyServer
Internal hosts
Internal hosts
Protected enclave
Protected enclave
4Proxy Servers
- Proxy receives requests for certain applications
- For example, an HTTP request for a particular URL
- Proxy checks if request is permitted
- For example, users might not be allowed to access
gambling sites from a corporate computer - If request is okay, proxy passes request on to
final destination - Otherwise, request is denied
- Proxy may also serve a caching function
- If request can be handled locally, dont bother
to pass it on to final destination
5Typical Firewall Configurations
- Transparent
- Allow incoming traffic to web server on port 80
- Allow incoming traffic to any machine on ports
1023 - Allow outgoing traffic to any IP address, any
port - Block all other packets
- Proxy as Bastion
- In this configuration, the proxy is the only
point of contact between the public and private
networks - Allow incoming traffic to web server on port 80
and 1023 - Allow outgoing traffic from Bastion/Proxy server
on ports 23, 80 to any IP address - Block all other packets
Note In this context, the direction of the
traffic indicates which host is responsible for
opening the connection. Once open, data flows
both ways.
6More Permissive Configuration
- Block incoming from known bad addresses
- Avoids some IP spoofing attacks
- Block incoming known bad ports
- E.g., multicast, if youre not using multicast
- E.g., napster
- Allow others
- Security experts prefer policies that prohibit
everything not explicitly permitted - Permitted unless prohibited enables more
innovation - E.g., access to experimental new services
7Vulnerability Assessment Tools
- Check configurations for known weaknesses
- Check for violations of organizations security
policy - For example, an individual office computer that
allows modem connections - Simulate known attacks
8Intrusion Detection Tools
- Monitor activity
- Look for known signatures of cracking
- Look for unusual activity
- Requires some model of normal activity
- What to monitor
- Host-based logs of activity on individual
machines - Network-based
- Promiscuous mode intercepts all packets
- Process them as fast as you can
- Unlike packet filter, can look for patterns in
sequences of packets - Problem of false alarms
- Each alarm requires human investigation
9Learning Objectives
- Understand Transaction Processing Techniques
- Logging
- Atomic Commit
- Understand Concurrency Control Techniques
- Locking
- Merging
- Apply these principles and techniques in new
settings
10A Transaction
Durable starting state
Durable, consistent, ending state
Collection of reads and writes
Abort
Successful completion
Rollback
11SI540 Players Transaction Examples
- Withdraw 100
- Read current balance
- new balance current - 100
- Write new balance
- Dispense cash
- Transfer 100
- Read savings
- new savings current - 100
- Read checking
- new checking current 100
- Write savings
- Write checking
12What Can Go Wrong?
- Some actions fail to complete
- For example, the application software crashes
- Interference from another transaction
- Some data lost after actions complete
13ACID Properties
- Atomic Process all of a transaction or none of
it transaction cannot be further subdivided
(like an atom) - Consistent Data on all systems reflects the
same state - Isolated Transactions do not interact/interfere
with one another transactions act as if they are
independent - Durable Effects of a completed transaction are
persistent
14Transaction Processing Middleware
- Software components/libraries
- Coordinate execution (processing) of transactions
- Help assure maintenance of ACID properties
- Examples
- X/Open DTP model Encina, Tuxedo,
- OTS interface extends CORBA several vendors
- Java Transaction API
- Microsoft Transaction Server
15Transaction Processing Responsibilities
- Application programmer or user
- Decides which actions grouped into a transaction
- Transaction management software assures
- Atomic commitment
- No interference between processes (concurrency
control) - Resource management software provides
- Database manipulation
- Transaction logging and rollback
16SI 540 Players Casting
- Application 1 transfer
- Application 2 withdraw
- Transaction Manager
- Resource Manager 1 savings account
- File System 1
- Database 1
- Resource Manager 2 checking account
- File System 2
- Database 2
17No Long Term Memory
- Applications, Transaction Managers and Resource
Managers have main memory, no permanent storage - RM1 has FS1 and DB1 as permanent storage
- Similarly for RM2
18SI540 Players Crash Before Completion
- Application 1 Transfer 100
- Read savings
- new savings current - 100
- Read checking
- new checking current 100
- Write savings to DB
- System crash before write of new checking balance
19Recovery
- Rollback
- Recover to starting state
- Techniques
- Take snapshot (checkpoint) of starting state
- E.g., initial bank balance (and all other states)
- And keep a redo log
- Alternative keep an undo log
- E.g., bank balance changed old value was x
- And Resume (if recoverable)
- Redo all committed actions (since last
checkpoint) - Or undo all uncommitted actions
- Ignore all uncompleted/uncommitted actions
20Creating REDO Log
- Keep a log of all database writes ON DISK (file
system) (so that it is still available after
crash) - transaction ID data item new value
- (Tj x125) (Ti y56)
- Actions must be idempotent (redoable)
- NOT x x 100
- But don't write to the database yet
- At the end of transaction execution
- Add "commit " to the log
- Do all the writes to the database
- Add "complete " to the log
21Recovery With REDO Log
- Restart after a crash by redoing the log
- Any write for a committed but uncompleted
transaction gets written again - What if the value was already written?
- Any write for a non-committed or completed
transaction is ignored
22SI540 Players REDO Log
- First, demonstrate log creation in a crash-free
environment - Demo a crash after the application issues the
write savings instruction? - Demo a crash following the commit at the end of
the transaction execution?
23Distributed Commit
- Transaction involves multiple resource managers
- Each has its own storage and logs
- Commit work of all processes or none of them
- How to assure this atomic commitment?
24Transaction Processing Architecture
Transaction manager
Application logic
Join
Prepare, commit, abort
Resource managers
252-Phase Commit (2PC)
Phase 2
One or more nos
Transaction manager
Phase 1
Rollback
abort()
Transaction manager
Transaction manager
prepare() yes_or_no
commit()
All yes
Resource managers
Resource managers
26SI540 Players 2-Phase Commit (2PC)
- System now includes a Transaction Manager (TM)
- Commit does not automatically follow transaction
execution - instead, TM verifies that all systems are
prepared and then instructs them to commit or
abort - Only then does Resource Manager (RM) add commit
to log and perform writes
27Analysis of 2PC Scenarios
- What happens when
- All say yes to prepare() request
- One says no
- One doesnt respond to prepare()
- Perhaps recovers later
- After commit() sent, some RM crashes
- After all yes, TM crashes
28Summary
- Need a way to coordinate work performed on data
stored across multiple, distributed components - These transactions should pass the ACID test
that is, they should be atomic, consistent,
isolated, and durable - Atomicity through 2-Phase Commit (2PC) and
transaction logs for rollback and recovery
29Learning Objectives
- Understand Transaction Processing Techniques
- Logging
- Atomic Commit
- Understand Concurrency Control Techniques
- Locking
- Merging
- Apply these principles and techniques in new
settings
30Review
- What are some examples of transactions that
should pass the ACID test? - Can you think of an example of when some of the
ACID properties could be optional? - Do you think that updates to the DNS are atomic?
Why do you think the DNS was implemented this
way? - The 2-phase commit protocol ensures that
transactions are atomic, but at what cost? What
is sacrificed in order to provide atomicity?
31A Concurrency Control Problem
- User A modifies document version v1, saves as
version vA - User B modifies document version v1, saves as
version vB - Neither vA nor vB is the most current
- Which version should the next user work from?
32Pessimistic Concurrency Control Locks
- A locks document v1
- A gets v1 and edits
- B wants to edit, but cant acquire the lock
- B waits
- A writes/saves doc as v2
- B gets notified
- Now B gets lock
- B gets v2
- B writes/saves doc as v3
- Disadvantage of locks
- Reduces amount of parallel processing
33Optimistic Concurrency Control Merging
- A checks out (gets) v1
- B checks out (gets) v1
- A checks in (saves) doc, becomes v2
- B tries to check in doc
- Version management software (e.g., CVS) notices
v2 (saved by A) is more recent than v1 checked
out by B - Alerts B to problem
- May provide help in merging changes
- Based on doing a diff of the versions
34Transaction Interference As A Concurrency Control
Problem
- Transfer 100
- Read savings
- new savings current - 100
- Read checking
- new checking current 100
-
-
- Write savings
- Write checking
- Withdraw 100
- Read checking
- new checking current - 100
- Write checking
- Dispense cash
time
35Isolation Of Transactions (Serializability)
Two concurrent transactions
Give the same result as
Transaction_1
Transaction_1
Shared resources
Transaction_2
Transaction_2
Or
36Bad (Not Serializable) Execution
- Transfer 100
- Read savings
- new savings current - 100
- Read checking
- new checking current 100
-
- Write savings
- Write checking
- Withdraw 100
- Read checking
- new checking current - 100
- Write checking
- Dispense cash
time
37OK (Same As Withdraw First)
- Transfer 100
- Read savings
- new savings current - 100
-
- Read checking
- new checking current 100
- Write savings
- Write checking
- Withdraw 100
- Read checking
- new checking current - 100
- Write checking
- Dispense cash
Withdrawal transaction is complete before
transfer transaction accesses the shared data
time
If using 2PC on each transaction, should new
balance be made visible before or after commit?
If new balance visible before commit in 2PC, must
be prepared for cascading rollback, hence must be
careful on pre-commit. Simpler to make it visible
after complete.
38Avoiding Interference
- Acquire exclusive lock before accessing resource
- Wait or abort if lock not available
- Resource manager enforces the lock
39Two Phase Locking (2PL)
- 2PL guarantees serializability, using locks
(reservations) - Each transaction does the following
- Phase one acquire "locks" for all data to be
accessed - If data already "locked" by another transaction,
have to wait - Lockpoint is just after acquisition of the last
lock - Perform the computations and updates
- Phase two release all the locks
- Claim execution equivalent to serial execution
in time ordering of lockpoints - If Ti acquires all its locks before Tj does, same
result as if Ti executes before Tj starts
(exercise prove this) - If also doing 2PC, unlock before or after commit?
40SI540 Players Two Phase Locking
- Start two transactions simultaneously
- Withdraw 100
- Lock checking
- Read checking
- new checking current - 100
- Write checking
- Unlock checking
- Dispense cash
- Transfer 100
- Lock savings
- Read savings
- new savings current - 100
- Lock checking
- Read checking
- new checking current 100
- Write savings
- Unlock savings
- Write checking
- Unlock checking
41Generalizations
- Concurrency control schemes often distinguish
between read and write access - If two transactions both read the same data,
there is no constraint placed on the serial order - 2PL as described here works, but allows less
concurrency, because it makes transactions wait
even if both are only reading - Other techniques besides 2PL are available
- Timestamp ordering
- Serial order is defined by start times of the
transactions - Transaction logs inconsistent with that serial
order cause undo of the bad transactions - Conflict recovery rather than prevention
42Danger from Waiting Deadlock
- Transaction 1
- Lock A (granted)
- Lock B (denied)
-
- Release all
- Transaction 2
- Lock B (granted)
- Lock A (denied)
-
- Release all
- Solution
- All processes acquire locks in same order
- or detect and recover from deadlocks
43Transaction Processing Responsibilities
- Application programmer or user
- Decides which actions grouped into a transaction
- Transaction management software assures
- Atomic commitment
- No interference between processes (concurrency
control) - Resource management software provides
- Database manipulation
- Transaction logging and rollback
44Graphical Summary
Application server
Resource manager
Transaction manager
Request(tp_ID,)
join(tp_ID)
Request_lock(tp_ID, res_ID)
Lock
OK or Already_locked
More requests.
All_done(tp_ID)
Prepare (tp_ID)
Commit (tp_ID) or Abort (tp_ID)
Commit or rollback
Unlock
45Summary
- Need a way to coordinate work performed on data
stored across multiple, distributed components - These transactions should pass the ACID test
that is, they should be atomic, consistent,
isolated, and durable - Atomicity and Durability through 2-Phase Commit
(2PC) and transaction logs for rollback and
recovery - Consistency and isolation through 2-Phase Locking
(2PL)