Title: Forensic Analysis of Database Tampering
1Forensic Analysis of Database Tampering
- Kyriacos Pavlou and Richard T. Snodgrass
- Computer Science Department
- The University of Arizona
2Introduction
- The problem How to systematically perform
forensic analysis - on a compromised database.
- Recent federal laws (HIPAA, Sarbanes-Oxley Act
etc.) and incidents of corporate collusion
mandate audit log security. - Snodgrass et al. VLDB04 showed how to detect
database tampering. - Approach Hash using a cryptographically strong
hash function, notarize data manipulated by
transactions and periodically validate. - Forensic analysis to ascertain
- When the intrusion transpired
- What data was altered
- Who the intruder is
- Why has this transpired
3Outline
- Tamper Detection
-
- Forensic Analysis
- The corruption diagram
- Types of corruption events
- Forensic Algorithms
- Three algorithms
- Forensic strength
- Future Work
4Tamper Detection
- Several related ideas that allow tamper
- detection
- DBMS can maintain audit log in background
- Transaction-time table
- Append-only
- Data modified can be cryptographically hashed to
produce a secure one-way hash of transaction. - Notarize hash value with external notarization
service. The hash value cannot change. - Implementation optimizations
- opportunistic hashing
- transaction ordering list
- linked hashing
- The latest hash value is a hash of all the
changes made to the - database since database creation.
5Tamper Detection
- Two phases
- Normal Processing
- Validation
- The validation result is a single bit.
6Definitions
- Corruption Event (CE) any event that corrupts
data and compromises database (intrusion, human
intervention, bug) - Corruption time (tc) actual time instant at
which a CE occurred. - Validation Event (VE) validation of the audit
log by the Notarization Service (NS). - Time of VE (tv) time instant at which a VE
occurred. - Validation Failure vs. Validation Success NSs
answer to a query for a particular hash value.
Denotes tampering or lack thereof respectively. - Notarization Event (NE) the notarization of a
document by the NS. - Time of NE (tn) time instant at which a NE
occurred.
7Definitions (cntd)
- Forensic analysis involves the following
- Temporal detection determination of tc
- Spatial detection determination of where,
i.e., the location in the database of the data
affected in a CE. - This data is termed the corruption locus data
(lc). - In fact, try to ascertain locus time (tl), the
time instant lc was originally stored
(transaction commit time). - Note that a CE can have many lcs, termed
multi-locus, or a - only one lc termed single-locus CE.
8The Corruption Diagram
When
VE1
TRUE
NE3
NE Notarization Event
VE Validation Event
link
NE2
link
NE1
NE0
Where
9The Corruption Diagram
When
Actual time
VE2 TRUE
VE2
NE6
NE Notarization Event
NE5
clock time
VE Validation Event
NE4
CE Corruption Event
TRUE
VE1
NE3
link
NE2
link
NE1
Commit time
commit time
NE0
Where
10Forensic Analysis
- If a corruption is detected, the forensic
analyzer springs into action. - The analyzer tries to ascertain a corruption
region the bounds on the uncertainty of the
where and when of the corruption.
11Notarization and Validation Intervals
- Non-aligned validation just delays detection of
tampering. -
- Validation factor IV VIN
12Analyzing Timestamp Corruption
- So far considered data-only CEs. We now examine
the case where the timestamps of the tuples are
changed.
Data-only Backdating Postdating
Retroactive Introactive
13Monochromatic Algorithm
When
Forensic analysis begins
VE2 FALSE
NE6
NE5
time of corruption (tc)
NE4
VE1 TRUE
NE3
Corruption Region captures the uncertainty as
to the position of CE
NE2
NE1
tl place of corruption (commit time)
NE0
Where
14Monochromatic Algorithm
- Central insight data can be rehashed by
validator and checked. - Corruption region bounds IV IN
- Area is solely dependent on the two intervals.
- Cannot handle CEs involving timestamp corruption.
15The RGB Forensic Algorithm
T
When
F
VE4 FALSE
NE8
Forensic analysis begins
IV 4 days IN 2 days
tc
NE7
T
Notarization of Red
R
VE3 TRUE
NE6
NE5
T
Notarization of Blue Green
VE2 TRUE
NE4
NE3
Notarization of Red
R
VE1 TRUE
NE2
NE1
x
x
tl
NE0
Where
16The RGB Forensic Algorithm
- Introduction of RGB partial hash chains
- Allows the bounding of both tl and tp
- Incurs extra NS cost
- Each of two corruption regions bounds IV IN
- We would like to reduce the area of the
corruption regions.
17The Polychromatic Algorithm
F
T
F
When
VE4 FALSE
NE8
Forensic analysis begins
IV 4 days IN 2 days Desired 1 day
tc
NE7
T
Notarization of 2 Reds
R
VE3 TRUE
NE6
NE5
T
Backdating CE
F
F
Notarization of 2 Blues 1 Green
VE2 TRUE
NE4
Uncertainty can be arbitrarily shrunk via a
logarithmic number of red and blue hash chains.
NE3
Notarization of 2 Reds
R
VE1 TRUE
NE2
NE1
tb backdating time
x
x
tl
tb
NE0
18The Polychromatic Algorithm
- Introduction of extra partial hash chains
- Reduces uncertainty of corruption region
- Incurs additional NS cost
- Uncertainty can be arbitrarily shrunk via a
logarithmic number of red and blue hash chains. - Hence, the width is no longer dependent on IV
and IN .
19Forensic Strength
- Components
- Work of forensic analysis
- Region-area of CE
- Width of postdating / backdating uncertainty
- Inverse Forensic Strength
- IFS( D , IN ,V ) ( NumNotarizes( D , IN ,V )
ForensicAnalysis( D , IN ,V ) ) -
RegionArea( IN ,V ) UncertaintyWidth( D , IN
) - where V IV / IN is the validation factor and
- D is the number of days before first
validation failure. - Monochromatic O( V D2 IN )
- RGB O( V D IN2 ) We
assume that D gtgt IN . - Polychromatic O( ( V lg IN ) D )
20Future Work
- Develop a stronger lower bound for this problem.
-
-
- Accommodate multi-locus and complex CEs.
- Differentiate postdating and backdating CEs.
- Implement forensic analysis in validator.
- Consider interaction between transaction-time
storage manager and underlying WORM storage.
21Summary
- We have presented a means of performing forensic
analysis. - We have introduced a graphical representation to
visualize CEs, termed the corruption diagram. - We have designed three forensic algorithms.
- Monochromatic
- RGB
- Polychromatic
22Acknowledgements
- NSF grants IIS-0100436, IIS-0415101 and
EIA-0080123 and a grant from Microsoft provided
partial support for this work.