Title: Database Security (Chapter 8, Sections 4-7)
1Database Security(Chapter 8, Sections 4-7)
- Student Ying Hong
- Course Database Security
- Instructor Dr. Yang
2Contents
- Sensitive Data
- Inference Problem
- Multilevel Databases
- Proposals for Multilevel Security
- Concluding Remarks
- References
3Sensitive Data?? Introduction (1)
- Sensitive data is data that should not be public.
- Three kinds of databases
- One that contains nothing sensitive
- One that contains everything sensitive
- One that contains some but not all sensitive, and
the sensitive data may be in varying degrees of
sensitivity.
4Sensitive Data?? Introduction (2)
- The access control problem is to limit users
access so that they can obtain only the data to
which they have legitimate access. - Several factors that make data sensitive
- inherently sensitive
- from a sensitive source
- declared sensitive
- of a sensitive attribute or a sensitive record
- sensitive in relation to previously disclosed
information (composite data)
5Sensitive Data?? Access Decisions (1)
- Access decisions are made by database
administrator and based on an access policy. - The DBMS implement the access decisions.
- There are several factors when deciding whether
to permit an access. They are - Availability of the data some required data may
not be accessible. - Example locking of tuples when updating
- Serious problem may be resulted in DOS
6Sensitive Data?? Access Decisions (2)
- Acceptability of the access some data may be
sensitive and not accessible by some user. This
control is not as simple as it sounds, because - the sensitive fields may not be directly
requested but only be referenced - some user may want a nonsensitive statistic from
the sensitive data - Example p.351
7Sensitive Data?? Access Decisions (3)
- Assurance of Authenticity certain
characteristics of the user that is external to
the database may also be considered. - access the database only during the working time
- previous request made by the user may be
considered (because sensitive data can sometimes
be revealed by combining less sensitive data)
8Sensitive Data?? Types of Disclosures (1)
- Data can be sensitive, but even information about
data is also a form of disclosure. So a
successful security strategy must protect from
both direct and indirect disclosure. - Exact data sensitive data itself
- Bounds sometimes by knowing the bounds on a
sensitive data and using a narrowing technique
the user may determine the sensitive data in any
desired precision.
9Sensitive Data?? Types of Disclosures (2)
- Negative Result some query may be made to
determine a negative result from which sensitive
data may be disclosed. - Existence the existence of data is itself a
sensitive data, regardless of the actual value. - Probable value it may be possible to determine
the probability that a certain element has a
certain value.
10Sensitive Data?? Security vs. Precision
- Sharing of nonsensitive data
- Security To disclose only nonsensitive data, and
reject any query that mentions a sensitive field. - Precision To protect all sensitive data while
disclose as much nonsensitive data as possible. - The ideal combination is to maintain perfect
security with maximum precision. In fact, we
often must sacrifice precision in order to
maintain security. - Fig. 8-8, p.354.
11Inference Problem
- Inference problem is to derive sensitive data
from nonsensitive data. Its a subtle
vulnerability in database security. - A sample table
12Inference Problem?? Direct Attack (1)
- Direct attack is an attempt to retrieve some
values by directly querying some sensitive
fields. - Example
- SELECT Name
- FROM Sample
- WHERE SexM AND Drugs1
- Some trick may be used on direct attack.
- Example
- SELECT Name
- FROM Sample
- WHERE (SexM AND Drugs1)
- OR (SexltgtM AND SexltgtF)
- OR DormAyres
13Inference Problem?? Direct Attack (2)
- The rule of n items over k percent data should
not be given if n items represent over k percent
of the result reported. - In the previous example, the one person selected
represents 100 percent of the data reported.
14Inference Problem?? Indirect Attack (1)
- Indirect attack is to infer a result based on one
or more statistical results. - Several examples of indirect attack
- Sum a reported sum may be used to infer a value.
- (Table 8-3 sums of financial aid by dorm and
sex) - Count the count can be combined with the sum to
produce some even more revealing results. - (Table 8-4 count of students by dorm and sex)
15Inference Problem?? Indirect Attack (2)
- Intersecting medians Unauthorized users may use
intersecting medians to determine a sensitive
fields value. - (Figure 8.9 Table 8-5)
- Tracker attacks is to generate the desired data
by using additional queries that generate small
results. - q count (a ? b ? c) ? q count(a) -
count(a ? ?(b?c)) count(a) - count(a ? (?b ?
?c) ) - Correction (page 358)
16Inference Problem?? Indirect Attack (3)
- Linear system vulnerability with a little
algebra and a little more luck in the
distribution of the database contents, its
possible to determine a series of queries that
returns results relating to several different
sets. - q1 c1 c2 c3 c4 c5
- q2 c1 c2 c4
- q3 c3 c4
- q4 c4 c5
- q5 c2 c5
17Inference Problem?? Controls for Inference
Attacks
- Query controls
- Effective primarily against direct attacks
- Item controls
- Suppression query is rejected without sensitive
data provided. - Concealing the answer provided is close to but
not exactly the actual value. - Contrast b/w security and precision.
18Inference Problem?? Examples of Controls (1)
- Limited response suppression
- Rule of n items over k percent eliminate
low-frequency elements. - More sufficient way suppress additional cells on
the same row and column. (Table 8-6 and 8-7) - combining results
- To combine some rows and columns. (Table 8-8 and
8-9) - Or to present values in ranges.
- Or to present values by rounding.
19Inference Problem?? Examples of Controls (2)
- Random sample
- Result is computed on a random selected subset of
the database but not the whole database. - Random data perturbation
- Result is perturbed by a small error.
- Query analysis
- A query and its implications are analyzed.
- Complexity
- Maintain a query history for each user.
- Judge each query on the context of previous
queries.
20Inference Problem?? Conclusion
- No perfect solutions.
- Three paths to follow
- Suppress obvious sensitive information (easily).
- Track what the user knows (costly).
- They are used to limit queries accepted and data
provided. - Disguise the data (problem with precision).
- Its applied only to the released data.
21Multilevel Databases?? Differentiated Security
- An illusion Sensitivity was determined just by
attribute. ? sensitive vs nonsensitive
attributes - The fact differentiated data security
- Three characteristics of database security
- The security of a single element may be different
from the others in the same row or column. - Several grades of security are needed.
- The security of an aggregate (a sum, a count, or
a group of values) may be different from the
security of the individual elements.
22Multilevel Databases?? Granularity
- An analogy Defining the sensitivity of each
value in a data base is similar to applying a
sensitivity level to each individual word of a
document. - Both element and combination of elements may have
a distinct sensitivity. - In order to keep each value of a database being
in its own sensitivity level - An access control policy must define which users
can have access to what data. - ? confidentiality (or secrecy)
- Each value must be guaranteed NOT to be changed
by any unauthorized person. - ? integrity
23Multilevel Databases?? Security Issues (1)
- Integrity
- -property for access control In multilevel
databases a high-level user should not be able to
write a lower-level data element. - See p.279 for the official definition of
-property. - Problem Sometimes read and write must happen to
the same process, like DBMS. - Solution Either the process cleared at a high
level cannot write to a lower level, or the
process must be a trusted process.
24Multilevel Databases?? Security Issues (2)
- Confidentiality
- In multilevel databases two different users from
different levels of security may get two
different answers to the same query. - Unknowing redundancy, known as polyinstantiation,
may be introduced that is, one record can appear
many times, with different level of secrecy each
time.
25Proposals for multilevel security?? Partitioning
- Model The database is divided into separate
databases, each at its own security level. - Weakness
- Destroy basic advantage of a database
elimination of redundancy. - Problem with combined usage of separate databases.
26Proposals?? Encryption
- Model Each level of sensitive data is stored in
a table encrypted under a key unique to the level
of sensitivity. - Weakness
- Chosen plaintext attack may increase.
- Controls different encryption keys, cipher block
chaining (Fig. 8-11, p.365) - High overhead of processing a query decrypting
required fields. - Encryption is not often used to implement
separation in data bases.
27Proposals?? Integrity and Sensitivity Locks (1)
- Integrity lock (spray paint) The protection is
made with elements, not with tables. - Each data item includes (Figure 8-12)
- Data itself stored in plaintext for efficiency
of access. - Sensitivity label defines the sensitivity of the
data. - Sensitivity labels are unforgeable, unique, and
concealed. - Checksum computed across both data and
sensitivity label. (Cryptographic checksum
Figure 8-13)
28Proposals?? Integrity and Sensitivity Locks (2)
- Sensitivity lock is a combination of a unique
identifier and the sensitivity level. (Figure
8-14) - Sensitivity lock E(Key, Sensitivity label,
unique identifier) - Intention
- Use any (untrusted) database manager with a
trusted procedure that handles access control.
(Figure 8-15)
29Proposals?? Integrity and Sensitivity Locks (3)
- Its only a short-term solution for multilevel
security. - Weakness
- Efficiency of integrity locks is a serious
drawback. - The space required is expanded.
- The processing time of decoding sensitivity label
is a problem. - Trojan horse attack database manager sees all
data.
30Proposals?? Trusted Front-End
- Model (Figure 8-16)
- A trusted front-end (known as a guard) is added.
- Interaction b/w user, front-end, and DBMS
(page368) - c.f., Commutative filters
- Reformat users request if necessary and return
only data of appropriate sensitivity to the user. - Advantage
- Database manager can do as much as possible, like
selection, optimization, subquery - Overall efficiency of the system.
31Proposals?? Distributed/federated Database
- Operation of a trusted front-end
- Control access to two database managers with
different sensitivity. - Take users query and formulate single-level
queries to the databases as appropriate. - Return results to user, combining results
appropriately if they are obtained from different
databases. - Weakness
- The front-end is potentially including most of
the functionality of a full database manager. - Data must be kept in separate databases according
to their different sensitivity degrees.
32Proposals?? Window/View
- Window
- A window is a subset of a database, containing
exactly the information that a user is entitled
to access. - View
- A view can represent a single users subset
database, so that all of a users queries access
only that subset database and sensitive data to
the user may be filtered. - Layered system (TCB, trusted computing base)
- First layer access control user
authentication. - Second layer index computation of database.
- Third layer translate views into the base
relations.
33Concluding Remarks?? On My Own
- Partitioning and Distributed databases are not
popular, mainly because there is a difficulty in
combined usage of different databases. - Integrity lock is just a short-term solution for
the security of multilevel databases. Trusted
front-end and Commutative filter look like more
practical, but more complicated. - Encryption is used, but big problem is overhead
of processing. - Window and View are concepts more related to the
functionality of database itself.
34Concluding Remarks?? To Our Projects
- We may focus on encryption, trusted front-end,
commutative filter, and view, looking for
practical implementation for some of them. - Basic idea on
- Encryption encrypt sensitive data stored in
databases with a crypto server to handle access
and user authentication. - Trusted front-end / commutative filter require
more space to store sensitivity degree for
sensitive data, with a guard to handle access and
user authentication and may reformat users
queries. - View create view for each user with a module
(may called view server) to handle access and
user authentication.
35References
- Security in Computing by Charles P. Pfleeger,
Chapter 8, Sections 4-7 - Developing a Database Encryption Strategy, at RSA
Security Home gt Events gt RSA Web Seminars