Title: Paper study Application Of Variable Precision Rough Set Approach To Car Driver Assessment
1Paper study - Application Of Variable Precision
Rough Set Approach To Car Driver Assessment
- Presented by Lichun (Jack) Zhu
- Course 60-539
- Winter 2006
- Instructor Dr. Christie Ezeife
- University of Windsor
-
2Agenda
- Introduction
- Rough Set Theory
- Variable Precision Rough Set Theory
- Linear Hierarchy of Decision Table (HDTL)
Algorithm - How the data is prepared
- Result interpretation
- Summary and Conclusion
- Q A
3Introduction
- Problem Statement
- Need to find out unsafe car drivers based on
history driving records. - Driving records in the database are incomplete
and inaccurate. - Solution A new approach to analyze the data that
contains inaccurate information - Variable Precision Rough Set Theory
- Linear Hierarchy of Decision Table algorithm
- Classification
4Introduction to Rough Set Theory
- Background
- First introduced by Pawlak (1982)
- A mathematic method to describe the uncertainty
and incompleteness - Basic concept
- Terms Information System S, Universe U,
Attributes A (condition attr, decision attr) - S (U, A)
5Introduction to Rough Set Theory
- Domain Va With every attribute a of A, we
associate a set Va as domain of a, Such as Vs
Male, Female - Indiscerniblity relation I(B) If B ? A, I(B) on
U as (x,y) ? I(B), if and only if a(x) a(y) for
every a ? B, where a(x) is the value of attribute
a for turple x. We can see I(B) is a equivalence
relation. -
- B-elementary sets B1, Bi, the partition on
the universe U/I(B) or simply U/B, we also define
B(x) Bi x ? Bi
6Introduction to Rough Set Theory
- An example of Information System
- Table 1. U 1,2,3,4,5,6, AS, G, N, R
-
Let B S, G, N I(B) (1,1), (1,6), (2,2),
(3,3), (4,4), (5,5), (6,6) U/B 1,6, 2,
3, 4, 5 B1, B2, B3, B4, B5
7Introduction to Rough Set Theory
- Approximation
- For interest set X ? U, We define
- B-lower(X) ?x?U B(x) B(x) ? X,
- B-upper(X) ?x?U B(x) B(x) n X ? F
- BNR B (X) B-upper(X) B-lower(X)
- For example
- if X contains all turples with high risk, X
2,3,4,6, then - B-lower(X) 2,3,4,
- B-upper(X) 1,2,3,4,6
- BNR B (X) 1,6
-
-
8Introduction to Rough Set Theory
Figure 1. Rough Set Concept, U ?B1B14,
B-lower(X) Yellow Region, B-upper(X) Yellow
and Green Region BNR B (X) - Green Region
9Variable Precision Rough Set Theory
- Background Information
- Problem of Rough Set B-lower approximation will
always be EMPTY if uncertainty widely exists. - Solution use probability based approach
- presented by Ziarko(1993), Yao and Wong (1992),
Slezak and Ziarko (2002) etc - Definations
- lower limit l satisfying 0 l lt P(X) lt 1
- l-negative region of X NEG l (X) ?Bi
P(XBi) l - upper limit u satisfying 0 lt P(X) lt u 1.
- u-positive region of X POS u (X) ?Bi
P(XBi) u - (l,u)-boundary region of X BNR l,u (X) ?Bi
l lt P(XBi) lt u
10Variable Precision Rough Set Theory
For data in Table 1, P(X) 4/6 2/30.67 If l
0.25 and u 0.75 then NEG 0.25 (X)5, POS
0.75 (X) 2,3,4, BNR 0.25,0.75
(X)1,6 Table 2. Sample Decision Table DT B,X
(U) with P(X) 0.67, l0.25, u0.75
11Variable Precision Rough Set Theory
Figure 2. VPRS Concept, U ?B1B17, NEG(X)
White Region, POS(X) Yellow Region BNR B (X)
Green Region
12Linear Hierarchy of Decision Table Algorithm
(Ziarko,2002)
- Corresponds to Tree-structured Hierarchy of
Decision Table Algorithm
13Linear Hierarchy of Decision Table (HDTL)
Algorithm
- Linear Hierarchy of Decision Table (HDTL)
Algorithm
- Advantage Linear Hierarchy of Decision Table
algorithm effectively eliminates the exponential
growth of the decision hierarchy size
14Linear Hierarchy of Decision Table (HDTL)
Algorithm (supervised approach)
Initialization 1. U ? U, C ? C, D ?
D 2. Compute POS u (X) and NEG l
(X) Iteration 3. repeat 4.
while (POS u (X) EMPTY and NEG l (X) EMPTY)
5. C ? new(C, U) define new
condition attributes 6. Compute POS u
(X) and NEG l (X) 7.
Output DT C,X (U) output decision table based on
the union of the positive and negative
regions 8. if POS u (X) ? NEG l (X) U
then exit. 9. U ? U (POS u (X) ? NEG l
(X)) 10. C ? new (C, U) define new
condition attributes 11. D ? DU
restrict decision attributes to the current set
of data U 12. Compute POS u (X) and NEG l
(X)
- There is a problem at this point. When defining
the new condition attributes failed, the
procedure should terminate.
Here embodies the linear approach of generating
the dataset for the subsequent layer.
15How the data is prepared
- Attributes
- Sex, Date-of-birth, City-population,
Number-of-convictions, Number-of-past-accidents
and Has-accident-in-last-year - Data scale about 29,000 records
- Data normalization
-
16Result interpretation
- 5 test cycles, generating 5 first layer decision
tables and 3 second layer decision tables. - A problem can be found from the testing result
- In all the presented test cycles, the boundary
sets of the first cycle all contain only one
combination of attributes. Therefore the
generated decision table hierarchy has no
difference compared with the Tree-structured
Hierarchy Decision Table algorithm at the first
two layers. The author did not display his
further investigation on the boundary sets that
have more than one combination of attributes. -
17Summary and Conclusion
- Strong points
- provides a valuable alternative solution that can
be used in rule finding and classification based
on inaccurate data. - The HDTL algorithm can also avoid the exponent
expansion of hierarchical data structures - Weak point
- Incomplete of test results provided. The test
results does not strong enough to testify the
effectiveness and accuracy of Linear Hierarchy
Decision Table algorithm.
18References
- Pawlak, Z, Decision Rules, Bayes Rule and Rough
Sets, New Directions in Rough Sets, Data Mining,
and Granular-Soft Computing, p.1-9, 7th
International Workshop, RSFDGrC99, Yamaguchi,
Japan, November 1999 Proceedings. - Ziarko, W., Incremental Learning with Hierarchies
of Rough Decision Tables, Proceedings of North
American Fuzzy Information Processing Society
Conf. (NAFIPS04), Banff, Alberta (2004)
p.802-808.
19Q A