Title: Rough Sets Theory
 1Rough Sets Theory
- Rough Set Theory. Introduction 
- Rough Sets and Information Systems 
- Approximation of sets 
- Undefinable sets 
- A Simple Informal Example 
- Reducts in Information Systems 
- Example 
- Summary
Kersti Antoi 
 2Rough Sets Theory. Introduction
The rough sets theory is a mathematical tool to 
deal with vagueness and uncertainty. 
- Knowledge has granular structure. 
- Some objects of interest cannot be discerned and 
 appear as the same (or similar).
- Objects characterised by the same information are 
 indiscernible
- Any set of all indiscernible objects is called 
 elementary set.
- Any union of some elementary sets is referred to 
 as precise set - otherwise a set is rough
 (imprecise, vague).
3Rough Sets Theory. Introduction
In the proposed approach is any vague concept 
replaced by a pair of precise concepts - called 
the lower and the upper approximation 
The lower approximation consists of all objects 
that surely belong to the concept
The upper approximation contains all objects that 
possibly belong to the concept
The boundary region (doubtful region) of the 
vague concept is the difference between the upper 
and the lower approximation constitute
Approximations are two basic operations in the 
rough set theory 
 4Rough Sets Theory. Introduction
The basic operations of the rough set theory are 
used to discover fundamental patterns in data. 
- The main specific problems addressed by this 
 theory are
- representation of uncertain or imprecise 
 knowledge
- empirical learning and knowledge acquisition from 
 experience
- knowledge analysis 
- analysis of conflicts 
- identification and evaluation of data 
 dependencies
- approximate pattern classification 
- reasoning with uncertainty 
- information-preserving data reduction
5Rough Sets Theory. Introduction
- The important application areas for rough sets 
 are
-  medical diagnosis 
-  pharmacology 
-  stock market prediction and financial data 
 analysis
-  banking 
-  market research 
-  information storage and retrieval systems 
-  pattern recognition, including speech and 
 handwriting recognition
-  control system design 
-  image processing 
-  digital logic design 
- and many others
6Rough Sets and Information Systems
Information system S  ?U, A, V, f? U  universe 
of S-elements (nonempty, finite set of objects) A 
 (finite) set of attributes V  domain of the 
attribute a (set of values of attributes) f  
description/information function (f U?A ? V, 
f(x, a) ? Va, ?x ? U, ?a ? A)
Indiscernible  IND(P) P-elementary sets  
equivalence classes of IND(P) DESp(X)?(a, 
v)f(x, a)v, ?x?X, ?a?P? 
 7Rough Sets and Information Systems
Let P?A and Y?U. The P-lower approximation of Y, 
denoted by PY and the P-upper approximation of Y, 
denoted by?PY, are defined as
PY  ??X?U  P X?Y? ?PY  ??X?U  P X?Y???
The P-boundary (doubtful region) of set Y is 
defined as
BnP(Y)  ?PY - PY 
Set PY is the set of all elements of U which can 
be certainly classified as elements of Y, 
employing the set of attributes P. Set ?PY is the 
set of elements of U which can be possibly 
classified as elements of Y, using the set of 
attributes P. The set BnP(Y) is the set of 
elements which cannot be certainly classified to 
Y using the set of attributes P. 
 8Rough Sets and Information Systems
Undefinable sets
- Set Y is definable in P iff PY  ?PY, otherwise 
 set Y is undefinable in P.
- If PY ? ? and ?PY ?U, Y will be called roughly 
 definable in P.
- If PY ? ? and ?PY  U, Y will be called 
 externally undefinable in P.
- If PY  ? and ?PY ? U, Y will be called 
 internally undefinable in P.
- If PY  ? and ?PY  U, Y will be called totally 
 undefinable in P.
9Rough Sets and Information Systems
With every set Y?U, we can associate an accuracy 
of approximation defined as
Subsets Yi , i1,...,n are categories of 
partition Y. By P-lower (P-upper) approximation 
of Y in S we mean sets PY?PY1, PY2, ..., PYn? 
and PY??PY1,?PY2, ...,?PYn?, respectively. 
The coefficient
is called the quality of approximation of 
partition Y by a set of attributes P (quality of 
sorting). 
 10A Simple Example
Name Education Descision Joe High 
School No Mary High School Yes Peter Elementary 
No Paul University Yes Cathy Doctorate Yes 
O  Mary, Paul, Cathy A  Education R(A)  
Joe, Mary, Peter, Paul, 
Cathy POS(O)  LOWER(O)  Paul, 
Cathy NEG(O)  Peter BND(O)  Joe, 
Mary UPPER(O)  POS(O)  BND(O)  Paul, Cathy, 
Joe, Mary 
(Education, University) or (Education, Doctorate) 
 --gt Good prospects (Education, Elementary) --gt 
No good prospects (Education, High School) gt 
Good prospects (i.e. possibly) 
 11Reducts in Information Systems
Set of attributes P is independent if for every 
proper subset Q of P, that is Q ? P, 
IND(P) ? IND(Q), otherwise P is dependent (in S).
A subset P ? Q ? A is a reduct of Q (in S) if P 
is independent subset of Q and IND(P)  IND(Q). 
For every P ? A, RED(P) ? ?, and if P is 
independent then RED(P)  ?P?.
An element p ? P is said to be dispensable for P 
if IND(P)  IND(P-?p?), otherwise an element p 
is indispensable.
The set of all indispensable elements for P is 
said to be a core of P and denoted by CORE(P).  
 12Example
Let U consist of five elements denoted by t1...t5 
and let A  ?a, b, c, d, e?, Va  ?0, 1?, 
 Vb  ?0, 2?, Vc  ?1, 2, 3?, Vd  ?1, 3?, 
 Ve  ?0, 1, 2, 3?. 
All partitions in this system are ?  ?t1, t2, t
3, t4, t5? a  ??t1, t5?, ?t2, t3, t4?? b   ??t
1, t3?, ?t2, t4, t5?? c   ??t1, t3?, ?t2, t4?, ?
t5?? d  ??t1, t3, t5?, ?t2, t4?? e  ??t1, t5?,
 ?t2?, ?t3?, ?t4?? (ab)  ??t2, t4?, ?t1?, ?t3?, 
?t5?? (ad)  ??t2, t4?, ?t1, t5?, ?t3?? (be)  ?
t1?, ?t2?, ?t3?, ?t4?, ?t5??
The table gives an information function
U a b c d e t1 0 2 1 3 1 t2 1 0 2 1 2 t3 1 2 1 3 3
 t4 1 0 2 1 0 t5 0 0 3 3 1 
 13Example
This information system has nine different 
indiscernibility relations IND(?), IND(a), IND(b)
, IND(c) ( IND(bc)  IND(bd)  IND(cd)  
IND(bcd)), IND(d), IND(e) ( IND(ae)  IND(de)  I
ND(ade)), IND(ab) ( IND(ac)  IND(abc)  IND(abd)
  IND(acd)  IND(abdc)), IND(ad), IND(A) IND(be)
  IND(ce)  IND(abc)  IND(ace)  IND(bce)   
 IND(cde) IND(abce)  IND(bcde)  
IND(acde)  IND(abde))
RED(A)  ?be, ce? and CORE(A)  e 
 14Summary
To remember
- A rough set is a set defined only by its lower 
 and upper approximation. A set, O, whose boundary
 is empty is exactly definable.
- If a subset of attributes, A, is sufficient to 
 create a partition R(A) which exactly defines
 set of objects, then we say that A is a reduct.
- The intersection of all reducts is known as the 
 core.
15References
- Articles 
- Rough Classification. Z. Pawlak, Polish Academy 
 of Sciences (1983)
- Reducts in Information Systems. C. M. Rauszer, 
 Uof Warsaw (1991)
- Variable Precision Extension of Rough Sets. J. D. 
 Katzberg, W. Ziarko, University of Regina,
 Saskatchewan, Canada (1996)
- Rough Concept Analysis a Synthesis of Rough Sets 
 and Formal Concept Analysis. R. E. Kent,
 University of Arkansas at Little Rock, USA (1996)
http//www-idss.cs.put.poznan.pl/research/rough_se
ts/index.html http//www.cs.uregina.ca/roughset/