Title: Models of Incremental Concept Formation Genarri,Langley
1Models of Incremental Concept FormationGenarri,La
ngley Fisher
2Overview
- Concept Formation/Conceptual Clustering
- Known projects
- EPAM
- UNIMEM
- COBWEB
- CLASSIT
- Future Research for CLASSIT
3Concept Formation AndConceptual Clustering
- Given Sequential presentation of instances
associated descriptions - Find Clusterings that group these instances in
categories - Find Intensional definition for each category
that summarizes its instances - Find A hierarchical organization for those
categories - Presentation of instances is Incremental
- Search through hypotheses space is mostly
Hill-climbing - Learning is unsupervised
- Concept Hierarchy
4Known Projects EPAM(1/2)
- EPAM is a discrimination network
- Nodes are Tests
- If Familiarization(matching) fails,
Discrimination(Adding a class) occurs. - For each operation, a new travel through the
discrimination tree is made - No real concept hierarchy, because EPAM only
stores concept descriptions at terminal nodes.
5Known Projects EPAM(2/2)
6Known Projects UNIMEM(1/2)
- Both terminal nonterminal nodes have concept
information - For any set of instances, any attribute-value
pair, Ai Vij, and any class, Ck, - Predictability P(Ai VijCk)
- - How well can the feature be predicted given
an instance of the concept?? - Predictiveness P(CkAiVij)
- When predictability of a feature gt threshold1
then feature becomes permanent part of nodes
description - When predictability of a feature lt threshold2
then feature is removed from concept
description. - Reorganization!
- Allows placing of instances in multiple categories
7Known Projects UNIMEM(2/2)
For any set of instances, any attribute-value
pair, Ai Vij, and any class, Ck,
Predictability P(Ai VijCk) Predictiveness
P(CkAiVij)
8Known Projects COBWEB(1/3)
- COBWEB stores probabilities of concepts in an
is-a hierarchy - Terminal nodes are always specific instances
- COBWEB never deletes instances
- COBWEB can Split Merge using Category Utility
- Category utility is the increase in the expected
number of attribute values that can be guessed,
over the expected number of correct guesses
without knowing P(Ai Vij) - Formula for Category Utility
- COBWEB can handle only nominal attributes
- COBWEB retains all instances ever encountered in
terminal nodes, can lead to overfitting
9Known Projects COBWEB(2/3)
Terminal nodes only contain probabilities of 1
and 0 Root node P(VC) P(V) Other
nodes P(VC) is relative to parent
10Known Projects COBWEB(3/3)
Merging or Splitting is considered at each level
of the classification process
11Known Projects CLASSIT(1/6)
- Much like COBWEB (matching, creating, merging
splitting) - Uses probability distributions for
attribute-values - Different evaluation function adjusted for
continuous interval - with s standard deviation µ Mean I
numOfAttributes K numOfClasses - sik stdDev for given attrib. in given
class - sip stdDev for given attrib. in parent
node - Cutoff similar enough ? anti-overfitting
- Acuity minimum value for s Minimal just
noticeable difference
12Known Projects CLASSIT(2/6)
13Known Projects CLASSIT(3/6)
14Known Projects CLASSIT(4/6)
15Known Projects CLASSIT(5/6)
16Known Projects CLASSIT(6/6)
17Future Research for CLASSIT
- Think of an evaluation function that supports
numerical data as well as symbolic attributes - Improve the matching process
- Smart attribute selection
- Relative values of attributes instead of absolute
- Matching with missing attributes components,
partial match