Title: Chapter 6 Knowledge Acquisition ????
1Chapter 6Knowledge Acquisition????
26.1 ??
- ???? (Knowledge Acquisition) ?????????????????
Expertise Transfer
???
Computerized Representation
??
3?????????????
- ???????? (training cases)
- ??????
- ???????????
- ??????KE??
- ?????????
4??????
- Substantive(???????) Knowledge
- ???????
- ??????????????
- ???? (Strategic Knowledge)
- ????????
- ???3000???
5??????
Substantive Knowledge
Strategic Knowledge
Classification Decision making
Control Planning
MORE SALT MOLE ASK
Repertory Grid Approach
Other Approach
TEIRESIAS
KRITON
ETS
NeoETS
KSSO
AQUINAS
KITTEN
KNACK
RuleCon
6 The Acquisition of Substantive Knowledge
- Repertory Grid(????)-Oriented Methods
- ??? ?????????? (elements)
- ??? ??????????? (constructs)
- ????????, ?????????????, ??????????????????
- ??? ???????, ?????, ?15
- ??? ?????????? (Implication graph)
7??? ??????????
Measles German Dangue Chickenpox Smallpox Measles Fever
??? ???????????
Measles German Dangue Chickenpox Smallpox Measles Fever
1 5 1 5 1 1 2 5 5 2 4
high fever red purple headache
no high fever not red no purple no headache
8??? ???????, ?????
Measles German Dangue Chickenpox Smallpox Measles Fever
1 5 1 2 3 1 5 1 1 2 2 5 2 5 5 5 4 2 2 4
high fever red purple headache
no high fever not red no purple no headache
??? ??????????
headache
red
purple
high fever
9??????????
- First column
- IF
- high_fever and red and purple and (not
headache) - Then
- Disease Measles
- CF MIN (0.8,1.0,0.8,0.8)
- 0.8
- Second column
- IF
- (not high_fever) and (not red) and
- (not purple) and (not headache)
- Then
- Disease German Measles
10?????????
- ???????????
- ???????????
- ????????
- ????????????
- ???????
- ????????
116.2 ELICTATION(?????) OF SUBSTANTIVE
KNOWLEDGE
- ????? (Knowledge Representation)
dog bird fish
4-legs 2-legs no-legs 1 5 5 5 1 5 5 5 1 not 4-legs not 2-legs has-legs
dog bird fish
of legs 4,2 2,2 0,2
A dog has 4 legs
being very sure
12 An acquisition table is a repertory grid(????)
of multiple data types
- Boolean true or false
- Single valuean integer, a real, or a symbol
- Set of valuea set of integers, real numbers or
symbols. - Range of valuesa set of integers or real
numbers. - Xno relation.
- Uunknown or undecidable.
- Ratings
- 2very likely to be.
- 1maybe.
136.3 ?????????
E1 E2 E3 E4 E5
C1 C2 C3 C4 1 5 5 4 2 1 5 1 1 5 1 5 1 2 2 1 5 1 1 4 C1 C2 C3 C4
14 Problem of Multi-Level Knowledge and
Acquirability
INPUT DATA
INPUT DATA
SUBGOAL
SUBGOAL
SUBGOAL
INPUT DATA
GOAL
15 The Concept of Acquirability
- The value of a terminal attribute of a decision
tree must either - be a constant or be acquirable from users. For
example - IF
- (leaf-shape scale(??)) and
- (class Gymnosperm(????))
- THEN
- family Cypress(??).
- Class is not an acquirable attribute.
16?
?
?
Leaf Shape
Class
Family
17Domain basis and classification knowledge
Diseases
Domain basis
(?????)
Other diseases
Acute Exanthemas
Classification knowledge
Measles, German measles, Dangue fever,
18 ???????
- ????????????????
- ??, ??, ??, ???,,
- ??????? ?????????, ???????????
- ??????????????
- (Headache yes) and (Feel_tired yes) and
- (cough yes) and ,
- --gt Disease Catch_cold
19 ????????
- ??????????????, ???????????
- ???????????
206.4 EMCUD????????????
- ????? (Knowledge Representation)
- ???? (Conventional Repertory grid) ? Acquisition
Table -
- ?????? (Attribute Ordering Table - AOT)
21 ??????????????
Obj1 Obj2 Obj3 Obj4 Obj5
A1 A2 A3 D D 2 1 D 1 1 1 D D X X D 1 D
- ???AOT?????
- D??????????
- X????????
- ???????????????? (?????????)
22???????
Obj1 Obj2 Obj3 Obj4 Obj5
A1 A2 A3 9,10,12,2 20,2 (13-16,2 17,2 3,2 YES,1 NO,2 YES,1 YES,2 NO,2 X X 4.3,2 2.1,2 6.0,2
- ??????????????
- RULE1 (A1?9,10,12) ?(A2 YES) gt
GOALobj1 - Where
- F(confidence) 1.0 if confidence 2
- 0.8 if confidence 1
- and
- Certainty Factor CF MIN(F(2),F(1)) 0.8
23??AOT???
- EMCUDIf A1 ? 9,10,12, is it possible that GOAL
Obj1 ? - EXPERTNo. /This implies that A1 dominates Obj1
and - AOTltObj1,A1gt D /
- EMCUDIf A2 ? YES,is it possible that GOAL
Obj1? - EXPERTYes. /A2 does not dominate Obj1 /
- EMCUDIf A1 gt 16 or A1 ? 13, is it possible that
GOAL Obj3? - EXPERTYes. / A1 does not dominate Obj3 /
- EMCUDIf A2 ? YES, is it possible that GOAL
Obj3 ? - EXPERTYes. / A2 does not dominate Obj3 /
- EMCUDIf A3 ? 4.3 , is it possible that GOAL
Obj3 ? - EXPERTNo. / A3 does dominate Obj3 /
24- EMCUDPlease rank A1 and A2 in the order of
importance to - Obj3 by choosing one of the following
expressions - 1)A1 is more important that A2
- 2)A1 is less important that A2
- 3)A1 is as important as A2
- EXPERT1 / A1 is more important to Obj3 than A2,
hence - AOT lt Obj3,A1gt 2 and AOT ltObj3,A2gt 1 /
Obj1 Obj2 Obj3 Obj4 Obj5
A1 A2 A3 D D 2 1 D 1 1 1 D D X X D 1 D
25??????
- From RULE3, the following embedded rules(????)
will - Be generated by negating the predicates of A1 and
A2 - RULE3,1NOT(13ltA1?16)?(A2YES) ? (A3A3)
- ? GOAL Obj3
- RULE3,2 (13ltA1?16)?NOT(A2YES) ? (A3A3)
- ? GOAL Obj3
- RULE3,3NOT(13ltA1?16)?NOT(A2YES) ? (A3A3)
- ? GOAL Obj3
26Certainty Sequence(CS)
- Represents the drgree of certainty degradation.
- CS(RULESij) SUM(AOTltObji,Akgt)
- for each ak in the negated predicates of ruleij
- For example
- CS(RULE3,3) AOT lt Obj3,A1 AOTltObj3,A2gt
- 2 1 3
- The embedded rules(????) generated from RULE3
- RULE3,1NOT(13ltA1?16)?(A2YES) ? (A3A3)
- ? GOAL Obj3 CS 2
- RULE3,2 (13ltA1?16)?NOT(A2YES) ? (A3A3)
- ? GOAL Obj3 CS 1
- RULE3,3NOT(13ltA1?16)?NOT(A2YES) ? (A3A3)
- ? GOAL Obj3 CS 3
27Construct Constraint List
- Sort the embedded rules according to the CS
values - RULES3,2 CS 1
- RULES3,1 CS 2
- RULES3,3 CS 3
- A prune-and-search algorithm
- EMCUDDo you think RULE3,1 is acceptable?
- ExpertYes. / then RULE3,2 is also accepted/
- EMCUDDo you think RULE3,3 is acceptable?
- ExpertNo. / then CS3 is recorded in the
constraint list /
28?????? (Certainty Factors)
- Confirm1.0
- Strongly support0.8
- Support0.6
- May support0.4
- CFij Upper-Boundi- (Csij/MAX(Csi)) ?
- (Upper-Boundi Lower-Boundi)
- MAX(Csi)maximum CS value of the embedded
- rules generated from RULEi.
- Upper-Boundicertainty factor of embedded
- Lower-Boundicertainty factor of embedded
- rule with MAX(Csi) / The rule with least
confidence/
29???????????
- ???RULE3???????
- 1. Upper Bound CF(RULES3) 0.8
- 2. ??RULES3 ?????, ?????????? (MAX(CS)) ??
- RULE3,1
- EMCUDIf RULE3 strongly supports GOAL Obj3 ,
- what about RULE3,1 ?
- Expert1. /The Lower-Bound 0.6/
- CF3,1 0.8 (2/2) (0.8 0.6) 0.6
- CF3,2 0.8 (1/2) (0.8 0.6) 0.7
30repertory grid
original rules
Attribute-Ordering Table
eliciting embedded rules
possible embedded rules
Constraint List
thresholding
accepted embedded rules
mapping
mapping function
certainty factors of the embedded rules
31ACQUISITION TABLE
? ?
? ? ? ? ? ? YES YES YES
AOT
? ?
? ? ? ? ? ? YES,2 YES,2 YES,1
32- ???????
- IF (??YES)(??YES)(??YES)
- THEN DISEASE??
- EMCUD
- IF (??YES)(??ltgtYES)(??YES)
- THEN DISEASE?? CF0.67
- IF (??YES)(??YES)(??ltgtYES)
- THEN DISEASE?? CF0.73
- IF (??YES)(??ltgtYES)(??ltgtYES)
- THEN DISEASE?? CF0.6
33 OBJECT CHAINA METHOD FOR questions
selection
- For the grid with 50 elements (or objects), there
are 19600 possible choices of questions to elicit
constructs (or attributes). - Initial repertory grid(????) and the object
chains - OBJECT CHAIN
- Obj1 --gt 2,3,4,5
- Obj2 --gt 1,3,4,5
- Obj3 --gt 1,2,4,5
- Obj4 --gt 1,2,3,5
- Obj5 --gt 1,2,3,4
Obj1 Obj2 Obj3 Obj4 Obj5
34 The expert gives attribute P1 to distinguish
Obj1 and Obj2 from Obj3
- OBJECT CHAIN
- Obj1 -- gt 2,5
- Obj2 -- gt 1,5
- Obj3 -- gt 4
- Obj4 -- gt 3
- Obj5 -- gt 1,2
Obj1 Obj2 Obj3 Obj4 Obj5
P1 T T F F T
35 The expert gives attribute P2 to distinguish
Obj2 and Obj5 from Obj1
- OBJECT CHAIN
- Obj1 -- gt NULL
- Obj2 -- gt 5
- Obj3 -- gt NULL
- Obj4 -- gt NULL
- Obj5 -- gt 2
Obj1 Obj2 Obj3 Obj4 Obj5
P1 P2 T T F F T T F T F F
36- The expert gives attribute P3 to distinguish
Obj2 from Obj5
OBJECT CHAIN Obj1 -- gt NULL Obj2 -- gt NULL
Obj3 -- gt NULL Obj4 -- gt NULL Obj5 -- gt NULL
Obj1 Obj2 Obj3 Obj4 Obj5
P1 P2 P3 T T T F T T F T F F F T T F F
37 Advantages
- Fewer questions are asked(log2n to n-1
questions). - All of the objects are classified.
- Every question matches the current requirement of
classifying objects. - Disadvantages
- It may force the expert to think a specific
direction. - Some important attributes may be ignored.
38 Eliciting hierarchy of grids
- For the expert system(????) of classifying
families of plants
Goal is FAMILY
Cypress Pine Bald Cypress Magnolia ?? ?? ???? ???
Leaf shape Needle pat. Class Silver band scale needle needle scale X random,evenline evenline X Gymnosperm Gymnosperm Gymnosperm Magnolia X T F X
39- Since class is not acquirable, it becomes the
goal of a new grid.
Goal is CLASS
Gymnosperm Magnolia Angiosperm ???? ??? ????
type flate Tree Herb(??) Tree F T T
40- Since class is not acquirable, it becomes the
goal of a new grid.
Goal is TYPE
Herb Vine Tree Shrub
stem position one trunk green woody woody woody X creeping upright upright F T T F
41Decision tree of the hierarchy of grids
FAMILY OF PLANT
LEAF SHAPE
NIDDLE PATTERN
CLASS
TYPE
FLATE
STEAM
POSITION
ONE TRUNK
426.5 EMCUD ????????
- ????
- ???????
- ??
- ????
- ??
- Personal Consultant Easy
43case number 1 2 3 4 5 6 7 8 9 10 11 12 13
physician(??) 12 3 3 1 2 1 14 2 6 5 5 3 1
old prototype 12 X X X X X 14 X 6 X X 3 1
new prototype 12 3 3 1 2 1 14 2 6 5 5 3 1
case number 14 15 16 17 18 19 20 21 22 23 24 25
physician 6 6 12 5 8 9 14 13 4 1 2 14
old prototype X X 12 5 X 9 14 13 4 1 2 14
new prototype 6 6 12 5 8 9 14 13 4 1 2 14
- The codes of diseases and their translations
- 1-Measles 8 - Meningococcemia
- 2-German measles 9 - Rocky Mt. Spotted
fever - 3-Chickenpox 10 - Typhus fevers
- 4-Smallpox 11 Infectious
mononucleosis - 5-Scarlet 12 Enterovirus infections
- 6-Exanthem subitum 13 Drug eruptions
- 7-Fifth disease 14 Eczema herpeticum
- Table 6.3Testing results of the old and new
prototypes.
446.6 ???????
- ?????????????, ??????????????
- ???
- Synonyms of elements (possible solutions)
- Synonyms of traits (attributes to classify the
solutions) - Conflicts of ratings
45Each expert has his own way to do some works.
Habitual domain of Expert 1
Habitual domain of Expert 2
Integrated Knowledge Use more attributes to make
choices from more possible decisions
46Expert 1
Expert 2
Expert N
Busy
Busy
Busy
Far away
Far away
Knowledge Engineer
It is difficult to have all of the experts work
together
47Expert 1
Expert 2
Expert N
Phase 1 interview
Repertory Grid 1
Repertory Grid 2
Repertory Grid N
The unions of element sets and construct sets
Common Repertory Grid
Phase 2 interview
Expert 1
Expert 2
Expert N
Eliminate some redundant vocabularies
Common Repertory Grid
48Phase 3 interview
Expert 1
Expert 2
Expert N
Rated Common Repertory Grid 1
Rated Common Repertory Grid 2
Rated Common Repertory Grid N
Knowledge Integration
Integrated Repertory Grid
Rule Generation
49Repertory Grid 1
Repertory Grid 2
Repertory Grid N
The unions of element sets and construct sets
Common Repertory Grid
Phase 2 interview
Expert 1
Expert 2
Expert N
Eliminate some redundant vocabularies
Common Repertory Grid
Phase 3 interview
Expert 1
Expert 2
Expert N
50Rated Common Repertory Grid 1
Rated Common Repertory Grid 2
Rated Common Repertory Grid N
Knowledge Integration
Integrated Repertory Grid
Flat Repertory Grid
Generate AOT
AOT
Filled AOT 2
Filled AOT 1
Filled AOT N
Integration or AOTs
Integrated AOT
Rule Generation
51Expert 1
Expert 2
E2
E1
E3
E4
E5
E2
E1
E3
E4
E5
Eye pain
5 4 1 4 5
1 1 5 1 1
4 4 5 3 1
5 5 5 4 3
4 1 1 5 4
4 1 1 5 5
5 1 1 5 4
1 4 5 1 1
5 2 2 5 5
5 1 4 1 1
Eye pain
5 3 1 5 4
1 2 4 1 1
3 4 5 2 1
5 5 5 3 2
5 1 1 5 4
4 1 1 4 5
5 1 1 5 5
1 3 4 1 1
5 2 1 5 5
5 1 3 1 1
Pupil size
Pupil size
headache
headache
Cornea
Cornea
Inflame of Eye
Inflame of Eye
Tears
Tears
Redness
Redness
Vision
Vision
Papillary light
Papillary light
response
response
Both Side
Both Side
Knowledge Integration
52Expert 3
E2
E1
E3
E4
E5
Eye pain
5 4 1 5 5
1 1 5 1 1
4 4 5 2 1
5 5 5 4 2
5 1 1 5 4
4 1 1 5 5
5 1 1 5 5
1 4 5 1 1
5 2 1 5 5
5 1 4 1 1
Pupil size
headache
Cornea
Inflame of Eye
Tears
Redness
Vision
Papillary light
response
Both Side
53Results of the first experiment
- Differential Diagnosis for Common Causes of
Inflamed Eyes. - 60 test cases are used to evaluate the knowledge
base from - Expert 1, the knowledge base from Expert 2, and
the - integrated knowledge base.
Knowledge base Ratio of Correct Diagnosis
Expert 1 Expert 2 Integrated 0.67 0.64 0.8
54Results of the first experiment
Differential Diagnosis for Common Causes of
Inflamed Eyes. 336 test cases are used to
evaluate the knowledge base from Expert 1, the
knowledge base from Expert 2, and the integrated
knowledge base.
Knowledge base Number of Correct Diagnosis Ratio of Correct Diagnosis
Expert 1 Expert 2 Integrated 255 243 306 0.759 0.723 0.911
556.7 ???? (Machine Learning)
- ???????, ???????????????????????
- ??
- Expert Systems
- Cognitive(??) Simulation
- Problem Solving
- Control
- ??
- Perceptron Rosenblatt, 1961
- Meta-Dendral Bucmanan, Feigenbaum, Sridharan,
1972 - AM Lenat, 1976
- LEX Mitchell, Utgoff, Banerji, 1983
56Michalski, 1983
Learning
Learning by Analog
Rote Learning
Learning by Instruction
Learning by Induction
Learning from Observation and Discovery
Learning from Examples
57Machine Learning(????) Central to A.I.
Learning from Examples.
582
2
2
1
1
2
2
1
1
1
1
1
1
1
3
3
1
1
3
1
1
3
1
3
2
2
1
2
2
2
2
3
2
2
2
2
2
59Learning Strategies
Neural Learning
Symbolic Learning
Incremental Learning
Batch Learning
e.g. ID3
e.g. Perceptron
e.g. Version Space
e.g. PRISM
60???????????T.M. Mitchell 1979
- Depth-first search
- Specific-to-general breadth-first search
- Version space
61??
- ?????
- an unorder pair of simple objects, characterized
by - three attributes(size, color, shape)
- ????
- (Large,Red,Triangles)(Small,Blue,Circle)
(Large,Blue,Circle)(Small,Red,Triangle)
(Large,Blue,Triangle)(Small,Blue,Triangle)
-
62?????? (Depth-first search)
1.(Large,Red,Triangle) (Small,Blue,Circle)
(Large,Red,Triangle) (Small,Blue,Circle)
2.(Large,Blue, Circle) (Small,Red, Triangle)
(Large,Red,Triangle) (Small,Blue,Circle)
(Large,?,?) (Small,?,?)
3.(Large,Blue, Triangle)) (Small,Blue,
Triangle)
(Large,Red,Triangle) (Small,Blue,Circle)
(Large,?,?) (Small,?,?)
(?,Red,Triangle) (?,Blue,Circle)
63?? 1. ?????? (backtracking) 2.
??????????????????
64Specific-to-general breadth-first search
1.(Large,Red,Triangle) (Small,Blue,Circle)
(Large,Red,Triangle) (Small,Blue,Circle)
2.(Large,Blue, Circle) (Small,Red, Triangle)
(Large,Red,Triangle) (Small,Blue,Circle)
(Large,?,?) (Small,?,?)
(?,Red,Triangle) (?,Blue,Circle)
3.(Large,Blue, Triangle)) (Small,Blue,
Triangle)
(Large,Red,Triangle) (Small,Blue,Circle)
(Large,?,?) (Small,?,?)
(?,Red,Triangle) (?,Blue,Circle)
65?? Needs to check past negative instances to
assure that the revised generalization is not
overly general
66Symbolic Learning determine one or several
hypotheses each of which is consistent with
presented training instances
Hypothesis Space
Attributes
Learning Unit
Matching Predicates
Training Instances
67Example assume only one attribute exists
transc
trig
explog
sin
cos
tan
ln
exp
- Instance space terminal nodes,Hypothesis space
all nodes - Predicates predecessor-successor relations
- Positive Training Instancessin and cos
- Negative Training Instance ln
- ? Concepttrig
68Terminology
- An Instance Space
- a set of instances which can be legally
- described by a given instance language
- .Attribute-based Instance Space
- .Structured Instance Space
- A Hypothesis Space
- a set of hypotheses which can be legally
described - by a generalization language
- Conjunctive Form Disjunctive Form
- e.g.
- Colorred and shapeconvex C1 or C2 or C3
- (most prevalent form)
- conjunctive form
5 kinds of expressions
69Terminology
- ? Predicates
- required for testing whether a given instance is
contained in the instance set corresponding to a
given hypothesis - Powerful basis for organizing a search
- Two partial ordering relations exist
- ? A is more specific(??) than B
- B is more general(??) than A
- If each instance contained in A is also
- contained in B
70????
- ? ???????? (Incremental Learning)
- ? For Conjunctive Hypothesis
- Idea
- ??????????????S?G???
- S????? (Specific) ???
- G????? (General) ???
-
G
more general
more specific
S
71??
transc
trig
explog
sin
cos
tan
ln
exp
- ( sin ) Ssin Gtransc
- ( ln - ) Ssin Gtrig
- ( cos ) Strig Gtrig
- Concept trig
- Lemma a ? S, b ? G,
- a is more specific than b
721.(Large,Red,Triangle) (Small,Blue,Circle)
(Large,Red,Triangle) (Small,Blue,Circle)
S
(?,?,?) (?,?,?)
G
2.(Large,Blue, Circle) (Small,Red, Triangle)
S
(Large,?,?) (Small,?,?)
(?,Red,Triangle) (?,Blue,Circle)
(?,?,?) (?,?,?)
G
(?,Red,Triangle) (?,Blue,Circle)
3.(Large,Blue, Triangle)) (Small,Blue,
Triangle)
S
(?,Red,?) (?,?,?)
(?,?,Circle) (?,?,?)
G
73Check contradiction between S and G
- Step1 Take a generalization s in S and a
generalization g in G. Check s with g, if g is
not more general than s , mark s and g. - Step2 Repeat step1 until each in S and G are
processed. - Step3 Discard those generalizations in S with
G marks and those in G with S marks.
74Advantage Needs not check past instances---the
reason to apply it in our parallel learning
algorithm
75ID 3
Attribute 1
Value 2
Value 1
Attribute 2
Attribute 2
76? Entropyfor each attribute, calculate the
entropy
i
n
m ? i0
i
E
- n log2
n
- i
i
n
- i
n
- i
- n log2
- i
n
i
n
Among all the feasible attributes, the one which
causes the minimum entropy will be chosen as the
next attribute
77Example
COLOR black brown brown black brown black brown br
own brown black black black
SIZE large large medium small medium large small s
mall large medium medium small
COAT shaggy smooth shaggy shaggy smooth smooth sha
ggy smooth shaggy shaggy smooth smooth
COLOR - - - - - -
78- For attribute color black n 2, n- 4
- Brown n 4, n- 2,
- E(color) -2 log2/6 4 log4/6 4log4/6 2log2/6
79 large
- -
shaggy
- - -
medium
SIZE
COAT
-
smooth
- -
small
- - -
black
COLOR
-
brown
80PRISMCendrowska , 1987
- ?Attribute-Value Pair
- e. g. A1, A2, A3,
- ? Instead of Attribute
- ?Information Gain
- e.g.
-
- ?Minimize Number of Rules
- And Number of Attributes
(
)
Probability of Class 1 A 1
log2
Probability of Class 1
81- COLORblack ? 2/6
- COLORbrown ? 4/6
- SIZEsmall ? 1/4
- SIZE medium ? 1/4
- SIZElarge ? 4/4
- COATshaggy ?3/6
- COATsmooth ?3/6
- SIZE large is chosen
- SIZE large Positive Class
82Exercise
- ????????,????Repertory Grid(????)???????????
- ????????????????????Embedded Meanings(????)?