Machine Learning: Symbol-based - PowerPoint PPT Presentation

About This Presentation
Title:

Machine Learning: Symbol-based

Description:

A learning game with playing cards ... Set E of objects (e.g., cards, drinking cups, writing instruments) ... 'the rewarded cards are 4 , 7 , 2 ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 96
Provided by: MBE
Learn more at: https://pages.mtu.edu
Category:

less

Transcript and Presenter's Notes

Title: Machine Learning: Symbol-based


1
Machine Learning Symbol-based
9
9.0 Introduction 9.1 A Framework
for Symbol-based Learning 9.2 Version Space
Search 9.3 The ID3 Decision Tree Induction
Algorithm 9.4 Inductive Bias and Learnability
9.5 Knowledge and Learning 9.6 Unsupervised
Learning 9.7 Reinforcement Learning 9.8 Epilogue
and References 9.9 Exercises
Additional source used in preparing the
slides Jean-Claude Latombes CS121 (Introduction
to Artificial Intelligence) lecture notes,
http//robotics.stanford.edu/latombe/cs121/2003/h
ome.htm (version spaces, decision trees) Tom
Mitchells machine learning notes (explanation
based learning)
2
Chapter Objectives
  • Learn about several paradigms of symbol-based
    learning
  • Learn about the issues in implementing and using
    learning algorithms
  • The agent model can learn, i.e., can use prior
    experience to perform better in the future

3
A learning agent
Critic
environment
KB
Learning element
sensors
actuators
4
A general model of the learning process
5
A learning game with playing cards
  • I would like to show what a full house is. I give
    you examples which are/are not full houses
  • 6? 6? 6? 9? 9? is a full house
  • 6? 6? 6? 6? 9? is not a full house
  • 3? 3? 3? 6? 6? is a full house
  • 1? 1? 1? 6? 6? is a full house
  • Q? Q? Q? 6? 6? is a full house
  • 1? 2? 3? 4? 5? is not a full house
  • 1? 1? 3? 4? 5? is not a full house
  • 1? 1? 1? 4? 5? is not a full house
  • 1? 1? 1? 4? 4? is a full house

6
A learning game with playing cards
  • If you havent guessed already, a full house is
    three of a kind and a pair of another kind.
  • 6? 6? 6? 9? 9? is a full house
  • 6? 6? 6? 6? 9? is not a full house
  • 3? 3? 3? 6? 6? is a full house
  • 1? 1? 1? 6? 6? is a full house
  • Q? Q? Q? 6? 6? is a full house
  • 1? 2? 3? 4? 5? is not a full house
  • 1? 1? 3? 4? 5? is not a full house
  • 1? 1? 1? 4? 5? is not a full house
  • 1? 1? 1? 4? 4? is a full house

7
Intuitively,
  • Im asking you to describe a set. This set is the
    concept I want you to learn.
  • This is called inductive learning, i.e., learning
    a generalization from a set of examples.
  • Concept learning is a typical inductive learning
    problem given examples of some concept, such as
    cat, soybean disease, or good stock
    investment, we attempt to infer a definition
    that will allow the learner to correctly
    recognize future instances of that concept.

8
Supervised learning
  • This is called supervised learning because we
    assume that there is a teacher who classified the
    training data the learner is told whether an
    instance is a positive or negative example of a
    target concept.

9
Supervised learning?
  • This definition might seem counter intuitive. If
    the teacher knows the concept, why doesnt s/he
    tell us directly and save us all the work?
  • The teacher only knows the classification, the
    learner has to find out what the classification
    is. Imagine an online store there is a lot of
    data concerning whether a customer returns to the
    store. The information is there in terms of
    attributes and whether they come back or not.
    However, it is up to the learning system to
    characterize the concept, e.g,
  • If a customer bought more than 4 books, s/he will
    return.
  • If a customer spent more than 50, s/he will
    return.

10
Rewarded card example
  • Deck of cards, with each card designated by
    r,s, its rank and suit, and some cards
    rewarded
  • Background knowledge in the KB ((r1) ? ?
    (r10)) ? NUM (r) ((rJ) ? (rQ) ? (rK)) ?
    FACE (r) ((sS) ? (sC)) ? BLACK (s)
    ((sD) ? (sH)) ? RED (s)
  • Training set REWARD(4,C) ? REWARD(7,C)
    ? REWARD(2,S) ? ?REWARD(5,H) ?
    ?REWARD(J,S)

11
Rewarded card example
  • Training set REWARD(4,C) ? REWARD(7,C)
    ? REWARD(2,S) ? ?REWARD(5,H) ?
    ?REWARD(J,S)
  • Card In the target set?
  • 4? yes
  • 7? yes
  • 2? yes
  • 5? no
  • J? no
  • Possible inductive hypothesis, h,
  • h (NUM (r) ? BLACK (s) ? REWARD(r,s)

12
Learning a predicate
  • Set E of objects (e.g., cards, drinking cups,
    writing instruments)
  • Goal predicate CONCEPT (X), where X is an object
    in E, that takes the value True or False (e.g.,
    REWARD, MUG, PENCIL, BALL)
  • Observable predicates A(X), B(X), (e.g., NUM,
    RED, HAS-HANDLE, HAS-ERASER)
  • Training set values of CONCEPT for some
    combinations of values of the observable
    predicates
  • Find a representation of CONCEPT of the form
    CONCEPT(X) ? A(X) ? ( B(X)? C(X) )

13
How can we do this?
  • Go with the most general hypothesis possible
    any card is a rewarded card This will cover
    all the positive examples, but will not be able
    to eliminate any negative examples.
  • Go with the most specific hypothesis
    possible the rewarded cards are 4?, 7?, 2?
    This will correctly sort all the examples in the
    training set, but it is overly specific, will not
    be sort any new examples.
  • But the above two are good starting points.

14
Version space algorithm
  • What we want to do is start with the most
    general and specific hypotheses, and when we
    see a positive example, we minimally generalize
    the most specific hypotheses when we see a
    negative example, we minimally specialize the
    most general hypothesis
  • When the most general hypothesis and the most
    specific hypothesis are the same, the algorithm
    has converged, this is the target concept

15
Pictorially

-
-
-

?
?

?

-
-
-

?
?
-

-
?
?

-
?



-


boundary of G
-
-
-
-
-
-
-
-
-
-
?
?
-


-
-

-
-


?




?
-
-
?
-
-
-
-
boundary of S
potential target concepts
16
Hypothesis space
  • When we shrink G, or enlarge S, we are
    essentially conducting a search in the hypothesis
    space
  • A hypothesis is any sentence h of the form
    CONCEPT(X) ? A(X) ? ( B(X)? C(X) ) where, the
    right hand side is built with observable
    predicates
  • The set of all hypotheses is called the
    hypothesis space, or H
  • A hypothesis h agrees with an example if it
    gives the correct value of CONCEPT

17
Size of the hypothesis space
  • n observable predicates
  • 2n entries in the truth table
  • A hypothesis is any subset of observable
    predicates with the associated truth tables so
    there are 2(2n) hypotheses to choose from
    BIG!
  • n6 ? 2 64 1.8 x 10 19 BIG!
  • Generate-and-test wont work.

18
Simplified Representation for the card problem
  • For simplicity, we represent a concept by rs,
    with
  • r a, n, f, 1, , 10, j, q, k
  • s a, b, r, ?, ?, ?, ?For example
  • n? represents NUM(r) ? (s?) ?
    REWARD(r,s)
  • aa represents
  • ANY-RANK(r) ? ANY-SUIT(s) ? REWARD(r,s)

19
Extension of an hypothesis
  • The extension of an hypothesis h is the set of
    objects that verifies h.
  • For instance,
  • the extension of f? is j?, q?, k?, and
  • the extension of aa is the set of all cards.

20
More general/specific relation
  • Let h1 and h2 be two hypotheses in H
  • h1 is more general than h2 iff the extension of
    h1 is a proper superset of the extension of h2
  • For instance,
  • aa is more general than f?,
  • f? is more general than q?,
  • fr and nr are not comparable

21
More general/specific relation (contd)
  • The inverse of the more general relation is the
    more specific relation
  • The more general relation defines a partial
    ordering on the hypotheses in H

22
A subset of the partial order for cards
23
G-Boundary / S-Boundary of V
  • An hypothesis in V is most general iff no
    hypothesis in V is more general
  • G-boundary G of V Set of most general hypotheses
    in V
  • An hypothesis in V is most specific iff no
    hypothesis in V is more general
  • S-boundary S of V Set of most specific
    hypotheses in V

24
Example The starting hypothesis space
G
S
25
4? is a positive example
We replace every hypothesis in S whose extension
does not contain 4? by its generalization set
aa
na
ab
The generalization set of a hypothesis h is the
set of the hypotheses that are immediately more
general than h
nb
a?
4a
n?
4b
4?
Generalization set of 4?
26
7? is the next positive example
Minimally generalize the most specific hypothesis
set
aa
We replace every hypothesis in S whose extension
does not contain 7? by its generalization set
na
ab
nb
a?
4a
n?
4b
4?
27
7? is positive(contd)
Minimally generalize the most specific hypothesis
set
aa
na
ab
nb
a?
4a
n?
4b
4?
28
7? is positive (contd)
Minimally generalize the most specific hypothesis
set
aa
na
ab
nb
a?
4a
n?
4b
4?
29
5? is a negative example
Minimally specialize the most general hypothesis
set
Specialization set of aa
aa
na
ab
nb
a?
4a
n?
4b
4?
30
5? is negative(contd)
Minimally specialize the most general hypothesis
set
aa
na
ab
nb
a?
4a
n?
4b
4?
31
After 3 examples (2 positive,1 negative)
G and S, and all hypotheses in between form
exactly the version space
ab
nb
a?
n?
1. If an hypothesis between G and S
disagreed with an example x, then an
hypothesis G or S would also disagree with
x, hence would have been removed
32
After 3 examples (2 positive,1 negative)
G and S, and all hypotheses in between form
exactly the version space
ab
nb
a?
n?
2. If there were an hypothesis not in
this set which agreed with all examples,
then it would have to be either no more
specific than any member of G but then it
would be in G or no more general than some
member of S but then it would be in S
33
At this stage
ab
nb
a?
n?
Do 8?, 6?, j? satisfy CONCEPT?
34
2? is the next positive example
Minimally generalize the most specific hypothesis
set
ab
nb
a?
n?
35
j? is the next negative example
Minimally specialize the most general hypothesis
set
ab
nb
36
Result
4? 7? 2? 5? j?
nb
NUM(r) ? BLACK(s) ? REWARD(r,s)
37
The version space algorithm
  • Begin
  • Initialize S to the first positive training
    instanceN is the set of all negative instances
    seen so far
  • For each example x
  • If x is positive, then (G,S) ?
    POSITIVE-UPDATE(G,S,x)
  • else (G,S) ? NEGATIVE-UPDATE(G,S,x)
  • If G S and both are singletons, then the
    algorithm has found a single concept that is
    consistent with all the data and the algorithm
    halts
  • If G and S become empty, then there is no concept
    that covers all the positive instances and none
    of the negative instances
  • End

38
The version space algorithm (contd)
  • POSITIVE-UPDATE(G,S,x)
  • Begin
  • Delete all members of G that fail to match x
  • For every s ? S, if s does not match x, replace s
    with its most specific generalizations that match
    x
  • Delete from S any hypothesis that is more general
    than some other hypothesis in S
  • Delete from S any hypothesis that is neither more
    specific than nor equal to a hypothesis in G
    (different than the textbook)
  • End

39
The version space algorithm (contd)
  • NEGATIVE-UPDATE(G,S,x)
  • Begin
  • Delete all members of S that match x
  • For every g ? G, that matches x, replace g with
    its most general specializations that do not
    match x
  • Delete from G any hypothesis that is more
    specific than some other hypothesis in G
  • Delete from G any hypothesis that is neither more
    general nor equal to hypothesis in S (different
    than the textbook)
  • End

40
Comments on Version Space Learning (VSL)
  • It is a bi-directional search. One direction is
    specific to general and is driven by positive
    instances. The other direction is general to
    specific and is driven by negative instances.
  • It is an incremental learning algorithm. The
    examples do not have to be given all at once (as
    opposed to learning decision trees.) The version
    space is meaningful even before it converges.
  • The order of examples matters for the speed of
    convergence
  • As is, cannot tolerate noise (misclassified
    examples), the version space might collapse

41
Examples and near misses for the concept arch
42
More on generalization operators
  • Replacing constants with variables. For
    example, color (ball,red) generalizes to
    color (X,red)
  • Dropping conditions from a conjunctive
    expression. For example, shape (X, round) ?
    size (X, small) ? color (X, red) generalizes
    to shape (X, round) ? color (X, red)

43
More on generalization operators (contd)
  • Adding a disjunct to an expression. For
    example, shape (X, round) ? size (X, small) ?
    color (X, red) generalizes to shape (X,
    round) ? size (X, small) ? ( color (X, red) ?
    (color (X, blue) )
  • Replacing a property with its parent in a class
    hierarchy. If we know that primary_color is a
    superclass of red, then color (X, red)
    generalizes to color (X, primary_color)

44
Another example
  • sizes large, small
  • colors red, white, blue
  • shapes sphere, brick, cube
  • object (size, color, shape)
  • If the target concept is a red ball, then size
    should not matter, color should be red, and shape
    should be sphere
  • If the target concept is ball, then size or
    color should not matter, shape should be sphere.

45
A portion of the concept space
46
Learning the concept of a red ball
  • G obj (X, Y, Z)S
  • positive obj (small, red, sphere)
  • G obj (X, Y, Z)S obj (small, red,
    sphere)
  • negative obj (small, blue, sphere)
  • G obj (large, Y, Z), obj (X, red, Z), obj (X,
    white, sphere) obj (X,Y, brick), obj (X,
    Y, cube) S obj (small, red, sphere)
    delete from G every hypothesis that is neither
    more general than nor equal to a hypothesis in S
  • G obj (X, red, Z) S obj (small, red,
    sphere)

47
Learning the concept of a red ball (contd)
  • G obj (X, red, Z) S obj (small, red,
    sphere)
  • positive obj (large, red, sphere)
  • G obj (X, red, Z)S obj (X, red, sphere)
  • negative obj (large, red, cube)
  • G obj (small, red, Z), obj (X, red, sphere),
    obj (X, red, brick)S obj (X, red,
    sphere) delete from G every hypothesis that is
    neither more general than nor equal to a
    hypothesis in S
  • G obj (X, red, sphere) S obj (X, red,
    sphere) converged to a single concept

48
LEX a program that learns heuristics
  • Learns heuristics for symbolic integration
    problems
  • Typical transformations used in performing
    integration include OP1 ? r f(x) dx ? r ? f(x)
    dx OP2 ? u dv ? uv - ? v du OP3 1 f(x) ?
    f(x) OP4 ? (f1(x) f2(x)) dx ? ? f1(x) dx
    ? f2(x) dx
  • A heuristic tells when an operator is
    particularly useful If a problem state matches
    ? x transcendental(x) dx then apply OP2 with
    bindings u x dv transcendental (x) dx

49
A portion of LEXs hierarchy of symbols
50
The overall architecture
  • A generalizer that uses candidate elimination to
    find heuristics
  • A problem solver that produces positive and
    negative heuristics from a problem trace
  • A critic that produces positive and negative
    instances from a problem traces (the credit
    assignment problem)
  • A problem generator that produces new candidate
    problems

51
A version space for OP2 (Mitchell et al.,1983)
52
Comments on LEX
  • The evolving heuristics are not guaranteed to be
    admissible. The solution path found by the
    problem solver may not actually be a shortest
    path solution.
  • The problem generator is the least developed
    part of the program.
  • Empirical studies before 5 problems solved in
    an average of 200 steps train with 12
    problems after 5 problems solved in an average
    of 20 steps

53
More comments on VSL
  • Still lots of research going on
  • Uses breadth-first search which might be
    inefficient
  • might need to use beam-search to prune hypotheses
    from G and S if they grow excessively
  • another alternative is to use inductive-bias and
    restrict the concept language
  • How to address the noise problem? Maintain
    several G and S sets.

54
Decision Trees
  • A decision tree allows a classification of an
    object by testing its values for certain
    properties
  • check out the example at www.aiinc.ca/demos/wha
    le.html
  • The learning problem is similar to concept
    learning using version spaces in the sense that
    we are trying to identify a class using the
    observable properties.
  • It is different in the sense that we are trying
    to learn a structure that determines class
    membership after a sequence of questions. This
    structure is a decision tree.

55
Reverse engineered decision tree of the whale
watcher expert system
see flukes?
no
yes
see dorsal fin?
no
(see next page)
yes
size?
size med?
vlg
med
yes
no
blue whale
blow forward?
Size?
blows?
yes
no
lg
vsm
1
2
sperm whale
humpback whale
bowhead whale
gray whale
narwhal whale
right whale
56
Reverse engineered decision tree of the whale
watcher expert system (contd)
see flukes?
no
yes
see dorsal fin?
no
(see previous page)
yes
blow?
no
yes
size?
lg
sm
dorsal fin and blow visible at the same time?
dorsal fin tall and pointed?
yes
no
yes
no
killer whale
northern bottlenose whale
sei whale
fin whale
57
What does the original data look like?
58
The search problem
  • Given a table of observable properties, search
    for a decision tree that
  • correctly represents the data (assuming that the
    data is noise-free), and
  • is as small as possible.
  • What does the search tree look like?

59
Comparing VSL and learning DTs
A hypothesis learned in VSL can be represented as
a decision tree. Consider the predicate that we
used as a VSL exampleNUM(r) ? BLACK(s) ?
REWARD(r,s) The decision tree on the right
represents it
NUM?
True
False
BLACK?
False
False
True
True
False
60
Predicate as a Decision Tree
The predicate CONCEPT(x) ? A(x) ? (?B(x) v C(x))
can be represented by the following decision
tree
  • ExampleA mushroom is poisonous iffit is yellow
    and small, or yellow,
  • big and spotted
  • x is a mushroom
  • CONCEPT POISONOUS
  • A YELLOW
  • B BIG
  • C SPOTTED
  • D FUNNEL-CAP
  • E BULKY

61
Training Set
62
Possible Decision Tree
63
Possible Decision Tree
CONCEPT ? (D ? (?E v A)) v
(C ? (B v ((E ? ?A) v A)))
KIS bias ? Build smallest decision tree
Computationally intractable problem? greedy
algorithm
64
Getting Started
The distribution of the training set is
True 6, 7, 8, 9, 10,13 False 1, 2, 3, 4, 5, 11,
12
65
Getting Started
The distribution of training set is
True 6, 7, 8, 9, 10,13 False 1, 2, 3, 4, 5, 11,
12
Without testing any observable predicate,
we could report that CONCEPT is False (majority
rule) with an estimated probability of error
P(E) 6/13
66
Getting Started
The distribution of training set is
True 6, 7, 8, 9, 10,13 False 1, 2, 3, 4, 5, 11,
12
Without testing any observable predicate,
we could report that CONCEPT is False (majority
rule)with an estimated probability of error P(E)
6/13
Assuming that we will only include one observable
predicate in the decision tree, which
predicateshould we test to minimize the
probability of error?
67
Assume Its A
68
Assume Its B
69
Assume Its C
70
Assume Its D
71
Assume Its E
So, the best predicate to test is A
72
Choice of Second Predicate
A
F
T
False
C
F
T
The majority rule gives the probability of error
Pr(EA) 1/8and Pr(E) 1/13
73
Choice of Third Predicate
A
F
T
False
C
F
T
True
B
T
F
74
Final Tree
L ? CONCEPT ? A ? (C v ?B)
75
Learning a decision tree
  • Function induce_tree (example_set, properties)
  • beginif all entries in example_set are in the
    same class then return a leaf node labeled with
    that classelse if properties is empty
    then return leaf node labeled with disjunction of
    all classes in example_set
    else begin select a property, P, and make it the
    root of the current tree delete P
    from properties for each value, V,
    of P begin
    create a branch of the tree labeled with V
    let partitionv be elements of
    example_set with values V
    for property P call
    induce_tree (partitionv, properties), attach
    result to branch V
    end endend

If property V is Boolean the partition will
contain two sets, one with property V true and
one with false
76
What happens if there is noise in the training
set?
  • The part of the algorithm shown below handles
    this
  • if properties is empty then return leaf
    node labeled with disjunction of all
    classes in example_set
  • Consider a very small (but inconsistent) training
    set

A classificationT TF FF T
A?
True
False
False ? True
True
77
Using Information Theory
  • Rather than minimizing the probability of error,
    most existing learning procedures try to minimize
    the expected number of questions needed to decide
    if an object x satisfies CONCEPT.
  • This minimization is based on a measure of the
    quantity of information that is contained in
    the truth value of an observable predicate and is
    explained in Section 9.3.2. We will skip the
    technique given there and use the probability of
    error approach.

78
Assessing performance
79
The evaluation of ID3 in chess endgame
80
Other issues in learning decision trees
  • If data for some attribute is missing and is
    hard to obtain, it might be possible to
    extrapolate or use unknown.
  • If some attributes have continuous values,
    groupings might be used.
  • If the data set is too large, one might use
    bagging to select a sample from the training set.
    Or, one can use boosting to assign a weight
    showing importance to each instance. Or, one can
    divide the sample set into subsets and train on
    one, and test on others.

81
Explanation based learning
  • Idea can learn better when the background
    theory is known
  • Use the domain theory to explain the instances
    taught
  • Generalize the explanation to come up with a
    learned rule

82
Example
  • We would like the system to learn what a cup is,
    i.e., we would like it to learn a rule of the
    form premise(X) ?? cup(X)
  • Assume that we have a domain theoryliftable(X)
    ? holds_liquid(X) ? cup(X)part (Z,W) ?
    concave(W) ? points_up ? holds_liquid
    (Z)light(Y) ? part(Y,handle) ? liftable
    (Y)small(A) ? light(A)made_of(A,feathers) ?
    light(A)
  • The training example is the followingcup
    (obj1) small(obj1)small(obj1) part(obj1,handle)
    owns(bob,obj1) part(obj1,bottom)part(obj1,
    bowl) points_up(bowl)concave(bowl) color(obj1,re
    d)

83
First, form a specific proof that obj1 is a cup
cup (obj1)
liftable (obj1)
holds_liquid (obj1)
light (obj1)
part (obj1, handle)
part (obj1, bowl)
points_up(bowl)
concave(bowl)
small (obj1)
84
Second, analyze the explanation structure to
generalize it
85
Third, adopt the generalized the proof
cup (X)
liftable (X)
holds_liquid (X)
light (X)
part (X, handle)
part (X, W)
points_up(W)
concave(W)
small (X)
86
The EBL algorithm
  • Initialize hypothesis
  • For each positive training example not covered by
    hypothesis
  • 1. Explain how training example satisfies
    target concept, in terms of domain theory
  • 2. Analyze the explanation to determine the
    most general conditions under which this
    explanation (proof) holds
  • 3. Refine the hypothesis by adding a new rule,
    whose premises are the above conditions, and
    whose consequent asserts the target concept

87
Wait a minute!
  • Isnt this just a restatement of what the
    learner already knows?
  • Not really
  • a theory-guided generalization from examples
  • an example-guided operationalization of theories
  • Even if you know all the rules of chess you get
    better if you play more
  • Even if you know the basic axioms of
    probability, you get better as you solve more
    probability problems

88
Comments on EBL
  • Note that the irrelevant properties of obj1
    were disregarded (e.g., color is red, it has a
    bottom)
  • Also note that irrelevant generalizations were
    sorted out due to its goal-directed nature
  • Allows justified generalization from a single
    example
  • Generality of result depends on domain theory
  • Still requires multiple examples
  • Assumes that the domain theory is correct
    (error-free)---as opposed to approximate domain
    theories which we will not cover.
  • This assumption holds in chess and other search
    problems.
  • It allows us to assume explanation proof.

89
Two formulations for learning
  • Inductive
  • Given
  • Instances
  • Hypotheses
  • Target concept
  • Training examples of the target concept
  • Analytical
  • Given
  • Instances
  • Hypotheses
  • Target concept
  • Training examples of the target concept
  • Domain theory for explaining examples
  • Determine
  • Hypotheses consistent with the training examples
    and the domain theory
  • Determine
  • Hypotheses consistent with the training examples

90
Two formulations for learning (contd)
  • Inductive
  • Hypothesis fits data
  • Statistical inference
  • Requires little prior knowledge
  • Syntactic inductive bias
  • Analytical
  • Hypothesis fits domain theory
  • Deductive inference
  • Learns from scarce data
  • Bias is domain theory

DT and VS learners are similarity-based Prior
knowledge is important. It might be one of the
reasons for humans ability to generalize from as
few as a single training instance. Prior
knowledge can guide in a space of an unlimited
number of generalizations that can be produced by
training examples.
91
An example META-DENDRAL
  • Learns rules for DENDRAL
  • Remember that DENDRAL infers structure of
    organic molecules from their chemical formula and
    mass spectrographic data.
  • Meta-DENDRAL constructs an explanation of the
    site of a cleavage using
  • structure of a known compound
  • mass and relative abundance of the fragments
    produced by spectrography
  • a half-order theory (e.g., double and triple
    bonds do not break only fragments larger than
    two carbon atoms show up in the data)
  • These explanations are used as examples for
    constructing general rules

92
Analogical reasoning
  • Idea if two situations are similar in some
    respects, then they will probably be in others
  • Define the source of an analogy to be a problem
    solution. It is a theory that is relatively well
    understood.
  • The target of an analogy is a theory that is not
    completely understood.
  • Analogy constructs a mapping between
    corresponding elements of the target and the
    source.

93
(No Transcript)
94
Example atom/solar system analogy
  • The source domain contains yellow(sun)
    blue(earth) hotter-than(sun,earth)
    causes(more-massive(sun,earth),
    attract(sun,earth)) causes(attract(sun,earth),
    revolves-around(earth,sun))
  • The target domain that the analogy is intended
    to explain includes more-massive(nucleus,
    electron) revolves-around(electron, nucleus)
  • The mapping is sun ? nucleus and earth ?
    electron
  • The extension of the mapping leads to the
    inference causes(more-massive(nucleus,electron)
    , attract(nucleus,electron))
    causes(attract(nucleus,electron),
    revolves-around(electron,nucleus))

95
A typical framework
  • Retrieval Given a target problem, select a
    potential source analog.
  • Elaboration Derive additional features and
    relations of the source.
  • Mapping and inference Mapping of source
    attributes into the target domain.
  • Justification Show that the mapping is valid.
Write a Comment
User Comments (0)
About PowerShow.com