Title: CSCI 4310
1CSCI 4310
2Book
3Knowledge Representation
- General problem-solving techniques are useful,
but effectiveness often depends on extensive,
domain-specific knowledge. - Knowledge-based systems use a KNOWLEDGE BASE (KB)
of facts about the world - Knowledge usually comes from experts in the
domain.
4Examples of knowledge
- Diagnosing a programming bug
- Ex checking for base cases in recursive routines
- Requires domain knowledge
5Using Knowledge Representations
Examples, Statements
Questions, requests
Answers, analyses
Inference Mechanism(s)
Learning Mechanism(s)
- Contents of KB is part of cognitive model
Knowledge Base
6Knowledge Representation (KR) language
- Expressiveness
- can all the knowledge required for the problem be
represented adequately? - Naturalness
- Does the representation allow the knowledge to be
input and manipulated in a natural fashion?
7Knowledge Representation (KR) language continued
- Efficiency
- Can the system access and process the domain
knowledge efficiently? - Inference
- Does the representation support the generation of
inferences? (new knowledge)
8Definition
- A knowledge-based (or expert) system is
- An AI program
- Capable of representing and reasoning about some
knowledge-rich domain - With a view to solving problems and giving advice
9KB system levels
- We can talk about knowledge-based systems at
different levels
Knowledge
Logical
Implementation
10KB system levels continued
- The knowledge level describes what is known
independent of representation. - knowing in an abstract sense that robins are
birds - The logical level describes the statement(s) in
the Knowledge Representation model that represent
a fact. - isa(robin,bird)
11KB system levels continued
- The implementation level refers to the way in
which the knowledge is encoded - isa(robin,bird) might be encoded as
- List
- Array
- Database record
- Something more abstract
12Getting started
- What types of information need to be represented?
- Which knowledge representation model should we
use? - How should information be encoded in the
knowledge representation model? - How will the information be accessed?
13Types of Knowledge
- Declarative facts about the world.
- ex Robins have wings
- Procedural Operations embodying knowledge
- ex an algorithm for addition
- Analogy Associating knowledge about different
things. - ex Robins can fly. Robins are like sparrows. So
I suspect that sparrows can fly too.
14Types of Knowledge cont.
- Generalization Making generalizations from
specific examples - ex Robins can fly, sparrows can fly, cardinals
can flyI suspect all birds can fly. - Meta-level Knowledge Knowledge about what is
known. - ex I dont know Jim Rogers, so I probably dont
know his phone number. - Rough definition of meta one level higher
15Explicit vs. Implicit Knowledge
- Explicit knowledge is information that is encoded
directly in the representation. - ex has_part(robin,wing)
- Implicit knowledge is information that can be
derived from the representation. - explicit has-part(bird,wing)
- explicit is-a(robin,bird)
- implicit has-part(robin,wing)
16Explicit vs. Implicit Knowledge
- Trade-off Relying on implicit knowledge can
reduce the size of the knowledge base, but it can
increase the access time. - Similar to using opening chess moves calculated
offline. - Time vs. space once again
17Explicit vs. Implicit Knowledge
- Trivial example previously is deceptive
- This can be prohibitively expensive
- Satisfiability (SAT) problem
- The granddaddy NP-complete problem
- Graph coloring can be encoded as SAT
- Many planning and scheduling problems also
18Knowledge Representation Models
- Propositional Logic
- Predicate Calculus
- Production Systems (rule-based systems)
- Semantic networks
- Frames
- Bayesian Networks
19Example
- http//www.aiinc.ca/demos/whale.html
20Components of a rulebased system
- Working memory
- Knowledge base of current assertions sometimes
called context - Things that are believed to be true about the
world. - This will change as rules are evaluated
- Overlap with concepts from Automata Theory
21Components of a rulebased system cont.
- Rule base set of inference rules
- Each rule (sometimes called a production) is a
conditionaction pair. - All predicates in the condition must be true for
the rule to fire. - An action can add or delete facts from the
working memory. - Rule interpreter
- Determines which rules to apply and executes
actions.
22Forward Reasoning
- Until no rule can fire or goal state is achieved
- 1. Find all rules whose left sides match
assertions in working memory. - 2. Pick some to execute modify working memory by
applying right sides of rules. - There is no point executing multiple rules that
take identical actions. - Rules may be implicitly ordered in terms of
likelihood, importance, etc. - Choose the first or highest priority satisfied
rule - 3. Iterate
23Example
- If A or B then C
- If C or (D and E) then F
- If C and F then G
- If G or H then I
- Working Memory
- A E
- Which rules will follow?
- Cycle 1 Cycle 2 Cycle 3 etc.
24Three Parts to the ForwardChaining Rule
Interpreter
- Match
- identify which rules are applicable at any given
point in the reasoning - Conflict Resolution
- select which (of possibly many rules) should be
applied at any given point in the reasoning - Execute
- execute the righthand side of the chosen rule
25Three Parts to the ForwardChaining Rule
Interpreter cont.
- Heart of Rule-Based Systems
- The Knowledge Base
- Matching algorithm
- Conflict Resolution
26Matching for ForwardChaining
- Simply search all rules incrementally
- Problems
- A large rule base would lead to a very slow
search. - Satisfaction of the rule's preconditions may not
be obvious. - Could view as a search through all possible
variable bindings and use depthfirst search, for
example. It's not obvious what sort of heuristics
would help.
27Forward Chaining Example
unknown
IF X Y Z THEN C
28Pros and Cons of Forward Reasoning
- Forward reasoning has no goal in mind so it can
generate a lot of irrelevant assertions in
undirected fashion. - Think of all the info you discard when you add
something to your working memory - I am driving
- On a road
- Roads are made of asphalt
- My tires are making contact with the road
- None of this is relevant.
- Humans detect relevancy to prune the search space
- Some mental disorders cloud this ability
29Pros and Cons of Forward Reasoning
- Forwardchaining systems often require user to
encode heuristic knowledge to guide the search. - Rule base often consists of both domain knowledge
and control knowledge.
30Rules
- If (?x has hair) then (?x is mammal)
- If (?x has feathers) then (?x is bird)
- If (?x files) and (?x lays eggs) then (?x is
bird) - If (?x is mammal) and (?x eats meat) then (?x is
carnivore) - If (?x is mammal) and (?x eats grass) then (?x is
herbivores) - If (?x is mammal) and (?x has hooves) then (?x is
herbivores) - If (?x is carnivore) and (?x has tawny color)
then (?x is tiger) - If (?x is herbivores) and (?x has black/white
color) then (?x is zebra) - If (?x is bird) and (?x swims) then (?x is
penguin) - If (?x is bird) and (?x files) and (?x
black/white color) then (?x is albatross)
31Facts
- F1 (Subject has hair)
- F2 (Subject eats grass)
- F3 (Subject has black/white color)
- Stored in Working Memory
32Backward Reasoning
- Start with an hypothesis and
- Some assertions in working memory
- Work backward from the hypothesis
- Tries to counter the undirected nature of
forward-chaining
33Backward Reasoning
- Until the hypothesis has been satisfied, or until
no more rules are applicable, do the following -
- 1. Find all rules whose right side matches the
hypothesis.
34Backward Reasoning part 2
- 2. For each matching rule
- Try to support each of the rule's conditions by
matching against assertions in working memory, or
generating subhypotheses and backward chaining
recursively. - If all the rule's conditions are satisfied, then
success!
35Control Backward Chaining
- IF A THEN C
- IF B THEN C
- IF C THEN D
- If we want to establish D as being true, then we
should establish C as being true. - To do this we need to show that A is true or that
B is true - Eventually we have to
- ask the user of the system
- go to a database
- interrogate a sensor
- ...
36Rules form a search tree
- R1 IF A and B and C and D THEN E
- R2 IF X and Y THEN A
- R3 IF Z THEN B
- R4 IF W THEN C
- R5 IF F THEN D
- R6 IF G THEN D
- R7 IF H THEN G
- Prove E?
- Facts X, Y, Z, W, H
37Rules drawn as a tree
X
R2
A
Y
Z
R3
B
E
R1
W
R4
C
F
D
R5
H
R7
R6
G
AND
38Search aspects of backward chaining
- What search strategy do we use
- normally depth first
- When two rules are available to prove a
conclusion, which one do we use first? - if A then C
- if B then C
- When a premise consists of multiple components,
what order do we work on them? - If A and B and C then D
39Knowledge about animals (1)
- If (?x has hair) then (?x is mammal)
- If (?x feeds young milk) then (?x is mammal)
- If (?x has feathers) then (?x is bird)
- If (?x flies) and (?x lays eggs) then (?x is
bird) - If (?x is mammal) and (?x eats meat) then (?x is
carnivore) - If (?x is mammal) and (?x has pointed teeth) and
(?x has claws) and (?x has eyes point forward)
then (?x is carnivore) - If (?x is mammal) and (?x eats grass) then (?x is
herbivore) - If (?x is mammal) and (?x has hooves) then (?x is
herbivore)
40Knowledge about animals (2)
- If (?x is carnivore) and (?x has tawny color) and
(?x has dark spots) then (?x is cheetah) - If (?x is carnivore) and (?x has tawny color) and
(?x has dark stripes) then (?x is tiger) - If (?x is herbivore) and (?x has tawny color) and
(?x has dark spots) and (?x has long neck) then
(?x is giraffe) - If (?x is herbivore) and (?x has black/white
color) then (?x is zebra) - If (?x is bird) and (?x has long neck) then (?x
is ostrich) - If (?x is bird) and (?x swims) and (?x is
black/white color) then (?x is penguin) - If (?x is bird) and (?x flies) and (?x is
black/white color) then (?x is albatross)
41What species is george?
- Want to know what species george is
- Information available on request
- (george has hair).
- (george lays unknown).
- (george, unknown).
- (george eats unknown).
- (george has pointed teeth).
- (george has claws).
- (george has eyes point forward).
- (george tawny color).
- (george has dark spots).
- (george neck short).
42Matching for Backward Chaining
- With forward chaining we can generate all
applicable rules and then select one using a
welldefined conflict resolution strategy.
Backward chaining is more complicated because...
43Matching for Backward Chaining
- The hypothesis must be matched against WM
assertions and rule consequents. - The hypothesis can contain variables.
- More than one rule can provide a variable
binding. - Backward chaining typically uses DFS with
backtracking to select individual rules.
44Fan Out
- A set of facts can lead to many conclusions
- A higher degree of fan out argues for backward
chaining
45Fan In
- A higher degree of fan in argues for forward
chaining
46Pros and Cons of Backward Reasoning
- Backwardchaining is best when there is a
distinct goal state that is likely to be
obtainable. (If there are many acceptable goal
states, then forward chaining might be fine.) - Backward reasoning can be efficient when the
branching factor of the initial working memory is
higher than the branching factor of the
assertions that lead to the conclusion. - Backward reasoning only asks for information when
it is relevant, which is extremely useful when
knowledge is expensive to access.
47Rule-based Systems
- Many problems are best characterized by a set of
rules. If-then rules (implication) are the focus
of problem-solving. - Problem-solving systems that use rule-based
knowledge representation and rule-based reasoning
are sometimes called production systems. - This is not how humans do it, though
48Rule-based Systems
- Production systems may use a somewhat less formal
representation scheme, use an incomplete
inference procedure, and treat the consequents of
rules as logical actions rather than logical
conclusions. - Book refers to this as a reaction system
- Need conflict-resolution procedure to decide
which action to take
49(No Transcript)
50What can an expert system do?
- In principle, anything that an expert can do, and
can be persuaded to articulate. - It is useful to distinguish between
- analytic tasks
- which involve analysing something which already
exists, and - synthetic tasks
- which involve creating something which doesnt.
51Experts vs. knowledge-engineers
- An expert is someone who
- Possesses knowledge about a domain
- Is skilled at applying this knowledge to problems
in the domain - A knowledge-engineer is skilled at interacting
with domain experts and formalizing their
knowledge - Automation of this process is very desirable
- But also difficult
- The field of machine learning
52Advantages of rulebased systems
- Modularity
- The knowledge is represented as small,
individual pieces. Changes to one rule do not
directly affect other rules. Parallelism. - Uniformity
- All knowledge uses the same simple
representation. We can build generalpurpose
rule interpreters. - Naturalness
- Domain experts can often easily express their
knowledge in terms of rules. Rulebased systems
generate explanations that are easy to understand.
53Disadvantages of rulebased systems
- Inefficiency
- Can be difficult to focus the effort in
productive directions. (Writing good control
knowledge is hard!) Difficult to organize
knowledge hierarchically. Cannot easily control
the grain size of steps. - Opacity
- Can be hard to follow problemsolving control.
Rule interactions can be difficult to anticipate
and control.
54Good domains for rulebased systems
- Domains that are characterized by a lot of
individual facts (e.g., medicine), but not
necessarily a formal, unified theory (e.g.,
physics). - Domains for which the actions are mostly
independent of one another. For example, a
patient monitoring system vs. income tax
accounting. - Domains for which subject knowledge can be easily
separated from process knowledge. For example,
taxonomic information vs. algorithmic information.
55Example systems
- MYCIN
- Identify bacterial infections and recommend
treatment - R1
- Configure DEC computers
- PROSPECTOR
- Interpret geological data and advise on the
presence of ore deposits
56MYCIN
- Developed at Stanford University by Shortliffe,
Buchanan et al. from 1972. - Concerned with the diagnosis and treatment of
some infectious diseases. - Steps
- Identify organisms
- Recommend treatment
- Important because its approach has been widely
copied
57MYCIN - Sample dialogue
- What is the patients name?John Smith
- Age?55
- Have you obtained positive cultures indicating
general type?Yes - Lets call the most recent culture CULTURE-1.
From what site was CULTURE-1 taken?From the
blood - .
58MYCIN - Sample dialogue 2
- When?May 9, 1988
- Lets call the first significant organism from
this blood culture ORGANISM-1. Do you know the
identity of ORGANISM-1?No - Is ORGANISM-1 a rod or a coccus or something
else?Rod - What is the gramstain of ORGANISM-1?Gramnegative
59MYCIN - Sample dialogue 3
- My therapy recommendations will be based on the
following possible identities of the organism(s)
that seem to be significant - the identity of ORGANISM-1 may be PSEUDOMONAS
- the identity of ORGANISM-2 may be KLEBSIELLA
- the identity of ORGANISM-3 may be ENTEROBACTER
- My preferred therapy recommendation is as
followsGive the following in combination - GENTAMYCINDose 1.7 mg/kg Q8H - IV or
IMComments Modify dose in renal failure - CARBENICILLIN ...
60Control Strategies
- Concerned with the general issue of how to
control the reasoning process so that it
efficiently finds a solution. - Control in Search refers to the order or
selection of nodes/states to explore. - Control in Rule Bases refers to the order or
selection of rules/actions to execute at each
cycle of a rulebased system. (conflict
resolution) - output of matching gt list of applicable rules
and their variable bindings - output of conflict resolution gt which rule to
apply
61Conflict Resolution Strategies
- 1. Assign absolute priorities to each rule in
advance. - ControlPanel(x) and Dusty(x)
- gt Action Dust(x)
- ControlPanel(x) and MeltdownLightOn(x) gt Action
Evacuate(x) - 2. Assign relative priorities to the rules,
depending upon the problem state.
62Conflict Resolution Strategies 2
- 3. Choose the most specific rule (i.e., the rule
with conditions that most closely match the
current situation). - Mammal(x)
- gt add(Legs(x 4))
-
- Mammal(x) and Human(x)
- gt add(Legs(x 2))
- 4. Choose a rule that uses assertions that were
made most recently.
63Semantic Approaches to Conflict Resolution
- Preferences based on objects that matched
- -- Add knowledge to rules to indicate more
important objects - Preferences based on the action that the matched
rule would perform - Treat conflict resolution as (search) problem in
its own right
64Control Rules
- Some systems use control rules (sometimes called
metarules) to make the heuristic search strategy
explicit and easily modifiable. This is called
search control knowledge.
65Control Rules
- Metarule Example
- Under conditions A and B,
- Rules that do not mention X
- at all,
- in their LHS,
- in their RHS
- will
- definitely be useless,
- probably be useless, ...
- probably be especially useful,
- definitely be especially useful
66Control knowledge can take a variety of forms
- Knowledge about which states are preferable
- 2. Knowledge about which rule is best to apply
when - 3. Knowledge about how to order subgoals
- 4. Knowledge about useful rule sequences
67Handling uncertainty
- Not all our knowledge is 100 certain
- Different approaches to uncertainty can be viewed
along the following dimensions - What knowledge and data can be represented as
being uncertain? - What is this representation?
- How are different pieces of evidence combined?
- How do different levels of certainty affect what
the system does?
68Uncertainty in MYCIN
- Both knowledge and data can be represented as
being uncertain. - Rules (knowledge)
- IF the stain of the organism is Gram negative
- AND the morphology of the organism is rod
- AND the aerobicity of the organism is aerobic
- THEN the class of the organism is
enterobacteriaceae with confidence 0.8
69Uncertainty in MYCIN
- Data
- the stain of the organism is definitely Gram
negative (1.0). - the morphology is rod, with confidence 0.8.
- the morphology is coccus, with confidence 0.2
- the aerobicity is aerobic, with confidence 0.6
- the class is enterobacteriaceae, with confidence
0.48
70Uncertainty in MYCIN
- The representation uses Certainty Factors (CF)
- -1 ? CF ? 1
- Data
- CF 1 the fact is certainly true
- CF 0 we know nothing about whether the
fact is true or not - CF -1 the fact is certainly not true
71Uncertainty in MYCIN
- Rules
- CF 1
- if the premise is known to be true then the
conclusion is known to be true - CF 0
- the premise brings no evidence for or against the
conclusion - CF -1
- if the premise is known to be true then the
conclusion is known to be false