COM362 Knowledge Engineering - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

COM362 Knowledge Engineering

Description:

Classic Case Studies. 3. DENDRAL. Developed at Stanford University in 1965 ... Classic Case Studies. 19. The MYCIN Knowledge Base. Where the rules are held ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 34
Provided by: john245
Category:

less

Transcript and Presenter's Notes

Title: COM362 Knowledge Engineering


1
Classic Case Studies
  • John MacIntyre
  • 0191 515 3778
  • john.macintyre_at_sunderland.ac.uk

2
The Classics
  • DENDRAL
  • determine molecular structure of an unknown
    compound
  • started in 1965
  • MYCIN
  • medical diagnosis system
  • started in 1972

3
DENDRAL
  • Developed at Stanford University in 1965
  • Possibly the first computer program EVER to rival
    human experts in a specialized field
  • Determine molecular structure of an unknown
    compound
  • Used a modified form of generate and test
    methodology

4
The DENDRAL Problem
  • Chemist is presented with an unknown chemical
    compound
  • Chemist must determine the molecular structure
  • Therefore needs to find out which atoms are in
    the structure
  • Needs to know how the atoms are connected to form
    molecules

5
The DENDRAL Problem
  • Data from mass spectrometer
  • Not straight-forward!
  • Molecules can fragment in different ways
  • need to make some predictions about how molecules
    are LIKELY to break
  • sub-components of the molecule may be found in
    many different compounds
  • chemists therefore determine compound
    sub-components, and apply constraints that other
    sub-components must satisfy

6
The DENDRAL Problem
  • Not a trivial problem!
  • Consider the formula C6H13NO2
  • There are 10,000 isomers of this compound!!
  • Each permutation can be uniquely identified
  • Could simply generate each of the10,000
    permutations in turn and test
  • Very expensive in computing time!
  • There would like to constrain the generation of
    candidate permutations to save time

7
Constrained Generation
  • CONGEN
  • DENDRAL program for constrained generation of
    complete chemical structures
  • Manipulates symbols representing atoms and
    molecules
  • Uses a set of constraints on how atoms can be
    inter-connected
  • Chemist can specify and vary the initial
    constraints (eg based on experimental evidence)

8
Specifying Constraints
  • Defining constraining structures
  • specify superatoms that compound must contain
  • typically in organic compounds, rings or chains
    of carbon atoms linked to hydrogens
  • Defining other constraints
  • open for the chemist to hypothesize
  • eg compound must contain a carbon ring of 6
    carbon atoms etc.

9
Assessing Candidates
  • CONGEN may produce hundreds or thousands of
    candidate structures
  • First pass at assessing the candidates
  • Use basic rules of mass spectrometry to test
    candidates and remove most unlikely ones
  • MSPRUNE another DENDRAL program which does this
  • MSRANK ranks remaining structures according to
    how their graphs match expected graphs for known
    compounds

10
Scoring Candidates
  • Peaks (features) in the spectral graphs are
    weighted to represent their importance
  • Weighted scores are produced to give the rank
    ordering for each candidate structure
  • Essentially this is a hypothesize-and-test
    strategy

11
Evaluating DENDRAL
  • Available on the network of Stanford University,
    California
  • Used by hundreds of people around the world every
    day
  • Has been used to challenge long-published
    chemical literature successfully
  • The first stepping-stone between traditional
    problem solving and modern expert systems

12
Features of DENDRAL
  • Uses information from domain experts to help
    limit the search space for candidate structures
  • Uses an explicit representation of knowledge -
    fragmentation rules
  • No real inference mechanism - iterative passes
    through the rules controlled by user

13
The Keys to Success?
  • DENDRAL was successful because
  • It did not set out to replace the expert, only to
    assist the expert
  • The search technique is based on a proven model
    of knowledge with known mathematical properties
  • There is a language which can be used to
    represent the structures easily and is well
    specified

14
MYCIN
  • Developed at Stanford University in 1972
  • Regarded as the first true expert system
  • Assist physicians in the treatment of blood
    infections
  • Many revisions and extensions to MYCIN over the
    years

15
The MYCIN Problem
  • Physician wishes to specify an antimicrobial
    agent - basically an antibiotic - to kill
    bacteria or arrest their growth
  • Some agents are poisonous!
  • No agent is effective against all bacteria
  • Most physicians are not expert in the field of
    antibiotics

16
The Decision Process
  • There are four questions in the process of
    deciding on treatment
  • Does the patient have a significant infection?
  • What are the organism(s) involved?
  • What set of drugs might be appropriate to treat
    the infection?
  • What is the best choice of drug or combination of
    drugs to treat the infection?

17
MYCIN Components
  • KNOWLEDGE BASE
  • facts and knowledge about the domain
  • DYNAMIC PATIENT DATABASE
  • information about a particular case
  • CONSULTATION PROGRAM
  • asks questions, gives advice on a particular case
  • EXPLANATION PROGRAM
  • answers questions and justifies advice
  • KNOWLEDGE ACQUISITION PROGRAM
  • adds new rules and changes exisiting rules

18
Basic MYCIN Structure
Physician User
Consultation Program
Static Knowledge Base
Dynamic Patient Data
Explanation Program
Knowledge Acquisition Program
Infectious Disease Expert
19
The MYCIN Knowledge Base
  • Where the rules are held
  • Basic rule structure in MYCIN is
  • if condition1 and.and conditionm hold
  • then draw conclusion1 and.and conditionn
  • Rules written in the LISP programming language
  • Rules can include certainty factors to help
    weight the conclusions drawn

20
An Example Rule
  • IF(1) The stain of the organism is Gram
    negative, and
  • (2) The morphology of the organism is rod, and
  • (3) The aerobicity of the organism is aerobic
  • THEN
  • There is strongly suggestive evidence (0.8)
    that the class of the organism is
    Enterobacteriaceae

21
Calculating Certainty
  • Rule certainties are regarded as probabilities
  • Therefore must apply the rules of probability in
    combining rules
  • Multiplying probabilities which are less than
    certain results in lower and lower certainty!
  • Eg 0.8 x 0.6 0.48

22
Other Types of Knowledge
  • Facts and definitions such as
  • lists of all organisms known to the system
  • knowledge tables of clinical parameters and the
    values they can take (eg morphology)
  • classification system for clinical parameters and
    the context in which they are applied (eg
    referring to patient or organism)
  • Much of MYCINs knowledge refers to 65 clinical
    parameters

23
MYCINs Context Trees
  • Used to organise case data
  • Helps to visualise how information within the
    case is related
  • Easily extended and adapted as more clinical
    evidence becomes available

24
Example Context Tree
PATIENT-1
CULTURE-1
CULTURE-2
CULTURE-3
OPERATION
ORGANISM-1
ORGANISM-2
ORGANISM-3
DRUG-1
DRUG-2
25
MYCIN Control Structure
  • Uses a goal-based strategy to attempt to solve,
    in the first instance, a TOP LEVEL GOAL RULE
  • Establishes sub-goals required to satisfy the top
    level goal
  • Therefore establishes the concept of backward
    chaining

26
Top Level Goal
  • IF(1) There is an organism which requires
    therapy
  • and
  • (2) consideration has been given to any other
  • organism requiring therapy
  • THEN
  • compile a list of possible therapies, and
  • determine the best one in this list

27
MYCIN Subgoals
  • Sub-goals are a generalised form of the top-level
    goal
  • Hence sub-goals consider the proposition that
    there is a particular organism
  • Exhaustive search on all relevant rules to test
    this proposition (until or unless one succeeds
    with total certainty)
  • More like exhaustive search than backward chaining

28
Selection of Therapy
  • Done after the diagnostic phase is complete
  • Two phases
  • Selection of a list of candidate drugs
  • Choice of preferred drugs or combinations of
    drugs from the list
  • Therapy rules use information on
  • Sensitivity of organism to drug
  • Contraindications on the drug

29
Example Recommendation
  • IF The identity of the organism is Pseudomonas
  • THEN
  • I recommend therapy from the following drugs
  • 1 - COLISTIN (0.98)
  • 2 - POLYMYXIN (0.96)
  • 3 - GENTAMICIN (0.96)
  • 4 - CARBENICILLIN (0.65)
  • 5 - SULFISOXAZOLE (0.64)

30
Evaluating MCYIN
  • Many studies show that MYCINs recommendations
    compare favourably with experts for diseases like
    meningitis
  • Study compared on real patients with expert and
    non-expert physicians
  • MYCIN matched experts
  • MYCIN was better than non-experts

31
MYCIN Limitations
  • A research tool - never intended for practical
    application
  • Limited knowledge base - only covers a small
    number of infectious diseases
  • Needed more computing power than most hospitals
    had at the time!
  • Doctors reluctant to use it
  • Poor interface

32
Conclusions
  • DENDRAL was a ground-breaking program as it
    showed that computers could match experts in a
    specific domain
  • DENDRAL was always intended as an expert
    assistant
  • MYCIN was the first expert system which
    included an inference control structure
  • MYCIN is limited for practical use

33
Further Reading
  • Introduction to Expert Systems
  • P. Jackson, Addison Wesley, 1990
  • Expert Systems Principles and Programming
  • J. Giarratano, G. Riley, PWS Publishing, 1994
  • Artificial Intelligence Tools, Techniques and
    Applications
  • T. OShea, M. Eisenstadt, Open University, 1984
Write a Comment
User Comments (0)
About PowerShow.com