Slide No.: 1 - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Slide No.: 1

Description:

Terminology supports clinical applications - not vice versa ... e. mermaids. f. fabulous ones. g. stray dogs. h. those that are included in this classification ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 55
Provided by: drjerem
Category:
Tags: mermaids

less

Transcript and Presenter's Notes

Title: Slide No.: 1


1
Introduction to Clinical Terminology and
Classification
  • AL Rector OpenGALEN CO-ODEThe Medical
    Informatics Group, U of Manchester
  • www.cs.man.ac.uk/mig/galenwww.opengalen.orgwww.c
    o-ode.orgoiled.man.ac.ukrector_at_cs.man.ac.uk

2
Where we come from
Best Practice
Best Practice
3
OpenGALEN Philosophy
  • Terminology is software
  • Terminology is the interface between people and
    machines
  • Re-use is the key
  • Patient-centred information
  • Terminology must have a purpose
  • Always ask Whats it for?
  • Not art for arts sake
  • Terminology supports clinical applications - not
    vice versa
  • Applications for someone to do something for
    somebody
  • Keep the Horse before the Cart
  • Always ask How will we know if it works?
    How will we know if it fails?

4
OpenGALEN Key ideas
  • Separation of kinds of knowledge
  • Terminology, medical record and information
    system schemas
  • Models of meaning Models of Use
  • Concepts, language, Coding, Indexing,
    Pragmatics
  • Machine level, User level
  • Knowledge is fractal!
  • There will always be more detail to be added
  • Therefore terminologies must be extensible
  • Formal logical Support
  • Too big and complicated to maintain by hand
  • Extensibility requires rules
  • Software needs logical rigour

5
Axes for kinds of Knowledge
  • Machine level
  • Human Level
  • Concepts
  • Language
  • Coding
  • Indexing
  • Pragmatics User Interface
  • Terminology
  • Medical Records/Information systems
  • Decision Support rules

6
9) Interface of EHR, Messaging Decision Support
Significant Research Topic Now
7
Uses of Terminology
  • Clinical
  • Epidemiology and quality assurance
  • Reproducibility / Comparability
  • Indexing
  • Software
  • Re-use !
  • Integration and Messaging between systems
  • Authoring and configuring systems
  • Data capture and presentation (user interface)
  • Indexing information and knowledge (meta-data,
    The Web)

8
An Old Problem
  • On those remote pages it is written that animals
    are divided into
  • a. those that belong to the Emperor
  • b. embalmed ones
  • c. those that are trained
  • d. suckling pigs
  • e. mermaids
  • f. fabulous ones
  • g. stray dogs
  • h. those that are included in this classification
  • i. those that tremble as if they were mad
  • j. innumerable ones
  • k. those drawn with a very fine camel's hair
    brush
  • l. others
  • m. those that have just broken a flower vase
  • n. those that resemble flies from a distance"

From The Celestial Emporium of Benevolent
Knowledge, Borges
9
HistoryOrigins of existing terminologies
  • Epidemiology
  • ICD - Farr in 1860s to ICD9 in 1979
  • International reporting of morbidity/mortality
  • ICPC - 1980s
  • Clinically validated epidemiology in primary care
  • Now expanded for use in Dutch GP software
  • Librarianship
  • MeSH - NLM from around 1900 - Index Medicus
    Medline
  • EMTree - from Elsevier in 1950s - EMBase
  • Remumeration
  • ICD9-CM (Clinical Modification) 1980
  • 10 x larger than ICD aimed at US insurance
    reimbursement

10
Traditional Systems
  • Built by people for interpretation by people
    (Coding clerks)
  • Most knowledge implicit in rubrics
  • Must understand medicine to use intelligently
  • Not built for software
  • On paper for use on paper
  • Enumerated - top down all possibilities listed
  • Serial - Single use - Single View
  • Hierarchical Thesauri
  • Traditional terminological techniques from
    librarianship
  • Broader than / Narrower than (ISO 1087)
  • no logical foundation
  • Focused on terms
  • Language and concepts mixed
  • Synonyms, preferred terms, etc caused confusion

11
History (2)
  • Pathology indexing
  • SNOMED 1970s to 1990 (SNOMED International)
  • First faceted or combinatorial system
  • Topology, morphology, aetiology, function
  • Plus diseases cross referenced to ICD9
  • Specialty Systems
  • Mostly similar hierarchical systems
  • ACRNEMA/SDM - Radiology
  • NANDA, ICNP - Nursing

12
History (3)
  • Early computer systems
  • Read I (4 digit Read)
  • Aimed at saving space on early computers
  • 1-5 Mbyte / 10,000 patients
  • Hierarchical modelled on ICD9
  • Detailed signs and symptoms for primary care
  • Purchased by UK government in 1990
  • Single use
  • Morbidity indexing
  • Medical Entities Dictionary (MED)
  • Jim Cimino

13
History (4)
  • Aspirations for electronic patient records (EPRs)
  • Weeds Problem Oriented Medical Record
  • Direct entry by health care professionals
  • Aspirations for decision support
  • Ted Shortliffe (MYCIN), Clem McDonald (Computer
    based reminders), Perry Miller (Critiquing),..
  • Aspirations for re-use
  • Patient centred information
  • Needed common multi-use multi-purpose terminology
  • None worked

14
Motivations and Business Models
  • Remuneration
  • ICD9/10-CM in US for insurance and medicare for
    diseases
  • Clinical Procedures Terminology (CPT) for
    surgical procedures
  • Public Health Reporting
  • ICD9/10
  • Clinical Recording
  • Read 1-3, SNOMED-RT/CT
  • ICPC International Classification of Diseases
    in Primary Care
  • Indexing publications
  • MeSH Medical Subject Headings - Basis of
    indexing MedLine/PubMed
  • EMTree basis of indexing EMBASE
  • Support for applications and decision support
  • GALEN

15
Summary of Changes at end of 1st Generation
  • From terminologies for people to terminologies
    for machines
  • From paper to software
  • From single use to multiple re-use for patient
    centred systems
  • From entry by coding clerks to direct entry by
    health care professionals
  • From pre-defined reporting for statistics to
    reliable indexing for decision support

16
Changes at end of first generation
  • From models of USE to models of MEANING
  • But tended to lose the model of use
  • The goal of useful and usable systems lost

17
Problems withFirst GenerationEnumerated
Systems in coping with these changes
18
Problems (1)
  • Scaling !!!
  • More detail and more specialities required
    scaling up, but...
  • The combinatorial explosion
  • Example Burns
  • 100 sites x 3 depths ? 404 codes
  • 5 subsites/site x chemical or thermal ? 7272
  • x 3 extents x 3 durations ? 116,352
  • The Persian chessboard
  • 264 ? 1019
  • 1019 grains of rice ? 100 billion tonnes of rice
  • 1019 nanoseconds ? 10,000 years
  • Read II grew from 20,000 to 250,000 terms in 100
    staff-years
  • still too small to be useful
  • but too big to use

19
Benefits
  • Avoid the Exploding Bicycle From phrase
    book to dictionary grammar Tame
    combinatorial explosions
  • 1980 - ICD-9 (E826) 8
  • 1990 - READ-2 (T30..) 81
  • 1995 - READ-3 87
  • 1996 - ICD-10 (V10-19) 587
  • V31.22 Occupant of three-wheeled motor vehicle
    injured in collision with pedal cycle, person on
    outside of vehicle, nontraffic accident, while
    working for income
  • and meanwhile elsewhere in ICD-10
  • W65.40 Drowning and submersion while in bath-tub,
    street and highway, while engaged in sports
    activity
  • X35.44 Victim of volcanic eruption, street and
    highway, while resting, sleeping, eating or
    engaging in other vital activities

20
Problems (2)
  • Information implicit in the rubrics
  • Hypertension excluding pregancy
  • Computers cant read!
  • Invisible to software
  • No explicit information except the hierarchy
  • Minimal support for software
  • No opportunity to use softwre to help
  • Language and concepts confused
  • Synonyms
  • Preferred terms
  • Homonyms
  • Only simple look up and spelling correction

21
Problems (3)
  • Mixed Organisation
  • Heart diseases in 13 of 19 chapters of ICD
  • Tumours, infections, congenital abnormalities,
    toxic,
  • Steroids in five chapters of standard drug
    classifications
  • Anti-inflammatories, anthi-asthmatics,
  • Unreliable for indexing or Abstractions
  • How to say something about all heart diseases?
  • Fixed organisation
  • Single hierarchy - Single use
  • Where to put gout - arthritis or metabolic
    disease?
  • Back and forth in each edition of ICD
  • No re-use

22
Problems 3bThesauri rather than Classifications
23
Problems (4)
  • Semantic identifiers
  • Codes really paths - moving a concept meant
    changing its code
  • 3 Cardiovascular disorders
  • 3.4 Disorders of Artery...
  • ...3.4.2 Disorders of coronary artery...
  • 3.4.2.3 Coronary thrombosis
  • Easy to process but...
  • Reorganisation requires changing codes
  • Codes cannot be permanent

24
Problems (5)
  • Maintenance
  • 20 Years from ICD9 to ICD10
  • 100 person-years from Read 1 to Read 3
  • Mega francs/guilders/crowns/marks on European
    coding schemes
  • Thousands of unpaid hours of committee time
  • Impossible / meaningless decisions take longest
  • You can search forever for something that is not
    there
  • Multiple uses compete -
  • Must choose one use
  • Most successful were clear about their purpose -
    ICD, ICPC, MeSH
  • Codes change meaning with version changes
  • Old data misleading!

25
Problems (6)
  • Version specific artefacts
  • Not otherwise specified (NOS)
  • Used to move a general concept down
  • Not elsewhere classified (NEC)
  • Catch all - Nowhere else in coding system e.g.
    Tumour not elsewhere classified
  • dependent on version,
  • Other
  • Catch all - Not listed below, e.g. Other
    diseases of the cardiovascular system
  • dependent on version
  • Not used consistsently

26
Problem (7) Language is slipperyTwo hands or
Four?
27
Language/Concepts are slippery
  • Human cognition makes it look easy
  • Logic fails to capture it
  • Classification is easy until you try to do it
  • Trying since Aristotle in the West and Ancient
    Chinese in the East
  • Words/Concepts mean what a community decides they
    mean
  • Does a chimpanzee have four hands?
  • Is a prion alive?
  • Is surgery on the ovary a kind of Endocrine
    surgery?
  • Easier to agree on the concrete than the abstract
  • Easy to agree on useful abstractions and
    generalisations
  • Harder to agree on how to name them

28
Problems (8)
  • There is no re-use - there is no standard
  • The grand challenge A common controlled
    vocabulary for medicine
  • But re-use requires multiple different views
  • Peoples needs differ / People do and find
    different things
  • By profession
  • Doctors and specialties, nurses,
    physiotherapiests, dentists
  • By situation
  • Inpatient, outpatient, primary care, community
  • By task
  • Diagnosis, management, prescribing,
  • patient care, public health, quality assurance,
    management, planning
  • By country and community
  • US, UK, France, Germany, Japan, Korea, ...

29
Summary of Problems1st Generation Enumerated
Systems
  • Enumerated Single Hierarchies
  • List all possibilities in advance
  • Cannot cope with fractal knowledge
  • Most knowledge implicit
  • Invisible to software
  • Cant agree on common concepts and classification
  • Unreliable for indexing
  • Difficult to use for healthcare professionals
  • No support for user interface
  • Cant build and maintain big classifications
  • Language and concepts dont translate easily to
    logic and software

30
Ciminos Desiderata (1)
  • Concept orientation
  • Separate language (terms) and concepts (codes)
  • Concept permanence
  • Never re-use a code (retire it)
  • Nonsemantic concept identifiers
  • Separate the code from the path
  • Polyhierarchy
  • Allow one concept to be classified in multiple
    ways
  • Gout can be both a metabolic disease and an
    arthritis

31
Ciminos Desiderata (2)
  • Formal Definitions
  • i.e Be compositional
  • Reject Not elsewhere classified
  • concept permanence and NEC
  • Multiple granularities
  • Organ, tissue, cellular, molecular
  • Grades, types, classes of diseases
  • Special clinical criteria
  • Multiple consistsent views
  • Allow different organisations
  • e.g. functional, anatomical, pathological

32
Ciminos Desiderata (3)
  • Represent context
  • Family history, risk, source of information
  • Evolve gracefully
  • Allow controlled changes
  • Recognise redundancy (equivalence)
  • Carcinoma Lung ?? Carcinoma of the lung
  • How would we know?
  • How could a machine know?

33
Solution 0 You are worrying about the wrong
problem
  • International Classification of Primary Care
    (ICPC)
  • Focus on repeatability and quality across
    languages for a small (lt2000) number of codes

34
Solution Generation 1Megaterm Crossmapping
UMLS
Decision support
Clinical Applications
Medical Records
Data entry
35
Cross mapped and typed terminologies
vocabularies
36
The UMLS Knowledge Sources
  • Metathesaurus
  • Cross mappings
  • Language resources
  • NORM stemming and term recognition
  • UMLS Semantic Net
  • 170 types attached to categorise concepts
  • Disease, anatomical part, micro-organism, etc.

37
(No Transcript)
38
Solution 1 Cross-mapping UMLS
  • Unified Medical Language System (UMLS) from US
    National Library of Medicine
  • Defacto common registry for vocabularies
  • Concept Unique Identifiers (CUIs) and Lexical
    Unique Identifiers (LUIs) are defacto the common
    nomenclature
  • NB must use a CUI LUI to get unique
    identification
  • Licence terms
  • Class I free for use
  • Clsass III heavily restricted
  • (Class II almost nonexistent)

39
Solution 1 Cross-mapping UMLS
  • An invaluable resource, but...
  • No better than the vocabularies which are mapped
  • Limited detail for patient care
  • Unreliable for indexing or abstraction of
    knowledge
  • Best for relating everything to MeSH for indexing
    literature
  • Still limited by combinatorial explosion
  • Still cant cope with fractal knowledge
  • Not extensible - no help in building or extending
    terminologiese
  • No help in reorganising existing terminologies to
    re-use for new purposes
  • Top down
  • Information still implicit
  • Minimal help with software
  • No help with data capture, user interfaces

40
Solution IIa Build what you need as you need it
  • LOINC dominant coding system for laboratory
    systems(Logical Observation Identifiers Names
    and Codes)http//www.loinc.org/
  • Clinical LOINC contains increasing amounts of
    clinical references
  • Fully Class I included in UMLS
  • Closely linked to HL7 and HL7 vocabulary
    committee

41
(No Transcript)
42
Build and Control what you need only
  • HL7 Messaging standard
  • Controls the codes that hold messages together
  • Uses codes from elsewhere as payload
  • See www.hl7.org
  • (Possib ly the worlds worst web site)
  • Some material members only

43
Solutions Generations 2-3Compositional Systems
  • Beat the combinatorial explosion
  • Build concepts out of pieces - leggo
  • Dictionary and grammar rather than phrasebook
  • But hard

44
Solution Generation 1.5 Faceted
  • Faceted systems SNOMED International
  • Inflammation Lung Infection Pneumococcus ?
    Pneumoccal pneumonia
  • Limit combinatorial explosion, but
  • Rigid - a limited number of axes / facets /
    chapters
  • Each facet has the problems of a first generation
    enumerated system
  • Much knowledge still implicit
  • No way to know how identifiers relate
  • No explicit relations, only
  • No way to recognise redundancy / equivalence
  • No help with data capture or user interface / No
    way to recognise nonsense
  • Carcinoma Hair Donkey Emotional ? ????
  • Still cant cope with fractal knowledge
  • Limited extensibility limited help with
    building, extending or reorganising
  • Still Top Down

45
Generation 2 Enumerated Compositional
  • Read III with qualifiers
  • Inflammation site lung, cause pneumococcus ?
    Pnemococcal Pneumonia
  • More semantics but
  • Limited qualifiers - limited views - limited
    re-use
  • Limited help with data capture - User interface
    difficult
  • Much information still implicit - limited
    software support
  • No way to recognise redundancy / equivalence /
    errors
  • Organisation still mixed - indexing better but
    still unreliable
  • Limited separation of language and concepts
  • Still cant cope with fractal knowledge
  • Limited extensibility limited help with building
    and reorganising terminologies
  • Top down

46
Logic Based Ontologies The basics
Primitive skeleton
Descriptions
Definitions
Reasoning
Validating
Thing
red partOf Heart
red partOf Heart
(feature pathological)
47
CT Vocabulary
  • Reference Terminology vs Interface
    Terminologies
  • Reference terminology enumerated hierarchy of
    formally defined terms
  • Interface terminology navigation structure for
    user interface
  • Explicitly excluded from SNOMED-RT
  • Terming, Coding, and Grouping
  • Terming - finding the lexical string
  • Coding - finding the correct unique code
    (concept)
  • Grouping - putting codes into groupers for
    epidmiological or other purposes

48
Generation 2.5 Pre-coordinatedFormal Compositions
  • SNOMED-CT
  • Formal collaboration between College of American
    Pathologists (CAP/SNOMED) and NHS
  • Formal logical model for classifying a fixed list
    of definitions
  • Simple fixed ontology (7 links)
  • Now officially adopted and probably available for
    both NHS and related academic uses
  • GALEN derived terminologies
  • UK Drug Ontology
  • Procedure classifications

49
Generation III
  • Fully compositional post coordinated
  • Not yet in use or fully available
  • GALEN-like
  • Will probably arrive with Semantic Web

50
Other Key Resources
  • Anatomy
  • Digital Anatomist Foundational Model of Anatomy
  • University of Washington (http//sig.biostr.washi
    ngton.edu/projects/da/)
  • Comprehensive model of STRUCTURAL anatomy
  • Transformed into formal representation in
    Freiburg
  • Feasibility rather than production
  • Mouse
  • The Edinburgh Mouse Atlas Project
    (http//genex.hgu.mrc.ac.uk/)
  • Bioinformatics
  • GO - The Gene Ontology
  • MGED Mircroarray Gene Expression Data
  • OMIM Online Mendelian Inheritance in Man
  • Drugs
  • Proprietary databases First Databank, Micromed
  • UK Drug Dictionary (UKCPRS)
  • National Cancer Institute CaCore Ontologies

51
Current Status (1)
  • UMLS is the central coordinating force
  • Any terminology needs links links to CUIs and
    LUIs
  • Many people using CLASS I licensed terms only
  • Links to MeSH and PubMed
  • ICD9/10-CM used for reporting of diseases for
    insurance and Medicare in the US
  • ICD-10 used for official reporting in UK
  • CPT and OPCS used for reporting of procedures in
    US and UK respectively
  • SNOMED-CT purchased by US and mandated in UK
  • As yet few convincing

52
Current Status (2)
  • ICPC widely used in in primary care on continent,
    especially in the Netherlands
  • LOINC used for lab systems HL7 for messaging
  • Variants of SNOMED used for pathology many places
  • Many specialist systems
  • SNOMED-DICOM-Microglossary (SDM) for imaging
  • Unrelated to SNOMED
  • Several nursing systems
  • A variety of open source resources appearing

53
Current Status (3)
  • Commercial world dominated by proprietary systems
  • MedCin
  • All based on Model of Use

54
The Semantic Web and OWL
  • Ontologies fancy word for terminologies
  • Means many things to many people
  • W3C has produced a standard language for
    compositional logic based ontologies, OWL
  • OIL DAML ? DAMLOIL ? OWL
  • See oiled.man.ac.uk
  • See www.co-ode.org
  • See http//www.w3.org/2001/sw/WebOnt/
  • Rapid proliferation of open source tools and
    resources
  • No longer a biomedical problem only
  • Serious computer scientists finally involved
Write a Comment
User Comments (0)
About PowerShow.com