Computational Linguistics for Referent Tracking in Electronic Healthcare Records: a research agenda CogSCI Colloquium Oct 19, 2005 - PowerPoint PPT Presentation

About This Presentation
Title:

Computational Linguistics for Referent Tracking in Electronic Healthcare Records: a research agenda CogSCI Colloquium Oct 19, 2005

Description:

Title: PowerPoint-presentatie Author: werner Last modified by: werner Created Date: 3/22/2005 12:37:51 PM Document presentation format: Diavoorstelling – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 67
Provided by: Wer15
Category:

less

Transcript and Presenter's Notes

Title: Computational Linguistics for Referent Tracking in Electronic Healthcare Records: a research agenda CogSCI Colloquium Oct 19, 2005


1
Computational Linguistics for Referent Tracking
in Electronic Healthcare Records a research
agenda CogSCI Colloquium Oct 19, 2005
  • Dr. W. Ceusters
  • European Centre for Ontological Research
  • Saarland University, Saarbrücken - Germany

2
Presentation overview
  • ECOR and me
  • The Electronic Health Record (EHR)
  • Problems with terminologies and their use in the
    EHR
  • Realist ontology
  • Referent Tracking
  • Opportunities for computational linguistics

3
European Centre forOntological Research
4
ECORs members partners
External members
Local members
Partners
Status Oct 2, 2005
5
Goals and objectives
  • sustained and coordinated collaboration with
    institutions with proven track record of
    excellence in ontological research and in the
    application of ontology to solve concrete
    problems.
  • interdisciplinary approach based on philosophical
    rigour
  • exchange of research personnel for short research
    visits
  • participation in joint projects,
  • joint supervision of doctoral research,
  • joint production of software and authorship of
    research papers
  • collaborate in seeking funding at national and
    international levels for ontology-related
    research and development activities

6
Recently also in the US
7
Short personal history
8
The Electronic Health Record
9
Current US GOV eHealth goals strategies
  • G1 Inform Clinical Practice
  • S1. Provide incentives for EHR adoption.
  • S2. Reduce risk of EHR investment.
  • S3. Promote EHR diffusion in rural and
    underserved areas.
  • G2 Interconnect Clinicians.
  • S1. Regional collaborations.
  • S2. Develop a national health information
    network.
  • S3. Coordinate federal health information
    systems.
  • Goal 3 Personalize Care.
  • S1. Encourage use of Personal Health Records.
  • S2. Enhance informed consumer choice.
  • S3. Promote use of telehealth systems.
  • Goal 4 Improve Population Health.
  • S1. Unify public health surveillance
    architectures.
  • S2. Streamline quality and health status
    monitoring.
  • S3. Accelerate research and dissemination of
    evidence.

US Department of Health and Human Services July
21, 2004
10
Electronic Health Record
  • ISO/TS 183082003
  • Electronic Health Record (EHR)
  • A repository of information regarding the health
    of a subject of care, in computer processable
    form.
  • EHR system
  • the set of components that form the mechanism by
    which electronic health records are created,
    used, stored, and retrieved. It includes people,
    data, rules and procedures, processing and
    storage devices, and communication and support
    facilities.
  • More common meaning of EHR system
  • only the software being executed

11
The Medical Informatics dogma
  • To structure or NOT to be
  • Fact computers can only deal with a structured
    representation of reality
  • structured data
  • relational databases, spread sheets
  • structured information
  • XML simulates context
  • structured knowledge
  • rule-based knowledge systems
  • Conclusion a need for structured data
    entry (???)

12
Example of data entry form
www.comchart.com
13
Structured EHR data entry
  • Current technical solutions
  • Data entry forms
  • provide the structure
  • various paradigms
  • Rigid, pre-fixed
  • Adaptable to user-preferences, but fixed when
    used
  • Dynamically adapting to entered data in context
  • Terminologies, coding and classification systems
  • provide the language to be used
  • Exchange of information preserving meaning
  • Statistics and epidemiology

14
The International Classification of diseases
(WHO).
  • ...
  • Chapter II Neoplasms (C00-D48)
  • Chapter III Diseases of the Blood and
    Blood-forming organs and certain disorders
    involving the immune mechanism (D50-D89)
  • Excludes auto-immune disease (systemic) NOS
    (M35.9)
  • ....
  • Nutritional Anemias (D50-D53)
  • D50 Iron deficiency anaemia
  • Includes ...
  • D50.0 Iron deficiency anaemia secondary
    to blood loss (chronic)
  • Excludes ...
  • D50.1 ...
  • D51 Vit B12 deficiency anaemia
  • Haemolytic Anemias (D55-D59)
  • ...
  • Chapter IV ...

15
The alphabetic index of ICD-9-CM
  • hydrops 782.3
  • abdominis 789.5
  • amnii (complicating pregnancy)
  • (see also hydramnios) 657
  • congenital - see Hydrops, fetalis
  • fetal(is) or new-born 778.
  • due to iso-immunisation 773.3
  • not due to iso-immunisation 778.0
  • meningeal NEC 331.4
  • pericardium - see Pericarditis

16
Snomed International (1995) Number of records
(V3.1)
  • T Topography 12,385
  • M Morphology 4,991
  • F Function 16,352
  • L Living Organisms 24,265
  • C Drugs Biological Products 14,075
  • A Physical Agents, Forces and Activities
    1,355
  • D Disease/ Diagnosis 28,623
  • P Procedures 27,033
  • S Social Context 433
  • J Occupations 1,886
  • G General Modifiers 1,176
  • TOTAL RECORDS 132,641

17
Snomed International (1995)knowledge in the
codes.
  • leaflet posterior
  • anatomic
  • mitral
  • cardiac valve
  • cardiovascular

18
Snomed International multiple ways to express
the same thing
  • D5-46210 Acute appendicitis, NOS
  • D5-46100 Appendicitis, NOS
  • G-A231 Acute
  • M-41000 Acute inflammation, NOS
  • G-C006 In
  • T-59200 Appendix, NOS
  • G-A231 Acute
  • M-40000 Inflammation, NOS
  • G-C006 In
  • T-59200 Appendix, NOS

19
The search for internal formal consistency
medSORT-II (Evans Hersh, 93)
  • no pin-prick sensation in calf gt
  • ltneuro-sensation-mxgt
  • ltmethodgt ltpin-prock-testgt
    pin-prick
  • ltlocusgt ltbody-regiongt calf
  • ltresultgt lteval-attrgt ltattrgt
    sensation
  • ltvaluegt absent

20
UMLS Unified Medical Language System (NLM)
  • Tool for information retrieval of 4 components
  • Metathesaurus contains information about
    biomedical concepts and how they are represented
    in diverse terminological systems.
  • Semantic Network contains information about
    concept categories and the permissible
    relationships among them
  • Information Sources Map contains both
    human-readable and machine-processable
    information about all kinds of biomedical
    terminological systems
  • Specialist lexicon english words with POS

21
UMLS Semantic Network
22
Main problems
  • Internal and external consistency of
    terminologies.
  • What do the terms in a terminology stand for ?

23
Problems with terminologies (1)
24
Problems with terminologies (2)
  • ventricle used in 2 different meanings

25
Problems with terminologies (3)
  • Mixing of differentiae
  • Ontological nonsense

26
Problems with terminologies (4)
Incomplete classification
27
Previous work
  • Many of these deficiencies can be identified
    corrected or prevented by doing the right sort of
    ontology using a proper tool.
  • SNOMED-CT
  • NCIT
  • UMLS Semantic Network
  • But this is NOT the topic of this presentation

28
Whats wrong with currentuse of terminologies
(and)ontologies in the EHR ?
29
Current mainstream thinking
30
The story of Jane Smithan old case, well known
in the literature ...
31
July 4th, 1990 Jane goes shopping
32
A visit to the hospital
  • City Health Centre Dr. Peters
  • (City HC) Dr. Longley

33
Diagnosis a severe spiral fracture of the femur
34
CityHCs representation formalism(for statements
in records)
Categories represent concepts and are analogous
to classes in other formalisms
Individuals concrete instances of categories
which persist in space and time
Occurrences are specific occurrences of
individuals and must be situated in space and
time. The most important group of occurrences are
observations i.e. agents observations of
individuals.
Rector AL, Nowlan WA, Kay S, Goble CA, Howkins
TJ. A framework for modelling the electronic
medical record. Methods Inf Med. 1993
Apr32(2)109-19.
35
A look at the database Use of SNOMED codes for
unambiguous understanding
How many numerically different disorders are
listed here ?

How many different types of disorders are listed
here ?

How many disorders have patients 5572, 2309 and
298 each had thus far in their lifetime ?

cause, not disorder
36
Would it be easier if youcould see the code
labels ?
5572
04/07/1990
79001
Essential hypertension
0939
24/12/1991
255174002
benign polyp of biliary tract
2309
21/03/1992
26442006
closed fracture of shaft of femur
0939
20/12/1998
255087006
malignant polyp of biliary tract
37
A look at the problems ...
38
Main problem areasfor CityHCs EHR
  • Statements refer only very implicitly to the
    concrete entities about which they give
    information.
  • Idiosyncracies of concept-based terminologies
  • tell us only that some instance of the class the
    codes refer to, is refered to in the statement,
    but not what instance precisely.
  • Are usually confused about classes and
    individuals.
  • Country and Belgium.
  • Mixing up the act of observation and the thing
    observed.
  • Mixing up statements and the entities these
    statements refer to.

39
Consequences
  • Very difficult to
  • Count the number of (numerically) different
    diseases
  • Bad statistics on incidence, prevalence, ...
  • Bad basis for health cost containment
  • Relate (numerically same or different) causal
    factors to disorders
  • Dangerous public places (specific work floors,
    swimming pools),
  • dogs with rabies,
  • HIV contaminated blood from donors,
  • food from unhygienic source, ...
  • Hampers prevention
  • ...

40
Proposed solutionReferent Tracking
  • Purpose
  • explicit reference to the concrete individual
    entities relevant to the accurate description of
    each patients condition, therapies, outcomes,
    ...
  • Method
  • Introduce an Instance Unique Identifier (IUI) for
    each relevant individual ( particular,
    instance).
  • Distinguish between
  • IUI assignment for instances that do exist
  • IUI reservation for entities expected to come
    into existence in the future

41
Ontology
  • Ontology the study of being as a science
  • An ontology is a representation of some
    pre-existing domain of reality which
  • (1) reflects the properties of the objects within
    its domain in such a way that there obtains
    a systematic correlation between reality and the
    representation itself,
  • (2) is intelligible to a domain expert
  • (3) is formalized in a way that allows it to
    support automatic information processing
  • ontological (as adjective)
  • Within an ontology.
  • Derived by applying the methodology of ontology
  • ...

42
An ontological analysis
continuants
43
Ontological recategorisation
Jane Smiths consultation with Dr. Peters
at City HC on 4th July 1990
Dr. Peters assessment of Jane Smiths fracture
of femur at City HC on 4th July 1990
44
Essentials of Referent Tracking
  • Generation of universally unique identifiers
  • deciding what particulars should receive a IUI
  • finding out whether or not a particular has
    already been assigned a IUI (each particular
    should receive maximally one IUI)
  • using IUIs in the EHR, i.e. issues concerning the
    syntax and semantics of statements containing
    IUIs
  • determining the truth values of statements in
    which IUIs are used
  • correcting errors in the assignment of IUIs.

45
IUI assignment
  • an act carried out by the first cognitive
    agent feeling the need to acknowledge the
    existence of a particular it has information
    about by labelling it with a UUID.
  • cognitive agent
  • A person
  • An organisation
  • A device or software agent, e.g.
  • Bank note printer,
  • Image analysis software.

46
Criteria for IUI assignment (1)
  • The particulars existence must be determined
  • Easy for persons in front of you, body parts, ...
  • Easy for planned acts they do not exist before
    the plan is executed !
  • Only the plan exists and possibly the statements
    made about the future execution of the plan
  • More difficult subjective symptoms
  • But the statements the patient makes about them
    do exist !
  • However
  • no need to know what the particular exactly is,
    i.e. which universal it instantiates
  • No need to be able to point to it precisely
  • One bee out of a particular swarm that stung the
    patient, one pain out of a series of pain attacks
    that made the patient worried
  • But this is not a matter of choice, not any
    out of ...

47
Criteria for IUI assignment (2)
  • The particulars existence may not already have
    been determined as the existence of something
    else
  • Morning star and evening star
  • Himalaya
  • Multiple sclerosis
  • May not have already been assigned a IUI.
  • It must be relevant to do so
  • Personal decision, (scientific) community
    guideline, ...
  • Possibilities offered by the EHR system
  • If a IUI has been assigned by somebody, everybody
    else making statements about the particular
    should use it

48
Representation in the EHR
  • Relevant particulars referred to using IUIs
  • Relationships that obtain between particulars at
    time t expressed using relations from an ontology
    (type OBO)
  • Statements describing for each particular, at
    time t
  • Of what universal from an ontology it is an
    instance of
  • AND/OR (if one insists)
  • By means of what concept from a concept-based
    system it can sensibly be described

particulars
49
Pragmatics of IUIs in EHRs
  • IUI assignment requires an additional effort
  • In principle no difference qua (or just a little
    bit more) effort compared to using directly codes
    from concept-based systems
  • A search for concept-codes is replaced by a
    search for the appropriate IUI using exactly the
    same mechanisms
  • Browsing
  • Code-finder software
  • Auto-coding software (CLEF NLP software Andrea
    Setzer)
  • With that IUI comes a wealth of already
    registered information
  • If for the same patient different IUIs apply, the
    user must make the decision which one is the one
    under scrutiny, or whether it is again a new
    instance
  • A transfert or reference mechanism makes the
    statements visible through the RTDB

50
Advantage betterreality representation
IUI-003
51
Other Advantages
  • mapping as by-product of tracking
  • Descriptions about the same particular using
    different ontologies/concept-based systems
  • Quality control of ontologies and concept-based
    systems
  • Systematic inconsistent descriptions in or
    cross terminologies may indicate poor definition
    of the respective terms

52
How to make this practicalfor the text-based
partsof an EHR ?
  • Referent tracking
  • in the linguistic sense !

53
The problem summarised
  • natural language is the only medium that is able
    to communicate clinical information about
    individual patients without loss of necessary
    detail
  • (virtual) structured data repositories are
    required to make subsequent analyses possible
  • any transformation from free language to coding
    and classification systems results in information
    loss that is unacceptable for individual patient
    care, but at the other hand is a conditio sine
    qua non for population based studies
  • todays graphical user interfaces can deal
    reasonably well with picking lists build around
    controlled vocabularies that fulfil a bridging
    function from free language towards coding and
    classification systems but are incompatible with
    referent tracking

54
The ultimate scenario
Ontology
continuant
disorder
person
CAG repeat
EHR
Juvenile HD
IUI-1 affects IUI-2 IUI-3 affects
IUI-2 IUI-1 causes IUI-3
Referent Tracking Database
55
A case study
  • Goals
  • Demonstrate the application of referent tracking
    to a concrete patient story
  • Make you familiar with the ontological analysis
    of what is involved
  • Understand the actions a NLU algorithm has to
    perform when transforming (running) text into a
    series of IUI-assertions ( information
    extraction)
  • Create interest of the computational linguists
    amongst you to embark on joined projects with us.

56
Jim Ciminos Woods Hole case
  • Jane Smith is a 30 year old, Native American
    female who presents to the emergency room with
    the chief complaint of cough and chest pain.The
    patient reports that she has had a productive
    cough for three days but that chest pain
    developed one hour ago.  She gives a history of
    hypertension.  She also reports that she was
    treated in the past for tuberculosis while she
    was pregnant.  The patient reports an allergy to
    Bufferin.Physical examination revealed a
    well-developed, well-nourished female in moderate
    respiratory distress.  Vital signs showed a pulse
    of 90, a respiratory rate of 22, an oral
    temperature of 100.3, and a blood pressure of
    150/100.  Examination revealed rales and rhonchi
    in the left upper chest.  Abdominal exam revealed
    a tender, palpable liver edge.LabsChem7
    (serum)  Glucose 100 (70-105)    Chem7 (plasma)
    Glucose 150 (75-110)CBC  Hgb 15 (12.0-15.8),
    Hct 45 (42.4-48.0), WBC 11,000 (3,540-9,060),
    Platelets 145,000 (165,000-415,000)A fingerstick
    blood sugar was 80Urinalysis showed protein of
    1 and glucose of 0.A blood culture was positive
    for methicillin-resistant Staphylococcusaureus
    (MRSA)

57
case study continued ...
  • ECG - Sinus Rhythm, 74BPM, Axis -30 degrees, ST
    segment 2mm elevated andT-waves down in leads I,
    L, V5 and V6Chest X-ray  Left upper lobe
    infiltrate, left ventricular hypertrophyThe
    patients nurse reported that the patient seemed
    depressed about her condition.  On questioning,
    the nurse found that the patient was caringfor
    her elderly father and was concerned that she
    would no longer be able to manage caring for
    herself and him.  The nurse asked the patients
    physician to consider an antidepressant and a
    social work consult.A medical student reviewing
    the case is concerned about the risk of MRSA in
    patients with pneumonia and a recent myocardial
    infarction.  She decides to do a literature
    search.

58
Step 1 identify the phrases referring to
particulars
  • Jane Smith is a 50 year
    old ,
  • Native American female who presents
  • to the emergency room
  • with the chief complaint
  • of cough and chest pain.

59
Step 2 indentify to what particulars these
phrases refer
60
Compare with simple clinical coding in
juxtaposition
61
Compare with the output of the perfect semantic
analyser we all would dream of
Compare with the output of the NAIVE !!! semantic
analyser we all would dream of
CS3-complaining
62
What it (more or less) should be
chest-pain
CS3-complaining
Has-Saying
Has-referent
CS3-chest pain
Has-Saying
coughing
Has-referent
CS3-coughing
63
Most important difference
Use of generic terms
Use of concrete particulars
64
Step 3 are relevant and necessary particulars
missing ?
  • Referred to
  • Jane Smith
  • Jane Smiths age
  • Jane Smiths race
  • Jane Smiths gender
  • Jane Smiths showing up at ...
  • The specific emergency room in the health
    facility
  • Jane Smiths primarily complaining ...
  • The temporal part ... coughs
  • Jane Smiths chest
  • Jane Smiths particular pain
  • Missing
  • The health facility
  • The healthcare worker she consulted
  • The particular coughs (under the condition she
    tells the objective truth)
  • The underlying disorder (under whatever state of
    affairs)

65
Step 4 IUI assignment
  • Assumptions
  • the RTS contains already
  • IUI-1 Jane Smith
  • Coi ltIUIa, ta, CS3, IUI-1, woman, trgt
  • IUI-1.1 Ri ltIUIa, ta, depends-on, BFO,
    IUI-1.1, IUI-1, trgt
  • Coi ltIUIa, ta, CS1, IUI-1.1, age, trgt
  • IUI-1.2 Coi ltIUIa, ta, CS1, IUI-1.2,
    cherokee, trgt
  • Ri ltIUIa, ta, depends-on, BFO, IUI-1.2,
    IUI-1, trgt
  • IUI-1.3 Coi ltIUIa, ta, CS3, IUI-1.3, chest
    pain, trgt
  • Ri ltIUIa, ta, is-located-in, BFO, IUI-1.3,
    IUI-1, trgt
  • All dates in the statements are 2 years earlier
    than now
  • What to do with
  • Jane Smith
  • Jane Smiths race (CS1 native American)
  • Jane Smiths gender (CS1 female)
  • Jane Smiths chest pain (CS3 chest pain)
  • Jane Smiths age (50)

66
Conclusion
  • Referent tracking can solve a number of problems
    in an elegant way.
  • Existing (or emerging) technologies can be used
    for the implementation.
  • Old technologies (cbs) can play an interesting
    role.
  • Big Brother feeling is to be expected but with
    adequate measures easy to fight.
  • The proof of the pudding is in the eating
  • Pilote is going to be set up
  • Collaboration sought for dealing with NLU
Write a Comment
User Comments (0)
About PowerShow.com