OntoBasis Project 4th General Meeting July 7, 2003 - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

OntoBasis Project 4th General Meeting July 7, 2003

Description:

Task 1.1 - Adapting text analysis tools for lexon extraction. Task 1.2 ... immunoadsorbent immunosorbent immunoassay immunospot immunosorbent_assay. Blood ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 28
Provided by: Vero171
Category:

less

Transcript and Presenter's Notes

Title: OntoBasis Project 4th General Meeting July 7, 2003


1
OntoBasis Project4th General MeetingJuly 7,
2003
  • CNTS Universiteit Antwerpen
  • Marie-Laure Reinberger
  • Walter Daelemans

2
Outline
  • Task 1.1 - Adapting text analysis tools for lexon
    extraction
  • Task 1.2 Evaluation of ontologies
  • Task 1.3 Adaptation of ontologies
  • Task 2.1 Overview of existing ontology (meta-)
    models
  • Prospects

3
Approach
  • Retrieving semantic information using
  • syntactic structures
  • unsupervised methods
  • domain specific corpora

4
Data two corpora extracted from Medline
  • Key word hepatitis A hepatitis B
  • Size ? 4 million words
  • Very specific
  • Key word blood
  • Size ? 7 million words
  • More general

5
Improving the syntactic analysis
  • Increasing the shallow parser lexicon with
    medical terms
  • Training the parser on specific sentences taken
    from Medline

6
Clustering
  • Unsupervised method that consists of
    partitioning a set of objects into groups or
    clusters, depending on the similarity between
    those objects

7
Pattern Matching
  • Pattern matching consists of finding patterns in
    texts that induce a relation between words, and
    generalizing this pattern to build relations
    between concepts

8
Methods
  • Apply pattern matching and cluster the results
  • Apply pattern matching and combine the results
    with clusters obtained with another method
  • Subject Phrase Verb Object Phrase
  • Noun Phrase Preposition Noun Phrase
  • Noun Phrases or Terms
  • Nouns

9
Evaluation
  • Former evaluation Wordnetmissing words and
    termsmissing relations
  • UMLS devoted to the medical domainmore
    accuratestill missing relations
    (functional)Example collect draw blood
    specimen sampleRelation blood
    specimen blood sample

10
Clusters of terms, hepatitis corpus before and
after improvement of the SP
11
  • But!
  • Before evaluation of 148 pairs, 152 UMLS pairs
    formed
  • After evaluation of 454 pairs, 381 UMLS pairs
    formed ?more reliable
  • And
  • Object relations being more specific ? more UMLS
    pairs ? more reliable results

12
Clusters of terms
13
Examples
  • Hepatitis ? liver_transplantation
    transplantation orthotopic_liver_transplantation
    ? immunoadsorbent immunosorbent immunoassay
    immunospot immunosorbent_assay
  • Blood ? day h month hour min ? Banker
    den_Hollander Knudsen Tanner

14
Method
  • Clusters C of terms formed according to
    similarity between classes of co-occurring verbs
  • Pattern Matching T1 Prep T2
  • Cluster Ci associated to a preposition Prepi and
    a set of terms
  • ?Relations Ci Prep Cj

15
Examples
  • dose injection vaccination of
    hepatitis_B_vaccine HBV_vaccine vaccine
  • recurrence transmission of infection
    hepatitis_B_virus viral_infection HCV hepatitis_B
    HCV_infection disease HBV HBV_infection
    viral_hepatitis
  • heparin blood_pressure blood blood_loss during
    aortic_surgery operation apostosis surgery
    coronary_angiography hemipathectomy
    coronary_artery_bypass emergency-surgery
    cardiac_surgery surgical_resection hemodialysis
    procedure dialysis transplantation
  • level sign level of platelet molecule
    interleukin cell t_cell CD8 white_blood_cell
    red_blood-cell macrophage risk

16
Patterns hepatitis vs blood
17
Outline
  • Task 1.1 - Adapting text analysis tools for lexon
    extraction
  • Task 1.2 Evaluation of ontologies
  • Task 1.3 Adaptation of ontologies
  • Task 2.1 Overview of existing ontology (meta-)
    models
  • Prospects

18
Evaluation of UMLS
  • Pairs of terms not found or not associated in
    UMLS
  • Relation checked on Google, using the mutual
    information measure
  • Discovery of missing terms and relations

19
Results
  • Transcriptase activity - transcriptase inhibitor
  • Transcriptase activity - transcriptase polymerase
    chain reaction
  • Transcriptase activity - transcription
  • Transcriptase activity - transcription polymerase
    chain reaction
  • Transcriptase inhibitor - transcriptase
    polymerase chain reaction
  • Mask - protective eyewear
  • Face mask - protective eyewear
  • Glove - protective eyewear

20
Outline
  • Task 1.1 - Adapting text analysis tools for lexon
    extraction
  • Task 1.2 Evaluation of ontologies
  • Task 1.3 Adaptation of ontologies
  • Task 2.1 Overview of existing ontology (meta-)
    models
  • Prospects

21
Adaptation of WordNet
  • red_blood_cell is_a red_cell is-a cell
  • cell hepatoma_cell human_hepatoma_Hep3B_cell
    woodchuck_hepatocyte normal_human_hepatocyte in
    liver
  • renal_failure liver-disase liver_disease_secondary
    renal_disease is_a disorder IDC
  • patient therapy liver_transplantation
    transplantation for IDC renal_failure
    liver_disease liver_disease_secondary
    renal_disease
  • use of mask glove protective_eyewear is_a
    protection with patient subject child adult
    infant person
  • face_mask is_a mask
  • contamination of mask
  • surgical_face_mask is_a face_mask
  • red_blood_cell is_a cell
  • renal_failure is_a disorder
  • face_mask is_a mask
  • mask is_a protection

22
Adaptation of Wordnet
  • recurrence transmission of HCV hepatitis_B
    HBV hepatitis_A HBV_infection hepatitis_B_virus_in
    fection hepatitis_B_infection HBsAg
    viral_hepatitis infection hepatitis_B_virus
    viral_infection hepatitis_C_virus HCV_infection
    disease virus
  • patient with hepatitis hepatitis_A
    hepatitis_B acute_hepatitis hepatitis_type_A
    acute_viral_hepatitis type_A_hepatitis
    type_B_hepatitis hepatitis_B_virus
    infectious_hepatitis chronic_active_hepatitis
    cholestatic_hepatitis chronic_viral_hepatitis_B
    chronic_viral_hepatitis_C post-transfusion_hepatit
    is

23
Outline
  • Task 1.1 - Adapting text analysis tools for lexon
    extraction
  • Task 1.2 Evaluation of ontologies
  • Task 1.3 Adaptation of ontologies
  • Task 2.1 Overview of existing ontology (meta-)
    models
  • Prospects

24
Ontology meta- models
  • A meta-model will determine the level of
    interoperability that can be reached between the
    different applications.
  • Different issues
  • Ontology growing
  • Ontology storing
  • Adaptation of ontologies
  • Reuse of ontologies

25
Combination of ontologies
  • Find to what extent they overlap
  • Link equivalent concepts
  • Respect consistency, coherence, non-redundancy
  • Take care ofsyntactic mismatchessemantic
    mismatchesconceptualization mismatches

26
Systems studied
  • On-To-Knowledge and Sesame
  • KAON, Text-To-Onto system
  • Protégé-2000 (Stanford)
  • VOID, the Kactus toolkit
  • WebOnto (Open University, England)

27
Prospects
  • Improving the building of relations between terms
  • Evaluation of UMLS find evidences of anomalous
    relations
  • Adaptation of WordNet generalization on the
    whole thesaurus
Write a Comment
User Comments (0)
About PowerShow.com