Ontological Analysis - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Ontological Analysis

Description:

Geri Steve, Aldo Gangemi, Domenico M. Pisanelli. Istituto di Tecnologie Biomediche, CNR, Rome, Italy. http://saussure.irmkant.rm.cnr.it ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 47
Provided by: mrt94
Category:

less

Transcript and Presenter's Notes

Title: Ontological Analysis


1
  • Ontological Analysis Integration of
    Terminologies Towards An Environmental Reference
    Ontology Library

Geri Steve, Aldo Gangemi, Domenico M. Pisanelli
Istituto di Tecnologie Biomediche, CNR, Rome,
Italy http//saussure.irmkant.rm.cnr.it steve,gan
gemi,pisanelli_at_saussure.irmkant.rm.cnr.it
2
Which part are you talking about?
  • If my liver is part of my digestive system, and
    that system is part of me, is my liver part of
    me?
  • If my liver is a part of me and I am part of the
    CNR, is my liver part of the CNR?
  • My liver is a component of my digestive system,
    while I am a member of CNR. No rule for composing
    component and member relations
  • Moreover, I am a body, but I am also a person. A
    living person depends on a body. Nevertheless, a
    living person can be member of CNR, but a body
    cannot

3
Object or place?
  • A body region is an object that one could cut, or
    a place?
  • A gene is a DNA fragment, or a DNA region
    (allele)?
  • A river is an orographic object, or the
    geographic place of a watercourse?
  • Despite many differences, such three cases seem
    analogous they share a polysemy partly dependent
    on an abstract difference between objects and
    regions, and a related axiom specifying that
    objects must be located at some region

4
River in the GEMET thesaurus
5
Should we worry about those things?
  • Even in presence of polysemous names, a
    standalone application using a local databank or
    terminological repository may be able to
    accomplish its task without serious flaws.
  • However, when it is integrated with another
    application, semantic mismatches constitute a
    serious obstacle for the agent or interface that
    is negotiating or sharing information.
  • The ever-increasing demand of data sharing has to
    rely on a solid conceptual foundation in order to
    give a semantics to the terabytes available in
    different databases and eventually traveling over
    the networks.
  • Ontologies are currently recognized as the answer
    to the needs of conceptual foundation.

6
The advantages of ontologies
  • to allow a more effective data and knowledge
    sharing
  • to facilitate knowledge re-use in decision
    support systems
  • to give theoretical foundation to vocabulary
    standardization activity

7
Our task
  • We learn domain ontologies (in medicine,
    environment) by integrating the conceptual models
    that can be extracted from terminological sources
  • The goal is building Domain Reference Ontologies
    in the form of modular libraries of formal
    theories
  • In our ONIONS methodology, ontology learning
    needs both incremental bottom-up learning from
    sources, and incremental definition and reuse of
    general theories that can account for the
    intended meaning of terms

8

ONtologic Integration Of Naïve Sources

9
Minimal history
  • ONIONS methodology for ontology integration has
    been developed since the early 1990s to account
    for the problem of conceptual heterogeneity. It
    addresses some problems encountered in the
    context of the European project GALEN and the
    Italian projects SOLMC (Ontological and
    Linguistic Tools for Conceptual Modeling) and
    ONTOINT (Ontological Integration of Information)

10
Some related research projects
  • GALEN GALEN-IN-USE
  • CYC anatomy
  • SNOMED RT
  • HL7 vocabulary committee
  • MED

11
What is an ontology?
  • A specification of a conceptualization
  • (Gruber, 1993)
  • The subject of ontology is the study of the
    categories of things that exist or may exist in
    some domain. The product of such a study, called
    an ontology, is a catalog of the types of things
    that are assumed to exist in a domain of interest
    D from the perspective of a person who uses a
    language L for the purpose of talking about D.
    ...
  • (Sowa, 1997)
  • A partial and indirect specification of a
    conceptualization
  • -restricted notion- (Guarino, 1998)

12
What is an ontology (restricted notion)?
  • An ontology is a set of axioms that account for
    the intended meaning (the intended models) of a
    vocabulary (the namespace of a logical language)
  • A set of axioms usually only approximate such
    intended models that on their turn only
    approximate the conceptualization of vocabulary
    items
  • A conceptualization is a set of conceptual
    relations that range over a domain and a set of
    relevant states of affairs (possible worlds) for
    that domain
  • Therefore, a precise definition of "ontology" (in
    a restricted, formal sense) might be "a partial
    specification of the intended models of the
    conceptualization of a vocabulary"

13
Types of ontologies (broad notion)
  • Catalog of normalized terms, e.g. a list of terms
    used in the reports from a laboratory no
    taxonomy, no axioms, and no glosses
  • Glossed catalog, e.g. a dictionary of medicine a
    catalog with glosses.
  • Thesaurus, e.g. many parts of the UMLS
    Metathesaurus, GEMET a hierarchical collection
    of terms the hierarchical link is usually
    polysemous
  • Taxonomy, e.g. the ICD10 a collection of classes
    with a partial order induced by inclusion
    (classification)
  • Axiomatized taxonomy, e.g. the GALEN Core Model
    a taxonomy with axioms
  • Ontology library, e.g. the Ontolingua repository
    a set of axiomatized taxonomies with relations
    among them. Each element of the library is a
    module, which can be included into another one.
    Also, a concept from a module can be only used
    into another one. Ontology modules can be
    considered subdivisions of the namespace of a
    model

14
From Data Integration to Conceptual Integration
  • Heterogeneous texts
  • Heterogeneous semi-structured texts (retrieval
    of web data types and descriptions)
  • Heterogeneous databases (schema integration,
    information brokering)
  • gt In all these cases, heterogeneity concerns the
    conceptualization of the terminology used in the
    sources

15
Polysemy and overlapping
  • Since the primary causes of heterogeneity are
  • polysemy (conceptual disalignment, difference of
    intended meaning of one name), and
  • conceptual overlapping (different names having
    overlapping meaning)
  • that arise in the union of the vocabularies of
    two any sources, ontologies are a major component
    to provide semantic access to (and integration
    of) terminological resources
  • Incidentally, polysemy is usually found within
    the same source as well (views, themes, homonyms)

16
Ontology Learning
  • From Natural Language
  • From Semi-structured Data
  • From Structured Data
  • From Terminologies
  • gt Integration of sources needs
  • (Principled) Conceptual Abstraction

17
Conceptual abstraction an example
  • The domain ontology A has body region with the
    intended meaning of loosely specified part of
    the body that can be cut, filled, etc.
  • The domain ontology B has body region with the
    intended meaning of region of the body at which
    body parts are located
  • There is a metonymy acting on body region in A,
    whose intended meaning concerns body parts
    located at some region, although they are denoted
    by referring to the region itself (the intended
    meaning in B)
  • Hence, the metonymic name should be distinguished
    from the plain name, and correctly related to it
  • The distinction between objects (body parts) and
    regions, and the notion of a localization
    relation holding between objects and regions are
    both necessary to make the metonymy clear, and
    cannot be found in the specifications given in A
    or B. They have to be found in some generic theory

18
Ontology integration conceptual issues
  • Ontology integration is generally speaking
    the construction of an ontology C that formally
    specifies the union of the vocabularies of two
    other ontologies A and B
  • To be sure that A and B can be integrated at some
    level, C has to commit to both A's and B's
    conceptualizations. In other words, the intension
    of the concepts in A and B should be mapped to
    the intension of C's concepts
  • Unfortunately, this cannot be realized using only
    the conceptual relations specified in A and B for
    local tasks (for a specific context). The
    methodological principle adopted here is that
    generic ontologies reused from the philosophical,
    linguistic, mathematical, AI literature must
    found the comparison of different intensions. Our
    approach may be called principled conceptual
    integration

19
Aspects of integration
  • Three aspects of an ontology are taken into
    account
  • the intended models of the conceptualizations of
    its vocabulary
  • the domain of interest of such models, i.e. the
    'topic' of the ontology
  • the namespace of the ontology
  • The most interesting case is when A and B are
    supposed to commit to the conceptualization of
    the same domain of interest or of two overlapping
    domains. In particular, A and B may be

20
Some integration cases for the same topic
  • Alternative ontologies the intended models of
    the conceptualizations of A and B are different
    (they partially overlap or are completely
    disjoint) while the domain of interest is
    (mostly) the same. This is a typical case that
    requires integration different descriptions of
    the same topic are to be integrated
  • Truly overlapping ontologies both the intended
    models of the conceptualizations of A and B and
    their domains of interest have a substantial
    overlap. This is another frequent case of
    required integration descriptions of strongly
    related topics are to be integrated
  • Equivalent ontologies with vocabulary mismatches
    the intended models of the conceptualizations of
    A and B are the same, as well as the domain of
    interest, but the namespaces of A and B are
    overlapping or disjoint. This is the case of
    equivalent theories with alternative vocabularies

21
Ontological integration operational issues
  • Depending on the amount of change necessary to
    the operational integration of A and B, different
    levels of interoperability can be distinguished
  • Mediation it requires no changes to A and B, but
    only mapping relations that describe the
    equivalence (partial or total) of A's and B's
    elements to C's elements. This may result in weak
    interoperability, since usually the intended
    models of A and B overlap only some concepts
    from A may not have a correspondent in B, and
    vice-versa. This is the design choice for some
    recent information brokering architectures.
    However, such architectures, have a weak
    commitment towards a principled way of conceptual
    integration, possibly for its additional cost
  • Alignment it requires some change to fill the
    biggest gaps of A and B respect to an ideal C
    that completely integrates A and B. Therefore,
    alignment requires at least a partial conceptual
    integration. It may support a limited
    interoperability for example, deep inferences
    may be excluded
  • Unification it may require a major
    reorganization of A and B, which are
    'harmonized'. Unification intervenes on the
    inferential features of the systems, and consists
    in a complete operational integration everything
    can be made in one system, can be made in the
    other. It results in the most complete
    interoperability but requires a complete
    conceptual integration as well. From the
    conceptual viewpoint, unification consists in the
    adoption of C as a standard in the systems using
    A or B

22
Ontology integration practical issues
  • Lack of hierarchies
  • Ambiguous hierarchies
  • Informality
  • Lack of modularity
  • Polysemy
  • Uncertain semantics
  • Prototypical descriptions
  • Ontological opaqueness
  • Lack of a (minimal) set of axioms
  • Confusing lexical clues
  • Awkward naming policy
  • 'Remainder' partitions
  • 'Exception' partitions
  • Terminological cycles
  • Meta-level soup
  • Low maintenance capabilities

23
Ontologies some desiderata
  • An explicit taxonomy with subsumption among
    concepts
  • Semantic explicitness of links
  • Modularity of namespace
  • A stratified design of the modules
  • Absence of polysemy within a module
  • Disjointness of concepts within a module and
    within the top-level
  • A proper interface between the ontology namespace
    and one or more sets of lexical realizations
  • Linguistically meaningful naming policy
    (cognitive transparency)
  • Rich documentation
  • Some minimal axiomatization to detail the
    difference among sibling concepts
  • Explicit linkage to concepts and relations from
    generic theories
  • Meta-level assignments to distinguish among the
    formal primitives assigned to concepts
  • Languages and implementations that support the
    previous needs as well as the possibility of
    collaborative modeling

24
The ONIONS Methodology
  • ONIONS implementation is meant to provide
    extensive axiomatization, clear semantics, and
    ontological depth to a domain terminology
  • Extensive axiomatization is obtained through a
    conceptual analysis of the terminological sources
    and their representation in a logical language
    with a rigorous semantics
  • Ontological depth is obtained by reusing a
    library of generic ontologies, on which the
    axiomatization depends. Such library may include
    multiple choices among partially incompatible
    ontologies. In particular, we suggest the
    importance of mereology or theory of parts,
    topology or theory of wholes, connexity and
    boundaries, morphology, or theory of form and
    congruence, localization, or theory of regions,
    time theory, actors, or theory of participants in
    a process, dependence theory, and the theory of
    environmental niches

25
The main steps (I)
  • 0. Semantically opaque hierarchies and lists are
    pre-processed in order to create clean
    taxonomies
  • 1. All concepts, relations, templates, rules, and
    axioms from a source ontology are represented in
    the ONIONS formalisms, currently Loom,
    Ontolingua, and OKBC
  • 2. When available, plain text descriptions are
    analyzed and axiomatized (text formalization)
  • 3. The union of such products is integrated by
    means of a set of generic ontologies. This is the
    most characteristic activity in ONIONS, which can
    be briefly described as follows

26
II
  • 3.1. For any set of sibling concepts in a
    taxonomy, the conceptual difference between each
    of them is inferred, and such difference is
    formalized by axioms that reuse the relations and
    concepts already in the library. If no concept is
    available to represent the difference, new
    concepts are added to the library
  • 3.2. For any set of polysemous senses of a term,
    different concepts are stated and placed within
    the library according to their topic and to the
    available modules. (Polysemy occurs when two
    concepts with overlapping or disjoint intended
    models have the same name.)
  • 3.3. Often, polysemous senses of a term - as well
    as different 'alternative' concepts - are
    metonymically related. For example
    process/outcome (as in inflammation),
    region/object (as in body region), etc.
    Alternatives must be properly defined by making
    it explicit the relationship between them e.g.
    "has-product" for inflammation, "location" for
    body-region
  • 3.4. When stating new concepts, the relations
    necessary to maintain the consistency with the
    existing concepts are instantiated. If conflicts
    arise with existing theories, a more general
    theory is searched which is more comprehensive.
    If this is impracticable, an alternative theory
    is created

27
III
  • 3.5. Relevant integration cases. Since ONIONS
    requires the use of generic theories to
    axiomatize alternative theories, the integration
    of a concept C from an ontology O is performed by
    comparing C with the concepts D1,,n already
    present in the evolving ontology library L, whose
    ontology set M1,,n contains at least a
    significant subset of generic ontologies and the
    set of domain ontologies at that state in the
    evolution of L. The following cases appear
    relevant to the methodology
  • 3.5.1. C's name is polysemous in O (internal
    polysemy). Iterate 3.2 3.4
  • 3.5.2. C's name is homonym with the name of a Di.
    (Homonymy occurs when both the intended models
    and the domains of two concepts with the same
    name are disjoint.) Homonyms must be
    differentiated by modifying the name, or by
    preventing the homonyms to be included in the
    same module namespace
  • 3.5.3. C's name is synonym with the name of a Di.
    (Synonymy is the converse of homonymy and occurs
    when two concepts with different names have both
    the same intended model and the same domain.)
    Synonyms must be preserved, or included in the
    set of lexical realizations related to the
    concept
  • 3.5.4. C is subsumed by some Di in L, but it has
    no total mapping on any Dj in L. The gap in L
    must be filled by adding C as a subconcept of Di

28
IV
  • 3.5.5. C is an intersection between two concepts
    Di and Dj in L. Solved by distinguishing types
    and roles, or different defining elements
  • 3.5.6. C has an alternative concept Di in L (same
    domain, but overlapping or disjoint intended
    models)
  • 3.5.6.1. If C metonymically depends on Di, C is
    properly related to Di
  • 3.5.6.2. If C and Di are different viewpoints on
    the same domain of interest, both concepts are
    kept if the case, they are included in separate
    modules
  • 3.5.6.3. If the intended model of C is finer than
    Di's, Di is substituted with C
  • 3.5.6.4. If the intended model of C is coarser
    than Di's, C is ignored (but track of it is kept
    for mapping between sources)

29
V
  • 4. The library of generic, intermediate, and
    domain ontologies should be stratified, say
    domain modules should include intermediate
    modules - that should include generic modules -
    so that each set of modules can be plugged or
    unplugged from its more general set without
    affecting the coherence of the entire library
  • 5. The source ontologies are explicitly mapped to
    the integrated ontology, in order to allow
    interoperability. The only admitted mappings are
    equivalent and coarser equivalent. Formally for
    any source ontology SO and an ontology IO that is
    supposed to result (also) from the integration of
    SO, for any concept Ci in SO, there is a Di in IO
    such that CiI DiI (equivalence of possible
    interpretations), or there is a disjunctive
    concept (or Di Dj) in IO such that CiI DiI ?
    DjI (equivalence of possible interpretations to a
    disjunction of concepts i.e. to a union of
    finer concepts)
  • 5.1. Partial mappings must have been already
    resolved through the methodology if any, some
    step in the integration procedure must be
    iterated

30
Ambiguous hierarchies
31
A principled formalization
  • (defconcept ununited-fracture
  • is-primitive (and fracture
  • (some morphology
  • (and bone
  • (or (some embodies malunion)
  • (not integral))))
  • (some dependently-postdates
    fracture)
  • (all interpretant clinical-condition)))

32
Some UMLS concepts pertaining the intersection
Amino Acid, Peptide, or Protein Carbohydrate
  • (hamster oviduct-specific glycoprotein)
  • (Par j I)
  • ((Man)6(GlcNAc)2Asn)
  • (Zn(2)-IAA)
  • (collapsing factor)
  • (BDV 18K glycoprotein)
  • (SI-gene-associated glycoprotein, Nicotiana)
  • (FdI allergen)
  • (sca gene product)
  • (EPV20 protein)
  • (lubricin)
  • (Pluritene)
  • (Par h 1 allergen)
  • (Wnt11 gene product)
  • (I-D-Gal-BSA)
  • (mannose-bovine serum albumin conjugate)
  • (acrosome granule lysin)
  • (sulfatide activator)
  • (vaccinia virus A34R protein)

gt More than 118,000 UMLS concepts (25) are
classified under an intersection
33
Ontological analysis of the intersection
  • (defconcept Amino Acid, Peptide, or Protein
    Carbohydrate
  • "834 instances. This conjunct includes two
    sibling types.
  • A protein containing a carbohydrate."
  • annotations ((Sugg.Name "carbohydrate-containin
    g-protein")
  • (onto-status integrated))
  • is-primitive (and protein
  • (some has-component carbohydrate))
  • context substances)

34
Morphologies
  • Names of anatomical morphologies are often
    polysemous
  • Both a condition and the function that caused the
    condition ("inflammation", "ulcer", "fracture",
    "wound", "hyperplasia")
  • Both an object and the function that produced the
    object ("neoplasm", "hemorrhage")
  • Both an object O and the condition created in
    another object O' by O ("obstruction")
  • For example "the fracture has been caused by a
    fall" vs. "the fracture is transverse" "the
    obstruction occurred in the jejunum" vs. "the
    obstruction has been removed"
  • Conceptual analysis puts into evidence other
    issues concerning morphologies
  • The dependence between a morphological condition,
    a function, and the related organ. For example,
    an "ulcer" (as a condition) of a stomach implies
    that the stomach embodies an ulceration function
    (an ulcer as a function)
  • The mereological import of morphologies some are
    featured by an organ, some only by a part of an
    organ. For instance, an "ectopic heart" is wholly
    ectopic, but an "ulcerated stomach" is only
    partly ulcerated

35
Morphologies analyzed
  • a property ("color", "consistency", "thickness",
    "size", "number", "shape")
  • a condition
  • a topologically relevant condition
  • an alteration of connection
  • that creates a configuration (a new property) in
    an object ("fracture", "wound")
  • in the holey interior of an object
    ("obstruction")
  • between several objects ("fusion")
  • an alteration of the boundary between an object
    holey interior and the object complement
  • creating a configuration in the boundary
    ("cavitation", "ulcer")
  • producing a substance flow ("hemorrhage",
    "ulcer")
  • an abnormal placement ("dislocation", "ectopia",
    "absence")
  • a form alteration condition ("deformity",
    "hyperplasia", "hypoplasia")
  • a condition involving the alteration of several
    properties ("inflammation", "eruption")
  • an abnormal, foreign object ("mass", "neoplasm",
    "calculus", "obstruction")

36
Expliciting relations
37
Medical source ontologies
  • The UMLS top-level (1998 edition 132 "semantic
    types", 91 "relations", and 412 "templates"),
  • The Snomed-III top-level (510 "terms" and 25
    "links"),
  • GMN top-level (708 "terms"),
  • The Icd10 top-level (185 "terms"), and
  • The GALEN Core Model v.5h (2,730 "entities", 413
    "attributes" and 1,692 axioms), etc.
  • The 1998 edition of the UMLS Metathesaurus
    (476,000 "concepts", 93,000 explicit templates,
    and 599,000 thesaurus-like templates)

38
The current ON9.2 library
39
The current top-level
40
Tool for representation
ONTOLINGUA Tool for representation and
classification LOOM Tool
for intermediate representation and
interchange OKBC Tool for
browsing and editing
ONTOSAURUS
41
(No Transcript)
42
Results
  • ON9.2 integration of the medical top levels
    within a library of generic theories. It includes
    a set of 50 modules with about 1,500 concepts. It
    is available in both Ontolingua and Loom
    languages
  • Explicitation of the Metathesaurus terminological
    knowledge intersections of UMLS semantic types,
    relations defined by sources (IS_A and other
    relations)
  • Integration of the Metathesaurus intersections
    within ON9.2
  • Contextualization of the Metathesaurus
  • An integrated model of clinical guidelines

43
What is a Domain Reference Ontology?
  • An ontology usable to build new ontologies in a
    domain, or to plug existing ontologies in it
  • Our research in medical conceptual structures
    aims at defining a Medical Reference Ontology
    (library)
  • The current research in environmental metadata
    could be reconsidered as the construction of an
    Environmental Reference Ontology
  • We are confident that our methodology is suitable
    to this task without substantial revision
  • Warning at first sight, conceptual heterogeneity
    in environment seems harder than medicine

44
  • "Es gibt nichts praktischers als eine gute
    Theorie"
  • (Ludwig von Boltzmann)

45
  • "Es gibt nichts praktischers als eine gute
    Theorie"
  • "There is nothing more practical than a good
    theory"
  • (Ludwig von Boltzmann)

46
References
  • for generalities, the library, and conceptual
    investigations
  • Gangemi A, Pisanelli DM, Steve G, "An overview of
    the ONIONS project Applying ontologies to the
    integration of medical terminologies", Data and
    Knowledge Engineering, 31 (1999), 183-220
  • for the investigation of the UMLS
  • Pisanelli DM, Gangemi A, Steve G, "An Ontological
    Analysis of the UMLS Metathesaurus", Journal of
    American Medical Informatics Association, vol. 5
    (symposium supplement), 1998
  • for the pre-processing of informal terminological
    repositories
  • Steve G, Gangemi A, Pisanelli DM, "Integrating
    Medical Terminologies with ONIONS Methodology",
    in Kangassalo H, Charrel JP (eds.) Information
    Modelling and Knowledge Bases VIII, Amsterdam,
    IOS Press 1997
  • for the integration of clinical guidelines
  • Pisanelli DM, Gangemi A, Steve G, "Toward a
    Standard for Guideline Representation an
    Ontological Approach", Journal of American
    Medical Informatics Association, vol. 6
    (symposium supplement), 1999
Write a Comment
User Comments (0)
About PowerShow.com