Principles for Building Biomedical Ontologies - PowerPoint PPT Presentation

1 / 105
About This Presentation
Title:

Principles for Building Biomedical Ontologies

Description:

Title: PowerPoint Presentation Author: Suzanna Lewis Last modified by: Suzanna Lewis Created Date: 9/7/2005 1:23:00 PM Document presentation format – PowerPoint PPT presentation

Number of Views:257
Avg rating:3.0/5.0
Slides: 106
Provided by: Suzann141
Category:

less

Transcript and Presenter's Notes

Title: Principles for Building Biomedical Ontologies


1
Principles for Building Biomedical Ontologies
  • Suzanna Lewis
  • National Center Biomedical Ontology
  • 22 October 2005
  • Advanced Bioinformatics, Cold Spring Harbor

2
National Center Biomedical Ontologyhttp//bioonto
logy.org/
  • Mark Musen
  • Suzanna Lewis
  • Barry Smith
  • Sima Misra
  • Daniel Rubin
  • Michael Ashburner
  • Monte Westerfield
  • Ida Sim
  • PI Core 1 computer science (SMI)
  • Co-PI Core 2 bioinformatics (BiKR GO)
  • Core 6 Outreach and training (ECOR)
  • Associate Program Director
  • Program Director
  • Core 3 Phenotype Project (Cambridge FlyBase
    and GO)
  • Core 3 Phenotype Project (UOregon PI of ZFIN)
  • Core 3 HIV clinical trials Project (UCSF)

3
BiKRs
  • Sima Misra
  • Shu Shengqiang
  • Christopher J. Mungall
  • Nomi Harris
  • John Day-Richter
  • Karen Eilbeck
  • Mark Gibson

4
Outline for the Morning
  • A definition of ontology
  • Four sessions
  • Organizational Challenges
  • Principles for Ontology Construction
  • Case Studies from the GO
  • Case Studies for group discussion.

5
My newbie questions
What Ive heard
  • What data is missing?
  • Organism, environment, data quality and
    attribution
  • Where is the data generated?
  • TIGR, Sanger, JGI, and coming soon to a 954 near
    you!
  • How will it be gathered?
  • Still an issue. Low threshold of effort relative
    to benefits of complying
  • What is the motivation?
  • Data it is accumulating on disks across the world
    and wed like to be able to locate and use it

The hardest part Sharing (semantics)
6
Ontologies help with decision making
Where should I eat?
handy ontology tells us whats there
7
Type of cuisine
(Presumable) country of origin
Ontologies dont just organize data they also
facilitate inference, and that creates new
knowledge, often unconsciously in the user.
8
What a computer would likely infer about the
world from this helpful ontology
Flag of fresh juice
Fresh Juice is a national cuisine
Where delicatessen food hails from
Frozen Yogurt cuisine in search of a national
identity?
9
Ontology is all about meaning
  • Communities form (scientific) theories
  • that seek to explain all of the existing evidence
  • and can be used for prediction
  • We make inferences and decisions based upon what
    we know about (biological) reality.

10
Make our meanings clear enough for a computer to
understand
  • An ontology is a computable representation of
    this underlying (biological) reality.
  • An ontology enables a computer to reason over the
    data in (some of) the ways that we do
  • particularly to query and locate relevant data.
  • A shared, common, backbone taxonomy of relevant
    entities, and the relationships between them,
    within an application domain.
  • Referred to by information scientists as an
    Ontology'.

11
But really
  • What is an Ontology?
  • From Aristotle to Artificial Intelligence
  • It is a formalism of what exists
  • Follows formal rules for creating definitions
    originally laid down by Aristotle.
  • A definition is the specification of the essence
    (nature, invariant structure) shared by all the
    members of a class or natural kind.

12
The Aristotelian Methodology
  • Topmost nodes are the undefinable primitives.
  • The definition of a class lower down in the
    hierarchy is provided by specifying the parent of
    the class together with the relevant differentia.
  • Differentia tells us what marks out instances of
    the defined class within the wider parent class
    as in
  • Plasma membrane
  • is a cell part immediate parent
  • that surrounds the cytoplasm differentia

13
classes
Physical object (substance)
mammal
frog
leaf class
all members of the class frog share a froggy
nature
14
Anatomical structures
Lung
Heart
Thorax
Cell
Cornelius Rosse
15
Content of FMA
Challenge Duplicate graphical model in symbolic
model
Universals or classes Kinds of anatomical
entities
Adapted from Bloom Fawcett Textbook of
Histology 1994 12th ed Chapman Hall
16
Content of FMA
17
1. Organizational Challenges
  • http//obo.sourceforge.net

18
So you want an ontology
  • What do you have to do to make/get/use/steal/beg
    one?

19
Why
Survey
Domain covered?
Public?
Community?
Active?
Salvage
Develop
Applied?
Improve
yes
no
Collaborate Learn
20
What you must do
  • Justify exactly why there is a need
  • Scope it very, very tightly
  • Communicate with people

21
The decisions you must make
  • What domain does it cover?
  • It is privately held?
  • Is it active?
  • Is it applied?

22
Survey
Why
Domain covered?
Public?
Community?
Active?
Salvage
Develop
Applied?
Improve
yes
no
Collaborate Learn (Listen to Barry)
23
Due diligence background research
  • Step 1 Learn what is out there
  • The most comprehensive list is on the OBO site.
    http//obo.sourceforge.net
  • Assess ontologies critically and realistically.
  • Make contact

24
Why
Survey
Domain covered?
Public?
Community?
Active?
Salvage
Develop
Applied?
Improve
yes
no
Collaborate Learn (Listen to Barry)
25
Ontologies must be shared
  • Proprietary ontologies
  • Belief that ownership of the terminology gives
    the owners a competitive edge
  • For example, Incyte or Monsanto in the past,
    SNOMED for non-US.
  • Data cannot be shared if the ontologies
    describing the data are not shared.
  • Dont reinventUse the power of combination and
    collaboration

26
Why
Survey
Domain covered?
Public?
Community?
Active?
Salvage
Develop
Applied?
Improve
yes
no
Collaborate Learn (Listen to Barry)
27
Pragmatic assessment of an ontology
  • Is there access to help, e.g.
  • help-me_at_weird.ontology.net ?
  • Does a warm body answer help mail within a
    reasonable timesay 2 working days ?

28
Why
Survey
Domain covered?
Public?
Community?
Active?
Salvage
Develop
Applied?
Improve
yes
no
Collaborate Learn (Listen to Barry)
29
Use it to improve it
  • Every ontology improves when it is applied to
    actual data
  • It improves even more when these data are used to
    answer questions
  • There will be fewer problems in the ontology and
    more commitment to fixing remaining problems when
    important research data is involved that
    scientists depend upon
  • Be very wary of ontologies that have never been
    applied

30
Work with that community
  • To improve (if you found one)
  • To develop (if you did not)
  • Getting it right
  • It is impossible to get it right the 1st (or 2nd,
    or 3rd, ) time.
  • What we know about reality is continually growing

31
Implication prepare for change
  • Establish a mechanism for change.
  • Use CVS or Subversion.
  • Changes must be reviewed by experts
  • Unique Identifiers
  • Versions
  • Archives

32
Ontology development is hard
  • Have a stake in seeing it work.
  • Have broad, detailed domain knowledge.
  • Will engage in vigorous debate without engaging
    egos.
  • Will do concrete work and attend frequent working
    sessions (quarterly), phone conferences (weekly),
    e-mail correspondence (daily).

33
2. Principles for Ontology Construction
34
Why do we need rules for good ontology?
  • Ontologies must be intelligible
  • to humans (for annotation) and
  • to machines (for reasoning and error-checking)
  • Unintuitive rules for classification lead to
    entry errors (problematic links)
  • Facilitate training of curators
  • Overcome obstacles to alignment with other
    ontology and terminology systems
  • Enhance harvesting of content through automatic
    reasoning systems
  • Following basic rules makes more useful ontologies

35
Aristotles categories
This is Aristotles list of types of predication,
that is, the different ways in which things can
be said to be. He identifies 10 mutually
exclusive categories.
36
SNOMED-CT Top Level
  • Substance
  • Body Structure
  • Specimen
  • Context-Dependent Categories
  • Attribute
  • Finding
  • Staging and Scales
  • Organism
  • Physical Object
  • Events
  • Environments and Geographic Locations
  • Qualifier Value
  • Special Concept
  • Pharmaceutical and Biological Products
  • Social Context
  • Disease
  • Procedure
  • Physical Force

37
Examples of Rules
  • Dont confuse instances with universals
  • Your navel (instance) is not the abstract
    representation of all navels
  • Your microarray result is not the abstract
    representation of all microarray results
  • The meaning of an ontology should not change when
    the programming language changes

38
First Rule Univocity
  • Terms (including those describing relations)
    should have the same meanings on every occasion
    of use.
  • In other words, they should refer to the same
    kinds of instances in reality

39
Example of univocity problem in case of part_of
relation
  • (Old) Gene Ontology
  • part_of may be part of
  • flagellum part_of cell
  • part_of is at times part of
  • replication fork part_of the nucleoplasm
  • part_of is included as a sub-list in

40
Second Rule Positivity
  • Complements of classes are not themselves
    classes.
  • Terms such as non-mammal, or non-frog, or
    non-membrane do not designate genuine classes.

41
Third Rule Objectivity
  • Which classes exist is not a function of our
    biological knowledge.
  • Terms such as unknown or unclassified do not
    designate biological natural kinds.

42
Fourth Rule Single Inheritance
  • No class in a classificatory hierarchy should
    have more than one is_a parent on the immediate
    higher level
  • I.e. no diamonds

43
Following the single inheritance rule
  • The position of a term within the hierarchy
    enriches its own definition by incorporating
    automatically the definitions of all the terms
    above it.
  • The entire information content of the term
    hierarchy can be translated very cleanly into a
    computer representation

44
Problems with multiple inheritance
  • B C
  • is_a1 is_a2
  • A
  • is_a no longer univocal

45
Fifth Rule Clarity of Text Definitions
  • The terms used in a definition should be simpler
    (more intelligible) than the term to be defined
  • otherwise the definition provides no assistance
    to human understanding
  • Machines can cope with the full formal
    representation (it doesnt need the text)

46
Sixth Rule Basis in Reality
  • When building or maintaining an ontology, always
    think carefully about how classes (types, kinds,
    species) relate to instances in reality
  • Axioms governing instances
  • Every class has at least one instance (exceptions
    will occur at top levels)
  • Each child class has a smaller collection of
    instances than its parent class

47
Axiom Every parent class has at least two
children
48
The reason that rules are important
Interoperability
  • Ontologies should work together
  • Avoid redundancy in ontology building
  • Support reuse
  • Ontologies should be capable of being used by
    other ontologies (cumulation)

49
The problem of ontology re-use
  • SNOMED
  • MeSH
  • UMLS
  • NCIT
  • HL7-RIM
  • None of these have clearly defined relations
  • Still remain too much at the level of TERMINOLOGY
  • Not based on a common set of rules
  • Not based on a common set of relations

50
An example of unclear relationship use
  • A is_a B
  • A is more specific in meaning than B
  • HL7-RIM
  • Individual Allele is_a Act of Observation
  • cancer documentation is_a cancer
  • disease prevention is_a disease

51
How to define A is_a B
  • A is_a B def.
  • A and B are names of universals (natural kinds,
    types) in reality
  • all instances of A are as a matter of biological
    science also instances of B

52
Benefits of well-defined relationships
  • If the relations in an ontology are well-defined,
    then reasoning can cascade from one relational
    assertion (A R1 B) to the next (B R2 C).
    Relations used in ontologies thus far have not
    been well defined in this sense.
  • Find all DNA binding proteins should also find
    all transcription factor proteins because
  • Transcription factor is_a DNA binding protein

53
Biomedical data integration / interoperability
  • Will never be achieved through integration of
    meanings or concepts
  • The problem different user communities use
    different concepts
  • What is really needed is a well-defined, commonly
    used set of relationships

54
Seventh Rule Distinguish Universals and Instances
  • A good ontology must distinguish clearly between
  • universals (types, kinds, classes)
  • and
  • instances (tokens, individuals, particulars)

55
Why distinguish classes from instances?
  • What holds on the level of instances may not hold
    on the level of universals
  • For example, my definition of an adjacent_to
    relation requires that it work in either
    direction
  • (This particular) nucleus adjacent_to (this
    particular) cytoplasm
  • Always true
  • Cytoplasm adjacent_to nucleus
  • Not always true

56
Using relations
  • Between classes
  • is_a, part_of, ...
  • Between an instance and a class
  • this explosion instance_of the class explosion
  • Between instances
  • Marys heart part_of Mary
  • Relations must be defined to always work

57
Defining the part_of relation can be a problem
  • part_of as a relation between classes versus
    part_of as a relation between instances
  • nucleus part_of cell (classes)
  • your heart part_of you (instances)
  • testis part_of human being ?
  • heart part_of human being ?
  • human being has_part human testis ?

58
Similar considerations are required to clearly
define nearly all relations
  • A causes B
  • A is_located in B
  • A is_adjacent_to B
  • A derives_from B
  • Zygote derives_from ovum, sperm
  • A transformation_of B
  • Adult transformation_of child

59
The Rules
  1. Univocity Terms should have the same meanings on
    every occasion of use
  2. Positivity Terms such as non-mammal or
    non-membrane do not designate genuine classes.
  3. Objectivity Terms such as unknown or
    unclassified or unlocalized do not designate
    biological natural kinds.
  4. Single Inheritance No class in a classification
    hierarchy should have more than one is_a parent
    on the immediate higher level
  5. Intelligibility of Definitions The terms used in
    a definition should be simpler (more
    intelligible) than the term to be defined
  6. Basis in Reality When building or maintaining an
    ontology, always think carefully at how classes
    relate to instances in reality
  7. Distinguish Classes and Instances

60
Some rules are Rules of Thumb
  • The world is full of difficult trade-offs
  • The benefits of formal (logical and ontological)
    rigor need to be balanced
  • Against the constraints of computer tractability,
  • Against the needs of biomedical practitioners.
  • BUT do the very best you can!

61
3. Case Studies from the GO
  • http//www.geneontology.org

62
How has GO dealt with some specific aspects of
ontology development?
  • Univocity
  • Positivity
  • Objectivity
  • Definitions
  • Formal definitions
  • Written definitions
  • Ontology Re-use (Alignment)

63
The Challenge of UnivocityPeople call the same
thing by different names
Taction
Tactile sense
Tactition
?
64
Univocity GO uses one term and many
characterized synonyms
Taction
Tactile sense
Tactition
perception of touch GO0050975
65
The Challenge of Univocity People use the same
words to describe different things
66
Bud initiation? How is a computer to know?
67
Univocity GO adds sensu descriptors to
discriminate among organisms
68
The Challenge of Positivity
Some organelles are membrane-bound. A centrosome
is not a membrane bound organelle, but it still
may be considered an organelle.
69
The Challenge of Positivity Sometimes absence is
a distinction in a Biologists mind
non-membrane-bound organelle GO0043228
membrane-bound organelle GO0043227
70
Positivity
  • Note the logical difference between
  • non-membrane-bound organelle and
  • not a membrane-bound organelle
  • The latter includes everything that is not a
    membrane bound organelle!

71
The Challenge of Objectivity Database users want
to know if we dont know anything (Exhaustiveness
with respect to knowledge)
We dont know anything about the ligand that
binds this type of GPCR
We dont know anything about a gene product
with respect to these
72
Objectivity
  • How can we use GO to annotate gene products when
    we know that we dont have any information about
    them?
  • Currently GO has terms in each ontology to
    describe unknown
  • An alternative might be to annotate genes to root
    nodes and use an evidence code to describe that
    we have no data.
  • Similar strategies could be used for things like
    receptors where the ligand is unknown.

73
GPCRs with unknown ligands
We could annotate to this
74
GO Definitions
A definition written by a biologist necessary
sufficient conditions written definition (not
computable)
Graph structure necessary conditions formal (com
putable)
75
Relationships and definitions
  • Important considerations
  • Placement in the graph- selecting parents
  • Appropriate relationships to different parents
  • True path violation

76
True path violationWhat is it?
nucleus
Part_of relationship
..the path from a child term all the way up to
its top-level parent(s) must always be true".
chromosome
Is_a relationship
Mitochondrial chromosome
77
True path violationWhat is it?
nucleus
chromosome
Is_a relationships
Part_of relationship
Nuclear chromosome
Mitochondrial chromosome
78
The Importance of synonymsis tRNA a function?
Molecular_function
Triplet codon amino acid adaptor activity
GO Definition Mediates the insertion of an amino
acid at the correct point in the sequence of a
nascent polypeptide chain during protein
synthesis. Synonym tRNA
79
Ontology integrationOne of the current goals of
GO is integration
References to Cell Types in GO
Cell Types in the Cell Ontology
with
  • cone cell fate commitment
  • retinal_cone_cell
  • keratinocyte
  • keratinocyte differentiation
  • fat_cell
  • adipocyte differentiation
  • dendritic_cell
  • dendritic cell activation
  • lymphocyte
  • lymphocyte proliferation
  • T_lymphocyte
  • T-cell homeostasis
  • garland_cell
  • garland cell differentiation
  • heterocyst
  • heterocyst cell differentiation

80
We can integrate the GO with other ontologies
  • Chemical ontologies
  • 3,4-dihydroxy-2-butanone-4-phosphate synthase
    activity
  • Anatomy ontologies
  • metanephros development
  • GO itself
  • mitochondrial inner membrane peptidase activity
  • Nota bene some time and effort will be required

81
Building Ontology
Improve
Collaborate and Learn
82
Applied Ontology a summary
  • Dedicated editors
  • Practice good ontological hygiene
  • Engage the community
  • Reward compliance and get the ontology into use
  • Plan for change over time
  • KISS Concentrate on what you can definitely
    agree upon the steps you can take with certainty.

83
4. Case Studies for group discussion
84
mitosis and meiosis
  • It's been a full lunar cycle since we last talked
    about this on the mailing list, and I would like
    to draw everyone's attention once again to the
    exciting topics of chromosome segregation,
    nuclear division and cell division. The basic
    problem is the multiplicity of meanings attached
    to 'mitosis'. The word are used in the literature
    and colloquially to represent everything from
    chromosome segregation up to a full round of
    nuclear and cell division and there is no
    consensus on how to define it in scientific or
    general dictionaries (check www.onelook.com for
    proof). To compound the problem, the only process
    common to all species which undergo 'mitosis' is
    chromosome segregation not all species undergo
    nuclear division or cell division during the
    processes described in the literature as
    'mitosis'. In the ontologies, we currently have
    'mitosis' defined as chromosome segregation and
    nuclear division. This is therefore wrong for
    those species in which there is no nuclear
    division accompanying chromosome segregation. How
    are we going to define mitosis?

85
  • Events of the mitotic cell cycle that need to be
    represented
  • mitotic chromosome segregation
  • mitotic nuclear division
  • mitotic cell division
  • Only component common to all these is mitotic
    chromosome segregation.
  • Structure must be flexible enough to accommodate
    any of the flavors of 'mitosis, no matter what
    the species and no matter whether the annotator
    has read the definition or not.

86
(No Transcript)
87
Backing up assertions
  • QUESTION What evidence code is appropriate to
    use for statements of common knowledge?

88
  • The current documentation states that TAS may be
    used as the evidence code for statements of
    common knowledge.
  • For example, lets say you have a paper that says
    that Protein X is an xxxxx , with a direct assay
    for activity, so you can use IDA for this
    function term. Then it also makes a mutation in
    the gene for Protein X and shows that it is
    involved in process yyyy, so you can use IMP for
    the process term. But, the paper does not have
    any direct evidence about the localization of
    Protein X. However, everyone knows that process
    yyyy occurs in the cytoplasm, so you can annotate
    protein X to the component term cytoplasm
    GO5737 by TAS using a general reference like
    Biochemistry by Lupert Stryer.

89
  • There is not really a traceable statement in
    Stryer providing evidence that process yyyy
    occurs in this location in yeast.
  • SGD feels that it is better to use the newer
    evidence code IC for these common knowledge
    types of annotations. Thus, if an SGD curator
    felt that it was reasonable to make the
    annotation cytoplasm based on the knowledge
    that Protein X the process annotation yyyy, then
    the curator could assign the component term
    cytoplasm GO5737 using IC and the GOid of
    the process term yyyyy.

90
  • many of these common knowledge types of
    statements are often not well based in actual
    experiments conducted on the organism of
    interest, that early biochemists would often
    perform experiments with materials that were easy
    to obtain, e.g. calf thymus, and assume that this
    accurately represented the situation for another
    organism, e.g. human. This may or may not be the
    case.

91
What is the most appropriate GO term for
annotating a response to methylmercury?
  • "Response to mercury ion" doesn't seem quite
    right, as it specifically states that the
    response is "as a result of exposure to mercuric
    ions (Hg2)", but the more general-sounding
    "response to mercury" is a synonym of it. In the
    publication I am working on, they exposed
    zebrafish to methylmercury and documented the
    resulting changes in gene expression.

92
"Response to mercury ion
  • Definition A change in state or activity of the
    organism (in terms of movement, secretion, enzyme
    production, gene expression, etc.) as a result of
    exposure to mercuric ions (Hg2).
  • Synonyms response to mercuric, response to
    mercury

93
Homeobox gt DNA binding?
  • http//www.geneontology.org/email-annotation/annot
    ation-arc/annotation-2005/0208.html

94
  • Bloggers and other online groups (eg.
    del.icio.us, Flickr online photo archive,
    Technorati) have been self-categorizing or
    'tagging' web sites and their content using
    user-defined words and phrases and not an
    expertly curated vocabulary or ontology. The end
    result is that a vast amount of content has been
    indexed using a rich vocabulary of tags (to date,
    technorati has over 1.2 billion links tagged with
    1.2 million tags).
  • Whilst this certainly lacks the formal
    consistency that would be obtained with curated
    annotation against a standard vocabulary, the
    quantity of content being categorized far exceeds
    what could be done by a group of annotators and
    perhaps is richer because the tags are defined by
    the users and creators of that content, not by a
    third party interpreting the material after the
    fact.
  • Given the ever increasing quantity of scientific
    data, the proliferation of online publishing,
    etc., could scientists tagging their own data
    with their own terms be the way to go?

95
  • How can you recruit and train people, in both
    logic and biology, given that without a
    sufficient number of competent personnel the
    ontology cannot be maintained?

96
Thanks to NIH and HHMI for funding and support
  • And to my fantastic colleagues (whose slides
    these are)
  • MICHAEL ASHBURNTER, BARRY SMITH, DAVID HILL,
    CORNELIUS ROSSE CHRIS MUNGALL

97
P.S. Graphical User Interfaces
  • Semantics

98
Common pitfalls
  • Dont confuse instances with artifacts of your
    database representation...

99
Instances are not included!
  • It is the universals that are important
  • though instances must be taken into account.
  • Please keep this in mind, it is a crucial to
    understanding the tutorial
  • Simon is an instance of the universal (class)
    human

100
Concept
  • Concepts are in your head and will change as our
    understanding changes
  • Universals exist and have an objective reality

101
Ontologies as Controlled Vocabularies
  • expressing discoveries in the life sciences in a
    uniform way
  • providing a uniform framework for managing
    annotation data deriving from different sources
    and with varying types and degrees of evidence

102
Structured definitions contain both genus and
differentiae
Essence Genus Differentiae
neuron cell differentiation Genus
differentiation (processes whereby a
relatively unspecialized cell acquires the
specialized features of..) Differentiae acquires
features of a neuron
103
Key ideaTo define ontological relations
  • Move from associative relations between meanings
    to strictly defined relations between the
    entities themselves.
  • The relations can then be used computationally in
    the way required.
  • For example part_of, develops_from
  • Definitions will enable computation
  • To define relations we must look at more than the
    classes.
  • We need also to take account of instances and time

104
is_a is pressed into service to mean a variety
of different things
  • shortfalls from single inheritance are often
    clues to multiple is_a classification meanings
  • the resulting ambiguities make it difficult for
    curators to reliably enter new terms (errors).
  • serves as obstacle to integration with
    neighboring ontologies
  • The success of ontology alignment depends
    crucially on the degree to which basic
    ontological relations such as is_a and part_of
    can be relied on as having the same meanings in
    the different ontologies to be aligned.

105
Definitions of the all-some form
  • Allow cascading inferences
  • If A R1 B and B R2 C, then we know that
  • Every A stands in R1 to some B,
  • but we know also that, whichever B this is, it
    can be plugged into the R2 relation, because R2
    is defined for every B.

106
Not only relations
  • We can apply the same methodology to other
    top-level categories in ontology, e.g.
  • anatomical structure
  • process
  • function
  • regulation, inhibition, suppression, co-factor
    ...
  • boundary, interior
  • contact, separation, continuity
  • tissue, membrane, sequence, cell

107
To the degree that the above rules are not
satisfied, error checking and ontology alignment
will be achievable, at best, only with human
intervention and via force majeure
108
What we have argued for
  • A methodology which enforces clear, coherent
    definitions
  • This promotes quality assurance
  • intent is not hard-coded into software
  • Meaning of relationships is defined, not inferred
  • Guarantees automatic reasoning across ontologies
    and across data at different granularities

109
The importance of relationships
  • Cyclin dependent protein kinase
  • Complex has a catalytic and a regulatory subunit
  • How do we represent these activities (function)
    in the ontology?
  • Do we need a new relationship type (regulates)?

Molecular_function
Catalytic activity
Enzyme regulator activity
protein kinase activity
Protein kinase regulator activity
protein Ser/Thr kinase activity
Cyclin dependent protein kinase activity
Cyclin dependent protein kinase regulator activity
110
GO textual definitions Related GO terms have
similarly structured (normalized) definitions
111
Alignment of the Two Ontologies will permit the
generation of consistent and complete definitions
GO

Cell type

Osteoblast differentiation Processes whereby an
osteoprogenitor cell or a cranial neural crest
cell acquires the specialized features of an
osteoblast, a bone-forming cell which secretes
extracellular matrix.
New Definition
112
Alignment of the Two Ontologies will permit the
generation of consistent and complete definitions
id GO0001649 name osteoblast
differentiation synonym osteoblast cell
differentiation genus differentiation GO0030154
(differentiation) differentium
acquires_features_of CL0000062
(osteoblast) definition (text) Processes whereby
a relatively unspecialized cell acquires the
specialized features of an osteoblast, the
mesodermal cell that gives rise to bone
Formal definitions with necessary and sufficient
conditions, in both human readable and computer
readable forms
113
part_of
  • part_of must be time-indexed for spatial classes
  • A part_of B is defined as
  • Given any instance a and any time t,
  • If a is an instance of the universal A at t,
  • then there is some instance b of the universal B
  • such that
  • a is an instance-level part_of b at t

114
derives_from
C1 c1 at t1
C c at t
time
C' c' at t
ovum
zygote
derives_from
sperm
115
transformation_of
  • C2 transformation_of C1 is defined as
  • Given any instance c of C2
  • c was at some earlier time an instance of C1

116
embryological development
117
tumor development
118
Key
  • In the following discussion
  • Classes are in upper case
  • A is the class
  • Instances are in lower case
  • a is a particular instance

119
Placement in the graph
  • Example- Proteasome complex
Write a Comment
User Comments (0)
About PowerShow.com