Title: Introduction to ontologies
1Introduction to ontologies
2DATA TYPES ???
3Vocabulary examples
Different resources try to control their
vocabulary. However there is no standardized
format, definition and often no equivalence
(cross reference) to another resource. The
vocabularies are organized in a simple tree
format.??
4Tree format
- Each node (or term, or family member) has only
one parent. - Each item in the tree (except for the root) has
exactly one other element above it. - A directory listing is a familiar example of this
kind of tree
Plant structure -- cell --trichoblast
-- sporophyte -- shoot --
stem -- leaf -- root
-- trichoblast -- tissue
C -- data -- Program Files --
Office -- Microsoft Works
-- Microsoft Dont Works -- Games
-- Monkey Wars -- Windows
A directory structure is a tree because each
directory is contained within exactly one other
directory.
5Directed Acyclic Graph (DAG).
Object -- toxin -- mercury -- paint
thinner -- canned items -- canned soups
-- paint thinner
Plant structure -- cell -- apical cell
-- trichoblast -- root -- root cortex
-- trichoblast
A DAG is very similar to a tree, except that each
node can have multiple parents, and there can be
more than one root.
6DAGs can be much more complicated. There can be
many different levels of ancestors, and each node
can have any number of parents
-- cell -- epidermal cell --
trichoblast -- guard cell -- tissue
-- epidermis -- epidermal cell -- root
-- epidermis -- epidermal cell
-- trichoblast -- shoot -- stem --
epidermis -- epidermal cell --
leaf -- epidermis -- epidermal
cell -- guard cell
However, no matter how complicated a DAG is, you
can always display it as a tree
7Parent to term relationship-1 (ISA and PART OF)
In the DAG glossary appears to be arranged
according to their organization in a given plant
part of a sub-section. However this can be made
more apparent by including relationship types
based on known concepts
Plant structure -- cell -- guard cell
-- trichoblast -- root -- root cortex
-- trichoblast
- Statement looks like
- Trichoblast and root cortex are part of root
- Trichoblast and apical cell are instances of cell
- Cell and root are instances of plant structure
8Parent to term relationship-2 (DEVELOPS FROM)
Plant structure cell
root guard cell trichoblast root hair root
cortex
is a
is a
i
i
part of
is a
p
i
develops from
D
9Anatomy of an Ontology
- In the Ontology
- Every term has
- A unique identifier (id or an accession)
- Term name
- Synonyms / aliases
- Definition
- Source of the definition
- Relationship to its parent term(s)
- Comments if any
- Species-specific definitions are avoided.
- No term is ever deleted from the ontology
database. - The term can become OBSOLETE ( a kind of flag
suggesting not to use in annotations)
- Rationale on creating a term in the ontology
- Anatomy and morphology (macro anatomy)
- Derivation
- Spatial / positional organization
- Introduce the grouping terms as higher-nodes
for same concepts/ontogeny - Avoid overlaps with Gene Ontology terms by
excluding sub cellular structures - Avoid attributes of anatomical terms e.g.
superior/inferior ovary - Effectively use the synonym field to cover
various instances, types and terms with attributes
10An example of a term Root
11How are anatomy terms organized in DAG ?
A previuosly shown view
PO anatomy structure
12The Granular View
PART OF felame gametophyte IS A cell DEVELOPS
FROM polar nucleus
A user can search using the exact term or a
broader term and still get to the proper set of
term he/she is interested in.
13True Path Rule (TPR)
Path from a child term all the way up to its
top-level parent(s) must always be true
Did you notice a problem in this tree?
Whats the solution ?
14True Path Rule solution to the violation
- The solution
- Drop the fruits relationship with shoot .
- Consider making fruit as an instance of
sporophyte. - Thus radicle as we truly know, is an instance
of the root and is also part of a sporophyte
Lineage of radicle in POC anatomy
15Using sensu for species-specific terms (sensu
means in the sense of or restricted to)
- Many plant parts are not common to all plants.
- Convention is to include any term that can apply
to more than one taxonomic class of plants. - There are cases where a word or phrase has
different meanings when applied to different
organisms. - Floret in Asteraceae is very different from grass
(Poaceae) floret. - Spikelet in the maize (Zea) tassel is different
than a generic grass (Poaceae) spikelet - Distinguish identical terms from one another by
their definitions and by the sensu designation - It makes the node available to other species that
uses the same term. - A node is divided into sensu sub-trees where the
children are or are likely to be different. - The terms with sensu attribute do not have
children with generic concepts.
Generic floret
Grass floret
16Different Types Of Tree in Anatomy Ontology
17Ontology Structure Plant Anatomy example
Fl1 dl1 cps Du8 atp1
genes
Learn more about the annotation process and
Ontology uses
Fl1 dl1 cps Du8
Fl1 dl1 cps Du8
atp1
Fl1 dl1 cps Du8
dl1 Cps Du8
Cps Du8
atp1
Du8
18Gramene Present strategy
Mutant / phenotype
simple
complex
Trait-1
Trait-5
Trait-2
Trait-4
Trait-3
Assayed Trait(s) / Trait Ontology terms
Assayed _at_ /affected /expressed
SCORE Absolute/relative (free text)
PO_Developmental stage(s)
Part Of
expressed in / affected
GO_Function GO_Process GO_Component
Assayed /affected
Environment interaction (EO)
PO_Anatomy(s)
19- Semi-dwarf mutant, sd1
- Plant height is semi-dwarf
- Resistant to lodging especially at high
fertilizer level, - High yielding.
- Elongation of lower internodes less than that of
upper internodes - Inhibition of cell division during elongation.
- Defective in biosynthetic enzyme
- GA20ox-2 catalyzed the conversion of GA53 to
GA20.
(1) Mona et al, (2) Sasaki et. al. and (3)
Spielmayer et al. (2002)
20Associated Plant Growth Stages
Assayed at
Expressed _at_/affected
1 2 ___ 3 ________ __5__ 6 7
___8____ 9
______________4_____________
SES rice growth stages
From Moldenhauer Slaton Rice Growth
Development (http//www.uaex.edu/Other_Areas/pub
lications/HTML/MP192/default.asp1)
21Associated CVO terms
PO_anatomy sd-1
- PO
- PO_growth stage affected Stem elongation
- PO_growth stage assayed at Heading
- PO_anatomy-affected Stem / culm
- PO_anatomy-expressed in second internode
- GO
- GO_F GA-20 oxidase
- GO_F regulation of cell proliferation
- GO_P GA biosynthesis
- GO_P cytokinesis
- GO_C cytoplasm
22PATO
23Fertilizer Growth hormone
semidwarf-1 sd1
Suggested curation strategy
Affected stem elongation Assayed _at_ heading
- GO_F GA-20 oxidase
- GO_F regulation of cell proliferation
- GO_P GA biosynthesis
- GO_P cytokinesis
- GO_C cytoplasm
Height semi dwarf (90-110cm) Yield
more Response insensitive Enzyme activity less
/ absent Gene expression less / absent
PO_anatomy-affected Stem / culm PO_anatomy-expres
sed in second internode
24Heading date-1 hd1/se1Ath_CO
Photoperiod (hr_day-night) Intensity
(lux) Quality (wavelength) Temperature
Affected panicle development Assayed _at_ heading
- GO_F transcription factor
- GO_P cell fate commitment
- GO_P entrainment of circadian clock
- GO_P response to photoperiod
Time early_flowering (in 8-12 weeks) Response
sensitive to Environ. Gene exp more_LD, less_SD
PO_anatomy-affected panicle / inflorescence PO_an
atomy-expressed in leaf, inflorescence meristem