The Genome Gamble, Knowledge or Carnage? - PowerPoint PPT Presentation

About This Presentation
Title:

The Genome Gamble, Knowledge or Carnage?

Description:

Co-expression = Cab (-1 =corr. =1) Ca'b' = Cab ... Ca'b' = Cab. Human/Mouse. Increases probability that A and B are involved in the same process ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 26
Provided by: Hul95
Category:

less

Transcript and Presenter's Notes

Title: The Genome Gamble, Knowledge or Carnage?


1
The Genome Gamble, Knowledge or Carnage?
  • Comparative Genomics Leading the Way _at_ Organon

Tim Hulsen, Oss, November 11, 2003
2
Summary
  • (1) An introduction to orthology and paralogy
  • (2) Orthology determination within eukaryotes
  • (3) Testing the advantages of our ortholog set
  • (4) Using evolutionary conservation of
    co-expression for function prediction
  • (5) Evolutionary conservation of chromosomal
    distance and orientation

3
(1) An introduction to orthology and paralogy
  • Homologous genes genes that have a common
    ancestor
  • Orthologous genes genes that evolved from a
    common ancestor through a speciation event (?
    equivalents in different species)
  • Paralogous genes genes that evolved from a
    common ancestor through a duplication event

4
Orthology and paralogy explained graphically
(from http//www.ncbi.nlm.nih.gov/Education/BLASTi
nfo/Orthology.html)
5
The importance of orthology and paralogy
  • Orthology relationships especially important for
    function prediction orthologous genes generally
    have the same function but in different species
  • Paralogy relationships can be used for function
    prediction too paralogous genes are often
    involved in the same process, but have different
    molecular functions (e.g. globins)

6
(2) Orthology determination within eukaryotes
  • Not much eukaryotic orthology available at this
    moment
  • euKaryotic Orthologous Groups (KOG,NCBI)
  • Inparanoid
  • OrthoMCL
  • Existing databases are either too inclusive or
    too restrict
  • Most methods rely on best bidirectional hit
    (E-value), while orthology is an evolutionary
    principle.. should be determined using
    phylogenetic trees!

7
Our orthology determination
within eukaryotes
  • Hs

  • At, Ce, Dm, Ec, Gt, Hs, Mm, Sc, Sp
  • Zgt20, RHgt0.5QL
  • 24,263 groups

GENOME
Hs-Mm 85,848 pairs Hs-Dm 55,934 pairs etc.
TREE SCANNING
8
Our orthology determination using phylogenetic
trees
  • Example BMP6 (Bone Morphogenetic Protein 6) ? 5
    orthologous relations are defined, all Hs-Mm

9
The ortholog database Eukaryortho
http//t2.teras.sara.nl4086
(only accessible from Organon, CMBI and SARA)
10
(3) Testing the advantages of our ortholog set
  • Quality of orthology difficult to test
  • Orthologs should have more or less the same
    function --gt use conservation of function as an
    orthology benchmark
  • Gene Ontology (GO) database hierarchical system
    of function and location descriptions
  • Orthologs are in same functional category when
    they are in the same 4th level GO Molecular
    Function class

11
GO molecular function benchmark
  • 0
  • 1
  • 2
  • 3
  • 4
  • Molecular function one of the three subroots
    (together with biological process and cellular
    location)
  • True orthologs should share a 4th level
    molecular function (here GO0019912)
  • Our Hs-Mm ortholog set 67
  • KOG Hs-Mm ortholog set 51

12
Co-expression benchmark
  • Second method comparing expression profiles of
    each orthologous gene pair
  • Using GeneLogic Expressor data set
  • Human chips 3269 samples, 44792 fragments, 115
    tissue categories, 15 SNOMED tissue categories
  • Mouse chips 859 samples, 36701 fragments, 25
    tissue categories, 12 SNOMED tissue categories

13
SNOMED tissue categories used for co-expression
calculation
HUMAN MOUSE
1 Blood vessel 1 Blood vessel
2 Cardiovascular system 2 Cardiovascular system
3 Digestive organs 3 Digestive organs
4 Digestive system 4 Digestive system
5 Endocrine gland -
6 Female genital system 5 Female genital system
7 Hematopoietic system 6 Hematopoietic system
8 Integumentary system 7 Integumentary system
HUMAN MOUSE
9 Male genital system 8 Male genital system
10 Musculoskeletal system 9 Musculoskeletal system
11 Nervous system 10 Nervous system
12 Product of conception -
13 Respiratory system 11 Respiratory system
14 Topographic region -
15 Urinary tract 12 Urinary tract
14
Calculating the correlation
  • N?xy (?x)(?y)
  • r ----------------------------------------------
    ---
  • sqrt( (N?x2 - (?x)2)(N?y2 (?y)2) )

Human gene 1 206316_s_at Mouse gene 1 162926_at Tissue category Human gene 2 205428_s_at Mouse gene 2 97166_at
41.04 83.56 1 62.95 49.11
30.78 61.11 2 67.72 45.18
74.73 92.95 3 93.2 40.76
43.9 78.85 4 68.48 41.2
39.23 88.93 5 54.8 41.24
88.72 100.7 6 52.16 49.64
39.71 83.15 7 73.56 42.84
135.42 169.28 8 46.59 49.58
55.98 79.91 9 205.58 0
0 59.05 10 142.9 34.7
54.78 97.37 11 48.57 48.04
68.11 87.85 12 48.97 46.26
? High correlation 0.914167 ? High correlation 0.914167 ? Low correlation -0.935731 ? Low correlation -0.935731
15
Co-expression comparison of our ortholog set to
the KOG set
16
(4) Using evolutionary conservation of
co-expression for function prediction
Human
Gene A Gene B
Co-expression Cab (-1ltcorr.lt1)
(Co-expression calculated over 115 tissues in
human, 25 in mouse)
Human/Mouse
Gene A Gene B
Cab gt Cab
? Increases probability that A and B are involved
in the same process
17
GO biological process benchmark
  • 0
  • 1
  • 2
  • 3
  • 4
  • Biological process one of the three subroots
    (together with cellular location and molecular
    function)
  • Both orthologs and paralogs are often involved in
    the same process/pathway (sharing a 4th level
    biological process, here GO0007584)

18
Conservation of co-expression used in function
prediction
19
The importance of (conserved) co-expression for
function prediction
  • Co-expression without conservation can already be
    used for function prediction
  • Paralogous conservation gives a 2x higher
    accuracy
  • Orthologous conservation gives a 3x or 4x higher
    accuracy
  • Alternative for GO Biological Process KEGG
    Pathway database ? similar results

20
(5) Evolutionary conservation of chromosomal
distance and orientation
Human
Gene A Gene B
Distance Dab ( bp) Orientation Oab
(??,??,??) Co-expression Cab (-1ltcorr.lt1)
Dab lt Dab Oab Oab Cab gt Cab
(Co-expression calculated over 115 tissues in
human, 25 in mouse)
Human/Mouse
Gene A Gene B
? Increases probability that A and B are involved
in the same process
21
Function prediction using co-expression and
chromosomal distance (without conservation)
22
Conservation of chromosomal distance used in
function prediction
23
The importance of chromosomal distance and
orientation for function prediction
  • Chromosomal distance in eukaryotes less important
    than in prokaryotes (due to the absence of
    operons)
  • Only genes with distance lt 1 Mbp seem to be
    coregulated
  • Conservation of relative orientation seems to be
    important only for very close gene pairs
  • Limited number of genes can be functional
    annotated using the conservation of chromosomal
    distance and orientation

24
Conclusions
  • Orthologous and paralogous relations can be used
    to improve function prediction
  • Our orthologous pairs of Protein World proteins
    perform better than KOG, in terms of
    co-expression and involvement in the same process
  • Chromosomal distance and relative orientation
    between genes can be used for function prediction
    too, in a limited number of cases
  • Future plans find examples where the function of
    a protein can be predicted using these methods

25
Credits
  • Martijn Huynen
  • Peter Groenen
  • Others at Comics
  • Others at Organon Bioinf.
Write a Comment
User Comments (0)
About PowerShow.com