Gencode Mar '10 Meeting: Pseudogene Project Update Mark Gerstein - PowerPoint PPT Presentation

About This Presentation
Title:

Gencode Mar '10 Meeting: Pseudogene Project Update Mark Gerstein

Description:

Gencode Mar '10 Meeting: Pseudogene Project Update Mark Gerstein Illustration from Gerstein & Zheng (2006). Sci Am. – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 23
Provided by: Off71
Category:

less

Transcript and Presenter's Notes

Title: Gencode Mar '10 Meeting: Pseudogene Project Update Mark Gerstein


1
Gencode Mar '10 Meeting Pseudogene Project
UpdateMark Gerstein
Illustration from Gerstein Zheng (2006). Sci Am.
2
Overall FlowPipeline Runs, Coherent Sets,
Annotation, Transfer to Sanger
  • Overall Approach
  • Overall Pipeline runs at Yale and UCSC, yielding
    raw pseudogenes
  • Extraction of coherent subsets for further
    analysis and annotation
  • Passing to Sanger for detailed manual analysis
    and curation
  • Incorporation into final GENCODE annotation
  • Pipeline modification
  • Chronology of Sets
  • Encode Pilot 1
  • Ribosomal Protein pseudogenes
  • Unitary pseudogenes (Hard)
  • Glycolytic Pseudogenes
  • Polymorphic Pseudogenes
  • Pseudogenes Associated with SDs

3
Specific Pseudogene Assignments Glycolytic
Pseudogenes (completed)
4
Number of pseudogenes for each glycolytic enzyme
Liu et al. BMC Genomics ('09)
Large numbers of processed GAPDH pseudogenes in
mammals comprise one of the biggest families but
numbers not obviously correlated with mRNA
abundance.
GAPDH
Processed/Duplicated
GAPDH
5
Number of pseudogenes for each glycolytic enzyme
Liu et al. BMC Genomics ('09)
Large numbers of processed GAPDH pseudogenes in
mammals comprise one of the biggest families but
numbers not obviously correlated with mRNA
abundance.
GAPDH
Processed/Duplicated
GAPDH
60 Proc/2 Dup
6
Distribution of human GAPDH pseudogenes
Large numbers of processed GAPDH pseudogenes in
mammals comprise one of the biggest families but
numbers not obviously correlated with mRNA
abundance.
60 Proc/2 Dup
Liu et al. BMC Genomics ('09, in press)
7
Aproximate Age of GAPDH pseudogenes
Burst of Retrotran-spositional Activity
Age calculated based on Kimura-2 parameter model
of nucleotide substitution
Liu et al. BMC Genomics ('09)
8
Synteny of GAPDH pseudogenes
Synteny derived based on local gene orthology
Liu et al. BMC Genomics ('09)
9
Specific Pseudogene Assignments Unitary
Pseudogenes (completed)
10
Pseudogenes
Unitary pseudogene
  • Pseudogenes nongenic DNA segments with high
    sequence similarity to functional genes
  • Unitary pseudogenes unprocessed pseudogenes with
    no functional counterparts

11
Identification pipeline
Unitary pseudogene
Zhang et al. GenomeBiology (in press, '10)
12
Relativity of unitary pseudogenes
Unitary pseudogene
Zhang et al. GenomeBiology (in press, '10)
13
Unitary Pseudogene Families
14
Dating the pseudogenization events
Unitary pseudogene
15
Specific Pseudogene Assignments Polymophic
Pseudogenes (in process)
16
11 Polymorphic Pseudogenes
17
Polymorphic pseudogenes (3 with allele frequency
data)
Zhang et al. GenomeBiology (in press, '10)
3 SNPs not found to be under recent positive
selection....
18
Fst hierarchical clustering for rs4940595 in
SERPINB11
....but population structure at rs4940595the
difference in the allelic frequencies in
different populationscould be result of
different selective regimes that the same allele
at rs4940595 is subjected to in different
population subdivisions.
19
Specific Pseudogene Assignments SD-associated
Pseudogenes (in process)
20
Segmental duplications (SDs)
  • Regions of the genome with ? 90 sequence
    identity and ? 1kb in length
  • Based on neutral divergence correspond to last
    40 million years of human evolution
  • Comprise 5-6 of the human genome
  • Enriched with genes (18) and pseudogenes
    (duplicated 45, processed 22)


Can the study of ?genes in SDs provide
information not obvious from individual dataset ?
Bailey et al, Science, 2002
21
Nucleotide substitutions in ?genes and SDs
containing them
Parent gene
Duplicated ?gene
K2m Nucleotide substitutions per site computed
using Kimuras two parameter model
Most ?genes show the same number of substitutions
as larger SD region containing them - Duplication
accompanied by disablement - Followed by neutral
rate of evolution
22
Acknowledgements
  • Z Zhang
  • E Khurana
  • Y J Liu
  • YK LamS Balasubramanian
  • G Fang
  • N Carriero
  • R RobilottoP Cayting
  • M Wilson
  • A Frankish M Diekhans
  • R HarteT HubbardJ Harrow

Pseudogene.org
Write a Comment
User Comments (0)
About PowerShow.com