Introduction to Entrez Genome Projects - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Introduction to Entrez Genome Projects

Description:

chimpanzee. Organisms. Environmental samples? Data scope of genome resources ... Motility Salinity Oxygen Habitat Temp. Host - Disease. Organism Info ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 40
Provided by: donnam73
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Entrez Genome Projects


1
Introduction to Entrez Genome Projects
2
Data scope of genome resources at NCBI
Environmental samples?
Organisms
Nematoda
C.elegans, C.briggsae
Microbes
Viruses
Fungi/small eukaryotes
Plants
A.thaliana Barley Corn Oat Rice Soybean Tomato Ric
e Wheat
Fishes
Insects
D.melanogaster, A.gambia, D.pseudoobscura, Honey
bee,
Chicken
Dog
Mouse/Rat
pig, cow
Human
chimpanzee
3
Data scope of genome resources at NCBI
Sequences
  • Nucleotide
  • EST, cDNA, mRNA, STS
  • patents
  • GSS
  • Traces
  • Genomic complete genome,
  • whole genome shotgun assembly
  • (different assembly methods)
  • BAC clone based sequencing
  • Resequencing
  • Annotation

4
Entrez Genome Project
  • NOT Entrez Genome?
  • Entrez Genomes is a collection of COMPLETE
    chromosomes,
  • plasmids, organelles, and viruses.
  • Created in 1995.
  • Doesnt have a way of linking all the data for a
    given organism
  • Other than by taxid.
  • Problems
  • How to define COMPLETE genome
  • Same organism sequenced by different groups
  • Agrobacterium tumefaciens str. C58 (Cereon and
    U.Washington)
  • Corynebacterium glutamicum ATCC 13032 (Japan and
    Germany)
  • Bacillus licheniformis DSM 13 (USA and Germany )
  • Genome project is more than chromosomes and
    proteins
  • Not Entrez Taxonomy?
  • Designed as taxonomic hierarchy, not organized by
    genomes
  • Collects all Entrez links associated with the
    organism
  • Problems
  • Same organism sequenced by different groups
  • Sequence links are lumped together, for example,
    Oryza sativa

5
Cultivar Chinsurah Boro II
6
Entrez Genome Project
complete and incomplete large-scale sequencing,
assembly, annotation, and mapping projects for
cellular organisms
  • Project is defined by
  • Organism
  • Project type ( and/or sequencing method)
  • Sequencing center

7
Schematic diagram of a generic eukaryotic genome
project
Nucleotide data at NCBI (GenBank)
6 Large-scale cDNA sequencing (incomplete) Center
B
1 Genomic sequencing (WGS) and assembly
and annotation (complete) Center B
Genomic data at NCBI (RefSeq)
Organism-specific overview
Links to third-party sites
2 Genomic sequencing (WGS) (complete) Center A
Nucleotide data at NCBI (GenBank)
4 BAC-ends sequencing (incomplete) Center F
project
overview
external data
NCBI data
8
Entrez Genome Project
Is it implemented
Hierarchical structure Flexible project
types Related projects Entrez links Relational
database Manually curated organism
descriptions Related resources/links Sequencing
centers Submission form
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
Entrez Genome Project
Is it presented
Genome Project gt Overview gt Project Brief
description (Docsum defline) Project
data Lineage Image Chromosome info Map Viewer
search Related Projects Publications Organism
description Resource links NCBI Resources
(Tools) Organism data in GenBank
Sequencing Centers Sequencing Projects
Related Resources
Organism groups Eukaryotes Animals Plants
Fungi Protists Prokaryotes Archaea
Bacteria Entrez search Reports Statistics Sequen
cing Centers Eukaryotic projects Prokaryotic
projects Sequence links
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Eukaryotic Projects List
17
Organism name
Short summary
Taxonomic groups
Sequencing status
Estimated size
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
Prokaryotic Genomic Data
Amount of Data
genomes (nucleotides, proteins, RNAs)
expression analysis (microarrays, etc.)
microbial community sequencing (Sargasso Sea,
etc.)
Organization of Data
currently by type of data
taxonomically
23
Growth of complete microbial genomes in the last
ten years.
September 1, 2005 254 complete genomes
Deluge of Data
24
(No Transcript)
25
Anatomy of a Prokaryotic Project
26
Anatomy of a Prokaryotic Project
External data and sites
Genome Information
Organism and strain description
Prokaryotic genome attributes
27
Prokaryotic Projects List
28
Microbial Projects List
29
Microbial Projects List
Complete Genomes Organism - Kingdom Genome
GC Accessions Release Center NCBI
Size Content
Date Links
30
Microbial Projects List
Genomes in Progress Organism - Kingdom -
Contigs - Genome GC Accessions BLAST
Center
Size

31
Microbial Projects List
Organism Info
Organism - Kingdom Genome GC Gram Shape
Arrangement Spores Motility Salinity
Oxygen Habitat Temp. Host - Disease
32
Microbial Projects List
33
Organism/Genome Attributes
34
Project types
35
Environmental samples
36
Comparative genomics
37
Future Directions
- linking other data (microarrays)
- comparative genomics projects (ex. Bacillus)
- environmental microbial community sequencing
projects
- links to granting agencies
- International Nucleotide Sequence Databases
meta-genomic data provided by scientific
communities
38
Submission of Projects
create project from existing data
create project from announced sequencing projects
direct submission from outside users
39
Submission of Projects
http//www.ncbi.nlm.nih.gov/genomes/mpfsubmission.
cgi
40
Entrez Genome Project
  • Curators
  • Prokaryotes
    Eukaryotes
  • William Klimke
    Ethan Carver
  • Stacy Ciufo
    Melissa Landrum
  • Leigh Riley
    Anjana Raina
  • Gert Roosen
    Barbara Ruef
  • Rich McVeigh
    Patti Sherman
  • Nikolai Daraselia
    Janet Weber
  • Emir Khatipov
    Lynn Schriml
  • Software developers Graphics
  • Andrei Kochergin
    Svetlana Iazvovskaia
  • Sergei Resenchuk
    Usability

  • Mark Johnson
  • Project coordinators
  • Tatiana Tatusova Kim
    Pruitt

41
Entrez Genome Project
http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD
searchDBgenomeprj
1391 projects indexed and searchable in
Entrez 1706 in works 1040 organism-specific
overview projects with manual
descriptions
Genome sequencing
projects Organism Complete In
progress Total Prokaryotes 254
421
675 Eukaryotes 19
185 204 Total
273 606 879
Comments, suggestions are welcome Mail to
genomeprj_at_ncbi.nlm.nih.gov
genomes_at_ncbi.nlm.nih.gov
Write a Comment
User Comments (0)
About PowerShow.com