INDIAN INITIATIVE FOR RICE GENOME SEQUENCING - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

INDIAN INITIATIVE FOR RICE GENOME SEQUENCING

Description:

Reverse BAC end sequence of 52B22. Orientation of BAC Clones. 89M05. 52B22. 85G12. 1. 2 ... Hypothetical 411. Known genes = Pi ta, Pib, Xa2, Xa21, RGAs, Yr10 , ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 21
Provided by: vivek1
Category:

less

Transcript and Presenter's Notes

Title: INDIAN INITIATIVE FOR RICE GENOME SEQUENCING


1
INDIAN INITIATIVE FOR RICE GENOME SEQUENCING
ANNUAL PROGRESS REPORT 17 Jan, 2002-2003
VIVEK DALAL
2
PROCESS FLOWCHART IN IIRGS
START
IDENTIFY CLONE
(PHYSICAL MAPPING)
LIB. PREPARATION SEQUENCING
(SHOT GUN CLONING SEQUENCING)
QUALITY CHECK
(GENOINFORMATICS)
FAIL
PASSED
TEMPLATE PREP.
(SHOT GUN CLONING)
DNA SEQUENCING
(SEQUENCING)
TCF
(DATA STORE)
ASSEMBLY
(GENOINFORMATICS)
(GENOINFORMATICS)
3
(SUBMISSION)
ORIENTATION (PHASE II)
GENE PREDICTION
ANNOTATION
FINISHING (PHASE III)
STOP
NCBI GENBANK
(RESUBMISSION)
TCF
(DATA STORE)
(GENOINFORMATICS)
4
CRITERIA FOR QUALITY CHECK
  • To treat all hits with E.coli as contamination.
  • To treat all hits with pBeloBAC as
    contamination.
  • To treat all significant hits with pUC19 or any
    other cloning or sub-cloning
  • vector as contamination.
  • Upto 10 of maximum total contamination is
    allowed.
  • Templates are estimated based on Mean Read
    Length Avg. Success Rate

RESULTS OF QUALITY CHECK
5
ASSEMBLY
  • Assembly uses a combination of 3 programs namely
    -
  • PHRED Assigns quality values to each base.
  • PHRAP Trims vector sequences assembles the
    reads into Contigs.
  • CONSED Provides a graphical view of the
    assembly.

Phase I
A
B
E
C
D
H
F
G
6
Sequence Assembly - I
Avg. Seq reads 2000/day Avg. No. of bases
7,75,000/day i.e. 52,50,000/wk
OSJNBa0079N13
No. of plates 8 F/R Coverage
6.3X Total contigs 21 No. of Contigs gt2K
12 Largest contig 27.9K
7
Sequence Assembly - II
OSJNBa0079N13
No. of plates 16 F/R Coverage
10X Total contigs 11 Contigs gt2K
6 Largest contig 66Kb
Submitted to GenBank 140Kb
8
Verification of BAC ends from sequence reads
AAGCTT Hind III site
9
Validation with BAC End Sequences
Ba70D14 End Contig
Ba70D14 Forward BAC end seq.
Ba70D14 End Contig
Ba70D14 Reverse BAC end seq.
10
Orientation of BAC Clones
52B22
89M05
85G12
89M05 Forward end contig
85G12
89M 05
89M 05
85G12--FE--
1
2
3
4
1
2
3
4
11
Orientation of BAC Clones
52B22
89M05
85G12
Reverse BAC end sequence of 52B22
89M 05 Reverse end contig
89M05
89M05
FE--
FE--
--RE
85G12--
85G12--
--52B22
1
2
3
4
1
2
3
4
12
SUMMARY OF PHASE II SUBMISSION TO GENBANK

13
STRATEGY FOR GENE PREDICTION
14
ANNOTATION STANDARDS (IRGSP, FEB. 2002)
  • Sequences with 100 identity at the amino acid
    level to known proteins will receive the same,
    original gene name.
  • Sequences with less than 100 identity but with
    significant homology to known proteins will be
    called "putative" proteins of the same name.
  • Protein matches with BLASTP bit scores of gt100,
    e-values of lt e-20 , or equivalent
    criteria, will be regarded as significant
    homologies.
  • Sequences with homology to unknown ESTs will be
    called "unknown."
  • Sequences predicted by multiple gene prediction
    programs with no homology to any EST will be
    called hypothetical protein.

15
GENE PREDICTION ANNOTATION RESULTS -
  • Total No. of Genes Predicted - 984
  • Exact / 100 identical - 156
  • Putative - 339
  • Unknown 78
  • Hypothetical 411

Known genes Pi ta, Pib, Xa2, Xa21, RGAs, Yr10
, NBS-LRR,
salinity tolerance,
Gag-Pol polyprotein etc.
16
Problems in sequences
Single clone area
Gap
Gap
Single strand area
Multiple clone coverage on both strands
17
Finishing DNA Sequences
Finishing It is the process of polishing raw
sequences, transforming the fragmented rough
draft into long, continuous final product
without breaks or errors.
Objectives..
  • Resolve sequence ambiguities and discrepancies,
    such that the error rate is less than one in
    10,000 bases.
  • Provide double-stranded coverage for every
    base
  • minimum of two different clones
  • two different directions
  • two different chemistries
  • Achieve contiguity.
  • Delineate vector/insert junctions.

18
www.nrcpb.org
19
www.nrcpb.org/rgp.html
20
THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com