Max Bachour - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Max Bachour

Description:

High throughput sequencing technique that can collect a large amount of ... gb|FC457105.1| UCRVU04_CCNI646_g1 Cowpea 524B Mixed Tissue and Conditions cDNA. THK ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 14
Provided by: mbac5
Category:
Tags: bachour | cowpea | max

less

Transcript and Presenter's Notes

Title: Max Bachour


1
  • Max Bachour

Jessica Chen
2
  • Shotgun or 454 sequencing
  • High throughput sequencing technique that can
    collect a large amount of data at a fast rate.
  • Works by partially digesting a genome or big
    strand of DNA into small overlapping fragments
  • These small fragments are sequenced and fragments
    that overlap are matched together.

3
Steps Behind 454 sequencing
  1. The genome is fragmented and the fragments are
    denatured.
  2. Fragments are amplified and assigned to beads.
    One fragment per one microbead.
  3. Each bead is placed in the wells of a fiber optic
    slide.
  4. Packing beads placed in all the wells.

4
Steps Behind 454 sequencing
  • Solution of one nucleoside is flooded onto tray.
  • If base added is next in the sequence, it will be
    added to the single stranded DNA on the bead.
  • When a nucleoside is added to DNA, 2 phosphates
    are given out
  • Enzymes in packing beads convert phosphate groups
    to ATP and then the ATP to light energy.

5
Steps Behind 454 sequencing
  • Computer and camera detect light in a certain
    well as a certain base is added to the tray.
  • Base is washed off and process is repeated with
    another base.
  • End product is large amount of fragments
    sequenced.

6
Genome Sequence Analysis
  • ?Contig Assembly
  • ?Identifying open reading frames (ORF) using gene
    prediction programs

7
What is the initial problem with assembly?
Sequenced fragmented DNA
CONTIG 1
CONTIG 2
Incorrectly Assembled DNA Sequence
8
How is this problem solved?
Sequenced fragmented DNA
Masked DNA Sequence
Assembled DNA Sequence
CONTIG 3
CONTIG 1
CONTIG 5
CONTIG 4
CONTIG 2
9
How do we identify genes?
  • Use gene prediction programs (Fgenesh, Genscan,
    Genemark) to determine potential genes also
    determine any repeat sequences
  • Enter contig
  • Which of the predicted genes are most likely
    existing genes?
  • ? Use BLAST

10
How do we use BLAST?
  • ? tblastn all predicted genes against an EST
    database (ESTDB)
  • Why ESTDB? record of all known/identified mRNA
    (cDNA library)
  • Why tblastn? -- amino acid sequence more likely
    to be conserved
  • ? use blastn and blastp
  • -blastp determine expression of gene

11
Analyzing BLAST data
Gene 1
Protein sequence MFVVQYLGSSRSWTSCSHSSKPGVDSRGRAEPHLAVGRSSLLGRVQTGLKGGGMKDSDLT
GDSSLARANQSMGICKSEGTVDRRLKSQVSQLLLGLLLIRLEGLLATCMTGPHGDAGAGS
THK

gtgbFC457105.1 UCRVU04_CCNI646_g1 Cowpea 524B Mixed Tissue and Conditions cDNA
Library UCRVU04-1 Vigna unguiculata cDNA clone CCNI646, mRNA
sequence.
Length807

Score 215 bits (548), Expect(2) 2e-55, Method Compositional matrix adjust.
Identities 110/112 (98), Positives 110/112 (98), Gaps 0/112 (0)
Frame -1

Query 12 SWTSCSHSSKPGVDSRGRAEPHLAVGRSSLLGRVQTGLKGGGMKDSDLTGDSSLARANQS 71
SWTSCSHS KPGVDSRGRAEPHLAVGRSSLLGRVQTGLKGGGMKDSDLTGDSSLARANQS
Sbjct 438 SWTSCSHSKPGVDSRGRAEPHLAVGRSSLLGRVQTGLKGGGMKDSDLTGDSSLARANQS 259

Query 72 MGICKSEGTVDRRLKSQVSQLLLGLLLIRLEGLLATCMTGPHGDAGAGSTHK 123
MGICK EGTVDRRLKSQVSQLLLGLLLIRLEGLLATCMTGPHGDAGAGSTHK
Sbjct 258 MGICKEGTVDRRLKSQVSQLLLGLLLIRLEGLLATCMTGPHGDAGAGSTHK 103
  • Critical data
  • e-value
  • match
  • EST source

12
Advantages and Disadvantages
  • Fast sequencing at a high volume
  • Cheap compared to other methods
  • Much higher coverage protection
  • Repetitive sequences can disrupt computer program
    into thinking that unrelated sequences are in
    fact connected.
  • More prone to error and missing sequences

13
Drastically changed genomics in a very short
amount of time
Write a Comment
User Comments (0)
About PowerShow.com