Title: Cloning and Sequencing Explorer Series
1(No Transcript)
2Cloning and Sequencing Explorer
Series Bioinformatics
3 Instructors
- Stan Hitomi
- Coordinator Math Science
- Principal Alamo School
- San Ramon Valley Unified School District
- Danville, CA
- Kirk Brown
- Lead Instructor, Edward Teller Education Center
- Science Chair, Tracy High School
- and Delta College, Tracy, CA
- Bio-Rad Curriculum and Training Specialists
- Sherri Andrews, Ph.D.
- sherri_andrews_at_bio-rad.com
- Essy Levy, M.Sc.
- essy_levy_at_bio-rad.com
- Leigh Brown, M.A.
4 Bioinformatics
- The application of information technology to
molecular biology
5Questions Concerning your Data
- Class Data Set
- Are our sequences high quality?
- Are my sequences similar to GAPDH?
- Are any of my sequences primarily cloning vector?
- Individual Clone Sequences
- Do my individual sequences align to give me a
single long sequence? - Are there discrepancies between my reads?
- Which GAPDH gene did we clone?
- Annotation of Clone Sequence
- What is the intron- exon structure/mRNA sequence
of my clone? - What is the protein sequence of my clone?
6Sequence data storage and analysis tools (iFinch
and Finch TV) Sequence comparison algorithm (NCBI
BLAST) Sequence Assembly (CAP3) mRNA sequence
prediction (BLAST and manual) Protein sequence
prediction (EMBL-EBI EMBOSS Transeq)
Sequence Data Analysis Tools
7Advanced Preparation
- Practice with iFinch using the guest account-
highly recommended! - Activate your iFinch account (2 months
subscription) - Download FinchTV onto lab computers
- Set up project and folder in iFinch
- Upload sequence data
8Guest iFinch Account
- http//classroom1.bio-rad.ifinch.com/Finch
- Username BR_guest
- Password guest
- Example data sets for each stage of process
- No uploading of data
9Your own iFinch account
- Each account has a unique URL
- http//Platenumber.ifinch.com/Finch
- E.g. http//A150936.ifinch.com/Finch
- Instructors Username Platenumber e.g. A150936
- Instructors Password Platenumber e.g. A150936
- Student Username Platenumber_student e.g.
A150936_student - Student Password Platenumber e.g. A150936
- Once activated- change your passwords!
- Active for 2 months.
10Download FinchTV
11Make project folder and upload data to
iFinchDemo
12Student Activities
- Review data quality and view sequence traces
- Use BLAST for preliminary check on which GAPDH
was cloned - Assemble sequences into a contig
- Verify which GAPDH gene was cloned
- Predict intron exon boundaries and generate mRNA
sequence - Predict protein sequence
13Sequence Quality
14Q20 values
- The quality value of a base call is
- Q -10Log10(Perror)
- where P is the probability of an error.
- Thus if the chance that a base call is incorrect
is 1/100, P would be 0.01 and the quality value
would be 20 (Q20). - Convention rates sequences by the number of
basecalls that have quality values of 20 or
higher- a Q20 value. - The quality values of a sequence are calculated
automatically by software in iFinch- a common
program for this was developed by the University
of Washington and is called Phred
15Sequence Quality
Q20 732
Q20 161
Q20 238
16iFinch and Sequence Quality
17Screen for poor quality sequence, vector, GAPDH
family
18Class Data Set
19Sort Class Data into Folders
20Record Data Information
21Download sequences for initial screen using BLAST
- Open Guest iFinch account
- User BR-guest, Psw guest
- Click Folders
- Click Salvia folder
- Look at data
- Go back to folder report
- Click Download folder data- save to new folder
on hard drive - View FASTA format in MSWord or text editor
- Upload file back to iFinch
22BLAST sequences for initial screen
- Click NCBI BLAST on iFinch homepage
- Choose nucleotide search
- Browse for downloaded salvia.fsa file to upload
- Choose Others (nr etc), Select Reference
Genomic sequences - Choose Plants (taxid)
- Choose Somewhat similar sequences (blastn)
- Click BLAST
23BLAST Results
- All 4 sequences were analyzed by BLAST- choose
from pull down menu at top of page - Mouse over top bar
- Scroll down to list of homologous sequences
- E value represents the number of equally good
sequence matches to the query sequence that would
be expected in a database of the same size
containing random sequences. - Scroll down to sequence alignments
- Query Your sequence
- Subject Database matching sequence
24Which GAPC Gene?
25 26Questions Concerning your Data
- Class Data Set
- Are our sequences high quality?
- Are my sequences similar to GAPDH?
- Are any of my sequences primarily cloning vector?
- Individual Clone Sequences
- Do my individual sequences align to give me a
single long sequence? - Are there discrepancies between my reads?
- Which GAPDH gene did we clone?
- Annotation of Clone Sequence
- What is the intron- exon structure/mRNA sequence
of my clone? - What is the protein sequence of my clone?
27Initial Screen Result
- We have cloned Salvia GAPC gene
- Now we need to put the sequences together to make
a contig- (contiguous sequence) - Then correct any sequence discrepancies between
different reads
28CAP3 Program(ContigAssemblyProgram)
- On iFinch home page click sequence assembly
29Assembly Results
- Your sequence file (your input)
- Single sequences (any seqs that could not be
assembled) - Contigs (save in FASTA format as .txt file)
- Assembly details (Save as landscape .txt file)
30Salvia Contig
A01
I01
C01
G01
31Check for Discrepancies
- Look through assembly file for sequence
discrepancies - Open chromatogram files in FinchTV
- Examine actual chromatograms and use personal
judgment to determine which base call is correct - Correct FinchTV file and save back to iFinch (not
available in guest account) noting the changes in
the revision history - If the consensus sequence has changed, download
folder sequences again like previously and
reassemble with CAP3 program
32BLAST search with contig
Submit contig FASTA file for BLAST search (same
database as before- plant reference genomic
database)
33 34Questions Concerning your Data
- Class Data Set
- Are our sequences high quality?
- Are my sequences similar to GAPDH?
- Are any of my sequences primarily cloning vector?
- Individual Clone Sequences
- Do my individual sequences align to give me a
single long sequence? - Are there discrepancies between my reads?
- Which GAPDH gene did we clone?
- Annotation of Clone Sequence
- What is the intron- exon structure/mRNA sequence
of my clone? - What is the protein sequence of my clone?
35Determine Gene Structure
36Workflow
37BLAST Search Against Reference mRNA Database
- Blastn search with contig against plant Reference
mRNA sequences database - Change Algorithm parameters
38Reformat BLAST results
- Reformat results in plain text format
- Save files to iFinch folder
39Save Contig File in MSWord
- Delete all paragraph marks using find and replace
command - Save to hard drive as .rtf file with a new
name. - Color contig sequence with exons as determined
from BLAST results - Put exons together in a first draft of the mRNA
sequence and save to iFinch folder - Submit draft mRNA sequence to blastn against
plant reference mRNA database
40BLAST search with derived mRNA sequence
- Correct intron-exon boundaries (use Arabidopsis
mRNA as model) - Resubmit to BLAST
- Reiterate if necessary until no indels are
evident and you are satisfied with a final mRNA
sequence - Save to iFinch folder
41Use blastx to Search Protein Database
- Blastx converts nucleic acid sequence to amino
acid sequence and searches protein database.
42Translate mRNA into Protein Sequence
43Check Protein Sequence with blastp Search
- Ensure translation is in correct frame
- Save to iFinch folder
44Congratulations!
- You have cloned, sequenced and annotated a novel
gene. - You could now submit this to GenBank.
- Data from additional samples would strengthen the
data- for example- assemble sequences from the
same gene from different student teams - Download data from iFinch if you wish to keep it
for the long term
45Webinars
- Enzyme Kinetics A Biofuels Case Study
- Real-Time PCR What You Need To Know and Why You
Should Teach It! - Proteins Where DNA Takes on Form and Function
- From plants to sequence a six week college
biology lab course - From singleplex to multiplex making the most out
of your realtime experiments - explorer.bio-rad.com?Support?Webinars