Tweaking BLAST - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Tweaking BLAST

Description:

Tweaking BLAST. Although you normally see BLAST as ... of parameters, but most are way too obscure even for die-hard techies like me! ... 4. Limit Entrez query ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 9
Provided by: mikegil
Category:
Tags: blast | die | hard | tweaking

less

Transcript and Presenter's Notes

Title: Tweaking BLAST


1
Tweaking BLAST
Although you normally see BLAST as a web page
with boxes to place data in and tick boxes, etc.,
it is actually a command line program that can be
running just by typing the right command and
options, e.g. gtblastall p blastn I
my_sequence.fasta d refseq Which is the
simplest form, where the basic program blastall
takes a number of different options or parameters
indicated by the x and followed by its value.
-p ltwhich blast flavour to rungt -I ltfile with
query sequence ingt -d ltpre-indexed database
namegt There are many other parameters, and if
not listed explicitly will use a default value
most appropriate to the blast flavour requested.
E.g. for W ltword sizegt blastn uses W 11, where
blastx uses W 3. There are also some options
that appear on the web pages that which are not
really parameters but manage the job in some way.
One of the most useful of these is on the NCBI
blast pages where you can use Entrez queries or
pick from an organism list to modify your search.
2
The Many Parameters of BLAST
There are almost literally hundreds of
parameters, but most are way too obscure even for
die-hard techies like me! Very few of them are
regularly useful in any but their default value,
but just occasionally they are very
necessary. Here are some of the ones that I have
used -e max expected value -m output
format (graphical or tabular/spreadsheet) -F
filter query sequence for low complexity (default
TRUE) -U use only upper case regions of query
(default FALSE) -G gap opening cost -E gap
extension cost -q nucleotide mismatch
penalty (BLASTx uses matrices) -r
nucleotide match reward -b number of
matching sequences to report -g allow
gaps (default TRUE) -W word size -z
effective database size (removes effect of
actual database size!) -S query strands
to search (default both directions) -l restrict
database sequences to given list of gi numbers
3
(No Transcript)
4
BLAST Parameters Exercises
1. BLASTn vs. BLASTp Go to informatics.gurdon.ca
m.ac.uk/online/workshops/useful-web-sites.html Ope
n blast-parameter-sequences.html Copy the
sequence gtblastn-vs-blastp and go to the NCBI
BLAST Home Page. This is a Xenopus tropicalis
cDNA sequence. Go to NUCLEOTIDE BLAST
section. Run BLASTn against the nr nucleotide
database using all default options. Then hit
format to wait for the results in a new
page. Now repeat but go to the TRANSLATED BLAST
section, and BLAST against the nr protein
database using BLASTx. How might the different
results help us view the presence of this gene in
other vertebrates?
5
BLAST Parameters Exercises
2. Low complexity filtering Go to
informatics.gurdon.cam.ac.uk/online/workshops/usef
ul-web-sites.html Open blast-parameter-sequences.h
tml Copy the sequence gtlow-complexity-filtering-A
and go to the NCBI BLAST Home Page. Go to the
TRANSLATED BLAST section, BLASTx. Carefully
UNTICK the Choose filter  Low complexity
BOX in the second section. And then run BLASTx
against the nr database. What do you feel about
these alignments? Re-run, but leave the
low-complexity filter ON this time. Does this
change our view of the protein matches? Now
continue with gtlow-complexity-filtering-B and
C. C is an especially interesting case what
can we deduce about the cDNA sequence? Annotators
beware!
6
BLAST Parameters Exercises
1. BLASTn vs tBLASTx and nucleotide mismatch
penalties Go to informatics.gurdon.cam.ac.uk/onl
ine/workshops/useful-web-sites.html Open
blast-parameter-sequences.html Also open the NCBI
BLAST Home Page and go to the SPECIAL Align two
sequences section. There are several Xenopus
tropicalis cyclins. Copy the sequence
gtcyclin-A1-Xt to the Sequence 1 window Copy the
sequence gtcyclin-A2-Xt to the Sequence 2
window Run the default comparison, should be
BLASTn. Note the alignment. Now run again using
tBLASTx what does this do to our understanding
of the relationship between these two sequences?
Are they homologs, orthologs or paralogs or
none of these? Revert to BLASTn, and try varying
the values for mismatch penalties and gapping
start by reducing the mismatch penalty to -1. Can
we learn anything from this? Now repeat the
first parts of the exercise with cyclin-D1 in
place of cyclin-A2
7
BLAST Parameters Exercises
4. Limit Entrez query Entrez queries can be used
in the NCBI BLAST web page to restrict the search
to more specific items. For instance to find
only matching in fruit fly proteins, enter
Drosophila melanogasterORGN in the Limit by
entrez query box in the second section (you can
also select the organism from the adjacent
drop-down list). To combine items use logical
AND, OR or NOT. Go to informatics.gurdon.cam.ac.
uk/online/workshops/useful-web-sites.html Open
blast-parameter-sequences.html Copy the sequence
gtcyclin-D1-Xt and go to the NCBI BLAST Home
Page. Go to the TRANSLATED BLAST section, BLASTx,
and paste the sequence. Use an Entrez query to
find all rodent sequences (rat and mouse) with a
good match to cyclin-D1. At what E-value do we
expect we are no longer looking at cyclins? Try
running the search again with that E-value as a
limit
8
BLAST Parameters Exercises
5. Word Size Go to informatics.gurdon.cam.ac.uk/
online/workshops/useful-web-sites.html Open
blast-parameter-sequences.html Copy the sequence
gtmorpholino go to the NCBI BLAST Home Page. Go to
the NUCLEOTIDE BLAST section, BLASTn, and paste
the sequence. Check OFF the low complexity
filter, and then run the search. Now re-run the
search, setting the following parameters Low
complexity OFF Expect 100 Word Size 7 Other
advanced -q-1 (mismatch penalty -1 instead of
default -3) What difference does this make?
Write a Comment
User Comments (0)
About PowerShow.com