How to use the web for bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

How to use the web for bioinformatics

Description:

JOURNAL. FEATURES - A complete list of all of the features of a sequence. ... Change to Courier or Courier New. Reduce Font Size. Change to Landscape view. PCR ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 22
Provided by: ethans4
Category:

less

Transcript and Presenter's Notes

Title: How to use the web for bioinformatics


1
How to use the web for bioinformatics
  • Ethan Strauss
  • ethan.strauss_at_promega.com
  • 274-4330 X 1171
  • http//www.q7.com/ethan

2
Objectives
  • At the end of this session you should be able to
    do all of the following freely available tools on
    the world wide web
  • Use Genbank or a similar database to find nucleic
    acid sequences of interest
  • Understand the parts of a Genbank entry
  • Use a BLAST server (e.g. ) to find related
    sequences.
  • Perform an alignment of several nucleic acid
    sequences
  • Obtain the protein sequence which corresponds to
    a specific Nucleic acid sequence

3
How to find all those dang URLs!
  • http//q7.com/ethan/molbio/

4
Outline
  • Sequence Databases
  • What does a Genbank Entry look like?
  • Translation and other Utilities
  • BLAST
  • Multiple Sequence Alignment
  • PCR Primer Design

5
Sequences Databases
  • NCBI databases Nucleic acids, proteins,
    Literature, genomes, taxonomy, SNPs and more!
  • EMBL Nucleic acid, protein, structure,
    microarray data and more.
  • DBJJ Nucleic acid, protein.
  • SwissProt Very well annotated protein database.
  • Many other general and specialized databases
    exist.

6
Sequences DatabasesNCBI/Genebank
  • Nation Center for Biotechnology Information
    (NCBI)
  • Sponsored and run by the US government.
  • Contains many different databases and huge
    amounts of information.
  • Most or all data is freely downloadable.
  • This one site is probably sufficient for all your
    Nucleic acid a protein database needs!

7
Sequences DatabasesEntrez
  • Allows searching and access to NCBI databases.

8
Sequences DatabasesSequence Records
  • LOCUS Number Size Type Topology
    Division Date
  • DEFINITION - Name of the Sequence
  • ACCESSION - Unique Id number
  • VERSION - Other numbers which are associated
  • KEYWORDS
  • SOURCE What was it isolated from
  • ORGANISM - More taxonomic detail
  • REFERENCE - Paper or papers about the sequence
  • AUTHORS
  • TITLE
  • JOURNAL
  • FEATURES - A complete list of all of the
    features of a sequence. Can be very extensive and
    useful!
  • ORIGIN The actual Sequence!
  • http//www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db
    nucleotideval58533118

9
Hands on
  • Find a gene of interest using the Entrez
    interface.
  • We will be working with this sequence throughout
    class, so you may want to open a word processing
    program and save the sequence (only) there for
    future reference

10
General Utilities
  • http//searchlauncher.bcm.tmc.edu/seq-util/seq-uti
    l.html
  • Translation
  • Restriction Digestion
  • Reformatting (alternately FASTA Formatter)
  • Complement/Reverse
  • Etc.
  • http//www.promega.com/biomath/calc11.htm
  • Melting Temperature of an oligo.

11
Hands on
  • Translate your sequence in all 6 reading frames.

12
BLAST
  • Basic Local Alignment Search Tool
  • Compares a query sequences against all sequences
    in a database.
  • Very powerful for finding biologically
    significant relationships and full gene sequences
    in the database when you have a fragment etc.
  • Different types
  • Nucleic acid Nucleic Acid
  • Protein- Protein
  • Nucleic Acid Translation Protein
  • Protein Nucleic Acid Translation
  • Translation - Translation

13
BLAST
14
BLAST
15
Hands on
  • Use 120 bases (2 lines) from your sequence to
    find at least two other sequences related to it.
  • Note that if we all hit NCBI BLAST at once, it
    will be slow. We may not have time to wait.
  • Get all 3 sequences (your original and two
    others) into FASTA format using READSEQ.

16
Multiple Sequence Alignment
  • Many programs can align multiple sequences with
    each other to find the best fit for all.
  • This is generally more biologically meaningful
    for protein sequences since they are more highly
    conserved.
  • Clustal is the most common.

17
Multiple Sequence Alignment
  • MEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITGESITVHACHID
    SX ETIKALA MEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITG
    ESITVHACHIDS...ETIKALA MEA..YLNAII.VLV.TIIAVIS..L.
    RTEPC.IkITGESITV.ACklDa.....I..L.
    MEAgaYLNAIIfVLVaTIIAVISrgLtRTEPCtIrITGESITVhAChiDs
    x etIkaLa
  • LK PLSLERLFQ LK.PLSLERLFQ ......L..... lk
    plsLerlfq

18
Hands on
  • Use your FASAT Formatted sequences to perform a
    multiple sequence alignment.
  • Transfer the alignment to a word processing
    program and see if you can make it look decent.
  • Change to Courier or Courier New
  • Reduce Font Size
  • Change to Landscape view

19
PCR Primer Design
  • There are many PCR primer design programs online
    and off.
  • I recommend Primer 3. It is complex, but
    powerful.
  • You can ignore most parameters.

20
Hands on
  • Design primers for the sequence you have been
    working with.

21
Homework
  • ReportPlease turn in a report which includes
    the following
  • Information about your initial sequence
    including
  • Genebank Accession Number
  • Species
  • Description
  • Location of ORF and any other important features.
  • Information about the 4 other sequences including
    the above
  • Genebank Accession Number
  • Species
  • Description
  • Location of ORF and any other important features.
  • E value from your BLAST results.
  • The sequences of the PCR primers you chose or a
    short explanation of why you could not find
    primers to amplify all of these genes.
  • The multiple sequence alignment with the
    locations of the primers clearly marked.
Write a Comment
User Comments (0)
About PowerShow.com