HMMER tutorial - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

HMMER tutorial

Description:

Biological sequence analysis: probabilistic models of proteins and ... http://bioweb.pasteur.fr/seqanal/motif/hmmer-uk.html. Format of input alignment files ... – PowerPoint PPT presentation

Number of Views:872
Avg rating:3.0/5.0
Slides: 31
Provided by: weixu
Category:
Tags: hmmer | motif | tutorial

less

Transcript and Presenter's Notes

Title: HMMER tutorial


1
HMMER tutorial
  • ???
  • g39208007_at_ym.edu.tw

2
Account
  • IP 140.129.78.120
  • Account binfo2005
  • Password 2005binfo

3
HMMER
  • http//hmmer.wustl.edu/
  • The theory behind profile HMMs R. Durbin, S.
    Eddy, A. Krogh, and G. Mitchison, Biological
    sequence analysis probabilistic models of
    proteins and nucleic acids, Cambridge University
    Press, 1998.

4
Flowchart
http//bioweb.pasteur.fr/seqanal/motif/hmmer-uk.ht
ml
5
Format of input alignment files
  • Output of CLUSTAL family of programs
  • Wisconsin/GCG MSF format
  • the input format for the PHYLIP phylogenetic
    analysis programs
  • aligned FASTA format
  • Stockholm format (HMMERs native format, used by
    the Pfam and Rfam databases)
  • SELEX format

6
Searching a sequence database with a single
profile HMM
  • build a profile HMM with hmmbuildgt hmmbuild
    globin.hmm globins50.msf
  • calibrate the profile HMM with hmmcalibrategt
    hmmcalibrate globin.hmm
  • search the sequence database with hmmsearchgt
    hmmsearch globin.hmm Artemia.fa

7
(No Transcript)
8
(No Transcript)
9
local alignment versus global alignment
  • To HMMER, whether local or global alignments are
    allowed is part of the model, rather than being
    accomplished by running a different algorithm.
  • you need to choose what kind of alignments you
    want to allow when you build the model with
    hmmbuild.
  • By default, hmmbuild builds models which allow
    alignments that are global with respect
  • to the HMM, local with respect to the sequence,
    and allows multiple domains to hit per sequence.

10
Searching a query sequence against a profile HMM
database
  • creating your own profile HMM databasegt hmmbuild
    -A myhmms rrm.stogt hmmbuild -A myhmms fn3.stogt
    hmmbuild -A myhmms pkinase.stogt hmmcalibrate
    myhmms
  • parsing the domain structure of a sequence with
    hmmpfamgt hmmpfam myhmms 7LES DROME

11
Creating and maintaining multiple alignments with
hmmalign
  • Another use of profile HMMs is to create multiple
    sequence alignments of large numbers of
    sequences.
  • A profile HMM can be build of a seed alignment
    of a small number of representative sequences,
    and this profile HMM can be used to efficiently
    align any number of additional sequences.
  • gt hmmalign -o globins630.ali globin.hmm
    globins630.fa

12
(No Transcript)
13
(No Transcript)
14
HMMER scoring and determining significance
  • HMMER gives you at least two scoring criteria to
    judge by the HMMER raw score, and an E-value.
  • The E-value is calculated from the bit score. It
    tells you how many false positives you would have
    expected to see at or above this bit score.
  • HMMER bit scores reflect whether the sequence is
    a better match to the profile model (positive
    score) or to the null model of nonhomologous
    sequences (negative score).

15
hmmsearch output
16
(No Transcript)
17
(No Transcript)
18
  • Building a model
  • hmmbuild From a multiple sequence alignment
  • Using a model
  • hmmalign Align sequences to an existing model
    (outputs a multiple alignment)
  • hmmconvert Convert a model into different formats
  • hmmcalibrate Takes an HMM and empirically
    determines parameters that are used to make
    searches more sensitive, by calculating more
    accurate expectation value scores (E-values)
  • hmmemit Emit sequences probabilistically from a
    profile HMM
  • hmmsearch Search a sequence database for matches
    to an HMM
  • HMMs Databases
  • hmmfetch Get a single model from an HMM database
  • hmmindex Index an HMM database (not available on
    the WEB server)
  • hmmpfam Search an HMM database for matches to a
    query sequence
  • Other programs
  • alistat Show some simple statistics about a
    sequence alignment file
  • seqstat Show some simple statistics about a
    sequence file
  • getseq Retrieve a (sub-)sequence from a sequence
    file (not available on the WEB server)
  • sreformat Reformat a sequence(s) or alignment
    file into a different format

19
References
  • HMMER user guide
  • Eddy SR. (1998) Profile hidden Markov models.
    Bioinformatics.

20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Related links
  • HMMER http//hmmer.wustl.edu/
  • SAM http//www.cse.ucsc.edu/research/compbio/sam.h
    tml
  • PFTOOLS http//www.isrec.isb-sib.ch/ftp-server/pft
    ools/
  • HMMpro http//www.netid.com/html/hmmpro.html
  • GENEWISE http//www.ebi.ac.uk/Wise2/
  • PROBE ftp//ftp.ncbi.nih.gov/pub/neuwald/probe1.0/
  • META-MEME http//metameme.sdsc.edu/
  • BLOCKS http//www.blocks.fhcrc.org/
  • PSI-BLAST http//www.ncbi.nlm.nih.gov/BLAST/newbla
    st.html

27
Homework Search for homologies with hidden
Markov models
  • Obtain the UniProtKB/Swiss-Prot entry P10242 of
    the myb proto-oncogene protein (AC P10242, entry
    MYB_HUMAN)
  • Take the amino acid sequence of the myb protein
    and search against the NCBI nr protein database
    with BLASTp to obtain a HMM for myb-domains and
    use this HMM for searching against the
    UniProt-SwissProt protein database.
  • Select 10 myb-domains while screening the hits of
    the BLASTp search and copy the corresponding
    parts of the sequences to a file in fasta-format
  • Do a multiple sequence alignment with these ten
    myb-domains by ClustalW.

28
Homework Search for homologies with hidden
Markov models (cont.)
  • Download HMMER from http//hmmer.wustl.edu/ and
    install.
  • Build and calibrate a HMM of these myb-domains by
    means of hmmbuild and hmmcalibrate.
  • Use hmmsearch to search against the
    UniProt-SwissProt protein library with the HMM of
    the myb-domains.
  • Screen the hits, build a new HMM including
    selected hits and hmmsearch again.
  • How many hits do you get? What are they?

29
HMM
30
Some examples
Write a Comment
User Comments (0)
About PowerShow.com