Matlab Bioinformatics Toolkit Evaluation - PowerPoint PPT Presentation

About This Presentation
Title:

Matlab Bioinformatics Toolkit Evaluation

Description:

of hydrophobicity, solvent accessibility : Command ? ... Hydrophobicity. Secondary structure propensity (Alpha helices or beta strands) ... – PowerPoint PPT presentation

Number of Views:290
Avg rating:3.0/5.0
Slides: 20
Provided by: sisl
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Matlab Bioinformatics Toolkit Evaluation


1
Matlab Bioinformatics Toolkit Evaluation
  • Kanishka Bhutani

2
What I expected ??
  • Local/Global sequence alignments.
  • Multiple sequence alignments.
  • Choice of different scoring matrices (BLOSUM,
    PAM) for evaluation.
  • Build Hidden Markov Models.
  • Easily import sequences from databases (PFAM,PDB,
    Swissprot)

3
What I found ??
  • Most of the features.
  • Bonus
  • Microarray normalization tools.
  • Microarray Visualization tools including box
    plots, heat maps.

4
Any surprises ?
  • No Multiple sequence alignments
  • Avg./Std Dev. of hydrophobicity, solvent
    accessibility Command ?
  • Proteinplot- GUI for protein structure
    analysis.
  • Import your file to view, select parameters and
    display stats.

5
What all I tried?
  • Local alignment, Global alignment.
  • For short sequences
  • swalign(seq1,seq2)
  • nwalign(seq1,seq2)
  • seq1,seq2 AA or NT sequences.
  • For imported long sequences
  • Convert seq into a vector of integer values
  • Commands nt2int, aa2int

6
Pairwise Sequence alignment
  • S getgenbank(NM_00001)
  • M getgenbank(NM_00002)
  • Output Header and a sequence.
  • Knt2int(S.Sequence)
  • Bnt2int(M.Sequence)
  • sc,align nwalign K,B
  • Alignment Score Aligned seq.

7
Getting sequences V Easy !
  • getgenbank Retrieve sequence information from
    Genbank database.
  • getembl Retrieve seq. information from EMBL
    database.
  • getpept Retrieve seq information from Genpept
    database.
  • gethmmprof Get HMM from the PFAM database.

8
Experiment
  • hmmodel gethmmprof(PF00001)

9
Visualization of model
Showhmmprof (hmmodel,scale,logodds)
10
Get GPCR seqs
  • S getgenbank (NM_024531)
  • disp (S.Sequence)

11
Alignment of the seqs
  • var gethmmalignment
    (PF00001,type,seed)
  • disp char(var.Header) char (var.Sequence)

12
For GPCR Family C
  • Similarly for diff families.
  • Multiple aligned sequences retrieved.

13
GUI proteinplot
  • User friendly.
  • Avg./ Std. dev values for
  • Hydrophobicity.
  • Secondary structure propensity (Alpha helices
    or beta strands)
  • Accessibility (accessible and buried residues)

14
Mglur1 plot (Proteinplot)
15
Mglur1 results
Parameter Average () Std. Dev.()
Accessible residues 5.04 1.25
Buried residues 8.22 1.816
Alpha helix 0.89 0.1565
Beta sheet 0.97 0.1038
Hydrophobicity 3.01 0.9608
16
Test a seq. with HMM
  • Retrieve mglur1 from Genbank
  • mgr getgenbank (NM_012407)
  • glusequence mgr.sequence
  • Test it with the HMM model class A
  • a.sglu hmmprofalign (model A,
    glusequence,showscore,true)
  • Score -203.53
  • Seq

17
Log odd score plot for best path
18
Difficulties questions
  • No multiple sequence alignment.
  • Demos Not very helpful.
  • Difficult to view the sequences as no disp
    command found.
  • Bugs
  • Storing huge sequences (GPCR A) in a file,
    parsing error.
  • HMMprofdemo command abruptly stops and gives
    errors.
  • Proteinplot (GUI) hangs the machine often.
  • Verify the sequences using the HMM models ??
  • Regular expression matches and highlighting those
    positions??

19
Suggestions of experiment
  • Given an unknown sample dataset of proteins,
    known dataset of proteins (known structural
    information).
  • Utilize the BLMT to extract over expressed 4
    Grams in a protein sequence or a group of protein
    sequences from the known set.
  • Use search for regular expression function in
    the Matlab toolkit to look for those 4 Grams in
    unknown proteins and hence predict their
    structure.
Write a Comment
User Comments (0)
About PowerShow.com