Bioinformatics for Proteomics studies - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics for Proteomics studies

Description:

Bioinformatics for Proteomics ... Under expressed Over expressed Protein Center v 3.0.4 A Proteomics bioinformatics tool available through HSLS Import ... – PowerPoint PPT presentation

Number of Views:383
Avg rating:3.0/5.0
Slides: 74
Provided by: taman7
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics for Proteomics studies


1
Bioinformatics for Proteomics studies
  • Tamanna Sultana
  • Bioinformatics Analysis Core (BAC)
  • Genomics Proteomics Core Laboratories (GPCL)
  • University of Pittsburgh

2
Proteomics
  • The term proteomics refers but not limited to the
    analysis of proteins in terms of separation,
    identification, quantification, expression and
    function.

3
Bioinformatics for proteomics
  • Informatics is a field of study that focuses on
    the use of technology for improving access to and
    utilization of information. ...library.ahima.org/
    xpedio/groups/public/documents/ahima/bok1_025042.h
    csp
  • Information science
  • the sciences concerned with gathering,
    manipulating, storing, retrieving, and
    classifying recorded information
    wordnetweb.princeton.edu/perl/webwn
  • its broad meaning is the science of processing
    data. Within health and social care, it is used
    to refer to the processing of data on patients
    and clients, normally but by no means
    exclusively through IT systems.www.smarthealthc
    are.com/glossary

4
Mass Spectrometry (MS) based proteomics
5
Bottom up MS workflow close up
Further sample prep through separation
LC
Sample
Spectral data
MS
Sample preparation
Which proteins to analyze ????
Experimental peak list
Which data base/search engine (algorithm) to use
????
How do I know this is correct ????
6
Sample preparation
  • Purification
  • Gel electrophoresis
  • Fractionation

www.qiagen.com
pI
MW
GPCL
www.prometicbiosciences.com
7
Sample analysis by Bottom up MS
  • Digestion (cleavage) of proteins by an enzyme
  • Fractionation of peptides
  • Off line
  • On line
  • Analysis by MS and tandem MS (MS/MS)

MGLSDGEWQQ VLNVWGKVEA DIAGHGQEVL IRLFTGHPET
LEKFDKFKHL KTEAEMKASE DLKKHGTVVL TALGGILKKK
GHHEAELKPL AQSHATKHKI PIKYLEFISD AIIHVLHSKH
PGDFGADAQG AMTKALELFR NDIAAKYKEL GFQG
8
Tandem MS
Mass analyzer
Ion source
Detector
Basic components of any MS
4700 Proteomics Analyzer, Applied Biosystems
9
MS
  • MS, followed by precursor ion selection

10
Fragment ion spectrum
Tandem MS
11
Tandem mass spectrum
http//qbab.aber.ac.uk
12
Tandem mass spectra (MS/MS) can be used for
peptide sequencing
  • Database Searching
  • Peptide Mass Fingerprinting
  • Sequence tag approach
  • De novo sequencing
  • inspect raw data

http//qbab.aber.ac.uk
13
(No Transcript)
14
Top hits from Mascot Search there are multiple
accession numbers for the same protein
15
Creatine kinase B is the highest scoring protein
Match to gi21536286 Score 681
Creatine kinase - B Homo sapiens
Nominal mass (Mr) 42591 Calculated pI value
5.34 Observed Mass pI 43kd, 6.2-6.27
Sequence Coverage 46
1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG
51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY
101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE
151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP
201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF
251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP
301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV
351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K
16
Problems frequently faced in proteomics
  • How to choose which proteins to analyze
  • Depends on the goal and the availability of
    support
  • Which methods to choose to quantify the proteins
  • Again it depends on the goal
  • How do I know I have all the correct proteins
  • Complicated
  • Depends on the sample, methods, instruments,
    softwares used

17
BACs fee for service methods for supporting
proteomic studies
  • Differential protein expression analysis
  • Tradition 2D
  • DiGE
  • Consensus peptide identification
  • Protein database searches with Mascot, Sequest,
    X!Tandem, Phenyx etc.
  • Consensus searches among Mascot, Sequest and
    X!Tandem
  • Pathway analysis of identified proteins
  • Intelligent Systems and Bioinformatics Laboratory
  • Pathway Express

18
Integration of BAC with Proteomics Lab
PI and Proteomics lab decides the project path
BAC suggests the study design
Samples submitted to the lab according to the
study design
BAC performs the Data analysis
For samples not involving 2D gel electrophoresis
2D gel analysis
Peptide ID consensus
List of differentially expressed spots sent to
Core
Further project specific bioinformatics
analysis/help
Protein ID generated by lab
19
Difference gel electrophoresis (DiGE) image
analysis
Surya Viswanathan, Mustafa Ünlü, Jonathan S
Minden. Nature Protocols 1, 1351 - 1358 (2006)
20
Labeling strategy for 3 samples
Cy3
Cy5
Cy3
Cy5
gel2
gel1
Cy3
Cy5
gel3
21
DiGE analysis of protein isoform expression in
STAT3 constitutively activated versus STAT3 loss
variant multiple myeloma cell line U266
  • Date May 22, 2008
  • PI ___ ___
  • BAC Analyst Tamanna Sultana
  • Project Location www.genetics.pitt.edu/mygpcl/

Example of DiGE analysis and Reporting to PI
22
Specific aims/objectives
  • To assess proteomic differences between U266
    with constitutively activated STAT3 and the U266
    STAT3 loss variant using 2D-DIGE, with particular
    emphasis on mapping the CENPM 58 AA isoform

23
Sample Details
  • Samples
  • U266 with constitutively activated STAT3 2661
  • U266 STAT3 loss variant 2662
  • Sample condition
  • Protein extracted and cleaned by PI lab
  • Sample amount
  • 100 µg each sample was used
  • Sample buffer
  • Lysis buffer, 20 µL

24
Sample processing (Traditional 2D/DiGE)
  • Sample prep
  • labeling 2661 cy3, 2662 cy5 and reciprocal
  • 1st dimension Protein IEF cell, BioRad
  • Gel-strip 3-10NL, 17cm
  • Sample volume load 300 µL
  • Running conditions 250V for 15min., ramp to
    10000V in 3hrs., reach 60000 V/hr, hold at 500 V
  • 2nd dimension Protein II xi cell, BioRad
  • Gel Jule, 8-16
  • Running buffer TGS (BioRad), 2X on top chamber
    and 1X on bottom chamber
  • Running conditions 16 mA for 45 min. followed by
    30 mA for 5 hrs.
  • gels generated 2

25
Gel processing
  • Gel fixing
  • Buffer 40 Methanol, 5 acidic acid
  • Time overnight
  • Gel staining
  • None
  • Gel imaging
  • DiGE scanner custom made with Prometrix CCD
    camera
  • Image generated per gel 2 (total 4 images)
  • Image label
  • BricknerGel_A_30sec_Cy3-2661
  • BricknerGel_A_30sec_Cy5-2662
  • BricknerGel_B_30sec_Cy5-2661
  • BricknerGel_B_30sec_Cy3-2662
  • Image Storage location http//www.genetics.pitt.e
    du/mygpcl/050208
  • Scanned gels and left over sample storage
    location
  • Gels in Rm. 9035, BST3 at 4C, samples -80C

26
Previous analysis summary
  • None

27
Image analysis using Delta 2D softwareFrom
Decodon
28
Dual-view image of 2661 and 2662
Possible knockdown
Over expression in 2662 labeled
Under expression in 2662 labeled
2662 blue spots 2661 orange spots Overlap
black
29
Quantitation table-1 (over-expression of 2662)
Ave. Volume of 2661 STDEV. 2661 Ave. Volume of 2662 STDEV. 2662 Statistics label
0.02286 95.541 0.157 21.485 6.86425 ID758
0.04919 5.16631 0.222 23.826 4.51013 ID1007
0.01342 46.4842 0.046 11.841 3.39664 ID638
0.08416 7.3891 0.278 10.941 3.30421 ID1034
0.018 42.3414 0.055 44.115 3.06272 ID856
0.03559 23.0147 0.101 13.473 2.8341 ID1001
0.04949 16.6076 0.136 2.09 2.75143 ID989
0.01316 66.0837 0.036 13.76 2.70679 ID766
0.00253 99.3149 0.006 98.832 2.52439 ID441
0.00738 99.0185 0.018 15.034 2.48139 ID777
0.02265 36.2054 0.054 3.8892 2.40566 ID768
0.00958 20.6202 0.022 6.2787 2.27036 ID637
0.01124 54.9229 0.025 28.956 2.26648 ID747
0.1959 27.1355 0.423 36.584 2.15748 ID1093
0.00491 20.2911 0.01 31.267 2.10956 ID671
0.27372 19.1721 0.568 25.056 2.07484 ID561
0.0122 22.5978 0.025 76.485 2.03149 ID436
0.01617 33.9946 0.032 43.049 2.0087 ID843
30
Quantitation table-2 (under-expression of 2662)
Ave. Volume of 2661 STDEV. 2661 Ave. Volume of 2662 STDEV. 2662 Statistics label
0.02564 47.8103 0.009 21.208 -2.94189 ID1052
0.21899 41.5971 0.096 47.237 -2.27408 ID1005
0.02429 74.5 0.006 41.081 -3.96436 ID780
0.01715 2.38384 0.007 10.352 -2.38462 ID879
0.78354 5.48853 0.18 28.436 -4.35728 ID1035
0.2324 5.45632 0.089 14.433 -2.61328 ID965
1.36616 5.92218 0.628 16.394 -2.17684 ID674
0.07541 56.8537 0.034 27.446 -2.20996 ID789
0.33283 1.68301 0.128 2.5139 -2.60336 ID966
0.07982 11.8707 0.028 39.784 -2.85716 ID972
0.13488 20.6176 0.055 52.409 -2.47384 ID894
0.30791 7.37787 0.152 12.419 -2.03194 ID413
0.03932 15.9807 0.015 58.697 -2.61746 ID1102
0.2053 27.0974 0.077 53.213 -2.68374 ID678
1.81212 13.1749 0.524 33.088 -3.45585 ID953
0.07955 38.023 0.038 35.563 -2.11097 ID736
0.01965 66.2334 0.004 60.707 -5.13589 ID171
0.35734 32.9212 0.138 47.716 -2.59257 ID646
0.5569 18.2044 0.233 14.208 -2.39388 ID682
0.5001 6.47287 0.249 30.814 -2.0096 ID681
0.02863 54.201 0.01 55.458 -2.98691 ID1096
1.17678 41.0268 0.451 66.544 -2.609 ID388
0.00893 0.69254 0.004 50.129 -2.02044 ID919
0.06059 60.0037 0.02 80.683 -2.99036 ID1044
31
Graphical representation
32
Preliminary Conclusions
  • 2661 versus 2662
  • Number of over-expressed spots 18
  • Number of under-expression 24
  • Report storage location
  • Folder 052708_Delta2D_image analysis
  • Contents Power point presentation, pick list,
    snapshot images with labeling and Delta 2D report

33
Mass spectrometry (MS) based peptide
identification
  • Gel bands of interests are excised for in-gel
    digestion
  • Alternatively, a protein sample of interest can
    be digested in solution
  • The digest is then subjected to MS or MS/MS for
    peptide identification
  • You can chose to run LC before MS
  • Some instruments like FT-MS allows MS and MS/MS
    of undigested protein samples

34
List of useful proteomics websites
  • http//www.fixingproteomics.org/
  • http//www.ionsource.com/
  • www.ebi.ac.uk
  • www.proteomecommons.org
  • www.peptideatlas.org
  • http//www.biochem.mpg.de/mann
  • http//ncrr.pnl.gov/software/
  • http//tools.proteomecenter.org
  • http//www.pil.sdu.dk/
  • www.proteomesoftware.com
  • www.hprd.org
  • www.expasy.org

Click on each link to get familiar
35
Protein database search engines (Algorithms)
  • Commercial
  • Mascot (Matrix sciences)
  • Sequest (Thermo Scientific)
  • Phenyx (GeneBio)
  • Spectrum Mills (Agilent Technologies)
  • Paragon etc. (Applied Biosystems)
  • Open source
  • Mascot (for small dataset)
  • http//www.matrixscience.com/search_form_select.ht
    ml
  • X!Tandem
  • http//www.thegpm.org/tandem/
  • Requires command-line use
  • OMSSA etc.
  • http//pubchem.ncbi.nlm.nih.gov/omssa/

36
Source of ambiguity in proteomics data analysis
  • Quality of the MS/MS spectra
  • Redundant databases
  • Search strategy used
  • Nature of the model used by database search
    engines (scoring algorithms)
  • The most common source of ambiguity and incorrect
    assignment

37
Overview of processing mass spectrometry data
Proteomics data validation why all must provide
data. Lennart Martens and Henning Hermjakob
Mol. BioSyst., 2007 3, 518522.
38
Why different search engines generate different
peptide lists from the same dataset??
  • Mascot
  • Probability base MOWSE scoring
  • Sequest
  • Cross-correlation (Xcorr) among experimental and
    theoretical spectra is used
  • Reports deltaCn
  • X-Tandem
  • Considers only B/Y-type ions
  • Creates a database of proteins identified and
    performs an extensive search on only identified
    proteins

39
Protein databases
  • NCBI
  • NCBI is Entrez Protein database from National
    Center for Biotechnology information and contains
    redundant protein sequences with poor annotation.
  • RefSeq is NCBIs Reference Sequence database with
    a comprehensive, integrated, non-redundant,
    well-annotated set of sequences.
  • Uniprot/Swiss-Port
  • The UniProt Knowledgebase (UniProtKB) consists of
    two sections
  • manually annotated and reviewed
    UniProtKB/Swiss-Prot and
  • automatically annotated UniProtKB/TrEMBL.
  • UniProtKB/Swiss-Prot is well-curated, well
    annotated, non redundant and considerably smaller
    than NCBI, therefore widely used.
  • IPI
  • IPI, International Protein Index databases, is
    used for species specific searches and is
    maintained by European Bioinformatics Institute
    (EBI).
  • The decision as which databases to use solely
    depends on aim of the project and type of the
    experiment in concern
  • If the goal is to receive highest sensitivity,
    NCBI is more desirable as a first step.
  • However, it is time consuming to search against a
    large database and it requires manual validation
    as a second step and/or further distillation of
    the protein list based on other specific
    databases, but for identifying sequence variant,
    NCBI is a better starting point.
  • UniProtKB/Swiss-Prot, on the other hand, is a
    better option for investigators seeking faster
    and reliable search results.
  • If species information is known, IPI database is
    a good candidate containing protein sequences
    with cross-references to all its source data e.g.
    Ensembl, UniProt, RefSeq..

40
1problem
  • Proliferation of new search algorithms, with a
    variety of settings which one(s)?

41
Importance of database search algorithms in
peptide identification
SEQUEST
But the overlap is surprisingly small. Different
search engines match different spectra.
Each search engine identifies about the same
number of spectra,
9
4
22
34
19
7
Mascot
X!Tandem
5
Courtesy Proteome Software Inc.
42
Sequest, mascot and X-tandem scores
  • SEQUEST XCorrgt2.5, DeltaCngt0.1
  • Mascot Ion Score-Identity Scoregt0
  • X! Tandem E-Valuelt0.01

How do we compare these????
Thats when Scaffold comes in
Courtesy Proteome Software Inc.
43
Scaffold workflow
Peptide Prophet
Protein Prophet
Get SEQUEST IDs
Calculate SEQUEST Probability
Calculate Combined Peptide Probability
Get Mascot IDs
Calculate Mascot Probability
For Each Spectrum
Calculate Protein Probabilities
Get X!Tandem IDs
Calculate X!Tandem Probability
Scaffold Merger
Scaffold uses another algorithm by Nesvizskii to
combine peptide probabilities.

Scaffold uses Nesvizhskiis algorithm to convert
SEQUEST and Mascot scores to peptide probabilities
Nesvizhskii, A. I. et al, Anal. Chem. 2003, 75,
4646-4658
44
Scaffold View
Click on Scaffold and import file MSX50.sfd
We will be able to import this file into protein
center after exporting it as ProtXML file format
( MSX50.xml)
45
Consensus Study Conducted by BAC
  • Mascot only (MO)
  • Sequest only (SO)
  • X!Tandem only (XO)
  • Union of M S (MSU)
  • MXU
  • SXU
  • MSXU
  • Intersection of M S (MSI)
  • SXI
  • MXI
  • MSXI

S
?
?
?
M
?
?
?
?
X
Evaluate the performance of each methods using a
standard protein dataset Performance measures are
sensitivity and specificity of each
methods Sultana T, Jordan R, Lyons-Weiler J.
2009. Optimization of the use of consensus
methods for the detection and putative
identification of peptides via mass spectrometry
using protein standard mixtures. J Proteomics
Bioinform 2 263-273.
46
Scaffold confidence filter settings
  • Minimum Protein defines the probability that
    a proteins identification is correct
  • (20, 50, 80, 90, 95, 99, 99.9)
  • Minimum Peptides filters results by the
    number of unique peptides on which the
    identification is based
  • (1, 2, 3, 4, 5)
  • Minimum Peptide requires a minimum
    probability from at least one spectrum
  • (0, 20, 50, 80, 90, 95)

47
GPCL-BAC recommendations(Sultana et al., 2009
publication)
  • Most Accurate peptide identification
  • Union of Sequest, Mascot X!Tandem (MSXU)
  • Scaffold filter 95 protein probability, 2
    minimum unique peptides 50 peptide
    probability.
  • Most Sensitive peptide identification
  • Union of Sequest Mascot (MSU) or union of
    Sequest X!Tandem (MSXU) or Sequest only (SO)
  • Scaffold filter 80 protein probability, 1
    minimum unique peptides 50 peptide
    probability.
  • Most Specific peptide identification
  • Union of Mascot X!Tandem (MXU) or Mascot only
    (MO)
  • Scaffold filter 99 protein probability, 3
    minimum unique peptides 50 peptide
    probability.
  • Sultana T, Jordan R, Lyons-Weiler J. 2009.
    Optimization of the use of consensus methods for
    the detection and putative identification of
    peptides via mass spectrometry using protein
    standard mixtures. J Proteomics Bioinform 2
    263-273.

Sensitivity and specificity trade off
48
Biology of MICA Protein in Human Sarcoma Cells
  • Final Research Report
  • March 5, 2010
  • PI ____ _____
  • GPCL-Bioinformatics Analysis Core Analysts
  • Tamanna Sultana and James Lyons Weiler
  • Project Location http//mygpcl/

Example of consensus peptide ID data analysis and
reporting
49
Specific Aims (obtained from PI)
  • ID proteins in sample
  • Identification of other closely interacting
    proteins with MICA in human sarcoma cells

50
Study Details
  • Samples protein extract of osteo-sarcoma cell
    lines
  • SCH2473A8MA3_sample 1
  • SCH2473A9MA3_sample 2
  • Sample preparation
  • Stable over expressed MICA immuno-precipitated
    with MICA antibody and then the complex were pull
    down
  • 5 ug/10 uL protein was reduced with TCEP,
    alkylated with iodoacetamide and trypsin
    digested.
  • LCQ-Deca-XL (LC-ESI-MS) was used for MS and MS/MS
    data generation
  • Data sets generated
  • SCH2473A8MA3-SpectraFile.RAW
  • SCH2473A9MA3-SpectraFile.RAW

51
Previous Analysis Summary
  • Sequest search results provided by Manny
    Schreiber from Proteomics Lab

52
GPCL-BAC Profile Data Analysis
  • Start with the LC-MS Raw data file
  • Convert to peak list (MGF file format)
  • Run Mascot and X!Tandem search
  • Using Scaffold, combine the search results of
    Sequest, Mascot and X!Tandem
  • Provide protein lists
  • (Most) Accurate (list 1)
  • (Most) Sensitive (list 2)
  • (Most) Specific (list 3)

53
SCH2473A8MA3.RAW
SCH2473A9MA3.RAW
54
Database Search Parameters
  • Database
  • IPI human v. 3.57
  • Search algorithms
  • Mascot, Sequest X!Tandem
  • Modifications (variable)
  • Carbamidomethyl (57 _at_C) and oxidation (16 _at_M)
  • Missed cleavages 2 maximum
  • Error tolerance 2 Da on both parent and fragment
    ions
  • Peak list conversion
  • Raw file were converted into Mascot generic
    format (MGF) peak list using extract_msn provided
    by Xcalibur software of LCQ instrument

55
Accurate list for sample1 (MSXU-95_2_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA8 Calicin IPI00299881 66,564.70 100.00 16 16 17 1.60 25.20
MascotA8 Putative uncharacterized protein (Fragment) IPI00816622 9,137.30 100.00 10 10 13 1.22 69.90
MascotA8 tudor domain containing 10 isoform a IPI00432733,IPI00514618 40,923.70 100.00 7 7 9 0.85 15.30
MascotA8 Protein IPI00513900 58,318.90 100.00 20 22 23 2.16 20.00
MascotA8 Isoform 2 of Tropomyosin alpha-4 chain IPI00216975 32,705.70 100.00 18 19 20 1.88 35.90
MascotA8 Isoform 2 of Protein spire homolog 1 IPI00645268 83,940.80 100.00 34 36 40 3.76 31.70
MascotA8 similar to hCG1820764 IPI00741841 11,265.40 99.90 5 5 8 0.75 50.50
MascotA8 Protein FAM26E IPI00166835 35,152.50 99.80 5 6 16 1.50 9.39
MascotA8 Ras-related protein Rab-5A IPI00023510 23,640.80 99.40 4 5 9 0.85 23.70
MascotA8 Isoform 1 of Tropomyosin alpha-4 chain IPI00010779 28,504.40 98.70 4 4 4 0.38 33.50
56
Sensitive list for sample1 (MSU-80_1_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA8 Calicin IPI00299881 66,564.70 100.00 16 16 17 1.60 25.20
MascotA8 Putative uncharacterized protein (Fragment) IPI00816622 9,137.30 100.00 10 10 13 1.22 69.90
MascotA8 tudor domain containing 10 isoform a IPI00432733,IPI00514618 40,923.70 100.00 7 7 9 0.85 15.30
MascotA8 Protein IPI00513900 58,318.90 100.00 20 22 23 2.16 20.00
MascotA8 Isoform 2 of Tropomyosin alpha-4 chain IPI00216975 32,705.70 100.00 18 19 20 1.88 35.90
MascotA8 Isoform 2 of Protein spire homolog 1 IPI00645268 83,940.80 100.00 34 36 40 3.76 31.70
MascotA8 similar to hCG1820764 IPI00741841 11,265.40 99.90 5 5 8 0.75 50.50
MascotA8 Protein FAM26E IPI00166835 35,152.50 99.80 5 6 16 1.50 9.39
MascotA8 Ras-related protein Rab-5A IPI00023510 23,640.80 99.40 4 5 9 0.85 23.70
MascotA8 Isoform 1 of Tropomyosin alpha-4 chain IPI00010779 28,504.40 98.70 4 4 4 0.38 33.50
SequestA8 Isoform 2 of Membrane-associated guanylate kinase, WW and PDZ domain-containing protein 2 IPI00376306,IPI00413880 157,220.10 95.00 1 1 1 0.11 0.56
SequestA8 Isoform 1 of Uncharacterized protein KIAA1107 IPI00298902,IPI00885110 138,584.40 95.00 1 1 1 0.11 0.48
SequestA8 Isoform 2 of Protein FAM184A IPI00647504,IPI00746850,IPI00871942,IPI00874044 61,451.20 95.00 1 1 1 0.11 5.10
SequestA8 Isoform 2 of Metallothionein-1G IPI00413064 6,051.60 92.60 1 1 1 0.11 41.00
MascotA8 GTPase IMAP family member 8 IPI00168482 99,026.30 89.90 2 2 3 0.28 1.25
SequestA8 Isoform 1 of WSC domain-containing protein 2 IPI00005652,IPI00217627 65,758.60 89.90 1 1 1 0.11 1.37
MascotA8 similar to GOLGA8A protein IPI00887650,IPI00888792 125,017.00 88.30 2 2 2 0.19 1.04
SequestA8 Isoform 1 of Zinc finger protein 461 IPI00152053 66,195.40 87.90 1 1 1 0.11 3.37
MascotA8 Myosin-8 IPI00302329 222,749.30 85.20 2 2 2 0.19 1.29
SequestA8 Isoform Short of Heat shock factor protein 1 IPI00218507,IPI00902551 52,863.50 85.20 1 1 1 0.11 1.02
SequestA8 cDNA FLJ61614, highly similar to TBC1 domain family member 1 IPI00878875,IPI00879034 27,553.40 85.00 1 1 1 0.11 7.54
SequestA8 Microfibrillar-associated protein 2 IPI00022621,IPI00644827 20,736.10 83.50 1 1 1 0.11 2.75
57
Specific list for sample1 (MO-99_3_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA8 Calicin IPI00299881 66,564.70 100.00 16 16 17 1.60 25.20
MascotA8 Putative uncharacterized protein (Fragment) IPI00816622 9,137.30 100.00 10 10 13 1.22 69.90
MascotA8 tudor domain containing 10 isoform a IPI00432733,IPI00514618 40,923.70 100.00 7 7 9 0.85 15.30
MascotA8 Protein IPI00513900 58,318.90 100.00 20 22 23 2.16 20.00
MascotA8 Isoform 2 of Tropomyosin alpha-4 chain IPI00216975 32,705.70 100.00 18 19 20 1.88 35.90
MascotA8 Isoform 2 of Protein spire homolog 1 IPI00645268 83,940.80 100.00 34 36 40 3.76 31.70
MascotA8 similar to hCG1820764 IPI00741841 11,265.40 99.90 5 5 8 0.75 50.50
MascotA8 Protein FAM26E IPI00166835 35,152.50 99.80 5 6 16 1.50 9.39
MascotA8 Ras-related protein Rab-5A IPI00023510 23,640.80 99.40 4 5 9 0.85 23.70
MascotA8 Isoform 1 of Tropomyosin alpha-4 chain IPI00010779 28,504.40 98.70 4 4 4 0.38 33.50
58
Venn diagram for sample1 using sensitive list
SEQUEST
29
3
0
0
19
42
Mascot
X!Tandem
7
59
Accurate list for sample 2 (MSXU-95_2_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA9 annexin A2 isoform 1 IPI00418169 40,395.30 100.00 6 6 7 0.57 24.60
MascotA9 Isoform 2 of Mucin-19 IPI00829833,IPI00896516 564,046.60 100.00 11 11 11 0.89 3.27
MascotA9 Protein IPI00916368 145,322.20 100.00 5 5 6 0.49 8.37
MascotA9 Isoform 3 of HEAT repeat-containing protein 5B IPI00333696,IPI00479069 214,978.60 100.00 7 7 7 0.57 4.89
MascotA9 Isoform 1 of AT-hook-containing transcription factor 1 IPI00170594,IPI00217957,IPI00876984,IPI00878213 253,444.20 100.00 8 8 8 0.65 5.23
MascotA9 Zinc finger protein 282 IPI00003798 74,277.40 99.90 5 5 7 0.57 12.10
MascotA9 Interferon-induced protein with tetratricopeptide repeats 3 IPI00024254 55,968.00 99.90 3 3 3 0.24 11.20
MascotA9 Isoform 1 of Transcription factor TFIIIB component B'' homolog IPI00760877,IPI00893272 293,875.60 99.90 6 6 7 0.57 4.84
MascotA9 Metallothionein-2 IPI00022498 6,023.60 99.90 5 5 7 0.57 90.20
MascotA9 Vimentin IPI00418471 53,634.60 99.90 3 3 3 0.24 6.01
MascotA9 Isoform 1 of Disabled homolog 2-interacting protein IPI00789361 131,611.90 99.80 4 4 4 0.33 3.87
MascotA9 Similar to Signal peptidase complex subunit 2 IPI00452747 24,959.10 99.80 3 3 4 0.33 16.80
SequestA9 Vimentin IPI00418471 53,634.60 99.80 2 2 2 0.17 4.94
MascotA9 Annexin A5 IPI00329801,IPI00872379 35,920.60 99.70 2 2 2 0.16 6.25
MascotA9 Isoform 2 of Protein Dok-7 IPI00168218 37,143.30 99.60 3 3 4 0.33 13.10
MascotA9 Isoform GTBP-N of DNA mismatch repair protein Msh6 IPI00384456 152,771.90 99.60 4 4 4 0.33 7.65
MascotA9 Isoform 1 of HEAT repeat-containing protein 5A IPI00783902 222,619.90 99.50 4 4 5 0.41 3.08
MascotA9 Isoform 1 of Cullin-4A IPI00419273 87,665.70 99.30 4 4 4 0.33 5.14
MascotA9 Intersectin 1 isoform 9 IPI00657958 5,457.10 98.70 3 3 3 0.24 42.00
MascotA9 Isoform 3 of Protein diaphanous homolog 2 IPI00844086 125,031.10 98.70 4 4 4 0.33 4.56
MascotA9 Nuclear receptor-interacting protein 1 IPI00010196 126,926.60 98.30 3 3 4 0.33 4.66
MascotA9 Semaphorin-3G IPI00024570 86,682.20 97.50 3 3 4 0.33 3.07
MascotA9 Glycerol-3-phosphate dehydrogenase NAD, cytoplasmic IPI00295777 37,549.60 96.70 2 2 2 0.16 18.60
MascotA9 Isoform 1 of NACHT, LRR and PYD domains-containing protein 4 IPI00151977,IPI00218614 113,400.60 96.20 2 2 3 0.24 2.11
MascotA9 FLJ00049 protein (Fragment) IPI00025469 14,347.50 95.90 2 2 2 0.16 31.60
MascotA9 DFNA5 protein family protein IPI00884370 13,300.90 95.70 2 2 2 0.16 27.60
MascotA9 Uncharacterized protein LOC400831 IPI00783496 13,573.60 94.80 2 2 2 0.16 19.70
60
Sensitive list for sample 2 (MSU-80_1_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA9 annexin A2 isoform 1 IPI00418169 40,395.30 100.00 6 6 7 0.57 24.60
MascotA9 Isoform 2 of Mucin-19 IPI00829833,IPI00896516 564,046.60 100.00 11 11 11 0.89 3.27
MascotA9 Protein IPI00916368 145,322.20 100.00 5 5 6 0.49 8.37
MascotA9 Isoform 3 of HEAT repeat-containing protein 5B IPI00333696,IPI00479069 214,978.60 100.00 7 7 7 0.57 4.89
MascotA9 Isoform 1 of AT-hook-containing transcription factor 1 IPI00170594,IPI00217957,IPI00876984,IPI00878213 253,444.20 100.00 8 8 8 0.65 5.23
MascotA9 Zinc finger protein 282 IPI00003798 74,277.40 99.90 5 5 7 0.57 12.10
MascotA9 Interferon-induced protein with tetratricopeptide repeats 3 IPI00024254 55,968.00 99.90 3 3 3 0.24 11.20
MascotA9 Isoform 1 of Transcription factor TFIIIB component B'' homolog IPI00760877,IPI00893272 293,875.60 99.90 6 6 7 0.57 4.84
MascotA9 Metallothionein-2 IPI00022498 6,023.60 99.90 5 5 7 0.57 90.20
MascotA9 Vimentin IPI00418471 53,634.60 99.90 3 3 3 0.24 6.01
MascotA9 Isoform 1 of Disabled homolog 2-interacting protein IPI00789361 131,611.90 99.80 4 4 4 0.33 3.87
MascotA9 Similar to Signal peptidase complex subunit 2 IPI00452747 24,959.10 99.80 3 3 4 0.33 16.80
SequestA9 Vimentin IPI00418471 53,634.60 99.80 2 2 2 0.17 4.94
MascotA9 Annexin A5 IPI00329801,IPI00872379 35,920.60 99.70 2 2 2 0.16 6.25
MascotA9 Isoform 2 of Protein Dok-7 IPI00168218 37,143.30 99.60 3 3 4 0.33 13.10
MascotA9 Isoform GTBP-N of DNA mismatch repair protein Msh6 IPI00384456 152,771.90 99.60 4 4 4 0.33 7.65
MascotA9 Isoform 1 of HEAT repeat-containing protein 5A IPI00783902 222,619.90 99.50 4 4 5 0.41 3.08
MascotA9 Isoform 1 of Cullin-4A IPI00419273 87,665.70 99.30 4 4 4 0.33 5.14
MascotA9 Intersectin 1 isoform 9 IPI00657958 5,457.10 98.70 3 3 3 0.24 42.00
MascotA9 Isoform 3 of Protein diaphanous homolog 2 IPI00844086 125,031.10 98.70 4 4 4 0.33 4.56
MascotA9 Nuclear receptor-interacting protein 1 IPI00010196 126,926.60 98.30 3 3 4 0.33 4.66
MascotA9 Semaphorin-3G IPI00024570 86,682.20 97.50 3 3 4 0.33 3.07
MascotA9 Glycerol-3-phosphate dehydrogenase NAD, cytoplasmic IPI00295777 37,549.60 96.70 2 2 2 0.16 18.60
MascotA9 Isoform 1 of NACHT, LRR and PYD domains-containing protein 4 IPI00151977,IPI00218614 113,400.60 96.20 2 2 3 0.24 2.11
MascotA9 FLJ00049 protein (Fragment) IPI00025469 14,347.50 95.90 2 2 2 0.16 31.60
MascotA9 DFNA5 protein family protein IPI00884370 13,300.90 95.70 2 2 2 0.16 27.60
MascotA9 Uncharacterized protein LOC400831 IPI00783496 13,573.60 94.80 2 2 2 0.16 19.70
SequestA9 Complement protein C4B frameshift mutant (Fragment) IPI00922744 38,120.90 94.70 1 1 1 0.09 7.63
SequestA9 Annexin A5 IPI00329801,IPI00872379 35,789.40 94.70 1 1 1 0.09 5.02
SequestA9 annexin A2 isoform 1 IPI00418169 40,395.30 94.70 1 1 1 0.09 4.48
SequestA9 anaphase-promoting complex subunit 7 isoform a IPI00008248,IPI00915282 59,986.80 94.70 1 1 1 0.09 4.10
SequestA9 Copine-3 IPI00024403 60,113.60 94.70 1 1 1 0.09 2.42
SequestA9 8 kDa protein IPI00917340 8,033.10 94.70 1 1 1 0.09 6.67
SequestA9 cDNA FLJ54922 IPI00908630 17,676.10 94.70 1 1 1 0.09 9.58
SequestA9 Putative transposase IPI00385003 51,650.80 94.70 1 1 1 0.09 5.07
SequestA9 40S ribosomal protein S25 IPI00012750 13,725.00 94.70 1 1 2 0.17 8.80
SequestA9 C1orf2 protein IPI00059407 15,443.10 94.70 1 1 1 0.09 3.47
SequestA9 Spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein 1 IPI00456833 34,470.20 94.70 1 1 1 0.09 1.52
SequestA9 T-lymphoma invasion and metastasis-inducing protein 1 IPI00011400,IPI00747471 118,638.60 94.70 1 1 1 0.09 0.75
SequestA9 Isoform 1 of Bromodomain-containing protein 3 IPI00014266,IPI00410716 79,524.50 94.70 1 1 1 0.09 2.07
SequestA9 Putative uncharacterized protein DKFZp566H184 IPI00552379 14,090.60 94.70 1 1 1 0.09 9.49
SequestA9 Conserved hypothetical protein IPI00745054 11,127.60 94.70 1 1 1 0.09 4.85
SequestA9 cDNA FLJ60844, highly similar to Homo sapiens centrosomal protein 170 kDa (CEP170), transcript variant alpha, mRNA IPI00910412 45,955.70 94.70 1 1 1 0.09 4.85
SequestA9 Nuclease-sensitive element-binding protein 1 IPI00031812,IPI00450235,IPI00643351 35,905.70 94.70 1 1 1 0.09 6.17
SequestA9 Isoform 1 of Caprin-1 IPI00783872,IPI00873926,IPI00910763 78,346.40 94.40 1 1 1 0.09 1.97
MascotA9 Pre-B-cell leukemia transcription factor 4 IPI00019294 40,836.60 93.80 2 2 2 0.16 6.42
MascotA9 Sperm acrosome-associated protein 5 IPI00044895 17,878.30 93.70 2 2 2 0.16 20.80
SequestA9 Methylenetetrahydrofolate reductase IPI00646036 15,800.00 91.90 1 1 1 0.09 6.29
SequestA9 Major facilitator superfamily domain-containing protein 9 IPI00301231,IPI00303331,IPI00477801,IPI00644135 37,842.70 91.20 1 1 1 0.09 1.52
SequestA9 fibrillin 2 precursor IPI00019439 314,744.50 91.00 1 1 1 0.09 0.72
SequestA9 Isoform 1 of JmjC domain-containing histone demethylation protein 1D IPI00418567,IPI00738581 106,540.00 90.20 1 1 1 0.09 2.98
SequestA9 Uncharacterized protein LOC400831 IPI00783496 13,573.60 89.00 1 1 1 0.09 16.40
SequestA9 Isoform 3 of Syntenin-2 IPI00220218,IPI00302318 22,645.20 87.70 1 1 1 0.09 2.90
MascotA9 Guanine nucleotide-binding protein subunit beta-4 IPI00012451,IPI00026268,IPI00639998,IPI00640949,IPI00642117 37,549.80 85.80 1 1 1 0.08 9.12
MascotA9 similar to 40S ribosomal protein S12 IPI00157456 14,560.50 85.80 1 1 2 0.16 16.50
MascotA9 Isoform 1 of Regulator of nonsense transcripts 1 IPI00034049,IPI00399170 124,328.90 85.50 1 1 3 0.24 1.24
SequestA9 Ras-related C3 botulinum toxin substrate 1 isoform Rac1b variant (Fragment) IPI00555566 15,559.00 81.40 1 1 1 0.09 3.47
MascotA9 Nuclease-sensitive element-binding protein 1 IPI00031812,IPI00450235,IPI00643351 35,905.70 81.30 1 1 3 0.24 6.17
SequestA9 Zinc finger protein 512B IPI00074893,IPI00307591,IPI00413362,IPI00435422 97,245.50 80.80 1 1 1 0.09 0.56
SequestA9 UDP-glucose 6-dehydrogenase IPI00031420 55,007.30 80.00 1 1 1 0.09 2.63
MascotA9 Ribosomal-like PROTEINHLA-F product (Fragment) IPI00848054 9,548.00 79.80 1 1 1 0.08 23.30
SequestA9 Isoform 3 of Serine/threonine-protein kinase 40 IPI00161013 46,376.20 79.70 1 1 1 0.09 1.20
61
Specific list for sample 2 (MO-99_3_50)
Biological sample name Protein name Protein accession numbers Protein molecular weight (Da) Protein identification probability unique peptides unique spectra total spectra total spectra sequence coverage
MascotA9 annexin A2 isoform 1 IPI00418169 40,395.30 100.00 6 6 7 0.57 24.60
MascotA9 Isoform 2 of Mucin-19 IPI00829833,IPI00896516 564,046.60 100.00 11 11 11 0.89 3.27
MascotA9 Protein IPI00916368 145,322.20 100.00 5 5 6 0.49 8.37
MascotA9 Isoform 3 of HEAT repeat-containing protein 5B IPI00333696,IPI00479069 214,978.60 100.00 7 7 7 0.57 4.89
MascotA9 Isoform 1 of AT-hook-containing transcription factor 1 IPI00170594,IPI00217957,IPI00876984,IPI00878213 253,444.20 100.00 8 8 8 0.65 5.23
MascotA9 Zinc finger protein 282 IPI00003798 74,277.40 99.90 5 5 7 0.57 12.10
MascotA9 Interferon-induced protein with tetratricopeptide repeats 3 IPI00024254 55,968.00 99.90 3 3 3 0.24 11.20
MascotA9 Isoform 1 of Transcription factor TFIIIB component B'' homolog IPI00760877,IPI00893272 293,875.60 99.90 6 6 7 0.57 4.84
MascotA9 Metallothionein-2 IPI00022498 6,023.60 99.90 5 5 7 0.57 90.20
MascotA9 Vimentin IPI00418471 53,634.60 99.90 3 3 3 0.24 6.01
MascotA9 Isoform 1 of Disabled homolog 2-interacting protein IPI00789361 131,611.90 99.80 4 4 4 0.33 3.87
MascotA9 Similar to Signal peptidase complex subunit 2 IPI00452747 24,959.10 99.80 3 3 4 0.33 16.80
MascotA9 Isoform 2 of Protein Dok-7 IPI00168218 37,143.30 99.60 3 3 4 0.33 13.10
MascotA9 Isoform GTBP-N of DNA mismatch repair protein Msh6 IPI00384456 152,771.90 99.60 4 4 4 0.33 7.65
MascotA9 Isoform 1 of HEAT repeat-containing protein 5A IPI00783902 222,619.90 99.50 4 4 5 0.41 3.08
MascotA9 Isoform 1 of Cullin-4A IPI00419273 87,665.70 99.30 4 4 4 0.33 5.14
MascotA9 Intersectin 1 isoform 9 IPI00657958 5,457.10 98.70 3 3 3 0.24 42.00
MascotA9 Isoform 3 of Protein diaphanous homolog 2 IPI00844086 125,031.10 98.70 4 4 4 0.33 4.56
62
Venn diagram for sample 2 using sensitive list
SEQUEST
35
0
13
3
5
Mascot
X!Tandem
42
2
63
Size of the consensus set ( of proteins
identified) for each consensus method
Sample 1 Sample 2
MSXU-95_2_50 (Accurate) 10 27
MSU-80_1_50 (Sensitive) 23 64
MO-99_3_50 (Specific) 10 18
64
Sample 1 consensus list for each consensus method
Acurate protein list Specific protein list Sensitive protein list
Calicin Calicin Calicin
Putative uncharacterized protein (Fragment) Putative uncharacterized protein (Fragment) Putative uncharacterized protein (Fragment)
tudor domain containing 10 isoform a tudor domain containing 10 isoform a tudor domain containing 10 isoform a
Protein Protein Protein
Isoform 2 of Tropomyosin alpha-4 chain Isoform 2 of Tropomyosin alpha-4 chain Isoform 2 of Tropomyosin alpha-4 chain
Isoform 2 of Protein spire homolog 1 Isoform 2 of Protein spire homolog 1 Isoform 2 of Protein spire homolog 1
similar to hCG1820764 similar to hCG1820764 similar to hCG1820764
Protein FAM26E Protein FAM26E Protein FAM26E
Ras-related protein Rab-5A Ras-related protein Rab-5A Ras-related protein Rab-5A
Isoform 1 of Tropomyosin alpha-4 chain Isoform 1 of Tropomyosin alpha-4 chain Isoform 1 of Tropomyosin alpha-4 chain
    Isoform 2 of Membrane-associated guanylate kinase, WW and PDZ domain-containing protein 2
    Corrupt Accession PIIPI00890727.1SWISS-PROTQ8NB90-2
    Isoform 1 of Uncharacterized protein KIAA1107
    Isoform 2 of Protein FAM184A
    Isoform 2 of Metallothionein-1G
    GTPase IMAP family member 8
    Isoform 1 of WSC domain-containing protein 2
    similar to GOLGA8A protein
    Isoform 1 of Zinc finger protein 461
    Myosin-8
    Isoform Short of Heat shock factor protein 1
    cDNA FLJ61614, highly similar to TBC1 domain family member 1
    Microfibrillar-associated protein 2
65
Sample 2 consensus list for each consensus method
Accurate protein list Specific protein list Sensitive protein list
annexin A2 isoform 1 annexin A2 isoform 1 annexin A2 isoform 1
Isoform 2 of Mucin-19 Isoform 2 of Mucin-19 Isoform 2 of Mucin-19
Protein IPI00916368 Protein IPI00916368 Protein IPI00916368
Isoform 3 of HEAT repeat-containing protein 5B Isoform 3 of HEAT repeat-containing protein 5B Isoform 3 of HEAT repeat-containing protein 5B
Isoform 1 of AT-hook-containing transcription factor 1 Isoform 1 of AT-hook-containing transcription factor 1 Isoform 1 of AT-hook-containing transcription factor 1
Zinc finger protein 282 Zinc finger protein 282 Zinc finger protein 282
Interferon-induced protein with tetratricopeptide repeats 3 Interferon-induced protein with tetratricopeptide repeats 3 Interferon-induced protein with tetratricopeptide repeats 3
Isoform 1 of Transcription factor TFIIIB component B'' homolog Isoform 1 of Transcription factor TFIIIB component B'' homolog Isoform 1 of Transcription factor TFIIIB component B'' homolog
Metallothionein-2 Metallothionein-2 Metallothionein-2
Vimentin Vimentin Vimentin
Isoform 1 of Disabled homolog 2-interacting protein Isoform 1 of Disabled homolog 2-interacting protein Isoform 1 of Disabled homolog 2-interacting protein
Similar to Signal peptidase complex subunit 2 Similar to Signal peptidase complex subunit 2 Similar to Signal peptidase complex subunit 2
Vimentin Isoform 2 of Protein Dok-7 Vimentin
Annexin A5 Isoform GTBP-N of DNA mismatch repair protein Msh6 Annexin A5
Isoform 2 of Protein Dok-7 Isoform 1 of HEAT repeat-containing protein 5A Isoform 2 of Protein Dok-7
Isoform GTBP-N of DNA mismatch repair protein Msh6 Isoform 1 of Cullin-4A Isoform GTBP-N of DNA mismatch repair protein Msh6
Isoform 1 of HEAT repeat-containing protein 5A Intersectin 1 isoform 9 Isoform 1 of HEAT repeat-containing protein 5A
Isoform 1 of Cullin-4A Isoform 3 of Protein diaphanous homolog 2 Isoform 1 of Cullin-4A
Intersectin 1 isoform 9   Intersectin 1 isoform 9
Isoform 3 of Protein diaphanous homolog 2   Isoform 3 of Protein diaphanous homolog 2
Nuclear receptor-interacting protein 1   Nuclear receptor-interacting protein 1
Semaphorin-3G   Semaphorin-3G
Glycerol-3-phosphate dehydrogenase NAD, cytoplasmic   Glycerol-3-phosphate dehydrogenase NAD, cytoplasmic
Isoform 1 of NACHT, LRR and PYD domains-containing protein 4   Isoform 1 of NACHT, LRR and PYD domains-containing protein 4
FLJ00049 protein (Fragment)   FLJ00049 protein (Fragment)
DFNA5 protein family protein   DFNA5 protein family protein
Uncharacterized protein LOC400831   Uncharacterized protein LOC400831
    Complement protein C4B frameshift mutant (Fragment)
    Annexin A5
    annexin A2 isoform 1
    anaphase-promoting complex subunit 7 isoform a
    Copine-3
    8 kDa protein
    cDNA FLJ54922
    Putative transposase
    40S ribosomal protein S25
    C1orf2 protein
    Spermatogenesis- and oogenesis-specific basic helix-loop-helix-containing protein 1
    T-lymphoma invasion and metastasis-inducing protein 1
    Isoform 1 of Bromodomain-containing protein 3
    Putative uncharacterized protein DKFZp566H184
    Conserved hypothetical protein
    cDNA FLJ60844, highly similar to Homo sapiens centrosomal protein 170 kDa (CEP170), transcript variant alpha, mRNA
    Nuclease-sensitive element-binding protein 1
    Corrupt Accession PIIPI00647138.1TREMBLQ5JP05ENSEMB
    Isoform 1 of Caprin-1
    Corrupt Accession PIIPI00890727.1SWISS-PROTQ8NB90-2
    Pre-B-cell leukemia transcription factor 4
    Sperm acrosome-associated protein 5
    Methylenetetrahydrofolate reductase
    Major facilitator superfamily domain-containing protein 9
    fibrillin 2 precursor
    Isoform 1 of JmjC domain-containing histone demethylation protein 1D
    Uncharacterized protein LOC400831
    Isoform 3 of Syntenin-2
    Guanine nucleotide-binding protein subunit beta-4
    similar to 40S ribosomal protein S12
    Isoform 1 of Regulator of nonsense transcripts 1
    Ras-related C3 botulinum toxin substrate 1 isoform Rac1b variant (Fragment)
    Nuclease-sensitive element-binding protein 1
    Zinc finger protein 512B
    UDP-glucose 6-dehydrogenase
    Ribosomal-like PROTEINHLA-F product (Fragment)
    Isoform 3 of Serine/threonine-protein kinase 40
66
Methods Text
  • Sample Preparation
  • Protein extracts of osteo-sarcoma cell
    lines,SCH2473A8MA3_sample 1 and
    SCH2473A9MA3_sample 2 were processed in the
    Genomics and Proteomics Core Laboratories (GPCL)
    at the University of Pittsburgh prior to
    performing tandem mass spectrometry. In brief,
    immuno precipitated samples provided by PIs lab
    were reduced with tris-2-carboxyethyl-phosphine
    (TCEP), alkylated with iodoacetamide (IAC), and
    digested with trypsin (Promega). The ESI-MS and
    information dependent (IDA) MS/MS spectra were
    acquired at GPCL with an LCQ-Deca-XL coupled with
    a nano-LC system (Thermo Scientific, Waltham,
    MA). The IDA was set so that MS/MS was done on
    the top three intense peaks per cycle.
  • Database Search
  • Experimental Raw spectra files were searched
    against the human database, IPI Human v3.57, for
    identifying peptides using following three search
    algorithms Sequest, Mascot, and X!Tandem. The
    search parameters for searching candidate
    peptides were precursor ion tolerance 2 Da
    fragment ion tolerance 2 Da variable
    modifications Carbamidomethyl on cysteine, and
    oxidation on methionine maximum missed
    cleavages2. For Mascot and X!Tandem search, the
    raw files were converted to MGF peak list.
  • Merging the Data
  • Files containing database search results derived
    from Mascot, Sequest, and X!Tandem were imported
    into Scaffold. The software then merged the
    peptide lists identified by all the three search
    algorithms, re-scored, and re-ranked. Scaffold
    uses PeptideProphet and ProteinProphet, that
    employ Bayesian statistics to combine the
    probability of identifying spectra with the
    probability that all search methods agree with
    each other.
  • Protein List Generation
  • Accurate list unions of Mascot, Sequest and
    X!Tandem (MSXU) at 95 minimum protein
    probability, 2 minimum unique peptides and 50
    minimum peptide probability
  • Sensitive list unions of Mascot, and Sequest
    (MSU) at 80 minimum protein probability, 1
    minimum unique peptides and 50 minimum peptide
    probability
  • Specific list mascot only at 99 minimum protein
    probability, 3 minimum unique peptides and 50
    minimum peptide probability

67
Pathway express
  • Pathway express analysis
  • Convert the Protein accession numbers to
    Genbank accession IDs using DAVID
  • That is your input file for pathway express


68
Impacted Pathways (sample 1)
Rank Database Name Pathway Name Impact Factor Genes in Pathway Input Genes in Pathway Pathway Genes on Chip Input Genes in Pathway Pathway Genes in Input p-value
68 KEGG Ribosome 0.669 101 40 74 15.385 39.604 -3.77E-13
5 KEGG Parkinson''s disease 25.022 137 15 101 5.769 10.949 5.26E-11
6 KEGG Alzheimer''s disease 22.184 178 16 145 6.154 8.989 1.11E-09
7 KEGG Cardiac muscle contraction 20.419 87 11 69 4.231 12.644 9.58E-09
8 KEGG Huntington''s disease 16.844 189 14 154 5.385 7.407 1.48E-07
9 KEGG Focal adhesion 16.757 203 15 189 5.769 7.389 3.15E-07
10 KEGG hsa05131 13.673 54 7 46 2.692 12.963 6.95E-06
11 KEGG Pathogenic Escherichia coli infection 13.673 54 7 46 2.692 12.963 6.95E-06
3 KEGG Antigen processing and presentation 51.553 89 8 68 3.077 8.989 1.10E-05
12 KEGG ECM-receptor interaction 12.773 84 8 76 3.077 9.524 2.51E-05
15 KEGG Allograft rejection 10.079 38 5 31 1.923 13.158 1.12E-04
14 KEGG Graft-versus-host disease 10.315 42 5 32 1.923 11.905 1.32E-04
17 KEGG Type I diabetes mellitus 9.624 44 5 34 1.923 11.364 1.77E-04
18 KEGG Autoimmune thyroid disease 8.285 53 5 45 1.923 9.434 6.76E-04
19 KEGG Prostate cancer 7.728 90 6 81 2.308 6.667 0.001729
16 KEGG Vibrio cholerae infection 9.737 62 5 58 1.923 8.065 0.002149
1 KEGG Leukocyte transendothelial migration 213.497 119 6 108 2.308 5.042 0.007188
21 KEGG Glioma 7.209 65 4 60 1.538 6.154 0.014638
25 KEGG Pathways in cancer 5.456 330 10 310 3.846 3.03 0.025205
29 KEGG Colorectal cancer 4.328 84 4 79 1.538 4.762 0.03588
2 KEGG Cell adhesion molecules (CAMs) 206.572 134 5 120 1.923 3.731 0.041079
13 KEGG Tight junction 12.499 135 5 121 1.923 3.704 0.042324
28 KEGG Gap junction 4.449 96 4 87 1.538 4.167 0.048312
33 KEGG Asthma 3.82 30 2 25 0.769 6.667 0.058296
23 KEGG Melanogenesis 6.196 102 4 95 1.538 3.922 0.062833
36 KEGG p53 signaling pathway 3.344 69 3 65 1.154 4.348 0.082574
22 KEGG Regulation of actin cytoskeleton 6.313 217 6 195 2.308 2.765 0.087915
40 KEGG Hematopoietic cell lineage 2.925 87 3 77 1.154 3.448 0.121068
26 KEGG Natural killer cell mediated cytotoxicity 4.661 135 4 123 1.538 2.963 0.128995
37 KEGG Bladder cancer 3.046 42 2 40 0.769 4.762 0.130371
47 KEGG Protein export 2.502 12 1 10 0.385 8.333 0.146607
35 KEGG Wnt signaling pathway 3.359 152 4 140 1.538 2.632 0.178981
32 KEGG Jak-STAT signaling pathway 3.837 155 4 145 1.538 2.581 0.194779
44 KEGG Renin-angiotensin system 2.686 17 1 17 0.385 5.882 0.23629
43 KEGG Cell cycle 2.687 118 3 108 1.154 2.542 0.241388
30 KEGG Calcium signaling pathway 4.04 182 4 164 1.538 2.198 0.258177
53 KEGG Mismatch repair 2.078 23 1 20 0.385 4.348 0.271791
20 KEGG Long-term potentiation 7.271 73 2 65 0.769 2.74 0.272422
38 KEGG Epithelial cell signaling in Helicobacter pylori infection 3.041 68 2 65 0.769 2.941 0.272422
50 KEGG Systemic lupus erythematosus 2.245 144 3 118 1.154 2.083 0.283798
4 KEGG Adherens junction 41.809 78 2 67 0.769 2.564 0.284099
34 KEGG Melanoma 3.68 71 2 68 0.769 2.817 0.289931
42 KEGG Phosphatidylinositol signaling system 2.768 76 2 71 0.769 2.632 0.307389
41 KEGG Insulin signaling pathway 2.912 138 3 128 1.154 2.174 0.326778
46 KEGG Thyroid cancer 2.534 29 1 26 0.385 3.448 0.337936
49 KEGG Cytokine-cytokine receptor interaction 2.275 263 5 245 1.923 1.901 0.341883
48 KEGG Small cell lung cancer 2.405 86 2 80 0.769 2.326 0.359144
39 KEGG TGF-beta signaling pathway 3.035 87 2 83 0.769 2.299 0.376099
60 KEGG Primary immunodeficiency 1.769 35 1 34 0.385 2.857 0.41691
31 KEGG GnRH signaling pathway 4.031 103 2 92 0.769 1.942 0.425772
51 KEGG Nucleotide excision repair 2.129 44 1 41 0.385 2.273 0.478272
62 KEGG Proteasome 1.438 48 1 42 0.385 2.083 0.486496
63 KEGG mTOR signaling pathway 1.274 52 1 46 0.385 1.923 0.518121
52 KEGG Endometrial cancer 2.092 52 1 47 0.385 1.923 0.52572
24 KEGG Basal cell carcinoma 5.781 55 1 52 0.385 1.818 0.561958
54 KEGG Acute myeloid leukemia 2.025 59 1 52 0.385 1.695 0.561958
55 KEGG Non-small cell lung cancer 2.025 54 1 52 0.385 1.852 0.561958
57 KEGG B cell receptor signaling pathway 1.891 65 1 62 0.385 1.538 0.626366
64 KEGG PPAR signaling pathway 0.982 70 1 65 0.385 1.429 0.643781
65 KEGG Complement and coagulation cascades 0.978 69 1 65 0.385 1.449 0.643781
45 KEGG Renal cell carcinoma 2.576 69 1 66 0.385 1.449 0.649405
58 KEGG Chronic myeloid leukemia 1.855 75 1 69 0.385 1.333 0.66575
59 KEGG Pancreatic cancer 1.848 72 1 70 0.385 1.389 0.671027
61 KEGG Toll-like receptor signaling pathway 1.727 102 1 90 0.385 0.98 0.760765
67 KEGG Axon guidance 0.853 129 1 124 0.385 0.775 0.860919
66 KEGG Neuroactive ligand-receptor interaction 0.917 256 2 239 0.769 0.781 0.892646
56 KEGG MAPK signaling pathway 1.947 272 2 241 0.769 0.735 0.895342
27 KEGG Olfactory transduction 4.611 382 2 367 0.769 0.524 0.980492
69
Impacted Pathways (Sample 1)
Under expressed
Over expressed
70
Protein Center v 3.0.4
  • A Proteomics bioinformatics tool available
    through HSLS

71
Import and view a protein dataset, export desired
info
  • Go to http//hsls.proteincenter.proxeon.com/ProXwe
    b/
  • Log in
  • Click on the file MSX50.xml
  • Play with peptide, protein and cluster view
  • Learn how to select information of interest,
    export files or create a report
  • Learn how to compare two datasets (A8.xml and
    A9.xml)

72
Bioinformatics and statistical analysis
  • Single protein
  • Look up using Accession Key 6807647
  • Dataset Human Red blood cell (hRBC) dataset
  • Datasets
  • Proxeon
  • Tutorials
  • Data set comparison and click on hRBC_proteome

For further help email Protein Center support at
proteincenter-support_at_proxeon.com
73
Acknowledgement
  • Genomics and Proteomics Core Laboratories
  • James Lyons-Weiler, Director of BAC
  • Rick Jordan, programmer of BAC
Write a Comment
User Comments (0)
About PowerShow.com