Integration of PSLID and SLIF with Virtual Cell - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Integration of PSLID and SLIF with Virtual Cell

Description:

Uniform punctate proteins. Punctate nuclear proteins. Vesicular proteins. Uniform proteins ... Nuclear w/ punctate cytoplasm. CD-tagging project. Running ~100 ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 25
Provided by: ntc5
Category:

less

Transcript and Presenter's Notes

Title: Integration of PSLID and SLIF with Virtual Cell


1
Integration of PSLID and SLIF with Virtual Cell
  • Robert F. Murphy, Les Loew Ion Moraru
  • Ray and Stephanie Lane Professor of Computational
    Biology
  • Molecular Biosensors and Imaging Center,
    Departments of Biological Sciences, Biomedical
    Engineering and Machine Learning and

2
Alan Waggoner (CMU) and Simon Watkins (Pitt)
3
Brian Athey (UMich), CMU Bob Murphy
4
Central questions
  • How many distinct locations within cells can
    proteins be found in? What are they?

5
Automated Interpretation
  • Traditional analysis of fluorescence microscope
    images has occurred by visual inspection
  • Our goal over the past twelve years has to been
    to automate interpretation with the ultimate goal
    of fully automated learning of protein location
    from images

6
Learn to recognize all major subcellular patterns
ER
gpp130
giantin
2D Images of HeLa cells
Mito
LAMP
Nucleolin
Tubulin
DNA
TfR
Actin
7
Classification Results Computer vs. Human
Murphy et al 2000 Boland Murphy 2001 Murphy
et al 2003 Huang Murphy 2004
Lysosomes
Giantin (Golgi)
Gpp130 (Golgi)
Notes Even better results using MR methods by
Kovacevic group Even better results for 3D images
8
Tissue Microarrays
Courtesy http//www.beecherinstruments.com
Courtesy www.microarraystation.com
9
Human Protein Atlas
Courtesy www.proteinatlas.org
10
Test Dataset from Human Protein Atlas
  • Selected 16 proteins from the Atlas
  • Two each from all major organelles (class)
  • 45 tissue types for each class (e.g. liver,
    skin)
  • Goal Train classifier to recognize each
    subcellular pattern across all tissue types

Insulin in islet cells
Justin Newberg
11
Subcellular Pattern Classification over 45 tissues
Prediction
Labels
Overall accuracy 81
Accuracy for 50 of images with highest
confidence 97
12
(No Transcript)
13
Annotations of Yeast GFP Fusion Localization
Database
  • Contains images of 4156 proteins (out of 6234
    ORFs in all 16 yeast chromosomes).
  • GFP tagged immediately before the stop codon of
    each ORF to minimize perturbation of protein
    expression.
  • Annotations were done manually by two scorers and
    co-localization experiments were done for some
    cases using mRFP.
  • Each protein is assigned one or more of 22
    location categories.

14
Classification of Yeast Subcellular Patterns
Chen et al 2007
  • Selected only those assigned to single
    unambiguous location class (21 classes)
  • Trained classifier to recognize those classes
  • 81 agreement with human classification
  • 94.5 agreement for high confidence assignments
    (without using colocalization!)
  • Examination of proteins forwhich methods
    disagree suggests machine classifier is correct
    in at least some cases

Shann-Ching (Sam) Chen Geoff Gordon
15
Example of Potentially Incorrect Label
ORF Name YGR130C UCSF Location punctate_composite
Automated Prediction cell_periphery
(60.67) cytoplasm (30) ER (9.33)
DNA GFP Segmentation
16
Supervised vs. Unsupervised Learning
  • This work demonstrated the feasibility of using
    classification methods to assign all proteins to
    known major classes
  • Do we know all locations? Are assignments to
    major classes enough?
  • Need approach to discover classes

17
Location Proteomics
  • Tag many proteins (many methods available we use
    CD-tagging (developed by Jonathan Jarvik and
    Peter Berget) Infect population of cells with a
    retrovirus carrying DNA sequence that will tag
    in a random gene in each cell
  • Isolate separate clones, each of which produces
    express one tagged protein
  • Use RT-PCR to identify tagged gene in each clone
  • Collect many live cell images for each clone
    using spinning disk confocal fluorescence
    microscopy

Jarvik et al 2002
18
Chen et al 2003Chen and Murphy 2005
Group proteins by pattern automatically
Uniform punctate proteins
Nucleolar proteins
Punctate nuclear proteins
Vesicular proteins
Uniform proteins
Nuclear w/ punctate cytoplasm
19
CD-tagging project
Garcia Osuna et al 2007
  • Running 100 clones/wk
  • Automated imaging

Results for 225 clones
20
Subcellular Location Families and Generative
Models
  • Rather than using words (e.g., GO terms) to
    describe location patterns, can make entries in
    protein databases that give its Subcellular
    Location Family - a specific node in a
    Subcellular Location Tree
  • Provides necessary resolution that is difficult
    to obtain with words
  • How do we communicate patterns Use generative
    models learned from images to capture pattern and
    variation in pattern

21
Generative Model Components
Nucleus
Model parameters
Cell membrane
Fitted
Original
Filtered
Protein objects
Zhao Murphy 2007
22
Synthesized Images
Lysosomes
Endosomes
  • Have XML design for capturing model parameters
  • Have portable tool for generating images from
    model

SLML toolbox - Ivan Cao-Berg, Tao Peng, Ting Zhao
23
Combining Models for Cell Simulations
Simulation for multiple proteins
Shared Nuclear and Cell Shape
XML
Integrating with Virtual Cell (University of
Connectiicut)) and M-Cell (Pittsburgh
Supercomputing Center)
24
PSLID Protein Subcellular Location Image Database
  • Version 4 to be released March 2008
  • Adding 50,000 analyzed images (1,000 clones,
    350,000 cells) from 3T3 cell random tagging
    project
  • Adding 7,500 analyzed images (2,500 genes,
    40,000 cells) from UCSF yeast GFP database
  • Adding 400,000 analyzed images (3,000 proteins,
    45 tissues) from Human Protein Atlas
  • Adding generative models to describe subcellular
    patterns consisting of discrete objects (e.g.,
    lysosomes, endosomes, mitochondria)
  • Return XML file with real images that match a
    query
  • Return XML file with generative model for a
    pattern
  • Connecting to MBIC TCNP fluorescent probes
    database
  • Connecting to CCAM TCNP Virtual Cell system
Write a Comment
User Comments (0)
About PowerShow.com