From Placenta to the Brain: Image Informatics Challenges and Experiences - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

From Placenta to the Brain: Image Informatics Challenges and Experiences

Description:

From Placenta to the Brain: Image Informatics Challenges and Experiences – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 63
Provided by: robp154
Category:

less

Transcript and Presenter's Notes

Title: From Placenta to the Brain: Image Informatics Challenges and Experiences


1
(No Transcript)
2
From Placenta to the Brain Image Informatics
Challenges and Experiences
  • Tony Pan
  • Department of Biomedical Informatics
  • The Ohio State University

3
Agenda
  • Virtual Mouse Placenta
  • BIRN Mouse Brain Distributed Processing
  • GridPACS

4
Virtual Mouse Placenta
5
The biological problem
  • What is the effect of genetic variation on the
    structure and function of an organism?
  • How do molecular changes translate into changes
    at the organism level?
  • Can we translate molecular changes found to be
    associated with a disease into demonstrable
    anatomic and physiologic changes?

6
Understand function of Rb gene
7
Compare phenotypes of normal vs Rb deficient mice
Alignment
Slides/Slices
Placenta
Visualization
Segmentation
8
Questions
  1. What are the mechanisms of fetal death in mutant
    mice?
  2. What structural changes occur in the placenta?
  3. How different are the structural changes between
    the wild and mutant types?

9
Computational Phenotyping Challenges
  • Very large datasets
  • Automated image analysis
  • Three dimensional reconstruction
  • Motion and deformation
  • Integration of multiple data sources
  • Data indexing and retrieval

10
Dataset Size Systems Biology
  • Future big science animal experiments on cancer,
    heart disease, pathogen host response
  • Basic small mouse is 3 cm3
  • 1 µ resolution very roughly 1013 bytes/mouse
  • Molecular data (spatial location) multiply by
    102
  • Vary genetic composition, environmental
    manipulation, systematic mechanisms for varying
    genetic expression multiply by 103
  • Total 1018 bytes per big science animal
    experiment

11
Now Virtual Slides(roughly 25GB/cm2 tissue)
12
Structural Complexity in the Natural World
13
Structural complexity in the biological world
14
Data Complexity
  • Biomedical informatics research involves a very
    large number of heterogeneous types of data
  • Descriptive metadata is complex and application
    specific
  • Joins frequently need to be carried out between
    different types of data
  • Image data Mass spec data account for 99 of
    the storage requirements but maybe 5 of the
    complexity

15
What will be done with Complex Grid Data?NCI
caBIG Program (Scenario from U Penn Cancer
Center)
  • A researcher would like to study the error rate
    in pathological diagnoses of solid tumor samples
    and compare numerous molecular diagnostic
    approaches to determine if the molecular
    diagnostic approach can enhance the accuracy of
    pathological diagnoses.
  • Query
  • I want all solid tumors, specifically for lung
    cancer, that have a diagnosis based on tumor
    pathology. Each diagnosis must have an image of
    the tumor that allows for independent
    verification of diagnoses. Each record retrieved
    must also have either proteomics marker data or
    microarray data (Affy or two-color) included so
    that different molecular techniques can be
    correlated to the tumor pathology. In addition, I
    want all protein annotations for markers and
    genes associated with the proteomics and
    microarray data so I can perform meta-analyses.

16
Return to Placenta Problem
  1. What are the mechanisms of fetal death in mutant
    mice?
  2. What structural changes occur in the placenta?
  3. How different are the structural changes between
    the wild and mutant types?

17
Placental Architecture
decidua
giant trophoblasts
trilaminate layer
spongiotrophoblasts
(E13.5)
labyrinth trophoblasts
yolk sac
18
Placental defects in Rb-mutant mice
Good - Labyrinth neat, well-ordered, maternal
blood sinusoids and trophoblasts evenly dispersed
among fetal blood cells.
Bad - Trophoblasts grow wildly, clump together
and disrupt fetal and maternal cells layers
necessary for proper embryonic growth
19
An Exercise in Systems Biology
  • Goals morphological changes from Rb gene
    mutation
  • Surface area between different cell layers
  • Vascular Density
  • Volume of labyrinth layer
  • Qualitative insights from 3-D visualization
  • .

20
Mouse Placenta Flowchart
Compare phenotypes of normal vs Rb deficient mice
Alignment
Slides/Slices
Placenta
Visualization
Segmentation
21
Mouse Placenta Flowchart
22
Tissue Segmentation Labyrinth
23
Probabilistic Segmentation
  • Preprocessing - color correction, sub-sampling
  • Training set - manually selected ROIs (48) are
    used as training set.
  • Probabilistic classifier Bayesian Maximum A
    Posteriori estimator
  • Local region is classified into one of three
    fetal tissue types
  • Features - Color histogram, gradient histogram,
    red pixel count, nuclei size histogram, vacuole
    size histogram (more than 500 features)

24
Is It Good ? Misclassifications ?
25
Tissue Classification Slide 586
26
Tissue Classification Sensitivity vs. Specificity
  • 10 images were tested
  • Sensitivity and Specificity are calculated and
    plotted
  • Sensitivity true positive / (true positive
    false negative)
  • Specificity true negative / (false positive
    true negative)
  • Results
  • K-Means performed the worst
  • 2-point correlation function has highest
    specificity
  • Bayesian MAP has highest sensitivity
  • Difference in sensitivity and specificity is
    related to the features used in the
    classification. A hybrid set of features from the
    three methods would be beneficial.

Kmeans Kmeans 2 Point Correlation 2 Point Correlation Bayesian MAP Bayesian MAP
slide sens spec sens spec sens spec
377 0.61 0.80 0.72 0.98 0.90 0.75
586 0.68 0.79 0.73 0.98 0.91 0.74
826 0.59 0.83 0.69 0.99 0.87 0.74
1047 0.82 0.74 0.76 0.99 0.90 0.71
27
An Exercise in Systems Biology
  • Goals morphological changes from Rb gene
    mutation
  • Surface area between different cell layers
  • Vascular Density
  • Volume of labyrinth layer
  • Qualitative insights from 3-D visualization
  • .

28
3D Fingers
Spongiotropoblast
glycogen
labyrinth
29
Finger Presence
  • Slides 576 625 (46)
  • Rigid Registration using Mutual Information and a
    2-Level Random walk optimizer
  • Deformable Registration using piecewise linear
    selection of control points
  • 3D Reconstruction of the finger outlines in each
    image

30
Rb/
Rb-/-
31
(No Transcript)
32
Challenges
  • Damaged / incomplete data
  • Missing parts / slides (broken tissue, folded
    tissue)
  • Flipped slides
  • Large variations Deformations
  • Non-linear warping
  • Different thickness
  • Different color
  • Morphological changes
  • Blood sinus
  • Maternal tissues

Flipped
Folded
Color Gradient
Broken
33
Algorithmic Challenges
  • Output Segmented Tissue Layers from Aligned
    Slices
  • Large data size
  • High resolution image (1.0 5.0 GB / image)
  • Large number of slides (2-5 mm)
  • Rb mutant 800 slides / Wild type 1200 slides
  • Too many features any optimization has many
    extremas
  • No well defined boundaries
  • Need robust algorithms A Big Challenge

34
BIRN Mouse Brain Distributed Processing
35
Mouse Brain Phenotype CharacterizationCon-focal
Microscopy (joint work with NCMIR)
correctional tasks
Image file
normalization
stitching
warping
declustering
target task
preprocessing tasks
thresholding
tessellation
prefix sum generation
querying
  • Problem definition how many pixels of a certain
    color intensity exist within a rectilinear region
    of interest?
  • Implementation the prefix sum solves the query
    without scanning every pixel within the region of
    interest

36
Solving aggregate queries involving Sum or Count
operations on spatial data
SELECT Add(Value(x,y)) FROM Image WHERE (x,y)
in POLYGON lt(10,20),(300,400)gt
37
(No Transcript)
38
Algorithms Overview
  • We develop distributed algorithms that function
    across any number of compute nodes in a cluster
  • When an algorithm begins, data is generally
    distributed as well across each compute nodes
    disk
  • Algorithms work on an out-of-core basis by
    pulling in one or a few tiles at a time, and
    write their result out similarly, one tile at a
    time to local disk
  • Conversions at each end of the pipeline from
    single-file formats (e.g. .IMG, .PPM) to and from
    distributed storage

39
Distributed Execution DataCutter
  • Pipe-and-filter metaphor of data processing
  • Data is streamed from producer to consumer
    filters
  • Framework for task- and data-parallel
    manipulation of large scientific data
  • Transparent copies of filters
  • Provide distributed computation and
    application-specific storage access
  • XML description of data and task flow

40
Every Corrective Phase Except Warping
41
Indexing Terabyte-scale images on OSC MSS (16
nodes, ext3)
42
GridPACS
43
Outline
  • Motivation
  • Use case
  • Mobius Overview
  • Mako Service
  • Virtual Mako
  • Grid PACS
  • Questions

44
Motivation
  • Integration of multi-institutional data sets
    across modalities.
  • Expose existing data resources with minimal
    effort
  • Provide methods for automatically creating
    databases to model new datasets.
  • Ability to execute distributed queries across all
    exposed data resources.
  • Provide methods for translating between data
    types
  • System should support any data type but promote
    the convergence and standardization of similar
    types.

45
Use Case
46
Mobius
  • The Mobius project attempts to define and build a
    set of services and protocols enabling the
    management and integration of both data and
    metadata.
  • Mobius Core Services
  • Global Model Exchange (GME)
  • Data Storage and Retrieval (Mako)
  • Data Integration and Translation (DTS)
  • Mobius Extension Services
  • Higher level query services, Adhoc federation
    services, Metadata Transportation Services.

47
Mako
  • Service framework that exposes data resources as
    XML data services through a set of well-defined
    interfaces based on the Mako protocol.
  • Interfaces based on the GGF DAIS working groups
    XML realization specification.
  • Example Operations
  • Insertion
  • Retrieval
  • XPath
  • XUpdate
  • Deletion

48
Mako Architecture
  • Abstract Communication Layer
  • Configurable Protocol Handling
  • Abstracts Mako Infrastructure from the underlying
    data resource
  • Protocol Handlers Specified at run time.
  • Abstract Handlers are extended to expose a
    particular data resource
  • Handlers are easy to write and deploy.

49
Mako Current Support
  • MakoDB
  • In house XML database, optimized for supporting
    specialized Mako features.
  • XML Databases
  • Handler implementation for the XMLDB API
  • Tested using Xindice and Exist
  • Relational Databases
  • Handler implementation for exposing relational
    databases using XBridge.
  • Requires the creation of a XBridge Map file.

50
Mako Features
  • Partial Retrieval
  • Distributed Document Object Model (DOM)
  • Binary Object Support
  • Mako protocol supports attaching binary objects
    to XML files.
  • Data Referencing

51
Virtual Mako
  • Simplifies client-side complexity of interfacing
    with multiple Makos by presenting a single
    virtualized interface to a collection of
    federated Makos
  • Acts as a data integration point for distributed
    queries
  • Pluggable algorithms for XML instance
    ingestion/distribution
  • Protocol request broadcast and response
    aggregation
  • Supports all services a standard Mako supports
  • Maps a Virtual Collection to a number of remote
    standard Collections

52
Grid PACS
  • Designed to address the storage, querying, and
    processing requirements of large-scale image
    databases in a grid wide environment.
  • Model-centric application, majority of backend
    implemented by simply submitting schemas to a
    number of Makos
  • Enables modeling and execution of image
    processing workflows

53
Grid PACS
  • Relies heavily on the Mobius Infrastructure
  • Data Referencing metadata and chunks of data
    distributed across grid via references
  • Partial Retrieval data retrieved on demand
  • Distributed DOM emulates local data environment
  • VMako query broadcast and aggregation
  • Model-driven data storage On demand creation of
    schema-based metadata and image storage
    collections on Makos

54
The Future
55
3D Model of the PlacentaArchitectural Framework
  • Complete annotation of cells
  • 4th Dimension of Time
  • Genetic and Biochemical analysis
  • of purified cell populations
  • - Genetic
  • - Chromatin
  • - Gene expression (mRNA miRNA)
  • - Proteonomics
  • - Signaling
  • - Immunohistochemical
  • Monitored data entry
  • Educated Bioinformatics
  • Interrogation/Experimental testing

56
Mammary Gland Microenvironment
57
3D Model of Breast Cancer Tumor progression
  • Epithelial tumor cell
  • Myoepithelial cell
  • Stromal fibroblast
  • Adipocyte
  • Tissue macrophage
  • T cell
  • Endothelial cell

3 Representative Tumor Models of Breast cancer
58
Summary Computational Phenotyping and High End
Computing
  • Genes comprise (part) of lifes source code
  • Understanding biomedicine requires understanding
    how genes and environment interact in space-time
  • Molecules to man is a big quantitative leap!

59
NIH BISTI Center at Ohio State
Biomedical Research Imaging Research Computer Science
Ensure success of biventricular pacing Semi-gated cardiac imagery analysis Machine vision On-demand large data analysis
Role of oncogenes in development Multiple modality mouse placenta imaging, information synthesis, registration Image analysis in ensembles of very high resolution 3-D imagery. Interactivity
Mechanism of ischemic cardiac injury Synthesis of multimodal imaging, genotype, gene expression, proteomic data Grid data management and query, Information integration involving multi-modal image, molecular data
60
Multiscale Laboratory Research Group
Ohio State University Joel Saltz Gagan
Agrawal Umit Catalyurek Dan Cowden Mike
Gray Tahsin Kurc Shannon Hastings Steve
Langella Scott Oster Tony Pan DK Panda Srini
Parthasarathy P. Sadayappan Sivaramakrishnan
(K2) Michael Zhang
The Ohio Supercomputer Center Stan Ahalt Jason
Bryan Dennis Sessanna Don Stredney Pete Wycoff
61
Microscopy Image Analysis
  • Pathology
  • Dr. Dan Cowden
  • Human Cancer Genetics
  • Pamela Wenzel
  • Dr. Gustavo Leone
  • Dr. Alain deBruin
  • Biomedical Informatics
  • Tony Pan
  • Alexandra Gulacy
  • Dr. Kun Huang
  • Dr. Metin Gurcan
  • Dr. Ashish Sharma
  • Dr. Joel Saltz
  • Computer Science and Engineering
  • Kishore Mosaliganti
  • Randall Ridgway
  • Richard Sharp

62
Mobius Team
  • David Ervin
  • Daniel Hall
  • Shannon Hastings
  • Tahsin Kurc
  • Stephen Langella
  • Scott Oster
  • Tony Pan
  • Joel Saltz
Write a Comment
User Comments (0)
About PowerShow.com