Title: Translating Imaging Science to the Emerging Grid Infrastructure
1Translating Imaging Science to the Emerging Grid
Infrastructure
- Jeffrey S. Grethe - BIRN
- University of California, San Diego
2We speak piously of taking measurements and
making small studies that will add another brick
to the temple of science. Most such bricks just
lie around the brickyard. Platt, J.R. (1964)
Strong Inference. Science. 146 347-353.
3Objectives
- Establish a stable, high performance network
linking key Biotechnology Centers and General
Clinical Research Centers - Establish distributed and linked data collections
with partnering groups - create a Data GRID for
the BIRN - Facilitate the use of "grid-based" computational
infrastructure and integrate BIRN with other GRID
middleware projects - Enable data mining from multiple data collections
or databases on neuroimaging and bioinformatics - Build a stable software and hardware
infrastructure that will allow centers to
coordinate efforts to accumulate larger studies
than can be carried out at one site.
4Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
5Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
6CREATING BIRN TEST-BED PARTNERSHIPS
- Three Research Project Application Test Beds
have been Assembled to Shape BIRN and Guide
Infrastructure Development - Multi-scale Mouse BIRN - Animal Models of disease
/ Multi Scale/Multi Method - Examples MS Mouse,
DAT KOM (a schizophrenic and otherwise
interesting mouse animal model) and a Parkinsons
Disease Mouse - Brain Morphometrics (Human Structure BIRN) -
Targets neuroanatomical correlates of
neuropsychiatric illness (Unipolar Depression,
mild Alzheimer's Disease (AD), mild cognitive
impairment (MCI) - Functional Imaging BIRN Development of a common
functional magnetic resonance imaging (fMRI)
protocol and to study regional brain dysfunction
related to the progression and treatment of
schizophrenia - attack on underlying cause of
disease
7A National Collaboratory
8Science Drives The Infrastructure
- USE APPLICATION SCIENCE PULL TO GUIDE
DEVELOPMENT OF THE NEXT GENERATION
CYBERINFRASTRUCTURE - Craft a plan to achieve an important scientific
goal requiring development and implementation of
innovative computational infrastructure. - Articulate a Grand Challenge and define work to
achieve this goal with increasing levels of
specificity. - Bring application scientists and computer
scientists together in projects at each level to
build elements of the new infrastructure.
9Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
10User Access to Grid Resources
- Application environment being developed to
provide centralized access to BIRN tools,
applications, resources with a Single Login from
any Internet capable location - Provides simple, intuitive access to Grid
resources for data storage, distributed
computation, and visualization
11Interfacing the Desktop with the Grid
- Developed a Java Grid Interface (JGI) that
provides wrapper for applications on a users
desktop. - Brokers communications and information/data
transfer between the application and BIRN
resources (e.g. SRB) - LONI Pipeline, 3D Slicer, FreeSurfer, and ImageJ
- Continue to extend and develop the JGI
- OGSA compliance
12Distribution of a Bioinformatics Toolbox
- Package and deploy test bedspecific software
through the distribution of the BIRN
bioinformatics toolbox - Use ROCKS (http//www.rocksclusters.org) as the
distribution mechanism
- Bioinformatics toolbox can be made available to
any researcher interested in a robust package of
neuroimaging applications. - First release to occur this fall using the new
ROCKS distribution model.
BIRN Roll
FreeSurfer
AIR
AFNI
FSL
Grid Wrappers
Grid Role
BIRN ROCKS Distribution
Grid Roll
ROCKS Core
13Scientific Workflow
- Sequence of steps (utilities, applications,
pipelines) required to acquire, process,
visualize, and extract useful information from a
scientific data. - Advantages of workflow managed within the Portal
- Progress through the workflow can be organized
and tracked - Automated and transparent mechanisms for the flow
of data from one step to the next using SRB - Tools are centralized and presented with uniform
GUIs to improve usability - Administration burden of each step (groups of
steps) is eliminated - Flexibility to enhance each process through
direct, transparent access to the grid
14Interactive Scientific Workflows
Provide researchers with transparent access to a
computing environment that supports their natural
working paradigm while taking advantage of the
evolving grid infrastructure
Data curation requires determination of data
quality and validity
15Workflow Considerations
- Provide full provenance for data within the BIRN
environment - Morphometry BIRN is modifying tools to provide
proper provenance information - Data provenance is being taken into account in
the human imaging database - Workflow Optimization
- Take advantage of resource discovery services
being deployed - Use of data provenance information
- Global versus run time optimizations
- Incorporation of legacy applications
- LONI Pipeline (UCLA)
- Standard install
- Incorporation into Portal
- Advisement on future Grid enhancements to Pipeline
16Challenges
Neuroscience
High Speed Network
Computation
Mouse BIRN
User Access
FIRST BIRN
Distributed Data
Data Integration
Morphometry BIRN
Informatics
Community
Policies
Best Practices
IRB
HIPAA
Governance
17Governance
- Incorporating processes for Multi-sites studies
and sharing of human data - HIPPA Compliance
- Patient confidentiality
- Institutional Review Board (IRB) approvals
- Developing guidelines - for sharing data
authorship - Breaking down the barriers
- Mistrust
- Open sharing of information
- Who gets credit
- Commercial products
- Governance
- Integrating new participants
18IRB Working Group
- One member from each BIRN site required to
participate - Each member is required to review BIRN consents,
waivers and procedures with local IRBs - Regular video conferences among members to
coordinate information and activities - Produce BIRN template language for subject
consent, IRB waiver for data upload and IRB
waiver for data download - Interact with Data Sharing Task Force
19What Regulations Apply?
Institutional Policy
It Depends!
Local Policy
20Data Sharing Task Force
- Produce guidelines and procedures for data
sharing across institutions taking into account
Common Rule, HIPAA and state regulations - Develop procedures to allow for longitudinal
studies within BIRN - Examine policies that are relevant to BIRN (e.g.
revised policies being drafted for tissue banks
and data banks) - Interact with Architecture working groups to help
define security and subject confidentiality
infrastructure and policy - Data Replication
- Certificate Policies
- Registration Authority Policies
- Local access control
- Auditing activity logs
21EU Privacy Directives
- EU directive 95/46/EC article 8
- Member states shall prohibit the processing of
personal data concerning health or sex life. - Recommendation nr R (97) 5 Exceptions
- Diagnostic and therapeutic reasons
- Public health reasons, public interest
- Criminal offenses
- Specific contractual obligations fulfilment
- Legal claims
- Consent for specific purposes
22Data Classifications
23Anonymization vs. De-Identification
- Both require deletion of direct identifiers
- Anonymization cannot have a link field
(De-Identified data can). - Anonymization makes protocol eligible for
exemption from IRB review. - De-Identification makes data exempt from HIPAA
regulations. - De-Identification with link field does NOT exempt
data from IRB review.
24EU Data Definitions
- Recommendation R (97)5 on the protection of
medical data - Personal data covers any information relating to
an identified or identifiable individual. - An individual shall not be regarded as
identifiable if identification requires an
unreasonable amount of time and manpower. - In cases where the individual is not
identifiable, the data are referred to as
anonymous
25Identifiable Health Information
- High-resolution structural images can be used as
an identifier. - Reconstruction of face from raw anatomical data
might be able to be used to identify subject - Some members of scientific community
require/desire unaltered raw data - Are allowed to provide both raw and skull
stripped data - Need to get approval from local IRB to allow for
the sharing of raw anatomical data - Users wishing to access data also require IRB
approval
Is there a scalable and distributed solution for
researchers to access identifiable health
information?
Raw
Skull Stripped
26Data Sharing Infrastructure
- Security related metadata
- All data uploaded within BIRN must have
associated metadata - Data classification
- IRB agreements
- Subject consent
- Longitudinal data
- Data sharing permissions are dependent on
metadata - For example, de-identified data can not be shared
with all users - Secure environment required for the storage of
protected information - Linkage of BIRN ID with original subject ID
- Protected data
- Auditing of data access and movement required
- HIPAA
- Internal Security
- Data Usage Statistics
27(No Transcript)