Title: The University of Pittsburgh Cancer Institute caBIG Architecture
1The University of Pittsburgh Cancer
InstitutecaBIG Architecture
- Establishing a framework for
- Creating Products
-
- Understanding Process
- Producing Value
2Architectural Goals
- Create a framework for building systems that are
widely accessible, modular, interoperable,
supportable, locally extensible and future-proof - Create data storage and communication strategies
that are hardware- and software-agnostic - Support standardization and "convergent
evolution" of software tools as they are shared
and tested by domain workgroups - Cancer Biomedical Informatics Grid
- Convergent Architecture Biomedical Informatics
Grid
3Overview of UPMC/UPCI Architecture
UPMC
UPCI
Clinical Users Fat clients, browsers
Clinical and Research Users
Cerner CIS
Presentation
Browser, Java applets
Oracle, Java API, AIX
Message Router HL7
HL7 2.x
HTTP, RMI, SSL
HL7 2.x
Ancillaries
Middleware
iPlanet, Oracle (PL/SQL), Java, Oracle Forms,
Cold Fusion, W2K
Various, Legacy
MARS
C, Python, AIX
JDBC, Oracle drv
Oracle 9i
Proprietary UPMC, AIX
Sun Solaris
CRIS
Research Users File Transfer
4Overview of UPMC/UPCI Architecture
UPMC
UPCI
Clinical Users Fat clients, browsers
Clinical and Research Users
Cerner CIS
Presentation
Browser, Java applets
Oracle, Java API, AIX
Message Router HL7
HL7 2.x
HTTP, RMI, SSL
HL7 2.x
Ancillaries
Middleware
iPlanet, Oracle (PL/SQL), Java, Oracle Forms,
Cold Fusion, W2K
Various, Legacy
MARS
C, Python, AIX
JDBC, Oracle drv
Oracle 9i
Proprietary UPMC, AIX
Sun Solaris
Proxy Server
CRIS
Firewall
VPN
5OPI Clinical Systems Application Architecture
Client Tier
UPMC Internal Network PCs (W2K)
Web Browsers
Oracle Forms
2nd-Tier
Web Servers WIN2K OS Iplanet 6.0 Cold Fusion
1upmc-pci2 intranet.upci.edu
Crystal Reports
3rd-Tier
Application Servers WIN2K OS
1upmc-opi-cry01 Crystal Enterprise Server
1upmc-pci1 Java DBServer J2EE environment
Database Servers Solaris 2.8 Oracle 9.2.0.1
Back-end
Redo Log Replication
Opiunx01 UPCI Database Production
Opiunx04 UPCI Database Standby
6Key caBIG Projects at UPCI
- Mature
- Clinical Trials Management Application
- Tissue Banking System and Organ-Specific
Databases - Currently in development
- SPIN distributed virtual database
- SPIN free text concept extraction
7Clinical Trial Management Application
- Currently deployed at UPCI (since 2000)
- Supports a range of users including
- Nurse Coordinators
- Data Managers
- Clinical Trial Specialists
- Clinicians
- Administrators
- Researchers
- Pharmaceutical representatives
- Houses over 250 active protocols representing
in-house, Co-op, NCI and pharmaceutical-sponsored
trials - Contains clinical data for approximately 16,000
patients
8CTMA Architecture
- Multi-tiered Java implementation
- Java/Swing client applet (pure Java)
- Uses RMI to communicate via SSLava to the middle
tier - Authentication via by iPlanets LDAP server and
Oracle - Data access server (DAS) written in Java (Win2K)
- Mediates between RMI client calls and middleware
objects or database routines - Runs on a Windows NT machine designated as
intranet host - Communicates with an Oracle database using JDBC
and the Oracle Type 4 Thin SQL driver, with
Oracle's RC4_56 encryption algorithms - Administrative reports via Crystal Reports
Enterprise Server - Database
- Oracle 9i database on SUN Solaris
9Tissue Banking Systemand Organ-Specific Databases
- Inventory, specimen management organ-specific
views of aggregate data - Consent tracking (universal consent, deceased
patients) - Sample inventory including sample
features/quality, bar coding, storage and access
tracking - Manual sample annotation using local data elements
- Organ-specific views provided by form-based
applications (Oracle Forms) - Aggregation and summary based on local data
elements - On-the-fly HIPAA de-identification through local
proprietary tools - In production at UPMC and 15 collaborating
institutions.
10Tissue Banking Architecture
- Multi-tiered implementation
- Client display implemented using Oracle Forms
- Middleware layer using Oracle Apache Services
- Data processing and aggregation partially in Java
- Data extraction and processing in PL/SQL
- Database
- Oracle 9i database on SUN Solaris
- Includes primary data and metadata that defines
rules for grouping data into specimens, cases,
patients, organs, diseases and aggregates of
patients
11SPINShared Pathology Informatics Network
- Distributed virtual database using a peer-to-peer
network with query/response communications based
on web services
Local databases at participating institutions
SOAP
JBoss/servlet middleware MySQL back end
Query
Response
Pitt Regenstrief Harvard UCLA
In development at Harvard
12SPINShared Pathology Informatics Network
- 2. Free text UMLS-based concept extraction from
pathology reports, in development at Pitt - De-identification based on local tools
- Processing pipeline includes report section
"chunker," tokenizer, and concept tagger
including negated concepts, written in Java - Incorporates Gate, an open source Java framework
for language engineering - Output as XML-structured report with identified
subsections (including diagnoses) and embedded
code elements - Modular, will communicate through Java API or web
services (automated data extraction from MARS)
13Key Issues/Lessons
- Get software into the hands of users/testers as
rapidly as possible. - Our first generation tools are functional but
need to evolve to fully support the ultimate
caBIG architecture. - Iteratively modifying functional tools at
implementation sites will be challenging. - Changes that impact the clinical workflow will be
disruptive. - Developers tend to gravitate to familiar tools.
In medical settings, these have not typically
included open source. - Focus when possible on language-agnostic open
messaging standards for communication rather than
language-specific APIs (consistent with caBIO
technical guidelines).
14UPCI's Role in Realizing the Architecture
- Because the UPCI is involved in several
workgroups, it may offer a helpful environment
for communication between domain and the
cross-cutting workgroups - Direct communication from the AWG to the project
technical staff and direct access to feedback
from the projects - Coordination between the Architecture and the
CDE/Vocabulary cross-cutting workgroups. - Generate integrated feedback to the AWG as
policies and recommendations develop. - Initial implementation and testing of
architectural recommendations in several
production settings
15Mechanisms for Providing Guidance
- Bi-directional communication between AWG and the
domain WGs - Position statements and guidelines from AWG
- Critiques on guidelines from domain WG
- Documentation from ongoing projects, including
early docs such as information models and UML
diagrams - Critique of project plans/designs
- Process and schedule for documenting development
(planning and implementation) important - Communication mechanisms
- Online asynchronous discussion, conference calls
sparingly - Online documents and critiques accessible to all
WG - Periodic meetings of AWG, possibly open source
"sprint" style over several days - Concurrent meetings or representation at domain
WG meetings and CDE/Vocab WG - Explicit connection with other standards
organizations (HL7)?
16Development Plan
Standards-Centered Framework
17Development Plan
Standards-Centered Framework Effectively
Communicated