Design of a scalable fault tolerant enterprise medial image management system PowerPoint PPT Presentation

presentation player overlay
1 / 21
About This Presentation
Transcript and Presenter's Notes

Title: Design of a scalable fault tolerant enterprise medial image management system


1
Design of a scalable fault tolerant enterprise
medial image management system
  • Chris Hafey
  • July 19, 2002
  • Stentor, Inc.

2
Presentation Overview
  • Medical Imaging Industry
  • Traditional Solutions
  • Stentors iSite 3.0
  • Fault Tolerance
  • Scalability / Load Balancing
  • Major ACETAO Components
  • Real World Lessons

3
Medical Imaging Industry
  • 15 Billion dollar a year industry
  • 400 Million Studies a year in North America
  • 35 per study to manage analog film studies
  • Vast majority of image display and distribution
    is film based
  • 40-50 of image data is digitally acquired
  • gt90 of new devices are digital
  • Up to 50 of films are unavailable at the point
    of care
  • Nearly 15 may be lost

4
Digital Imaging
  • Images vary from 500k to 80 MB in size
  • Resolution From 128x128 to 4k x 4k
  • Bit Depth 1 to 4 Bytes per pixel
  • Studies range from 20MB 500MB in size
  • Volumes range from 10,000 studies a year to
    1,000,000 studies a year
  • 5 to 15 Terabytes per year!
  • Images may not be lossy compressed

5
Technical Challenges to Enterprise Distribution
  • System is life critical (99.99 uptime)
  • Must coexist with other enterprise applications
  • Software must run on existing PCs
  • System must work on existing network
    infrastructure
  • System must be fast!
  • Users will not wait more than 2 seconds
  • System must scale to 1000s of concurrent users

6
Traditional Solutions
  • Requires high end hardware
  • Fiber Channel, 15k RPM Drives, 2 Gig RAM
  • Data automatically routed to each workstation
  • Hierarchical storage RAID, MO, Tape
  • Users tape backup technology slow and
    unreliable
  • Multiple copies of data makes synchronization
    impossible
  • Expensive, Unreliable, has not met expectations

7
Stentors iSite 3.0
  • iSyntax The enabling technology
  • System Design
  • Fault Tolerance and Scalability

8
iSyntax Technology
  • Wavelet encoding of images
  • Hierarchical representation
  • Lossless compression
  • Multiple full-fidelity presentations transmitted
    and generated from a single image
  • Zero additional overhead required
  • Optimizes the dynamic transfer of data to match
    display resolution of users monitor

9
iSyntax Encoding
10
iSyntax Decoding
11
iSyntax Benefits
  • Only data required by client is delivered
  • Data is delivered on demand
  • Minimum load on server resources
  • Minimum load on network resources
  • Minimum memory requirements for clients
  • CPU computation moved to client PCs where ample
    resources are available

12
iSite 3.0 System Design
13
iSite 3.0 Fault Tolerance and Load Balancing
  • 99 of data is stateless
  • Stateful data managed by one clustered server
  • Stateless data managed by many standard servers
  • Server can be restarted without affecting
    connected clients
  • Client knows about all servers, automatically
    fails over and load balances between them

14
iSite 3.0 Fault Tolerance and Load Balancing
  • JBOD Just a bunch of disks IDE vs SCSI
  • Interisis switch used for legacy protocol
  • Newest, likely to be accessed data on multiple
    servers
  • Oldest, unlikley to be accessed data on one server

15
Major ACETAO Components
  • Service Configurator
  • ACE_Task
  • Memory mapping octet sequence optimization
  • Network endpoint selector
  • SSL
  • ACE_Configuration
  • Naming Service
  • Portable Interceptors
  • ACE Logging mechansim

16
Real World Lessons
  • NAT (Network Address Translation)
  • Inadequate timeout mechanism
  • Idle connections not purged bug
  • No mechanism to close connections
  • Load balancing and fault tolerance

17
NAT (Network Address Translation)
  • Server ORB must be initialized with endpoints for
    the real and NATd IP address
  • TAO requires a network device per endpoint we
    used a loopback for the NATd IP address
  • Client must use a custom endpoint selector to
    select the correct endpoint

18
Inadequate Timeout Mechanism
  • Same timeout applied to connection establishment
    and receiving data
  • Ideally a short connection timeout could be used
    with a very long or no receive timeout
  • Current limitations cause clients to hang if
    server is down
  • Problem is addressed through a new ORB policy in
    TAO 1.3

19
Idle connections not purged
  • Bug in TAO 1.2.1 prevents ORB from shutting down
    idle connections
  • Causes server to have a hard limit of FD_SETSIZE
    connections
  • Bug causes client ORB to hang waiting for server
    to accept connection
  • Current workaround is to restart server when this
    occurs

20
No mechanism to close connections
  • Necessary to avoid client hangs due to Server
    deadlock
  • Necessary to allow user to abort long running
    invocation
  • Could be used to work around connection purging
    bug

21
Load Balancing and Fault Tolerance
  • Consider making the client smarter
  • Consider separating stateful and stateless data
  • Consider both hardware and software solutions
Write a Comment
User Comments (0)
About PowerShow.com