Life of a Cell - PowerPoint PPT Presentation

About This Presentation
Title:

Life of a Cell

Description:

Distribute -- on-line -- millions of pages of aircraft maintenance documentation ... Excalibur Technologies 'EFS' (Electronic File System) Transarc AFS 3.3. HP Servers ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 33
Provided by: jb145
Category:

less

Transcript and Presenter's Notes

Title: Life of a Cell


1
Life of a Cell
  • Woes and Wins

2
The Conundrum
  • Distribute -- on-line -- millions of pages of
    aircraft maintenance documentation in a system
    that the FAA requires to be foolproof
  • No downtime
  • All data identical for every mechanic worldwide.
    Always

3
Business Risks
  • An airplane cannot leave the gate if maintenance
    documentation is unavailable.
  • An airplane stuck at the gate causes the airline
    to lose lots of money (system wide)
  • Hasnt been done before

4
Business Drivers
  • Faster access to documentation translates to
    millions of dollars a year in recovered revenue
  • No such thing as I did that yesterday Ill just
    wing it documents change daily
  • New document is printed and carried aboard the
    aircraft (or youre busted)
  • Search times and print times must be low

5
Business Drivers
  • Consistency of documentation eliminates flip
    flop maintenance costs
  • I use procedure A and perform X
  • Downline old documents ... Hey, who did that?
    But uh oh I can fix it. Procedure B
  • Downline new documents, Procedure A ....

6
Business Drivers
  • Safety
  • An incident involving a fatality drops ticket
    sales by 50 for two weeks.
  • If the incident cannot be explained ticket sales
    remain off until it is
  • US Airways 737 (1994?), Pittsburgh, almost put
    airline out of business
  • Airline people really do care about the people
    theyre responsible for

7
The Plan
  • Be the first airline to gain competitive
    advantage by going to 100 online documentation
  • Retire microfilm/microfiche completely
  • Dont lose shirt

8
The Technologies
  • Excalibur Technologies EFS (Electronic File
    System)
  • Transarc AFS 3.3
  • HP Servers
  • Bunchostuff to convert manuals to TIF
  • Windows 3.1 target user platform

9
The Process
  • Scan microfiche/film manual pages to TIF
  • EFS OCR TIFs
  • AFS Store TIF pages
  • EFS Index TIFs (OCR output), keyword indexes
  • AFS Store index
  • AFS Replicate to strategically placed
    fileservers
  • Mechanics and engineers
  • Click on index icon (File cabinet)
  • Keyword search
  • EFS client on Windows 3.1 desktop requests data
    from EFS server running on AFS fileserver

10
World wide airline, world wide cell
  • Fileserver locations decided by
  • Location on corporate backbone
  • Connectivity from other linestations (smaller
    airports)
  • Number of linestations that can be served from
    location
  • Paranoia (over designed by 2x)

11
Domestic Fileserver Locations
12
End User Workstations
  • Every hangar -- many per dock
  • Every gate 2x, independent LANs
  • Every engineering department
  • Facilities for support of in-air aircraft
  • (World wide)

13
AFS Client Locations
  • Minimal
  • No supported Windows 3.1 AFS client
  • EFS client requests data from AFS client

14
Number of users
  • 40000 human users
  • I forgot my password puts airline out of
    business
  • 1500 workstations workstation hostname is
    user and is written on front of workstation

15
Woes and Wins
  • Network shoving data into your LAN
  • Replication management
  • Who is authorized
  • You want me to release how many volumes?
  • vos release times
  • FAA the system will not go down! All replicas
    will be identical
  • Lets use a really big cache for Seattle!

16
Woe Network
  • How to get 300 600 GB of data to fileserver for
    initial load of ROs
  • Slow links to small airports
  • Slow links to international server locations
  • Fast links heavily trafficked
  • vos release can beat the out of a network
  • An airline is always in operation no magic
    window of opportunity

17
Win Network
  • Cant use vos release
  • Hey, we have lots of those airplane things
  • Load local (SFO) fileserver array with disks,
    setup viceps
  • vos addsite to fileserver/array vos release
  • vgexport OS says by to volume groups
  • vos remsite remove drives
  • Fly to wherever vgimport, vos addsite / vos
    release. Rio, anyone?

18
Woes Replication Management
  • 15000 RW volumes, all replicated
  • Whos authorized to issue vos release?
  • Which volumes to release? EFS randomly places
    data ...
  • How many volumes did you say to release?

19
Win Replication Management
  • Authorization/automation
  • Per fleet per manual vosrel PTS group
  • PTS group on every relevant volume root node
  • User interface writes record to work queue, a
    file in /afs
  • Requester manual/index priority
  • Fileserver cron job compares requester with
    vosrel PTS group, figures out volume list,
    performs vos release localauth

20
Woe Replication Management
  • Which volumes to release?
  • Well known volume tree and consistent naming
    conventions
  • Release all volumes for requested manual
  • Who cares, really? How many can there be?
  • Sometimes 4000 volumes per night
  • vos release is slowish doesnt check to see if
    volume is unchanged looks at contents
  • Release cycle gt 24 hours, queue issue. OW!

21
Win Replication Management
  • Filter release requests
  • Compare RO dates, RW dates if RW not changed
    and all ROs same date, skip it
  • Filter 3 seconds
  • vos release no op 30 seconds
  • Small fraction of volumes for given manual are
    actually changed
  • Sometimes 0 changed sometimes lt 1 usually
    small fraction of total

22
Woe FAA the system will not fail!!
  • FAA requires 100 uptime, else wont approve
    system and airline can go fish
  • Yeah, right!

23
Win FAA the system will not fail!!
  • Data outage vs. system outage
  • Replication, of course
  • Multiple configurations for EFS client
  • Crude failover
  • No data outage for six years and counting
  • Well, there were a couple of times when ... but
    we fixed that ...

24
Woe FAA replicas will be identical
  • Several million RW files X 5 replicas
  • Have to prove that all files are identical across
    the 5 ROs for a given volume

25
Win FAA replicas will be identical
  • Tree crawler!
  • A little cheesy ls l cksum each directory
    in volume and compare results
  • Known bad case looked for 6x per day
  • Key fs setserverprefs I prefer you, now you,
    now you, now you
  • Dedicated client, no mounted .backups

26
Woe Lets use a really big cache
  • It seemed like a really good idea
  • 20 files changed per quarter -- lt 2/week
  • Average file size 10K
  • Oops, the indexes are monolithic and 300 MB ...
    but dont change often
  • Lets try a 12 GB cache!
  • Hello? Ive got twenty minutes to turn the
    shuttle. It takes fifteen minutes to ...

27
Win Lets not use a really big cache
  • AFS client (still I believe?) chokes on large
    cache
  • 12 GB 1,200,000 cache Vfiles
  • At garbage collection time, cache purge looks for
    LRU
  • Gee, that takes a long time. Is the machine
    dead?
  • Lets try a 3 GB cache!
  • (Worked indefinitely from 3.3 through 3.6)

28
Other smidgeons
  • vos release manager
  • Does volume need to be released?
  • Are all the relevant fileservers available?
  • Is there a sync site for the VLDB?
  • Do it
  • Did it?
  • Check VLDB entry
  • Compare dates

29
Other smidgeons
  • Data reasonableness checks
  • Do files pointed to by index actually exist?
  • If not, do not vos rel the index
  • Avoids the data outage of empty index for
    example (bad day)

30
Other smidgeons
  • popcache
  • Index files monolithic and large
  • Fileservers overseas, slow networks
  • Initial search of newly released index could take
    many minutes
  • Cat indexes to /dev/null every five minutes
  • If index unchanged, local cached copy is used
  • If index changed, pulled from fileserver and user
    doesnt pay penalty for first search

31
Other smidgeons
  • Anyone here ever have these?
  • AFS is complaining about the network, so AFS
    broke the network
  • AFS is the networks canary in a cage
  • We could do the whole thing with NFS!
  • AFS isnt POSIX compliant. Yay DFS!
  • A file lock resides on disk. File in RO volume
    cant be locked. (Oh yes it can.)
  • HP T500 goes to sleep?
  • We could do the whole thing on a Kenmore!

32
Outcome AFS Rules
  • The airline became the first airline (and may
    still be the only) to place 100 of its aircraft
    maintenance documentation on line
  • The system has run reliably for 5 years
  • So of course its time to replace it
  • There are three server locations in the US, one
    each in Europe, Hong Kong, Narita, Sydney,
    Montevideo, Rio de J
  • Mechanics no longer mash the microfilm reader
  • This system was enabled by AFS
Write a Comment
User Comments (0)
About PowerShow.com