Title: Section 6 Survey of Residual Risk
1Section 6 Survey of Residual Risk
- Initial Survey of Residual Risk
- Recent Reviews Considering Residual Risk
- Special Topics
- Redbook Candidates
- Launch Liens
. . . Bryant Cramer NMP Implementation Manager
2Introduction
- Completion of EO-1 has been significantly
affected by the Red Team process - Red Team Charter focuses on Residual Risk
- Consequently, the EO-1 Project has made the
consideration of Residual Risk a specific part of
each recent presentation - The involved reviews are as follows
- Initial Pre-Ship Review December 15-17, 1999
- Red Team Review March 28-31, 2000
- Red Team Follow-Up Meeting June 13, 2000
- Delta Pre-Ship Review August 8-9, 2000
- Flight Software Independent Assessment September
8, 2000 - Operational Readiness Review October 3, 2000
- Mission Readiness Review October 5, 2000
3Initial Pre-Ship Review
- Held the initial Pre-Ship Review on 12/15-17/99
- WARP failure had not yet occurred
- Launch still planned for 04/13/00
- Wanted to discuss the results of Thermal Vacuum
Test 1 - Acknowledged that considerable work remained to
be completed - At that time, 57 specific tasks were presented as
remaining work and areas of potential residual
risk - All but one has been completed with no untoward
outcomes - The remaining task, the Launch Dress Rehearsal,
normally occurs a few days before launch and is
scheduled by KSC
4Initial Survey of Residual Risk
- Preparatory to the Red Team Review on 3/28-31/00,
we did a Survey of Residual Risk - Developmental activities were tracked through
community-based databases - There are eight such databases
- Project-Level Reviews
- Spacecraft
- Flight Software
- Operations
- GSFC / NCR
- Materials / Processes
- Waivers
- Schedule Liens
- Vast majority (gt90) of all entries are in the
first four databases, but since - Flight Software entries generally mirror
Spacecraft entries, and - Operations database does not include flight
hardware -- therefore - Only the Project-level reviews and the Spacecraft
databases were surveyed for residual risk in
preparation of the initial Red Team Review
5Initial Survey of Residual Risk (continued)
- All of the RFAs were reviewed and 82 were
selected to have potential residual risk - All of the PRs after starting with the first
Thermal/Vacuum test were reviewed and classified - Any change to flight hardware
- Any change to flight software
- Any Redbook candidate
- Any operational constraint
- Any open item affecting flight hardware /
software - Any significant test anomaly occurring during IT
was also selected as a Special Topic (57) - These 263 residual risk candidates were reviewed
and accepted by the GSFC Systems Safety and
Mission Assurance Office on 3/17/00 - All of these residual risk candidates were
addressed in the Red Team Review with the mission
elements wherein they reside - Red Team concerns were almost entirely restricted
to the Special Topics
All of these were selected (124)
6Initial Red Team Recommendations (5/9/00)
- Spacecraft is a single-string design
- Risk Mitigation
- Obtain at least 300 hours of failure-free
operation on the entire spacecraft Estimate
289 - Implement the Accelerated Minimal Mission DONE
- WARP is required for the Minimal Mission
- Risk Mitigation
- Obtain at least 300 hours of failure-free
operation ESTIMATE 700 - Perform WARP Parts Stress Analysis DONE
- Design a WARP Back-Up DONE
- Flight Software still undergoing change with
minimal failure-free hours - Risk Mitigation
- Conduct walk-throughs of all critical flight
and ground S/W DONE - Freeze FSW DONE
- Exercise this FSW in T/V testing DONE
- High-Fidelity hardware / software test bed does
not exist - Risk Mitigation
- Supplement FSW Maintenance Facility DONE
7Initial Red Team Recommendations (continued)
- QA process applied to GSFC GFE inconsistently
- Risk Mitigation
- Maximize failure-free operating time DONE
- Implement Accelerated Minimal Mission DONE
- Helium adsorption on HRGs
- Risk Mitigation
- Do a worst case analysis DONE
- Improved nitrogen purge DONE
- WARP, PSE, and ACDS had thermal cycling as
opposed to thermal / vacuum testing - Risk Mitigation
- Determine thermal stresses at maximum power
levels DONE - Evaluate performance in T/V testing DONE
All initial Red Team Recommendations Accomplished
!
8Red Team Follow-Up Meeting
- The Red Team Follow-Up Meeting on 6/13/00
involved the following - Presentation and discussion of PRA, FMEA, and FTA
- Review of earlier Red Team RFAs
- PRA results
- Reliability of Minimal Mission _at_ 1 year is 0.75
- Accelerating the Data Collection Events from four
/ day to eight / day allows the Minimal Mission
to be largely completed in four months at a
reliability of 0.90 - Red Team strongly recommended the implementation
of the Accelerated Minimal Mission to achieve the
higher reliability - All RFAs from the initial Red Team Review were
accepted - Two new RFAs were added
- Factor the 1R failures into the reliability
calculations - Describe the single-point failures associated
with the Safehold Mode - These RFAs have since been completed and accepted
9Delta Pre-Ship Review
- Delta Pre-Ship Review held on 8/8-9/00
- Combination review
- Traditional Pre-Ship Review
- Final Red Team Review
- Single Board -- Co-Chaired by Charles Vanek and
Ron Thomas - Three areas of potential risk identified
- S-Band Transponder
- Launch Vehicle Fairing
- Flight Software Independent Assessment
- All three will be presented as Special Topics
- 21 RFAs assigned and have since been submitted
10Flight Software Independent Assessment
- Held at GSFC on 9/08/00
- Two members of the IVV Facility (West Virginia)
visited GSFC - Requested documentation was posted at a web site
two weeks prior to the visit - No formal presentations during the visit
- Team spent the morning asking questions
- Later inspected software documentation and
databases - Organized by Bill Jackson, IVV Deputy Director
- A different kind of review -- favorable
impressions on both sides - Treated as a separate Special Topic
- No show-stoppers
- Made eight risk mitigating recommendations that
we have adopted - Final Report included in Probabilistic Risk
Assessment document
11Operations Readiness Review
- Operations Readiness Review (ORR) held on 10/3/00
- Combination Review
- Traditional ORR -- chaired by
- Red Team represented by Ann Merwarth
- Walk-on chart summarizing the outcome to be
presented at the MRR
12Special Topics
- 1. WARP Status
- 2. S-Band Transponder
- 3. Launch Vehicle Payload Fairing
- 4. Hyperion Calibration Lamps
- 5. Flight Software Independent Assessment
- 6. Sprague Capacitors
13Special Topic Wideband Advance Recorder /
Processor (WARP)
14WARP Status (1 of 3)
- Wideband Advanced Recorder / Processor (WARP) is
a microprocessor-based 48 Gb solid state memory
that ingests all imaging data from the three
instruments at about 500 Mb/sec - It is essential for the Minimal Mission
- A 10V regulator failed initially on 1/4/00 and
again on 2/13/00 -- the problem was found to be a
failed Zener voltage regulator - The failing part overstressed other parts so that
the entire 10V regulator board was replaced - The WARP was re-assembled, re-verified, and
returned to the S/C on 5/15/00 - Re-verification consisted of
- Conducted susceptibility
- Conducted emissions
- 3-axis vibration at workmanship levels
- Four temperature cycles
- Functional tests between each environmental test
- Re-integration onto the Spacecraft was
accomplished with no difficulty
15WARP Status (2 of 3)
- After re-integration there was a second
spacecraft Thermal / Vacuum test (T/V II) - Three trips to 40C for a total of 136 hours
- Two trips to 5C for a total of 135 hours
- A complete CPT at 40C and 5C
- Four functional tests during T/V II
- Over 100 DCEs transmitted
- No difficulties encountered
- Over 500 hours on the new regulator board with no
problems - Now over 2500 hours and well over 400 DCEs on the
rest of the WARP - Calculated reliability at one year is 0.90
- WARP is ready to fly !
16WARP Status (3 of 3)
- After second WARP failure, it was decided to
pursue a Back-Up Solid State Recorder (BSSR) - The BSSR is a reliable, independent,
CCSDS-compatible single card solid state memory
that fits into the WARP with its own separate
power source - Team members include Litton, Orbital, and QSS
- Work started in April 2000 and the CDR was held
in July - The WARP performed very well in T/V II and work
on the BSSR was discontinued on 9/30/00 - A complete electrical design was completed for
the BSSR - This design would raise the WARP reliability from
0.90 at one year to 0.97, but cost a nine month
launch delay
17Special Topic S-Band Transponder
18S-Band Transponder (1 of 6)
- During T/V II the S-Band Transponder developed
storms on the AGC output - Once properly calibrated, the AGC output is an
analog signal proportional to the signal strength
in the receiver and is used for this purpose - Diagnostic tests isolated the problem to the
S-Band transponder and it was removed and
returned to the manufacturer where the problem
was readily demonstrated (the root cause was not
determined) - In the interest of time, the same transponder was
removed from the Triana mission and sent to CONIC
for further evaluation prior to EO-1 integration - This transponder experienced two brief AGC
transients while in test of CONIC - To demonstrate that these were random events,
this transponder was then exposed to 50 thermal
cycles (-20C ? 65C) in vacuum over a week
19EO-1 S-Band Transponder Test Logic(2 of 6)
20Thermal Cycling of S-Band Transponder (3 of 6)
21S-Band Transponder (4 of 6)
- The AGC was utterly stable and no other problems
were encountered - EMI / EMC testing was successfully accomplished
at an independent lab near CONIC - Re-integration onto the S/C was successfully
accomplished - Successful tests
- RF Functional Test
- One Complete CPT
- Spacecraft Functional Test
- Compatibility Test Van
- Self-Compatibility scheduled for 10/3/00
- This transponder has over 800 hours on it and has
passed all of our tests
22EO-1 Launch Site S-Band Integration (5 of 6)
S-Band Integration
16
20
32
184
188
196
72
Denotes cumulative hours on S-band following
integration
23EO-1 Launch Site Flow (6 of 6)
208
The S-Band Transponder is ready to fly !
24Special Topic Launch Vehicle Payload Fairing
25Launch Vehicle Payload Fairing (1 of 2)
- Some small debris was seen on the video of the
Globalstar-7 mission at the time of payload
fairing separation - KSC ERB concluded on 3/3/00 that the actual
debris was a lubricant called Fluoroglide - After examining the data, the EO-1 Project judged
the contamination risk to be unacceptable - KSC ERB recommended on 6/19/00 not to fly the
fairing as-is - In August, Boeing proposed a shield over part of
the separation line - This approach was presented to the Delta Pre-Ship
Review -- the Board, as well as the Project, was
uncomfortable with this approach - Boeing agreed to disassemble the fairing, remove
the Fluoroglide from the rail assembly, and
re-assemble it using a traditional alcohol
lubricant that leaves no residue - This approach has been successfully implemented
- GSFC recommended tests confirmed the removal of
the Fluoroglide
26Launch Vehicle Payload Fairing (2 of 2)
- The EO-1 fairing was also slightly damaged while
in transit to Vandenburg Air Force Base (VAFB) - Damage was a delamination of about 5 x 7
- Such delaminations sometimes occur in fabrication
and established procedures are available for
their repair - Such a repair was successfully implemented
- GSFC experts reviewed all of the KSC ERB
documentation and are confident the repair is
acceptable - Both issues with the payload fairing have been
independently reviewed by GSFC experts and are no
longer a concern to the Project - The residual risk associated with the fairing is
acceptably low - HQ has requested that the Red Team review the
resolution of these fairing issues - This Review is then a Launch Lien
27Special Topic Hyperion Calibration Lamps
28Hyperion Cal Lamp Anomaly Background
- Hyperion response to primary cal lamp noticed to
be degraded just prior to EO-1 T/V II secondary
lamp response normal - Additional tests performed during T/V II
- Primary lamp response continued to degrade
- Secondary lamp response remained stable after
small increase in output - Investigation revealed that the cal lamps were
changing output because they were not subjected
to an adequate burn-in prior to installation - Tungsten-halogen lamps must be burned-in to
stabilize the filament (that operates at 3000C)
before use. - Welch-Allyn recommends burn-in 1 of life 6.5
hours minimum - Estimated cumulative on time for Hyperion lamps
(as of 7/31/00) - Primary ? 2.5 hours Secondary ? 1 hour
- Additional Hyperion cal lamp burn-in performed on
spacecraft after EO-1 shipment to VAFB
29Hyperion Flight Cal Lamp Burn-in Results
30GSFC Extended Lamp Burn-In Results
31Maintenance of On-orbit Calibration
- Hyperion radiometric accuracy requirement is 6
- Case 1 Secondary cal lamps continue operation
during mission - Stability of secondary lamp output is /- 0.25
- Radiometric accuracy will remain 2.5 to 3
- Case 2 Secondary lamps last through several
solar cals - Perform solar cal as early as possible on-orbit
to calibrate lamp output - If secondary fails, increase solar calibration to
daily event - Uses up one of the eight scheduled image collects
per day - Potential degradation in white paint reflectance
from frequent solar exposure - Radiometric accuracy will remain 2.5 to 3 with
lamps intact, degrades to 4 to 5 without lamps - Case 3 Secondary lamps fail immediately, before
solar cal - Rely on solar calibration, cross-instrument (ALI,
AC) and cross-platform (Landsat) measurements of
selected ground sites, solar and lunar
calibration - Radiometric Accuracy will remain at 4 to 5,
still meeting the 6 requirement
32Special Topic EO-1 Flight Software Independent
Assessment
33Overview
- At the EO-1 Delta Pre-ship Review, Charles Vanek
recommended an IVV Independent Assessment (IA)
of EO-1 Flight Software (FSW) to be completed
before MRR - The IVV Facility in West Virginia provided two
senior analysts for three weeks - Project documentation was provided to IA team
prior to a face-to-face meeting - Final Report delivered on 9/8/00
- EO-1 accepted all of the recommendations
34Assessment Scope
- Check status of program issues from previous
reviews - Random test runs to demonstrate
- Command validation (accept valids / reject
invalids) - Exception handling
- FSW Patch uplink verification
- Ops mode transitions
- Review problem closures and verification
techniques - TSM/RTS and FDC actions and relationship to FMEA
35Review Materials Provided
- Review Presentations (CDR, PSR, Red Team, and
Delta PSR) - FMEA, FTA, and PRA documentation
- Access to PTR, PR, and Work Order Databases
- FSW design specs and data flow diagrams
- Test procedures, setup, and condition
descriptions - Test output, logs, and other reporting
documentation
36Findings
- No significant risks unresolved from previous
reviews - FSW test practices and coverage of the testing
adequate - Trending analysis demonstrates that FSW is mature
and the development process is stable - No issues identified against fault contingency
processes or fault handling - Eight procedural or documentation clean-up
recommendations which are being completed by the
Project
37Recommendations
- Table upload procedures to include processor
restart recovery check - Default memory scrub rate to be increased if
dictated by SEU predictions - Central server for baseline program documentation
to be implemented before launch - Traceability from FSW requirements to test cases
to be documented before launch - Future Work Orders to be annotated to identify
steps that address each PTR closure - Project to review and close remaining PTRs
- Inaccurate comments in RTS database to be
corrected - FDC table entries to be annotated with reference
sources
38Special Topic Sprague Capacitors
39Vishay Sprague Capacitor Issue
- At GSFC, IRAC mission had a Vishay Sprague CWR09
part failure. GSFC team went to Sprague to
investigate the issue. - GSFC discovered that Spragues test procedures
and test fixture problems have allowed an
undetermined number of improperly screened
capacitors to be delivered to several GSFC
missions (IRAC, EO-1, MAP) - CWR09 capacitors from Sprague are suspect, if
they were processed at the West Palm Beach
facility. Capacitors from Spragues Concord
facility are not affected. - Data from the origins of CWR09 lots is being
gathered by EO-1 part engineers. - 12 suspect capacitors are in the WARP box.
- Engineers are currently evaluating the specific
locations are these capacitors to understand the
severity of this issue. - CWR06 capacitors from Sprague also may be
suspect. - WARP, PSE and possibly ALI may contain these
suspect capacitors. - Part Engineer's are evaluating the CWR09 parts
first, then will investigate the CWR06 parts. - CWR06 parts are more numerous than the CWR09s.
40Vishay Sprague Capacitor Issue
- Successful testing of residual parts from same
lot as flight is a way to mitigate change out if
sufficient quantities exist, at least 100 per
lot. Small samples will not provide sufficient
data. EO-1 does not have sufficient residual
parts to test, so this is not an option. - Date Code is not a factor. All date codes for
parts tested at West Palm Beach facility are
suspect. - This issue emerged on 9/29/00 and is still being
worked. - The resolution of this issue becomes a Launch
Lien.
41Vishay Sprague Capacitor Issue
- At GSFC, IRAC mission had a Vishay Sprague CWR09
part failure. GSFC team went to Sprague to
investigate the issue. - GSFC suspects that Spragues test procedures and
test fixture problems may have allowed an
undetermined number of inadequately screened
capacitors to be delivered to several GSFC
missions (IRAC, EO-1, MAP) - CWR09 capacitors from Sprague are suspect, if
they were processed at the West Palm Beach
facility. Capacitors from Spragues Concord
facility are not affected. - Data from the origins of CWR09 lots have been
gathered by EO-1 part engineers. - Six suspect capacitors are in the WARP.
- Engineers are currently evaluating the specific
locations of these capacitors to understand the
severity of this issue.
42Vishay Sprague Capacitor Issue
- Successful testing of residual parts from same
lot as flight is a way to mitigate change out if
sufficient quantities exist, at least 100 per
lot. Small samples will not provide sufficient
data. EO-1 does not have sufficient residual
parts to test, so this is not an option. - Date codes near to IRAC for parts tested at West
Palm Beach facility are suspect. - This issue became known to EO-1 on 9/29/00 and is
now being worked aggressively. - The resolution of this issue becomes a Launch
Lien.
43Current Status
- Screening
- Since the concern is potential early failures
that escaped through faulty factory screening
tests, these parts are somewhat less likely to be
involved due to - Successful completion of a 50A surge test, and
- Over 2500 hours of use with no difficulties not
a part of LVPC re-build - Application
- The six parts are in the WARP RSN
- They are power filters across a 15V regulated
supply - They are rated at 50 VDC -- derated gt3 to 1
- This is not a stressful application and promotes
self-healing of the capacitor current limited
voltage source - Still defining the consequences of a failure
- Resolution
- All of the necessary data will be available next
week - Codes 300, 400, and 500 will concur and provide
an e-mail to the PMC Chairman
44Redbook Candidates
- The EO-1 Project proposes the following ten
events as Redbook candidates - In order of priority
- 1. Helium adsorption on HRGs
- 2. Hyperion calibration lamps
- 3. One-time, unrepeatable, unexplained events
- a. Apparent X-Band RF frequency shift
- b. A/D anomalous maximum value
- c. Chassis current transient
- d. Sustained SADA chassis current event
- The helium absorption is a small but real risk
due to the significant release of helium from the
Launch Vehicle in the event of a launch scrub --
we have done all that we can, but a small, finite
risk remains - The rest are of minimally low risk
45Redbook Candidate (1 of 7)
- Litton Space Inertial Reference Unit uses three
Hemispherical Resonator Gyros (HRGs) - Atmospheric helium adsorbs on the quartz of the
HRGs increasing their Parametric Drive Voltage
(PDV) - Not a problem until Spacecraft arrived in
Building 7 where helium is used in the
environmental chamber testing processes - PDV initially about -1.0V and decreased to -2.80V
within a month. HRGs fail to reliably start when
below -5.0V - Nitrogen purge quickly installed and subsequently
improved including continuous atmospheric
monitoring - Primary helium source determined to be a vent
near the Building 7 air intake -- since moved - This approach has been successful and implemented
at the launch site for continuous purging while
at VAFB and on the launch stand - PDV is slowly increasing and now stands at -2.6V
- Our goal was to be above -4.0V throughout the
launch campaign and we have met this goal - Remaining risk is the potential release of
significant helium from the launch vehicle in the
event of a launch scrub and our purge system
fails outright - We are theoretically protected, but a small,
finite risk remains in the event of a launch scrub
46Redbook Candidate (2 of 7)
- Directly prior to T/V II, the Hyperion primary
calibration lamps were found to be degraded, but
the secondary was normal - Investigation revealed that the calibration lamps
were not adequately burned-in - During additional burn-in, a primary calibration
lamp failed - Subsequent examination of 15 bulbs from the same
lot revealed - An anticipated life of 146 hours -- quite
adequate for the mission - Halogen cycle not working well at ambient
pressure - Halogen cycle working properly in vacuum
- To change out the primary calibration lamps is a
two-month launch delay - The secondary is stable and should last
throughout the mission - These calibration lamps are transfer standards
between solar calibrations - Operational workarounds have developed to
compensate for the loss of the secondary
calibration lamps and still maintain radiometric
accuracy - This is a negligible risk -- See Special Topic
for details
47Redbook Candidate (3 of 7)
- After Spacecraft vibration on 8/4/99, the first
attempt to use the X-Band downlink resulted in
failure to achieve carrier lock - Test personnel saw an apparent frequency shift of
6-7 MHz on spectrum analyzer - The next day, more equipment was brought in to
examine the problem, but it disappeared and never
occurred again - Detailed examination failed to find the cause of
the problem so a panel of independent experts was
convened by the Deputy Center Director - FTA performed and individual branches
systematically ruled out - Experts Findings
- Flight hardware essentially ruled out as probable
cause - Most likely cause was RF GSE cables that were
damaged in handling / use - Damaged cables display selective frequency
attenuation (SFA) that leads to loss of carrier
lock and an apparent frequency shift - These cables were replaced by the Project early
in its investigation
48Redbook Candidate (4 of 7)
- Some of these RF GSE cables were subsequently
found and tested -- they displayed SFA - David Israel (Code 527) was able to successfully
model an apparent frequency shift based on SFA
for comparable cables - X-Band System now has over 700 hours on it since
this one-time incident and has successfully
downlinked well over 300 DCEs - Although never specifically reproduced, there is
strong circumstantial evidence to believe that RF
GSE cables were responsible for this one-time
event - This concern is of negligible risk to the mission
49Redbook Candidate (5 of 7)
- PR-837-20-1 While loading software in the
housekeeping RSN, the A/D in the PSE RSN
indicated that the PSE LVPC total output current
was FF or 19.275A for one sample while the
Spacecraft was quiescent and no heaters were ON - Occurred only once in over 2500 hours (gt9 x 106
samples) - LVPC overcurrent trip did not occur
- Umbilical telemetry (battery current, battery
voltage, and chassis current) remained nominal - Fault tree analysis completed and each branch
explored - No credible cause was identified
- No evidence of FF representing actual current
- GSE found to be operating normally
- A/D timing explored and found to be adequate, but
marginal - Conclude this is a random event of unknown
etiology - Judged to be a random event of low risk to the
mission
50Redbook Candidate (6 of 7)
- PR-770-20-3 describes a one-time transient of
300ma chassis current that occurred on the
stripchart recorder on 11/22/99 - The transient was less than one second in
duration - The event was also seen in the telemetry
- EMC testing was occurring at the time of the
transient and various test probes were connected
to the Spacecraft power bus - Investigation was unable to identify a probable
cause - This transient was most likely associated with
the GSE used in the EMC testing - As such, this one-time transient is of negligible
risk to the mission
51Redbook Candidate (7 of 7)
- PR-572-30-10 describes a prolonged chassis
current of 925 ma on 8/20/99 - Ad hoc testing following the event was able to
reduce the problem by moving the Solar Array
Simulator Cable (EGSE) - Inspection of this cable revealed split
insulation on several wires in the backshell of
the connector - A new Solar Array Simulator Cable was fabricated
- In the succeeding 2500 hours of Spacecraft
powered testing the problem has not recurred - Although we cannot be absolutely certain, there
is strong presumptive evidence that the failed
insulation in the EGSE cable was responsible for
this event - This event is of negligible risk to the mission
52Launch Liens
- Successful completion of remaining planned work
- Sprague capacitors in WARP LVPC (IRAC problem)
- Conduct Red Team review of fairing issues
- Complete closure of open paper
- Completion of any RFAs from MRR
- Presentation of MRR Overview to Office of Earth
Science