Title: Threat and Error Management: Diagnosing Safety Before a System Breakdown
1Threat and Error Management Diagnosing Safety
Before a System Breakdown
- James Klinect
- The University of Texas at Austin
- February 18, 1999
2Three Organizational Approaches to Safety
- 1. Reactive - improvement to system defenses in
order to prevent a particular breakdown (incident
or accident) from reoccurring - 2. Proactive - diagnose weaknesses in system
defenses that have the potentiality to lead to
future breakdowns - 3. Reactive and Proactive - The optimal
organizational safety approach
3Reactive Sources of Data
- Data collected after a system breakdown
- Accident reports - too rare (1.5 PMD accident
rate) - Voluntary incident reporting - causes and
outcomes - Same as ASRS but the reports are collected
in-house - Air Safety Awareness Partnership (ASAP) - 3500
reports per year
4Requirements for a Successful Reporting System
- 1. Trust
- 2. Non-punitive policy toward flightcrew error
- 3. Confidentiality
- 4. Managerial commitment to take action
- 5. Feedback information to the line
5Surveying Incident Reporting Roadblocks
6Voluntary Incident Reporting Limitations
- Reports can be skewed - incomplete data about the
event (ex., procedural violations) - Retrospective - Only provides post-hoc data on
failed system or crew performance
7Proactive Safety Approach
A proactive safety approach is an informed
approach
8Why Threats and Error?
- Threats and errors reduce the margin of safety
and increase the probability of incidents or
accidents - Threats and errors are inevitable and can only
hope to reduce their impact by managing their
consequences - The quality of threat and error management is a
better indicator of system safety because it is
more robust than incident or accident reports
9Proactive Data Sources
- Data collected before a system breakdown
- Performance evaluations - crew competency
- DFDR (FOQA) - flight instrument parameters
- Surveys - attitudes, perceptions, and suggestions
- Line Operations Safety Audit (LOSA) - crew
performance during normal flight operations
10Line Operations Safety Audit
- Systematic observations of crew performance
- Team of observers from the airline and U.T.
- Non-jeopardy
- Union supported
- Measures
- SOP compliance
- CRM behavioral markers and crew performance
- Threat management - external threat types and
behavior - Error management - error types and behavior
- Informal feedback from the crews about flight
operations and training
11The LOSA Error Database
- 1. International Major - 59 crews on 91 flights
- International and domestic flights - South
Pacific and Pacific Rim - 2. U.S. Major - 65 crews on 102 flights
- Only international flights - Central and South
America - 3. U.S. Regional - 60 crews on 121 flights
- Experienced Captains with inexperienced First
Officers (less than 4 years in aviation and less
than one year in position)
12Threat Management
13The Rain of Threats
14Threats
Threats increase the level of risk to safety
- External Events
- Adverse weather
- ATC command
- Terrain
- Aircraft systems malfunction
- Maintenance event
- Dispatch event
- Ground handling event
- Cabin event
- Airport conditions
- Operational pressure
- External Errors
- Maintenance error
- Dispatch error
- ATC error
- Ground crew error
- Cabin crew error
15A Heavy Rain of Threat
- On one flight observation,
- 1. Late arriving aircraft
- 2. Inconsistent fuel slips
- 3. Weight restriction on departure
- 4. Weather and heavy traffic on takeoff
- 5. Lavatory smoke alarm during cruise
- 6. Weather and heavy traffic on arrival
- 7. ATC instructed a runway change in late final
16Threat Results
- 72 of the flights had at least one or more
threats - From 0 to 10 external threats per flight
- Averaged 2 threats per flight
17Threats by Phase of Flight
Threats most frequently occur during preflight
and approach
18Most Common Threats
- 1. Adverse weather - 20 of all flights
- 2. Aircraft malfunctions - 12
- 3. ATC event - 10
- 4. External errors (ATC, Maintenance, Cabin,
Dispatch, and Ground Crew) - 8 - 5. Operational pressures - 8
19Threat Recognition and Error Avoidance Behaviors
- Behaviors that crews used to recognize threats
and avoid error - 1. Active Captain leadership
- 2. Vigilance
- 3. Operational plans clearly stated and
acknowledged - 4. Staying ahead of the curve
- 5. Following SOP
20Error Management
21The Other Piece Flightcrew Errors
22Flightcrew Errors
- Can be triggered by an external threat or occur
in isolation - Flightcrew error - the action or inaction that
leads to a deviation from crew intentions or
expectations
23A Model of Error Management
24Intentional Noncompliance Procedural Communication
Proficiency Operational Decision
Error Types
25Error Types
- 1. Intentional Noncompliance - violations
- ex.) Omitted required briefings
- Performing checklists from memory
- Failure to cross-verify settings
- 2. Procedural - followed procedures but wrong
execution - ex.) Lever and switch settings
- Wrong altitude dialed
- Wrong MCP mode executed
26Error Types
- 3. Communication - Misinterpretation or missing
information or during an exchange - ex.) Wrong readbacks to ATC
- Missed ATC calls
- Wrong runway communicated
- 4. Proficiency - lack of knowledge error
- ex.) Lack of stick and rudder proficiency
- Lack of knowledge with automation
- Lack of knowledge with procedures
27Error Types
- 5. Operational Decision - discretionary decision
not covered by procedures that unnecessarily
increased risk - ex.) Over-reliance on automation
- Unnecessary low maneuver on approach
- Unnecessary navigation through adverse
weather
28Intentional Noncompliance Procedural Communication
Proficiency Operational Decision
Error Types
Trap Exacerbate Fail to Respond
Error Responses
29Error Responses
- Trap - error is detected and managed before it
becomes consequential (undesired state or
additional error) - Exacerbate - error is detected but the crews
action or inaction becomes consequential - Fail to Respond - lack of a response to an error
(undetected or ignored) that can either end up
being inconsequential or consequential
30Intentional Noncompliance Procedural Communication
Proficiency Operational Decision
Error Types
Trap Exacerbate Fail to Respond
Error Responses
Error Outcomes
Inconsequential
Undesired State
Additional Error
31Undesired States
Undesired states are deviations from normal
flight that unnecessarily compromises safety
- Lateral deviation - heading
- Vertical deviation - altitude
- Speed to high or low
- Unstable approach
- Near miss
- Fuel level below minimums
- Vertical deviation on the G.S.
- Long landing
- Hard landing
- Landing off centerline
- Wrong taxiway or ramp
- Wrong runway
- Wrong airport
- Wrong country
32Intentional Noncompliance Procedural Communication
Proficiency Operational Decision
Error Types
Trap Exacerbate Fail to Respond
Error Responses
Error Outcomes
Inconsequential
Undesired State
Additional Error
Undesired State Responses
Mitigate
Additional Error
33Flightcrew Error Results
- 72 of the crews committed at least one error
- 65 of the flights had at least one error
- From 0 to 14 errors per flight
- Averaged 2 errors per flight
- There were between and within-fleet differences
34Errors by Phase of Flight
35Error Frequencies and Consequences
36Most Common Errors
- 1. Automated systems errors (MCP and FMC) - 21
of all flights - Failure to cross-verify settings
- Wrong MCP or FMC settings
- Other Intentional noncompliance errors
- 2. Checklist errors - 20
- Checklist performed from memory
- Nonstandard checklist usage
- Self-performed checklist
- Procedural checklist errors
37Most Common Errors
- 3. Sterile cockpit violations - 10
- 4. ATC errors - 6
- Missed ATC calls
- Omitted information (readbacks or call signs)
- Accepting ATC instructions that unnecessarily
- increased risk
- Procedural ATC errors
- 5. Briefing errors (omitted or incomplete) - 5
38Error Management Results
39Undesired State Results
- Responses to Undesired States
- 75 are mitigated
- 9 lead to additional errors
- 16 required no crew response
- Most common undesired states
- 1. Vertical deviations
- 2. Speed too high
- 3. Lateral deviations
40Error Detection and Management Behaviors
- Behaviors that crews used to detect and manage
errors - 1. Active captain leadership
- 2. Environment set for open communications
- 3. Crew members asking questions and speaking up
- 4. Vigilance
- 5. Prioritization of tasks to manage workload
41Between-Airline Differences
42Violations as the Norm
- One observer noted the following during the U.S.
Regional audit on an IOE ride, - The Check Airman ran the entire taxi
checklist by memory. - Organizations cannot allow violations to
normalize - Why? - Crews that commit at least one intentional
noncompliance error are more likely to commit
other types of error than those without an
intentional noncompliance error
43Summary
44Optimizing Safety Efforts
- Organizational safety approaches need to be
reactive and proactive - Incident reports provides useful data after a
system breakdown, but only part of the answer - Safety efforts will be fully optimized when
systems diagnose weaknesses before breakdowns - This requires ongoing line audits of normal
flight operations that measures crew performance
in threat and error management
45Safety Roles in Improving System Defenses
- Flight Standards
- Line checks to reinforce threat recognition,
error avoidance and management - Flight Operations
- Review/revise SOPs and policies
- Safety
- Error reporting system (ASAP) for data
- Ongoing line audits for different parts of the
operation - Training
- Threat recognition, error avoidance, and error
management - Leadership
- Focus on technical and procedural excellence
46Our Website
- www.psy.utexas.edu/psy/helmreich/nasaut.htm