Part 2 Session 1 Breakout - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Part 2 Session 1 Breakout

Description:

Herb Shivers NASA/MSFC. 1. Part 2 Session 1 Breakout #1. Old Lessons Apply in the New World ... Herb Shivers NASA/MSFC. 2. Recap from Panel Discussion ' ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 52
Provided by: NAS8155
Category:
Tags: breakout | herb | part | session

less

Transcript and Presenter's Notes

Title: Part 2 Session 1 Breakout


1
Part 2 Session 1 Breakout 1
Old Lessons Apply in the New World
2
Recap from Panel Discussion
  • There has to be an optimum balance among
    technical performance, time schedule and cost.

  • Dr. Eberhard Rees
  • If eternal vigilance is the price of liberty,
    then chronic unease is the price of safety.
  • Professor James Reason (2005, p 37) (substitute
    quality for safety)
  • Quality and System Safety both are instrumental
    in the prevention process

3
What is Quality Engineering?
  • Juran
  • customer satisfaction, or simply "fitness for
    use" (p 20)
  • Ishikawa
  • the practice of developing, designing, producing,
    and servicing a quality product that is most
    economical, useful, and satisfactory to the
    customer (p 64)
  • Crosby
  • conformance to requirements (p 21)
  • Deming
  • a predictable degree of uniformity and
    dependability that is suited to the market at low
    cost. In other words, quality is meeting customer
    needs and wants (p 61)

ASQ, 2001
4
Quality Evolution
  • Babylonian, Egyptian, Greek, Roman weights and
    measures for trade
  • Trades and craft guilds standards (experts)
  • Mass production and machinery (low level
    training)
  • Supervisor quality monitors
  • Inspectors (ala quality control)
  • Deming Plan-Do-Check Action Cycle
  • Juran, Feigenbaum, Ishikawa TQM
  • Quality assurance
  • designed in, not inspected in
  • (James Reason, pp 46/7)

5
What is System Safety Engineering?
  • System Safety Engineering (SSE) - A subset of the
    safety engineering discipline that provides
    direct support to programs and projects to
    achieve acceptable mishap risk through a
    systematic approach of hazard analysis, risk
    assessment, and risk management.
  • (J.R. Goodin/NASA/KSC ( retired), 2004)
  • System Safety is the application of engineering
    and management principles, criteria, and
    techniques to optimize all aspects of safety
    within the constraints of operational
    effectiveness, time, and cost throughout all
    phases of the system life cycle
  • (Air Force Safety Agency, 2000, p vii)
  • System safety is
  • A management doctrine, and
  • A family of analytical approaches that support
    that doctrine (Mohr, Jacobs Sverdrup, 2002)

6
Some Analysis Types
  • Preliminary Hazard Analysis (PHA)
  • System Hazard Analysis (SHA)
  • Subsystem Hazard Analysis (SSHA)
  • Occupational Health Hazard Assessment (OHHA)
  • Software Hazard Analysis
  • SSE Analyses consider system limits and risks

Mohr, 2002
7
Some Analytical Techniques
  • Preliminary Hazard Analysis
  • Failure Modes and Effects Analysis
  • Fault Tree Analysis
  • Event Tree Analysis
  • Cause-Consequence Analysis
  • Sneak Circuit Analysis
  • Probabilistic Risk Assessment
  • Digraph Analysis
  • Hazard and Operability Study (HAZOP)
  • Management Oversight and Risk Tree Analysis
    (MORT)
  • SSE requires a toolbox of techniques there is no
    one size fits all tool

Mohr, 2002
8
Why System Safety Engineering?
  • Support management risk decisions relative to
    system hazards
  • Avoid fly-fail-fix-fly and pilot error
    mentalities
  • Manage safety in the same manner as any other
    design or operational parameter
  • Prevent accidents, not react to them
  • Consider impacts to workers, the public,
    product quality, productivity, environment,
    facilities and equipment

Shivers, 2005
9
Effective System Safety Program Attributes
  • Management Commitment
  • Safety Culture
  • Independent Safety Organization
  • Communication
  • Qualified/Educated Personnel
  • Well-Defined Roles, Processes and Tools
    Including
  • Use of Technical Standards, Capture/Use of
    Lessons Learned,
  • Audits and Reviews, Stop Work Authority
  • Sufficient Resources
  • (Kiessling, Shivers, and Tippet, 2004)

10
Systems Thinking
  • Learn to view connected events as a system
  • Seeing wholes the big picture, unintended
    consequences, cause and effect (including delay),
    long term views, etc.
  • Our jobs dont exist in isolation
  • Deal with root causes, not symptoms
  • Learn to view connected events as a system (Peter
    Senge, The Fifth Discipline )
  • Seeing wholes the big picture, unintended
    consequences, cause and effect (including delay),
    long term views, etc.
  • Our jobs dont exist in isolation
  • Deal with root causes, not symptoms

Senge
11
Who Should Implement SSE?
  • SSE is the responsibility of all technical and
    management personnel on a project team
  • Chief engineers, systems engineers, design
    engineers, project managers all must include SSE
    thinking as a minimum in their work and
    understand what SSE is and does
  • SSE practitioners generally come from the safety
    and mission assurance organizations, but must be
    planned for and included in the team activities

Shivers, 2005
12
SSE Thinking
  • SSE thinking is focused on identifying and
    controlling potential failure, while design
    engineering thinking might be more focused on
    successful operation
  • Together, the two thought modes are complimentary
    and lead to better chance of success, which is
    the goal of each
  • Both thought modes need to be within the realm of
    Systems Thinking in general to consider all
    impacts of decisions made

Shivers, 2005
13
When is SSE Implemented?
  • SSE considerations must be included in the up
    front conceptualization so that pertinent
    information can be used in trade studies and
    requirements development
  • SSE is applied throughout the life cycle with
    appropriate tools and analyses brought to bear as
    warranted
  • The system safety process can be applied at any
    point in the system life cycle, but the greatest
    advantages are achieved when it is used early in
    the acquisition life cycle
  • The system safety process is normally repeated as
    the system evolve or changes and as problem areas
    are identified (Air Force Safety Agency, 2000, p
    14)
  • Decisions made under cost and schedule pressure
    can lead to hazards (Stroup and Naylor, 2001)

14
SSE and the Life Cycle
  • Early in the life cycle SSE considers hazards
    that may occur any time in the life cycle
  • Early identification usually results in less
    expensive corrections
  • Analysis can be and is done at any time in the
    life cycle

Shivers, 2005
15
System Safety Program Objectives
  • a. Safety, consistent with mission requirements
    is designed into the system in a timely,
    cost-effective manner
  • b. Hazards are identified, evaluated, and
    eliminated, or the associated risk reduced to a
    level acceptable to the managing activity (MA)
    throughout the entire life cycle of a system
  • c. Historical safety data, including lessons
    learned from other systems, are considered and
    used
  • d. Minimum risk is sought in accepting and using
    new designs, materials, and production and test
    techniques
  • e. Actions taken to eliminate hazards or reduce
    risk to a level acceptable to the MA are
    documented
  • f. Retrofit actions are minimized
  • g. Changes in design, configuration, or mission
    requirements are accomplished in a manner that
    maintains a risk level acceptable to the MA
  • h. Consideration is given to safety, ease of
    disposal, and demilitarization of any hazardous
    materials associated with the system
  • i. Significant safety data are documented as
    lessons learned and are submitted to data
    banks, design handbooks, or specifications
  • j. Hazards identified after production are
    minimized consistent with program restraints

Air Force Safety Agency, 2000, p 1
16
Some Concept Phase SSE Tasks
  • Concept Trade Studies
  • Concept alternative studies include quantitative
    and qualitative SSE analysis input and criteria
  • Concept Definition
  • Requirements management, risk management
    planning, feasibility and design trades safety
    technical requirements generation include results
    from SSE analysis

Shivers, 2005
17
Some Development Phase SSE Tasks
  • Development of contract requirements in the
    Statement of Work and for the contract data
    requirements (analyses reports)
  • Example analyses requirements
  • System Safety Plan
  • Preliminary Hazard List
  • Preliminary Hazard Analysis
  • Operating Support Hazard Analyses
  • System Hazard Analyses
  • Fault Tree Analyses (FTA)
  • Probabilistic Risk Analysis (PRA)
  • Design and Development
  • SSE input into specification development and
    verification planning

Shivers, 2005
18
Some Production Phase SSE Tasks
  • Fabrication integration, test and evaluation
  • SSE input into ground activities and verification
  • Test planning to validate safety features
  • Conducting test safely

Shivers, 2005
19
Some Operations Phase SSE Tasks
  • Operations
  • SSE input into operations and performance
    validation (must be considered early as well)
  • Operation and Support Hazard Analyses
  • Analyses from the Human Factors Program

Shivers, 2005
20
Some Close Out SSE Tasks
  • Decommissioning, disposal, recycling
  • SSE inputs into process decisions


Shivers, 2005
21
NASA SMA Roles
  • SMA provides
  • SSE practitioners
  • Assurance that requirements are set and met
  • Development of disciplines and tools
  • SMA in-line engineering has a review, evaluation
    and concurrence role
  • The SMA assurance supports engineering,
    validation and verification, policy and planning,
    and independent assessments

22
System Safety Effort Throughout Project Lifecycle
  • Proposal Support
  • Requirements Definition
  • Design Assessment
  • Identification of Hazards
  • Recommended Hazard Controls
  • Assessment of Risk
  • Verification of Hazard Controls
  • Development of Safety Data Packages
  • Interface with KSC Range Safety
  • Safety Support during IT Activities
  • Track Closure of Verification Items
  • Safety Certification
  • Prelaunch Safety Support

Goddard Space Flight Center, 2006
23
SUMMARY
120
  • System safety is involved throughout entire
    project lifecycle
  • Hazards to personnel or mission success are
    identified, eliminated or controlled to an
    acceptable level of risk
  • Effectiveness of hazard controls must be verified
  • Hazard analysis results and verification results
    are documented

Goddard Space Flight Center, 2006
24
Organizational Accidents
  • Rare, sometimes catastrophic, events that occur
    within complex modern technologies
  • Have multiple causes
  • Have devastating effects on uninvolved
    populations and things
  • Contrast with individual accidents that involve a
    person as often the victim and agent of the event
  • Difficult to understand and control
  • (James Reason, p 1)

25
Generic Cause of Organizational Accidents
  • All organizational accidents entail the
    breaching of the barriers and safeguards that
    separate damaging and injurious hazards from
    vulnerable people or assets-collectively termed
    losses
  • In individual accidents such defenses are often
    either inadequate or lacking
  • Three factors of breaching defenses
  • Human, technical, organizational
  • Governed by production and protection
  • (James Reason, p 2)

26
Unintended Consequences
  • conflicts between production and protection
    pressures tend to be resolved in favour of the
    former at least until a bad accident occurs.
  • efficient methods for work arise naturally
  • safety adds restrictions to procedures
  • rules become more restrictive over time
  • the scope of allowable actions is reduced
  • violation of procedure becomes necessary to
    accomplish the job
  • (James Reason, p 49)

27
Maintenance Can Seriously Damage Your System
  • it is often latent conditions created by
    maintenance lapses that either set the accident
    sequence in motion or thwart its recovery.
  • of the various possible error types associated
    with the reassembly, installation or restoration
    of components, omission the failure to carry
    out necessary steps in the task comprise the
    largest single error type.
  • (James Reason, pp 85/6)

28
Some Well-known Accidents
  • USS Thrasher 1963 sinking
  • QC of brazing, etc. Quality Problem safety
    problem
  • Poor design, overhaul followed by severe test
  • Quality - to prevent, not learn from catastrophe
  • Design, manufacturing, identify safety critical
    elements, test and verification, test planning
  • X31 Crash 1995
  • Faulty Configuration Management
  • Pitot tube heaters not present in design
  • Failure to follow procedure, find process
    escapes, identify critical failures, verification
  • Idaho Falls nuclear reactor explosion 1991
  • Poor maintenance procedures, on the fly process
    modifications, design flaws, QE supervision of
    work

NASA, 2006
29
Project and Systems Management
  • Were developed to manage in an emerging new
    environment
  • A multitude of government agencies, industrial
    firms and other organizations, sometimes on an
    international basis
  • Funds in the multimillion to billion dollar
    category
  • Complex technology sometimes reaching beyond the
    state of the art
  • Large forces of scientists, engineers,
    technicians and administrative personnel
  • Construction of extensive and highly specialized
    facilities

Rees
30
Apollo Program Characteristics
  • Program and systems management perspective
  • Technical risk trades with cost and schedule
  • Planning
  • Visibility
  • Management review
  • Configuration control
  • Penetration
  • Communication
  • Contracting philosophies
  • Organization
  • Authority, roles and responsibilities
  • Innovation
  • Goal focus
  • Continuous study and application of systems
    engineering
  • Relate actions to schedule and budget

31
Systems Aspects
  • Such projects of great magnitude and complexity,
    had to be considered under the overall systems
    point of view
  • The Apollo Program had shortcomings, setbacks,
    and deficiencies during its execution all of
    which challenged the management
  • To assure success, minimize technical risks or
    actually mission risks
  • Keep closely to the time schedule
  • Wherever possible must engage in parallel rather
    than consecutive developments

Rees
32
Tight Budget Control and Highest Economy in
Expenditure
  • Budget Controls
  • Subordinate to technical needs and the demands of
    the time schedule
  • There is a trade-off between acceptable technical
    risks or product quality, time schedule and
    project cost.
  • To eliminate the technical risk problem,
    frequently undue quality control or over-testing
    of hardware is applied which delays schedules and
    makes costs skyrocket.

Rees
33
Solid Planning
  • Master plans on hardware, software, and
    overall systems
  • Technical approaches
  • Resources such as facilities, manpower and funds
  • Schedules
  • Detailed breakdowns of the overall job and the
    system into subsystems

Rees
34
Visibility
  • Management at all levels should know almost in
    real time what is going on in the program
  • technical occurrences
  • schedule progress or delays
  • financial status
  • From the outset of the program, proper and
    effective channels and ways of communication have
    to be established on the government side between
    upper and lower echelons of management
  • Prime contractors must provide equally effective
    channels down to their respective subcontractors

Rees
35
Significance of Visibility
Enable management on all levels to predict
trends in the progression of the program Vital
for taking corrective steps before the program
runs into impediments The capability of
management to foretell trouble and thus avoid it
by appropriate actions was one of the major
cornerstones of the Apollo success. Dr.
Eberhard Rees
36
Review Milestones
Schedule review between government and prime
contractors. Apollo reviews, for instance, in a
chronological sequence Program Requirements
Review PRR Preliminary Design
Review PDR Critical Design
Review CDR Design Certification
Review DCR Pre-Delivery Turn-Over
Review PDTR Flight Readiness
Review FRR Countdown Demonstration Test
and its Review CDDT


Rees
37
Significance of Reviews
  • Critically examine and assess the project status
  • Affirm the quality of the product and its
    reliability
  • Assure systems safety
  • Every review resulted in protocolled action
    items
  • Resolve problems
  • Authorized go ahead with the next increment of
    the overall plan.

Rees
38
Configuration Control
  • The contractor followed acceptable drawing room
    practice as to procedure and discipline
  • Design intentions were carried through
    manufacturing
  • Only mandatory changes were approved
  • The exact configuration, known down to the most
    minute detail was delivered to the launching site
  • Failures or unsuitable hardware or material could
    be traced down to the point of origin (Apollo
    management called this traceability)
  • Configuration control carried out in a strict
    sense is very expensive. It is, therefore, vital
    that these controls not be overdone and that they
    are wisely introduced to prime contractors and
    subcontractors.

Rees
39
Application of the Penetration Principle
  • Dr. Eberhard Rees on the Penetration Principle
  • It permeated through the contractor
    organization to the subcontractor structure.
    Spawned by this approach, improved failure
    analysis appeared throughout the system
    in-process inspection was maintained at a high
    level and receiving inspection techniques and
    effectiveness were improved, among other
    benefits.

40
Significance of Penetration
  • Improved Communication Channels
  • Created close interaction of highly dedicated,
    competent technical and scientific personnel, all
    motivated by the impressive challenge of a huge
    complex program, no mater whether they are
    government or contractor employees
  • Most instrumental in this government-contractor
    relationship was the establishment of resident
    personnel in the prime contractor plants

Rees
41
Contracting Principles
  • Cost-plus-fixed-fee contracts
  • Used because of the uncertainties of effective,
    close pricing in such a program with its many
    unknowns
  • Incentive fee contracts
  • A base fee of modest proportions
  • Plus a scaled or incentive segment awarded to a
    contractor for success in meeting program product
    requirements for performance, cost, and time
    schedule
  • Lends itself well to hardware contracts with
    reasonable, well-determined milestones, cost
    levels and schedule.
  • Award Fee contracts
  • Used where parameters are not easily
    distinguished in advance
  • Support service or engineering service contracts
  • Motivational in nature

Rees
42
Other Pertinent Principles
  • Organize and motivate to achieve effective high
    morale in the workforce
  • Delegate authority clearly, concisely and
    positively to achieve timely decisions
  • Apply innovative concepts and techniques
    courageously
  • Keep objectives pointed toward the goal
  • Require continuing study and application of the
    systems engineering approach
  • Relate actions to schedule and to budget
    continuously

Rees
43
The Apollo Management System
  • Our management system evolved after some
    painful experiences in the early days of Apollo.
    In fact, at the beginning of the program in 1961,
    there was no common system in existence within
    the rather young National Aeronautics and Space
    Administration. Then as the program gathered
    headway and matured, the management system became
    better defined, changing as necessary to keep
    pace with unfolding events. Early it was learned
    that in the environment of a big development
    project, there can be no static system. Change
    and evolution are inevitable.
    Dr. Eberhard Rees

44
Program Integration
  • Three categories of concern
  • First, there are the hardware, systems and
    subsystems specialists who devote attention to
    the delivery of items that are technically
    adequate and qualified for mission performance
  • Second, there are the specialists who approach
    the project from the point of view of controlling
    costs and schedules.
  • As the third organizational element in the
    grouping, there is the on-site resident
    management office. To assure that project
    management interests were advanced and that
    decisions were made and implemented within the
    designated scope of authority of the resident
    group.

Rees
45
Resident Management Offices
  • This resident element proved to be a most
    important link between government and contractor
    activities
  • To expedite decisions, the resident manager
    required functional support, which was provided
    by specialized , on-site contract administration
    and technical engineering staff
  • assigned from parent functional organizations of
    the responsible Center
  • could make decisions on the spot or commit the
    parent office or function at the Center (within
    well-established limits)

Rees
46
Significance of the Resident Management Office

Speed the project management process
Provide a dynamic interface with the contractor
on a continuing day-to-day basis Integrate
technical and managerial personnel The
technical functions tend to strive primarily
toward perfection to a degree that possibly
inhibits adequate attention to manufacturing and
launch schedules or cost. The contractor could
well be oriented toward schedule, costs and
profits, whereas the project manager might weigh
concern more heavily on schedule and costs.
Through the office of the resident manager, an
automatic system of checks and balances developed
to the end that each consideration received its
appropriate share of attention.
Rees
47
Contractor Penetration
  • Contractor penetration is necessary to
    obtain visibility
  • There is an understandably strong desire on
    the part of industry to take the control and the
    funding and to do the job with but minor
    government intervention. The restiveness that
    stemmed from such close control gradually
    dissipated early in the Apollo Program as the
    benefits accruing from the industry-government
    teams approach were revealed. The manager must
    have control of competent technical and
    administrative staff in order to conduct
    activities efficiently.

Rees
48
Program Management
  • While centralized program management has many
    values, of prime importance is the assignment of
    all responsibility to single organizational
    management structures, pyramiding into a single
    strong personality. Of course with the
    responsibility, the manager must have
    commensurate authority to resolve technical,
    financial, production and other problems that
    otherwise require coordination and approval in
    separate channels at different echelons. And the
    manager must have clear, concise communications
    flowing in all directions.
  • Dr. Eberhard Rees

49
Conclusion
  • System Safety and Quality
  • necessary components of good program and systems
    management
  • very similar in their objectives, but with quite
    different tools and techniques
  • Must be applied early in the life cycle
  • Must be implemented religiously throughout
    program execution
  • Must be continuously examined and improved
  • Are complementary for safety and mission success

50
Acknowledgements (1 of 2)
  • A Brief Overview of Selected System Safety
    Analytical Approaches, R. R. Mohr, Jacobs
    Engineering, 2002.
  • Air Force System Safety Handbook, Air Force
    Safety Agency, July 2000.
  • Cost and Schedule The Overlooked Hazards, Ron
    Stroup and Warren Naylor, Proceedings of the 19th
    International System Safety Conference, 2001.
  • Improving Performance of the System Safety
    Function at the Marshall Space Flight Center, Ed
    Kiessling and Herb Shivers, NASA Marshall Space
    Flight Center and Donald D. Tippett, The
    University of Alabama in Huntsville, Proceedings
    of the American Society for Engineering
    Management Conference, 9/2004.
  • Human Factors A Personal Perspective, James
    Reason, Human Factors Seminar, Helsinki, 2006.
  • Managing the Risks of Organizational Accidents,
    James Reason, Ashgate, 1997 (9th reprint, 2005).
  • Quality 101, American Society for Quality, 2001.

51
Acknowledgements (2 of 2)
  • Safety and Mission Success, Technical Managers
    Training, Goddard Space Flight Center, 10/2006
  • Some general SSE information in this presentation
    was taken from works of Pat Clemens/APT Research,
    Huntsville, AL Ronnie Goodin/KSC, retired.
  • System Safety Engineering Awareness Training for
    NASA Managers and Engineers, (not yet released),
    2006.
  • System Safety Engineering Technical Warrant,
    Herb Shivers, presented to the NASA Technical
    Authority Conference, June 2005.
  • The Fifth Discipline The Art and Practice of the
    Learning Organization, Peter Senge, Currency
    Doubleday, 1990 - 1st edition, 1994 - paperback
    edition.
  • System Failure Case Studies, NASA, Office of
    Safety and Mission Assurance, Review and
    Assessment Division, 2006.
Write a Comment
User Comments (0)
About PowerShow.com