Critical Systems - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Critical Systems

Description:

Safety Critical : a failure may result in human injury or environmental damage ... Reliable system is critical to safety, but not always enough---- a fault ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 16
Provided by: XR7
Category:
Tags: critical | systems

less

Transcript and Presenter's Notes

Title: Critical Systems


1
Critical Systems
  • Critical Systems Systems which, if it fails,
    may result in serious damages
  • Safety Critical a failure may result in human
    injury or environmental damage which may in turn
    injure human or other living organisms (e.g.
    embedded system monitoring the dialysis machine)
  • Mission Critical a failure may result in not
    achieving an important goal (e.g. unmanned
    spacecraft navigation system election poll
    tabulating system)
  • Business Critical a failure may result in loss
    of business or loss of customers (e.g. credit
    card payment system general ledger)

2
Areas of Concern for Critical Systems
  • Include the whole system
  • hardware all the hardware components must be
    maintained and kept up from failure (possibly
    even with redundant components)
  • power all the utility that services the system,
    such as electricity must be available (possibly
    with backup power supply system)
  • software all the software components from
    operating system, database, network, interfaces,
    all the way to the actual application must be
    operational (possibly with system backup
    recovery functions that allows the restart from
    the last software checkpoint or with user undo
    options)
  • user all the users must be trained and know how
    to interface with the system under both normal
    and failure situations

3
Non-Dependable System
Progression of negative impacts
4
Quality and Dependability
  • For Critical Systems, quality means dependability
    and must have the following general
    characteristics
  • Availability the system needs to be up and be
    operational when user demands it.
  • Reliability the system is performing as stated
    and prescribed by the user and reference
    documentation.
  • Safety the system is performing without
    threatening to the well being of the people
    and/or the environment.
  • Security the system is resistant to accidental
    or intentional undesired intrusion

5
Reliability and Availability
  • Reliability strictly speaking, is the
    characteristic that the system is performing
    according to the specification.
  • Availability is the characteristic that the
    system is operational when requested.
  • Reliability may be viewed as a superset
    characteristic of Availability. But - - - - look
    at the following
  • consider a system A which is relatively
    unreliable, with a failure once a month, but can
    be fixed and recover in 5 to 10 minutes. (e.g.
    transaction capacity problem that gets fixed as
    soon as more resources are allocated.)
  • alternatively consider a system B which is
    relatively reliable, with a failure of once a
    year, but can not be fixed and recover in less
    than 2 days. (e.g. design error that escaped
    inspection and test)

6
Other Considerations
  • We need to consider both the user environment and
    users (normal) usage pattern
  • if the error made does not result in a defect in
    a area highly used by the users or in the
    normal path of the users then it may never be
    detected. Thus the system may be considered
    available (for the highly used paths) and
    reliable (rarely run into an failure).
  • We would also need to understand that not all
    system failures are the same. We need to consider
    the severity of the failure, not just the number
    of failures, in terms of the effect the failure
    has on the users environment.
  • (your thought ?) Does this mean that there is
    no need to fix problems that is not on the
    normal path or assign low severity levels to
    those bugs?

7
Some Metrics for Reliability
  • Mean time to Failure (MTTF) the average time
    between failure.
  • Given observed failures of f1 through fn, record
    the time between no failure to first failure, t1,
    and all subsequent between failure times, t2
    through tn.
  • The mean of these between failure times, (sum of
    t1 through tn)/n, is used as the mean time to
    failure measure and possibly as an estimator for
    the next failure time t(n1).
  • Rate of Failure the frequency of failure
    occurrences per unit of time.
  • Given a fixed unit of system usage time, the
    number of failures are recorded.
  • The rate of failure is expected to follow some
    kind of exponential curve where the rate is high
    at first and then decreases after all the normal
    paths are exercised and fixed.

8
Safety
  • Safety is the attribute where the system is
    capable of operating without threatening the
    people or the environment.
  • Primary Safety-Critical systems, such as
    embedded process control systems, whose failure
    in performing the tasks may cause direct injury
    to people or environment.
  • Secondary safety-Critical systems that create
    other systems erroneously, such as populating a
    medical database with erroneous information, and
    thus indirectly cause injury to people or
    environment.
  • Reliable system is critical to safety, but not
    always enough---- a fault tolerant system may be
    deemed reliable because it has redundancy
    capability but it may still be unsafe if the
    switch over time is too long

9
Some Causes of Unsafe Systems
  • Incomplete requirements specification where not
    all the critical situations have been described.
  • Erroneous specification where the critical
    condition is not characterized correctly.
  • External behavior (e.g. hardware) that causes an
    unanticipated condition for software. (It is
    almost impossible to describe all possible
    external conditions)
  • User/Operator error(s) which caused a combination
    of conditions that result in an unanticipated
    case, leading to a malfunction

10
Safety-Criticality Needs
  • Hazard Avoidance mechanism that has duplicative
    characteristics before committing, such as asking
    the user to re-confirm the command before
    actually deleting a file or activate a process.
  • Hazard Detection and Removal detecting the
    hazard and removing it before a potential
    accident, such as process control system where
    constant pressure gauging can detect potential
    problems
  • Damage Limitation minimizing damages from an
    accident that has happened, such as automatic
    stoppage of elevators when fire is detected by a
    building maintenance system

11
Security
  • Security is an attribute of the system that
    addresses the extent that the system protects
    itself from external attacks.
  • Non-Malicious Intention
  • Authorized access to the system for only for
    limited functions which expands to functions
    beyond the limited functions, but no harm is done
    to the system or the owner of the system.
  • Unauthorized access of the system which violates
    privacy, but does not act do harm to the system
    itself.
  • Malicious Intention
  • Authorized or unauthorized access to the system
    followed by intentionally performing activities
    which result in harming the system or the owner
    of the system.

12
Security Violation Damages
  • Denial of Services the normal processing of the
    system is disabled through saturating the system
    capacity or engaging in all system resources
    thus availability becomes a problem.
  • Corruption of Data or Program the data or the
    program logic of the system may be altered which
    makes the system unreliable and, possibly,
    unsafe.
  • Exposure of Information the information managed
    by the system is made available to unauthorized
    parties which can lead to harming both the system
    and the people thus making the system unreliable
    and unsafe.

13
Security Critical Systems Need
  • Vulnerability avoidance the disabling of
    potentially exploitable weakness of the system
    such as not directly connecting to external
    network or encode the data
  • Attack Detection and Neutralization the
    inclusion of functions to detect an attack and to
    remove the intrusion before any harm is done such
    as virus checker that looks at attachments and
    strip them if there is any possibility of a
    threat.
  • Exposure Limitation the minimization of harm if
    an exposure (successful attack) takes place such
    as instituting regular system backup policy that
    allows system recovery when needed.

14
Metrics for Safety and Security
  • There is no clear metrics for safety and security
    except for listing a set of approaches that
    define some measurements for levels of
    protection
  • avoidance
  • detection and removal
  • limitation of damage and recovery

15
Back to Requirements
  • The attributes of critical system needs to be
    sated in the requirements specifications under
    the non-functional category.
  • For reliability and availability, numerical
    metrics is preferred.
  • For safety and security, the description of some
    level of protection needs to be stated, possibly
    also in the functional category.
Write a Comment
User Comments (0)
About PowerShow.com