Title: Digital Communication Assessment For Highly Integrated Control Rooms
1Digital Communication Assessment ForHighly
Integrated Control Rooms
- Richard T. Wood
- David E. Holcomb
- Oak Ridge National Laboratory
- presented at
- IAEA Technical Meeting on
- Integrating Analog and Digital IC Systems in
- Hybrid Main Control Rooms at Nuclear Power Plants
- Toronto, Canada
- October 28November 2, 2007
Opinions and conclusions expressed by the author
do not necessarily represent positions endorsed
by NRC.
2Control Room Designs For New Proposed Nuclear
Power Plants Extensively Employ Digital Network
Communications
- Digital communications are potentially beneficial
to plant safety and efficiency and are
necessitated by analog system obsolescence/unavail
ability - Digital communications are more complicated than
their analog predecessors and introduce
additional failure modes requiring regulatory
modernization - Interim staff guidance has been developed and
regulatory research is ongoing - Interim staff guidance available at
- http//www.nrc.gov/reading-rm/doc-collections/isg
/digital-instrumentation-ctrl.html
3Interdivisional And Nonsafety-Related- To-Safety
Communication Present New Issues
- Current guidance documents (IEEE 603 and IEEE
7-4.3.2) already address intradivisional
communication - Review guidance and acceptance are required for
- Communication among safety divisions,
- Communication from a nonsafety console to safety
divisions, - Communication from nonsafety systems to safety
equipment, - Communication from a multidivisional safety
console, and - Connection of nonsafety programming, maintenance,
and test equipment to multiple safety divisions
4Lessons Learned, Accepted Consensus Practices,
And Analysis of Credible Failure Mechanisms
Establish The Basis For Enhanced Guidance
- Review of relevant experience and emerging
regulatory practice from international reactors
contributes to the lessons learned - IEEE and IEC standards as well as best practices
from industrial applications provide the accepted
consensus practices - Analysis of communication failures for
high-integrity digital communications provides
the credible failure mechanisms
5Communication With Any External Source Should Not
Inhibit a Safety Division From Performing Its
Designated Function
- Two primary failure scenarios
- Failure to communicate necessary data when it is
needed - Communication of erroneous information
- Failures and errors can appear in numerous places
along the path from source to receiver - Source-generated errors,
- Communication/transmission channel generated
errors (including interposed bridges and
routers), - Receiver-generated errors, and
- System-wide, component interaction generated
errors
6Digital Communications Are Typically Configured
In One Of Three Forms
- Busan array of parallel conductors
- One module controls which module can put
information on the bus - Typically only exists within one division
- Serialpoint-to-point wiring (fiber)
- Typically employed for communication between
computing level of the system and the sensing and
actuation level - Information tends to be device specific fixed
format - Networkconnection to many devices
- Serial in nature, but allows messages to many
receivers
7Greater Network Complexity Increases The
Difficulty Of Assessing Network Reliability
- High reliability is a primary requirement for
safety networks - Flexibility, adaptability, and wide area coverage
with many nodes are not needed for NPP safety
critical systems - Point-to-point interconnection provides high
reliability through simplicity - More complex network topologies can provide
higher reliability through redundant and diverse
links - Fault toleranceredundant link in the event of a
failure - Fault detectioncomparison of transmissions
through multiple links - Fault removalreconfiguring transmission around
failed links
8Communication Networks Have a Wide Range of
Potential Vulnerabilities
- Babbling idiot (commission fault)
- Inconsistency (Byzantine generals problem)
- Excessive jitter
- Buffer overflow
- Data out of range
- Incorrect ordering
- Message too early
- Very long delays in bridges and routers
- Very long times to initiate communications
- Corruption
- Unintended repetition
- Incorrect sequence
- Loss
- Unacceptable delay
- Insertion
- Masquerade
- Addressing
- Broadcast storm
9Network Defensive Measures Are Available That
Address Each Vulnerability
- Sequence number
- Time stamp
- Timeout (for example, watchdog)
- Source and destination identifier
- Feedback message (acknowledgments and echoes)
- Identification procedure
- Safety code (CRC cyclic redundancy check)
- Cryptographic techniques
- Redundancy (replication)
- Membership control
- Atomic broadcast
- Time-triggered architecture
- Bus guardian
- Prioritization of messages
- Inhibit times
- Hamming distance applied to node addresses or
message identifiers
10Communications Isolation Is Key For
Interconnections Among Systems Of Different
Safety Class
- IEEE 7-4.3.2 recommends erecting barriers as an
alternative to requiring all communications
components be safety-grade - Only permissible isolator failure mode is
disconnection from nonsafety network - Planned revision to provide more detailed
guidance - Key feature of communications isolation is the
use of a separate communications processor with
structured access to dual-port memory with safety
function processor - Ensures that normal execution is not impeded by
attention to external communication duties
11NPP Safety Systems Require Only A Limited Set Of
Message Types
Network Topology Changes With System Status
- Software Coding (Programming Updates)
- Set Points and Parameters
- Command Functions
- Go/No-Go (Interlocks)
- Data Transfer
- System Health Check
12Different Nuclear Power Regulatory Bodies Employ
Different Safety-System Classification Schemes
- U.S. has the most coarsely graduated
classification scheme
Based on table from several IAEA TECDOCs (e.g.,
IAEA-TECDOC-1066) Does not represent precise
relationships among the various categories in the
standard
13Some Regulatory Practices Are Common Among
Regulatory Bodies For Current Generation Plants
- For all but the simplest communications, the
highest class safety system must be in bypass to
accept command access - Communications with the highest class of safety
system, while on-line, are of limited extent and
always from a high-quality, regulated system, but
not necessarily the highest class system - Communication from systems of the highest safety
class, while on-line, is accomplished via
buffered, one-way communication nodes - Physical access restrictions are required for
implementing changes to safety system performance - Prevents on-line modification of a safety system
- Restricts access to more than one safety division
concurrently
14GE ABWR IC System Design Employs A Highly
Integrated Digital Communication Architecture
15Priority Logic Devices Are Employed For Control
Of Devices With Both Safety and Nonsafety
Functions
- Prioritization logic required to ensure that
safety commands take priority over nonsafety
commands - Local feedback safety commands take priority over
remote safety commands - For example, inhibit valve demands after
saturation occurs
Olkiluoto 3 priority and actuation module
16Systematic Approach to Assessing the
Acceptability of Digital Communication Systems
Addresses Their Vulnerabilities
- Evaluate the effect of the communication on the
safety function(s) - Functional dependence
- Determine the capability of the architecture to
avoid or withstand the occurrence of credible
faults - Isolation execution dependence
- Failures which trigger dependencies result in
- Incorrect performance of the safety function or
- Interruption of safety function execution
17Structured Review Process Is Required To Evaluate
Digital Safety Communications
- Communication interconnections can
- Compromise the independence of safety functions
- Provide a propagation path for errors among
systems, - Introduce new failure modes
- Increased vulnerability necessitates a
comprehensive assessment of the nature of the
communication, the implementation approach, as
well as the credible vulnerabilities and
corresponding mitigation approaches - Identify architecture, network topology, and key
characteristics - Check for known vulnerabilities to define a
credible set - Assess the application of defensive and
mitigation strategies