Design of Memory Systems for Spaceborne Computers PowerPoint PPT Presentation

presentation player overlay
1 / 29
About This Presentation
Transcript and Presenter's Notes

Title: Design of Memory Systems for Spaceborne Computers


1
Design of Memory Systems for Spaceborne Computers
  • Richard B. Katz
  • NASA Office of Logic Design
  • Flight Software Workshop 2007 (FSW-07)
  • November 5-6, 2007
  • Laurel, MD

2
Memory Classification
  • While normally associated with computers, some of
    the concepts in this paper also apply to the
    configuration memory of FPGAs.
  • Fixed
  • The contents of the memory are physically fixed
    by the structure of the memory element.
  • Examples core rope memories (wire wound through
    or around a core), fusible link PROMs, and
    antifuse-based PROMs.
  • Erasable
  • The contents of the memory are non-volatile, like
    the fixed memories, but the contents can be
    changed. In many cases this involves an erase
    operation and then a write.
  • Examples core, plated wire, electrically
    erasable programmable read only memories
    (EEPROM), erasable read only memories (EPROM),
    ferroelectric memories, and flash. The ROM in
    EPROM and EEPROM is a poor part of the name as it
    implies permanence, which is incorrect. Devices
    such as EEPROM may need refreshing over long
    missions as many are rated with a 10 year storage
    lifetime, giving them dynamic characteristics.
  • Volatile
  • The contents of the memory are volatile they do
    not retain contents either after the cycling of
    power or during brown out conditions. This
    class is subdivided into two subclasses, static,
    which will retain state indefinitely and dynamic,
    where the memory must be read and subsequently
    refreshed.
  • Examples include SRAM, DRAM, and SDRAM.

3
Requirement Design Against Any Credible
Off-Nominal Event
  • These Events Are Considered Both Credible and
    Likely Power Transitions and Disruptions
  • Power Up Transient
  • Power Down Transient
  • Glitches or brownouts on power lines
  • Software Faults
  • Cell and Device Failure
  • Asynchronous Reset
  • Some observations
  • Difficult to design against and many current
    designs do not properly protect against brownouts
    and the power down transient.
  • FPGA-based control signals

4
Software Faults
  • Consider the likelihood of a software fault is
    100.
  • Device Protection
  • Many erasable devices implement software write
    protection to prevent against inadvertent writes
    to the memory.
  • JEDEC has published a standard on this type of
    protection.
  • Do not keep the keys to unlock the memory
    on-board unless absolutely necessary.
  • Subsystem Protection
  • System level write protection limits, implemented
    in hardware, to protect against software faults.
  • Some systems implement this in software which is
    risky see bullet 1 above.
  • Use external hardware discrete command as an
    additional barrier to prevent inadvertent writes.

5
Cell and Device FailureGeneral Guidelines to be
Tailored for Each Mission and Application
  • High-reliability, radiation-hardened CMOS RAM and
    PROM is available.
  • Designing against cell and device failure should
    be consistent with mission rules on single point
    failures.
  • Examine radiation-hardened label carefully as
    some devices marked as such are in fact SEU soft.
  • Commercial off the shelf (COTS) and Single Event
    Upset (SEU) soft devices should have parity for
    error detection or error detection and correction
    (EDAC) circuits, as required for the application.
  • Analyze and test devices for lockup states.
    These can occur in many memory types from illegal
    loads into command registers, poor signal
    integrity, poor power quality, or an SEU. Some
    device lockup states require power cycling to
    clear.
  • Consider the likelihood of an EEPROM or flash
    device fault to be 100. There are enough
    failures in the industry to justify such an
    approach.

6
Some Component ConsiderationsNon-volatile Memory
Lockup
SEFI data for the R1701L PROM This stuck at
mode, not necessarily 0, requires power cycling
of this serial device to clear. 5 See also
6 and other reports for similar results. t
SEE Test Results for AT28C010 (EEPROM) 4 Types
I and II are Single Effect Functional Interrupts
(SEFI) and required power cycling to restore
functionality. Errors can be multi-bit,
defeating SEC/DEC EDAC schemes.
Some but not all non-volatile memory components
can enter lockup states and become stuck,
requiring the cycling of power to restore
functionality. Careful system considerations for
the use of such devices is needed, with regards
to error detection and clearing, protection of
device I/O pins, and loss of system functionality
and propagation of errors until recovery is
achieved.
7
Some Component ConsiderationsSynchronous DRAM
(SDRAM) Lockup
BURST LENGTH A2 A1 A0 M30
M31 0 0 0 1 1 0 0
1 2 2 0 1 0 4
4 0 1 1 8 8 1 0 0
RESERVED RESERVED 1 0 1 RESERVED
RESERVED 1 1 0 RESERVED RESERVED 1 1
1 FULL PAGE RESERVED
Loss of functionality for the Hyundai 256M SDRAM
(Auto Refresh Operation Mode) 7
Examination a command field, Burst Length, for a
Load Mode Register command for one SDRAM type.
SDRAMs contain finite state machines and some
models may lock up, requiring the cycling of
power, if RESERVED commands are loaded. For some
models, this can result in potential damage to a
device. Other methods of entering illegal and
potentially damaging states is via an SEU, as
shown in the chart on the right, and error in the
controlling device, poor signal integrity or poor
power quality. Careful system considerations for
the use of such devices is needed, with regards
to error detection and clearing, spare
replacement devices in the event of damage, and
loss of system functionality and propagation of
errors until recovery is achieved.
8
Asynchronous Reset
  • Consider the system effects on the memory
    subsystem from an asynchronous reset.
  • Power disruption as discussed above, which are
    included here.
  • Reset either from another on-board computer or a
    ground command, perhaps in an attempt to clear a
    fault.
  • Will write cycles be aborted while being setup or
    in-process, leaving a non-volatile memory in an
    undefined state or altering RAM contents from a
    warm boot no longer valid?
  • Hardware memory controllers
  • Flight software, which is often involved by some
    systems in generating sequences and timing for
    non-volatile memories.
  • Will hardware operations be given time and energy
    to complete on-going operations? Many
    non-volatile memory devices take on order of 10
    ms to complete.

9
Some Recommendations
  • Boot and Safe-Hold Code
  • High-reliability, radiation-hardened, fixed
    memories should normally be employed for boot and
    safe-hold functions.
  • For applications such as instruments, DMA
    functions, properly implemented, can load
    memories with boot code. In this case, the
    instrument should be safed by hardware logic.
  • DMA functions should not require any operational
    software. A hardware discrete command to clamp a
    processor into reset is also recommended.
  • Hardware discrete commands should be used for
    switching critical memory banks, not software.
  • Systems should require the minimum of resources
    to function to enhance the probability of
    survival in the presence of either faults or
    off-nominal events.

10
Saturn V Launch Vehicle Duplex Memory
Each of the two core memory units was accessed in
parallel and each contained parity. If an error
was detected in the memory unit currently
designated as prime, then data from the secondary
unit was used with the secondary unit now given
the prime designation. Hardware automatically
wrote corrected data upon the detection of an
error.
11
Apollo Guidance Computer
The advantages of the ropes are numerous. The
program, once wired in, cannot be electrically
altered, a substantial asset for mission
reliability. 2 The permanent memory requires
very few active components and very little power
to operate, It also has properties that make it
indestructible short of mechanical damage, that
is, there is no inflight failure of any kind that
can destroy this part of the memory.
In case of inflight
failure that destroys the information in this
erasable memory the computation can be
restarted by reading in only a very few words.
3.
Memories in the AGC were single string each
memory used a parity bit for error detection.
Fixed storage was core rope, a permanent memory
technology, with coincident current core
implementing erasable memory. Involuntary
instructions, which operated as an interrupt and
not under program control, could shift data into
specific words of memory. Data could also be
entered via the astronauts keyboard and the the
"PACE" digital command system before launch. 3
12
Galileo Attitude Control Computer
RTG Power For Keep-A-Live
RTG Power For Keep-A-Live
CMOS Memory Array
CMOS Memory Array
ROM
ROM
GSE/DMA
GSE/DMA
Arbiter/ Controller
Arbiter/ Controller
CDH/DMA
CDH/DMA
Memory units were accessed one at a time. There
was no parity and RAM contents were protected by
write protect registers and monitored by
checksums in the background. Primary and
secondary memory designs were switched via a
discrete command. ROM contents implemented
safe-hold mode. DMA was functional either with
the processor clamped in reset or executing
flight software. A heartbeat was sent to the
CDH via DMA.
13
Single String Computer A
Single Board Computer
Conceptual diagram.
Code redundantly stored in three EEPROM modules.
Switching between copies is implemented in
software and all software must be running to be
able to accept and process the command to switch
images. The critical boot code and interrupt
vectors can not be made fault tolerant in this
software-centric architecture.
Command to the flight software.
µP
Logic Device
Simplified software-centric architecture.
Switching between critical boot sections is done
by software, leaving single point failures in
this architecture. There is no parity or EDAC.
Boot Code
Boot Code
Boot Code
EEPROM Module 1
EEPROM Module 2
EEPROM Module 3
14
Single String Computer B
These two computers are based on the same base
SBC but reflect different engineering approaches.
Single Board Computer
Conceptual diagram.
Code redundantly stored in three EEPROM modules.
Switching between copies is implemented in
hardware by an external discrete command.
µP
Hardware command selects between one of two spare
modules.
Hardware command for either on- or off-board boot
code selection.
Simplified hardware-centric architecture.
Switching between critical boot sections is done
by hardware discretes, eliminating the EEPROM as
a single point failure. Common mode EEPROM
failure modes do remain.
Boot Code
Boot Code
Boot Code
EEPROM Module 1
EEPROM Module 2
EEPROM Module 3
15
LOLA Memory
  • LOLA is the Lunar Orbiter Laser Altimeter which
    will fly on the Lunar Reconnaissance Orbiter.

16
LOLA Memory Breakdown
128 kbytes
128 kbytes
Redundant
Redundant
Redundant
Redundant
  • Notes
  • Each block 16 kbytes
  • Red blocks are redundant, access controlled by a
    discrete command bit
  • BAE Rad-hard SRAM
  • Continuously read SRAM by data drip in
    telemetry
  • Aeroflex PROM
  • Hitachi commercial EEPROM
  • All memories readable by DMA.
  • EEPROM and SRAM writable by DMA.

Redundant
Redundant
Redundant
Redundant
Spare
Redundant
Redundant
Spare
32 kbytes
Data
Margin
Code
Boot
EEPROM
SRAM
PROM
17
Boot Control and Memory Architectures
  • Multiple Sources for Initial Memory Load
  • Each page is 32 kbytes
  • The upper 32 kbytes of the address space is
    loaded with an illegal instruction to force a
    trap if the program loses control and executes
    undefined memory locations.
  • One page of PROM (32k x 8 device)
  • Four pages of EEPROM (128k x 8 device)
  • Can load into one of two pages of SRAM (128k x 8
    device)
  • Source of boot information comes from a
    programmable register.

18
LOLA Memory Philosophy
  • Ground computer can peek and poke memory
    locations.
  • DMA model for the instrument
  • Other than setting/reading discrete bits.
  • Single address field for entire instrument
  • Give the ground computers maximum control of
    instrument for push button control.
  • Science telemetry double buffered.
  • Transparent to the spacecraft, they retrieve
    packets from the same address each 1 s major
    frame period
  • DMA can access telemetry in raw mode, directly
    reading each byte of telemetry.

19
LOLA Memory Philosophy
--------------------------------------------------
------------------------- -- -- LOLA MEMORY
ORGANIZATION -- -- The memory in the LOLA
instrument will be organized as an array of --
bytes with a 24-bit address field. Memory is to
be accessed through -- MIL-STD-1553B interface
and employ direct memory access via the --
"RodChip." The uppermost nibble of this 24-bit
address will denote -- the memory to be accessed
as in the table below. -- -- 0 PROM --
1 EEPROM -- 2 SRAM -- 3 -- 4 CT
(exclusive of 0-3 above) and RMU -- 5. --
6 TELEMETRY -- -- Address 0-5 are for physical
memory and are provided for engineering
use. -- ------------------------------------------
---------------------------------
20
LOLA EEPROM Protection
  • Discrete bit in MIL-STD-1553B output register
    must be set
  • Logically ANDed with write signal to inhibit
    writes
  • Power-on-reset signal
  • Releases late (power up)
  • Applied early (power down)
  • Controls devices reset pin
  • Disables device and prevents inadvertent writes
    during power transitions.

21
LOLA EEPROM Protection
  • All write cycles use the sequence to unlock/lock
    the non-volatile memory device
  • Software Data Protection continually enabled
    and never disabled
  • This sequence is a preamble to a write cycle
  • µP/software cannot access EEPROM
  • All EEPROM access is done by hardware
  • Boot controller (reads)
  • DMA controller (reads and writes)

22
KEY EEPROM CHARACTERISTICS
Keys
tBLC Byte Load Cycle min 1.0 µs, max 30
µs. Key Fact MIL-STD-1553B word rate 20 µs per
word.
23
LOLA EEPROM Protection
  • EEPROM Keys for enabling writing
  • Are not stored on board in any form
  • Part of the MIL-STD-1553B RECEIVE command
  • Key (data part) is uploaded from ground in 3
    byte preamble and presented via DMA controller.
  • Are discarded immediately after use
  • Writing may only be done by DMA
  • DMA supplies Address and Data
  • Except for Address during preamble.

24
LOLA Flight Software
25
LOLA Flight Software Overview
  • Produce 3 sets of outputs every 35.7 ms.
  • Very small and focused.
  • Interrupt driven software
  • but only one task runs at a time, making the
    software deterministic.
  • Role of software an interrupt calls a subroutine
    which executes and then stops (executes HLT
    instruction)
  • Provision made for software telemetry for each
    minor frame.
  • Software is loaded by DMA, no need to configure
    itself.
  • DMA is completely independent of processor, which
    is held in reset during DMA operations.

26
Key Flight Software Characteristics
  • Size of Flight Code - 12,450 bytes
  • Size of Tables - 3,472 bytes
  • Size of Memory for Data - 3,592 bytes
  • Execution time
  • nominal 22 ms
  • last shot in major frame 32 ms
  • Note minor frame rate of 28 Hz ? 35.7 ms major
    frame rate is 1 Hz.

27
Algorithm Engine Timer
The signal that is input to the CT from the RMU
that triggers the algorithm engine is called
RUPT, which is a 200 ns pulse asserted every 28
Hz (35.7 ms). TELEMETRY One byte per
minor frame (35.7 ms) is to be put into the
telemetry This is the Algorithm Engine
Timer 00 Software did not run properly and
terminated early. FF Software did not complete
on time (cycle slip) Other values represent time
of execution for flight software for one minor
cycle. The Algorithm Engine Timer is implemented
by a counter with the following properties
Effectively an 8-bit saturating counter LSB
200 microseconds Set to "0000000" by RUPT
Stopped by an 80K85 I/O write cycle to a
particular address when the software is done
processing a minor frame.
28
Thank you.
29
References
  • Space Vehicle Design Criteria, (Guidance and
    Control) Spaceborne Digital Computer Systems,
    NASA SP-8070, March 1971, National Aeronautics
    and Space Administration
  • The Apollo Guidance Computer, Ramon L. Alonso
    and Albert L. Hopkins, R-416, August, 1963.
  • General Design Characteristics of the Apollo
    Guidance Computer, Eldon C. Hall, R-410, May
    1963.
  • Single Event Functional Interrupt (SEFI)
    Sensitivity in EEPROMs, R. Koga, 1998 MAPLD
    International Conference, September, 1998,
    Greenbelt, MD.
  • Single-Event Upset Test Results for the Xilinx
    R1701L PROM, S. M. Guertin, JPL Report, August
    24, 2000
  • SEE and TID Extension Testing of the Xilinx
    XQR18V04 4Mbit Radiation Hardened Configuration
    PROM, Carl Carmichael, Joe Fabula, Candice Yui,
    and Gary Swift, 2002 MAPLD International
    Conference, September 10-12, 2002, Laurel, MD.
  • "Permanent Single Event Functional Interrupts
    (SEFIs) in 128- and 256-megabit Synchronous
    Dynamic Random Access Memories (SDRAMs)," R.
    Koga, P. Yu, K.B. Crawford, S.H. Crain, and V.T.
    Tran, 2001 IEEE Radiation Effects Data Workshop.
Write a Comment
User Comments (0)
About PowerShow.com