Failure in the PATHFINDER Mission - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Failure in the PATHFINDER Mission

Description:

Failure in the PATHFINDER Mission. Chandan Kumar. EE 585: Fault ... 1553 bus connects to cruiser' and lander' stages. H/W on Cruiser controls thrusters .etc ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 17
Provided by: ideaEn
Category:

less

Transcript and Presenter's Notes

Title: Failure in the PATHFINDER Mission


1
Failure in the PATHFINDER Mission
  • Chandan Kumar
  • EE 585 Fault Tolerant Computing

2
Outline
  • Background
  • Simplified view of H/W architecture
  • S/W architecture
  • Failure
  • Cause
  • Correction

3
Background
  • Launched Dec 4 1996
  • Landed July 4 1997.
  • Mission Objectives
  • To prove that the development of "faster, better
    and
  • cheaper" spacecraft is possible (with three
    years for development and a cost under US 150
    million).
  • To show that it is possible to send a load of
    scientific instruments to another planet with a
    simple system and at one fifth the cost of a
    Viking mission.

4
Background Contd.
  • To demonstrate NASA's commitment to low-cost
    planetary exploration finishing the mission with
    a total expenditure of US 280 million, including
    the launch vehicle and mission operations.
  • Demonstrate the mobility and usefulness of a
    micro rover on the surface of Mars
  • It carried a number of scientific instruments
    like
  • Mars Pathfinder Lander
  • Imager for Mars Pathfinder (IMP),(includes
    magnetometer and anemometer)
  • Atmospheric and meteorological sensors (ASI/MET)

5
Background Contd.
  • Rover Sojourner
  • Imaging system (three cameras front BW stereo,
    1 rear color)
  • Laser striper hazard detection system
  • Alpha Proton X-ray Spectrometer (APXS)
  • Wheel Abrasion Experiment
  • Material Adherence Experiment
  • Accelerometers
  • Potentiometers
  • Final transmission Sept 27 1997.
  • 16500 images sent from lander,550 from rover
  • 15 analysis of rocks.

6
Simplified view of Hardware Architecture
  • Single CPU Controls the Spacecraft.
  • Resides on VME bus.
  • Interface cards for Radio and Camera.
  • Interface to 1553 bus.
  • 1553 bus connects to cruiser and lander
    stages.
  • H/W on Cruiser controls thrusters .etc
  • H/W on Lander interface to instruments like
    accelerometer,radar altimeter and ASI/MET etc.

7
The Software Architecture
lt ------------------------ .125 seconds
----------------------------gt
lt
gt
lt- bc_dist active -gt bc_sched active
lt - bus active - gt
lt-gt
------------------------------------------------
------------------------------ t1
t2 t3
t4 t5 t1
The are periods when tasks
other than the ones listed are executing. There
is some idle time. t1 - bus hardware starts via
hardware control on the 8 Hz boundary. The
transactions for the this cycle had been set up
by the previous execution of the bc_sched
task. t2 - 1553 traffic is complete and the
bc_dist task is awakened.t3 - bc_dist task has
completed all of the data distributiont4 -
bc_sched task is awakened to setup transactions
for the next cyclet5 - bc_sched activity is
complete
8
The Failure
  • The spacecraft began experiencing total system
    resets.
  • This reset reinitializes all of the hardware and
    software. It also terminates the execution of the
    current ground commanded activities.
  • The remainder of the activities for that day were
    not accomplished until the next day

9
The Cause
  • The Failure - a case of Priority Inversion
  • In scheduling, priority inversion is the scenario
    where a low priority task holds a shared resource
    that is required by a high priority task.
  • This causes the execution of the high priority
    task to be blocked until the low priority task
    has released the resource, effectively
    "inverting" the relative priorities of the two
    tasks.
  • If some other medium priority task attempts to
    run in the interim, it will take precedence over
    both the low priority task and the high priority
    task.

10
The Cause Contd.
  • The failure was identified by the spacecraft as a
    failure of the bc_dist task to complete its
    execution before the bc_sched task started
  • The ASI/MET task is delivered its information via
    an interprocess communication mechanism (IPC).
  • IPC mechanism based on using Pipes.
  • The higher priority bc_dist task was blocked by
    the much lower priority ASI/MET task that was
    holding a shared resource.

11
The Cause contd..
  • The resource that caused this problem was a
    mutual exclusion semaphore used within the
    select() mechanism.
  • The ASI/MET task had acquired this resource and
    then been preempted by several of the medium
    priority tasks.
  • The bc_dist task attempted to send the newest
    ASI/MET data via the IPC mechanism which called a
    Pipe. This pipe blocked taking the semaphore.

12
The Cause contd..
  • The medium priority tasks ran, still not allowing
    the ASI/MET task to run, until the bc_sched task
    was awakened.
  • At that point, the bc_sched task determined that
    the bc_dist task had not completed its cycle (a
    hard deadline in the system) and declared the
    error that initiated the reset.

13
Correction
  • Changing the creation flags for the semaphore so
    as to enable the priority inheritance
  • Modify the semaphore associated with the pipe
    used for bc_dist task to ASI/MET task
    communications corrected the problem.

14
S/W modification on the spacecraft
  • Patching is a specialised process.
  • Send the difference b/w what you have onboard and
    what you want on the spacecraft.
  • S/W on the spacecraft modifies the onboard copy.

15
Questions??
16
References
  • http//mars.jpl.nasa.gov/missions/past/pathfinder.
    html
  • http//research.microsoft.com/7embj/Mars_Pathfind
    er/Authoritative_Account.html
  • http//en.wikipedia.org/wiki/Mars_Pathfinder
Write a Comment
User Comments (0)
About PowerShow.com