Title: Dependability
 1Dependability  Maintainability Theory and 
MethodsPart 2 Repairable systems Availability
- Andrea Bobbio 
- Dipartimento di Informatica 
- Università del Piemonte Orientale, A. Avogadro 
- 15100 Alessandria (Italy) 
- bobbio_at_unipmn.it - http//www.mfn.unipmn.it/bob
 bio/IFOA/
IFOA, Reggio Emilia, June 17-18, 2003 
 2Repairable systems
X 1
X 2
X 3
UP    
DOWN
t
Y 1
Y 2
X 1, X 2 . X n Successive UP times Y1, Y 2 
. Y n Successive DOWN times 
 3Repairable systems
- The usual hypothesis in modeling repairable 
 systems is that
- The successive UP times X 1, X 2 . X n are 
 i.i.d. random variable i.e. samples from a
 common cdf F (t)
- The successive DOWN times Y1, Y 2 . Y n are 
 i.i.d. random variable i.e. samples from a
 common cdf G (t)
4Repairable systems
X 1
X 2
X 3
UP    
DOWN
t
Y 1
Y 2
- The dynamic behaviour of a repairable system is 
 characterized by
-  the r.v. X of the successive up times 
-  the r.v. Y of the successive down times
5Maintainability
- Let Y be the r.v. of the successive down times 
-  G(t)  Pr  Y ? t  (maintainability) 
-  d G(t) 
-  g (t)   (density) 
-  dt 
-  g(t) 
-  h g (t)   (repair rate) 
-  1 - G(t) 
-  MTTR  ? t g(t) dt (Mean Time To 
 Repair)
-  
?
0 
 6Availability
The measure to characterize a repairable system 
is the availability (unavailability)
The availability A(t) of an item at time t is the 
probability that the item is correctly working at 
time t. 
 7Availability
- The measure to characterize a repairable system 
 is the availability (unavailability)
-  A(t)  Pr  time t, system  UP  
-  U(t)  Pr  time t, system  DOWN  
-  
-  A(t)  U(t)  1 
8Definition of Availability
- An important difference between reliability and 
 availability is
- reliability refers to failure-free operation 
 during an interval (0  t)
- availability refers to failure-free operation at 
 a given instant of time t (the time when a
 device or system is accessed to provide a
 required function), independently on the number
 of cycles failure/repair.
9Definition of Availability
I(t)
1
Failed and being restored
Operating and providing a required function
Operating and providing a required function
0
t
1 working 0 failed
I(t) indicator function
 System Failure and Restoration Process 
 10Availability evaluation
- In the special case when times to failure and 
 times to restoration are both exponentially
 distributed, the alternating process can be
 viewed as a two-state homogeneous Continuous Time
 Markov Chain
Time-independent failure rate 
? Time-independent repair rate ? 
 112-State Markov Availability Model
- Transient Availability analysis 
- for each state, we apply a flow balance equation 
-  Rate of buildup  rate of flow IN - rate of flow 
 OUT
122-State Markov Availability Model 
 132-State Markov Availability Model
1
A(t)
Ass 
 142-State Markov Model
1) Pointwise availability A(t) 
2) Steady state availability limiting value as 
- If there is no restoration (?0) the 
 availability
- becomes the reliability A(t)  R(t)  
15Steady-state Availability
- Steady-state availability 
- In many system models, the limit 
- exists and is called the steady-state availability
The steady-state availability represents the 
probability of finding a system operational after 
many fail-and-restore cycles. 
 16Steady-state Availability
1
0
UP
DOWN
t
Expected UP time EU(t)  MUT  MTTF
Expected DOWN time ED(t)  MDT  MTTR 
 17Availability Example (I)
Let a system have a steady state availability Ass 
 0.95 This means that, given a mission time T, 
it is expected that the system works correctly 
for a total time of 0.95T. Or, alternatively, 
it is expected that the system is out of service 
for a total time Uss  T  (1- Ass)  T 
 18Availability Example (II)
Let a system have a rated productivity of W 
/year. The loss due to system out of service can 
be estimated as Uss  W  (1- Ass)  W The 
availability (unavailability) is an index to 
estimate the real productivity, given the rated 
productivity.
Alternatively, if the goal is to have a net 
productivity of W /year, the plant must be 
designed such that its rated productivity W 
should satisfy Uss  W  W 
 19Availability
We can show that This result is valid without 
making any assumptions on the form of the 
distributions of times to failure  times to 
repair. Also 
 20Motivation  High Availability 
 21Maintainability
- MDT (Mean Down Time or MTTR - mean time to 
 restoration).
- The total down time (Y ) consists of 
- Failure detection time 
- Alarm notification time 
- Dispatch and travel time of the repair person(s) 
- Repair or replacement time 
- Reboot time
22Maintainability
- The total down time (Y ) consists of 
- Logistic (passive) time 
-  Administrative times 
-  Dispatch and travel time of the repair 
 person(s)
-  Waiting time for spares, tools  
- Effective restoration (active) time 
-  Access and diagnosis time 
-  Repair or replacement time 
-  Test and reboot time
23Logistics
- Logistic times depend on the organization of the 
 assistance service
-  Number of crews 
-  Dislocation of tools and storehouses 
-  Number of spare parts. 
24The number of spares 
 25Maintenance Costs
- The total cost of a maintenance action consists 
 of
-  Cost of spares and replaced parts 
-  Cost of person/hours for repair 
-  Down-time cost (loss of productivity) 
The down-time cost (due to a loss of 
productivity) can be the most relevant cost 
factor. 
 26Maintenance Policy
- Is the sequence of actions that minimizes the 
 total cost related to a down time
-  Reactive maintenance 
-  maintenance action is triggered by a failure. 
-  Proactive maintenance 
-  preventive maintenance policy.
27Life Cycle Cost