Title: Supplemental Material: Data on Tevatron Complex Reliability
1Supplemental Material Data on Tevatron Complex
Reliability
- Elliott McCrory
- 25 February 2004
2Presentation Outline
- Main talk is elsewhere
- The data here are supplemental to that talk
3Downtime Logger, D18
Console Application Screen Shot
4Web Page 4 Weeks of Downtime
Click Here
5Focus on one subsystem BPS
- Entries like these are uses as the basis for the
analysis
6Tevatron Complex Subsystems
- BDIAG
- BMAG
- BMISC
- BPS
- BRF
- BVAC
- BWATR
- CMISC
- CNET
- CRLK
- CTIME
- EXPAR
- FDIAG
- FMISC
- FPS
- FWATR
- LDIAG
- LMAG
- LMISC
- LPS
- LRF
- LVAC
- LWATR
- MIDIAG
- MIMISC
- MIPS
- MIRF
- MISC
- MIVAC
- MIWATR
- MIMAG
- MIXFER
- POWER
- RADTRIP
- RRMISC
- SAFETY
- TCRYO
- TDIAG
- TMAG
- TMISC
- TPS
- TQPM
- TQUEN
- TRF
- TVAC
- TWATR
- WATER
- NTF
- PACC (L)
- PBCOOL
- PBDIAG
- PBMAG
- PBMISC
- PBPS
- PBRF
- PBTRGT
- PBVAC
- PBWATR
- PBXFER
7Definition of Terms
- MTBF
- Mean Time Between Failure
- Simple arithmetic average of the time between
failures - ?
- Used to characterize the probability of an
interval in the classic random queuing problem - E.g., the arrival interval of cars at a toll
booth - ? 1/lttgt 1/s
- Actually, not s, but R.M.S.
- Calculated here as (2/(MTBF s(MTBF)))
- Average Downtime Fraction
- S (time down) / (specified interval)
8MTBF For Each Major System
Running average over 30 days
Linac
Good
Fall 2003 Shutdown
Running average MTBF, hours
Bad
Running average over 120 days
Days since 1/1/2003
9MTBF For Each Major System
Booster
Good
Running average MTBF, minutes
Bad
Days since 1/1/2003
10MTBF For Each Major System
PBar
Good
Running average MTBF, minutes
A lot less downtime
Bad
Days since 1/1/2003
11MTBF For Each Major System
Main Injector
Good
Running average MTBF, minutes
Bad
Days since 1/1/2003
12MTBF For Each Major System
Tevatron
Good
Running average MTBF, minutes
Bad
Days since 1/1/2003
13MTBF For Each Major System
Controls
Good
Running average MTBF, minutes
Bad
Days since 1/1/2003
14Tevatron as 2 Machines
- The machines are called Low Beta and Not Low
Beta - Is Low Beta if energy is bigger than 900
- Otherwise, it is called Not Low Beta
- Only one machine exists at a time
- Time between failures
- Only count time in which the machine exists
- That is, count minutes at 980 to determine time
between failures at 980 - Ditto for Not Low Beta
- Eliminate obvious bad points
- E.g., 5000 minutes of shutdown between failures
at 150 for TeV this winter.
15Tevatron as 2 machines, MTBF
MTBF is getting longer!
16 ?(t) For Each Major System
Linac
Bad
Running average MTBF, hours
Running average over 30 days
Good
Running average over 120 days
Days since 1/1/2003
17 ?(t) For Each Major System
Booster
Bad
Running average MTBF, hours
Good
Days since 1/1/2003
18 ?(t) For Each Major System
Main Injector
Bad
Running average MTBF, hours
Good
Days since 1/1/2003
19 ?(t) For Each Major System
PBar
Bad
Running average MTBF, hours
Good
Days since 1/1/2003
20 ?(t) For Each Major System
Bad
Running average MTBF, hours
Good
Tevatron
Days since 1/1/2003
21 ?(t) For Each Major System
Controls
Bad
Running average MTBF, hours
Good
Days since 1/1/2003
22 Average Downtime Fraction
Running average over 30 days
Running average over 120 days
Bad
Linac
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
23 Average Downtime Fraction
Booster
Bad
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
24 Average Downtime Fraction
Main Injector
Bad
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
25 Average Downtime Fraction
PBar
Bad
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
26 Average Downtime Fraction
Tevatron
Bad
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
27 Average Downtime Fraction
Controls
Bad
Running average Downtime Duration, minutes
Good
Days since 1/1/2003
28Radioactive Decay
? e ? t is the frequency distribution of the
classic random queuing problem
F(t) 2 -t/t
F(t) 2 -t/t e -ln(2) t/t e ?
t ? 0.0346
lttgt s 1/ ? 28.9
29Controls Downtimes
Data Dump of All Controls Downtimes since
1-1-2003
30? For All Systems
Probability of failure per hour 1 ? E.g.
Linac up for 5 hours (1-0.183)5 0.364
Log Scale
31Quality of ? Fit Assumption
Q lttgt/s
32Store Duration
Failures
Intentional 2003
dN / dT
Intentional 2004
Stores Duration, Hours