Title: Correlation Analysis for USCM8 CERs
1CORRELATION ANALYSIS FOR USCM8 CERS
2002 SCEA National Conference Phoenix
(Scottsdale), Arizona Dr. Shu-Ping Hu 14 June
2002 shu_at_tecolote.com
2Outline
- Objectives
- Background
- Tecolotes position
- Future study items noted in Mr. Coverts paper
(see Reference 1) - How to apply the correlation formula
- Ground Rules for Developing USCM7 CERs
- Using CER Data Points to Compute Pearsons r
- Multiplicative Error Model (MUPE) and Error Forms
- Pearsons Correlation Coefficient
- Definition and example
- Property
- Revisited High Correlation Items from Reference 1
- USCM8 Sample Correlation Coefficients
- Conclusions
3Objectives
- Derive correlations between the USCM CER
uncertainties using an analytic method
4Tecolotes Position
- Cost correlation is not the same as CER noise
correlation - With CERs as cost estimating methodologies, most
of the correlations are captured through the
functional relationships specified in the WBS - Do any correlations exist for the remaining noise
terms?
5Future Study Items Noted in Reference 1
- High correlation coefficients between USCM7 CER
uncertainties in Correlation Coefficients for
Spacecraft Subsystems from the USCM7 Database
6How should we apply the correlation formula to
the data points?
- Reference 1 used 26 satellites from the entire
USCM7 database to compute correlation
coefficients for USCM7 CERs - Outliers not eliminated
- Population not homogeneous
- We should not use the entire database to compute
correlation coefficients - Data point selection
- Error form consideration
7Ground Rules for Developing USCM7 CERs
- ATSF deleted due to incomplete cost data
- Programs with no costs identified were not used
- AE, CRRES, P78-1, P78-2, P72-2, OSO, S3, DMSP
5-D1, DMSP 5-D2, and DMSP 5-D3 did not have a
communication payload - DSCS, DMSP, DSP, AE, OSO, and SMS did not have an
AKM - GPS 9-11 and CRRES AKMs were GFE
- Follow-on production programs DSCS 4-7, DSCS
8-14, DMSP 5-D2, DSP 5-12, DSP 18-22, FLTSATCOM
6-8, GPS 9-11, and GPS 13-40 not used in the
nonrecurring CERs - DSCS A (a development program) not used in the T1
CER - Data points displaying program peculiarity were
not used in subsystem CER development
8Ground Rules for Developing USCM7 CERs (2)
- P78-1, P78-2, P72-2, and S3 were identified as
Space Test Programs (STPs) - A smaller physical size, maximum reuse of
existing HW - Shorter design life (6 18 months)
- Not a full-up design effort for nonrecurring
- Not a full-up manufacturing effort for recurring
- AE, OSO, and CRRES were considered experimental
satellites - Developed a separate CER for estimating STPs and
experimental programs if appropriate - Using primary equation to predict STPs would be
incorrect
9Using CER Data Points to ComputePearsons r
- Even worse calculate the corresponding
correlation coefficient when using primary
equation to predict STPs - If a satellite doesnt have a particular
subsystem, do not include it in computing the
correlation coefficient for the corresponding
subsystem-level CER - Percentage errors could be 100 using any CER
- Do not use data points with program peculiarity
to compute Pearsons r if they are excluded from
the CER - Refit the CER with previously excluded outliers
if necessary - Homogeneous data set is essential
10Multiplicative Error Model MUPE
- Definition for cost variation
- Y f(X)e
- where E(e ) 1 and V(e ) s 2
f(x)
Cost Y
Note E( (Y-f(X)) / f(X) ) 0 V( (Y-f(X)) / f(X)
) s 2
Some Driver X
11Candidate Error Forms
- MUPE models use percentage errors
- Note Residuals are weighted by the reciprocal of
the predicted value - Additive models use residuals
yi f(xi) ei for i 1,,n
yi f(xi) ei for i 1,,n
12Pearsons Correlation Coefficient
- Pearsons correlation coefficient measures the
linear association between two sets of pairs
xi and yi - xi and yi are the paired percentage errors
for multiplicative models - xi and yi are the paired residuals for
additive models - should both be zero
13Reference 1 Deriving Correlation Coefficients
- Usually dont know the true value of rxy, so
approximate it by sample correlation rxy - Example calculation using randomly generated
numbers
14Pearsons r Preserved through Linear
Transformation
- Given the following
- T X Y
- X f(W) e
- Y g(W) ?
- (Note f and g are USCM7 weight-based CERs, e and
h are error terms) - The correlation between X and Y is the same as
the correlation between e and h, i.e., - Total cost variance at a given weight, wt, is
given by - We should consider the correlations between
percentage errors instead of residuals
15Pearsons r Preserved Through Linear
Transformation (2)
- General total cost variance
- Where
- sk, sm, and rkm are the standard deviations of
the noise terms for the WBS elements k and m,
respectively, and the correlation between them. - fk and fm are the CER estimated values for the
WBS elements k and m, respectively.
16Revisited High Correlation Items in Previous Study
- High correlation coefficients listed in Reference
1 not found with the revised approach
17USCM8 Sample Correlation Coefficients
- Range (-0.925,0.913), Mean 0.04, Median
0.02, Skew - 0.02 - 1st quartile -0.32, 3rd quartile 0.44, sd
0.44 - 73 of the correlation coefficients are from 0.5
to 0.5 - Three sample correlations with absolute values
0.85 0.90, 0.91, -0.93
18Reference 1 USCM7 Correlation Coefficients
19Correlations between Structure/Thermal and SEPM
Nonrecurring CERs
- For non-communication satellites 0.90
- For communication satellites -0.54
- For all satellites 0.73
20Conclusions
- Sample correlation coefficient is sensitive to
the computing method - Use CER data points to compute Pearsons r to
avoid heteroscedasticity - In cost risk analysis, consider the correlations
between - percentage errors instead of residuals for
multiplicative CERs and - residuals instead of percentage errors for
additive CERs - Means of the errors should be zero when computing
Pearsons r - With the revised approach, high correlations from
previous study for USCM7 CERs are not found - We have found no discernible sample correlations
for the USCM8 subsystem-level CERs using the
revised method - Mean 0.04, Median 0.02, Skew -0.02
- 73 of them are between -0.5 and 0.5.
- Three sample correlations with absolute values
greater than 0.85 0.90, 0.91, and -0.93 ( 0.9
is significant, but not the other two) - Cost correlation is not the same as CER noise
correlation. Use this analytic method as a
cross-check
21References
- Covert, Raymond P., "Correlation Coefficients for
Spacecraft Subsystems from the USCM7 Database,"
Third Joint Annual ISPA/SCEA International
Conference, Vienna, VA, 12-15 June 2001. - Garvey, Paul R, "Do Not Use Rank Correlation in
Cost Risk Analysis," 32nd Annual DoD Cost
Analysis Symposium, Williamsburg, VA, 2-5
February 1999. - Nguyen, P., et al., Unmanned Spacecraft Cost
Model, Seventh Edition, U.S. Air Force Space and
Missile Systems Center (SMC/FMC), Los Angeles
AFB, CA, August 1994. - Nguyen, P., et al., Unmanned Spacecraft Cost
Model, Eighth Edition, U.S. Air Force Space and
Missile Systems Center (SMC/FMC), Los Angeles
AFB, CA, October 2001. - Tecolote Research, Inc., RIK in ACE Users
Manual, GM 075, August 1999.Â
22Backup Slides
23USCM8 Sample Correlation Coefficients