Title: A Review of Benchmarking Methods
1A Review of Benchmarking Methods
G Brown, N Parkin, and N Stuttard, ONS
2Overview
- Introduction
- What is benchmarking?
- What we did and why
- Some methods for benchmarking
- Some quality measures
- Comparison of methods
- Summary
3Introduction
- Purpose to recommend a method for benchmarking
to ONS and wider GSS - Benchmarking combines two time series of same
phenomenon, measured at different frequencies - Result benchmarked series is higher quality
- Work funded from Quality Improvement Fund
4What we did and why
- Identified appropriate benchmarking methods
- Tested using several hundred ONS time series
- Used range of quality measures to rank methods
- Made judgment to combine results from different
quality measures - Recommended a benchmarking method
- Update of ONS computer systems prompted
examination of methods
5Benchmarking
- Want good estimates of levels and growth
- Have two series measuring same phenomenon
- Different frequencies
- Higher frequency more timely, accurate growths
- Indicator series
- Lower frequency delayed, more accurate levels
- Benchmark series
6Benchmarking
- Resulting high frequency series
- Benchmarked series
- Has good estimates of growth combined with good
estimates of level
7Benchmarking
- Two types of relation between indicator and
benchmark - Point in time
- Average
8Benchmarking, point in time
- Example unemployment monthly and quarterly
- Benchmarks apply to the third month in each
quarter - Third monthly estimate in each quarter is forced
to equal benchmark
9(No Transcript)
10(No Transcript)
11Benchmarking, average
- Example turnover monthly and quarterly
- Benchmarks apply to each month in each quarter
- Average turnover of three months in each quarter
is forced to equal benchmark
12(No Transcript)
13(No Transcript)
14Non-negativity
- Most indicator series must be non-negative
- In those cases the benchmarked series must be
non-negative too - Process of benchmarking can produce negative
benchmarked series
15(No Transcript)
16What we did and why
- Identified appropriate benchmarking methods
- Tested using several hundred ONS time series
- Used range of quality measures to rank methods
- Made judgment to combine results from different
quality measures - Recommended a benchmarking method
- Update of ONS computer systems prompted
examination of methods
17Benchmarking methods
- Methods suggested by ONS, variants with different
splines - proc Expand (in SAS)
- INTER
- Kruger
- Denton
- Cholette-Dagum
- Constrained versions of the above for
non-negativity
18Benchmarking methods
- Methods suggested by ONS, variants with different
splines - proc Expand (in SAS)
- INTER
- Kruger
- Denton
- Cholette-Dagum
- Constrained versions of the above for
non-negativity
19Benchmarking methods
- Methods suggested by ONS, variants with different
splines - proc Expand (in SAS)
- INTER
- Kruger
- Denton
- Cholette-Dagum
- Constrained versions of the above for
non-negativity
20Benchmarking methods
- Methods suggested by ONS, variants with different
splines - proc Expand (in SAS)
- INTER
- Kruger
- Denton
- Cholette-Dagum
- Constrained versions of the above for
non-negativity
21ONS methods (and variants)
- Summary fits smooth curve through knots
- Aggregate indicator series
- Calculate ratio of aggregated to benchmark
- Augment with fore/backcasts using X-12-ARIMA
- Interpolate to frequency of indicator
- Multiply indicator by interpolated series
- Iterate 1 to 5
- Variants use different ways to interpolate
22Interpolation
- Three types of cubic spline
- Proc Expand (point in time/average)
- INTER (average)
- Kruger (point in time)
- Progressively less prone to produce negative
values
23Denton type
- Summary try to preserve movements in indicator
- Minimise a penalty function of differences or
relative differences between indicator and
benchmark - Minimisation using either special methods or
off-the-shelf methods for quadratic minimisation - Denton usually set up to minimise first
differences or proportionate first differences
24Denton and Cholette-Dagum
- For indicator points with no benchmark
- Denton carries forward the most recent difference
between benchmark and indicator - Cholette-Dagum assumes the difference decays to
zero in a defined way - Flexible in the way this is modelled
- We assume
- Decay is geometric
- Rate of decay fixed in advance for all series
25(No Transcript)
26Non-negativity
- ONS suggestion
- Benchmark on log scale
- Exponentiate
- Distribute residual differences
- Optimisation approach for Denton type
- Set up basic method as a matrix problem
- Add constraints as part of matrix setup
- Solve using off-the-shelf optimiser in SAS
27What we did and why
- Identified appropriate benchmarking methods
- Tested using several hundred ONS time series
- Used range of quality measures to rank methods
- Made judgment to combine results from different
quality measures - Recommended a benchmarking method
- Update of ONS computer systems prompted
examination of methods
28Time series used for testing
- Mixture of
- Monthly to quarterly
- Quarterly to annual
- Average and point in time
- Different lengths
- Included some awkward series (to test
non-negativity)
29What we did and why
- Identified appropriate benchmarking methods
- Tested using several hundred ONS time series
- Used range of quality measures to rank methods
- Made judgment to combine results from different
quality measures - Recommended a benchmarking method
- Update of ONS computer systems prompted
examination of methods
30How the methods were compared
- Failures
- Verification of benchmarking constraint
- Preserving change
- Revisions
- Smoothness
- Closeness
31How the methods were compared
- Failures program fails to benchmark
- Verification of benchmarking constraint
- Preserving change
- Revisions
- Smoothness
- Closeness
32How the methods were compared
- Failures
- Verification of benchmarking constraint -
benchmarked not equal to benchmark - Preserving change
- Revisions
- Smoothness
- Closeness
33How the methods were compared
- Failures
- Verification of benchmarking constraint
- Preserving change size and direction
- Revisions
- Smoothness
- Closeness
34How the methods were compared
- Failures
- Verification of benchmarking constraint
- Preserving change
- Revisions size bias when perturbing or adding
benchmark - Smoothness
- Closeness
35How the methods were compared
- Failures
- Verification of benchmarking constraint
- Preserving change
- Revisions
- Smoothness relative variance of indicator and
benchmarked - Closeness
36How the methods were compared
- Failures
- Verification of benchmarking constraint
- Preserving change
- Revisions
- Smoothness
- Closeness between indicator and benchmarked
37How the methods were compared
- For each one of preserving change, revisions,
smoothness and closeness, calculate - For each method, for each time series, for
different lengths of the series - Rank methods for each series and length
- Average the ranks over all series
- Plot and compare average ranks by length
38(No Transcript)
39(No Transcript)
40Recommended method
- Around 100 plots compared
- Judgment made on overall best performing method
- Based on good performance and lack of bad
performance - Recommended method
- Cholette-Dagum (0.8)
41Summary
- Aim recommend method for benchmarking to ONS and
wider GSS - Update of ONS computer systems prompted
examination of methods - Used several quality measures to rank methods
- Made judgment to combine results from different
quality measures - Recommended Cholette-Dagum (0.8)
42Any questions?