Bootstrapping - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Bootstrapping

Description:

Bootstrapping the neglected approach to uncertainty Paul Kershaw University of South Australia European Real Estate Society Conference Eindhoven, Nederlands, 15 ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 17
Provided by: PaulK250
Category:

less

Transcript and Presenter's Notes

Title: Bootstrapping


1
Bootstrapping the neglected approach to
uncertainty
Paul Kershaw University of South Australia
  • European Real Estate Society Conference
  • Eindhoven, Nederlands, 15-18 June 2011

2
Overview
  • The history of confidence intervals
  • Pedagogical predilection to a parametric view
  • Real estate research is NOT normal
  • Do not provide a measure of probability
  • Enter the Jackknife
  • Monte Carlo simulation Bootstrapping
  • The basic algorithms
  • Real World Applications
  • A better mousetrap

3
Introduction
  • The origins of hypothesis tests is 1279
  • Confidence intervals were derived in 1937
  • A confidence interval estimates the uncertainty
    about the true value of some population parameter
  • 50-year lag before medical journals for example
    advocated their use
  • The lazy approach is to assume a normal
    distribution

4
Not Normal
  • Very little about real estate can be considered
    to follow a Normal distribution
    includingPrices, Land area, building area, age,
    number of bedrooms, location, physical condition,
    construction, tenants covenant, heating, etc.
  • Linear regression techniques are regularly
    applied, averages, standard errors and parametric
    confidence intervals proffered. Why?
  • Is it because we are taught to do it that way
    or because we teach it that way gloss over the
    ignored assumptions just give me a number from
    the printout.

5
Not a measure of Probability
  • This begs the question what is the confidence
    interval of a correlation coefficient? and leads
    to the second question why is it so rarely
    reported?
  • What is a realistic confidence interval for a
    computer generated valuation using a linear
    regression model?
  • Most proprietary AVMs provide their own, often
    ill defined, assessment of accuracy that is
    usually somewhat nebulous.

6
Enter the Jackknife
  • Early efforts in the 1950s revolved around the
    Jackknife (Quenouille, M).
  • The Jackknife provides a technique for
    estimating the bias and standard error of an
    estimate irrespective of the shape of the
    underlying distribution.
  • The following example is based upon the work of
    Efron, B 1993. The datapoints are LSAT, the
    average score for the class on a national law
    test, and GPA, the average undergraduate
    grade-point average for the class.

7
Sample Data
8
The basic algorithms
  • Compute sample statistics on n separate samples
    of size n-1. Each sample is the original data
    with a single observation omitted.
  • Jackknife Heuristic
  • Remove one data point only and calculate the
    statistic of interest to give estimate 1
  • Repeat for each data points to give estimates 2,
    3, 4 n
  • Calculate the percentiles of interest to obtain
    the confidence interval

9
Jackknife Calculations
10
Monte Carlo Bootstrapping
  • Monte Carlo simulation caught the imagination of
    practitioners and researchers following Hertz,
    David 1964, Harvard Business Review
  • Monte Carlo simulation uses repeated sampling to
    determine the properties of some result of
    interest
  • The re-sampling is carried out with replacement
  • If we apply this technique to the previous
    Jackknife data we would be Bootstrapping
    Adventures of Baron Munchausen
  • Bootstrapping is repeatedly re-sampling with
    replacement, calculating the statistic of
    interest and recording its distribution.

11
Bootstrap Algorithm
  • Remark to calculate the dispersion of the mean
  • DataArray() n data points
  • MeanResults(1000)
  • For i 1 to 1000
  • Sum0
  • For j 1 to n
  • Sum Sum DataArray(RandomBetween(1,n))
  • Next j
  • MeanResults(i) Sum / n
  • Next i

12
Real World Application 1
  • What annoys me most residential price change
    reporting and hot spotting
  • Below are sale prices for Q4 2010 and Q1 2011 for
    Detached houses in Aberfoyle Park, South
    Australia

Median 382,500
Average 409,932
Median 385,000
Average 391,946
Change Median 0.65
Change Average -4.39
13
Bootstrap Results 1000 iterations
  • The degree of uncertainty is clearly illustrated.
  • The median has a 95 confidence interval of .

14
A better mousetrap
  • The traditional approach is to select n from n
    with replacement and calculate statistic of
    interest and repeat m times
  • This is inefficient for most statistics of
    interest including the mean, median, standard
    deviation or correlation coefficient
  • For example the mean is sum/n
  • If for each iteration we remove just one random
    element and replace it with another random
    element we can adjust the sum by subtracting the
    value of the removed element and adding the value
    of the ingoing element
  • If n is say 50 we save 48 mathematical operations

15
Summary
  • The bootstrap is simple to implement
  • The results are meaningful and easy to interpret
  • No specious assumptions regarding underlying
    distributions are required
  • Widely accepted
  • It should be embraced by all researchers and
    practitioners

16
Yesteryears Joys
  • Bootstrap Methods Another Look at the Jackknife
  • B. Efron
  • Source Annals of Statistics Volume 7, Number 1
    (1979), 1-26.
Write a Comment
User Comments (0)
About PowerShow.com