Title: Linear Regression Example
1Linear RegressionExample
- A homeowner recorded the amount of electricity in
kilowatt-hours (KWH) consumed in his house on
each of 21 days. He also recorded the numbers of
hours his air conditioner (AC) was turned on and
the numbers of times his electric clothes dryer
(DRYER) was operated. - His objective was to relate the KWH consumption
to the AC and DRYER usage. In particular, he
wanted to know how many KWHs the AC used per
hour and the number of KWHs used in each run of
the DRYER. - Statistical regression analysis can serve this
purpose.
2KWH Consumption DataRelated to AC and Dryer
- Day 1 2 3 4 5 6
- AC 1.5 4.5 5.0 2.0 8.5 6.0
- DRYER 1 2 2 0 3 3
- KWH 35 63 66 17 94 79
3Plot First
4Linear Regression Equation
- We shall obtain an equation
- KWH b0 b1AC
- That quantifies rate of increase in KWH as a
function of AC. This is an equation of a
straight line with - b0 the intercept and b0 the slope
- The equation turns out to be
-
- KWH 27.85 5.34(AC)
5Regression Line Plotted through Data
6Some things to think about
- What is an appropriate model?
- How to estimate the unknown coefficients?
- How accurate are these estimates?
- Does the model (straight line) represent the
data? - Can the regression equation be used to predict
the future responses? How accurate are these
predictions? - Since the responses are not deterministic, most
of the answers are statistical, i.e., decisions
are not 100 accurate, but we will find the best
methods and present the confidence of our
decisions.
7Simple Linear Regression Model
- value of dependent variable
- value of independent variable
- intercept of population regression line
- slope of population regression line
- random error, ,
8Fitting the Simple Linear Regression Model
- Summary Statistics for KWH Data
- 145.5/21 6.93
- 1362/21 64.86
- 1204.75 145.52/21
- 1204.75 - 1008.12 196.64
- 97914 13622/21
- 97914 88335.4 9578.6
- 10487(145.5)(1362)/21
- 10487 9436.7 1050.3
9Fitting the Simple Linear Regression Model
- Parameter Estimates
- 1050.3/196.64 5.34
- 64.86 (5.34)(6.93) 27.85
- KWH 27.85 5.34(AC)
10Assessing Model Fit
- Coefficient of DeterminationR2
- R2 SS(regression)/SS(no model)
- 5609.9/9578.6 .58
- 58 of the variation in KWH is due to variation
in AC - R2 (Variation due to model)/(Total variation)
- Many other methods
11Inference for a Future Response
Consider the conceivable days when the AC could
be turned on for 10 hours. The estimate of the
mean of this population of days isKWH 27.85
5.34(10) 81.25 A 95 confidence interval for
the mean KWH consumption on those days is81.25
2.1(208.9(1/21 (10 6.93)2)/196.64)0.5or81.
259.38 (71.87,90.63)
12A more complete model
13Two Examples in the book
- Example 11.10. Relation between water/cement
ratio (y) and concrete strength (x). - The book uses a linear model, but quadratic is
better for real application, because we wish to
find the ratio for the highest strength. - Example 11.11. Relation between alligator weight
(w) and length (l). A reasonable model is
14Application of Statistics to Genetics
Genotype
Phenotype
15How a gene is found and why do we need statistics?
16 (Science, 31 March 2006)
17Domesticated rice
Wild rice
18(No Transcript)
19Accomplishments
Left weed uncontrolled, Right weed controlled
with transgenic Roundup Ready resist soybean
Herbicide Tolerance Weed control is one of the
farmer's biggest challenges in crop production.
It is difficult to distinguish weed and crop in
nature. Roundup ready kills many types of plants
including wild-type soybeans. From
http//www.cls.casa.colostate.edu/TransgenicCrops/
current.html
20Corn and cotton
The transgenic corn resists a rootworm and cotton
a ball worm.
21Transgenic plants
- Top Papaya effected by ringspot potyvirus (PRSV)
- Bottom The difference between transgenic PRVS
resist papaya and wild type.
22Transgenic animals
Transgenic animals are not easy to produce.
Success rate from embryo to maturity Mouse
20 Goat, sheep cow 5
Left control, right with human growth hormome
promoter
From http//www.biotech.iastate.edu/biotech_info_
series/bio10.html
23Firefly gene in tobacco plant
24(No Transcript)
25We need markers in the genome
- 20 codes identifiability 1/4209x10-13
- Human genome size 2.8x109
26There is no pure brand in human race.
- Royal hemophilia pedigree Started at Victoria,
Queen of England
27Hemophilia genes were found.
Hem. A, B
colorblind