Title: Data Analysis
1 Data Analysis
Chapter 8
2In this chapter, we focus on 3 parts
Chapter 6
Data Analysis
- 1. Descriptive Analysis
- 2. Two-way Analysis of Variance
- 3. Forecasting
31. Descriptive Analysis
Chapter 6
Data Analysis
- 1.1 Index Numbers
- 1.2 Exponential Smoothing
41.1 Index Numbers
Chapter 6
Data Analysis
- Index Number a number that measures the change
in a variable over time relative to the value of
the variable during a specific base period - Simple Index Number index based on the relative
changes (over time) in the price or quantity of a
single commodity
51.1 Index Numbers
Chapter 6
Data Analysis
- Laspeyres and Paasche Indexes compared
- The Laspeyres Index weights by the purchase
quantities of the baseline period - The Paasche Index weights by the purchase
quantities of the period the index value
represents. - Laspeyres Index is most appropriate when baseline
purchase quantities are reasonable approximations
of purchases in subsequent periods. - Paasche Index is most appropriate when you want
to compare current to baseline prices at current
purchase levels
61.1 Index Numbers
Chapter 6
Data Analysis
- Calculating a Laspeyres Index
- Collect price info for the k price series (the
basket) to be used, denoted as P1t, P2tPkt - Select a base period t0
- Collect purchase quantity info for base period,
denoted as Q1t0, Q2t0..Qkt0 - Calculate weighted totals for each time period
using the formula - Calculate the index using the formula
71.1 Index Numbers
Chapter 6
Data Analysis
- Calculating a Paasche Index
- Collect price info for the k price series to be
used, denoted as P1t, P2tPkt - Select a base period t0
- Collect purchase quantity info for every period,
denoted as Q1t, Q2t..Qkt - Calculate the index for time t using the formula
81.2 Exponential Smoothing
Chapter 6
Data Analysis
- Exponential smoothing is a type of weighted
average that applies a weight w to past and
current values of the time series. (Yi actual
value) - Exponential smoothing constant (w) lies between 0
and 1, and smoothed series Et is calculated as - How much influence
- does the past have when w 0 and
when w 1?
91.2 Exponential Smoothing
Chapter 6
Data Analysis
- Selection of smoothing constant w is made by
researcher. - Small values of w give less weight to current
value, yield a smoother series - Large values of w give more weight to current
value, yield a more variable series
102 Two-way Analysis of Variance
Chapter 6
Data Analysis
- Two-way ANOVA is a type of study design with one
numerical outcome variable and two categorical
explanatory variables. - Example In a completely randomised design we
may wish to compare outcome by age, gender or
disease severity. Subjects are grouped by one
such factor and then randomly assigned one
treatment. - Technical term for such a group is block and the
study design is also called randomised block
design
112 Two-way Analysis of Variance
Chapter 6
Data Analysis
- 2.1 Randomised Block Design
- 2.2 Analysis in Two-way ANOVA 1
- 2.3 Analysis of Two-way ANOVA by the regression
method
122.1 Randomised Block Design
Chapter 6
Data Analysis
- Blocks are formed on the basis of expected
homogeneity of response in each block (or group). - The purpose is to reduce variation in response
within each block (or group) due to biological
differences between individual subjects on
account of age, sex or severity of disease.
132.1 Randomised Block Design
Chapter 6
Data Analysis
- Randomised block design is a more robust design
than the simple randomised design. - The investigator can take into account
simultaneously the effects of two factors on an
outcome of interest. - Additionally, the investigator can test for
interaction, if any, between the two factors.
14Steps in Planning a Randomised Block Design
Chapter 6
Data Analysis
2.1 Randomised Block Design
- Subjects are randomly selected to constitute a
random sample. - Subjects likely to have similar response
(homogeneity) are put together to form a block. - To each member in a block intervention is
assigned such that each subject receives one
treatment. - Comparisons of treatment outcomes are made within
each block
152.2 Analysis in Two-way ANOVA - 1
Chapter 6
Data Analysis
- The variance (total sum of squares) is first
partitioned into WITHIN and BETWEEN sum of
squares. Sum of Squares BETWEEN is next
partitioned by intervention, blocking and
interaction
SS TOTAL
SS BETWEEN
SS WITHIN
SS INTERVENTION
SS BLOCKING
SS INTERACTION
16Chapter 6
Data Analysis
2.2 Analysis in Two-way ANOVA - 1
method. And an interaction between gender and
teaching method is being sought. Analysis of
Two-way ANOVA is demonstrated in the slides that
follow. The study is about a n experiment
involving a teaching method in which professional
actors were brought in to play the role of
patients in a medical school. The test scores of
male and female students who were taught either
by the conventional method of lectures, seminars
and tutorials and the role-play method were
recorded. The hypotheses being tested
are Role-play method is superior to conventional
way of teaching. Female students in general have
better test scores than male students. Role-play
method makes a better impact on students of a
particular gender. Thus, there are two factors
gender and teaching method. And an interaction
between teaching method and gender is being
sought.
17Chapter 6
Data Analysis
2.2 Analysis in Two-way ANOVA - 2
- Each Sum of Squares (SS) is divided by its degree
of freedom (df) to get the Mean Sum of Squares
(MS). - The F statistic is computed for each of the three
ratios as - MS INTERVENTION MS WITHIN
- MS BLOCK MS WITHIN
- MS INTERVENTION MS WITHIN
182.2 Analysis of Two-way ANOVA - 3
Chapter 6
Data Analysis
- Analysis of Variance for score
- Source DF SS MS F
P - sex 1 2839 2839 22.75
0.000 - Tchmthd 1 1782 1782 14.28
0.001 - Error 29 3619 125
- Total 31 8240
-
192.2 Analysis of Two-way ANOVA - 4
Chapter 6
Data Analysis
- Individual 95 CI
- Sex Mean -------------------------
------------- - 0 58.5
(------------) - 1 39.6 (-------------)
- -------------------------
------------- - 40.0 48.0
56.0 64.0 - Individual 95 CI
- Tchmthd Mean -------------------------
------------- - 0 56.5
(--------------) - 1 41.6 (---------------)
- -------------------------
------------- - 42.0 49.0
56.0 63.0
202.2 Analysis of Tw0-way ANOVA - 5
Chapter 6
Data Analysis
Analysis of Variance for SCORE Source
DF SS MS F
P SEX 1 2839
2839 22.64 0.000 TCHMTHD 1
1782 1782 14.21 0.001 INTERACTN
1 108 108 0.86
0.361 Error 28 3511
125 Total 31 8240
Interaction is not significant P 0.361
212.2 Analysis of Two-way ANOVA - 6
Chapter 6
Data Analysis
Individual 95 CI SEX Mean
-------------------------------------- 0
58.5
(------------) 1 39.6
(-------------)
--------------------------------------
40.0 48.0 56.0
64.0 Individual 95
CI TCHMTHD Mean ----------------------
---------------- 0 56.5
(--------------) 1
41.6 (---------------)
--------------------------------------
42.0 49.0 56.0
63.0
222.3 Analysis of Two-way ANOVA by the regression
method (reference coding)
Chapter 6
Data Analysis
The regression equation is SCORE 65.9 - 18.8
SEX - 14.9 TCHMTHD Predictor Coef
SE Coef T P Constant
65.913 3.420 19.27 0.000 SEX
-18.838 3.950 -4.77
0.000 TCHMTHD -14.925 3.950
-3.78 0.001 S 11.17 R-Sq 56.1
R-Sq(adj) 53.1 Analysis of Variance Source
DF SS MS F
P Regression 2 4620.9
2310.4 18.51 0.000 Residual Error 29
3619.0 124.8 Total 31
8239.8
232.3 Analysis of Two-way ANOVA by the regression
method (effect coding)
Chapter 6
Data Analysis
The regression equation is SCORE 49.0 - 9.42
EFCT-Sex - 7.46 EFCT-Tchmthd - 1.84
Interaction Predictor Coef SE
Coef T P Constant 49.031
1.980 24.77 0.000 EFCT-Sex
-9.419 1.980 -4.76 0.000 EFCT-Tch
-7.463 1.980 -3.77
0.001 Interact -1.838 1.980
-0.93 0.361 S 11.20 R-Sq 57.4
R-Sq(adj) 52.8
24Reference Coding and Effect Coding - 1
Chapter 6
Data Analysis
- In both methods, for k explanatory variables k-1
dummy variables are created. - In reference coding the value 1 is assigned to
the group of interest and 0 to all others (e.g.
Female 1 Male 0). - In effect coding the value -1 is assigned to
control group 1 to the group of interest (e.g.
new treatment), and 0 to all others (e.g. Female
1 Male (control group) -1 Role Play 1
conventional teaching (control) -1).
25Reference Coding and Effect Coding - 2
Chapter 6
Data Analysis
- In reference coding the ß coefficients of the
regression equation provide estimates of the
differences in means from the control (reference)
group for various treatment groups. - In effect coding the ß coefficients provide the
differences from the overall mean response for
each treatment group.
26Chapter 6
Data Analysis
3.1 The concept of market forecast 3.2 The
theoretical bases of forecast 3.3 The
classification of forecast methods 3.4
Qualitative Forecast Methods 3.5 Quantitative
Forecast Methods
273.1 The concept of market forecast
Chapter 6
Data Analysis
- Based on market surveys and by applying
scientific methods, to estimate the development
situation of objects-forecasted in a certain
period in future in order to help managers to
improve decisions-making qualify. The process is
generally called as market forecast. - In this chapter, objects-forecasted mainly are
need quantities of products, sometime may also be
product prices, competitive situations,
environmental factors, and so on.
283.2 The theoretical bases of forecast
Chapter 6
Data Analysis
- (1)The continuity principle
- ?It is also called as inertia principle. Because
of existing inertia, any system doesn't change
its basic characteristics in the short run. - Attention all time series analysis methods
are based on this principle.
29Chapter 6
Data Analysis
3.2 The theoretical bases of forecast
- (2)The analogy principle
- ?time analogy to make an inference in future
from the past and the present. When two things
and more things have characteristic similarity
(structure, mode, property, and develop
tendency), we can forecast the developing things
and the ready-to-develop things by studying the
developed or advanced things. Attention analogy
is suitable to the homogeneous things, also to
inhomogeneous things.
30Chapter 6
Data Analysis
3.2 The theoretical bases of forecast
- (2)The analogy principle
- ?(continual to front page) sampling analogy to
make an inference about the whole from the part.
When the whole and the part have characteristic
similarity, we can forecast the whole by studying
the part. - Attention the similarity is the key point either
between the things with difference in advance
time, or between the whole and the part.
31Chapter 6
Data Analysis
3.2 The theoretical bases of forecast
- (3)The relevancy principle
- ?the theory considers that there is relativity
among things, especially between two relevance
things or causal things. All statistical
regression analysis methods are based on this
principle.
323.3 The classification of forecast methods
Chapter 6
Data Analysis
- Although there are many theoretical forecast
methods, in general forecast can be classified as
two types - qualitative forecast
- quantitative forecast.
333.3.1 Qualitative forecast
Chapter 6
Data Analysis
- Qualitative forecast emphasizes the development
tendencies (maybe essential characteristics), and
is suitable to cases which there are a fewer and
lack of data, such as science and technology
forecast, development forecast of infant
industries, long-term forecast, and forecasting
things with uncertainty, etc.
34Chapter 6
Data Analysis
3.3.2 Quantitative forecast
- Quantitative forecast emphasizes the quantitative
relationships of developing things. Essentially
it is a kind of methods based on quantitative
trend extrapolation, and is suitable to cases
which there are many data.
35Chapter 6
Data Analysis
3.3.3 The comparison of two methods
- Qualitative forecast might contribute to the
analysis of the basic trends, development
inflection point, and the essence of things.
Quantitative forecast can draw us numeral
development concepts, and bring us conveniences
of applying forecast results. None of two methods
should be our preference, otherwise we probably
abuse forecast methods.
363.4 Qualitative Forecast Methods
Chapter 6
Data Analysis
- Delphi method
- Social investigation or consumer survey
- Colligating sellers opinions
- having an informal discussion of a team
- Integration of experts forecasts
- The method of subjective probabilities
- above methods all belong to non-models.
373.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Exponential Smoothing
- ?mathematical model
-
- ?signs and meanings to explain every sign and
its meaning - ?avalue ais greater, means that the more late
sample observations, the more its influence on
forecast results. Vice versa. Recommendation a
2/(n1)
383.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Exponential Smoothing
- ?mathematical model----horizontal trend
- ? mathematical model----lineal trend
393.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Exponential Smoothing
- ?mathematical model---- quadratic curve trend
403.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Exponential Smoothing
- ?how to choose mathematical models according to
the trend of sample observations on coordinate
diagram.
413.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Exponential Smoothing
- ?how to determine initial values of smoothing
parameters in general, the first observation
value instead of them. - ?superiorities of exponential smoothing the
storage data only is a fewer and it is suitable
to forecast in short run. - ?application cases reference to another teaching
materials.
423.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- The growth curve
- ?mathematical model
- Logistic curve
- Gompertz curve
433.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- The growth curve
- ?mathematical processing of initial observations
- For Logistic curve
- 2. For Gompertz curve
The processed data of observations can be used
for calculation of parameters k, a and b. The
calculation formulas are as following
443.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Calculation of k, a, and b
Attention the processed data of observations
must be blacked into 3 groups, thus we can obtain
3 sum values
When the number of initial data is not integer
multiple of 3, we must add or cut down data of
initials.
45Chapter 6
Data Analysis
3.5 Quantitative Forecast Methods
- Linear regression
- ?An independent variable and a dependent variable
are chosen on the model, and the varied relation
of y and x is linear. This model is widely
applied in quantitative forecasts. - ?the standard model
- yabx
- to non-standard equation, it is must
transferred as standard model.
463.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- ?determination of the coefficient a and b
- by means of method of minimum squares, let
the variance minimization, and the calculation
of is as following -
- and let derivatives of Q to a and b are equal
to 0, then
473.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
We can get a and b
483.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- ?then the forecast model is
It is necessary to check if the model the built
model is of high quality, the checking methods
are 1. standards error analysis
493.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- in general, the following is required
2. correlation coefficient and test of
significance. The calculation of correlation
coefficient is
503.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- ?discussion of correlation coefficient R
- ?when R0, means y doesn't have the correlation
with x, the case is called 0- correlation, so the
built model cant be applied to forecast. - ?when R1, means y has the direct correlation
with x. - ?in general, R is required to meet Rgt0.7. when
Rlt0.3, means the built model can not be applied.
When 0.3ltRlt0.7, means the model is not good and
worthless.
513.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- ? The quality of regression model is also tested
by significance. - if , the built model is good and
worth to application. on the contrary, if
, the built model is worthless. - is the critical value of
correlation coefficient R. It is known by looking
up the given table. Theais given level of
significance such as 0.05. The (n-m) is the
degree of freedom such as n-2, m is the number of
variables.
523.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- ?the application of model if the future value of
x is known as x?, the interval value of forecast
variable is
Here, s?is determined by the formula
53Quantitative Forecast Methods
Chapter 6
Data Analysis
- Linear regression
- and is T-distribution with
significance level aand freedom degree n-m-1,
here n is the number of observations, m is the
number of variables. - ?In addition, many non-linear equation can be
transferred as linear regression. For example
543.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
Then we can get the equation , the
same work is suitable to exponential function,
logarithm function, reciprocal function, etc.
Those functions are called as allowed linear
regression with single variable.
Application case to see another teaching
materials.
553.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- analogy forecasting method a case of
application - ?We can forecast an object variable by
researching the relationship between the variable
and an economic indicator (for example, per
capita national income, NI, or gross national
product, GNP) - ?The relationship between vehicle population and
NI is given in page 78 of textbook.
563.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Elastic coefficient method
- ?For example, we can get the average growth rate
of vehicle sales quantity by observing selling in
the past years, but the rate is only an image. If
we analyzes growth rate of sales together with
growth rate of an economic indicator, we can
improve forecast quality. Detail case is given in
textbook.
573.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- Combination forecasting method
- ?Concept of combination forecasting it is called
as combination forecasting to get a final
forecast conclusion based on colligating multi
intermediate forecast results gained by adopting
multi-models, or on same model adopting multi
independent variables. - ?The core idea combination is benefit to clear
up the chanciness of single mode or independent
variable.
583.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- talking about forecast experience
- ?policy variables it is very difficult to
forecast changes of policy, but we can strengthen
monitoring of environmental factors, especially
paying attention to the running condition of the
national economy. Establishing the monitoring and
early warning system of the national economy is
very necessary.
593.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- talking about forecast experience
- ?predicting accuracy and goodness of fit in
model. - ?simple model and complexity model
- ?single predicting result and many results
- ?reliability of forecast conclusions three pints
are very important----reality initial data
(authoritativeness), accuracy of mathematical
models, and correctness of forecast procedures.
603.5 Quantitative Forecast Methods
Chapter 6
Data Analysis
- talking about forecast experience
- ?data processing, actual cases and researchers
imagination. -
- to improve forecast, establishing information
system is very important.