Title: Statistics: An Introduction
1Statistics An Introduction
- LIR 832
- Class 1
- January 8, 2007
2Topics of the Day
- 1. Some really nice pictures (graphical display
of quantitative data and more) - 2. Why do we teach statistics other than to
make your life miserable for a semester? - 3. What is in it for me as a future
HR/LR practitioner? - 4. Fundamental issues in statistics (what
really matters) - 5. The structure of the course
3The Use of StatisticsClassic Examples
- Tufte The Visual Display of Quantitative
Information - For a successful technology, reality must take
precedence over public relations, for Nature
cannot be fooled (Richard Feynmans conclusion
on the explosion of the space shuttle).
4The Use of StatisticsClassic Examples
- Data map John Snow (1854) deaths from cholera
in Central London - Before knowledge of bacterial sources of illness
(Holmes father in 1847 Louis Pasteur later on). - Deaths are dots xs are water pumps
- Deaths are clustered around the Broad St. Pump
- Removal of handle ended an epidemic which killed
more than 500
5The Use of Statistics Classic Examples
- Data map National cancer rates by county
- From darkest (in highest decile of cancer rates)
to lightest (lower than US as a whole). - High death rates from cancer in the northeast
part of the country and around the Great Lakes
(High levels of air pollution and dense
concentration of industry). - Low rates in an east-west band across the middle
of the country. - Higher rates for men than for women in the south,
particularly Louisiana (cancers likely caused by
occupational exposure from working with asbestos
in shipyards). - Can you find the counties which are downwind from
the Nevada test range? - Can you find the central locations for the
chemical industry in the US?
6The Use of Statistics Classic Examples
- Data Map Space and Time - Charles Minard (1861)
- Napoleons March - Width of line varies continuously with size of
the army. - The line establishes the longitude and latitude
of the army. - The lines show the direction of movement of the
army. - The location of the army with respect to certain
dates is marked. - The temperature along the path of march is
marked.
7The Use of StatisticsClassic Examples
- Computer Graphics Space and Time
- Concentration of Pollutants over L.A. July 22,
1979 - Two dimensional surface 6 south California
counties - Nitrous oxides power plants, refineries
vehicles - Refineries and Kaiser Steel produce post midnight
peaks. - Traffic and power plants produce daytime peaks
- Carbon monoxide
- Reactive hydrocarbons
8The Use of StatisticsClassic Examples
- Election Maps Difficulty in Portraying
Information Accurately and making your point.
(http//www-personal.umich.edu/mejn/election/)
9The Use of StatisticsClassic Examples
- Tuftes Principles of Graphic Excellence
- The efficient communication of complex
quantitative ideas - Show the data
- Avoid distorting what the data have to say
- Encourage the eye to compare different pieces of
data - Make large data sets coherent
- Induce the viewer to think about the substance
rather than about methodology, graphic design,
the technology of graphic production, or
something else
10The Use of StatisticsTruck Driver Retention
- Factors Affecting Over-the-Road Truck Driver
Retention A More Traditional Application of
Statistics to a Complex Relationship
11The Use of StatisticsTruck Driver Retention
- Background
- Ongoing shortage of truck drivers makes trucking
firms very concerned, at least rhetorically,
about driver retention - Have excellent data on drivers from a survey of
truck drivers, would like to sort out factors
affecting driver retention so firm policy can
focus on those factors - Problem, retention is multi-causal, many factors
are likely to affect the retention of truck
drivers and we need an approach that allows for
all of these affects.
12The Use of StatisticsTruck Driver Retention
- It is always good to start an inquiry with a
little theory. This sets a question or questions
that we structure our inquiry around. - The following from Freeman and Medoff Two Faces
of Unionism
13The Use of StatisticsTruck Driver Retention
- Monopoly face unions raise wages and improve
benefits - Exit-voice face.
- Typical means of employees registering
dissatisfaction with a job is to quit and find a
new job. - Unions provide employees with an alternative
route voice - Improve communications because employees are
protected against bad consequences of
communicating their views to management - Allow employees a means to communicate and decide
on issues among themselves rather than being
mediated by management. Employees rather than
management decide on hard issues such as the
allocation of benefits - Solves public goods problem at work
14The Use of StatisticsTruck Driver Retention
- Lower quit rates and longer tenure are a
potential source of advantage to organized firms
as they - Save hiring and training costs
- Have greater depth of human capital
- Research on quits and employee tenure shows a
strong positive association between tenure (years
of service with employer) and unionism and a
strong negative association between unionism and
quits.
15The Use of Statistics Truck Driver Retention
- This might be explained by the union voice
effect but it might also be an example of the
monopoly face (syllogism) - Unions raise wages and increase benefits
- All else constant, employees tend to stay with
firms which provide better wages and benefits - To be better assured of union voice effects we
need to distinguish the monopoly face of unions
on compensation from that of voice - Consistent with the monopoly argument, Delery
finds only a compensation effect, no distinct
union effect
16The Use of StatisticsTruck Driver Retention
- Current Research draws on a UMTIP survey of truck
drivers. - Interview 1,000 drivers in truck stops between
1997 and 1999 - Includes data on tenure and quits along with
union membership and compensation. - Consider the descriptive statistics (abbreviated)
17(No Transcript)
18The Use of StatisticsTruck Driver Retention
- Build a series of models that look at the months
spent with their current employer (tenure) - Dont have pre-quit information on those who quit
their jobs in the last year). - Models (working from simplest to most complete)
- Model 1 Tenure models with extensive controls
- Controls serve to eliminate the effects of
factors which would otherwise confuse our
estimates, such as personal characteristics
which might affect tenure (age, race, gender
...), segment of the industry, size of the firm,
characteristics of the work. - Model 2 Add Union Measure
- Coefficient on union membership is 38.78
interpreted as indicating union members stay with
firm an additional 39 months.
19(No Transcript)
20The Use of StatisticsTruck Driver Retention
- Model 3
- Add a measure of weekly pay. The coefficient on
union membership is not the earnings (monopoly)
effect as we have removed effects related to
weekly earnings. - Members remain an additional 36 months, not much
effect - Model 4
- Allow for the effect of benefits including paid
days off, employer provided health insurance,
pensions and deferred compensation - Union effect declines to an additional 22 months
21The Use of Statistics Truck Driver Retention
- Conclusions
- Union membership has a strong effect on employee
retention. While part of this effect is due to
unions improving wages and benefits, even with
controlling for such effects, unions continue to
be associated with longer employee retention. - Time off work is also very important to driver
retention as is earnings.
22Why Quantitative Methods?
- What will we learn?
- Master fundamental knowledge about construction
and application of statistical models. - Develop, operationalize and interpret models of
social interaction using modern statistical
software. - Learn to evaluate and critique others research.
- In essence, become knowledgeable users of
statistical analyses.
23What Issues Does Statistics Address?
- Human beings are very good storytellers.
- Human beings have always been very good at
developing stories which explain the world out
of small amounts of information. This behavior
may have been necessary for survival when early
man competed for food with large predators, but
it often leads us to misunderstandings about
causal relations.
24What Issues Does Statistics Address?
- Tom Peters In Search of Excellence Lessons
from America's Best-Run Companies - How 10 firms became top performers. Exciting
reading with many important insights into
successful management. ATT, IBM, Digital
Equipment, 3M, Allen-Bradley, Delta Airlines - Fortune magazine returns three years later, half
of the firms are no longer top performers
25What Issues Does Statistics Address?
- Beardstown Ladies
- Successful investment club in Ohio. Produces of
book of investment tips with recipes. Some
problems later on with how they figured their
profits but lets put that aside. Their claim to
fame was that they out guessed the stock market
ten years in a row. Did this reflect brilliant
thinking on their part, or might it simply be
luck (change or random event)? - Supppose in 1980 there were 1000 womens
investment clubs in Ohio. Each year we would
expect ½ of those clubs would do better than the
stock market and one half would do worse. How
many clubs would have a record of straight wins
in the 1980s?
26What Issues DoesStatistics Address?
27What Issues Does Statistics Address?
- Stories are essentially anecdotes, interesting
and potentially insightful but its difficult to
separate what is useful from what is bullshit.
Much of what we consider theory in social and
behavior science, whether it be economic theory,
psychological theory, sociological theory,
management theory, physics, etc.
28What Issues Does Statistics Address?
- Statistics provide a method of examining these
stories to determine if they are consistent with
the facts (data) generated from a large number of
cases. - For example, the questions we might be interested
in might be - Does a particular absence policy actually reduce
absences? - What is the response to a pay for performance
system? - How does greater workforce diversity influence
plant level performance?
29What Can We Do With Statistics?
- Compactly summarize large bodies of data
- Using measures of central tendency, dispersion
and probability distributions, we can compactly
describe and understand these large bodies of
data. - CPS file with 150,000 observations on earners
- Compustat file with annual data on 1,000s of
firms at the divisional level - Personnel files from medium or large size
corporations.
30What Can We Do With Statistics?
- Determine if there are meaningful relationships
in the data - Test theories or ideas about social or other
inter-relationships - An attendance award program will reduce
absenteeism - Piece rate systems will increase output but
quality will suffer (Dodge Brothers machining
plant in Detroit in 1904 for example) - Increases in the minimum wage reduce employment
- Training programs improve output
- What is a theory? (evolution v. intelligent
design)
31What Can We Do With Statistics?
- Determine the magnitude of the relationship
Answer the essential question How Big - If you are going to calculate the ROI on a
training program you need to know the magnitude
of the effect of that program. So you will want
to be able to answer questions such as - Following training program X, productivity rose
by Y - If a firm is going to invest in a program, it
need to know the rate of return and this will, in
turn, be determined by the improvement in
productivity. - An A increase in the minimum wage is associated
with a B decline in teenage employment. - Piece rate workers produce H more output than
hourly workers.
32What Can We DoWith Statistics?
- This is all very nice, what is in it for you as
an IR/HR professional? - IR/HR students do not, typically, believe that
numbers are their friends. - Alas, HR Managers are expected to use numeric
and statistical information to understand and
guide their decisions. - As HR moves from a transactional to strategic
position within the firm, HR managers are more
and more expected to use numeric and statistical
methods to evaluate operations and guide their
decisions. - Organizational HR performance is monitored using
HR metrics. If correctly chosen these metrics
can provide a compact summary of units
performance.
33What Can We Do With Statistics?
- A Few HR Metrics
- Number of interviews to hires
- Total recruiting cost per hire
- Hiring manager satisfaction
- Turnover Rate
- Turnover Cost
- Absence Rate
- Health Care Cost per Employee
- HR expense factor
- Human Capital Value Added
- Workers Compensation Cost per Employee
- HR ROI
34What Can We Do With Statistics?
- Some issues in using these metrics
- What do these measure?
- What is a good performance?
- Are deviations from good performance due to
problems or chance? - What are the sources of good or bad
performance?
35What Can We DoWith Statistics?
- With the availability of HR metrics, it become
possible to use descriptive and analytic
statistics to evaluate programs. - Consider a program to control health care costs.
You are going to be interested in some relatively
simple measures such as whether there was a
reduction in direct health care costs. You will
also be interested in determining whether there
are indirect costs such as increased absenteeism,
lower employee satisfaction, increased turnover
and whether there is a change in employee
behavior or simply cost shifting.
36What Can We DoWith Statistics?
- You will regularly be presented with reports and
memos incorporating numeric and statistical
materials. You needed to understand and evaluate
the work of others - You hire a consultant to suggest or evaluate a
program. You need to be able to understand and
interpret what they have done both to determine
the quality of the work, to be able to ask good
questions, and to reach your own conclusions
about the report. - Example EEOC and OFCCPs standards for
establishing a pattern and practice of
discrimination
37What Can We DoWith Statistics?
- You should be facile with statistical measures
and data to be able to play with professions for
which is required knowledge. You can also shine
relative to your peers if you are the one who
does the statistical work and drafts those
reports.
38Fundamental Issues in Statistics
- The world is multi-causal, meaningful models need
to reflect multiple sources of causation. - There are many random elements in the outcomes we
are concerned with. Simple observation is not
enough to reveal underlying relationships. We
need multiple observations to be able to
establish the presence of a relationship. - Why anecdotes are suspect.
39Fundamental Issues in Statistics
- We use samples to learn about populations
- We seldom observe the populations we want to know
about. Because we have to use samples, we engage
in inference from samples to populations.
However, because of sampling variability, samples
are not little mirror images of the population of
interest. Given that samples are imperfect
replications of populations, we have to use
techniques such as hypothesis testing to
determine if statements about populations are
reasonable given our observed population.
40Fundamental Issues in Statistics
- Few events have only one or two causes. As we
want to avoid reductionist approaches, our
methods must allow for with multiple causation. - The foundation of model building is not statical
but theoretic and practical knowledge of an
issue. - Evaluation of the usefulness of models then rests
on both statistical knowledge and broader
understanding of an issue. Good statistical
technique is a necessary but not sufficient
condition for building a useful model. - Toward the end of the course we will evaluate
literature using statistics so that we can bring
all of the diverse elements together. - Successfully modeling this multi-causal world
requires careful application of statistical
technique. - Traditional course ends with a brief smattering
of multi-variate statistics, but we need more.
41The Structure of the Course