Title: Experimental Methods
1Experimental Methods
2Ch 1 Introduction
- 1.1 Economics as an experimental discipline
- 1.2 the engine of scientific progress
- 1.3 Data sources
- 1.4 Purpose of experiments
3Introduction ch. 1
- Experimental sciences
- Physics Galileo
- Biology Mendel, Pasteur
- Economics Vernon Smith and Charles Plott
- Circle of theory empirics
4Empirical strategies
- Field data
- Econometric experiments
- Natural experiment
- Field experiment with field task
- Pinochets Chile, income maintenance experiments
in Denver, Seattle - Field experiment with artefactual task
- Danish elicitation of risk and time preferences
- Virtual experiment
- Lab experiment with artefactual task
- Simulations
5Confounding explanations
- Leamers Luminists vs. Aviophile parable
- Higher crop yields explained by presence of bird
droppings (Aviophiles) - Higher crop yields explained by presence of sun
light (Luminists) - The two are always present simultaneously in the
field.
6Purpose of experiments
- Theory testing
- Theory extensions
- Boundaries of theories
- Phenomena discovery, verification and robustness
check - Mechanism design and test beds
- Policy influence, persuasion and demonstration
- Preference elicitation
7Ch 2 Principles of economics experiments
- 2.1 Realism and models
- 2.2 Controlled economic environments
- 2.3 Induced value theory
- 2.4 Parallelism
- 2.5 Practical implications
- 2.6 Application The Hayek hypothesis
8Ch 2 Induced value theory
- 3 necessary conditions
- Monotonicity
- Prefer more rewards to less and not be satiated
- Salience
- Actions have consequences in the reward and
subjects understand this - Dominance
- Reward medium has bigger impact on subjects than
other possible incentives present - Hypothetical vs. real incentives
- Debate
- Pros and cons
9Simple guidance
- Pay in cash
- Use subjects that pay attention, learn fast and
have low opportunity cost - Simple is best
- Avoid uncontrolled context
- Maintain anonymity
- Do not deceive
10Competitive Equilibrium experiments
- Example of induced value experiments
- Testing central theory of economics
- Chamberlin and Smith experiments
- Discuss design details
- How the institutional design affects the
efficiency of the market - ZI trader markets
11Ch 3 Experimental Design
- 3.1 Direct experimental control Constants and
Treatments - 3.2 Indirect control Randomization
- 3.3 The within-subjects design as an example of
blocking and randomization - 3.4 Other efficient designs
- 3.5 Practical advice
- 3.6 Application New market institutions
12Ch 3 Experimental Design
- Focus and nuisance variables
- Avoiding confounding effects of several variables
- Example voluntary contribution mechanisms and
public goods experiments - Terminology cohort, session, round, trial
- Condition control and treatment
- Control condition replicates some previous
experiment or implements a simple benchmark
design - Treatment condition executes a comparative
statics
13Control in experiments
- Avoid nuisance influence on behavior by holding a
variable constant - Example holding constant the return to public
investments or holding constant the temperature
in the room - Or by observing its changes precisely and
controlling for these influences in the analysis - Demographic variables
- Degree of control here depends on accuracy of
specification of structural model (interaction
effects) - Treatment
- Controlled change of exogenous variable
- Return to public investments, initial wealth
- Designing comparative statics effects
14Hard to control variables
- Variables not under our control
- Unobservable variables
- Personal characteristics
- Patience, cognitive capital, intelligence,
linguistic abilities, risk attitudes - Beliefs and expectations
- what is the purpose of the experiment, what
should I focus on, what will others do - Are other participants free riders or cooperators
- Randomization to treatment
- With large enough samples the unobserved
variables will be distributed equally across
treatments - Low and high return treatments and distribution
of those who believe others are free riders or
cooperators
15Example of randomization
- 20 of subjects are smart, 80 are dumb
- Assign 50 of subjects to treatment 1 and 50 to
treatment 2 - We cannot observe who is smart and who is dumb
(ex ante) - If registering for treatment 1 is a more
difficult process than registering for treatment
2 we expect fewer dumb people to attend treatment
1 than treatment 2 - This is a confound
- Randomization requires that assignment to
treatment does not separate out the dumb from the
smart
16Within and Between Subject designs
- Between subject design
- Each subject experiences only one treatment low
or high return to public investment - Within subject design
- Each subject experiences two or more treatments
- Randomization in between subject designs
- at start of session
- Pre-assign roles or treatments to different
stations and let subjects be assigned to stations
randomly - Do not assign all of one role first and the all
of the other role - However sometimes it is more convenient to run
all of one treatment in the same session - Randomization in within subject designs
- Pre-assign each sequence of roles or treatments
to stations - Do not change treatment across all rounds
randomly subjects need time to learn and adjust
in each treatment - Do not alternate high and low return treatment
across periods
17Within subject design and order effect
- Within subject designs control for individual
idiosyncracies and learning - 10 rounds of low return to public investment
followed by 10 rounds of high return to public
investment - Also need a treatment with the opposite order
since subjects learning and perception formation
may be affected by the order of the experiences - AB or ABA (cross-over) or AB and BA
- Dual two simultaneous treatment within-subject
(p. 26) - n x m factorial designs in k trials
(replications) - Sometimes trials is rounds sometimes sessions
- Complete factorial but randomize order across
trials
18Example VCM
- High return, Low return
- With punishment, without punishment
- With reputation, without reputation
- 2 x 2 x 2 design
- One-shot game or repeated game
- Game theoretic predictions
- One-shot with replicated experience
- Random matching some probability of multiple
encounters - Deterministic one-shot matching zero
probability of multiple encounters
19- Between subject, one-shot replicated
- Recruit 30-50 subjects for each condition
- Assign subjects to markets a market can be 2 or
more subjects - Matching protocol for re-assignment across rounds
- Each session can have multiple or single cohorts,
thus multiple or single treatments - Within subject return
- Between subject punishment reputation
- Session 1, cohort 1 HRP, LRP cohort 2 LRP, HRP
- Session 2, cohort 3 HRNP, LRNP cohort 4 LRNP,
HRNP
20fractional factorial design
- High and Low return
- High and Low punishment points
- High and Low reputation
- 2 x 2 x 2 8 factorial design
- Reduce the dimension
- HHH HHL HLH HLL is bad because return is
always High - HHH HLH LHH LLH is bad because reputation is
always High - Make the third the product of the first two
(assume e.g. H and L-) - HHH () HLL (--) LHL (--) LLH (--)
- See graph
- If interaction effects are expected then a full
factorial design is necessary
21Nuisance variables
- Experience and learning
- Extra-lab interaction influences
- Boredom, fatigue
- Selection bias
- Subject or group idiosyncracies
22- Control all controllable variables
- Otherwise you may have confounds
- Observability solves the confounding problem of
uncontrolled variables but use up degrees of
freedom - Focus variables define treatments
- Statistical power requires widely separated
levels - Linear or non-linear effects
- Be aware of interactions between focus and
nuisance variables - Keep nuisance variables constant
- Use orthogonal variations in focus variables
23Applications New Institutions
- Experiments as test beds
- Grether and Plott (1984)
- Uncompetitive prices through institutional
practices in gas additive market (tetraethyl
lead) - 3 x 2 x 2 x 2 institutional treatments
- 3 levels of price publication
- 2 levels of price access
- 2 levels of advanced notice
- MFN or no MFN
- But some of these interactions were uninteresting
so used only 8 treatments - Constant supply and demand curves, basic
exchange institution - Found some support for institutional practices
leading to uncompetitive prices
24- Hong and Plott (1982)
- Test of posted offer vs. telephone markets
- Railroads were required to post prices but dry
bulk cargo barges were not - Telephone market involved bilateral contacts
between shippers and carriers - Posted prices showed less competitive prices,
less market efficiency, and lower profits for
smaller carriers - Contrary to railroad companies claims
25Development testing
- Grether et al. (1981) demonstration experiments
comparing allocation of airline landing slots
through markets or committees - McCabe et al. (1991) studies on electric power
and natural gas networks - Uniform price double auction with continuous
feedback - Uniform pricing as in call markets
- Continuous feedback as in DA markets
- Non-commercial television station programming
- NASA resource allocation at space station
26Ch 4 Human Subjects
- Who should your subjects be?
- Subjects attitude towards risk
- How many subjects?
- Trading commissions and rewards
- Instructions
- Recruitment and maintaining subject history
- Human subject committees and ethics
- Application to bargaining experiments
27Who should your subjects be?
- Students
- Convenience, low opportunity cost, experience in
processing written information - Professionals
- Field experience, proven success
- Your own students?
- Heterogeneity of socio-demographics
- What role in theory?
28Risk attitudes of subjects
- Important role in theory
- Heterogeneity
- Induce risk neutrality
- Pay in lottery tickets
- Assumes independence axiom
- Evidence indicate it does not work well
- Elicit risk attitudes
- BDM, MPL
29How many subjects?
- How large is large enough for competitiveness?
- Statistical power
- Idiosyncratic effects and uncontrolled nuisance
variables
30Rewards
- Money or course credit?
- Points with conversion to dollars?
- How much is enough for incentive at the margin?
- Multi-session experiments and IOUs
- Asymmetric payoffs
- Pay by ranking in each role
- Tournaments increase variance in payoffs
- Bankruptcy
- Risk seeking behavior
31Instructions
- Statement of purpose
- Danger with numeric examples
- Importance of privacy and anonymity
- Story or artefactual
- Duration of session
32Recruitment
- Social distance
- Sample selection
- No shows and stand-bys
33Human subjects committees and ethics
- Deception
- A public bad
- IRB approval process
34Application Bargaining
- Siegel and Fouraker (1960), Fouraker and Siegel
(1963) - Structured alternating series of written
price-quantity messages with information
treatments - Roth and others
- Free-form bargaining over computers
- Induced risk neutrality
- Binmore and others
- Instructions told subjects to make as much money
as possible (demand effects) - Roth et al (1991)
- Subject pool effects (also Botelho et al 2005)
35Ch 5 Laboratory facilities
- Choosing between manual and computer modes
- Manual laboratory facilities
- Computerized laboratory facilities
- Random number generation
- Applications Monetary overlapping generations
economies
36Applications
- Lim, Prescott and Sunder (1994)
- Overlapping generations model
- Convergence is slow and computerizing design
allowed more periods - Marimon and Sunder (1993)
- Moving from partially to fully computerized
system allowed them to double the number of
periods - Further allowing them to spot a phenomenon
- Also allowed them larger cohort size enabled
the observation of lab generated sun spots
37- Computer assisted decisions
- Computer can solve for certain inputs into
decisions such as optimal supply functions - Olg experiment could then focus on studying
expectations formation
38Ch 6 Conducting an experiment
- Lab log
- Pilot experiments
- Lab setup
- Registration
- Conductors
- Assistants or researchers
- Monitors
- Instructions
- Handling queries from subjects
- Dry-run periods
- Termination
- Known or unknown termination point
- Modeling infinite horizons
- Debriefing
- Payment (anonymity or double blind)
- Bankruptcy
- Bailout plan
39Ch 7 Data analysis
- Graphs and summary statistics
- Statistical inference Preliminaries
- Reference distributions and hypothesis tests
- Practical advice
- Application First-price auctions
40Describing the data
- Description of data more central in experimental
than other economics - Many data sets describe entirely new type of
phenomena - Line graphs, pie charts, scatter plots
- See figure 7.3 p 92
- Descriptive statistics, means, medians, standard
deviations - Description should allow you to discover both
expected and unexpected phenomena - interocular trauma test (Savage)
- It is blindingly obvious from the graph
- Simple hypothesis tests should still be included
41Statistical inference
- Experimental error
- Measurement error
- Sampling error
- Ideal samples
- Random sample
- Balanced sample
- Sample selection bias
- Care in recruiting, test and correct for bias in
statistical analysis - Multicollinearity
- Vary variables orthogonally in experimental
design - Omitted variables
- Collect observations if possible
42Individual heterogeneity and omitted variables
nuisance variables
- Demographics cultural influences on behavior
- Cognition cognitive capital investments
- Heuristics and ecological rationality
- Physical and emotional health and stability
- Risk attitudes
43- Example of estimation with and without control
for demographics
44Panel data
- Most experimental data collects several
observations from the same cohort - Observations on same subject are correlated
individual specific idiosyncracies omitted
variables - Learning later observations depend on
consequences of earlier choices observations
are not independent - Group effects omitted variables on group
dynamics
45Hypothesis tests
- Are the treatment differences observed due to
sampling error or due to differences in
population distributions? - Assume a reference distribution for the
population distribution of choices - Parametric (normal, student t, lognormal, beta)
- Nonparametric
46Tests with parametric reference distributions
- Difference in means across two treatments
- Reference distribution assumption
- is normally distributed with unknown mean
µ and known variance s2/n, where n is the sample
size and s2 is the estimated variance
47Example
- Sample mean is 0.6
- Hypothesized population mean is 0.5
- Nul hypothesis sample mean is equal to
population mean - Alternative hypothesis sample mean is not equal
to (is greater than) population mean - Two-tailed (one-tailed) t test
- n36, s0.2, t3.0, n-1 degrees of freedom
- Probabilities from t-tables
- Two-sided test p0.005, one-sided test p0.0025
48Pooled t-statistic of difference across treatments
- nAnB -2 degrees of freedom
- Between subject design
- Within subject design matched t statistic
49Pooled or matched?
- Matched pairs controls for nuisance variables
- Better able to detect true treatment effects
50Nonparametric tests
- Using the observed data to construct reference
distributions - No need to make assumptions about the
distribution - Use the observations and construct sample
distributions to treatments in all possible ways
51Example
- A 1 2
- B 3 2
- A 12 13 12 21 23 22 31 32 32 21 22 23
- And corresponding assignments to B
52Wilcoxon Mann-Whitney U test
- Rank the pooled data, keeping track of
correspondence between each observations and
which treatment it came from (1(A), 2(A), 2(B),
3(B)) - Sum the rank across all observations from A
treatment (123) or B (347), compute
probabilities that these ranks are different from
random ranking - Only uses ordinal relationships and ignores
quantitative sample information (the magnitude of
difference across each rank)
53Sign test for matched pair data
- r number of positive paired differences
- w number of negative paired differences
- Is the actual frequency of positive differences
different from 50 - Ignores all sample information except the sign
- Ignoring information reduces the power of the
test the ability to detect treatment
differences when they are there
54Bootstrap
- Take all permutations of the data, assigning data
to the two treatments A and B - One of these permutations is the actual data
- To test if the actual treatment difference is not
zero compare the simulated differences to the
actual difference - If 95 (or more) of the simulated differences are
greater than the actual difference we can reject
the nul hypothesis of no difference at the 5
significance level
55Other tests
- Chi square test of contingency table of
frequencies to categorical behavior - Fishers exact test (also contingency table)
- ANOVA multivariate
- Multiple regression with dummy variables for
treatments - Bayesian techniques
56Ch 8 Reporting your results
- Coverage
- Cant cover everything
- Keep readers attention and aid their retention
- Focus on single issue
- Describe how you select the data you report if
not reporting all data - Organization
- Section on experimental design and lab procedures
- Documentation and replicability document
everything needed - Project management start analyzing and
organizing data early
57Appendices
- Readings
- Instructions and procedures samples
- Forms
- Checklist
- Recruitment
- Consent form
- Receipt
- IOU
- Econometrica guidelines
58Application
- US Russian Ultimatum Bargaining experiment