Title: MIS 650 Data Collection
1MIS 650Data Collection
2Chapter 3 Methodology
- Chapter Outline
- 3.1 Methodological Issues (Usually Validity and
Reliability, sometimes Ethics) - 3.2 Sampling Methods
- 3.3 Data Collection Techniques
- 3.4 Data Integrity Issues
- 3.5 Analysis Look-ahead
3Idea Theory, Model
What you say
What the theory says
Test Plan Methodology
hypotheses
conclusions
Research methods
Conclusions about Idea
Physical Test of Hypotheses using Methodology
What the data say
data
What the world says
43.1 Methdological Issues
- State of Theory in your area (well developed,
speculative) - Ability to generalize
- Role of data in your research is it empirical?
- Formal or informal index of goodness of your
methodology within a general critique
5State of Theory
- Theory vs. Experience Is theory well developed
or are we still experiencing rather than thinking
about this area? - Role of Language Are there well-defined terms
and measures? - Proof vs. Communication Role of paper
- Qualitative vs. Quantitative Research Do strong
theories already exist?
6Role of Data
- Data are Instances of Abstractions
- These instances have relationships which test
relationships among abstractions - The abstraction relationships are the theory
- We use DATA (measurements) to demonstrate the
theoretical relationships among the abstractions
7Our theories are the scripts the world, our
stage researchers, the stage managers and data,
the film of the players performances. Our goal
is to create excitement, sell tickets, and
satisfy the public.
8Classes of Problems
- Sampling Problems (Cases, Companies, Individuals,
Times, Tasks) - Observer Errors (Creating the wrong stimuli)
- Subject Errors (Getting wrong responses)
- Recording Errors (Losing the data)
- Ethical Problems (Not deserving the data)
9Where Students Often Fail
- Lack of Theory to Guide Method
- Poor Operationalization of Concepts
- Convenience Samples
- Measurement Errors
- Sloppy Data Collection
- Too little data
103.2 Sampling
- Discuss how sample was obtained
- What was used as the sampling frame? Why?
- Were there any problems with representativeness?
- Were there any potential ethical problems?
11Sampling Issues
- Representativeness
- Usually assured by random sampling
- Not always an issue or an issue to the same
degree - Procedure
- Topic/Hypotheses ? Universe ? Sampling Frame ?
Research Sample ? Actual Sample
12Representativeness
- Data points must be unbiased
- This means that qualities of the source of the
data should not (apparently) affect the content
of the data - Generally this means that every potential data
source has the same probability of being in the
research sample
13Representativeness, Contd
- The question is then, Do the sources of data in
the research sample represent all those data
points not present? - If YES, then conclusions drawn from the data can
be generalized to the whole universe. - If NO, then such conclusions will be deemed to
apply only to the research sample.
14Representativeness, Contd
- Representativeness works in two ways
- 1. Generalizability
- Do the data represent the universe?
- 2. Confidence
- How well do the data do that representation?
-
15Representativeness, Contd
- Confidence
- This becomes an issue because of random
variation rather than bias. Random variation is
only an accumulation of unknown biases.
Systematic bias pushes qualities of data source
in particular directions thus increasing
possibility of wrong conclusion. Random variation
pushes qualities of data source in many random
directions, thus lowering confidence in
conclusions
Systematic Bias
Random Variation
16Procedure-1
- Topic/Hypotheses ? Universe
- Topic applies to particular part of the world and
your hypotheses can only be tested in a
particular world - The universe is what your ideas are eventually
going to apply to
17Procedure-2
- Universe ? Sampling Fame
- Sampling Frame is a systematic way to get to data
sources in your universe. - Examples include phone directories, databases,
printed lists, physical inventory - All real sampling frames are inaccurate, out of
date and incomplete. Problems must be addressed
and discussed.
18Procedure-3
- Sampling Frame ? Research Sample
- Research sample is the actual list of your data
sources. For generalization research sample
should be representative - Research sample should be drawn randomly if
possible or sometimes in a stratified manner. - Taking every nth item is common, or using random
number table. - Not every item selected is real!
19Procedure-4
- Research Sample ? Actual Sample
- Actual sample is smaller than research sample
- Sources may not be available
- Scheduling is hard
- Interruptions, lost data, accidents, etc.
- Sampling frame may be inaccurate or out of date.
20Sampling Issues
- Level of Aggregation Issues
- Organization, Group, Individual, Task. Sampling
Entity Issues - Site, Individual, Task, Time, Measurements
- Sample Size Issues
- Parameterisation,
Inference, Description
21Sample Structure
- Universe (all possible things)
- Sampling Frame (Systematic Division into
Allowable/not Allowable) - Sample Situation
22Ex-sample
- Universe
Users - Sampling Frame Firm phone Directory
- Sample Every 3rd Situation
23Problems in Sampling
- Convenience Sampling -- unrepresentative
- Lack of a Sampling Frame -- cant sample
- Too small a sample size -- low confidence
- Too large a sample size -- wasted effort
- Sampling the wrong thing -- useless
- Non-representative Sampling -- cannot generalise
243.3 Data Collection Techniques
- What were the possible choices for data
collection technique? - Why did you choose method you did?
- Describe the method in detail
- Was there a role for observers, coders,
interpretation? - Show how you handled problems with the technique
you selected.
25General Data Collection Methods
Survey Expt. Obsvn Case Ret RT RT
RT/Ret Pro/Sub Sub N/A Pro/Sub Res
Res Subj Subj/Pro Emp Emp Emp Emp
- Dimensions Real-time vs. retrospective
- Observed now or subject recalls from past
- Projective vs. Subjective
- Others/subjects experience
- Researcher-driven vs. subject-driven
- Researcher creates stimulus/subject does this
- Most common methods are case studies, surveys
and experiments - Empirical vs. non-empirical
26Data Collection Model
5. Perceived Stimulus
9. Response / Answer
11. Perceived Response
1. Theory
3. Stimulus / Question
8. Response formulation
2. Stimulus formulation
4. 10.
6. Knowledge 7. Ideas
12. Recorded Response
Observer
Subject
Interpreter/Coder
27Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
1. (Actually H1) Prior experience with one
application influences perception of innovation.
6. Knowledge 7. Ideas
2.
4. 10.
12. Recorded Response
Observer
Subject
Interpreter/Coder
28Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
6. Knowledge 7. Ideas
2.
4. 10.
2. 3. Which of the following applications have
you used in the past 12 months?
12. Recorded Response
Observer
Subject
Interpreter/Coder
29Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
4. 5. Do you know how to do your job?
6. Knowledge 7. Ideas
2.
4. 10.
12. Recorded Response
Observer
Subject
Interpreter/Coder
30Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
6. Knowledge 7. Ideas
2.
4. 10.
6. 7. ltHmmm, maybe I look like I dont know what
Im doing herebetter deny!gt
12. Recorded Response
Observer
Subject
Interpreter/Coder
31Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
6. Knowledge 7. Ideas
2.
4. 10.
8. 9. Nope, havent used any of them
12. Recorded Response
Observer
Subject
Interpreter/Coder
32Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
10. 11. ltHmm, he must be an idiot not to have
used these appli-cationsgt
6. Knowledge 7. Ideas
2.
4. 10.
Recorded Response
Observer
Subject
Interpreter/Coder
33Data Collection Model
9. Response / Answer
5. Perceived Stimulus
11. Perceived Response
1. Theory
8.
3. Stimulus / Question
6. Knowledge 7. Ideas
2.
4. 10.
12. Dont Know
12. Recorded Response
Observer
Subject
Interpreter/Coder
34Observer Errors
- Mistakes that observers commit, usually not
observing the right phenomenon or masking
subjects behaviour
Observer Behaviour
Subject Behaviour
35Observer Errors
- Intrusion, leading questions
- Setting up the situation to give a predetermined
answer, interfering with subjects ability to
select an answer by supplying it, assuming an
answer, not respecting silence
36Observer Errors
- Intrusion, leading questions
- Expectation management problems
- Creating a situation in which subject tries to
guess correct answer or tries to please the
researcher by giving socially mandated or
desirable responses
37Observer Errors
- Intrusion, leading questions
- Expectation management problems
- Consultant effect
- Interfering with normal behavior by changing
the situation to favor socially-facilitated
responses or by focusing attention on behavior
under study
38Observer Errors
- Intrusion, leading questions
- Expectation management problems
- Consultant effect
- Hawthorne effect
- A consultant-related effect in which behavior is
enhanced because attention has been drawn to it.
39Subject Errors
- Many things can influence the subject in his or
her responses. Here are some of the sources - Memory effects
- Protocol Intrusion effects
- Subject Context and Limitation effects
- Researcher-Subject Interaction effects
- Subject Cognition effects
- Instrument-Subject Interactions
40Subject Errors
Generally, these errors are most noticeable and
problematic when subjects are used in a
retrospective manner. However, any task
requiring cognition or performance of any type is
subject to most of these problems.
Context / Protocol Subject is the source
of variance we desire
Sub- ject
Mem- ory
Instru- ment
E- vents
Cog- nition
Re- sponse
Researcher
41Memory Effects
- Memory for events changes over time and under the
influence of other events - Recency
- Primacy
- Von Restdorff
- I dont remember
- I used to know
- Clustering
Recall/recognition
Time since event remembered
42Protocol Intrusion Effects
- Responses are conditioned not only by what the
respondent might know, think or feel, but also by
the presence of words or concepts in the stimulus
or stimulus situation - Sequence
- Positive Halo
- Negative Halo
- Mand characteristics
43Subject Context and Limitation Effects
- How the subject feels about you, your questions,
everything, determines the responses and how the
responses are presented. - Stupidity
- Ignorance
- Ill Will towards you, the organization or
system, research, any group you are imagined to
be part of or represent - Resistance
44Researcher-Subject Interaction Effects
- Because you are present (or not), your being
around may affect what the respondent does and
hence how the respondent replies. - Social facilitation
45Subject Cognition Effects
- The subject is not just a machine that reacts.
He or she engages in games, strategizes, and
tries to understand the situation while working
as a response machine. Intrusion effects (halo
(/-), sequence) - Experimenter expectancy
- Evaluation apprehension
- Gamesmanship
- Face games, one-upmanship
- The problem of the in-group (technicians, mgrs)
46Instrument-Subject Interactions
- The instrument may prompt, provoke or prevent
response because of its design - Poor scales for response
- Too many responses, fatigue
- Aesthetic reactions
47Recording Errors
- Failure to listen
- Categorization errors
- General carelessness
- Privacy problems
- Too little room on medium
- Over-reliance on tape or technology
- Poor scales
48Interpretation Errors
- Misunderstanding
- Poor conceptualization of constructs
- Poor scales
493.4 Data Integrity Issues
- How data will be recorded
- Potential problems with recording
- How data will be maintained
- Potential problems with maintenance
- How data will be stored, accessed
- Potential problems with storage, access
- Are data confidential?