Logic in Scientific Reasoning

About This Presentation

Title:

Logic in Scientific Reasoning

Description:

Compatibility with experimental laws/theories if two new hypotheses are ... 4. Test for reliability and validity. 1. Conceptualization ... – PowerPoint PPT presentation

Number of Views:924

Avg rating:3.0/5.0

Slides: 169

Provided by: roxanne5

Category:

more less

Transcript and Presenter's Notes

Title: Logic in Scientific Reasoning

1
Logic in Scientific Reasoning

Definition of Science
Broadly speaking, science can be defined as a
systematic study of nature and the rules which
govern nature (and human behaviour or phenomena),
to identify general statements of fact through
which we can understand and interpret
information.

2
Logic in Scientific Reasoning

Definition of Science
Science derives from Latin word scientia
which means knowing.
Not talking specifically about particular subject
areas, but an orientation/ way of looking at an
area, a systematic approach in the pursuit of
knowledge.
Basically any subject matter can be studied as
science the essential point is the method used.

3
Scientific Method

The method distinguishes scientific from
non-scientific enterprise
Here, one uses the tools of logic to assist in
assessing whether a systematic and logical
sequence has been followed in arriving at truth
claims.
The scientific method can be described as the
logic of science, ie, the principles that can be
used to arrive at explanation of facts and so get
knowledge of the world. It is a systematic
approach that is used in the attempt to gain
knowledge.

4
Scientific Method

The method combines both aspects of logic, ie,
the inductive process and the deductive process.
The process is cyclical, with the method
beginning and ending with what can be observed or
data that can be gathered through means such as
experiments, that is, through the use of
induction.

5
Steps of the Scientific Method

Identifying a problem (remember that problem
here means that which requires explanation, not
the negative definition we generally have) cant
have anything before a problem is identified.
Formulate a hypothesis cant be done without
having collected some data that could be used to
try to understand the events and so formulate a
means of explanation

6
Steps of the Scientific Method

Collect additional data this generally involves
preparing a research design to identify the
sources of data and how that data will be
collected. The problem at that stage is
identifying what counts as relevant
information/data. May be relevant for a
hypothesis that you have not actually formulated,
but not relevant to the one that you are
presently using.

7
Steps of the Scientific Method

Test hypothesis how effectively does the
hypothesis explain the facts that it was
formulated to explain, as well as other facts?
Additionally, how well can it predict future
occurrences?
Draw conclusions - as to the efficiency and
effectiveness of the theory in explaining and
predicting events, and how the theory can be used
to achieve some human goals (since that is often
what is being aimed at, or results from
scientific exploration).

8
Laws of Nature

The scientific method should yield a body of
general statements of fact through which we can
understand and interpret information.
These general statements of fact that we
formulate are actually universal statements,
because they should apply to all instances
covered in the statement.

9
Laws of Nature

However, they are only provisional, in that there
is always the possibility of finding evidence to
disprove the statement.
These statements are also empirically based in
that they depend on data gathered from
observation or experiment.

10
Laws of Nature

These general statements are called laws of
nature.
Laws of nature are derived from hypotheses that
are formulated to understand the events
occurrence.
Before we can have natural laws, we therefore
need to have good hypotheses.

11
Evaluating Hypotheses

Relevance the hypothesis must not stray from
the phenomenon which needs to be explained.
Testability Must always be able to test to see
if the hypothesis is plausible by apply it to the
data for which you need to be getting an
understanding. Additionally, the test that is
being done must be relevant to the hypothesis
that is being tested.

12
Evaluating Hypotheses

Predictive and explanatory power distinction
between the two is generally seen as temporal
explanation referring to events that have already
occurred prediction to those that have not yet.
Yet at the same time, prediction presupposes that
an explanation has already occurred.

13
Evaluating Hypotheses

Compatibility with experimental laws/theories
if two new hypotheses are offered, want to accept
the one that accords more with what has already
been established. The idea here is that there is
a constant accumulation of knowledge.
Simplicity may find that you have two theories
that both function effectively when this
happens, use the simpler of the two. Simpler the
explanation, the less factors that youd be
needing to take into account to give an account
of the event.

14
Social Surveys

Research projects which use a questionnaire to
collect standardized data from a large number of
individuals.
Can be either Population or Sample surveys.
Sample surveys are the most common
The collection of standardized data requires that
the same questions be given to all respondents in
the same order.

15
Types of Surveys

Factual Surveys Use to collect descriptive
information. Example, Population census, The
Survey of Living Conditions and The Labour Force
Survey.
Attitude Surveys Carried out by opinion poll
organizations, market researchers, etc.
Explanatory Surveys - Used to test hypotheses or
to test and develop theories.
Common to all types, is the use of the
Questionnaire as the instrument of data
collection

16
Survey Design
17
The term Research Design can refer to

the planning of scientific inquiry
the design strategy for finding out something
the arrangement of the conditions of observations

18
All designs require

a precise determination of what you want to find
out
the detailed specification of the most
appropriate and effective way doing so

19
Factors determining Design

a. The purpose of the Study
b. The Time Dimension
d. Approach to the collection and handling of
the data

20
The purpose of the Study

A particular research project can serve any
one or a combination of the following purposes
Exploration
Description
Explanation

21
Classification According to Time

Time dimension speaks to the number of times
participants will be observed in relation to a
particular study
There are two approaches
- Cross-sectional researchers do a
snapshot. One-time effort in gathering data.
- Longitudinal Permits the researcher to
observe the phenomenon more than once

22
Correlational Design

The standard research design used in surveys is
the correlational design
In this design constructs are measured
independently of each other and then tested for
associations.
Extraneous variables are controlled by including
them in the study.

23
Causality

The Demonstration of Cause and Effect

24
Causation

Mill argued that every event that occurs has a
cause this is called the Principle of Universal
Causation.
But how do we identify causal relationships? If
we see two events occurring together
consistently, what right have we to assume that
one is the cause of the other?

25
Explanation

Explanation implies delineating the causal links
between and among variables.
To explain therefore involves asking two types of
questions about any phenomena i.e. how? and/or
why?
How questions are more easily answered than the
why questions because of the nature of the types
of answers required.
How questions can usually be answered using
chronological accounts i.e. the sequence of
events.

26
Explanation (contd)

Answering how questions allows us to decompose
the structure of relationshipssimplify. e.g.
traditional voting patterns
Why questions usually require introspection,
rationalization and motivational answers.
Answers to why questions exist outside of both
variables
Although one type of question is easer to answer
than another both are necessary for adequate
explanation.

27
Causality Assessment Criteria

There are four general criteria
1. association
2. time priority
3. non-spuriousness
4. rationale
interpretivists also rely on how they think
things should be ordered.

28
Causality Assessment Criteria Association

For a causal relationship to exist there must be
co-variation
Although association is necessary for causality
the strength of the association does not increase
or decrease causality, e.g. smoking and lung
cancer
However if there is consistency i.e. if the
association is present across different samples
and contexts (established through replication)
the greater the chance that there is a causal
relationship.
this might be confounded by conditional
relationships

29
Causality Assessment Criteria Time Priority

Cause must always precede effect.
This is not clear cut in all cases, it is
difficult at times to determine which variable
should precede the other, e.g. social class and
educational attainment. How do we decide?

30
Causality Assessment Criteria Non-spuriousness

For two variables to be causally related the
co-variation must not be the function of a third
variable, i.e. a variable that is related to both
variables, e.g. age and religiosity

31
Causality Assessment Criteria Rationale

Causality can only be established if theory is
supported by empirical evidence. If there is any
disconnect causality is questionable.

32
CausationMeaning of Cause

Necessary Condition can be stated in various
ways
It has to be present for the event to occur.
If it is absent, the event will not take place.
If the phenomenon (effect) is present, the
condition (cause) is present.
A is a necessary condition for B only if whenever
B is present, A is present.

33
CausationMeaning of Cause

Necessary Condition
Will need several necessary conditions for a
phenomenon to occur.
Example Plants will only grow if they are
exposed to light.

34
CausationMeaning of Cause

Sufficient Condition can be stated in various
ways
F is sufficient for H if whenever F is present, H
is present.
If the condition is present, the phenomenon is
present.
Once the condition is present, the event will
occur.

35
CausationMeaning of Cause

Sufficient Condition
Example
The rain falling heavily is a sufficient
condition for the road being wet, but can
conceive of other ways in which the road could
have become wet without the rain falling.

36
CausationMeaning of Cause

Remote vs. proximate cause
Remote cause looking at the more distant
conditions that resulted in the events
occurrence
Proximate cause the conditions that are in
place just before the events occurrence
Remote and proximate causes will be linked by
intermediate causes in what can be labelled a
causal chain.

37
Causal Claims/Statements

These may be singular or general (we are usually
more interested in the general)
Singular
The slippery road surface caused the accident
today.
The tsunami in Asia on December 26, 2004 claimed
500 000 lives.
General
Absence makes the heart grow fonder
Exercising reduces the risk of a heart attack.
Dirty campaigning wins elections.

38
Identification of Causal Relationships

Universality of causation if an event resulted
from certain conditions being in place, we expect
that this will happen again in the future, if the
same conditions are in place.
This principle is the underlying basis through
which we can develop general causal statements.

39
Identification of Causal Relationships

It is never possible to directly see causal
relationships have to infer these relationships
from observations.
Conclusions drawn from observations are
potentially flawed as correlation does not lead
to causation.
Correlation looking at the rate at which a
property is seen among two populations/groups of
events
The identification of a causal relationship
generally starts with a correlation, but goes
further by identifying a set relationship between
the two, where the presence of one results in the
presence of the other, or the absence of one
results in the absence of the other.

40
Common Mistakes In Assigning Causal Relationships

May identify two events that are correlated, but
may reverse the causal factors.
May identify a correlation between two events,
but both events are the result of a common cause.
May have correlations that are purely accidental.
Note These errors will result in the Fallacy of
False Cause.

41
Identification of Causal Relationships

Once a correlation has been identified, there
needs to be a further exploration as to whether
the events in question are in fact causally
linked and, if they are, which is the cause and
which is the effect.
This is usually the most difficult to achieve,
because one needs to have strong support for the
causal claim being made.

42
Identification of Causal Relationships

Several methods have been developed to ensure
that when a causal statement is developed that it
will actually hold true in most, if not all,
cases.
John Stuart Mill developed five such methods,
called Mills Methods of Experimental Inquiry,
namely
Method of Agreement
Method of Difference
Joint Method of Agreement and Difference
Method of Residues
Method of Concomitant Variation

43
Mills Methods Method of Agreement

Looking for instances where two events
consistently occur together, one being an
antecedent to the other the antecedent condition
being identified as the cause.
Example
In the case of chicken pox, you may notice that
each time someone has chicken pox and someone
touches her, the person also gets chicken pox.
When you have seen this on several occasions, you
can develop a general causal statement.

44
Mills Methods Method of Agreement

Each occurrence of this event with the conditions
laid out would provide confirmation for the
causal statement. In a case where you already
have the cause, you are seeing a sufficient
condition being identified.
If you already have the effect, trying then to
identify what is the one condition that is
present in all cases where the phenomenon occurs,
so looking for the necessary condition.

45
Mills Methods Method of Agreement

Limitation
May not be able to narrow it down to one
differentiating but common condition for all
instances of the phenomenon.
Therefore may need to call on another method to
supplement.

46
Mills Methods Method of Difference

You are looking at the conditions that are
present when an event occurs and when it does not
occur.
If there is one condition that is present when
the event occurs and is absent when it does not
occur, that condition can be seen as the cause of
the event.
This is the method that is generally used in the
controlled experiment. Trying to control every
other variable want to see what will differ in
the absence of one variable.

47
Mills Methods Method of Difference

Example
For the chicken pox, you may notice that you
have three children playing with someone who has
chicken pox. You watch carefully and no child
touches the child that has chicken pox, and no
child gets the virus, so in the absence of
touching, the virus is not spread. So touching is
the cause of the virus transmission.
Necessary condition is what is usually
identified, or the basis for the necessary
condition.

48
Mills Methods Method of Difference

Limitations
Cant control every other variable under normal
conditions. Even a minor factor could change,
which would make the use of this method
problematic.
Very often, the cause of an event's occurrence is
various factors being combined can then have
more than 1 condition being identified as the
cause, when all are actually required.

49
Mills Methods Joint Method of Agreement and
Difference

If you can find both instances when the condition
and the phenomenon are present and when both are
absent, there are stronger grounds for making a
causal connection than using solely Agreement or
Difference. Therefore, using both methods at the
same time.

50
Mills Methods Joint Method of Agreement and
Difference

For the chicken pox, you may notice that you have
three children playing with someone who has
chicken pox. You watch carefully and two of the
children touch the child that has chicken pox,
and they get the virus. One child does not touch
the infected child and does not get chicken pox.
You can therefore conclude that chicken pox is
transmitted through touching the infected person.

51
Mills MethodsMethod of Residues

There is complex mix of events occurring.
You already know the cause of some of the
phenomena that are being exhibited. If you remove
those causes and their phenomena, then you will
have left some conditions and some effects.
A B C g h i
A has previously been identified as the cause of
g
B has previously been identified as the cause of
h
Therefore C is the cause of i

52
Mills MethodsMethod of Residues

Eat lunch, after which you start to feel ill. You
have indigestion, vomiting and a rash. You had a
tuna salad (which has mayonnaise), coconut water
and a banana. You know from past experience that
mayonnaise will give you a rash if it is combined
with black pepper that coconut water can cause
indigestion, then the banana is the probable
cause of the vomiting.

53
Mills MethodsMethod of Residues

Limitations
May not be able to identify all the probable
causes of the event in question.
You would have to go to another source to
identify which of the possible causes that youve
identified is the one that gave you the rash.
The causal link that is used to exclude the other
occurrences from the equation have to be well
established from other inductions how do you do
this?

54
Mills MethodsMethod of Concomitant Variation

If you see a consistent variation between two
phenomena, one increasing while the other
decreases OR both increasing at the same time OR
both decreasing at the same time, then have a
basis for arguing that there is a causal
relationship between the two events.
Example
Smoking causes lung cancer.

55
Mills MethodsMethod of Concomitant Variation

Limitations
Concomitant variation may be seen in situations
where there is actually no causal connection
between the two events.
Cant simply use two observations or instances,
need to be looking at several cases.
When one mistakes correlation for cause, it is
generally because of the misuse of this method.

56
Mills MethodsConcluding Remarks

Mill set out to try to show that we can arrive at
certainty using induction by formulating his five
methods. However, this cannot be achieved,
because certainty, in the strictest sense, is not
possible in induction.
He also searched in each of his methods to
identify one cause, but many events have several
factors which work together to lead to the
occurrence of an event.

57
Mills MethodsConcluding Remarks

Mills Methods can be best seen as a means for
testing hypotheses that we have formulated to
explain particular phenomena we have observed or
certain questions we want answered. This is
because we will be limited in terms of the number
of possible instances that we can identify to
explain a particular occurrence youve already
set limits once you start to use Mills Methods,
they cannot be used in a vacuum.

58
MEASUREMENT
59
What is measurement?

The process of assigning numbers of labels to
units of analysis in order to represent
conceptual or variable categories
Scientific norms require that we fully describe
our methods and procedures so that others can
repeat our observations and judge the quality of
our measurements

60
Steps in the measurement process

Conceptualization
2. Operationalization
3. Specification of levels of measurement
4. Test for reliability and validity

61
1. Conceptualization

Creating factual or constitutive definitions of
concepts
Define concepts in relation to other concepts.
Ordinary language

62
2. Operationalization

Describing concepts in the language of
measurement
Creating measurable definitions of concepts

63
3. Levels of Measurement

set of rules used in the labeling or quantifying
of variables.
There are 4 levels, each assuming different
interpretation of the numbers or labels assigned
to the variables
Nominal Level
Ordinal Level
Interval Level
Ratio Level

64
Nominal variable

the term nominal means to name
measurement simply involves attaching names or
labels to the variables
observations are merely classified into
categories.
indicate whether things are the same or are
different
numbers or labels are assigned to categories as
codes for facilitate data collection and analysis
the only mathematical relationship that can be
assumed is that of equivalence.

65
Nominal (cont.)

Cases placed in a given category must all be the
same.
Also, the categories must be
exhaustive -sufficient categories so that all
the items can fit into one of the categories
mutually exclusive -the items being classified
must not fit in more than one category

66
Ordinal

Variables classified as ordinal also have the
characteristics of mutually exclusive and
exhaustive categories
In addition, this level of measurement has the
additional feature of logical ordering of the
attributes of the variables.
In other words, you can order or rank the
variables under consideration

67
Interval level

This level has the qualities of nominal and
ordinal measurements
In addition, there is equal distances (intervals)
between the categories.
Example The difference between 20OC and 30OC is
the same as the difference between 90O C and 100O
C ( i.e. 10O C) degrees.
We can infer not only that 100OC is hotter than
90OC degree
but also by how much because of a standard
measurement.

68
RATIO LEVEL

This level includes all the features of the other
levels of measurement
In addition, there is an absolute zero point.
Hence, it can be multiplied and divided.
Example, income measured in dollars can be
divided one into another to form a ratio.
Zero means absolutely nothing in this level of
measurement

69
Reliability and Validity
70
Reliability

Reliability is concerned with issues of
stability and consistency
It is the extent to which a measuring instrument
produces the same result on repeated applications
under similar conditions.
When repeated measures of the same thing gives
identical or very similar results, the
measurement instrument is said to be reliable.

71
Reliability Assessment Techniques

1. The Test-Retest Method
2. The Alternate-Form Method
3. The Split-Half Method
4. The Established Measures Method
5. Inter-coder/Research Workers Readability

72
1. The Test-Retest Method

This is the simplest approach to assessing
reliability
It involves testing/measuring the same persons or
units on two separate occasions and then checking
for statistical correlation between the two sets
of scores.

73
Procedures

Test - Administer same test to some individuals
on more than one occasion..
Compare - Scores of each individual on first
testing are related to scores of second testing
to provide a reliability coefficient.
Results - Coefficient can vary from 0 (zero),
indicating no relationship between the sets of
scores to 1 (one), indicating perfect
relationship
Interpret - High coefficient close to 1 is
desirable, since it is an indication of a strong
relation between the scores or an indication that
instrument is, indeed, measuring stable /enduring
characteristics.

74
Advantages and Disadvantages

Advantages
Requires only one form of a test
Provides information as to test consistency over
time.
Disadvantages
Affected by practice and memory
Influenced by events that might occur between
testing sessions.
Requires the administration of two tests

75
2. The Alternate-Form Method

This approach reduces the likelihood that
practice and memory will inflate reliability
coefficient
Involves the use of two tests, the second being a
parallel form of the first

76
Procedures

Test - Administer alternate forms of a test to
same people
Compare- Compute relationship between each
persons score on the two forms
Result - as above
Interpret- as above
It must be noted that this approach requires two
forms of a test, which parallel one another in
content and the mental operations required. In
addition, items on one form must match items on
the other form, with corresponding items
measuring same quality or characteristic.

77
The Split-Half Method

a quick way of determining internal consistency
Procedures
1. Split test in two halves (odd verses even
or by random selection )
2. Administer to two groups
3. Relate scores of both groups
This approach determines whether each half of
the test is measuring same quality or
characteristic

78
The Established Measures Method

Use a standardized test/scale to measure the
quality or characteristics of interest.
These are instruments for which reliability has
already been established.

79
Inter-coder/Research Workers Reliability

Assesses the extent to which different
interviewers, observers or coders, using the same
instrument get equivalent results.
Involves the independent assessment and
comparison of selected interviewers, coders or
observers to determine consistency in judgments.

80
Factors Contributing to unreliability of a
test

Familiarity with the particular test
Fatigue
Stress.
Physical conditions of the room in which the test
is given
Health of the test taken.
Fluctuation of human memory
Amount of practice or experience by the test
taken of the specific skill being measured.
Specific knowledge that has been gained outside
of experience being evaluated by the test.
A test that is overly sensitive to the above
items is not reliable.

81
Validity

The extent to which a measuring instrument
measures what it purports to measure
the truthfulness or accuracy of a measure.
Types
Criterion-related validity
Construct validity
Content validity

82
Criterion-related validity

Check instrument for its predictive power
Relate performance on the test to some actual
behavior it is suppose indicate
Example- Drivers test. Relate performance on
the written exam to use of the road use of turn
signals, observation of signs, etc.
That is, relate test to performance criterion

83
Construct Validity

Established by relating a presumed measure of a
construct or hypothetical quality to some
behaviour or behaviour manifestation it is
assumed to indicate
Relate performance on the test to some construct
that the test score is assumed to indicate
Example- Self esteem scale A person who scores
high on the test is assumed to have a high self
esteem. Is that person extroversive, gregarious,
highly motivated, etc.

84
Content validity

To what extent do the items on a test adequately
represents all facets of the concept being
measured?
Determined by ensuring that sample set (of items
on test) is representative of actual set
A test in Social Research should cover all areas
on course outline to be content valid

85
Dimensioning
86
Quality of life

Quality of Life There are many components to
well-being. A large part is standard of living,
that is, the amount of money and access to goods
and services that a person has (easily measured).
Additionally, the concept refers to freedom,
happiness, and satisfaction with life are far
harder to measure and could be more important.

87
Quality of life

The concept of quality of life incorporates two
major dimensions
Objective Living Conditions This dimension
concerns the ascertainable living circumstances
of individuals, such as working conditions, state
of health or standard of living.
Subjective Well-Being This dimension covers
perceptions, evaluations and appreciation of life
and living conditions by the individual citizens.
Examples are measures of satisfaction or
happiness.

88
Questionnaire Design
89
The Questionnaire

A questionnaire is a collection of questions and
/or statements that is designed to collect
information on a particular topic.
It is an instrument used by researchers to
convert into data, information directly given by
respondents.
In essence, it provides access to what is inside
the person's head

The questionnaire facilitates the
measurement of what a person
knows - knowledge, information
likes dislikes - values, preference
thinks - attitudes, beliefs
experiences - past present
It is a useful alternative when direct
observation is not possible.

This approach to data collection requires
that the respondent
co-operates in the completion of questionnaire
tells what is, instead of what he thinks ought to
be, or what he imagines the researcher would
like to hear.
knows how he feels or thinks in order to report.
It is possible therefore for the questionnaire to
measure not
necessarily what a person likes, believes or
thinks but what
he/she indicates in these regards.

The researcher must, therefore, pay attention to
the following factors
He/she may not be able to provide answers to the
questions posed - out of ignorance etc.
Respondent bias
Acquiescence the tendency to agree to statements
despite the content of the statement
Social desirability respondents answer giving
the socially or culturally correct answer rather
than what they actually believe or feel.
Practice effects

93
Questionnaire construction

The structure of any questionnaire is determined
by
Theoretical considerations
Method of data analysis
A questionnaire for a correlational or
explanatory survey typically has the following
types of items
Measures for the dependent variable(s)
Measures for the independent variable(s)
Background measures

94
Questionnaire construction (contd)

Question content
There are five types of question content
Behaviour what people do?
Belief what people think is true
Knowledge items that measure respondents
knowledge of knowable facts or the accuracy of
their beliefs
Attitude what people desire or find desirable
Attributes characteristics of people
Each type of question is specific to the
characteristics that it is supposed to measure.

95
Questionnaire construction (contd)

Principles of item design
All items must achieve the following
Reliability
Validity
Adequate Discrimination
High response rates
Consistent interpretation across respondents
Relevance to the overall research endeavour

96
Questionnaire construction (contd)

Wording items
In wording items it is important to consider the
following
Language simplicity
Length of the item
Avoiding double barrelled questions
Avoiding leading items
Avoiding negatively worded items
Avoiding ambiguous items
Avoiding prestige bias
Avoiding words (qualifiers) that will influence
responses
The level of precision required to answer the
item
The level of precision of the item and the
knowledge required
Time and space requirements
The use of personal or impersonal wording

97
Types of Questions

Direct versus indirect (Specific vs. Non
Specific)
a. Do you like your job? - direct (specific)
b. How do you feel about your job? -
indirect (non-specific)
a. How you feel about teacher A? - direct
(specific)
b.How do feel about class taught by teacher
A? - indirect (non-specific)
Direct or specific questions may cause respondent
to
become guarded or cautious and give less than
honest
answers. Non-specific ones lead to desired
information
with less alarm.

98
Types of Questions (contd).

Questions versus Statements - Can be a direct
question as those types above (requiring a direct
answer) or a statement requiring an optional
response.
Predetermined versus Response Keyed Questions -
Answer all vs. answer those that are relevant.

5. Do you drink alcoholic beverages?

1. Never 2. Occasionally
3. Frequently 4. Always
(If never, go to 6 and then terminate.
Otherwise, skip to 7 and continue)
6. Why dont you drink alcoholic beverages?
1. Religious reasons 2. Health reasons 3.
Others (Specify) ______

100
RESPONSEMODES
101
Structured Response (Close-ended)

Provide respondent with possible answers and ask
him/her to choose the most appropriate option.
When the closed-ended format is used, the
researcher should be guided by the following
- Response categories provided should be
exhaustive
- Response options should be mutually
exclusive
- There should be clear instruction to
select the best answer
This format is respondent friendly and
facilitates greater ease in the processing of
data, since it can be transferred directly to
computer. It however, limits the possible answers
to those thought of by the researcher.

102
Structured Response (Close-ended)

For close ended the responses must satisfy the
following requirements
Exhaustiveness
Exclusiveness
Balanced categories

103
Unstructured Response (Open-ended)

Researchers ask questions and allow respondents
to provide answers
Exert control only in regard to the questions
asked and the time and space provided.
Respondents give own answer, rather than just
agreeing with those given.
Format offers the respondent more flexibility

104
Disadvantages of Open-ended Format

Responses must be coded before processing - The
coding process can be time consuming and can be
quiet technical. It requires the researcher to
accurately interpret the meaning of respondents
give to responses. There is always the possibly
of misunderstanding and researchers bias.
Respondents quite often provide answers that are
irrelevant to researcher's intent.

105
Fill-in Responses.

This is transitional mode between structured and
unstructured mode.
Respondents generate, rather than choose answers
Responses are, however, limited in range and
length - often a single word or short phrase
Example What is your father's occupation?
The very wording of the question restricts the
number of possible responses and the number of
words.

106

Tabular Responses - Fill response into a table. A
very convenient way of organizing complex
responses.
Scaled Response - A structured response form.
Respondents are asked to express endorsement or
rejection of a given statement.
Numerical rating scales
These scales require respondents to give one
response for each item
The resulting variable responses can be ordered
from high to low
The numbers represent the intensity of the
sentiment being expressed
Likert
Horizontal rating scale
Semantic differential scales
Vertical rating ladder

107
Fill-in responses (contd)

Scoring Out of 10
Ranking response Respondents are given some
statements, etc. and asked to rank according to
some criteria.
Checklist Response (Multiple response format) -
Respondents choose all possible answers from a
number of options given to him

108
Fill-in responses (contd)

Binary choice formats
Dichotomous
Paired comparisons given these two choices which
do you consider to be more important?
Multiple choice formats
The respondent is asked to select one response
from a set of responses.
Multiple nominal categories
Multiple ordinal categories
Multiple ordered attitude statements

109
Fill-in responses (contd)

Non-committal responses
No opinion
Dont know
Middle non-committal answer

110
Fill-in responses (contd)

The number of response categories
The fundamental concern is that there are
response categories that the respondent can
comfortably represent themselves. Secondly this
depends on the ability of the respondent to
answer the item and the extent to which they can
do so.
Dichotomous
Five point scales information about intensity,
extremity and direction
Longer scales greater discrimination and
therefore finer details

111
Development Issues

In constructing the Questionnaire, the
researcher
should always consider the following factors
Format Wording
Precision Questions should be clear and
unambiguous
Concision Items should be as short as possible
Relevance Question should all be relevant and
necessary
Double-barreled Questions Each question should
should attempt to measure only one variable at a
time
Biased Items/Terms Should not use leading
questions
Negative Items Questions should be in positive
form
Abbreviations and Jargons These should always
be avoided

112
Development Issues (contd).

Format Layout
Uncluttered Items should be well-spaced/
spread-out
Order Items should flow in a logical order. The
ordering of questions affects the quality of
responses
Length Should not be too many items
Instrument shouldnt be too long
Personal Information Request only when required
Instructions Always provide adequate
instructions both general and specific.

113
Pilot or Pre-testing

Questionnaire development

114
Why Pre-test?

It is human to err irrespective of how
systematic or careful we are in the questionnaire
design process we will more than like make
errors.
The standardized questionnaire is inflexible.
Once the instrument is developed and data
collection has started any mistakes/errors cannot
be corrected. If modifications to the
questionnaire are made during data collection,
any data collected before the changes will become
useless.
To determine the length of time the questionnaire
takes to be completed.

115
Conducting the Pre-test

Use respondents that are part of the population
but not part of the sample. The basic requirement
is that the questionnaire should be relevant to
those responding.
The interview should be conducted in conditions
as similar to those that the actually interviews
will be conducted under. That is, similar
settings and the interviewers that will
eventually conduct the interviews should be used.
Test in waves that is, after each set of
revisions to the instrument, the revised
questionnaire should be re-tested on a new set of
respondents. These waves of testing should be
continued until the instrument is as clean as is
possible.

116
Sampling and

Sampling distributions

117
What is Sampling?

A process for identifying and selecting elements
for observation (Babbie, 2000)
The selection of units of observation in such a
manner that a researcher can make relatively few
observations yet being able to generalize to the
larger population
A systematic way of deciding what or whom to
observe when limited resources dictate that the
few instead of the many be observed

118
Why Sample?

Factors justifying the observation of a sample
rather than the entire population
Cost constraint
Time constraint
Timely Results
Accuracy Planning and logistics more manageable
Human Resource constraint
Practicability e.g. Population may be
inaccessible

119
Some Important Concepts/Terms

Representativeness typical of the population. A
sample is representative of the population if the
aggregate characteristics of the sample closely
approximate those same aggregate characteristics
of the population
Equi-probability the probability or chance for
being selected is the same for all members of the
population. Sampling methods which ensures
equi-probability are classified as Random
Sampling Techniques

120
Some Important Concepts/Terms

Bias giving a quality or characteristic
more/less attention or emphasis than it merits.
A biased sample is not typical/representative of
the population
Factors contributing to bias in a sample
- Convenience tendency to include units that
are easily accessed
- Personal biases
- Ignorance about composition or description
of population
- Flawed Method

121
Some Important Concepts/Terms

Population Totality of items or units about
which the researcher wants information
Sample Frame An accurate specification of all
units of interest to a particular study. It is an
operational definition of the population under
consideration. A list of all units of interest.
Sample A subset of the population drawn from
the sample frame. A representative smaller group
that is systematically selected from the
population
Element A unit of sample drawn from the sample
frame of a particular population. Member of
sample

122
Some Important Concepts/Terms

Sampling error this is the difference between
the value of a statistic and the value of the
corresponding population parameter.
Periodicity Where there are periodic/cyclical
patterns in the population that correspond to the
sampling intervals (a problem specific to
systematic random sampling which results in a
biased sample).
This is caused by the arrangement of the items in
the population is defined by some characteristic.

123
Two types of Sampling

Probability
- Elements are drawn based on chance/random
procedures
- Every member of the population has known
(non-zero) probability of being selected
Non-probability
- Elements are not chosen by chance/randomly
- Determined on the basis of expertise,
personal judgment, knowledge or convenience

124
Non-probability Sampling

In many research situation, the enumeration
of the population elements (basic requirement
of probability sampling) is difficult or
impossible
Other times, a representative sample is not
appropriate given the aim/purpose of the research
In these instances, a non-probability technique
will suffice

125
Probability Sampling

There are four main types of probability samples.
The choice between these depends on the nature
of
The research problem
The availability of good sampling frames
Money
The desired level of accuracy in the sample
The method by which the data is to be collected
Four Types
Simple random sampling
Stratified random sampling
Systematic sampling
Multistage Area/Cluster sampling

126
Simple Random Sampling

All members of the population have an equal and
independent chance of being included
Define population
List accessible members of population (complete
sample frame)
Decide on the required sample size
Select sample by employing chance procedure (e.g.
table of random numbers)

127
SRS Table of Random Numbers

Table containing columns of digits that have been
computer generated.
Assign each member of population a distinct
identification number
Use table of select systematically, subjects to
be included in sample
Customary to determine by chance the point at
which table is entered

128
SRS

Disadvantages
Requires an unbiased sample frame
Impractical when surveying populations in diverse
geographical areas

129
Systematic Sampling

Involves drawing a sample by taking every
Kth case from a list of the population
First decide on a number of subjects in the
sample (n)
Since the total number in the population (N) is
known, divide N by n and determine the
sample interval (k) to apply to the list

130
Systematic Sampling

First number is randomly selected from the
first k member of the list (N.B. if this is
not done the list will be exhausted before the
sample is selected) and every Kth member of the
population is selected for the sample
If a fraction is obtained truncate the number and
proceed. If the number is rounded instead the
list may be exhausted before the sample is drawn.
Pop 500 desired sample 50
kN/n
500/50 10

131
Systematic Sampling

Start near the top of the list and select the
first case randomly from the first 10 cases and
then every 10th case thereafter
Differs from simple random because
choices are not independent. Once first
case is chosen, all subsequent cases are
automatically determined.

132
Systematic Sampling

If the original population lists is in random
order, then systematic sampling would yield a
sample that could be reasonably substituted for a
random sample
If list is alphabetical or otherwise structured,
(e.g. people are positioned in population
according to a given characteristic) the sample
may be biased (Periodicity).

133
Systematic Sampling

Advantage
Simplest method to select a random sample
Disadvantage
Requires an unbiased sample frame
Periodicity

134
Exercise

1, 2,3, 4, 5, 6, 7, 8, 9, 10,11,12,13,14,15,16,17,
18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34
,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,5
1,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,
68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84
,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,
101,102,103,104,105,106,107,108,109,110,111,112,11
3,114,115,117,118,119,120,121,122,123,124,125,126,
127,128,129,130,131,132,133,134,135,136,137,138,13
9,140,141,142,143,144,145,146,147,148,149,150.

135
Stratified Sampling

Use when population contains a number of
subgroups/strata that may differ in the
characteristics being studied
First identify the strata of interest and then
draw a specified number of subjects from each
strata
This approach improves representativeness and
facilitates the studying of differences between
subgroups

136
Stratified Sampling

To Stratify a sample
Select the stratifying variable
Divide the sampling frame into separate lists-one
for each category of the stratifying variable
Draw a systematic or SRS of each list

137
Stratified Sampling

Advantages
It produces more representative samples
Disadvantage
More complicated than SRS and systematic random
sampling
Sample frame must contain information on the
stratifying variable
Requires an unbiased sample frame

138
Multistage Cluster Sampling

Using this method a (final) sample is obtained by
drawing several different (intermediate) samples.
Procedure
Divide population into broad groups clusters
Select a SRS of these groups
Within these selected groups obtain sample frames
of each
Select a SRS within these groups, etc.

139
Multistage Cluster Sampling

Advantages
Does not require an unbiased sample frame
Can be used to sample geographically diverse
populations
Disadvantages
Complex
Expensive

140
Steps in the Sampling Process

Define the population Identifying all the
elements about which the research wants
information
Determine Sample Frame -Identify the accessible
population
Select Sample Systematically Select a
representative smaller group from sample frame.
Observe - Make observation of this smaller group
Generalize - Generalizing the findings to the
larger population

141
Sample Size

Sample size is determined by two factors
The degree of accuracy that is required
The level of variation in the population across
the variables being studied

142
Sampling distributions

Up until this point the discussion about
statistics has been about those from a sample. In
our estimation of the various means and standard
deviations it was assumed that these sample
statistics were similar to those in the
population. However this assumption is not
necessarily true. To begin our discussion about
how this is the case we need to understand some
basic statistical terms and their relevance to
sampling distributions.

143
Sampling distributions

A population is any set of measurements that can
be made of a random variable real or
hypothetical.
A sample is a subset of these (actual)
measurements.
In an effort to summarize the characteristics of
samples various numerical descriptive measures
called sample statistics were calculated from the
sample measurements and used as afore mentioned.
Since sample statistics are singular numerical
values they are also referred to as point
estimates of population parameters.

144
Sampling distributions

Populations characteristics are also summarized
by similar numerical descriptive measures called
population parameters however these are usually
unknown because it is too expensive to measure
every possible value of a random variable or it
is impractical to do so for some other reason.
Consequently sample statistics are used as
estimates of population parameters because it is
cheaper to make measurements of a few of the
values that a random variable can take.

145
Sampling distributions

Specifically sample statistics are used to make
inferences about population parameters either in
the form of (1) estimates of the actual
population values or (2) to formulate a decision
about the value of a population parameter.
However this is problematic because the sample
statistics vary from sample to sample depending
on the values of the random variable that are
sampled.

146
Sampling distributions

The solution is to create a probability
distribution (sampling distribution) of all the
values of the statistic produced from the various
samples that can be drawn from the population of
measurements of the random variable.
This probability distribution could then be used
to evaluate the reliability of the inferences
made about the population parameters. Of course
we are assuming that each sample consists of
values typically found for the random variable or
in other words each sample is representative of
population of values it is drawn from. The most
common mechanism used to ensure
representativeness is to select the values of
each sample by a random method.

147
Sampling distributions

Understanding sampling distributions is therefore
important because it facilitates the
comprehension of the process of statistical
inference in particular an intuitive appreciation
of the reliability of the inferences made using
sample statistics.

148
Sampling distributions

Properties of Sampling Distributions
The mean of a sampling distribution is normally
distributed.
The mean of the mean sampling distribution is
equal to the population mean

149
Sampling distributions

If the mean of the sampling distribution is not
equal to the mean of the population then the
sample mean is said to be biased.
The standard deviation (or standard error) of the
mean sampling distribution is

150
Sampling distributions

The mean distribution can be converted to the
standard normal distribution by using,

151
Sampling distributions

where µ the mean of the random variable
n the sample size
s the standard deviation of the random variable

152
Sampling distributions

From the properties listed above two theorems
have been derived, the first says that if a
random sample of n observations is selected from
a population of measurements from a random
variable that is normally distributed the mean
sampling distribution that will be produced from
this population will also be normally distributed.

153
Sampling distributions
154
Sampling

Sample size
Before sample size selection, the following must
be considered
Level of confidence this is the risk of error
the researcher is willing to accept in the study.
This in turn depends on
Time
Money/resources
Consequences associated with drawing incorrect
conclusions

155
Determining Sample Size

The most common levels of confidence used are 95
and 99.
Confidence interval this is the level of
sampling accuracy the researcher will have.
The type of variable or variables being studied

156
Sample Size

Categorical variables require different sample
sizes than do metric level variables. Generally
categorical variables require smaller sample
sizes than metric variables.
Additionally it is easier to estimate sample
sizes for categorical variables. That is, in
order to calculate the sample size for metric
variables, the standard deviation of a previous
sample is required.
n z2pq/e2 , where e z v(pq/n)

157
Sample Size

This standard deviation is usually obtained from
one of two sources either from previous research
literature using the same population or from the
sample used in the pre-test of the instrument.
n (zs/e)2
Where e z (s/vn)

158
Sample Size

The size of the population
If the population has less than 100,000 elements,
it is considered to be a finite population (or if
n /N 0.05) and the finite population correction
factor must be applied to the calculation of the
standard error. The resulting samples are smaller
than those selected from larger populations.
Finite population correction factor v (N n)
/ ( N 1)

159
Exercise

An alumni association wants to estimate the mean
debt of this years college graduates. It is known
that the population standard deviation of the
debts of this years college graduates is 11,800.
How large a sample should be selected so that the
estimate with a 99 confidence level is within
800 of the population mean?
A political party wants to estimate the
proportion of voters who disapprove of a
candidate they have put to run in a particular
constituency. The party wants the estimate to be