Title: Logic in Scientific Reasoning
1Logic in Scientific Reasoning
- Definition of Science
- Broadly speaking, science can be defined as a
systematic study of nature and the rules which
govern nature (and human behaviour or phenomena),
to identify general statements of fact through
which we can understand and interpret
information.
2Logic in Scientific Reasoning
- Definition of Science
- Science derives from Latin word scientia
which means knowing. - Not talking specifically about particular subject
areas, but an orientation/ way of looking at an
area, a systematic approach in the pursuit of
knowledge. - Basically any subject matter can be studied as
science the essential point is the method used.
3Scientific Method
- The method distinguishes scientific from
non-scientific enterprise - Here, one uses the tools of logic to assist in
assessing whether a systematic and logical
sequence has been followed in arriving at truth
claims. - The scientific method can be described as the
logic of science, ie, the principles that can be
used to arrive at explanation of facts and so get
knowledge of the world. It is a systematic
approach that is used in the attempt to gain
knowledge.
4Scientific Method
- The method combines both aspects of logic, ie,
the inductive process and the deductive process. - The process is cyclical, with the method
beginning and ending with what can be observed or
data that can be gathered through means such as
experiments, that is, through the use of
induction.
5Steps of the Scientific Method
- Identifying a problem (remember that problem
here means that which requires explanation, not
the negative definition we generally have) cant
have anything before a problem is identified. - Formulate a hypothesis cant be done without
having collected some data that could be used to
try to understand the events and so formulate a
means of explanation
6Steps of the Scientific Method
- Collect additional data this generally involves
preparing a research design to identify the
sources of data and how that data will be
collected. The problem at that stage is
identifying what counts as relevant
information/data. May be relevant for a
hypothesis that you have not actually formulated,
but not relevant to the one that you are
presently using.
7Steps of the Scientific Method
- Test hypothesis how effectively does the
hypothesis explain the facts that it was
formulated to explain, as well as other facts?
Additionally, how well can it predict future
occurrences? - Draw conclusions - as to the efficiency and
effectiveness of the theory in explaining and
predicting events, and how the theory can be used
to achieve some human goals (since that is often
what is being aimed at, or results from
scientific exploration).
8Laws of Nature
- The scientific method should yield a body of
general statements of fact through which we can
understand and interpret information. - These general statements of fact that we
formulate are actually universal statements,
because they should apply to all instances
covered in the statement.
9Laws of Nature
- However, they are only provisional, in that there
is always the possibility of finding evidence to
disprove the statement. - These statements are also empirically based in
that they depend on data gathered from
observation or experiment.
10Laws of Nature
- These general statements are called laws of
nature. - Laws of nature are derived from hypotheses that
are formulated to understand the events
occurrence. - Before we can have natural laws, we therefore
need to have good hypotheses.
11Evaluating Hypotheses
- Relevance the hypothesis must not stray from
the phenomenon which needs to be explained. - Testability Must always be able to test to see
if the hypothesis is plausible by apply it to the
data for which you need to be getting an
understanding. Additionally, the test that is
being done must be relevant to the hypothesis
that is being tested. -
12Evaluating Hypotheses
- Predictive and explanatory power distinction
between the two is generally seen as temporal
explanation referring to events that have already
occurred prediction to those that have not yet.
Yet at the same time, prediction presupposes that
an explanation has already occurred.
13Evaluating Hypotheses
- Compatibility with experimental laws/theories
if two new hypotheses are offered, want to accept
the one that accords more with what has already
been established. The idea here is that there is
a constant accumulation of knowledge. - Simplicity may find that you have two theories
that both function effectively when this
happens, use the simpler of the two. Simpler the
explanation, the less factors that youd be
needing to take into account to give an account
of the event.
14Social Surveys
- Research projects which use a questionnaire to
collect standardized data from a large number of
individuals. - Can be either Population or Sample surveys.
Sample surveys are the most common - The collection of standardized data requires that
the same questions be given to all respondents in
the same order. -
15Types of Surveys
- Factual Surveys Use to collect descriptive
information. Example, Population census, The
Survey of Living Conditions and The Labour Force
Survey. - Attitude Surveys Carried out by opinion poll
organizations, market researchers, etc. - Explanatory Surveys - Used to test hypotheses or
to test and develop theories. - Common to all types, is the use of the
Questionnaire as the instrument of data
collection
16Survey Design
17The term Research Design can refer to
- the planning of scientific inquiry
- the design strategy for finding out something
-
- the arrangement of the conditions of observations
18All designs require
- a precise determination of what you want to find
out - the detailed specification of the most
appropriate and effective way doing so
19Factors determining Design
- a. The purpose of the Study
-
- b. The Time Dimension
- d. Approach to the collection and handling of
the data
20The purpose of the Study
- A particular research project can serve any
one or a combination of the following purposes - Exploration
- Description
- Explanation
21Classification According to Time
- Time dimension speaks to the number of times
participants will be observed in relation to a
particular study - There are two approaches
- - Cross-sectional researchers do a
snapshot. One-time effort in gathering data. - - Longitudinal Permits the researcher to
observe the phenomenon more than once
22Correlational Design
- The standard research design used in surveys is
the correlational design - In this design constructs are measured
independently of each other and then tested for
associations. - Extraneous variables are controlled by including
them in the study.
23Causality
- The Demonstration of Cause and Effect
24Causation
- Mill argued that every event that occurs has a
cause this is called the Principle of Universal
Causation. - But how do we identify causal relationships? If
we see two events occurring together
consistently, what right have we to assume that
one is the cause of the other?
25Explanation
- Explanation implies delineating the causal links
between and among variables. - To explain therefore involves asking two types of
questions about any phenomena i.e. how? and/or
why? - How questions are more easily answered than the
why questions because of the nature of the types
of answers required. - How questions can usually be answered using
chronological accounts i.e. the sequence of
events.
26Explanation (contd)
- Answering how questions allows us to decompose
the structure of relationshipssimplify. e.g.
traditional voting patterns - Why questions usually require introspection,
rationalization and motivational answers. - Answers to why questions exist outside of both
variables - Although one type of question is easer to answer
than another both are necessary for adequate
explanation.
27Causality Assessment Criteria
- There are four general criteria
- 1. association
- 2. time priority
- 3. non-spuriousness
- 4. rationale
- interpretivists also rely on how they think
things should be ordered.
28Causality Assessment Criteria Association
- For a causal relationship to exist there must be
co-variation - Although association is necessary for causality
the strength of the association does not increase
or decrease causality, e.g. smoking and lung
cancer - However if there is consistency i.e. if the
association is present across different samples
and contexts (established through replication)
the greater the chance that there is a causal
relationship. - this might be confounded by conditional
relationships
29Causality Assessment Criteria Time Priority
- Cause must always precede effect.
- This is not clear cut in all cases, it is
difficult at times to determine which variable
should precede the other, e.g. social class and
educational attainment. How do we decide?
30Causality Assessment Criteria Non-spuriousness
- For two variables to be causally related the
co-variation must not be the function of a third
variable, i.e. a variable that is related to both
variables, e.g. age and religiosity
31Causality Assessment Criteria Rationale
- Causality can only be established if theory is
supported by empirical evidence. If there is any
disconnect causality is questionable.
32CausationMeaning of Cause
- Necessary Condition can be stated in various
ways - It has to be present for the event to occur.
- If it is absent, the event will not take place.
- If the phenomenon (effect) is present, the
condition (cause) is present. - A is a necessary condition for B only if whenever
B is present, A is present.
33CausationMeaning of Cause
- Necessary Condition
- Will need several necessary conditions for a
phenomenon to occur. - Example Plants will only grow if they are
exposed to light.
34CausationMeaning of Cause
- Sufficient Condition can be stated in various
ways - F is sufficient for H if whenever F is present, H
is present. - If the condition is present, the phenomenon is
present. - Once the condition is present, the event will
occur.
35CausationMeaning of Cause
- Sufficient Condition
- Example
- The rain falling heavily is a sufficient
condition for the road being wet, but can
conceive of other ways in which the road could
have become wet without the rain falling.
36CausationMeaning of Cause
- Remote vs. proximate cause
- Remote cause looking at the more distant
conditions that resulted in the events
occurrence - Proximate cause the conditions that are in
place just before the events occurrence - Remote and proximate causes will be linked by
intermediate causes in what can be labelled a
causal chain.
37Causal Claims/Statements
- These may be singular or general (we are usually
more interested in the general) - Singular
- The slippery road surface caused the accident
today. - The tsunami in Asia on December 26, 2004 claimed
500 000 lives. - General
- Absence makes the heart grow fonder
- Exercising reduces the risk of a heart attack.
- Dirty campaigning wins elections.
38Identification of Causal Relationships
- Universality of causation if an event resulted
from certain conditions being in place, we expect
that this will happen again in the future, if the
same conditions are in place. - This principle is the underlying basis through
which we can develop general causal statements.
39Identification of Causal Relationships
- It is never possible to directly see causal
relationships have to infer these relationships
from observations. - Conclusions drawn from observations are
potentially flawed as correlation does not lead
to causation. - Correlation looking at the rate at which a
property is seen among two populations/groups of
events - The identification of a causal relationship
generally starts with a correlation, but goes
further by identifying a set relationship between
the two, where the presence of one results in the
presence of the other, or the absence of one
results in the absence of the other.
40Common Mistakes In Assigning Causal Relationships
- May identify two events that are correlated, but
may reverse the causal factors. - May identify a correlation between two events,
but both events are the result of a common cause. - May have correlations that are purely accidental.
- Note These errors will result in the Fallacy of
False Cause.
41Identification of Causal Relationships
- Once a correlation has been identified, there
needs to be a further exploration as to whether
the events in question are in fact causally
linked and, if they are, which is the cause and
which is the effect. - This is usually the most difficult to achieve,
because one needs to have strong support for the
causal claim being made.
42Identification of Causal Relationships
- Several methods have been developed to ensure
that when a causal statement is developed that it
will actually hold true in most, if not all,
cases. - John Stuart Mill developed five such methods,
called Mills Methods of Experimental Inquiry,
namely - Method of Agreement
- Method of Difference
- Joint Method of Agreement and Difference
- Method of Residues
- Method of Concomitant Variation
43Mills Methods Method of Agreement
- Looking for instances where two events
consistently occur together, one being an
antecedent to the other the antecedent condition
being identified as the cause. - Example
- In the case of chicken pox, you may notice that
each time someone has chicken pox and someone
touches her, the person also gets chicken pox.
When you have seen this on several occasions, you
can develop a general causal statement.
44Mills Methods Method of Agreement
- Each occurrence of this event with the conditions
laid out would provide confirmation for the
causal statement. In a case where you already
have the cause, you are seeing a sufficient
condition being identified. - If you already have the effect, trying then to
identify what is the one condition that is
present in all cases where the phenomenon occurs,
so looking for the necessary condition.
45Mills Methods Method of Agreement
- Limitation
- May not be able to narrow it down to one
differentiating but common condition for all
instances of the phenomenon. - Therefore may need to call on another method to
supplement.
46Mills Methods Method of Difference
- You are looking at the conditions that are
present when an event occurs and when it does not
occur. - If there is one condition that is present when
the event occurs and is absent when it does not
occur, that condition can be seen as the cause of
the event. - This is the method that is generally used in the
controlled experiment. Trying to control every
other variable want to see what will differ in
the absence of one variable.
47Mills Methods Method of Difference
- Example
- For the chicken pox, you may notice that you
have three children playing with someone who has
chicken pox. You watch carefully and no child
touches the child that has chicken pox, and no
child gets the virus, so in the absence of
touching, the virus is not spread. So touching is
the cause of the virus transmission. - Necessary condition is what is usually
identified, or the basis for the necessary
condition.
48Mills Methods Method of Difference
- Limitations
- Cant control every other variable under normal
conditions. Even a minor factor could change,
which would make the use of this method
problematic. - Very often, the cause of an event's occurrence is
various factors being combined can then have
more than 1 condition being identified as the
cause, when all are actually required.
49Mills Methods Joint Method of Agreement and
Difference
- If you can find both instances when the condition
and the phenomenon are present and when both are
absent, there are stronger grounds for making a
causal connection than using solely Agreement or
Difference. Therefore, using both methods at the
same time.
50Mills Methods Joint Method of Agreement and
Difference
- For the chicken pox, you may notice that you have
three children playing with someone who has
chicken pox. You watch carefully and two of the
children touch the child that has chicken pox,
and they get the virus. One child does not touch
the infected child and does not get chicken pox.
You can therefore conclude that chicken pox is
transmitted through touching the infected person.
51Mills MethodsMethod of Residues
- There is complex mix of events occurring.
- You already know the cause of some of the
phenomena that are being exhibited. If you remove
those causes and their phenomena, then you will
have left some conditions and some effects. - A B C g h i
- A has previously been identified as the cause of
g - B has previously been identified as the cause of
h - Therefore C is the cause of i
52Mills MethodsMethod of Residues
- Eat lunch, after which you start to feel ill. You
have indigestion, vomiting and a rash. You had a
tuna salad (which has mayonnaise), coconut water
and a banana. You know from past experience that
mayonnaise will give you a rash if it is combined
with black pepper that coconut water can cause
indigestion, then the banana is the probable
cause of the vomiting.
53Mills MethodsMethod of Residues
- Limitations
- May not be able to identify all the probable
causes of the event in question. - You would have to go to another source to
identify which of the possible causes that youve
identified is the one that gave you the rash. - The causal link that is used to exclude the other
occurrences from the equation have to be well
established from other inductions how do you do
this?
54Mills MethodsMethod of Concomitant Variation
- If you see a consistent variation between two
phenomena, one increasing while the other
decreases OR both increasing at the same time OR
both decreasing at the same time, then have a
basis for arguing that there is a causal
relationship between the two events. - Example
- Smoking causes lung cancer.
55Mills MethodsMethod of Concomitant Variation
- Limitations
- Concomitant variation may be seen in situations
where there is actually no causal connection
between the two events. - Cant simply use two observations or instances,
need to be looking at several cases. - When one mistakes correlation for cause, it is
generally because of the misuse of this method.
56Mills MethodsConcluding Remarks
- Mill set out to try to show that we can arrive at
certainty using induction by formulating his five
methods. However, this cannot be achieved,
because certainty, in the strictest sense, is not
possible in induction. - He also searched in each of his methods to
identify one cause, but many events have several
factors which work together to lead to the
occurrence of an event.
57Mills MethodsConcluding Remarks
- Mills Methods can be best seen as a means for
testing hypotheses that we have formulated to
explain particular phenomena we have observed or
certain questions we want answered. This is
because we will be limited in terms of the number
of possible instances that we can identify to
explain a particular occurrence youve already
set limits once you start to use Mills Methods,
they cannot be used in a vacuum.
58MEASUREMENT
59What is measurement?
- The process of assigning numbers of labels to
units of analysis in order to represent
conceptual or variable categories - Scientific norms require that we fully describe
our methods and procedures so that others can
repeat our observations and judge the quality of
our measurements
60Steps in the measurement process
- Conceptualization
- 2. Operationalization
- 3. Specification of levels of measurement
- 4. Test for reliability and validity
611. Conceptualization
- Creating factual or constitutive definitions of
concepts - Define concepts in relation to other concepts.
Ordinary language
622. Operationalization
- Describing concepts in the language of
measurement - Creating measurable definitions of concepts
633. Levels of Measurement
- set of rules used in the labeling or quantifying
of variables. - There are 4 levels, each assuming different
interpretation of the numbers or labels assigned
to the variables - Nominal Level
- Ordinal Level
- Interval Level
- Ratio Level
64Nominal variable
- the term nominal means to name
- measurement simply involves attaching names or
labels to the variables - observations are merely classified into
categories. - indicate whether things are the same or are
different - numbers or labels are assigned to categories as
codes for facilitate data collection and analysis - the only mathematical relationship that can be
assumed is that of equivalence.
65Nominal (cont.)
- Cases placed in a given category must all be the
same. - Also, the categories must be
-
- exhaustive -sufficient categories so that all
the items can fit into one of the categories - mutually exclusive -the items being classified
must not fit in more than one category
66Ordinal
- Variables classified as ordinal also have the
characteristics of mutually exclusive and
exhaustive categories - In addition, this level of measurement has the
additional feature of logical ordering of the
attributes of the variables. - In other words, you can order or rank the
variables under consideration
67Interval level
- This level has the qualities of nominal and
ordinal measurements - In addition, there is equal distances (intervals)
between the categories. - Example The difference between 20OC and 30OC is
the same as the difference between 90O C and 100O
C ( i.e. 10O C) degrees. - We can infer not only that 100OC is hotter than
90OC degree - but also by how much because of a standard
measurement.
68RATIO LEVEL
- This level includes all the features of the other
levels of measurement - In addition, there is an absolute zero point.
- Hence, it can be multiplied and divided.
Example, income measured in dollars can be
divided one into another to form a ratio. - Zero means absolutely nothing in this level of
measurement
69Reliability and Validity
70Reliability
- Reliability is concerned with issues of
stability and consistency - It is the extent to which a measuring instrument
produces the same result on repeated applications
under similar conditions. - When repeated measures of the same thing gives
identical or very similar results, the
measurement instrument is said to be reliable.
71Reliability Assessment Techniques
- 1. The Test-Retest Method
- 2. The Alternate-Form Method
- 3. The Split-Half Method
- 4. The Established Measures Method
- 5. Inter-coder/Research Workers Readability
721. The Test-Retest Method
- This is the simplest approach to assessing
reliability - It involves testing/measuring the same persons or
units on two separate occasions and then checking
for statistical correlation between the two sets
of scores.
73Procedures
- Test - Administer same test to some individuals
on more than one occasion.. - Compare - Scores of each individual on first
testing are related to scores of second testing
to provide a reliability coefficient. - Results - Coefficient can vary from 0 (zero),
indicating no relationship between the sets of
scores to 1 (one), indicating perfect
relationship - Interpret - High coefficient close to 1 is
desirable, since it is an indication of a strong
relation between the scores or an indication that
instrument is, indeed, measuring stable /enduring
characteristics.
74Advantages and Disadvantages
- Advantages
- Requires only one form of a test
- Provides information as to test consistency over
time. - Disadvantages
- Affected by practice and memory
- Influenced by events that might occur between
testing sessions. - Requires the administration of two tests
752. The Alternate-Form Method
- This approach reduces the likelihood that
practice and memory will inflate reliability
coefficient - Involves the use of two tests, the second being a
parallel form of the first
76Procedures
- Test - Administer alternate forms of a test to
same people - Compare- Compute relationship between each
persons score on the two forms - Result - as above
- Interpret- as above
- It must be noted that this approach requires two
forms of a test, which parallel one another in
content and the mental operations required. In
addition, items on one form must match items on
the other form, with corresponding items
measuring same quality or characteristic.
77The Split-Half Method
- a quick way of determining internal consistency
- Procedures
- 1. Split test in two halves (odd verses even
or by random selection ) - 2. Administer to two groups
- 3. Relate scores of both groups
- This approach determines whether each half of
the test is measuring same quality or
characteristic
78The Established Measures Method
- Use a standardized test/scale to measure the
quality or characteristics of interest. - These are instruments for which reliability has
already been established.
79Inter-coder/Research Workers Reliability
- Assesses the extent to which different
interviewers, observers or coders, using the same
instrument get equivalent results. - Involves the independent assessment and
comparison of selected interviewers, coders or
observers to determine consistency in judgments.
80 Factors Contributing to unreliability of a
test
- Familiarity with the particular test
- Fatigue
- Stress.
- Physical conditions of the room in which the test
is given - Health of the test taken.
- Fluctuation of human memory
- Amount of practice or experience by the test
taken of the specific skill being measured. - Specific knowledge that has been gained outside
of experience being evaluated by the test. - A test that is overly sensitive to the above
items is not reliable.
81Validity
- The extent to which a measuring instrument
measures what it purports to measure - the truthfulness or accuracy of a measure.
- Types
- Criterion-related validity
- Construct validity
- Content validity
82Criterion-related validity
- Check instrument for its predictive power
- Relate performance on the test to some actual
behavior it is suppose indicate - Example- Drivers test. Relate performance on
the written exam to use of the road use of turn
signals, observation of signs, etc. - That is, relate test to performance criterion
83Construct Validity
- Established by relating a presumed measure of a
construct or hypothetical quality to some
behaviour or behaviour manifestation it is
assumed to indicate - Relate performance on the test to some construct
that the test score is assumed to indicate - Example- Self esteem scale A person who scores
high on the test is assumed to have a high self
esteem. Is that person extroversive, gregarious,
highly motivated, etc.
84Content validity
- To what extent do the items on a test adequately
represents all facets of the concept being
measured? - Determined by ensuring that sample set (of items
on test) is representative of actual set - A test in Social Research should cover all areas
on course outline to be content valid
85Dimensioning
86Quality of life
- Quality of Life There are many components to
well-being. A large part is standard of living,
that is, the amount of money and access to goods
and services that a person has (easily measured).
Additionally, the concept refers to freedom,
happiness, and satisfaction with life are far
harder to measure and could be more important.
87Quality of life
- The concept of quality of life incorporates two
major dimensions - Objective Living Conditions This dimension
concerns the ascertainable living circumstances
of individuals, such as working conditions, state
of health or standard of living. - Subjective Well-Being This dimension covers
perceptions, evaluations and appreciation of life
and living conditions by the individual citizens.
Examples are measures of satisfaction or
happiness.
88Questionnaire Design
89The Questionnaire
- A questionnaire is a collection of questions and
/or statements that is designed to collect
information on a particular topic. - It is an instrument used by researchers to
convert into data, information directly given by
respondents. - In essence, it provides access to what is inside
the person's head
90- The questionnaire facilitates the
- measurement of what a person
- knows - knowledge, information
- likes dislikes - values, preference
- thinks - attitudes, beliefs
- experiences - past present
- It is a useful alternative when direct
observation is not possible.
91- This approach to data collection requires
- that the respondent
- co-operates in the completion of questionnaire
- tells what is, instead of what he thinks ought to
be, or what he imagines the researcher would
like to hear. - knows how he feels or thinks in order to report.
-
- It is possible therefore for the questionnaire to
measure not - necessarily what a person likes, believes or
thinks but what - he/she indicates in these regards.
92- The researcher must, therefore, pay attention to
the following factors - He/she may not be able to provide answers to the
questions posed - out of ignorance etc. - Respondent bias
- Acquiescence the tendency to agree to statements
despite the content of the statement - Social desirability respondents answer giving
the socially or culturally correct answer rather
than what they actually believe or feel. - Practice effects
93Questionnaire construction
- The structure of any questionnaire is determined
by - Theoretical considerations
- Method of data analysis
- A questionnaire for a correlational or
explanatory survey typically has the following
types of items - Measures for the dependent variable(s)
- Measures for the independent variable(s)
- Background measures
94Questionnaire construction (contd)
- Question content
- There are five types of question content
- Behaviour what people do?
- Belief what people think is true
- Knowledge items that measure respondents
knowledge of knowable facts or the accuracy of
their beliefs - Attitude what people desire or find desirable
- Attributes characteristics of people
- Each type of question is specific to the
characteristics that it is supposed to measure.
95Questionnaire construction (contd)
- Principles of item design
- All items must achieve the following
- Reliability
- Validity
- Adequate Discrimination
- High response rates
- Consistent interpretation across respondents
- Relevance to the overall research endeavour
96Questionnaire construction (contd)
- Wording items
- In wording items it is important to consider the
following - Language simplicity
- Length of the item
- Avoiding double barrelled questions
- Avoiding leading items
- Avoiding negatively worded items
- Avoiding ambiguous items
- Avoiding prestige bias
- Avoiding words (qualifiers) that will influence
responses - The level of precision required to answer the
item - The level of precision of the item and the
knowledge required - Time and space requirements
- The use of personal or impersonal wording
97Types of Questions
- Direct versus indirect (Specific vs. Non
Specific) - a. Do you like your job? - direct (specific)
- b. How do you feel about your job? -
indirect (non-specific) - a. How you feel about teacher A? - direct
(specific) - b.How do feel about class taught by teacher
A? - indirect (non-specific) - Direct or specific questions may cause respondent
to - become guarded or cautious and give less than
honest - answers. Non-specific ones lead to desired
information - with less alarm.
-
98Types of Questions (contd).
- Questions versus Statements - Can be a direct
question as those types above (requiring a direct
answer) or a statement requiring an optional
response. - Predetermined versus Response Keyed Questions -
Answer all vs. answer those that are relevant.
99- 5. Do you drink alcoholic beverages?
1. Never 2. Occasionally
3. Frequently 4. Always - (If never, go to 6 and then terminate.
Otherwise, skip to 7 and continue) -
- 6. Why dont you drink alcoholic beverages?
- 1. Religious reasons 2. Health reasons 3.
Others (Specify) ______
100RESPONSEMODES
101 Structured Response (Close-ended)
- Provide respondent with possible answers and ask
him/her to choose the most appropriate option. - When the closed-ended format is used, the
researcher should be guided by the following - - Response categories provided should be
exhaustive - - Response options should be mutually
exclusive - - There should be clear instruction to
select the best answer - This format is respondent friendly and
facilitates greater ease in the processing of
data, since it can be transferred directly to
computer. It however, limits the possible answers
to those thought of by the researcher. -
102Structured Response (Close-ended)
- For close ended the responses must satisfy the
following requirements - Exhaustiveness
- Exclusiveness
- Balanced categories
103 Unstructured Response (Open-ended)
- Researchers ask questions and allow respondents
to provide answers - Exert control only in regard to the questions
asked and the time and space provided. - Respondents give own answer, rather than just
agreeing with those given. - Format offers the respondent more flexibility
104Disadvantages of Open-ended Format
- Responses must be coded before processing - The
coding process can be time consuming and can be
quiet technical. It requires the researcher to
accurately interpret the meaning of respondents
give to responses. There is always the possibly
of misunderstanding and researchers bias. - Respondents quite often provide answers that are
irrelevant to researcher's intent.
105Fill-in Responses.
- This is transitional mode between structured and
unstructured mode. - Respondents generate, rather than choose answers
- Responses are, however, limited in range and
length - often a single word or short phrase - Example What is your father's occupation?
- The very wording of the question restricts the
number of possible responses and the number of
words. -
106- Tabular Responses - Fill response into a table. A
very convenient way of organizing complex
responses. - Scaled Response - A structured response form.
Respondents are asked to express endorsement or
rejection of a given statement. - Numerical rating scales
- These scales require respondents to give one
response for each item - The resulting variable responses can be ordered
from high to low - The numbers represent the intensity of the
sentiment being expressed - Likert
- Horizontal rating scale
- Semantic differential scales
- Vertical rating ladder
107Fill-in responses (contd)
- Scoring Out of 10
- Ranking response Respondents are given some
statements, etc. and asked to rank according to
some criteria. - Checklist Response (Multiple response format) -
Respondents choose all possible answers from a
number of options given to him
108Fill-in responses (contd)
- Binary choice formats
- Dichotomous
- Paired comparisons given these two choices which
do you consider to be more important? - Multiple choice formats
- The respondent is asked to select one response
from a set of responses. - Multiple nominal categories
- Multiple ordinal categories
- Multiple ordered attitude statements
109Fill-in responses (contd)
- Non-committal responses
- No opinion
- Dont know
- Middle non-committal answer
110Fill-in responses (contd)
- The number of response categories
- The fundamental concern is that there are
response categories that the respondent can
comfortably represent themselves. Secondly this
depends on the ability of the respondent to
answer the item and the extent to which they can
do so. - Dichotomous
- Five point scales information about intensity,
extremity and direction - Longer scales greater discrimination and
therefore finer details
111Development Issues
- In constructing the Questionnaire, the
researcher - should always consider the following factors
- Format Wording
- Precision Questions should be clear and
unambiguous - Concision Items should be as short as possible
- Relevance Question should all be relevant and
necessary - Double-barreled Questions Each question should
should attempt to measure only one variable at a
time - Biased Items/Terms Should not use leading
questions - Negative Items Questions should be in positive
form - Abbreviations and Jargons These should always
be avoided
112Development Issues (contd).
- Format Layout
- Uncluttered Items should be well-spaced/
spread-out - Order Items should flow in a logical order. The
ordering of questions affects the quality of
responses - Length Should not be too many items
Instrument shouldnt be too long - Personal Information Request only when required
- Instructions Always provide adequate
instructions both general and specific.
113Pilot or Pre-testing
- Questionnaire development
114Why Pre-test?
- It is human to err irrespective of how
systematic or careful we are in the questionnaire
design process we will more than like make
errors. - The standardized questionnaire is inflexible.
Once the instrument is developed and data
collection has started any mistakes/errors cannot
be corrected. If modifications to the
questionnaire are made during data collection,
any data collected before the changes will become
useless. - To determine the length of time the questionnaire
takes to be completed.
115Conducting the Pre-test
- Use respondents that are part of the population
but not part of the sample. The basic requirement
is that the questionnaire should be relevant to
those responding. - The interview should be conducted in conditions
as similar to those that the actually interviews
will be conducted under. That is, similar
settings and the interviewers that will
eventually conduct the interviews should be used. - Test in waves that is, after each set of
revisions to the instrument, the revised
questionnaire should be re-tested on a new set of
respondents. These waves of testing should be
continued until the instrument is as clean as is
possible.
116Sampling and
117What is Sampling?
- A process for identifying and selecting elements
for observation (Babbie, 2000) - The selection of units of observation in such a
manner that a researcher can make relatively few
observations yet being able to generalize to the
larger population - A systematic way of deciding what or whom to
observe when limited resources dictate that the
few instead of the many be observed
118Why Sample?
- Factors justifying the observation of a sample
rather than the entire population
- Cost constraint
- Time constraint
- Timely Results
- Accuracy Planning and logistics more manageable
- Human Resource constraint
- Practicability e.g. Population may be
inaccessible
119Some Important Concepts/Terms
- Representativeness typical of the population. A
sample is representative of the population if the
aggregate characteristics of the sample closely
approximate those same aggregate characteristics
of the population - Equi-probability the probability or chance for
being selected is the same for all members of the
population. Sampling methods which ensures
equi-probability are classified as Random
Sampling Techniques
120Some Important Concepts/Terms
- Bias giving a quality or characteristic
more/less attention or emphasis than it merits. - A biased sample is not typical/representative of
the population - Factors contributing to bias in a sample
- - Convenience tendency to include units that
are easily accessed - - Personal biases
- - Ignorance about composition or description
of population - - Flawed Method
121Some Important Concepts/Terms
- Population Totality of items or units about
which the researcher wants information - Sample Frame An accurate specification of all
units of interest to a particular study. It is an
operational definition of the population under
consideration. A list of all units of interest. - Sample A subset of the population drawn from
the sample frame. A representative smaller group
that is systematically selected from the
population - Element A unit of sample drawn from the sample
frame of a particular population. Member of
sample
122Some Important Concepts/Terms
- Sampling error this is the difference between
the value of a statistic and the value of the
corresponding population parameter. - Periodicity Where there are periodic/cyclical
patterns in the population that correspond to the
sampling intervals (a problem specific to
systematic random sampling which results in a
biased sample). - This is caused by the arrangement of the items in
the population is defined by some characteristic.
123Two types of Sampling
- Probability
- - Elements are drawn based on chance/random
procedures - - Every member of the population has known
(non-zero) probability of being selected - Non-probability
- - Elements are not chosen by chance/randomly
- - Determined on the basis of expertise,
personal judgment, knowledge or convenience
124Non-probability Sampling
- In many research situation, the enumeration
- of the population elements (basic requirement
- of probability sampling) is difficult or
- impossible
- Other times, a representative sample is not
appropriate given the aim/purpose of the research
- In these instances, a non-probability technique
will suffice
125Probability Sampling
- There are four main types of probability samples.
The choice between these depends on the nature
of - The research problem
- The availability of good sampling frames
- Money
- The desired level of accuracy in the sample
- The method by which the data is to be collected
- Four Types
- Simple random sampling
- Stratified random sampling
- Systematic sampling
- Multistage Area/Cluster sampling
126Simple Random Sampling
- All members of the population have an equal and
independent chance of being included - Define population
- List accessible members of population (complete
sample frame) - Decide on the required sample size
- Select sample by employing chance procedure (e.g.
table of random numbers)
127SRS Table of Random Numbers
- Table containing columns of digits that have been
computer generated. - Assign each member of population a distinct
identification number - Use table of select systematically, subjects to
be included in sample - Customary to determine by chance the point at
which table is entered
128SRS
- Disadvantages
- Requires an unbiased sample frame
- Impractical when surveying populations in diverse
geographical areas
129Systematic Sampling
- Involves drawing a sample by taking every
- Kth case from a list of the population
- First decide on a number of subjects in the
- sample (n)
- Since the total number in the population (N) is
- known, divide N by n and determine the
- sample interval (k) to apply to the list
130Systematic Sampling
- First number is randomly selected from the
- first k member of the list (N.B. if this is
not done the list will be exhausted before the
sample is selected) and every Kth member of the
population is selected for the sample - If a fraction is obtained truncate the number and
proceed. If the number is rounded instead the
list may be exhausted before the sample is drawn.
- Pop 500 desired sample 50
- kN/n
- 500/50 10
131Systematic Sampling
- Start near the top of the list and select the
- first case randomly from the first 10 cases and
then every 10th case thereafter - Differs from simple random because
- choices are not independent. Once first
- case is chosen, all subsequent cases are
- automatically determined.
132Systematic Sampling
- If the original population lists is in random
- order, then systematic sampling would yield a
sample that could be reasonably substituted for a
random sample - If list is alphabetical or otherwise structured,
(e.g. people are positioned in population
according to a given characteristic) the sample
may be biased (Periodicity).
133Systematic Sampling
- Advantage
- Simplest method to select a random sample
- Disadvantage
- Requires an unbiased sample frame
- Periodicity
134Exercise
- 1, 2,3, 4, 5, 6, 7, 8, 9, 10,11,12,13,14,15,16,17,
18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34
,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,5
1,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,
68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84
,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,
101,102,103,104,105,106,107,108,109,110,111,112,11
3,114,115,117,118,119,120,121,122,123,124,125,126,
127,128,129,130,131,132,133,134,135,136,137,138,13
9,140,141,142,143,144,145,146,147,148,149,150.
135Stratified Sampling
- Use when population contains a number of
subgroups/strata that may differ in the
characteristics being studied - First identify the strata of interest and then
draw a specified number of subjects from each
strata - This approach improves representativeness and
facilitates the studying of differences between
subgroups
136Stratified Sampling
- To Stratify a sample
- Select the stratifying variable
- Divide the sampling frame into separate lists-one
for each category of the stratifying variable - Draw a systematic or SRS of each list
137Stratified Sampling
- Advantages
- It produces more representative samples
- Disadvantage
- More complicated than SRS and systematic random
sampling - Sample frame must contain information on the
stratifying variable - Requires an unbiased sample frame
138Multistage Cluster Sampling
- Using this method a (final) sample is obtained by
drawing several different (intermediate) samples.
- Procedure
- Divide population into broad groups clusters
- Select a SRS of these groups
- Within these selected groups obtain sample frames
of each - Select a SRS within these groups, etc.
139Multistage Cluster Sampling
- Advantages
- Does not require an unbiased sample frame
- Can be used to sample geographically diverse
populations - Disadvantages
- Complex
- Expensive
140Steps in the Sampling Process
- Define the population Identifying all the
elements about which the research wants
information - Determine Sample Frame -Identify the accessible
population - Select Sample Systematically Select a
representative smaller group from sample frame. - Observe - Make observation of this smaller group
- Generalize - Generalizing the findings to the
larger population
141Sample Size
- Sample size is determined by two factors
- The degree of accuracy that is required
- The level of variation in the population across
the variables being studied
142Sampling distributions
- Up until this point the discussion about
statistics has been about those from a sample. In
our estimation of the various means and standard
deviations it was assumed that these sample
statistics were similar to those in the
population. However this assumption is not
necessarily true. To begin our discussion about
how this is the case we need to understand some
basic statistical terms and their relevance to
sampling distributions.
143Sampling distributions
- A population is any set of measurements that can
be made of a random variable real or
hypothetical. - A sample is a subset of these (actual)
measurements. - In an effort to summarize the characteristics of
samples various numerical descriptive measures
called sample statistics were calculated from the
sample measurements and used as afore mentioned. - Since sample statistics are singular numerical
values they are also referred to as point
estimates of population parameters.
144Sampling distributions
- Populations characteristics are also summarized
by similar numerical descriptive measures called
population parameters however these are usually
unknown because it is too expensive to measure
every possible value of a random variable or it
is impractical to do so for some other reason. - Consequently sample statistics are used as
estimates of population parameters because it is
cheaper to make measurements of a few of the
values that a random variable can take.
145Sampling distributions
- Specifically sample statistics are used to make
inferences about population parameters either in
the form of (1) estimates of the actual
population values or (2) to formulate a decision
about the value of a population parameter. - However this is problematic because the sample
statistics vary from sample to sample depending
on the values of the random variable that are
sampled.
146Sampling distributions
- The solution is to create a probability
distribution (sampling distribution) of all the
values of the statistic produced from the various
samples that can be drawn from the population of
measurements of the random variable. - This probability distribution could then be used
to evaluate the reliability of the inferences
made about the population parameters. Of course
we are assuming that each sample consists of
values typically found for the random variable or
in other words each sample is representative of
population of values it is drawn from. The most
common mechanism used to ensure
representativeness is to select the values of
each sample by a random method.
147Sampling distributions
- Understanding sampling distributions is therefore
important because it facilitates the
comprehension of the process of statistical
inference in particular an intuitive appreciation
of the reliability of the inferences made using
sample statistics.
148Sampling distributions
- Properties of Sampling Distributions
- The mean of a sampling distribution is normally
distributed. - The mean of the mean sampling distribution is
equal to the population mean
149Sampling distributions
- If the mean of the sampling distribution is not
equal to the mean of the population then the
sample mean is said to be biased. - The standard deviation (or standard error) of the
mean sampling distribution is
150Sampling distributions
- The mean distribution can be converted to the
standard normal distribution by using,
151Sampling distributions
- where µ the mean of the random variable
- n the sample size
- s the standard deviation of the random variable
152Sampling distributions
- From the properties listed above two theorems
have been derived, the first says that if a
random sample of n observations is selected from
a population of measurements from a random
variable that is normally distributed the mean
sampling distribution that will be produced from
this population will also be normally distributed.
153Sampling distributions
154Sampling
- Sample size
- Before sample size selection, the following must
be considered - Level of confidence this is the risk of error
the researcher is willing to accept in the study.
This in turn depends on - Time
- Money/resources
- Consequences associated with drawing incorrect
conclusions
155 Determining Sample Size
- The most common levels of confidence used are 95
and 99. - Confidence interval this is the level of
sampling accuracy the researcher will have. - The type of variable or variables being studied
156Sample Size
- Categorical variables require different sample
sizes than do metric level variables. Generally
categorical variables require smaller sample
sizes than metric variables. - Additionally it is easier to estimate sample
sizes for categorical variables. That is, in
order to calculate the sample size for metric
variables, the standard deviation of a previous
sample is required. - n z2pq/e2 , where e z v(pq/n)
157Sample Size
- This standard deviation is usually obtained from
one of two sources either from previous research
literature using the same population or from the
sample used in the pre-test of the instrument. - n (zs/e)2
- Where e z (s/vn)
158Sample Size
- The size of the population
- If the population has less than 100,000 elements,
it is considered to be a finite population (or if
n /N 0.05) and the finite population correction
factor must be applied to the calculation of the
standard error. The resulting samples are smaller
than those selected from larger populations. - Finite population correction factor v (N n)
/ ( N 1)
159Exercise
- An alumni association wants to estimate the mean
debt of this years college graduates. It is known
that the population standard deviation of the
debts of this years college graduates is 11,800.
How large a sample should be selected so that the
estimate with a 99 confidence level is within
800 of the population mean? - A political party wants to estimate the
proportion of voters who disapprove of a
candidate they have put to run in a particular
constituency. The party wants the estimate to be