Dr.S.Nishan Silva - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Dr.S.Nishan Silva

Description:

Dr.S.Nishan Silva (MBBS) Levels of Measurement Qualitative data Nominal Measurement Ex Give a number coding for the data. Number value is not considered Ordinal ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 63
Provided by: weeblyCom
Category:
Tags: mbbs | nishan | silva

less

Transcript and Presenter's Notes

Title: Dr.S.Nishan Silva


1
Research Statistics
  • Dr.S.Nishan Silva
  • (MBBS)

2
Statistics
The collection, evaluation, and interpretation of
data
3
Statistics
Statistics
Inferential Statistics Generalize and evaluate a
population based on sample data
Descriptive Statistics Describe collected data
4
Graphic Data Representation
Histogram
Frequency distribution graph
Frequency Polygons
Frequency distribution graph
Bar Chart
Categorical data graph
Pie Chart
Categorical data graph
5
Levels of Measurement
  • Qualitative data
  • Nominal Measurement
  • Ex Give a number coding for the data. Number
    value is not considered
  • Ordinal Measurement
  • Ex- Number coding but the number value matters
  • Quantitative data
  • Interval Measurement
  • No absolute zero. To what range does a value
    belong to..
  • Ratio Measurement
  • Absolute zero. And continuing

6
Discussion of Examples
7
Example Research
  • The effect of food from IIHS canteen on weight
    gain
  • Population IIHS (students and staff)
  • Further divisions Students Nursing and
    Physiotherapy
  • Data collection Questionnaire
  • Food from home/ outside Vs from canteen
  • Weight change over one month

8
Master Data Sheet
Question Question Sheet 1 Sheet 2 Sheet 3 Sheet 4 Sheet 5 Sheet 6 Sheet 7 Sheet 8
Gender M
Gender F
Job Nurse
Job Physio
Job Other
Food From Canteen
Food From Other
Weight Gain
Weight Lost or Same
9
Master Table
Value

Male Nurses Canteen Gain
Male Nurses Canteen No
Male Nurses Other Gain
Male Nurses Other No
Male Physio Canteen Gain
Male Physio Canteen No
Male Physio Other Gain
Male Physio Other No
Male Other Canteen Gain
Male Other Canteen No
Male Other Other Gain
Male Other Other No
Female Nurses Canteen Gain
Female Nurses Canteen No
Female Nurses Other Gain
Female Nurses Other No
Female Physio Canteen Gain
Female Physio Canteen No
Female Physio Other Gain
Female Physio Other No
Female Other Canteen Gain
Female Other Canteen No
Female Other Other Gain
Female Other Other No
10
Graphs - Draw
  • Pie charts
  • Weight gain from canteen in males
  • Weight gain from home in females
  • Bar charts / Graphs
  • Weight gain from Canteen

11
Discussion of YOUR Examples
12
Describing the Data with Numbers
  • Measures of Central Tendency
  • MEAN -- average
  • MEDIAN -- middle value
  • MODE -- most frequently observed value(s)

13
Measures of Central Tendency
Mean
Arithmetic average Sum of all data values divided
by the number of data values within the array
Most frequently used measure of central
tendency Strongly influenced by outliers- very
large or very small values
14
Measures of Central Tendency
Determine the mean value of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
15
Mean of a Group of Data
  • Page 78

16
Measures of Central Tendency
Median
Data value that divides a data array into two
equal groups
Data values must be ordered from lowest to highest
Useful in situations with skewed data and
outliers (e.g., wealth management)
17
Measures of Central Tendency
Determine the median value of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Organize the data array from lowest to highest
value.
59, 60, 62, 63, 63
2, 5, 48, 49, 55,
58,
Select the data value that splits the data set
evenly.
Median 58
What if the data array had an even number of
values?
60, 62, 63, 63
5, 48, 49, 55,
58, 59,
18
Measures of central tendency
Mode
Most frequently occurring response within a data
array
  • Usually the highest point of curve

May not be typical
May not exist at all
Mode, bimodal, and multimodal
19
Measures of Central Tendency
Determine the mode of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Mode 63
Determine the mode of
48, 63, 62, 59, 58, 2, 63, 5, 60, 59, 55
Mode 63 59 Bimodal
Determine the mode of
48, 63, 62, 59, 48, 2, 63, 5, 60, 59, 55
Mode 63, 59, 48 Multimodal
20
Measures of Dispersion
  • RANGE
  • highest to lowest values
  • STANDARD DEVIATION
  • how closely do values cluster around the mean
    value
  • SKEWNESS
  • refers to symmetry of curve

21
Range
Calculate by subtracting the lowest value from
the highest value.
Calculate the range for the data array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
22
Standard Deviation
  1. Calculate the mean .
  2. Subtract the mean from each value.
  3. Square each difference.
  4. Sum all squared differences.
  5. Divide the summation by the number of values
    in the array minus 1.
  6. Calculate the square root of the product.

23
Standard Deviation
Calculate the standard deviation for the data
array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
1.
2.
2 - 47.64 -45.64 5 - 47.64 -42.64 48 -
47.64 0.36 49 - 47.64 1.36 55 - 47.64
7.36 58 - 47.64 10.36
59 - 47.64 11.36 60 - 47.64 12.36 62 -
47.64 14.36 63 - 47.64 15.36 63 - 47.64
15.36
24
Standard Deviation
Calculate the standard deviation for the data
array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
3.
11.362 129.05 12.362 152.77 14.362
206.21 15.362 235.93 15.362 235.93
-45.642 2083.01 -42.642 1818.17 0.362
0.13 1.362 1.85 7.362
54.17 10.362 107.33
25
Standard Deviation
Calculate the standard deviation for the data
array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
4.
2083.01 1818.17 0.13 1.85 54.17 107.33
129.05 152.77 206.21 235.93 235.93
5,024.55
7.
5.
11-1 10
6.
S 22.42
26
Variance
Average of the square of the deviations
  1. Calculate the mean.
  2. Subtract the mean from each value.
  3. Square each difference.
  4. Sum all squared differences.
  5. Divide the summation by the number of values in
    the array minus 1.

27
Variance
Calculate the variance for the data array.
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
28
Standard Deviation
Curve A
Curve B
?B
?A
29
Skewness
Curve A
Curve B
Mode
Median
negative skew
Mean
30
A Simple Method for estimating standard error
Standard error is the calculated standard
deviation divided by the square root of the size,
or number of the population Standard error of the
means is used to test the reliability of the data
Example If there are 10 corn plants with a
standard deviation of 0.2 Sex 0.2/ sq root of
10 0.2/3.03 0.006 0.006 represents one std
dev in a sample of 10 plants If there were 100
plants the standard error would drop to 0.002
Why? Because when we take larger samples, our
sample means get closer to the true mean value of
the population. Thus, the distribution of the
sample means would be less spread out and would
have a lower standard deviation.
31
Coefficient of Variation
  • Percentage CV is
  • Standard Deviation X 100
  • Mean

32
Discussion of Examples
33
Probability
  • It is the numerical measure of the likelihood
    that a specific event would occur.
  • (Page 92)
  • Sum of probabilities for one event 1
  • Probability is always between 0 and 1

34
Probability
  • Probability of independent events
  • Chance of one single event happening (against not
    happening)
  • Marginal and condition probabilities
  • (Page 92-94)

35
The Normal Distribution
  • Mean median mode
  • Skew is zero
  • 68 of values fall between 1 SD
  • 95 of values fall between 2 SDs

.
Mean, Median, Mode
2?
1?
36
The Normal Curve and Standard Deviation
A normal curve Each vertical line is a unit of
standard deviation 68 of values fall within 1
or -1 of the mean 95 of values fall within 2
-2 units Nearly all members (gt99) fall within 3
std dev units
37
Example
  • (Theory)

38
My weight
Plot as a function of time data was acquired
39
Comments background is white (less ink) Font
size is larger than Excel default (use 14 or 16)
Do not use curved lines to connect data points
that assumes you know more about the relationship
of the data than you really do
40
Assume my weight is a single, random, set of
similar data
Make a frequency chart (histogram) of the data
Create a model of my weight and determine
average Weight and how consistent my weight is
41
s 1.4 lbs
s standard deviation
measure of the consistency, or similarity, of
weights
42
Width is measured At inflection point s
W1/2
Triangulated peak Base width is 2s lt W lt 4s
43
Pp peak to peak or largest separation of
measurements
Area 68.3
/- 1s
Area /- 2s 95.4
Area /- 3s 99.74
Peak to peak is sometimes Easier to see on the
data vs time plot
44
(Calculated s 1.4)
144.9
Peak to peak
139.5
s pp/6 (144.9-139.5)/60.9
45
Read
  • Co-relation between variables Page 99 and beyond

46
(No Transcript)
47
(No Transcript)
48
Inferential Statistics
  • Used to determine the likelihood that a
    conclusion based on data from a sample is true

49
Terms
  • p value the probability that an observed
    difference could have occurred by chance

50
Terms
  • confidence interval
  • The range of values we can be reasonably certain
    includes the true value.

51
The Use of the Null Hypothesis
  • Is the difference in two sample populations due
    to chance or a real statistical difference?
  • The null hypothesis assumes that there will be no
    difference or no change or no effect of the
    experimental treatment.
  • If treatment A is no better than treatment B then
    the null hypothesis is supported.
  • If there is a significant difference between A
    and B then the null hypothesis is rejected...

52
T-test or Chi Square? Testing the validity of the
null hypothesis
  • Use the T-test (also called Students T-test) if
    using continuous variables from a normally
    distributed sample populations (ex. Height)
  • Use the Chi Square (X2) if using discrete
    variables (if you are evaluating the differences
    between experimental data and expected or
    hypothetical data) Example genetics
    experiments, expected distribution of organisms.

53
T-test
  • T-test determines the probability that the null
    hypothesis concerning the means of two small
    samples is correct
  • The probability that two samples are
    representative of a single population (supporting
    null hypothesis) OR two different populations
    (rejecting null hypothesis)

54
Use t-test to determine whether or not sample
population A and B came from the same or
different population
t x1-x2 / sx1-sx2 x1 (bar x) mean of A x2
(bar x) mean of B sx1 std error of A sx2
std error of B Example Sample A mean
8 Sample B mean 12 Std error of difference
of populations 1 12-8/1 4 std deviation units
55
  • The z test
  • -used if your population samples are greater than
    30
  • Also used for normally distributed populations
    with continuous variables
  • -formula note s (sigma) is used instead of
    the letter s
  • z mean of pop 1 mean of pop 2/
  • v of variance of pop 1/n1 variance of pop2/n2
  • Also note that if you only had the standard
    deviation you can square that value and
    substitute for variance

56
Example z-test
  • You are looking at two methods of learning
    geometry proofs, one teacher uses method 1, the
    other teacher uses method 2, they use a test to
    compare success.
  • Teacher 1 has 75 students mean 85 stdev3
  • Teacher 2 has 60 students mean 83 stdev

  • (85-83)/v32/75 22/60

  • 2/0.4321 4.629

57
Example continued
Z 4.6291 Ho null hypothesis would be Method 1
is not better than method 2 HA alternative
hypothesis would be that Method 1 is better than
method 2 This is a one tailed z test (since the
null hypothesis doesnt predict that there will
be no difference) So for the probability of 0.05
(5 significance or 95 confidence) that Method
one is not better than method 2 that chart
value Za 1.645 So 4.629 is greater than the
1.645 (the null hypothesis states that method 1
would not be better and the value had to be less
than 1.645 it is not less therefore reject the
null hypothesis and indeed method 1 is better
Z table (sample table with 3 probabilities)
a Za (one tail) Za/2 (two tails)
0.1 1.28 1.64
0.05 1.645 1.96
0.01 2.33 2.576
58
Chi square
  • Used with discrete values
  • Phenotypes, choice chambers, etc.
  • Not used with continuous variables (like height
    use t-test for samples less than 30 and z-test
    for samples greater than 30)
  • O observed values
  • E expected values

59
http//course1.winona.edu/sberg/Equation/chi-squ2.
gif
60
Interpreting a chi square
  • Calculate degrees of freedom
  • of events, trials, phenotypes -1
  • Example 2 phenotypes-1 1
  • Generally use the column labeled 0.05 (which
    means there is a 95 chance that any difference
    between what you expected and what you observed
    is within accepted random chance.
  • Any value calculated that is larger means you
    reject your null hypothesis and there is a
    difference between observed and expect values.

61
How to use a chi square chart
http//faculty.southwest.tn.edu/jiwilliams/probab2
.gif
62
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com