DATA ANALYSIS FOR RESEARCH PROJECTS - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

DATA ANALYSIS FOR RESEARCH PROJECTS

Description:

data analysis for research projects – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 47
Provided by: PhilR164
Category:

less

Transcript and Presenter's Notes

Title: DATA ANALYSIS FOR RESEARCH PROJECTS


1
DATA ANALYSIS FOR RESEARCH PROJECTS
2
TYPES OF DATA
  • Quantitative data
  • measurements use scale with equal
    intervals
  • examples include mass (g), length (cm),
  • volume (mL), temperature (oC or K)
  • Qualitative data
  • non-standard scales with unequal
    intervals or
  • discrete categories
  • examples include gender, choice, color
    scales

3
Quantitative Scales of Measure
Scale Properties Example
Interval (equal) Numerical value indicates rank and meaningfully reflects relative distance between points on a scale Temperature (oC or oF)
Ratio (equal) Has all the properties of an interval scale, and in addition has a true zero point. (proportional scale) Length Weight Temperature (K)
4
Qualitative Scales of Measure
Scale Properties Example
Nominal (to name) Data represents qualitative or equivalent categories (not numerical, cannot be rank ordered). Eye color, hair color Gender Race
Ordinal (to order) Numerically ranked, but has no implication about how far apart ranks are. Grades Rating Scales
5
Sample Data
  • An experiment was conducted to measure the
    tensile strength of each of twelve pieces of two
    types of steel. The data from this experiment
    are given in the table to the right.
  • Is there a significant difference in tensile
    strength between the two types of steel?

6
  • Is there a better way to compare the data from
    these groups?
  • What have you used before to compare data from
    two different groups?

7
  • It is difficult to decide (consistently) whether
    differences between experimental groups are
    significant
  • We need a rigorous procedure that includes a
    clear operational definition of dissimilarity.

8
Statistics Statistical Analysis
  • Statistical hypothesis-testing methods give us
    the ability to say with confidence that
    differences between groups are real and not just
    due to random chance, sampling errors, or other
    mistakes in data collection.

9
Sample data for consideration
  • For the following sets of data, discuss
  • What was the IV and DV tested?
  • How should the data be processed to determine if
    the IV affects the DV?
  • How will you decide if the IV has a significant
    effect on the DV?

10
Sample Data Set 1Effect of Temperature on the
pressure of a sample of gas above water
Temperature of Water (oC) Pressure (mmHg)
50 90
55 120
60 145
65 180
70 219
75 264
80 310
11
Graphing data
  • Correlation coefficient gives a measure of how
    strong the relationship is between the graphed
    variables.
  • Multiple trials can and should all be analyzed at
    the same time.

12
Sample Data Set 2 Effect of Stress on the Height
of Bean Plants after 30 Days
Stressed Plants (cm) Unstressed Plants (cm)
55.0 48.0
65.0 65.0
50.0 59.0
57.0 57.0
59.0 51.0
73.0 63.0
57.0 65.0
54.0 58.0
62.0 44.0
68.0 50.0
13
Comparing levels of IV
  • If graphing the data is not appropriate, the
    different groups of the IV can be compared.
  • These types of statistics are called Descriptive
    Statistics since they
  • describe the data sets
  • summarize groups of measurements

14
Descriptive Statistics
  • Measure of Central Tendency
  • attempt to provide one value that is most
    typical of the entire set of data
  • What are some examples of measures of central
    tendency?
  • Variation
  • describes the spread within the data set
  • two sets of data with the same mean may have
    quite different spread within the data

15
Appropriate Measures of Central Tendency and
Variations for Types of Data
QUANTITATIVE DATA QUALITATIVE DATA QUALITATIVE DATA
Central Tendency Measurement Mean, Median or Mode Nominal Ordinal
Central Tendency Measurement Mean, Median or Mode Mode Median
Variation Standard Deviation Or Range Frequency Distribution Frequency Distribution
16
What is standard deviation???
  • The standard deviation is a statistic that tells
    you how tightly all the various examples are
    clustered around the mean in a set of data. This
    relates the variation in a set of data.
  • When the data points are pretty precise (close to
    the mean, little variation), the bell-shaped
    curve is steep, and the standard deviation is
    small.
  • When there is greater variation in the data, the
    bell curve is relatively flat. that tells you you
    have a relatively large standard deviation.

17
Displaying variationBox-and-Whisker Plot
  • First Quartile (Q1) smaller than 75 of ranked
    values
  • Median (Q2) smaller than 50 and larger than
    50
  • Third Quartile (Q3) smaller than 25 of ranked
    values

18
Illustrating Distributions for qualitative data
Histograms
  • Symmetrical mean equals median
  • Left-skewed mean lt median
  • Right-skewed mean gt median

19
Statistical Hypothesis Testing
  • A trend is apparent in the graph of the data, is
    this trend significant?
  • So the means of the groups are different, is the
    difference significant?
  • Statistical hypothesis testing is needed to
    determine the significance in the results of your
    data analysis.
  • The results of these tests provide Inferential
    Statistics. We make inferential decisions based
    on the data we collect from a sample population.

20
Sample Data Effect of Stress on the Height of
Bean Plants after 30 Days
Stressed Plants (cm) Unstressed Plants (cm)
55.0 48.0
65.0 65.0
50.0 59.0
57.0 57.0
59.0 51.0
73.0 63.0
57.0 65.0
54.0 58.0
62.0 44.0
68.0 50.0
21
Example for comparing meanst Test for
Quantitative Data
  • Equal Sample Size
  • t
  • mean of Group 1
  • mean of Group 2
  • variance of Group 1
  • variance of Group 2
  • number of items or measurements

22
Statistical calculations
  • Use the TI-84 or TI-83 calculator OR
  • Use Microsoft Excel Data Analysis
  • Calculate the t-test for the stressed plants data
    on the next slide, using the graphing calculator

23
Level of Significance
  • Establish a level of significance
  • In this class, use 0.05.
  • this means the probability of error in
  • rejecting the null hypothesis is 5/100
  • OR
  • we can be 95 confident that the null
  • hypothesis may be rejected

24
Results from the calculator
  • t value for the t-test
  • x1 mean from List 1
  • x2 mean from List 2
  • Sx1 standard deviation for List 1
  • Sx2 standard deviation for List 2
  • df degrees of freedom
  • n1 number of values in List 1
  • n2 number of values in List 2

25
t-Test Results from Excel
26
Statistical Hypotheses(different from your
research hypothesis)
  • Null Hypothesis
  • suggests any observed difference between two
    sample means occurred by chance and is NOT
    significant
  • state that there is no relationship between
    variables i.e. two means are equal OR they are
    not statistically different
  • Claim / Alternative Hypothesis
  • derived from literature, research hypothesis
  • suggests outcome of experiment if I.V. affects
    D.V.

27
Null Hypothesis
  • What would be the null hypothesis for this set of
    data?

The mean height of stressed plants is not
significantly different from the mean height of
unstressed plants.
28
Confidence Levels
  • Probability that findings are repeatable
  • Infers that results of sample are the same as
    results of the whole population
  • If we reject the null hypothesis at 95
    confidence level
  • 95 certainty that difference between groups is
    NOT due to chance
  • 95 certainty that results will be the same with
    further testing

29
Confidence levels
  • Probablity of error Error that occurs if null
    hypothesis is rejected when it is true and should
    not be rejected
  • Identified by Greek lowercase alpha, a
  • Researchers usually select a lt 0.05
  • If confidence level is 95, then probability of
    error (a) is 5, or 0.05

30
Statistical TestsTest Values and Critical Values
  • Test value the result of a statistical test on
    your data.
  • Critical value this is a reference value for
    each statistical test.
  • Your calculated statistical test value must
    exceed this value for you to reject the null
    hypothesis
  • You can find the critical value for each
    statistical test in publications and university
    websites. (links available on my website)
  • If you use Microsoft Excel for your statistics,
    the critical value will be given with the results.

31
Significance of t value
Determine the degrees of freedom df (number in
experimental group 1) (number in control
group 1)
df (10 1) (10 1) 18
Determine significance of calculated t by looking
at table for critical t values Calculated t lt
critical t ? not significant Calculated t gt
critical t ? is significant
At df 18, t 2.101 Calculated t of 1.24 lt
2.101 and is not significant at 0.05 level.
32
Rejecting Null Hypothesis
  • If test value is not significant ?
  • null hypothesis is NOT REJECTED
  • If test value is significant ?
  • null hypothesis is REJECTED

33
Do Statistical Findings Support the Research
Hypothesis?
  • Null hypothesis was rejected
  • Research hypothesis was supported
  • (unless research hypothesis IS a null
    hypothesis)
  • Null hypothesis was not rejected
  • Research hypothesis was not supported

34
SummarySteps of Hypothesis Testing
  1. State the null hypothesis and alternative
    hypothesis (claim)
  2. Choose the confidence level (95) and sample size
  3. Collect the data and calculate the appropriate
    statistics
  4. Make the proper statistical inference

35
Populations of Study Be careful what you claim!
  • Sample
  • specific portion of the population that is
    selected for the study ( 100 bean seedlings used
    in the study)
  • Sampled Population
  • population from which the sample was drawn (all
    the bean seedlings in the nursery from which the
    experimenter obtained their bean seedlings)
  • Target Population
  • ALL units (persons, things, experimental
    outcomes) of the specific group whose
    characteristics are being studied (all the bean
    seedlings of the same species)

36
Communicating StatisticsEffect of Stress on the
Mean Height of Bean Plants after 30 Days
Stressed Group Unstressed Group
Mean Variance Standard Deviation 1SD 2SD Number 60.0 cm 49.1 cm 7.0 cm 53.0 67.0 cm 46.0 74.0 cm 10 56.0 cm 60.7 cm 7.8 cm 48.2 63.8 cm 40.4 71.6 cm 10
Results of t test t 1.3 df 18 t of 1.3 lt 2.101 p gt 0.10 t 1.3 df 18 t of 1.3 lt 2.101 p gt 0.10
37
(No Transcript)
38
Types of Tests
  • For Quantitative Data
  • Linear Regression
  • One-Way Analysis of Variance (ANOVA)
  • t Test
  • For Qualitative Data
  • Chi-Squared Test
  • Z Test

39
Linear Regression
  • Determines a linear relationship between two
    variables based on a correlation coefficient
  • H0 The number of yellow MMs is not related to
    the total number of MMs in the package.

40
ANOVA Test
  • Compares the means of more than two groups
  • H0 There is no significant difference between
    the numbers of MMs in plain packages, almond
    packages and peanut packages

41
t-Test
  • Compares the means of two independent groups
  • H0 There is no significant difference between
    the numbers of MMs in plain and peanut packages
  • Two-tail test determines if populations are not
    equal / the same (more difficult to support)
  • One-tail test determines if one mean is greater
    than the other (easier to support)

42
Chi-Squared Test
  • Determines if a proportion within a sample is
    larger than expected can be used for more than
    two groups
  • H0 There are equal numbers of each color of MM
    in a package.

43
Z-Test
  • Compares proportions between two groups
  • H0 There are equal proportions of red MMs
    in plain and peanut packages

44
Selecting a Statistical Test
  • Things to consider
  • Number of groups of data
  • Type of data Quantitative or Qualitative
  • Type of variable numerical or categorical
  • The relationship in the null hypothesis being
    tested

45
Statistical Tests Review
  • Comparison of two variables for correlation ?
    correlation coefficient test
  • Comparing means of more than two groups/levels ?
    ANOVA test
  • Comparing two means ? t-test
  • Comparison of proportions within a population ?
    X2 (chi-squared) test
  • Comparison of proportions between populations ? Z
    test

46
Key Questions for your Research
  • What kind of data will you need to collect to
    test your hypothesis? (Qualitative or
    Quantitative)
  • What kind of scale will you use?
  • How do you plan on analyzing this data?
  • Comparison of groups? What will you compare?
  • Look for a trend? What will you graph?
  • How many different levels will you need data for?
  • How many trials?
  • What relevant qualitative data will you look for
    that may also help you interpret results?
Write a Comment
User Comments (0)
About PowerShow.com