Numbers, statistics, and Biology - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Numbers, statistics, and Biology

Description:

Statistical significant may be different from biological significant (specie richness example) ... CAN WE INFER THAT THE RESULT IS VALID FOR THE SPECIES ... – PowerPoint PPT presentation

Number of Views:500
Avg rating:3.0/5.0
Slides: 43
Provided by: nbo1
Category:

less

Transcript and Presenter's Notes

Title: Numbers, statistics, and Biology


1
BIODIVERSITY and ENVIRONMENTAL
MANAGMENT
Dr Ole R. Vetaas, ole.vetaas_at_global.uib.
no UNIFOB - Global, University of Bergen,
www.global.uib.no  Ecology and Environmental
Change Research Group, Dept. of Biology, UiB
www.eecrg.uib.no
2
Numbers, statistics, and Biology
  • WHY statistical analyses?

3
SCIENCE BASED MANAGMENT
  • The value of this degree you may obtain depend on
  • GOOD THESIS
  • Good research question
  • Good research question relay on theory and
    observations
  • Facts or factual observation
  • Theory Facts linked in causal logical frame
    work

4
STRUCTURE OF A THESIS
  • INTRODUCTION
  • MATERIALS AND METHODS
  • RESULTS text, tables, and figures
  • DISCUSSION
  • References
  • Appendices, raw data

5
STRUCTURE OF A THESIS as a basic introduction
to philosophy of science
  • INTRODUCTION
  • Other researchers observation
  • Theory
  • Models
  • Deduction
  • Hypotheses
  • Aims to test or evaluate hypothesis about nature

6
Hypothesis formulation
  • Everything is connected to everything
  • YES, But the strength is different
  • From strong causal link to indifferent
  • Science is to find the strongest connections, and
    evaluate the plausible causations
  • If we know causal links we understand the system
    better

7
HYPOTHESIS
  • Hypotheses are potential statements that answers
    Research questions
  • Example
  • Are there more birds in the mid-hills than in
    tropical lowland?
  • Hypothesis there are more birds in the mid-hills
    than in the tropical lowland .
  • Where is the theory?

8
THEORY
9
THEORY
  • Habitat diversity or heterogeneity theory
  • Heterogeneity in topography increase surface
  • microclimate south-north-exposed slopes
  • Create many habitats
  • Mixture of forest and open meadows suitable for
    birds
  • Model topography ?micro climate ? many habitats?
    DEDUCE an HYPOTHESIS
  • Hypothesis MORE BIRDS in mid hills than FLAT
    AREA
  • (H0 (Null hypothesis) no difference in bird
    richness in due to habitat diversity)

10
TESTING the HYPOTHESIS
  • Hypothesis there are more birds in the mid-hills
    than in the tropical lowland
  • Predication more birds in the mid-hills than in
    the tropical lowland
  • Collection of data with a scientific method
    (repeatable for others)
  • FIGURE B represents BIRDS richness along the
    elevation gradient
  • Our hypothesis is falsified or rejected
  • THIS GOOD!!?

Low High
11
Hypothetic Deductive method.
  • hypothesis should be deduced from a model
  • model based on rational theory that describes a
    certain phenomenon
  • possible to falsify these hypotheses
  • That is they can be wrong
  • If always correct no good hypothesis

12
What is science
Carl Popper
  • Science and non-science
  • real science should be able to formulate testable
    hypothesis
  • be possible to falsify these hypotheses
  • falsification criteria !!

13
History philosophy of science
  • 1900 century
  • hypothesis testing program
  • Numerical methods aimed to test hypothesis
    were developed by
  • Pearson (correlation )
  • R. A. Fisher ( t-test)
  • Carl Popper made the theoretical basis for
    hypothesis testing which utilised the
    development in the statistical science.

14
STRUCTURE OF A THESIS
  • INTRODUCTION
  • MATERIALS AND METHODS
  • RESULTS text, tables, and figures
  • DISCUSSION
  • References
  • Appendices, raw data

15
Filed study
  • Several different causes
  • Several interactions
  • Feedback loops cause and effect reinforcing the
    original cause
  • True Interdisciplinary work is a new paradigm

16
Data that may indicate causation
  • Isolate the potential causes
  • Effect of Land use on Biodiversity
  • Same slope inclination
  • Same aspect
  • Same type of soil

17
STRUCTURE OF A THESIS
  • INTRODUCTION
  • MATERIALS AND METHODS
  • RESULTS text, tables, and figures illustrating
    statistical results
  • DISCUSSION
  • References
  • Appendices, raw data

18
NUMERICALMEETHODS What is statistical analyses
  • Statistics is a branch of mathematics, which is a
    special language
  • This special language is international
  • Statistical methods should be viewed as a tool
    for analyses and presentations of data in a
    standardised way

19
Statistical expressions elucidate the results
for an international audience
  • repeatable for other researchers
  • the procedure is arguable
  • the progressive scientific process.
  • enhance the degree of objectivity in the analysis

20
STRUCTURE OF A THESIS
  • INTRODUCTION
  • MATERIALS AND METHODS
  • RESULTS text, tables, figures and graphs that
    illustrate statistical results
  • DISCUSSION
  • References
  • Appendices, raw data

21
Discussion
  • What did I find
  • This agrees with other finding
  • Other researchers found contrasting results
  • What is the reason for agreement and disagreement
  • What is the main causal factor
  • INTERPRETATION
  • Ecological sound reasoning
  • Statistical significant may be different from
    biological significant (specie richness example)

22
INTERPRETATION
  • QUNTITATIVE METHOD DO NOT NECCESARILY GIVE CLEAR
    CUT RESULTS
  • INTERPRETATION
  • QULITATIVE EVALUATION OF PLAUSIBLE EXPLANTION
  • YOU WILL ALSO LEARN QULITATIVE METHODS IN BERGEN

23
Universities, faculties and tradition
Natural Sciences
FIELD SCIENCES BIOLOGY, GEOLOGY, GEOGRAPHY
Art Humanities
24
The Scientific Field-method
  • RESEARCH PROCEDURE IN MOST FIELD STUDIES
  • 1. THEORHY state of art what do we know,
  • 2. Research question
  • 3. Hypothesis or hypotheses
  • 4. Collect data in the field
  • 5. Qualitative and quantitative analyses
    confronting the hypothesis with the field-data
  • 5. Interpret the result and explain it for the
    scientific community public

25
Good results depend on
  • Testable hypothesis,
  • Good sampling design
  • Reduced set of potential causal factors

26
Numerical methods. aimed to test hypothesis
  • accomplish certain assumptions in order to be a
    valid test
  • difficult to fulfil when the hypothesis relates
    to the processes in the real world,
  • field situation

27
THE DIFFERENCE BETWEEN
IDEAL AND REAL WORLD   IDEAL WORLD
(LAB.) REAL WORLD ASSUMPTIONS IN NATURE
LANDSCAPE ----------------------------------------
------------------------------------------------  
ONE RESPONSE VARIABLE MANY INTERACTIVE
RESPONSE VARIABLES   ONE OR A SET OF FEW
INDEPENDENT MANY INTERACTIVE EXPLANATORY
VARIABLES EXPLANATORY VARIABLES  
    NORMAL DISTRIBUTION SKEWED DISTRIBUTION OF
VARIABLES AND ERROR OF VARIABLES AND
ERROR REPRESENTATIVE SAMPLE GEOGRAPHICAL
LIMITATION   INDEPENDENT SAMPLES SPATIAL OR
TEMPORAL AUTOCORRELATED SAMPLES
DEPENDENT INDEPENDENT VARIABLES PHYLOGENNETICAL
RELATION-SHIPS DEPENDENT  
28
Statistics orginally made for experimental
situation
Most species in north or south slope
29
A test of the null hypothesis by inferential
statistics is in a strict sense only valid if
  • Ø There is only one causal factor that causes the
    investigated changes in the response variable, or
    there is a set of few independent causal factors
    that cause the response.

30
Statistics orginally made for experimental
situation
Test tubes with green alaga add a toxic element
to test effect
Add toxic element different doses
control
31
Real world
Many factores influnce species richness
32
Normal or gaussian dsitribution
  • Ø The variables of interest have a normal
    distribution, or the residual error after a
    regression should have a normal distribution.
  • Biological variables may take many forms of
    distribution, e.g. skewed or bimodal.

33
NORMAL DISRTRIBUTION   MISCONCEPTION REGARDING
THE EXPLANATORY VARIABLE   EXPERIMENTAL DESIGNE
UNIFORM DISTRIBUTION   RESPONSE VARIBELS HAVE TO
BE NORMAL IF WE ARE COMAPRING MEANS OF TWO OR
MORE POPULATIONS BY t-TEST OR ANOVA     IN
CLASSICAL ANOVA AND REGRESSION RESIDUALS HAVE TO
BE NORMAL DISTRIBUTED     GENERALIZED LINEAR
MODELS OR GENERALIZE ADDITIVE MOEDLS CAN COPE
WITH VARIOUS DISTRIBUTIONS  
34
Dry weight wheat Gr. Pr. m2
Mean growing season temperature
35
ALWAYS CHECK RESIDUALS AFTER REGRESSION!!
Residuals after regression
Residuals after regression
Missing factor or wrong factor
ok
-2 -1 0 1 2
-2 -1 0 1 2
36
IN CLASSICAL ANOVA AND REGRESSION RESIDUALS HAVE
TO BE NORMAL DISTRIBUTED     GENERALIZED LINEAR
MODELS OR GENERALIZE ADDITIVE MOEDLS CAN COPE
WITH VARIOUS DISTRIBUTIONS
37
NORAMLITY t-TEST DIFFERENC OF MEAN
FREQUNCY of plots with x number of species
MEAN NUMBER OF SPECIES
Equal variance
south
north
X NUMBER OF SPECIES
38
NORAMLITY t-TEST DIFFERENC OF MEAN
FREQUNCY of plots with x number of species
MEAN NUMBER OF SPECIES
Non-Equal variance
south
north
X NUMBER OF SPECIES
39
Representative sample
Ø The objects sampled are a representative sample
of the total population. Normally, a very small
fraction of the total population of the target
organism is sampled in biological field studies.
Thus in a strict sense the analyses will be site
conditional, and the result may not be inferred
to be valid for the total population.
40
REPRESNTATIVE SAMPLE
  • gtRandom sampling
  • gtState How big is the total polulation one aims
    to infer about  
  • ESTIMATE THE NICHE OF AN ORGANISME, WITH DATA
    FROM A LIMITED REGION
  • CAN WE INFER THAT THE RESULT IS VALID FOR THE
    SPECIES
  •  
  • HYPOTHEIS ABOUT THE TOTAL POPULATION, I.E. THE
    SPECIES 

41
Popperian falsification
The classical popperian falsification can not be
done without some ad-hoc adjustment of the
degrees of freedom  MAJOR PROBLEM IN CAUSAL
ANALYESE NOT IN DESCRIPTIVE ANALYSES BY MEANS OF
NUMERICAL METHODS  SOFTWARE AND NUMERICAL OPTIONS
EXSIT TO COPE WITH THIS PROBLEM  SAMPLING MOST
IMPORTANT MAKE A GRID ! Comparative biology
phylogenetic relationships among the objects
42
SPATIAL AUTOCORRELATION DISTANCE DECAY 
All objects have a geographical distribution,
objects located close to each other are on
average more similar than those with a more
distant location,    Hence the similarity among
objects are a function of their distance to each
other spatial dependency.   Hence the
objects are not independet of each other   The
degrees of freedom is not equal to the number of
samples minus one (df n 1 is not
true)   Thus the statistical p-value can not
be used to evaluate if the hypotehsis is
falsified or not.  
Write a Comment
User Comments (0)
About PowerShow.com