Title: Laboratory in Oceanography: Data and Methods
1Laboratory in Oceanography Data and Methods
Computing Basic Statistics
- MAR599, Spring 2009
- Miles A. Sundermeyer
2Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
3Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
4Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
- Basic Statistics
- Let
- min, max - absolute minimum and maximum values of
data - (see also nanmin, nanmax)
- mean -
- median - the value separating the higher half of
a sample population, or - probability distribution, and the lower
half - mode - the value that has the largest number of
observations - Note The mode function is most useful with
discrete or coarsely rounded data. The mode for
a continuous probability distribution is defined
as the peak of its density function. Application
to a sample from such a distribution is unlikely
to provide a good estimate of the peak it would
be better to compute a histogram or density
estimate and calculate the peak of that estimate.
Also, the mode function is not suitable for
finding peaks in distributions with multiple
modes.
5Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
6Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
7Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
- Variance and Standard Deviation
- var - variance of a
- Note The above calculation of variance is
biased low for unbiased variance, must normalize
by (n-1). By default, Matlab computes the
unbiased variance, i.e., - std - standard deviation of a sqrt(var(a))
8Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
9Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
- Percentiles
- prctile - Percentile (or centile) is the value of
a variable below which a certain percent of
observations fall e.g., the 20th percentile is
the value (or score) below which 20 percent
of the observations may be found. - cdfplot - Plots the cumulative distribution
function (CDF) of the observations in the
data sample vector.
10Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
11Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
12Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
- Normality
- Gaussian - Gaussian or Normal distribution is the
probability distribution function given by -
- Noteworthy features of Gaussian distributions
- symmetry about its mean
- the mode and median both equal the mean
- the inflection points of the distribution curve
occur one standard deviation away from the mean,
i.e. at (x - s) and (x  s) - Exist numerous tests for normality, e.g.,
Lilliefors, Kolmogorov-Smirnov, Jarque-Bera, and
others. Also many more general tests for
comparing distributions.
13Computing Basic Statistics see also
http//www.mathworks.com/access/helpdesk/help/tool
box/stats/
14Computing Basic Statistics Useful Tidbits
- Useful Tidbits
- if ltexpressiongt statements
- elseif ltexpressiongt
- statements
- else
- statements
- end
e.g., nrand(1) if (ngt0.5) disp('heads') else
if (nlt0.5) disp('tails') else disp('neither!'
) end
15Computing Basic Statistics Useful Tidbits
- Useful Tidbits
- while ltexpressiongt statements
- end
e.g., nrand(1) while ngt0.5 if
(ngt0.5) disp('heads') elseif
(nlt0.5) disp('tails') else disp('neither!'
) break end nrand(1) end
16Computing Basic Statistics Useful Tidbits
Gaussian Distribution