Title: Basic Statistics for Engineers.
1Basic Statistics for Engineers.
- Collection, presentation, interpretation and
decision making. - Prof. Dudley S. Finch
2Statistics
- Four steps
- Data collection including sampling techniques
- Data presentation
- Data analysis
- Conclusions and decisions based on the analysis
3Data types
- Discrete
- Defined as
- A variable consisting of separate values for
example the number of bolts in a packet. There
may be 8 or 9 but there cannot be 8.5 - Continuous
- Defined as
- A variable which may have any value for example
the diameter of steel bars after machining. Any
diameter is possible within the allowable
tolerance to which the machine is set.
4Sampling
- Often not practical to examine every component
therefore sampling techniques are used. - Sample should be representative of the complete
set (the population) of values from which it has
been chosen. - Although not guaranteed, we attempt to chose an
unbiased sample. - To be unbiased every possible sample must have an
equal chance of being chosen. Satisfied if sample
is chosen at random that is, if there is no
order in the way the sample is chosen. This is
called a random sample.
5Random samples
- The larger the random sample the more
representative of the population it is likely to
be. - Random sampling can be carried out by allocating
a number to each member of the population and
then drawing numbered balls from a bag or using a
random number generator. - Sampling techniques involve probability theory
(will be dealt with later).
6Data presentation
51.4 55.3 56.1 50.5 55.5
52.8 55.6 55.3 50.2 56.1
52.1 54.8 49.6 57.0 52.0
56.5 55.3 54.0 51.6 52.1
57.3 53.9 53.5 56.1 57.2
54.6 55.4 55.9 56.0 52.9
54.1 55.0 54.2 54.2 54.5
53.0 52.7 54.5 54.7 58.4
56.2 55.8 54.1 56.0 55.1
55.1 54.4 57.2 53.2 55.4
53.9 50.9 54.5 56.9 54.0
56.4 53.1 51.8 52.8 50.5
53.7 52.8 54.0 56.4 55.0
53.8
Measured weights of a casting (lbs).
7Frequency distribution
The class interval should be one that emphasizes
any pattern in the data. Typically between 8 and
15 class intervals should be chosen. In the
example used, a class interval of 1lb is chosen.
50lbs therefore includes 49.5 to 50.4lbs. We can
therefore compile a frequency distribution table.
Mass of casting 50 51 52 53 54 55 56 57 58
Number of castings (frequency)f 2 4 5 8 13 15 12 6 1
8Bar chart
9Histogram
10Frequency polygon
11Frequency curve
12Pie chart showing relative frequency
Relative frequency class frequency / total
frequency of the sample e.g. the relative
frequency of the 53lb class is 8/66 or 0.121
13Numerical methods of a distribution
- A frequency distribution can be represented by
two numerical quantities - Central tendency or average value of the
distribution - Dispersion or scatter of variables about the
average value
14Numerical measures of central tendency
- Mid point of range
- Difference between the largest and smallest
values of the variable - Generally poor measure of central tendency since
it depends only on the extreme values of the
variable and is not influenced by the form of the
distribution. - Mode
- The most frequently occurring value of the
variable - Easily obtained from frequency table. For the
casting the mode 55lbs.
15- Arithmetic mean
- Determined by adding all the values of the
variable and dividing this by the total number of
values. If x1, x2, x3, .xn are the N values then
16(No Transcript)
17For frequency distribution tables
18To calculate standard deviation
19(No Transcript)
20Estimation
- Applies to the difficulty of obtaining data about
the population from which the sample was drawn
and in setting up a mathematical model to
describe this population. - Two components estimation and testing of
hypotheses about the chosen model.
21Two types of estimates
- Point estimate
- Estimate of a population parameter expressed as a
single number - This method gives no indication as to the
accuracy of the estimate - Interval estimate
- Estimate of a population parameter expressed as
two numbers - This method is preferable as it gives an
indication as to where the population parameter
is expected to lie
22Confidence intervals
- In practice, the true standard deviation, ?, is
unknown and that the sample standard deviation,
s, is used to estimate ?. - If a random sample size n is drawn, an estimate
of the standard error of the sample mean is
given by - Need to determine the confidence interval for the
true mean, ?. - For ngt30 a good approximation can be obtained.
For small samples a wider interval is used.
23Use of Student t-distribution tables
- Look up value for (n-1) and use desired
confidence limits (0.01 98, 0.005 99, 0.001
99.8, etc.). - Find
- The true mean ? sample mean
- ? t½?,n-1
24For castings example
Sample mean 54.3lbs Standard deviation, s
1.83lbs n 66 Using t0.005, 65 the true mean ?
is given by 54.3 ? 2.66 x 0.225 0.599 Thus we
can be 99 confident that the true mean lies
between 53.7 and 54.9