Title: Two-stage Cluster Sampling When Clusters are of Unequal Size
1Cluster Analysis
2- First used by Tryon (1939) encompasses a
number of different algorithms and methods for
grouping objects of similar kind into respective
categories.
3 ???? ??
????????????,?????????(???)????? ????????????????
??,???????????????????(homogeneity),??????????????
????
4 ???????????????????????,???????????????
5- ??????
- ????
- ???
- ???
- ???
- ???
- ???
6- ???????
- ???????????????????????????????????,??????????????
,????????????? - ??????????????????????,?????????N?????????????????
????????? -
7??????
- ????????????,??????????????(Euclidean Distance)
- ??N????,??????M???,??X?NM?????,???????????
8dij
????????????,????????????????????????,??????0,????
?1?
9??????
- ??????????????????,???????????,??????????????????
- ??????????????????(matching coefficient)???
10Ex ?i?j???????(1???????,0????????)
11??????
12- ?????????
- ???? (non-hierarchical)????????????????????,?????
??
a. ?????? (sequential threshold)
?????,???????????,????????,???????????????????????
????????????????????,???????????????????,????????
?
13b.??????(paralleled threshold)
???????????????????????,???????,???????????????,??
???????????????,?????(???)???????????
c.?????(optimizing partitioning) ????????
(???????????) ???,????????,????? (criterion
measure) ????????
14d.????(K-means Method) ??????????????,??????????
??K???,????????????????,??????????????????????????
??????????????,??????????????????????,????????????
??????
15??? (hierarchical)????? ??????????,?????????????
???????,?????????????????
????????,?????????????,?????
16??????????????????,??????????,????????????,???????
??????????????????????? ???????????????????????,?
????????????,????????,??????????
17?K-means???????
1.?????????K????? 2.?????????????(???)??(????????)
,?????????????????????????????????????????????????
??? 3.??????,?????????????????????
18Ex????????????????
??????????????????,????1,2?????3,4?,??????????????
???
???1,2? ?? ?3,4?
19X2
X2
????????????????????,??????????????
?D21?1,2?(12-2)2(8-6)2104
??????????4????3,4?????,??????????2????3,4??????
,?????????3,4???????????1???2,3,4?,????????
20???1? ???2,3,4?
X112 X1
X28 X2
????????????1?????2,3,4??????
21????????1????1????????2,3,4????2,3,4??????,?????
??????,???K2???,??????1?????2,3,4??
22Two-stage Cluster Sampling When Clusters are of
Unequal Size
- Desired Sample Proportion pn/N
- a Desired of Clusters Selected in the 1st
Stage - A Total of Clusters
- b Sample Size within Each Cluster Selected
- Ni of Elements in Cluster i
23Simple Two-stage Cluster Sampling
- The First-stage Prob. p1a/A
- The Second-stage Prob. p2p?(a/A)
- Sample size in cluster I, ni p2Ni
24Probability Proportional to Size
where
25Example
- Draw a sample of 1,000 households from a city
that contains about 200,000 households
distributed among 2000 blocks of unequal but
known size. - The desired sample proportion 1/200
- The desired of clusters selected in the 1st
stage100 - How do we conduct the two-stage cluster sampling?
26What is Cluster Analysis?
- Cluster Analysis is a class of statistical
techniques that can be applied to data that
exhibit natural groupings. - CA is an interdependence technique that makes no
distinction between dependent and independent
variables. - There is NO statistical significance testing in
CA. - CA is more a group of different algorithms that
put objects into clusters following well-defined
similarity rules.
27What is A Cluster?
- A cluster is a group of relatively homogeneous
cases and observations. - Clusters exhibit high internal homogeneity and
high external heterogeneity.
28A Cluster Diagram Drinkers Perceptions of
Alcohol
29Characteristics of CA
- Cluster Analysis is a tool of discovery.
- It discovers structures in data but does NOT
explain why they exist. - CA is used when we do not have an a priori
hypothesis, but when we are in the exploratory
phase.
30How does CA differ
- From Discriminant Analysis
- A dependence technique
- Predict the probability that an object will fall
into one of two or more mutually exclusive
categories based on several independent
variables. - Find a linear combination of independent
variables. - Find natural groupings based on distances among
objects.
31- From Factor Analysis
- Similar to cluster analysis in that it is an
interdependence technique. - Primary difference lies in the focus on objects
and variables. - Factor analysis reduces variables to a few
factors. Cluster analysis reduces objects to a
few clusters.
32Cluster Analysis Methods
- Three Cluster Analysis Methods
- Joining (Tree Clustering)
- Two-way Joining
- K-means Testing
33Joining (Tree Clustering)
- A type of hierarchical clustering --
agglomerative - Each unit is a cluster.
- Dendogram ?
- Many other methods
34The first level shows all samples xi as singleton
clusters. Increase levels, more samples are
clustered together in a hierarchical manner.
35It is based on sets where each cluster level may
contain sets that are subclusters as shown in the
Venn diagram.
36Two-way Joining Hartigan (1975)
- Two-way Joining tries to cluster both variables
and objects. - Only useful if you think clustering along BOTH
lines will be useful. - Very rare in application.
37k-Means Clustering
- Begin with a preconception about the number of
clusters (k). - Thought of as ANOVA in reverse.
- ANOVA evaluates between group var. against within
group var. when computing stat. signif. of
hypothesis that groups are different. - In k-Means the computer will try to move objects
in and out of the groups to get the most
significant ANOVA results.
38Its all about distance
- Distance Measures
- Euclidean Distance
- Squared Euclidean Distance
- Manhattan Distance
- Chebychev Distance
- Power Distance
39EQUATION Euclidean Distance
- Basic equation for determining distance measure.
- Distance (x,y) Si (xi yi)21/2
- A standard formula for determining the distance
between two points on a plane
40Fairly simple, right?
41In other words, how do we get from this
42To this
43To this
44How to Determine Clusters.
- Use a computer.
- Call a professional.
45- Clusters in the
- Real World
46Why is Cluster Analysis Important?
- Relatively new/evolving technique
- Highly useful for market segmentation
- Segmentation identifying groupings of customers
using statistical multi-variate analysis, often
based on perceptions and attitudes as well as
demographics and behavior. - Segmentation helpful to small companies
attempting to carve out a niche - Large companies trying to tailor their
products/services to different segments
47In addition to segmentation, clusters are used to
- Design products and establish brands
- Target direct mail
- Make decisions about customer conversion and
retention - Decide on marketing cost levels
48Ex Luxury Car Customers
- Demographic examples easier to illustrate
- Demographics
- Gender
- Education
- Age
- 149 customers (objects) of a luxury car dealership
49Using SPSS for Clustering
- Chose TwoStep Cluster Analysis
- Basically, the agglomerative technique
(dendogram). - Step One Creates very small (individual)
sub-clusters. - Step Two Cluster sub-clusters into desired
number of clusters. - Automatically finds optimum number of clusters.
50Two-Step CA Output
What are these clusters?
51Two-Step CA Output
52(No Transcript)
53(No Transcript)
54What does this mean?
- Cluster 5
- Age 36 - 65
- Education High School graduate or above
- Gender Female
- Could have used k-Means, would have generated
different results. - Clustering is a powerful marketing research tool.
55Claritas Clustering Experts
- Example Claritas Corporation
- Claritas founded the U.S. geodemographic industry
when it launched the first PRIZM segmentation
system in 1974. - PRIZM (Potential Rating Index for Zip Markets)
categorizes every U.S. neighborhood into 1 of 62
clusters. - Descriptive Names
- Money and Brains
- Young Literati
- Shotguns and Pickups
56Money and Brains
- Sophisticated Urban Fringe Couples
- Cluster is a mix of family types singles,
married couples with children and married couples
without children. These families own their homes
in upscale neighborhoods near cities. Dual
incomes provide luxuries, travel and
entertainment. - Demographics
- Affluent
- Age Groups 55-64, 65
- Predominantly White, High Asian
57Clusters Work!
- At a conservative estimate, more than 20,000
companies in the United States and Canada alone
used clusters as part of their marketing
information mix last year.
58Web Sources
- http//cwis.livjm.ac.uk/bus/busrmccl/ae230/lect10.
ppt - http//www.clusterbigip1.claritas.com/claritas/Def
ault.jsp?main3submenusegsubcatsegprizm - http//www.clusterbigip1.claritas.com/claritas/Def
ault.jsp?main3submenusegsubcatsegprizmne - http//www.insightsc.ie/newsletter7.htm
- http//www.directionsmag.com/article.asp?article_i
d12 - http//fun.supereva.it/scoleri.freeweb/cern/biogra
fie/hawking.jpg - http//www.statsoft.com/textbook/stcluan.html
- http//www-db.stanford.edu/ullman/mining/cluster1
.pdf - http//www.snr.missouri.edu/multivariate/ClusterAn
alysis.pdf
59Print Sources
- Recent Developments in Clustering and Data
Analysis. Edited by Chikio Hayashi, Edwin Diday,
Michel Jambou, Noboru Ohsumi. Academic Press,
Inc. 1988. - Finding Groups in Data An Introduction to
Cluster Analysis. Leonard Kaufman, Peter J.
Rousseeuw. John Wiley and Sons, Inc. 1990. - Marketing Research An Aid to Decision Making.
Dr. Alan T. Shao. South-Western. 2002. - Exploring Marketing Research. William G. Zikmund.
South-Western. 2003.
60Ex 7 Hypothetical Data
Subject Id. Income (1000) Education (years)
S1 5 5
S2 6 6
S3 15 14
S4 16 15
S5 25 20
S6 30 19
61Similarity Matrix (Euclidean Distances)
Id S1 S2 S3 S4 S5 S6
S1 0 2 181 221 625 821
S2 2 0 145 181 557 745
S3 181 145 0 2 136 250
S4 221 181 2 0 106 212
S5 625 557 136 106 0 26
S6 821 745 250 212 26 0
d(S1, S3) ? (15-5)2 (19-5)2 181 d(S1, S2)
? 2 ???? (?????) ???
62Centroid Method Five ClustersData For Five
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 15 14
3 S4 16 15
4 S5 25 20
5 S6 30 19
63Similarity Matrix (Euclidean Distances)
Id S1 S2 S3 S4 S5 S6
S1 S2 0 162.5 200.5 590.5 782.5
S3 162 0 2 135.96 250
S4 200.5 2 0 106 212
S5 590.5 135.96 106 0 26
S6 782.5 250 212 26 0
d(S1 S2 , S3) ? (5.5-15)2 (5.5-14)2 ?
162.5 d( S3, S4) ? 2 ???? (?????) ???
64Centroid Method Four ClustersData For Four
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 S4 (15,14) (16,15) 15.5 1516/2 14 .5 1415/2
3 S5 25 20
4 S6 30 19
65Similarity Matrix (Euclidean Distances)
Id S1 S2 S3S4 S5 S6
S1 S2 0 181 590.5 782.5
S3 S4 181 0 120.5 230.5
S5 590.5 120.5 0 26
S6 782.5 230.5 26 0
d(S1 S2 , S5) ? (5.5-25)2 (5.5-20)2 ?
590.5 d( S5, S6) ? 26 ???? (?????) ???
66Centroid Method Three ClustersData For Three
Clusters
Cluster Cluster Members Income (1000) Education (years)
1 S1S2 (5,5) (6,6) 5.5 56/2 5.5 56/2
2 S3 S4 (15,14) (16,15) 15.5 1516/2 14 .5 1415/2
3 S5 S6 (25,20) (30,19) 27.5 2530/2 19.5 1415/2
67Similarity Matrix (Euclidean Distances)
Id S1 S2 S3S4 S5 S6
S1 S2 0 181 680
S3 S4 181 0 169
S5 S6 680 169 0
d(S1 S2 , S5 S6) ? (5.5-27.5)2 (5.5-19.5)2
? 680 d( S3 S4, S5 S6) ? 169 ????
(?????) ???
68Exhibit 7-1SAS Output for cluster analysis on
data in Table 7.1
1
???????????
-
- Simple statistics
- Mean Std Dev
Skewness Kurtosis Bimodality - INCOME 16.1667 9.9883 0.2684
-1.4015 0.2211 - EDUC 13.1667 6.3692
-0.4510 -1.8108 0.2711 - Root-Mean-Square Total-Sample Standard Deviation
8.376555
69Root-Mean-Square Total-Sample Standard
Deviation8.376555 (RMSSTD)
RMSSTO?????????????(?????????)
- Step Number
Frequency RMS STD - Number of
of New of New Semipartial
Centroid - Clusters Clusters Joined Cluster
Cluster R-Squared R-Squared
Distance -
- 1 5 S1 S2
2 0.707107 0.001425
0.998575 1.4142 - 2 4 S3 S4
2 0.707107 0.001425
0.997150 1.4142 - 3 3 S5 S6
2 2.549510 0.018527
0.978622 5.0990 - 4 2 CL4 CL3
4 5.522681 0.240855 0.737767
13.0000 - 5 1 CL5 CL2
6 8.376555 0.737767 0.000000
19.7041
?????,?R2????
70- CLUSTER1 CLUSTER2
CLUSTER3 - OBS SID INCOME EDUC OBS SID INCOME EDUC
OBS SID INCOME EDUC - 1 S1 5 5 3
S3 15 14 5
S5 25 20 - 2 S2 6 6 4
S4 16 15 6
S6 30 19
71Exhibit 7.2Non-hierarchical Clustering On Data
- ReplaceFULL Radius0 Maxclusters3 Maxiter20
Converge0.02 -
-
- Initial Seeds
-
- Cluster INCOME EDUC
- -------- -----------------------------------
- 1 5.0000 5.0000
- 2 30.0000 19.0000
- 3 16.0000 15.0000
??????????S1, S6, S4
72Exhibit 7-2 (continued)
- Minimum Distance Between Seeds 14.56022
-
- Iteration Change in Cluster Seeds
- 1 2
3 - -------------------------------------------------
- - 1 0.707107 2.54951 0.707107
- 2 0 0
0 - Statistics for Variables
-
- Variable Total STD Within STD
R-Squared RSQ/(1-RSQ) - -------------- -----------------------------------
------------------------------------------- - INCOME 9.988327 2.121320
0.972937 35.950617 - EDUC 6.369197 0.707107
0.992605 134.222222 - OVER-ALL 8.376555 1.581139
0.978622 45.777778
73Exhibit 7-2 (continued)
- Pseudo
F Statistic 68.67 - Approximate Expected Over-All R-Squared .
- Cubic Clustering
Criterion . - WARNING The two above values are invalid for
correlated variables. - Cluster Means
-
- Cluster INCOME EDUC
- --------- -----------------------------------
- 1 5.5000 5.5000
- 2 27.5000 19.5000
- 3 15.5000 14.5000
-
-
???????(?????)
74Exhibit 7.4 Hierarchical Cluster Analysis For
Food Data
- SINGLE LINKAGE CLUSTER ANALYSIS
-
- SIMPLE STATISTICS
-
- MEAN STD DEV SKEWNESS KURTOSIS
BIMODALITY -
- CALORIES 207.407 101.208
0.542 -0.675 0.478
- PROTEIN 19.000 4.252
-0.824 1.327
0.357 - FAT 13.481 11.257
0.790 -0.624
0.589 - CALCIUM 43.963 78.034
3.159 11.345 0.746
- IRON 2.381 1.461
1.230 1.469
0.518
75Exhibit 7.4 (continued)
(?????)
- COMPLETE LINKAGE CLUSTER ANALYSIS
-
- NUMBER
FREQUENCY RMS STD
- OF CLUSTERS
OF NEW OF NEW
SEMIPARTIAL MAXIMUM - CLUSTERS JOINED
CLUSTER CLUSTER R-SQUARED
R-SQUARED DISTANCE - 10 CL15 CANNED CRABMEAT
4 11.32324
0.003476 0.985594 50.6665 - 9 CL17 ROAST LAMB
SHOUL 3 12.59929
0.003226 0.982367 55.6611 - 8 CL14 CANNED SHRIMP
3 16.10565
0.005231 0.977136 71.1677 - 7 CL13 ROAST BEEF
6 14.34190
0.009755 0.967381 80.9343 - 6 CL10 CL8
7
22.14096 0.023782 0.943599
108.1758 - 5 CL9 CL11
11
20.22234 0.039103 0.904496
141.7814 - 4 CL6 CL12
9
30.07489 0.048662 0.855835
154.4447 - 3 CL7 CL5
17
38.73570 0.220433 0.635402
262.5666 - 2 CL4 CANNED
SARDINES 10 51.36181
0.192623 0.442779 364.8934 - 1 CL3 CL2
27
57.40958 0.442779 0.000000
433.7617
76Exhibit 7.4 (continued)
- ROOT-MEAN-SQUARE TOTAL-SAMPLE STANDARD DEVIATION
57.4096 -
- NUMBER
FREQUENCY RMS STD
- OF CLUSTERS
OF NEW OF NEW
SEMIPARTIAL MINIMUM - CLUSTERS JOINED
CLUSTER CLUSTER R-SQUARED
R-SQUARED DISTANCE -
- 10 CANNED CANNED
2 11.16786
0.001455 0.973438 35.3159
MACKEREL SALMON - 9 CL14
ROAST LAMB 3 12.59929
0.003226 0.970211
35.4131 -
SHOULDER - 8 CL11
CANNED 12 16.80697
0.014701 0.955510
39.5267 -
CRABMEAT - 7 CL15
CL9 8
20.48901 0.028341 0.927169
40.1627 - 6 CL7
CL8 20
40.04817 0.285060 0.642109
40.2746 - 5 CL12
CANNED 3 16.10565
0.005231 0.636878
44.8504 -
SHRIMP - 4 CL6
ROAST BEEF 21 43.49500
0.085924 0.550954
45.7642 - 3 CL4
CL5 24
48.72189 0.189548 0.361406
48.7139 - 2 CL3
CL10 26
50.53988 0.106595 0.254811
62.2624 - 1 CL2
CANNED 27 57.40958
0.254811 0.000000
211.5691
77Exhibit 7.4 (continued)
(???)
- CENTROID HIERARCHICAL CLUSTER ANALYSIS
- NUMBER
FREQUENCY RMS STD
- OF CLUSTERS
OF NEW OF NEW SEMIPARTIAL
CENTROID - CLUSTERS JOINED
CLUSTER CLUSTER R-SQUARED R-SQUARED
DISTANCE -
- 10 CL15 CANNED
4 11.32324
0.003476 0.985594 44.5633 -
CRABMEAT - 9 CL16 ROAST
LAMB 3 12.59929
0.003226 0.982367 45.5370 -
SHOULDER - 8 CL14 CANNED
SHRIMP 3 16.10565 0.005231
0.977136 57.9815 - 7 CL13 CL10
12 16.80697
0.026857 0.950279 65.6901 - 6 CL12 ROAST
BEEF 6 14.34190
0.009755 0 940524 70.8222 - 5 CL6 CL9
9 24.36751
0.039727 0.900797
92.2533 - 4 CL8 CL11
5 26.85628
0.026158 0.874639 96.6423
- 3 CL7 CL4
17 31.36108
0.113709 0.760930 117.4906
- 2 CL5 CL3
26 50.53988
0.506119 0.254811 191.9655
- 1 CL2 CANNED
27 57.40958
0.254811 0.000000 336.7134 - SARDINES
78Exhibit 7.4 (continued)
(???)
- WARD'S MINIMUM VARIANCE CLUSTER ANALYSIS
-
- NUMBER
FREQUENCY RMS STD
BETWEEN- - OF CLUSTERS
OF NEW OF NEW SEMIPARTIAL
CLUSTER - CLUSTERS JOINED
CLUSTER CLUSTER R-SQUARED R-SQUARED
SUM OF
SQUARES - 10 CL14 CANNED
4 11.32324 0.003476
0.985908 1489.42 - CRABMEAT
- 9 CL16 CL20
8 7.75641
0.003541 0.982367 1517.12 - 8 CL15 CANNED
3 16.10565 0.005231
0.977136 2241.24 - SHRIMP
- 7 CL12 ROAST BEEF
6 14.34190 0.009755
0.967381 4179.83 - 6 CL10 CL8
7 22.14096
0.023782 0.943599 10189.5 - 5 CL11 CL9
11 20.22234
0.039103 0.904496 16754.1 - 4 CL6 CL13
9 30.07489
0.048662 0.855835 20849.7 - 3 CL5 CL4
20 36.22080
0.158726 0.697109 68007.8 - 2 CL3 CANNED
21 47.72546 0.240715
0.456394 103137 - SARDINES
- 1 CL7 CL2
27 57.40958
0.456394 0.000000 195548
79Exhibit 7.5 Non-Hierarchical Analysis For
Food-Nutrient Data
- INITIAL SEEDS (??????)
-
- CLUSTER CALORIES PROTEIN
FAT CALCIUM IRON - --------------------------------------------------
------------------------------------------------- - 1 331.111 19.000
27.556 8.778 2.467
- 2 161.667 20.500
7.500 14.250 1.925 - 3 100.000 14.800
3.400 114.000 3.000 -
80Exhibit 7.5 (continued)
- MINIMUM DISTANCE BETWEEN SEEDS 117.4876
-
- ITERATION CHANGE IN CLUSTER SEEDS
- 1
2 3 - ----------------------- --------------------------
---------------- - 1 10.8475
6.46446 0.3 - 2 0
6.85281 12.7855 - 3 0
0 0
81- CLUSTER SUMMARY
- MAXIMUM
-
DISTANCE - CLUSTER RMS STD
FROM SEED TO NEAREST CENTROID - NUMBER FREQUENCY DEVIATION OBSERVATION
CLUSTER DISTANCE - --------------------------------------------------
--------------------------------------------------
------------ - 1 8 20.8936
78.8882 2 168.5 - 2 12 16.3651
70.9576 3 117.9 - 3 6 27.8059
79.6672 2 117.9 -
- ????? ?2?????? ??? ?????
- ??? ???
-
82?????(??)???,?????RMSSTD.????,???? Within
SD/Total SD
-
- VARIABLE TOTAL STD WITHIN STD
R-SQUARED RSQ/(1-RSQ) - -------------------------------------------------
--------------------------------------------------
------- - CALORIES 103.06085
39.89286 0.86216
6.25453 - PROTEIN 4.29257
3.58590 0.35798
0.55758 - FAT 11.44357
4.52989 0.85584
5.93681 - CALCIUM 44.70188
22.76009 0.76150
3.19291 - IRON 1.49005
1.51663 0.04688
0.04919 - OVER-ALL 50.53988
20.71299 0.84547
5.47135 - PSEUDO F STATISTIC 62.92
- APPROXIMATE EXPECTED OVER-ALL R-SQUARED
0.78678 - CUBIC
CLUSTERING CRITERION 2.186 -
STATISTICS FOR VARIABLES
83Exhibit 7.5 (continued)
- CLUSTER MEANS
-
- CLUSTER CALORIES PROTEIN FAT
CALCIUM IRON - --------------------------------------------------
--------------------------------------------- - 1 341.875
18.750 28.875 8.750
2.437 - 2 174.583
21.083 8.750 11.833
2.083 - 3 98.333
14.667 3.167 101.333
2.883 -
Cluster 1?????? Cluster 2??????? Cluster 3?????
84Exhibit 7.5 (continued)
?????????(????,?????,??) (8 Cases)
- CLUSTER1
-
- OBS NAME CLUS DISTA
CALORIES PROTEIN FAT CALCIUM IRON -
- 1 BRAISED BEEF 1
2.4357 340 2 0 28
9 2.6 - 2 ROAST BEEF 1
78.8882 420 15
39 7 2.0 - 3 BEEF STEAK 1
33.2744 375 19
32 9 2.6 - 4 ROST LAMB LEG 1 77.3963
265 20 20
9 2.6 - 5 ROAST LAMB 1
42.0616 300 18
25 9 2.3 - 6 SMOKED HAM 1 2.4311
340 20 28
9 2.5 - 7 PORK ROAST 1
1.9132 340 19
29 9 2.5 - 8 PORK SIMMERED 1 13.1779
355 19 30
9 2.4 -
85Exhibit 7.5 (continued)
?????????(????,?????,??) (12 Cases)
- CLUSTER2
-
- OBS NAME CLUSTER
DISTANCE CALORIES PROTEIN FAT CALCIUM IRON -
- 9 HAMBURGER 2
70.9576 245 21
17 9 2.7 - 10 CANNED BEEF 2
7.8135 180 22
10 17 3.7 - 11 BROILED CHICKEN 2
59.9964 115 20 3
8 1.4 - 12 CANNED CHICKEN 2
6.3070 170 25 7
12 1.5 - 13 BEEF HEART 2
16.4369 160 26
5 14 5.9 - 14 BEEF TONGUE 2
31.3971 205 18
14 7 2.5 - 15 VEAL CUTLET 2
10.9841 185 23
9 9 2.7 - 16 BAKED BLUEFISH 2
42.0215 135 22
4 25 0.6 - 17 FRIED HADDOCK 2
40.2403 135 16
5 15 0.5 - 18 BROILED MACKEREL 2
26.7634 200 19 13
5 1.0 - 19 FRIED PERCH 2
21.2850 195 16
11 14 1.3 - 20 CANNED TUNA 2
7.9719 170 25
7 7 1.2 -
86Exhibit 7.5 (continued)
???????????? (6 Cases)
- CLUSTER3
-
- OBS NAME CLUSTER
DISTANCE CALORIES PROTEIN FAT CALCIUM IRON -
- 21 RAW CLAMS 3
34.7046 70 11
1 82 6.0 - 22 CANNED CLAMS 3
60.5092 45 7
1 74 5.4 - 23 CANNED CRABMEAT 3
63.9273 90 14
2 38 0.8 - 24 CANNED MACKEREL 3
79.6672 155 16
9 157 1.8 - 25 CANNED SALMON 3
61.7127 120 17
5 159 0.7 - 26 CANNED SHRIMP 3
14.8809 110 23
1 98 2.6