Game Trees-Clustering - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Game Trees-Clustering

Description:

All human beings desire to know Aristotle, Metaphysics, I.1. Game Trees-Clustering Prof. Sin-Min Lee Decision Tree A decision tree is a predictive model Each ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 74
Provided by: Lee144
Learn more at: http://www.cs.sjsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Game Trees-Clustering


1
Game Trees-Clustering
All human beings desire to know Aristotle,
Metaphysics, I.1.
Lecture 10
  • Prof. Sin-Min Lee

2
Decision Tree
  • A decision tree is a predictive model
  • Each interior node corresponds to a variable
  • An arc to a child represents a possible value
  • of that variable
  • A leaf represents the predicted value of target
    variable given the values of the variables
    represented by the path from the root.

3
  • - Decision tree can be learned by splitting the
    source set into subsets based on an attribute
    value test - This process is repeated on each
    derived subset in a recursive manner - The
    recursion is completed when splitting is a
    singular classification which can be applied to
    each element of the derived subset
  • - It is also for calculating conditional
    probabilities

4
Decision tree has three other names
  • Classification tree analysis is used when the
    predicted outcome is the class to which the data
    belongs.
  • Regression tree analysis is used when the
    predicted outcome can be considered a real number
  • CART analysis is to refer to both of the above
    procedures.

5
(No Transcript)
6
Advantage of Decision Tree
  • simple to understand and interpret
  • require little data preparation
  • able to handle nominal and categorical data.
  • perform well with large data in a short time
  • the explanation for the condition is easily
    explained by boolean logic.

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
AprioriTid Algorithm
  • The database is not used at all for counting the
    support of candidate itemsets after the first
    pass.
  • The candidate itemsets are generated the same way
    as in Apriori algorithm.
  • Another set C is generated of which each member
    has the TID of each transaction and the large
    itemsets present in this transaction. This set is
    used to count the support of each candidate
    itemset.
  • The advantage is that the number of entries in C
    may be smaller than the number of transactions in
    the database, especially in the later passes.

27
Apriori Algorithm
  • Candidate itemsets are generated using only the
    large itemsets of the previous pass without
    considering the transactions in the database.
  • The large itemset of the previous pass is joined
    with itself to generate all itemsets whose size
    is higher by 1.
  • Each generated itemset, that has a subset which
    is not large, is deleted. The remaining itemsets
    are the candidate ones.

28
Example
Database
L1
C2
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
Itemset Support
1 2
2 3
3 3
5 3
Itemset Support
1 3 2
1 4 1
3 4 1
2 3 2
2 5 3
3 5 2
1 2 1
1 5 1
C3
Itemset Support
1 3 4 1
2 3 5 2
1 3 5 1
29
Example
Database
L1
C2
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
Itemset Support
1 2
2 3
3 3
5 3
Itemset TID
1 3 100
1 4 100
3 4 100
2 3 200
2 5 200
3 5 200
1 2 300
1 3 300
1 5 300
2 3 300
2 5 300
3 5 300
2 5 400
C3
Itemset TID
1 3 4 100
2 3 5 200
1 3 5 300
2 3 5 300
30
Example
Database
L1
C2
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
Itemset Support
1 2
2 3
3 3
5 3
Itemset Support
1 2 1
1 3 2
1 5 1
2 3 2
2 5 3
3 5 2
C3
1 2 3
1 3 5
2 3 5
Itemset Support
2 3 5 2
31
Example
C2
Itemset Support
1 2 1
1 3 2
1 5 1
2 3 2
2 5 3
3 5 2
Database
L1
TID Items
100 1 3 4
200 2 3 5
300 1 2 3 5
400 2 5
Itemset Support
1 2
2 3
3 3
5 3
C2
C3
100 1 3
200 2 3, 2 5, 3 5
300 1 2, 1 3, 1 5, 2 3, 2 5, 3 5
400 2 5
200 2 3 5
300 2 3 5
C3
Itemset Support
2 3 5 2
32
  • No practicable methodology has been demonstrated
    for reliable prediction of large earthquakes on
    times scales of decades or less
  • Some scientists question whether such predictions
    will be possible even with much improved
    observations
  • Pessimism comes from repeated cycles in which
    public promises that reliable predictions are
    just around the corner are followed by the
    equally public failures of specific prediction
    methodologies. Bad for science!

33
COMPLEX PLATE BOUNDARY ZONE IN SOUTHEAST
ASIA Northward motion of India deforms all of
the region Many small plates (microplates) and
blocks
Molnar Tapponier, 1977
34
(No Transcript)
35
(No Transcript)
36
Mission district San Francisco Earthquake, 1906
  • Short-term prediction (forecast)
  • Frequency and distribution pattern of foreshocks
  • Deformation of the ground surface Tilting,
    elevation changes
  • Emission of radon gas
  • Seismic gap along faults
  • Abnormal animal activities

37
(No Transcript)
38
                                                
        ???????????????????????????
39
Freeway Damage 1994 CA Earthquake
40
Sand Boils after Loma Prieta Earthquake
41
California Earthquake Probabilities Map
42
Clustering
  • Group data into clusters
  • Similar to one another within the same cluster
  • Dissimilar to the objects in other clusters
  • Unsupervised learning no predefined classes

43
(No Transcript)
44
What is Cluster Analysis?
  • Cluster analysis
  • Grouping a set of data objects into clusters
  • Clustering is unsupervised classification no
    predefined classes
  • Typical applications
  • to get insight into data
  • as a preprocessing step

45
What Is A Good Clustering?
  • High intra-class similarity and low inter-class
    similarity
  • Depending on the similarity measure
  • The ability to discover some or all of the hidden
    patterns

46
General Applications of Clustering
  • Pattern Recognition
  • Spatial Data Analysis
  • create thematic maps in GIS by clustering feature
    spaces
  • detect spatial clusters and explain them in
    spatial data mining
  • Image Processing
  • Economic Science (especially market research)
  • WWW
  • Document classification
  • Cluster Weblog data to discover groups of similar
    access patterns

47
Examples of Clustering Applications
  • Marketing Help marketers discover distinct
    groups in their customer bases, and then use this
    knowledge to develop targeted marketing programs
  • Land use Identification of areas of similar land
    use in an earth observation database
  • Insurance Identifying groups of motor insurance
    policy holders with a high average claim cost
  • City-planning Identifying groups of houses
    according to their house type, value, and
    geographical location
  • Earth-quake studies Observed earth quake
    epicenters should be clustered along continent
    faults

48
What Is Good Clustering?
  • A good clustering method will produce high
    quality clusters with
  • high intra-class similarity
  • low inter-class similarity
  • The quality of a clustering result depends on
    both the similarity measure used by the method
    and its implementation.
  • The quality of a clustering method is also
    measured by its ability to discover some or all
    of the hidden patterns.

49
Data Structures in Clustering
  • Data matrix
  • (two modes)
  • Dissimilarity matrix
  • (one mode)

50
Measuring Similarity
  • Dissimilarity/Similarity metric Similarity is
    expressed in terms of a distance function, which
    is typically metric d(i, j)
  • There is a separate quality function that
    measures the goodness of a cluster.
  • The definitions of distance functions are usually
    very different for interval-scaled, boolean,
    categorical, ordinal and ratio variables.
  • Weights should be associated with different
    variables based on applications and data
    semantics.
  • It is hard to define similar enough or good
    enough
  • the answer is typically highly subjective.

51
Notion of a Cluster can be Ambiguous
52
  • Hierarchy algorithmsAgglomerative each object
    is a cluster, merge clusters to form larger
    onesDivisive all objects are in a cluster,
    split it up into smaller clusters

53
Types of Clusters Well-Separated
  • Well-Separated Clusters
  • A cluster is a set of points such that any point
    in a cluster is closer (or more similar) to every
    other point in the cluster than to any point not
    in the cluster.

3 well-separated clusters
54
Types of Clusters Center-Based
  • Center-based
  • A cluster is a set of objects such that an
    object in a cluster is closer (more similar) to
    the center of a cluster, than to the center of
    any other cluster
  • The center of a cluster is often a centroid, the
    average of all the points in the cluster, or a
    medoid, the most representative point of a
    cluster

4 center-based clusters
55
Types of Clusters Contiguity-Based
  • Contiguous Cluster (Nearest neighbor or
    Transitive)
  • A cluster is a set of points such that a point in
    a cluster is closer (or more similar) to one or
    more other points in the cluster than to any
    point not in the cluster.

8 contiguous clusters
56
Types of Clusters Density-Based
  • Density-based
  • A cluster is a dense region of points, which is
    separated by low-density regions, from other
    regions of high density.
  • Used when the clusters are irregular or
    intertwined, and when noise and outliers are
    present.

6 density-based clusters
57
Types of Clusters Conceptual Clusters
  • Shared Property or Conceptual Clusters
  • Finds clusters that share some common property or
    represent a particular concept.
  • .

2 Overlapping Circles
58
Hierarchical Clustering
Traditional Hierarchical Clustering
Traditional Dendrogram
Non-traditional Hierarchical Clustering
Non-traditional Dendrogram
59
Hierarchical Clustering
  • Produces a set of nested clusters organized as a
    hierarchical tree
  • Can be visualized as a dendrogram
  • A tree like diagram that records the sequences of
    merges or splits

60
Starting Situation
  • Start with clusters of individual points and a
    proximity matrix

Proximity Matrix
61
Intermediate Situation
  • After some merging steps, we have some clusters

C3
C4
Proximity Matrix
C1
C5
C2
62
  • We want to merge the two closest clusters (C2 and
    C5) and update the proximity matrix.

C3
C4
Proximity Matrix
C1
C5
C2
63
After Merging
  • The question is How do we update the proximity
    matrix?

C2 U C5
C1
C3
C4
?
C1
? ? ? ?
C2 U C5
C3
?
C3
C4
?
C4
Proximity Matrix
C1
C2 U C5
64
How to Define Inter-Cluster Similarity
Similarity?

MIN MAX Group Average Distance Between
Centroids Other methods driven by an objective
function
Proximity Matrix
65
MIN
Proximity Matrix
  • MAX

66
Group Average
  • Distance Between Centroids

?
?
67
Cluster Similarity MIN or Single Link
Similarity of two clusters is based on the two
most similar (closest) points in the different
clusters Determined by one pair of points, i.e.,
by one link in the proximity graph.
68
Hierarchical Clustering MIN
Nested Clusters
Dendrogram
69
Cluster Similarity MAX or Complete Linkage
  • Similarity of two clusters is based on the two
    least similar (most distant) points in the
    different clusters
  • Determined by all pairs of points in the two
    clusters

70
Hierarchical Clustering MAX
Nested Clusters
Dendrogram
71
Cluster Similarity Group Average
  • Proximity of two clusters is the average of
    pairwise proximity between points in the two
    clusters.
  • Need to use average connectivity for scalability
    since total proximity favors large clusters

72
Hierarchical Clustering Group Average
Nested Clusters
Dendrogram
73
Hierarchical Clustering Time and Space
requirements
  • O(N2) space since it uses the proximity matrix.
  • N is the number of points.
  • O(N3) time in many cases
  • There are N steps and at each step the size, N2,
    proximity matrix must be updated and searched
  • Complexity can be reduced to O(N2 log(N) ) time
    for some approaches

74
Hierarchical Clustering Problems and Limitations
  • Once a decision is made to combine two clusters,
    it cannot be undone
  • No objective function is directly minimized
  • Different schemes have problems with one or more
    of the following
  • Sensitivity to noise and outliers
  • Difficulty handling different sized clusters and
    convex shapes
  • Breaking large clusters
Write a Comment
User Comments (0)
About PowerShow.com