Mining spatiotemporal cascade patterns - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Mining spatiotemporal cascade patterns

Description:

Relationships between different types of crime (Drunk ... 002630 WINTHROP RD. OTHER. 0057. 2300. 003817 S 15TH ST. OTHER. 2321. 002900 S 27TH ST. LARCENY ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 35
Provided by: pradee6
Category:

less

Transcript and Presenter's Notes

Title: Mining spatiotemporal cascade patterns


1
Mining spatio-temporal cascade patterns
1
2
Outline
  • Motivation
  • Modeling Spatio-Temporal Cascade Pattern
  • Spatio Temporal Cascade Pattern Mining Problem
  • Challenges
  • Related Work
  • Contributions
  • Key Concepts
  • Proposed Approach
  • Validation Methodology
  • Validation of Interest Measures
  • Analytical Validation of Algorithms
  • Conclusion and Further Work

2
3
Motivation
  • Application Domains Crime (Crime Linkage),
    Military (Insurgent Attack Patterns), ecology
    (preservation of endangered species), power
    systems (cascading blackouts)
  • Example Military
  • Understanding insurgent Attack Patterns
  • Understanding global and local trends in attacks.
  • Predicting future locations of attacks.
  • Example Crime
  • Crime linkage analysis.
  • Relationships between different types of crime
    (Drunk Driving, Hit and Run, Homicide, Shop
    Breaking)
  • Example Power Systems
  • Transmission Data Analysis
  • Identifying sequences of faults for a potential
    cascading black out.
  • Preventing blackouts.

Source http//www.ferc.gov/EventCalendar/Files/20
040414101846-blackout.pps
3
4
Motivation Tactical Crime Analysis
  • Geographic Profiling Step of Scenario
    SelectionRossmo2006
  • Scenario selection Optimal subset of Crime
    Sites to be profiled.
  • Performs Crime series Linkage Analysis
  • Requires Human Intervention and manually
    building separate scenarios.
  • No Existing measure for the validity of crime
    linkage
  • Outliers Complicates the problem.
  • Tactical Crime Analyst Questions
  • Can we prune unnecessary data in an un-biased
    way ?
  • Are a particular set of Crimes part of a single
    series?
  • Are these serial crimes committed by one
    individual or a group of individuals operating
    together ?
  • How can we identify and eliminate outliers ?

Source http//pt.wikipedia.org/wiki/Ataques_a_Tir
os_em_Beltway
4
5
Modeling Spatio Temporal Cascade Patterns Iraq
Insurgency Example
  • Cascade Patterns
  • Series of Attacks by US Troops. (A)
  • Series of attacks on US Troops.(B)
  • Series of attacks on civilian facilities.(C)
  • Series of suicide attacks. (D)
  • Series of Communal Clashes.(E)
  • Attacks on Iraqi troops.(F)

Source www.tribuneindia.com
6
Modeling Spatio Temporal Cascade Patterns
Lincoln Crime sample dataset
0100 AM
0130 AM
0200 AM
0230 AM
7
Sample Data Records
8
Modeling Spatio Temporal Cascade Patterns An
Example Cascade Pattern
  • Output Cascade Pattern

Bar Closing(B)
AutoCrime(A)
Assault (C),Vandalism (D)
OtherCrimes(E)
D
A
B
C
D
A
Continuous Sub-cascades
D
B
E
C
A
E
C
Cascade Pattern
9
Spatio Temporal Cascade Pattern Mining Problem
  • Given
  • A spatio-temporal event database.
  • A set of M spatio-temporal event types.
  • A spatial neighborhood relation S?, A temporal
    Neighborhood relation T ? .
  • A Time window ( interval) T w (gt T ? )
  • A spatio-temporal co-occurrence prevalence
    threshold
  • A spatio-temporal link prevalence threshold
  • A Cascade Prevalence Threshold CP?
  • Find
  • All Spatio-temporal Cascade (ST- Cascade)
    patterns with cascade prevalence gt CP?,
  • where, all its component spatio-temporal
    co-occurrence prevalence gt and all
    its inter component links have a spatio-temporal
    link prevalence gt
  • Objective
  • Statistical Significance of interest measures.
  • Minimize Computational Cost
  • Constraints
  • Correctness
  • Completeness
  • Monotonic Composite Multi-dimensional Interest
    Measure

9
10
Challenges
  • Conceptual Challenges
  • Spatial footprint of events over time changes
    with different problem settings.
  • A cascade pattern is non-linear . (Example.,
    Cascading Power Failure)
  • Concurrent processes can together form a
    cascade. (Example A group of serial criminals
    working together and committing different crimes)
  • Requires different timing considerations to
    capture concurrency , prolonged influence and
    lack of influence.
  • Statistical challenges
  • Cascading behavior is multidimensional (involves
    both space and time) Interest measures need to
    capture this aspect.
  • Timing constraints and neighborhood definitions
    vary across problem domains Interest measures
    need to be flexible and correct.
  • Identifying the right statistical measure from
    spatial statistics to compare proposed interest
    measures.
  • Existence of non-linearity and concurrent
    processes across space and time Requires
    Composite Multi-dimensional Interest measures
  • Risk of generating spurious patterns.
  • Desirable computational properties such as
    monotonicity.
  • Computational challenges
  • Composite multi-dimensional interest measures are
    computationally complex.
  • Patterns are exponential in the number of event
    types.
  • Patterns are exponential in the nature of the
    timing constraints and maximum time span length
    of the dataset.

11
Classification of Related Work and Proposed
Pattern
11
12
Modeling ST-Cascade Patterns Related ST-Patterns
12
13
Related Work Topological Patterns Wang et al.
2005
  • Topological Pattern Co-occurrence of m feature
    types over a spatio-temporal neighborhood.

Example Bar Closing Drunk Driving, Hit and
Run, Accident
  • Properties
  • All events occur in the same spatio-temporal
    neighborhood.
  • No concept of separate time intervals.
  • Concept of Space and time neighborhood.
  • Interest Measure Prevalence (Participation
    Index over a Space Time Neighborhood)

Topological Patterns Wang et al. 2005
ST Cascade
  • Limitation
  • Does not recognize sequences of different
    spatio-temporal co-occurrences
  • (no ordering between different
    co-occurrences)
  • Example
  • Can catch crime patterns co-occurring in a
    spatio-temporal neighborhood.
  • Cannot identify an ordering over different
    spatio-temporal co-occurrences.

Contains
Spatio-temporal Co-occurrence
A Topological Pattern
13
14
Related Work Sequential Patterns from
Spatio-temporal Event Datasets, Huang et al. 2008
  • Spatio temporal Sequential Pattern Sequence of
    spatio-temporal events based on a temporal
    ordering.

Example
  • Properties
  • Defines Follow Predicate for time ordering
    using a Particular neighborhood N(e)
  • Defines During Predicate for intervals
    (possibly subset) must have to define another
    neighbor hood N1(e) to find subsets.
  • Interest measure weak anti monotonic.
  • Limitation
  • Sequence Index and Density Ratio can be only a
    single type of neighborhood function at a time
    for a particular pattern.
  • Requires a-priori specification of the
    respective neighbor hood definitions between
    different event types to obtain sequences of
    subsets.
  • Due to these reasons Sequence Index cannot be
    used as the significance measure for capturing
    cascades.
  • No notion of a continuous sub-sequence where
    continuity referes to the presence of common
    subsets of event types.
  • Rules out presence of both the Follow and
    During relationship between a pair of event
    types.
  • Example
  • Can identify ordered sets of spatio-temporal
    event types that is a link between different
    types.
  • Can identify links, but cannot identify
    co-occurrences of crime types and hence cannot
    identify ordering between different co-occurring
    sets.

14
15
Sequential Patterns from Spatio-temporal Event
Datasets
Spatio-temporal Cascade Patterns
Finds
Finds
Cannot find both of these from the same dataset
D and D in the above pattern correspond to
different event instance sets in this case, in
Cascades it is not, in crime also it is not!!
15
16
A distinguishing Example
  • Three Criterion
  • Neighborhood Relationship
  • Interest Measure
  • Pattern Semantics

17
A distinguishing Example
During
Follow
Sequence Index different values for both.
Hence cannot capture patterns where both exist.
18
Related Work Collocation Episodes Cao et al.
2006
  • Collocation Episodes finding inter-movement
    regularities of different object types.

Example (Bar Closing, DrunkDriving)?(DrunkDriving
,HitRun)
  • Properties
  • A reference feature is used for defining the
    sequence of episodes.
  • Defined for moving object types.
  • Limitation
  • A reference feature type is required for finding
    such patterns
  • Trajectories known apriori.
  • Example
  • Can catch patterns which have common event type.
  • Cannot identify patterns that do not have common
    event type.
  • Assumes that a sequence of episodes are
    connected by a common reference type.

18
19
Collocation Episodes
Spatio-temporal Cascade Patterns
  • Inputs
  • Trajectories of Moving objects
  • Inputs
  • Spatio-temporal Event Database

Finds
Finds
Constraints
  • Reference Feature Type D

Cannot Find
19
20
Related Work Spatio-Temporal Cross Correlation
Function, Ma et al. 2006 (ST Cross K Function)
  • Temporal Extension to the Cross K Function -
    statistical measure for quantifying cause
    effect relationships of different event types
  • Properties
  • Extends Ripleys K function (Ripley et al. 1976)
    by adding time.
  • One Tail case events of other type occurring
    only after a current event
  • Two Tail case- events before and after the
    current event.
  • Limitations
  • Defined only for pairs of event types.
  • High Computational cost for computing the K
    function.
  • Example
  • Can identify only pairs of spatio-temporally
    correlated event types.

ST Cross K Function
ST Cascades
A
D
D
A
A
B
C
B
C
Two-Tail Effect
A
A
One-Tail Effect
B
E
B
20
21
Contributions
  • Modeled Spatio-Temporal Cascade Patterns
  • Definition of monotonic composite interest
    measure Cascade Prevalence
  • Prove interest measures preserve anti-monotone
    property of Cascades and statistical
    significance.
  • Development of a novel ST Cascade Miner Algorithm.

21
22
Key Concepts -2 Spatio-Temporal Co-occurrence
  • Spatio-Temporal Participation Ratio
  • Spatio-Temporal Prevalence (Spatio-temporal
    Participation Index)
  • A Spatio-temporal Co-occurrence is prevalent if

22
23
Key Concepts -3
  • Spatio-Temporal Link Relationship is an
    ordering between spatio-temporal co-occurrences
    or spatio-temporal event types.
  • B and A are event types which
  • Satisfy a spatial neighborhood relation S?
  • Are within a time window Tw
  • Do not satisfy a time neighborhood relationT?
  • Definition 1 Spatio-temporal Link Prevalence of
    a pattern

min( of instances of D satisfying a spatial
neighborhood relationship with B and occurring
after B)/ (total of instances of D), (of
instances of B in spatial neighborhood of D and
occurring before D)/(total of instances of B)
min4/4,3/5 3/5
  • Definition 2 prevalent Spatio-temporal Cascade

3/5
3/5
Prevalent ST Cascades
ST Cascade
ST Cascade
3/5
23
ST Cascade
24
Key Concepts -4
  • Definition 3 Cascade Prevalence of a ST Cascade



Intersecting at the number of common instances of
B
Cascade Prevalence (instances of B in

) / (total
instances of B)
If, prevalent( ) and prevalent(
)


Intersecting at the number of common instances of
E
Cascade Prevalence (instances of E in

) / (total
instances of E)
If, prevalent( ) and prevalent(
)
Cascade Prevalence min (instances of E in

) / (total instances of E) ,
(instances of B in
) / (total
instances of B)
If, prevalent( ) and
prevalent( )
24
In all of the above cases Tw gt T?, if Tw T? ?
25
Key Concepts -4 (contd , Examples)
if Tw T? then we have (or patterns are just
connected by common type)
A
D
D


B
C
If, prevalent( ) and
prevalent( )
Cascade Prevalence (instances of D in

) / (total
instances of D)



E
B
If, prevalent( ) and
prevalent( )
Cascade Prevalence (instances of B in

) / (total instances of B)
26
Key Concepts -4 (contd, Examples)
A
D
D


B
C
Cascade Prevalence min(( instances of D in

)/ (total instances of D), (
instances of B in
)/ (total
instances of B) )


27
Evaluation of Interest Measures
Properties
  • Monotonicity
  • Spatio-temporal prevalence (PI)
  • Link prevalence.
  • Cascade Prevalence (CP)
  • Statistical Significance to ST Cross K Function
    (Ma et al. 2006)
  • Spatio-temporal prevalence (PI)
  • Link prevalence is defined only between 2 types.
  • Cascade Prevalence (CP)

28
Evaluation of Interest Measures
Properties
  • Lemma 1
  • Spatio-temporal Prevalence is Monotonic.

2/5
2/5
1
Spatio-temporal Link Prevalence is similar to
Spatio-temporal Prevalence with a different Time
window hence, has same properties.
2/5
  • Lemma 2
  • Cascade Prevalence preserves is Monotonic

3/5
3/5
2/5
28
29
Spatiotemporal prevalence is an upper bound to ST
K-Function - Illustration
30
Spatio-Temporal Link Prevalence is an upper bound
to ST K-Function - Illustration
31
Cascade Prevalence is an upper bound to ST
K-Function - Illustration
32
Proposed Approach Naïve Algorithm
  • Step 1 Preprocessing Create instance level
    graph
  • Step 2 Candidate Generation
  • Candidate Co-occurrences
  • Candidate Spatio-temporal Links
  • Candidate continuous single link cascades
  • Step 3 Repeatedly merge all continuous k-1
    linked cascades to form continuous k linked
    cascades.
  • Step4 Pruning Step
  • Limitations
  • Step 2 No pruning
  • Step 3 merge all k-1 link cascades
  • Step 4 looks at all candidates.

33
Proposed Approach Naïve Algorithm
  • Inputs
  • All of the Inputs in problem definition
  • Output
  • All Prevalent Cascade Patterns
  • Pseudo code
  • preprocessing step
  • generate_all_instance
    _co_occurrences_ pplying S? and T ? .
  • generate_all_instance_
    links_applying S? and T w .
  • mining step
  • candidate generation
  • generate all event type
    co-occurrences
  • generate all event type
    links
  • generate all continuous sub
    cascades with one link add it to set S.
  • merging step
  • intialize k 2
  • repeat until S ! null set
  • C all k-1 linked
    continuous sub cascades
  • repeat until C!null
    set
  • merge all
    continuous k-1 linked continuous sub cascades
    with one another

34
Conclusions and Further Work
  • Optimized algorithm for Mining ST Cascades
  • Proof of Correctness and Completeness.
  • Experimental Evaluation.
Write a Comment
User Comments (0)
About PowerShow.com