Title: Mining geospatial temporal patterns from soccer games
1Mining geo-spatial temporal patterns from soccer
games
- Paul Lesov G2
- 8715 Spatial Databases
- University of Minnesota
- Fall 2007
2Outline
- Motivation
- Problem Statement
- Challenges
- Scope/Limitations
- Contribution
- Spatial Indexing
- Relational Model
- Temporal event handling
- Demonstration
- Future Work
- Conclusion
3Motivation
- Importance
- Soccer is the most popular game in the world
- Soccer has great impact on many regional
economics - Soccer data is used for coaching, marketing,
medical, AI simulations and betting odds - Finding hidden trends in the data collected can
lead to improved coaching, injury prevention,
targeted marketing, better prediction and more
realistic game simulations - Goal Provide a database framework to allow
spatial- autocorrelation of a soccer game, which
would in turn allow applicable statistical
methods to find useful patterns and nuggets
contained within
4Problem Statement
- Current Data Mining Limitations
- Today trends are data mined from the non-spatial
aspects of the game (player and team statistics) - Spatial-centric game play information is obtained
as needed by reviewing game video - Data Mining and aggregations of spatial aspects
over multiple games is not possible - Example A team owner may be considering
expanding the size of the pitch. He may want to
know if the team performance will suffer as a
result of such move. A question such as During
last 3 seasons was out team more successful at
(passing/dribbling/shooting/ scoring) when
attacking through the middle or the flanks? can
not be easily answered quantitatively today.
5Challenges
- Manual data collection is time consuming
- Advances in obtaining spatial data from video
footage - GPS device embedded in a ball and in a future
possibly players clothes - Spatial-Temporal Aspects
- Lack of defined standard for capturing temporal
relations between spatial entities - Implementation Issues
- OGIS extensions to popular free databases are not
fully standardized and support for different
functions is varying to a large degree - Development was done using both MySQL and
PostgreSQL to allow for an overlaps between the
two providers to cover all needed functionally
6Contribution Spatial Indexing
- Key Concepts
- Distance Preserving Fixed Spatial Grid
- well defined, static boundaries
- data is evenly distributed
- column-based ordering for 60 10x10m quadrants
emphases the importance of horizontal movement
down/up the field
7Contribution ER
Spatial Non Temporal
Spatial Temporal
Non Spatial Temporal
8Contribution Temporal
- Event entity maintains all the temporal aspects
for a single game. It has no spatial
representation and is an identifier for spatial
Entities Pass, Dribble and Shot. A spatial
Possession entity is a collection of the Events
spatial derivatives.
INPUT EVENT E, PASS P, SHOT S, DRIBBLE
D OUTPUT POSSESION PO For each E Initialize
id0, Geo_collection_array()j0 If
id.time_end(id).time_start If
p.eventid Add p.geom to geo_collection_array
else if d.eventid Add d.geom to
geo_collection_array else if s.eventid Add
s.geom to geo_collection_array else j copy
geo_collection_array to PO PO.idj update
possesion with PO.id
9Scope/Limitations
- Manual Data Collection
- Limited data set prevented us from multi-game
pattern mining - Only ball trajectory is collected, not 22 players
and 3 officials - Limitation on knowing kinematics of each player
and players spatial relations to each other - Game Play elements identification of which is not
supported by database schema - Fouls
- Injuries
- Set pieces
- Temporal data resides independently of spatial
possession data - Similarity identification queries are not
supported (Such as show all possessions that
followed the same path as this possession)
10Validation/Demonstration
A prototype is accessible at http//65.41.192.13/g
soccer.htm It utilizes colors, shapes and
different line types for the resulting visuals.
Example Show all long (over 20 meters) passes
into penalty area between 60th and 70th minute.
11Validation/Demonstration
Example Show all possessions for team Germany
which resulted in lost of possession from
dribbles into opponents penalty area
12Future Work
- Automate data collection
- Utilize emerging video stream player
disambiguation technology or GPS tracking - Extend database schema to support
- Fouls, Injuries and Set pieces (easy)
- 22 players 3 officials kinematics (harder as
player actions are not easily identifiable and
players have complex trajectories temporal
aspects of which must be preserved ) - Allow similarity identification queries
- Polynomial approximations for most of the
possessions is possible as curves and usage of
PA-Tree for indexing
13Conclusion
- We have created a prototype for storing
spatial-temporal data contained within a soccer
game by providing - ER model
- distance preserving fixed grid
- functional approach to temporal issues
- working demonstration
- Mining this data over the course of many games
may realize much insight into the game and can
lead to improved coaching, injury prevention,
targeted marketing, better prediction and more
realistic game simulations
14Questions?
---THANK YOU!