Title: Long Term Data Base
1Long Term Data Base Data Navigator II
- ESPON SEMINAR
- 14-15 November 2006
- Dipoli Conference Centre, Espoo, Finland
- Joël Boulier, Claude GraslandLaboratoire
Géographie-cités (Paris) HYPERCARTE Research
Group - Marc Guerrien, Nicolas Lambert
- CNRS UMS RIATE (Paris) HYPERCARTE Research
Group - Jérôme Gensel, Bogdan Moisuc, Marlène
Villanova-Oliver - Laboratoire LSR-IMAG (Grenoble) HYPERCARTE
Research Group
2Context
- ESPON 3.2 Programme (European Commission
2002-2006) - Long Term (ESPON) DataBase
- Provide scenario builders quantitative inputs on
selected topics at regional level for years 1980,
2000 and 2020 - Try to establish a sustainable framework for the
ESPON Database in the future ESPON II, taking
into account various problems encountered
(missing values, changing territorial units,
etc.) - HYPERCARTE Research Group
- 2 Research Labs in Geography Géographie-Cités
and UMS RIATE - 2 Research Labs in Computer Science LSR-IMAG and
ID-IMAG - Goals Advanced Methods and Tools for Spatial
Analysis - Involved in ESPON 3.1, 3.4.3 and 3.2
- Software
- Available HyperAtlas (application ESPON
HyperAtlas ESPON 3.1) - Soon to come HyperAdmin, HyperSmooth, LTDB
3LTDB Objectives
- Objective 1
- Provide a framework for long-term storage of
thematic and geometric data for the territorial
units composing a given area, at different levels
- This implies tackling several issues
- Evolutivity ? rely on a flexible schema
- Data quality ? keep trace of the quality of the
data - Usability ? make it usable by other people than
its designers, possibly as a shared resource - Objective 2
- Provide a framework for a reliable estimation of
missing indicator values - To fill-up informational gaps
- To simulate past or/and future hypothetical
situations - This implies designing several components
- A set of generalized estimation methods
- A set of generalized estimation strategies
- A mechanism for evaluating the quality of the
estimated data
4ESTImate
- Postulate All statistical information managed by
the LTDB can be described according to four
dimensions E, S, T and I - (E)space the spatial unit to which the
statistical information is attached - (S)ource the statistical institute which has
produced the information - (T)ime a period or an instant which dates the
information - (I)ndicator a thematic definition of the
variable - And then come more general problems
- Instability of the administrative structures
- The name and/or the borderline of E can have
changed someday - W. Germany E. Germany ? Germany
- Czechoslovakia ? Czech Republic Slovakia
- Côtes du Nord ? Côtes d'Armor
- Isère 1960 ? Isère 2006 and Rhône 1960 ? Rhône
2006 - Heterogeneity of the sources
- The source S does not provide any value for the
given (E,T,I) - Missing values
- At time T, no value for E and I whatever S
- How to Cope with Reality?
5LTDB General Architecture
6LTDB Architecture Components
- Geographic Ontology a gazetteer containing names
of geographic entities and some relations between
them - Indicator Ontology a classification hierarchy of
the themes and indicators with some relations
between them (aggregation, broader term, etc.) - Indicator Formulae Knowledge Base a set of
mathematical rules for calculating new indicators
using existing ones - Method Hierarchy a classification of estimation
methods - Estimation Strategy Knowledge Base a set of
rules allowing the system to choose the most
appropriate estimation method in a given
situation - Spatio-Temporal Database a relational database
containing the whole set of geographic entities
with their known indicator values
7LTDB Schema
8Estimation Methods
- E, S, T and I define a hypercube of information
with holes (missing values) - We need ESTImation methods
- To fill up missing values of the past
- To predict future values
- So far, (simple) ESTImations methods have been
proposed - Estimation methods based on one-dimension E, S,T
or I - Estimation methods based two (or more)
dimensions ES, SI, ET - The Method Hierarchy and the Estimation Strategy
Knowledge Base will be designed to extend this
set of methods
9LTDB First and Future Developments
- A first prototype has been developed
- Implementation of the database schema in the open
source POSTGRES DBMS - Data acquisition mechanisms in Java
- The LTDB framework imports and exports data files
in various formats (excel, dbf...)
10LTDB First and Future Developments
- Short Term
- Estimation methods hierarchy using AROM (an
Object-Based Knowledge Representation System) - Indicator formulae knowledge base in AROM
- Estimation strategy knowledge base with AROMTasks
- Test and validation through an incremental
approach start with an example with a small set
of indicators and some estimation methods, adding
more indicators later - Mid term
- LTDB as a research project of the HYPERCARTE
Research Group, will be integrated into the
HyperAtlas and HyperAdmin software - In the case of the ESPON HyperAtlas, this will
allow the visualization (by simply moving a
cursor on a time line, for instance) of the
evolution of the ratio of two indicators through
past, present, but also future time - Long term
- in order to perform simulations that validate
different scenarios, LTDB will integrate
estimation methods relying on different
parameters which convey tendencies, hypothesis
and assumptions corresponding to these scenarios
11Data Navigator II
- General objective produce a handbook on data
acquisition and harmonization, with a focus on
the themes investigated by the ESPON Program - Applied research
- This project can be seen as an application and a
validation test for the LTDB structure through
European Databases (ESPON DB,)
12Work in Progress
- Three workpackages have been determined
- WP1 Use and practices of data collection in
ESPON I - Short survey on practices of some TPG's (IGEAT)
- Problem of national data collections (TIGRIS)
- WP2 Choice of data model for ESPON II
- Practical example of data integration between
environmental and socio economic data
(Géographie-cités LSR-IMAG) - Choice of the best solution for data modeling and
data integration (Géographie-cités LSR-IMAG) - WP3 Handbook for data collection
- Practical rules for harmonization (time and
space, thematic harmonization) - Practical rules for the use of national sources
- Recommendations for ESPON II one or two
databases
13Integration of environmental and socio economic
data
- Question How many m2 of
- forest are accessible
- for a European citizen?
NUTS23_99 CLC00_forest
14Integration of environmental and socio economic
data
Forest area per inhabitant in 2000
15Integration of environmental and socio economic
data
Potential of forest area per inhabitant within
a 10 km radius in 2000
16How far do we get?
- ESPON Database in its current structure is a
repository for a huge set of European indicators - Long Term Database relies on a structured schema
designed for the import of different kinds of
indicators (different sources, different grids,
different census times,) - Two different approaches (from philosophical
technical points of view) - During the development of LTDB, we have
experienced that to extract and import some
indicators from a data source is not a trivial
task (ESPON Database included)
17Coupling LTDB and ESPON Database?
- The acquisition process (values from the ESPON
Database) has to be automated - A wrapper dedicated to the ESPON Data Base
which will enable the import of incomplete sets
of indicators from the ESPON Database into LTDB - This wrapper exploits a meta description of the
schema of the imported source of data (in this
case ESPON Database) - To update the LTDB with ESPON data, a complete
description of the structuration of the ESPON
Database is needed - A cooperation between authors of LTDB and ESPON
Databases will be beneficial for both tools in
the future
18Long Term Data Base Data Navigator
- Thank You for Your Attention
- Questions?
- Joël Boulier, Claude Graslandjoel.boulier,
claude.grasland_at_parisgeo.cnrs.fr - Marc Guerrien, Nicolas Lambert
- marc.guerrien, nicolas.lambert_at_ums-riate.org
- Jérôme Gensel, Bogdan Moisuc, Marlène
Villanova-Oliver - jerome.gensel, bogdan.moisuc, marlene.villanova_at_
imag.fr
19LTDB Schema
Temporal name
Temporal object with an internal identifier
Proximity/similarity measure matrix
Temporal value of some indicator for some GU
Indicators
Temporal code system
Reliability of the source
Temporal spatial representation
Temporal splitting and merging of GUs
Composition relation between GUs, depends on the
hierarchy
Code depending on a code system
Database or process genealogy of a value
Temporal hierarchical organization of GUs
Providing organism
20Open Questions
- Evolution of the conceptual model of the LTDB?
- Terminology (semantic units, spatial inclusion or
aggregation, ) - Partial space inclusion what does it mean?
- Data importation from different sources?
- Data from different sources will be imported
- Needs an interface for facilitating the
importation of heterogeneous data (tools
developped for HyperAdmin could serve) - Sources?
- ESPON database (BBR)
- Corine Land Cover
- United Nations
-