Title: Semantic Web Techniques for Personalization of eGovernment Services
1Semantic Web Techniquesfor Personalizationof
eGovernment Services
SemWAT 2006 1st International ER Workshop on
Semantic Web Applications Theory and Practice
Tucson, AZ, November 2006
Federica Mandreoli Riccardo Martoglia Enrico
Ronchetti Paolo Tiberio Università degli Studi
di Modena e Reggio Emilia
Fabio Grandi Maria Rita Scalas Università degli
Studi di Bologna
2Overview
- Our research activities concern the
implementation of Web information systems for
eGovernment applications - Due to development of eGovernment
initiatives,more and more on-line resources and
services are being made available by Public
Administrations (PAs) - We make use of temporal database and semantic Web
techniques to provide personalized access to such
resources and services - In particular, we consider multi-version norm
texts (stored in XML format) available in Web
repositories
3Importance of versioning
- Temporal concerns are ubiquitous in the law
domain - Each normative text changes in time due to
different modifications, but keeps its identity - The ability to model temporal dimensions is
essential for the management of evolving norms - it is crucial to reconstruct the consolidated
version of a norm - also past versions are still important
4Importance of versioning
- Applicability (semantic) versioning also plays an
important role - some norms or some of their parts have or acquire
a limited applicability - personalized version of the norm
- A version only containing provisions which are
applicable to a citizens personal case
Art. 1 (unemployed) xxy yyx yxyx yyyxx
xyyx Art. 2 (self-employed) aab bbab abab
abba ab Art. 3 (retired) qwqq ww wqqw wq ww
Self-employed
5Motivation
- Large XML collections of norms are made
available by the PA on the Web but
personalization is - Absent, e.g. http//www.normeinrete.it(temporal
versioning partially supported) - Predefined in the Website structure and contents,
e.g. http//www.italia.gov.it(hardwired by
human experts following the life-events
approach) - Lack of an effective, flexible, on-demand
(intelligent, efficient) personalization
facility
6Objectives
- Development of an effective and efficient Web
information system where - norms are represented as XML documents
- dynamics of norms in time is captured
- limited applicability of norms (and their parts)
is captured - selective access and reconstruction of
versionsis supported by a query engine - Aimed at
- enabling citizens to access personalized versions
of multiversion resources - improving and optimizing the involvement of
citizens in the eGovernance process
7The Technological Infrastructure
Public Administration DB
WEB SERVICES OF PUBLIC ADMINISTRATION
1
WEB SERVICES WITH ONTOLOGY OC
2
SIMPLE ELABORATION UNIT
creation /update
class Cx
1 identification phase reconstruction
on-the-fly of the digital identity of the
authenticated user
2 classification phase use of the collected
digital identity to classify the citizen with
respect to the civic ontology Oc
3 querying phase access and reconstruction of
all and only norms which are applicable to the
class Cx
8The Civic Ontology
- Embodies a classification of citizens based on
the distinctions introduced by successive norms
that imply some limitations in their
applicability (founding acts) - At this stage of the project, we manage
tree-like ontologies(i.e. class taxonomies
induced by the IS-A relationship)
9The modeling approach
- Extension of a previous temporal XML model (DKE
2005) including - a temporal multi-version XML schema
- is based on the hierarchical organization of
normative texts contents-section-article-paragrap
h - at each level of the hierarchy, the history of
changes is represented by the (time-stamped)
versions produced - it supports ancestor-descendant inheritance
- temporal manipulation operations
- Addition of applicability annotations in order to
support semantic versioning
10The temporal XML schema
Num R
Law
- 4 Temporal Dimensions
- Publication time
- time of publication on the Official Journal
- Validity time
- time the norm is in force
- Efficacy time
- time the norm can be applied
- Transaction time
- time the norm is storedin the system
Type R
Title
Contents
An_ref O
Ver
Num R
Section
Num R
Ver
An_ref O
Num R
Heading
Num R
Article
Ver
An_ref O
Num R
Paragraph
Heading
Num R
Ver
An_ref O
Num R
11Semantic versioning
- A pre-order and post-order numbering scheme is
introduced in the tree-like ontology - Classes are identified by means of their
pre-order code - Encoding is exploited in query processing for
quick ancestor-descendant checking - Applicability annotations (AA) are added to
semantic versions of document parts as references
to the ontology classes
12Semantic versioning
- Applicability is inherited by descendant nodes
unless locally redefined - By means of redefinitions we can also introduce,
for each part of a document, complex
applicability properties - Restrictions with respect to ancestors
- Extensions with respect to ancestors
13Example of full search
- John Smith is a self-employed citizen.
- He is interested in the text of all the norms ...
- ... which contain paragraphs dealing with health
care, ... - ... which were valid and in effect between 2002
and 2004, ... - ... and which are applicable to his case (civic
class 7).
Structural constraint Textual constraint Temporal
constraint Semantic constraint
4 orthogonal constraints
14Example of full search
FOR a IN norms WHERE textConstr
(a//paragraph//text(), health AND care) AND
tempConstr (vTime OVERLAPS PERIOD(2002-01-01,2
004-12-31)) AND tempConstr (eTime OVERLAPS
PERIOD(2002-01-01,2004-12-31)) AND
applConstr (class 7) RETURN a
Structural constraint Textual constraint Temporal
constraint Semantic constraint
4 orthogonal constraints
15Example of full search
Civic ontology
Normative DB
Norm
Article 1
Article 2
TA
Ver 1
AA3
Par 1
Par 2
norm//paragraph//text()
TA
TA
TA
Ver 1
Ver 1
Ver 2
AA4
AA3,8
class 7
AA
Health care text X
Public health text Y
Health care text Z
16Our prototype system (native approach)
- The query engine is able to access and retrieve
only the strictly necessary data - selection relies on ad-hoc data structures
supporting multi-versioning - storage granularity is finer than the entire
documents used by standard XML engines (including
our previous prototype stratum approach) - Only the parts which satisfy the temporal and
applicability constraints are used for the
reconstruction of the retrieved documents - There is no need to retrieve whole XML documents
and build space-consuming structures such as DOM
trees
17Evaluation benchmark
- Variable document size
- min 2KB
- avg 24KB
- max 125KB
- Three XML document sets
- 5000 documents (120MB)
- 10000 documents (240MB)
- 20000 documents (480MB)
- Five different query types
- Queries on keywords (structural textual
constraints) - Q1 keywords in contents
- Q2 keywords in type and contents
- Temporal queries (structural temporal
constraints) - Q3 conditions on publication, validity and
transaction time - Mixed queries (structural textual temporal
constraints) - Q4, Q5 with keywords and temporal conditions
- Five variants with semantic constraints
- Qx-A with additional applicability constraints
18Performance evaluation
- The new system outperforms its predecessor
(stratum approach) as far as temporal queries
are concerned - The new system showed a very high efficiencyin
personalization query processing - selection of qualifying versions is improved by a
technique involving simple comparisons involving
pre-post encodings - 0.5-1 more time than for the original versions
- 3-4 storage space overhead
- The new system showed good scalability figures in
every type of query context - the computing time grows sublinearly with the
number of documents (it depends mainly on the
size of the results)
19Conclusions
- We presented our research work concerning the
design and implementation of efficient Web-based
information systems for eGovernment applications - We introduced support for a personalized access
to resources on the basis of the digital
identity of citizens (relying on semantic
versioning and ontology mapping) - We developed an efficient platform (native
approach) for which a specialized Multi-version
XML Query Processor has been designed and
implemented - We showed our approach to be very efficient in a
large set of experimental situations with good
scale-up figures under growing load configurations
20Future Work
- Extensions of the current framework
- more advanced application requirements may
include a more sophisticated ontology definition
(graph-like), possibly versioned, and more
advanced reasoning services - Completion of the technological infrastructure
usable in a large Web-based eGovernment scenario,
including - identification and classification services
- Assessment of our prototype systems in a concrete
working environment - with real users and with a large repository of
real norms - Extension to a more general application
domain(Web personalization via ontology-based
user profiling)