Title: BOEMIE: Bootstrapping Ontology Evolution with Multimedia Information Extraction
1BOEMIE Bootstrapping Ontology Evolution with
Multimedia Information Extraction
- Vasileios Papastathis
- Centre for Research and Technology Hellas (CERTH)
- Informatics and Telematics Institute
- Multimedia Knowledge Group
-
- 3rd Know-How Transfer Event
- Thessaloniki, 8 March 2007
2Presentation Overview
- A short presentation of CERTH-ITI and the
Multimedia Knowledge Group (MKG) - BOEMIE project An FP6 success story
- FP7 Challenge 4 Digital Libraries and Content
3History-Scope
- Founded in 1998 as a non-profit organisation
under the auspices of the General Secretariat of
Research and Technology of the Greek Ministry of
Development - Since March 2000, it is part of the Centre for
Research and Technology Hellas as one of its
four constituent institutes - Set-up to constitute a major research and
development centre, with continuous interaction
with the academic community, the National and
European Informatics and Telematics Industry, the
international scientific community and the Public
Sector - Play a key role in the development of the Greek
Information Society as a National Center of
Excellence in Informatics and Telematics - Set-up spin-off companies aiming at the
commercial exploitation of ITIs research results
4Structure and Organization
- Virtual Reality Research Unit
- Advanced e-Services for the Knowledge Society
Research Unit - Telecommunications and Telematics Research Unit
- Intelligent Systems and Software Engineering
Research Unit - Business Information Systems Research Unit
5Structure and Organization
- Multimedia Knowledge Group
- Semantic Multimedia Analysis
- Multimedia Indexing and Retrieval
- Multimedia and the Semantic Web
- Knowledge Structures, Languages and Tools for
Multimedia - Reasoning and Personalization for Multimedia
Applications - MPEG-7 and MPEG-21 Standards
6Personnel
- 13 Professors
- 7 Researchers Grade C and D
- 5 Post-Doctoral Researchers
- 20 PhD Candidates
- (Postgraduate Research Fellows)
- 45 Research Assistants
- (all University Graduates, MSc)
- 2 Technicians (University Graduates)
- 3 Administration Staff
- 20 Undergraduate Students
7Publications and Projects
- Since 2000
- 140 publications in peer-reviewed international
journals - 46 book chapters
- 370 publications in international and national
conferences - More than 450 citations (not by authors or
coauthors) - 70 RD projects funded by European Commission
Programmes (8.8 MEuro) - 31 RD projects funded by National Programmes
(2.2 MEuro) - 50 Industrial Contracts and Subcontracts (2.45
MEuro) - 27 5th FP European projects
- Coordinator in 4 EC IST projects (SCHEMA NoE,
LAURA, EU-PUBLI.COM, KOD Knowledge on Demand) - Financial coordinator of 2 EC IST projects
(INTERVUSE, P2People)
8Funding (2002-2006)
9FP6 RD Projects
- aceMedia Integrating knowledge, semantics and
content for user centred intelligent media
services, IP 2004-2007. - KnowledgeWeb Realizing the Semantic Web,
funded by the DG XIII, NoE 2004-2007. - MESH Multimedia Semantic Syndication for
Enhanced News Services, IST IP, 2006-2008. - X-Media Knowledge Sharing and Reuse Across
Media, IST IP, 2006-2009. - BOEMIE Bootstrapping Ontology Evolution with
Multimedia Information Extraction, IST-STREP,
2006-2008. - K-Space Knowledge Space of Semantic Inference
for Automatic Annotation and Retrieval of
Multimedia Content, 6th FP IST NoE, 2006-2008. - Since 2000
- 106 publications in peer-reviewed international
journals - 46 book chapters
- 342 (32616) publications in international and
national conferences - More than 450 citations (not by authors or
coauthors) - 40 RD projects funded by European Commission
Programmes (8.8 MEuro) - 31 RD projects funded by National Programmes
(2.2 MEuro) - 50 Industrial Contracts and Subcontracts (2.45
MEuro)
10BOEMIE Bootstrapping Ontology Evolution with
Multimedia Information Extraction
11Specific Targeted Research Projects (STREP)
- Aims and Objectives
- An RTD project designed to gain knowledge or
improve existing products, processes or services - A demonstration project designed to prove the
viability of new technologies, but which cannot
be commercialized directly - Number of participants
- Minimum of 3 partners from three different Member
States - Duration
- Typically between 2 to 3 years
- Projects Management
- Require overall management and coordination of
the consortium
12The facts
- Specific Targeted Research Projects (STREP), IST
2004 2.4.7 Semantic-based Knowledge and
Content Systems - Start March 1, 2006
- End February 28, 2009
- Budget 5.075.678 Euro
- EU Funding 3.150.000 Euro
- More than 30 people already active in the project
- Project portal http//www.boemie.org/
13Consortium
- Inst. of Informatics Telecommunications, NCSR
Demokritos, Greece (Coordinator) - Fraunhofer Institute for Media Communication
(NetMedia), Germany - Dip. di Informatica e Comunicazione, University
of Milano, Italy - Centre for Research and Technology Hellas (CERTH)
- Informatics Telematics Institute (ITI),
Greece - Hamburg University of Technology, Germany
- TeleAtlas SA, the Netherlands
14Vision
- Pave the way towards automation of the knowledge
acquisition from multimedia content. - Break new ground by introducing and implementing
the concept of evolving multimedia ontologies. - Make domain-specific semantic webs feasible with
limited human effort.
15Objectives
- Providing technology to represent and evolve
domain-specific multimedia ontologies. - Moving from low-level, general-purpose,
single-modality feature extraction towards
semantic, multimedia analysis. - Robust and scalable ontology-driven multimedia
content extraction through ontology evolution.
16Approach
- Driven by domain-specific multimedia ontologies,
BOEMIE information extraction systems will be
able to identify high-level semantic features in
image, video, audio and text and fuse these
features for optimal extraction. - The ontologies will be continuously populated and
enriched using the extracted semantic content. - This is a bootstrapping process, since the
enriched ontologies will in turn be used to drive
the multimedia information extraction system.
17Approach
F
V
T
A
SEMANTICS EXTRACTION TOOLKIT
V
VISUAL EXTRACTION TOOLS
T
TEXT EXTRACTION TOOLS
ONTOLOGY EVOLUTION PROCESS
EVOLVED ONTOLOGY
A
AUDIO EXTRACTION TOOLS
INFORMATION FUSION TOOLS
F
18Semantics extraction Objectives
- No single modality is powerful enough to support
robust and large-scale extraction. - Emphasis on fusion of multiple modalities, using
reasoning and uncertainty handling. - Contribution to the state-of-the-art in visual
content analysis, due to its richness and the
difficulty of extracting semantics. - Non-visual content will provide supportive
evidence, to improve precision.
19Multimedia semantic model Objectives
- A multimedia ontology describes the structure of
multimedia content and visual characteristics of
content objects in terms of low-level features. - One or more domain ontologies, e.g. about
athletics. - A geographic ontology, e.g. about landmarks.
- An event ontology, e.g. about athletic events.
- Potential contribution
- Uncertainty in concept descriptions
- Spatial and temporal relations
20Ontology evolution Objectives
- Ontology population and enrichment, i.e. addition
of concepts, relations, properties and instances. - Coordination of homogeneous ontologies (same
domain) and heterogeneous ontologies (e.g. domain
and multimedia ontologies). - Potential contribution
- Ontology population from multimedia content.
- Coordination of different types of reasoning for
enrichment and coordination. - Matching, coordination and versioning of the
integrated semantic model.
217th FP
22Challenge 4
- Digital Libraries and Content
- Make content and knowledge abundant, accessible,
interactive and usable over time by humans and
machines alike. - Content must be made available through digital
libraries and its long term usability,
accessibility and preservation must be ensured - Effective technologies need to be developed for
intelligent content creation and management, and
for supporting the capture of knowledge and its
sharing and reuse - Individuals, organisations and communities must
find new ways to acquire and exploit knowledge,
and thereby learn - Political framework  i2010 - Digital
LibrariesÂ
23Intelligent Content Semantics
- Make digital resources that embody creativity
and semantics easier and more cost effective to
produce, organize, search, personalise,
distribute and use across the value chain. - CREATORS Design more communicative and
participative forms of content (media
professionals, enterprise designers, talented
amateurs). - PUBLISHERS Increase productivity in creative
industries, enterprises and professional sectors
(e.g. health, law, etc.). - SCIENTISTS Automate link between data analysis,
theory and experimental validation. - ORGANISATIONS COMMUNITIES Automate collection
and distribution of digital content and
machine-tractable knowledge, and their sharing in
collaborative environments.
24Target socio-economic sectors
- key features
- ICT based, high growth innovation potential
- pronounced international character
- sophisticated users
- very large data volumes
- well defined flows protocols
- obvious candidates (in addition to ICT!)
- creative industries (film, TV, games, advertising
) - enterprises in information bound industries
- utilities eg energy
- manufacturing process industries
- construction engineering, financial services
- eScience eg life sciences
25Do NOT do
- In 2007-08 NO intend to support research into
- basic research with no identifiable by-products
within 10 years - domain specific applications - not
portable/replicable in other socio-economic
sectors - developments addressing immediate commercial
imperatives (e.g. content protection
monetisation) - issues covered by other Challenges and Objectives
eg media networking, peer to peer, technology
enabled learning - topics well covered by on-going FP6 projects
networks (see our website)
26Schedule of 1st call (provisional)
- 51 Meuro in total of which
- 46 Meuro for IP STR projects
- 5 Meuro for NoEs CSAs
- first call expected to close late April (?)
- evaluation/selection mid-May late Jun (?)
- negotiations until Nov
- contract awarding in Dec
- projects due to start Q1 2008
- highly demanding process
27CERTH-ITI in FP7
- Continue research in Multimedia and Knowledge
Technologies - Expand to new areas and applications (Health,
Industry, Cognition, Robotics, Environment,
Security, Surveillance, ) - Challenges in IST
- Networked media
- Cognitive systems, interaction, robotics
- Digital libraries and technology-enhanced
learning - Intelligent content and semantics
- Personal health systems for monitoring and
point-of-care diagnostics - Advanced ICT for risk assessment and patient
safety
28Thank you!Mr. Vasileios Papastathis
vkpapa_at_iti.gr Multimedia Knowledge
Grouphttp//mkg.iti.gr