Real-Time Generation of Topic Maps from Speech Streams - PowerPoint PPT Presentation

About This Presentation
Title:

Real-Time Generation of Topic Maps from Speech Streams

Description:

Real-time Generation of Topic Maps from Speech Streams ... Internatioal Workshop on Topic Maps ... Focusses on the support of group oriented conversation ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 15
Provided by: informati84
Category:

less

Transcript and Presenter's Notes

Title: Real-Time Generation of Topic Maps from Speech Streams


1
Real-Time Generation of Topic Maps from Speech
Streams
  • TMRA'05
  • Internatioal Workshop on Topic Maps Research and
    Applications06.10.2005
  • Karsten Böhm, Lutz Maicher
  • University of Leipzig
  • boehmmaicher_at_informatik.uni-leipzig.de

2
Introduction
  • Topic Maps are means for
  • representing (powerful) indexes
  • of any information collection
  • for semantic information integration
  • Our goal
  • real-time generation of conceptual indexes of
    speech streams,
  • represented as Topic Maps
  • for integration with other information systems

3
How to create Topic Maps
  • Topic Maps are a semantic technology ...
  • ...only in the perspective of information
    integration
  • holding the Co-location objective always true
  • Subject Proxies indicating identical Subjects
    has to be viewed as merged ones
  • Subject Equality Decision Approach
  • Subject Viewing Approach
  • We have to represent the created indexes to hold
    the Co-location objective true in the perspective
    of the creator ....
  • ... and therefore we need a theoretic
    fundament.

4
Subject Equality Decision Chain
From the child's perspective (Elgs are
sweet.) I caught always the same Subject, an
elg.
From the zoologist's perspective (Elgs are
loners.) I caught two deers and three elgs.
From the ranger's perspective (Bernd needs a
cow) I caught Lisa, Ud (fighting), and Bernd
(in summer, in winter and as calf).
5
Subject Equality Decision Chain
2. Sensory Systems come to stage, catching
Subject Stages
3. Documenting the impressions (from the rangers
perspective)
4. Subject Equality is decided according to the
governing SMD
(1) Subjectness I'm only interested in Lisa, Ud,
and Bernd not in snow, trees.
(2) Creating Subject Proxies for the current
Subject Stages of Lisa, Ud and Bernd
(3) Try to document the decision about the
Subject Identity of the current Subject
Stage by the given means of the governing SMD
ontology, TMV ontology and TMV vocabulary.
Subject Identity of Subject Stages is
mapped to Subject Indication of the
Subject Proxy
(4) Document all further information observed
about the Subject Stage. (Documenting
modelling loosing information)
6
Subject Equality Decision Chain
  • Co-Location Objective Subject Proxies indicating
    identical Subjects
  • World without any sensory system
  • How to make a qualified assertion about the very
    nature of Subjects?
  • Sensory systems come on stage, catching Subject
    Stages
  • Never Subjects, only Subject Stages (see Quine)
    are observed
  • Subject Identity Subject Stages caught at
    different occassions belong to the same Subject
    (see Vatants hubjects)
  • perspective dependent (see Biezunsky)
  • decision process under uncertainty
  • Documenting the impressions from a perspective
  • Subjectness in the current perspective
  • observations are documented restricted by the
    available vocabulary (SMD Ontology, TMV ontology,
    TMV vocabulary)
  • Decision about Subject Identity is documented
    according to the governing Subject Indication
    Approach
  • Subject Equality is decided according to a SMD

7
The Observation Principle
.. or how to create Topic Maps from digital
domains?
(1.) Observe the information collections in
interest (texts, video streams, etc.) and
detect Subject Stages of Subjects in interest
from the current perspective.
(2.) Decide about the Subject Identity of the
observed Subject Stages.
(3.) Create a Subject Proxy for each Subject
Stage in interest.
(4.) Document the decision about the Subject
Identity of the current Subject Stage by
the given means of the governing SMD ontology,
TMV ontology and TMV vocabulary. (
... and with respect to all expected Subject
Equality Decision Approaches applied
later to this Subject Proxy)
(5.) Document all further information observed
about the Subject Stage by the given means
of the governing SMD ontology, TMV ontology and
TMV vocabulary.
8
The Semantic Talk System
  • Focusses on the support of group oriented
    conversation
  • Implementation of a minimal invasive
    IT-solution
  • Application for interviewing scenarios,
    innovation processes and early stages of product
    development
  • Semantic Talk creates powerful, conceptual
    indexes of Speech Streams in real-time
  • Combines speech recognition (LinguaTecs
    VoicePro) with Text Mining algorithms
  • Provides dynamic visualization (extended Version
    of TouchGraph)
  • Networked application with multiple clients
  • Provides a generic RDF-export
  • Cooperation with University Duisburg-Essen, ISA
    Informationssysteme GmbH

9
SemantikTalk Speech recognition and text Mining
Overview window (birds eye view)
Window for add. Information (documents,
pictures)
Sliders for configuration parameters (zooms)
local context window
10
The Semantic Talk System
11
Semantic Talk creates indexes of speech streams
we have to represent them as Topic Maps
and use them for semantic information integration
12
From RDF-output to LTM
ST did observe a noticeable usage of the term
"Fisichella" in the speech stream ...
ltstnode rdfID"node_Fisichella"gt
ltstIDgt160615lt/stIDgt ltstlabelgtFisichellalt/stl
abelgt ltstnodelevelgt1lt/stnodelevelgt
ltstref_wort_nr rdfresource"http//www.tt.de/dtd
/st/papnode_160615"/gt ltstvariant stindex"3"
sttype"4" stweight"0.3176"/gt lt/stnodegt
Semantic Mapping between RDF-output and Topic Map
using the Omnigator ...
id7406 id7276 "Fisichella"
_at_"http//www.texttech.de/dtd/st/papnode_160615"
_at_"http//www.texttech.de/dtd/st/papnode_Fisich
ella" id7406, id3670, 1 id7406,
id7650, 160615 id7549( id7406 id463, id464
id2195 ) id464 id464, id1636,
0.31766722453166335 id464, id4378,
3 id464, id787, 4
... and this 'noticeable usage of the term
Fisichella' becomes the Subject in the Topic
Map. (Subject Identity gt the same algorithms
observes the 'noticable usage' twice)
13
Integration with other Topic Maps ...
Starting point Integration with an other Topic
Map created by the
observation principle (for
example a motor-sport Topic Map)
- a mapping Topic Map is needed (which should
be created under the observation principle, too)
... to allow more accurate mapping decisions,
it seems to be necessary that the creation
process of a Topic Map needs to be
documented, too.
14
Discussion
Write a Comment
User Comments (0)
About PowerShow.com