Title: DIVA - University of Fribourg - Switzerland Seminar presentation, jan. 2005 Lawrence Michel, MSc Student lawrence.michel@unifr.ch
1DIVA - University of Fribourg -
SwitzerlandSeminar presentation, jan.
2005Lawrence Michel, MSc Studentlawrence.michel_at_
unifr.ch
- Portable Meeting Recorder
- A multimodal meeting recorder solution designed
by Ricoh
Dar-Shyang Lee Berna Erol Jamey Graham Jonathan
J. Hull Norihiko Murata
2Concept 1/3
- Intended goalA methodology to enable a full
multimodal (A/V, metadata) recording and browsing
of a meeting - under strong constraint ofminimal hardware
intrusion,portabilityand maximal data
extraction capability
3Concept 2/3
The Portable MeetingRecorder system
- Hardware specifications
- Minimal intrusive A/V capturecomponent
- 4 Microphones
- 1 360 Videocamera
- Meeting Recorder interface
- Touchscreen browsing
- A common PC for processingdata
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8Sound Localization
An interesting algorithm the 360 Sound
localization using 4 microphones
Elevation computing
Azimuth computing
a
ß
? Method basically based on phase properties of 4
input signals, computing differences between them
and guessing the appropriate angle.
9Sound Localization 2/
Properties
? The method is applied at real-time meeting
recording (30-40 CPU load in a 933MHz PC)
? Permits a maximum data extraction while
requiring a minimum of hardware (thus needed a
boily human brain output!)
? These datas are mainly needed for view
selection and face extraction process
? Accuracy is highly dependent on several factor,
such as room specifications (e.g. reflectiv
surfaces that leads to high signal
reverberation), amplitude of signals, speech
overlap, particular angles, etc.
? hardware dependency Accuracy effectiveness is
strongly correlated with signal sampling rate,
sensitivity of input devices, etc.
10Meeting Location Recognition 1/
Another interesting method recognizing the
meeting location - adaptiv background modeling
The process is as follow
1 Analyzing frame by comparing its historgram
with template
2 Applying foreground extraction
3 Resulting background image will be set as the
newest template
11Searching and Browsing with Visual and Audio
Content
How are the audio files, video files and XML
metadatas efficiently exploited?
12Searching and Browsing with Visual and Audio
Content 1/
Introduction
Searching and browsing audiovisual information is
a time consuming task. The Audio and Video
Recorder is, at it's actual state of development,
unable to transcript automatically audio
files. Alternatively, searching and browsing
within our meeting document is based on visual
and audio content activity.
13Searching and Browsing with Visual and Audio
Content 2/
Visual activity analysis
In most of meeting sequences, there are most of
the time minimal motions.
- High motion segments sequences will be
corresponded to significant events
14Searching and Browsing with Visual and Audio
Content 3/
Audio activity analysis
The system, which is highly based on audio
analysis, enables to navigate through our
document in various way, such as
- Speaker segmentation using audio data
? Lost of efficiency when bad audio based
tracking data are present (resulting from speech
overlap, hardware specification, bad angle
positioning,...).
15Searching and Browsing with Visual and Audio
Content 4/
Image screenshot from Meeting Browser using the
Muvie Client
16(No Transcript)
17Thank you