Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System Architecture (MUSA) - PowerPoint PPT Presentation

About This Presentation
Title:

Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System Architecture (MUSA)

Description:

Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System Architecture (MUSA) S.Piperidis, I.Demiros, P.Prokopidis – PowerPoint PPT presentation

Number of Views:227
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System Architecture (MUSA)


1
Multimodal multilingual information processing
for automatic subtitle generation Resources,
Methods and System Architecture (MUSA)
  • S.Piperidis, I.Demiros, P.Prokopidis
  • spip, iason, prokopis_at_ilsp.gr

2
Objectives
  • explore the degree to which subtitling can be
  • automated by using the appropriate
  • technologies
  • focus on human language technologies
  • explore the degree to which speech and
  • language technologies can be integrated
  • try out system architectures simulating the
  • underlying cognitive processes

3
Challenges of Subtitling
  • the challenge in automated generation is that
  • there must be agreement between subtitles, the
  • spoken source language and the corresponding
  • image
  • generated subtitles must meet a set of
  • constraints imposed by the visual context of
    the
  • text and spatio-temporal factors
  • subtitle text is no longer normal written text
    but
  • rather oral text

4
Experiments in MUSA
  • experiments on monolingual and multilingual
    subtitle generation
  • Languages English source target
  • French Greek target
  • Technologies used
  • English ASR component for the transcription of
    audio
  • streams into text
  • Subtitling component producing English subtitles
    from
  • English audio transcriptions
  • Translation component integrating machine
  • translation and translation memory, for EN-FR
    EN-EL

5
Architecture
6
Resources for subtitling
  • in order to train and evaluate system
    components,
  • an array of application specific resources is
    needed
  • primary audiovisual data from BBC World Service,
  • documentaries and newsy current affairs
  • for each programme, the following parallel data
  • are sourced
  • the actual video of the programme
  • its script or hand-made transcript
  • English, Greek and French subtitles
  • topically relevant newspaper
  • and web-sourced texts

7
Resources overview
Scripts Tran scripts ScriptsTran scripts EN sub titles EL sub titles FR sub titles
Horizon 110.452  55.224 165.676 121.036 106.668 38.875
Panorama 87.039 87.039 43.981 35.623 25.891
Misc 563.155 563.155 408.214 351.857 64.381
DVDs 89.882 89.882 77.629 58.427
Totals 763.489 142.263 905.752 650.860 552.575 129.147
8
Speech recognition component
  • Use of parallel corpus of BBC programs, audio
  • and hand-made transcripts, as well as
    topically
  • relevant newspaper texts
  • Tuning of acoustic and language models of the
  • KUL/ESAT recogniser
  • Background noise non-native speech hinder the
  • process
  • Aligning audio with hand-made transcripts
  • proved to be a working solution helping
    overcome
  • noise and non-native speakers problems

9
Speech recognition component (2)

10
Constraints Requirements
  • subtitling conventions in various EU countries
  • constraints entail that compression of
  • transcripts segments is required
  • compression rate expressed in of words and
  • of chars to delete

11
Subtitling engine resources
  • Use of a parallel corpus of BBC programs
  • featuring program hand-validated transcripts
  • and their hand-made subtitles
  • Align sentences and words in the parallel corpus
  • Extract a table of paraphrases to aid compression
  • Example
  • Within the next few years -gt Soon
  • During the years when -gt While
  • It was clear that -gt Clearly

12
Subtitling engine resources (2)
  • If compression rate is not reached by using
    paraphrasing, apply syntactic rules to delete
    low-importance units (e.g. adverbs, adjectives,
    etc)
  • Hand-crafted deletion rules making use of
  • a shallow-parse of the segments
  • surprise values for each word, computed on the
    basis of a large text corpus.
  • If more deletable segments than necessary exist,
    start by deleting the least important segments
    first.

13
Translation component
  • integrate TM (TrAID) and MT (Systran)
  • align EN hand-made subtitles
  • with FR and EL hand-made subtitles
  • build a translation memory database (high
  • of unique translation units, not unexpected)
  • perform term extraction on the parallel corpus
  • hand-validate automatically extracted terms and
  • use them for translation customisation
    purposes

14
Subtitle editing
  • responsible for textual operations, tokenisation
    and
  • subtitle text splitting, calculation of
  • cue-in/cue-out timecodes
  • requirement subtitled text should be segmented
  • at the highest syntactic nodes possible
  • hand-crafted rules, e.g.cut after
    punctuation,cut
  • after personal pronouns following a verb
    phrase
  • For EN use of available shallow parse
    information
  • For FR and EL, use of part-of-speech
    information
  • did not produce worse results

15
Evaluation
  • so far, relatively poor ASR results for
    subtitling
  • alignment mode of ASR yielded gt97 accuracy
  • grammaticality and semantic acceptability
  • of subtitles with targeted compression
    reachedgt70
  • acceptability of translated subtitles
  • in the range of 45-55
  • evaluation of integrated prototype very
    encouraging,
  • entailing considerable productivity gains

16
The MUSA prototype
Musa_EN_Demo.asx Musa_FR_Demo.asx Musa_EL_Demo.a
sx
17
Conclusions
  • human subtitling is an extremely complex process
  • a simplified computational model is feasible
  • an architecture for a multilingual subtitling
    system
  • is implementable
  • useful arrays of resources can be sourced and
  • processed at different levels, yielding useful
  • derivative resources

18
Whats next for today
  • the session eTools and Translation II,
  • after the break is dedicated to MUSA
  • the MUSA team will be around, available
  • for demonstrations of the system
  • and further discussions
  • MUSA on the web http//sifnos.ilsp.gr/musa
Write a Comment
User Comments (0)
About PowerShow.com