Language and Speech Technology: a personal perspective - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Language and Speech Technology: a personal perspective

Description:

Machine Translation for language technology. Speech Recognition for speech technology ... Introduction to Statistical Machine Translation by Rafael Banchs ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 47
Provided by: jod7
Category:

less

Transcript and Presenter's Notes

Title: Language and Speech Technology: a personal perspective


1
Language and Speech Technology a personal
perspective
  • Jan Odijk
  • 10 July 2006
  • Utrecht EMasters Summer School in Language and
    Speech 2006

2
Introduction
  • Perspective on Language and Speech Technology
  • personal
  • Europe / Netherlands centered
  • historical (last two decades)
  • focus on
  • Machine Translation for language technology
  • Speech Recognition for speech technology

3
Introduction
  • Two decades
  • 80s 1980-1994
  • 90s 1990-2006

4
Overview
  • 80s Language Technology
  • 80s Speech Technology
  • 90s Language and Speech Technology
  • 90s Commercial Activity
  • 90s Importance of Data
  • Current Status and Directions

5
80s Language Technology
  • Focus on MT (in Europe)
  • Eurotra (Europe)
  • Rosetta (Philips, Netherlands)
  • Distributed Translation (BSO, Netherlands)

6
80s Language Technology
  • Linguistic Research Approach
  • Focus on Research
  • not/less on Technology Development
  • Knowledge-based approach
  • hand-crafted lexicons and rules
  • based on a theory / grammatical formalism
  • Focus on linguistically interesting complex
    phenomena
  • less on phenomena that occur often
  • not strongly data-driven

7
80s Language Technology
  • Focus on an idealized language
  • not on actual language use
  • no focus on robustness
  • Computational approach seen (in research) as a
    way to gain insight into language, grammar and
    grammar formalisms
  • no focus on developing a working system
  • no pragmatic solutions

8
80s Language Technology
  • Little formal (quantitative) evaluation
  • only with test suites
  • constructed sentences illustrating linguistic
    phenomena
  • Teaching tools for deductive parsing, Michael
    Moortgat Willemijn Vermaat

9
80s Language Technology
  • Major Problems (from a technology point of view)
  • Ambiguity
  • Real
  • Temporary
  • Computational Complexity
  • computation-intensive grammar formalisms
  • Complexity of language
  • handcrafting lexicons and rules
  • requires linguistic and computational expertise
  • requires a lot of effort and time

10
80s Language Technology
  • Major problems (cont.)
  • Idealized Language v. Language Use
  • Require large and rich lexicons, suited to the
    application domain difficult/ large effort to
    make them, and to tune (adapt) to specific
    domains

11
80s Speech Technology
  • Automatic Speech Recognition (ASR)
  • Statistical Engineering Approach
  • approach based on Noisy Channel Model
  • derive acoustic models from a lot of annotated
    speech examples
  • derive statistical language models from large
    text corpora (n-gram probabilities)

12
80s Speech Technology
  • Focus on making (small) working systems
  • Statistical approach system uses probabilities
    derived from data
  • Focus initially on limited, simple tasks (e.g.
    digit recognition), and increasingly on more
    complex tasks

13
80s Speech Technology
  • Focus on real language use under realistic
    conditions
  • Progress made by making concrete systems and
    evaluating them rigorously

14
80s Speech Technology
  • Speech recognition based on Hidden Markov Models,
    Jan Cernocky and Pavel Matejka
  • Phoneme posterior estimation and acoustic
    keyword-spotting. Igor Szoke
  • Vocal Dialogue Management Frame-Based
    Approaches, Ivan Kopecek Pavel Cenek, Martin
    Rajman Miroslav Melichar

15
90s Language Technology
  • Statistical MT
  • derive language models from monolingual corpora
    (probabilities of word( sequence)s
  • align sentences with their translations
  • derive translation model from parallel corpora
  • estimate translation probabilities for words and
    word sequences from the aligned sentences
  • use these probabilities to compute translations
    for new sentences

16
90 Language Technology
  • Ambiguity resolved by probabilities based on
    statistics
  • Computational Complexity
  • computationally feasible formalisms
  • proven in speech recognition
  • Complexity of language
  • language and translation model automatically
    derived from data
  • Strong focus on language use
  • data driven
  • Lexicons can be simpler and are derived
    automatically from the data adaptation to
    specific domains easy once the data are
    available

17
90s Language Technology
  • Rise of Internet
  • increasing need for information retrieval
  • approximated by search for word and word sequence
    strings
  • Information Retrieval
  • strongly statistically based
  • no or hardly any linguistics
  • formal evaluation (recall, precision, F-score)

18
90s Language Technology
  • Resulted in
  • strongly data-driven approach in language
    technology
  • increasing use of machine learning techniques
  • explicit focus on formal, esp. quantative
    evaluation
  • re-examination of simpler/computationally less
    intensive formalisms (finite-state) for syntax

19
90s Language Technology
  • Introduction to Statistical Machine Translation
    by Rafael Banchs
  • Finite State techniques, Gosse Bouma

20
90s Speech Technology
  • Continued working under the established paradigm
  • increasingly improving performance and extending
    environments and application areas

21
90s Companies
  • many companies active in Speech technology
  • IBM, Microsoft, Siemens, Nokia, Philips,
    Motorola, Matra Nortel, Nortel,..
  • Dragon, Kurzweil, Lernout Hauspie, SpeechWorks,
    Nuance, Babel, Loquendo, Rhetorical, Vocalis,
    Telisma, Elan, ...

22
90s Companies
  • many companies in Language technology
  • IBM, Microsoft, INSO, Novell, ...
  • GMS, Apptek, Globalink, Lernout Hauspie,
    Systran, LANT (Xplanation), ...

23
90s Companies
  • MT systems
  • knowledge based systems,
  • developed under an engineering approach
  • grammatical formalism simple or pruning in search
    space
  • to reduce ambiguity
  • to reduce computational resource requirements
  • to reduce hand-crafting of rules

24
90s Companies
  • resulted in low quality MT systems
  • still useful in many circumstances
  • Differentiating factors
  • rapid adaptation to (multi-word) terms /
    vocabulary of new domain
  • good performance on named entity recognition

25
90s Data
  • Knowledge Based NLP realized cooperation on
    lexicons was required
  • ASR Methodology requires a lot of data
  • There is no data like more data
  • This led to
  • Data creation projects
  • Set-up of data distribution centers
  • Projects for developing standards for data

26
90s Data
  • Projects
  • Lexicon projects
  • Multilex,
  • Genelex http//perso.orange.fr/laurence.zaysser/ll
    c94.html
  • Acquilex http//www.cl.cam.ac.uk/Research/NL/acqui
    lex/
  • Parole http//www.elda.org/catalogue/en/text/doc/p
    arole.html
  • (WordNet), EuroWordNet
  • SpeechDat projects http//www.speechdat.org/
  • SpeechDat, SpeechDat-Car, SpeechDat-East,
    SPEECON, Orientel
  • National / Local projects
  • Spoken Dutch Corpus (Netherlands and Flanders)

27
90s Data
  • Data distribution Centers are set up
  • LDC http//www.ldc.upenn.edu/ (1993)
  • ELRA http//www.elra.info (1995)
  • Standards
  • TEI http//www.tei-c.org/,
  • CES, XCES http//www.xml-ces.org/
  • Eagles, ISLE http//www.ilc.cnr.it/EAGLES96/isle/I
    SLE_Home_Page.htm

28
Efficient Data Production
  • Methodologies and tools for efficiently producing
    data are being developed
  • See
  • A dictionary writing system and development of a
    selected small dictionary, A. Horak and A.
    Rambousek
  • Scripting in Praat, David Weenink

29
Automating Data Production
  • Automatic or semi-automatic data production
  • Semi-automatic Construction of Arbitrary Domain
    Ontologies, Vit Novacek, Karel Pala

30
Current Status
  • My personal interest
  • Will statistical MT yield better results than
    knowledge-based MT?
  • If so/not, why?
  • Can we further improve MT by combining the good
    aspects of each, and if so, how?

31
Current Status
  • My personal interest
  • Knowledge-based approach
  • strongly data-driven
  • based on real-world data (language use)
  • to guarantee relevant coverage
  • Pay attention to robustness
  • always yield a result
  • Use state-of-the-art techniques to adapt lexicons
    to specific domains

32
Current Status
  • My personal interest
  • reviving Rosetta MT system
  • with assistance of GridLine
  • knowledge-based
  • based on Montague and transformational grammar
  • working in IRME project with Van Dale and
    Groningen on acquisition and lexical
    representation of multi-word expressions

33
Current Status
  • My personal interest
  • Can it be done?
  • Yes! Groningen (Alpino) has proven it for a Dutch
    robust knowledge-based parser

34
Current Status
  • Language and Speech Technology in 2006
  • Exciting area!
  • A lot of commercial activity, and expanding
  • A lot of interesting topics are open for research

35
Commercial Activity
  • many companies in Language technology
  • IBM, Microsoft, ...
  • Apptek, Linguatec, Systran, Knowledge Concepts,
    Q-go, ...
  • applications
  • MT, content management, information retrieval,
    dealing with customer questions,...

36
Commercial Activity
  • many companies in Speech technology
  • IBM, Microsoft, HarmanBecker, Motorola, Nokia,
    ...
  • Nuance, Loquendo, Acapela, SVOX, Telisma, ...
  • even more in application development and system
    integration

37
Commercial Activity
  • applications
  • Network IVR applications (Call centers, banking,
    information services,...)
  • Embedded applications
  • in-car applications, e.g. voice activated
    dialing, navigation (voice destination entry)
  • mobile phone/PDA applications
  • multimodal output e.g. for navigation
  • command and control
  • (SMS) dictation coming soon

38
Commercial Activity
  • applications
  • Office Applications
  • Dictation, horizontal and vertical (medical,
    legal)
  • Language learning
  • Audiomining
  • information retrieval from recorded speech
    (possibly incl. other modalities)
    Radio/TV-broadcasts, parliamentary sessions, ...

39
Research Topics?
  • Speech Technology (Recognition)
  • new paradigms?
  • cf . FLAVOR project http//www.esat.kuleuven.be/ps
    i/spraak/projects/FLaVoR/
  • Combination with other modalities
  • AMI http//www.amiproject.org
  • The AMI project Insights into the processing of
    multimodal meeting recordings, Vincent Wan,
    Stuart Wrigley and Simon Tucker  
  • CHIL http//chil.server.de/servlet/is/101/
  • IMIX (Interactive Multimodal Information
    eXtraction)

40
Research Topics?
  • Speech Technology (Recognition)
  • robustness against noise and other speakers
  • increasing use in car and in public places on
    PDAs and mobile phones
  • Auditory processing of speech in noise , Robert
    Mill
  • pronunciation of names
  • Autonomata project (incl. Nuance, Ghent, Nijmegen
    and Utrecht)

41
Research Topics?
  • Speech technology (Text-to-Speech)
  • better control over prosody in corpus-based TTS?
  • Unit Selection Speech Synthesis with BOSS, Stefan
    Breuer and Petra Wagner
  • Combination with other modalities

42
Research Topics?
  • Language Technology
  • Semantic Annotation of Corpora
  • OntoNotes http//www.isi.edu/natural-language/peop
    le/hovy/papers/06HLT-NAACL-OntoNotes-short.pdf
  • Utrecht contribution to STEVIN D-COI project
  • How to use this semantic annotation in practical
    systems?

43
Research Topics?
  • Language Technology
  • (Semi-)automatic lexicon creation/adaptation
  • Sophisticated information retrieval
  • Language based information retrieval, Petr Sojka
    and Jan Pomikalek
  • Information extraction, summarization and merging

44
Research Topics?
  • Language And Speech Technology
  • Speech to Speech Translation
  • TC-STAR http//www.tc-star.org/

45
Research Topics?
  • Dutch-Flemish STEVIN programme
  • running from 2004-2009
  • 11.4M budget
  • resources
  • research
  • applications
  • demonstration projects
  • many projects are running
  • several calls still to be launched
  • http//www.taalunieversum.nl/stevin

46
  • Enjoy the Summerschool!
Write a Comment
User Comments (0)
About PowerShow.com