NICE: Native language Interpretation and Communication Environment - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

NICE: Native language Interpretation and Communication Environment

Description:

School of Computer Science. Carnegie Mellon University. History of NICE. Arose from a series of joint workshops ... US (Alaska) I upiaq (advanced. discussion) ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 17
Provided by: loril8
Category:

less

Transcript and Presenter's Notes

Title: NICE: Native language Interpretation and Communication Environment


1
NICE Native language Interpretation and
Communication Environment
  • Jaime Carbonell, Lori Levin, Alon Lavie, Language
    Technologies Institute
  • Carnegie Mellon University
  • jgc, lsl, alavie_at_cs.cmu.edu

2
Machine Translation of Indigenous Languages
  • Policy makers have access to information about
    indigenous people.
  • Epidemics, crop failures, etc.
  • Indigenous people can participate in
  • Health care
  • Education
  • Government
  • Internet
  • without giving up their languages.

3
History of NICE
  • Arose from a series of joint workshops of NSF and
    OAS.
  • Workshop recommendations
  • Create multinational projects using information
    technology to
  • provide immediate benefits to governments and
    citizens
  • develop critical infrastructure for communication
    and collaborative research
  • training researchers and engineers
  • advancing science and technology

4
Architecture Diagram
SL Input
Run-Time Module
Learning Module
SL Parser
EBMT Engine
Elicitation Process
Learning Process
Transfer Rules
Transfer Engine
TL Generator
User
Unifier Module
TL Output
5
EBMT Example
English I would like to meet
her. Mapudungun Ayükefun trawüael fey
engu.
English The tallest man is my
father. Mapudungun Chi doy fütra chi wentru
fey ta inche ñi chaw.
English I would like to meet the
tallest man Mapudungun (new)
Ayükefun trawüael Chi doy fütra chi
wentru Mapudungun (correct) Ayüken ñi
trawüael chi doy fütra wentruengu.
6
NICE Partners
7
Agreement Between LTI and Institute of Indigenous
Studies (IEI), Universidad De La Frontera, Chile
  • Contributions of IEI
  • Native language knowledge and linguistic
    expertise in Mapudungun
  • Experience in bicultural, bilingual education
  • Data collection recording, transcribing,
    translating
  • Orthographic normalization of Mapudungun

8
Agreement between LTI and Institute of Indigenous
Studies (IEI), Universidad de la Frontera, Chile
  • Contributions of LTI
  • Develop MT technology for indigenous languages
  • Training for data collection and transcription
  • Partial support for data collection effort
    pending funding from Chilean Ministry of
    Education
  • International coordination, technical and project
    management

9
LTI/IEI Agreement
  • Continue collaboration on data collection and
    machine translation technology.
  • Pursue focused areas of mutual interest, such as
    bilingual education.
  • Seek additional funding sources in Chile and the
    US.

10
The IEI Team
  • Coordinator (leader of a bilingual and
    multicultural education project)
  • Eliseo Canulef
  • Distinguished native speaker
  • Rosendo Huisca
  • Linguists (one native speaker, one near-native)
  • Juan Hector Painequeo
  • Hugo Carrasco
  • Typists/Transcribers
  • Recording assistants
  • Translators
  • Native speaker linguistic informants

11
MINEDUC/IEIAgreement Highlights
  • Based on the LTI/IEI agreement, the Chilean
    Ministry of Education agreed to fund the data
    collection and processing team for the year 2001.
    This agreement will be renewed each year, as
    needed.

12
MINEDUC/IEI AgreementObjectives
  • To evaluate the NICE/Mapudungun proposal for
    orthography and spelling
  • To collect an oral corpus that represent the four
    Mapudungun dialects spoken in Chile. The main
    domain is primary health, traditional and western.

13
MINEDUC/IEI AgreementDeliverables
  • An oral corpus of 800 hours recorded,
    proportional to the demography of each current
    spoken dialect
  • 120 hours transcribed and translated from
    Mapudungun to Spanish
  • A refined proposal for writing Mapudungun

14
Nice/MapudungunDatabase
  • Writing conventions (Grafemario)
  • Glossary Mapudungun/Spanish
  • Bilingual newspaper, 4 issues
  • Ultimas Familias memoirs
  • Memorias de Pascual Coña
  • Publishable product with new Spanish translation
  • 35 hours transcribed speech
  • 80 hours recorded speech

15
NICE/MapudungunOther Products
  • Standardization of orthography Linguists at UFRO
    have evaluated the competing orthographies for
    Mapudungun and written a report detailing their
    recommendations for a standardized orthography
    for NICE.
  • Training for spoken language collection In
    January 2001 native speakers of Mapudungun were
    trained in the recording and transcription of
    spoken data.

16
Underfunded Activities
  • Data collection
  • Colombia (unfunded)
  • Chile (partially funded)
  • Travel
  • More contact between CMU and Chile (UFRO) and
    Colombia.
  • Training
  • Train Mapuche linguists in language technologies
    at CMU.
  • Extend training to Colombia
  • Refine MT system for Mapudungun and Siona
  • Current funding covers research on the MT engine
    and data collection, but not detailed linguistic
    analysis
Write a Comment
User Comments (0)
About PowerShow.com