Title: NSF Funding of LT resources
 1NSF Funding of LT resources 
- Tanya Korelsky, Program Director 
- Robust Intelligence Cluster 
- Division of Information and Intelligent Systems 
- Directorate for Computer and Information Science 
 and Engineering
- National Science Foundation 
- tkorelsk_at_nsf.gov 
- http//www.nsf.gov/ 
2How NSF is organized
Office of the Director
 Biological Sciences
Geosciences
Computer and Information Sciences and Engineering
Mathematical and Physical Sciences
Education and Human Resources
Social, Behavioral And Economic Sciences
Engineering 
 3How CISE is organized
Office of the Director
Office of the Assistant Director for CISE
CCF Computing and Communications Foundations
CNS Computer and Network Systems
IIS Information and Intelligent Systems
OCI Office of Cyberinfra- structure
(formerly SCI, now with NSF-wide mission, 
reporting to Director of NSF)
Clusters
Clusters
Clusters
Crosscutting Emphasis Areas 
 4(No Transcript) 
 5CISE Proposal/Award Statistics 
FY Proposals Awards Funding Rate CGIs Supple-ments
2005 4,962 1,086 23 1,398 581
2004 6,266 1,017 16 1,297 400
2003 5,346 1,174 22 1,023 354
2002 4,314 1,038 24 918 308
2001 3,579 885 25 768 231
2000 2,853 903 32 547 210
1999 2,209 746 34 493 301
1998 1,885 667 35 476 211
1997 1,894 684 36 527 219
1996 1,760 601 34 610 183
1995 1,941 708 36 631 215
ADJUSTED 
 6CISE Budget 2003-2007
527M
525
Requested 6.1 increase includes 20M for 
cybersecurity, 10M for GENI
Dollars in Millions
500
496M
475
2003
2004
2005
2006
2007Request
Fiscal Year 
 7The Human Language and Communication Program (HLC)
- Initiated by Dr. Mary Harper 
- This HLC program emphasizes innovative advances 
 in computer and information sciences relating to
 all forms of human communication.
- High-level human communication topics 
- Text Processing 
- Speech Processing 
- Multimodal Communication Processing 
- HLC is attempting to strengthen current research 
 while broadening future research directions of
 the language processing research community (e.g.,
 multimodal communication).
8HLC/ITR LT recent resource, annotation and 
evaluation metrics awards
- ITR 03 Collaborative effort on Interlingual 
 Annotation
- HLC 04 Constructing an Enhanced Version of 
 WordNet, 100K (12 months)
- HLC 05 
- Rapid Development of Frame Semantic lexicon, to 
 ICSI, UC Berkeley, 400K (36 months)
- SGER Learning Syntax-based Evaluation Metrics 
 for Machine Translation, Dr. Rebecca Hwa,
 University of Pittsburgh, 200K (24 months)
- A Framework for Learning High Accuracy Evaluation 
 Metrics for NLP Applications, Dr. Alon Lavie,
 CMU, 150K (24 months)
-  
9CISE CRI (Computing Research Infrastructure) 
Program
- Funds community resources for IIS programs 
 reviewers are supplied by the technical program
 directors
- 04 LT resource planning award to Vassar 
 College An Open Linguistic Infrastructure for
 American English, 50K (12 month)
- 05 LT resource/annotation awards 
- Towards a Comprehensive Linguistic Annotation of 
 Language (Brandeis, UColorado, Pitt, Penn, NYU),
 850K, 24 months goals include achieving an
 international consensus on a meta-specification
 framework
- Another planning award (100K) to Vassar College 
 and Princeton University An Open Linguistic
 infrastructure for American English goals
 include annotation of semantic categories using
 WordNet and FrameNet
10Information and Intelligent Systems 
Reorganization into Clusters
- Robust Intelligence 
-  Artificial Intelligence, Human Language and 
 Communication, Robotics, Computer Vision,
 Computational Neuroscience
 
- Human-centered Computing 
-  Human Computer Interaction, Social Informatics, 
 Universal Access
- Information Integration and Informatics 
-  Data, Information, and Knowledge Management 
 Information Integration Science and Engineering
 Informatics Digital Libraries Digital
 Government
11Information and Intelligent Systems
- New Cluster-oriented Solicitation 
- Scheduled to be published in May with submission 
 deadline late October  early November
- One of cross-cutting threads Human-Robot 
 Interaction
- Implications for HLC area - renewed attention to 
 
- dialogue (human-human, machine-human) 
- ASR of imperfect and affected speech 
- Speech-to-concept understanding 
 concept-to-speech generation
- Need corpora to support these research areas! 
12One Small Current Effort
- SGER (Small Grant for Exploratory Research) 
- Creation of a Goal-Oriented, Human-Machine Spoken 
 Corpus
- ICSI (UC Berkeley), Dr. Dillek Hakkani-Tur 
- Building a spoken mixed-initiative dialogue 
 system for for conference services
- Deploying the system for the IEEE SLT Workshop 
 (December 2006)
- Collecting and annotating the dialogue corpus
13Digital Tools Summit at Michigan State University 
(June 2006)
- Funded jointly by the Linguistics Program and 
 (former) HLC program
- Addresses a functionality gap between the tools 
 that documentary linguists and typologists need
 and the ability of existing tools to annotate
 partially-understood linguistic data
- Existing methods and tools presuppose a 
 regularized digital corpus of a well-understood
 language and require a high degree of
 computational sophistication
- Aims to develop a roadmap for creating regional 
 and national language archives and the tools to
 achieve it
- Brings together theoretical computational 
 linguists and data-driven linguists to
 brainstorm the challenging issues
14NSF perspective on funding LT resources
- New corpora for dialogue research 
- New corpora for ASR research 
- mixed language (English-Spanish) 
- affected speech (911 calls) senior speech 
- New general corpora (ANC), both text and speech 
- Dependency treebanks and parsers 
- Harmonization of existing semantic resources 
 (WordNet and FrameNet)
- Basic research on semantic annotation ambivalent 
 attitude to standardization
15NSF perspective on funding LT resources 
(international resources)
- Parallel corpora for new MT research on 
 statistical methods applied to syntactic and
 semantic representations
- Research on MT for minority languages (pending 
 award to CMU for Inupiaq and Aymara)
- Corpora for research on language identification 
- International collaboration on speech processing 
 (NYU-EBIRE- CNRS) and on unified linguistic
 annotation
- International workshop on dependency 
 representations (2007 ACL in Prague)
16Thank you
- Tanya Korelsky 
- Robust Intelligence 
- Human Language and Communication 
- Division of Information and Intelligent Systems 
- Directorate for Computer and Information Science 
 and Engineering
- National Science Foundation 
- tkorelsk_at_nsf.gov 
- http//www.nsf.gov/ 
17Digital Living 2010
People across the globe will have access to each 
other and information provided by pervasive 
devices, embedded sensors and systems because all 
will be connected to the Internet.
Thanks to David Kotz at Dartmouth 
 18Global Environment for Networking Innovations 
(GENI) 
- Limitations of the Internet 
- Security mechanisms not included in the IP layer 
- End-to-end robustness cannot be assumed or 
 assured
- Scaling limitations 
- Quality of service mechanisms have not diffused 
 widely in the public Internet
- Support for new technologies difficult (e.g., 
 wireless, mobility, sensors)
19Global Environment for Networking Innovations 
- New networking and distributed system 
 architectures
- Build in security and robustness 
- Enabling pervasive computing, bridging the gap 
 between the physical and virtual worlds by
 including mobile, wireless and sensor networks
- Enable control and management of other critical 
 infrastructures
- Include ease of operation and usability 
- New classes of societal-level services and 
 applications
20Global Environment for Networking Innovations
- Research Program 
- Supports research, design, and development of new 
 networking and distributed systems
- Builds on many years of knowledge and experience, 
 but reexamine all networking assumptions and
 reinvent where needed
- Design for intended capabilities deploy and 
 validate architectures build new services and
 applications
- Encourage users to participate in experimentation 
- Take a system-wide approach to the synthesis of 
 new architectures
21Global Environment for Networking Innovations
- Facility 
- Shared use through slicing and virtualization 
 (where "slice" denotes the subset of resources
 bound to a particular experiment)
- Access to physical facilities through 
 programmable platforms (e.g., via customized
 protocol stacks)
- Large-scale user participation by "user opt-in" 
 and IP tunnels
- Protection and collaboration among researchers by 
 controlled isolation and connection among slices
- A broad range of investigations using new classes 
 of platforms and networks, a variety of access
 circuits and technologies, and global control and
 management software
- Interconnection of independent facilities via 
 federated design.
22Global Environment for Networking Innovations
- Outreach 
- CISE has supported numerous community workshops 
 in support of GENI
- CISE is supporting on-going planning efforts, 
 including needs assessment and requirements for
 the GENI Facility.
- CISE will hold town meetings and continue to 
 support future workshops to broaden community
 participation.
- CISE will work with industry, other US agencies, 
 and international groups to broaden participation
 in GENI beyond NSF and the US government.