Title: Nuance Speech Technology ______________________ Developing Applications
1Nuance Speech Technology______________________D
eveloping Applications
2Speech Recognition Applications
- Retrieve information
- Perform an action
- Examples
- Banking applications
- Stock quotes applications
- Travel reservations
- Auto attendant systems
3Benefits
- Existing applications are more useful
- Say what you meanwhats my checking balance
instead of 63 - Larger menus
- Complex, information-rich queries
- New applications are possible
- Stock quotes require a very large vocabulary
- Travel reservations would be too tedious with
isolated speech or touch tones
4Nuance Software Architecture
5RecClient
- Answers the phone
- Plays prompts
- Detects touch tones
- Performs endpointing
- Transmits speech to server
6RecServer
- Recognition
- Interpretation
- Front-end processing
- Multiple simultaneous client connections
- Serves multiple applications, packages, languages
7Resource Manager
- Connects clients to servers on a per-utterance
basis
8Client/Server Architecture
- Load balancing and resource sharing
- Multiple callers handled by the same recognition
server - Client connects to least-loaded server for each
utterance - Distributed
- Clients can be on less expensive hardware than
servers - Robust
- Clients automatically connect to available server
9Application
- Created by the developer
- Manages the dialog
- Interfaces with
- The dialog construction API (ex Nuances Dialog
Builder) - Lower level APIs as required
- Ex RecClient APIs for control of individual
recognition steps but shielding from handling of
raw audio data - Ex VirtualRecServer API for low level control of
raw audio data
10Nuance APIs
11Nuance APIs
- Dialog Builder
- High-level API for writing telephony applications
- Written on top of RCAPI
- RCAPI
- Event-driven interface between application or
application development tool and RecClient - Written in C
- Being replaced by RCEngine API
12Nuance APIs
- RCEngine
- Written in C
- Same functionality as RCAPI
- VRSAPI
- Primarily for integrating hardware that cant be
integrated at RecClient audio provider level - Interfaces to RecServer through Resource Manager
13Nuance Recognition and Understanding Engine
Grammar
Speech
Word
Meaning
Features
String
Front
Interpretation
End
14Specifying, Compiling, and Testing Grammars
15Overview
- Grammar specification basics
- Natural language (NL) specification
- Compiling grammars
- Testing grammars
16What is a Grammar?
- Specifies what can be saidall the possible
sentences and phrases that can be recognized - File is called application.grammar, where
application is the name of the recognition
package - Written in Grammar Specification Language (GSL)
17Grammar Basics
- This is a simple grammar
- .Sentence ( good morning )
- A semicolon () indicates a comment
- .Sentence is the name of the grammar
- Grammar names must contain at least one uppercase
character - Words are lowercase
- (A B C D) A and B and C and and D
18OR Construction
- A B C D A or B or C or or D
- .Sentence( good morning afternoon evening )
19Optional Words and Repetition
- ?A A is optional
- .Command ( tell me my balance in checking ?please
) - A One or more repetitions of A
- .Sentence( thanks very much )
- A Zero or more repetitions of A
- .Sentence( thanks very much )
20Writing a Good Grammar
- Broad coverage
- People express themselves in a variety of ways
- Recognizer cannot recognize anything not in the
grammar - But not too broad
- Recognition accuracy can be adversely affected
21Natural Language Interpretation
- NL interpretation assigns meaning to word strings
- Many utterances. . .
- withdraw fifteen hundred bucks from savings
- take fifteen hundred out of savings
- give me one thousand five hundred dollars from
my savings account - . . .may express the same meaning
- ltaction "withdrawal"gt
- ltsource_account "savings"gt
- ltamount 1500gt
22Interpretation
- Define the relevant slots for the domain
- Slot Value
- command "transfer"
- source-account "savings"
- destination-account "checking"
- amount 125.10
- Transfer one twenty five ten from savings to
checking - I want to transfer to checking from savings one
hundred twenty five dollars and ten cents - Please put a hundred twenty five dollars ten
cents in checking from my savings account
23Interpretation
- The Nuance NL engine uses a slot and filler
representation of meaning - Slots are ...
- Defined for the domain
- command
- amount
- source
- Associated with word strings in the grammar
- Filled with values when the associated word
string is recognized by NL Interpretation
24Slot-Filling Commands
- NL commands go between curly braces
- Commands attach to the preceding item either a
word or a grammar construction - NL commands are part of the grammar file
- .Command ( withdraw from checking
ltsource_account "checking"gt - savings ltsource_account
"savings"gt -
- ) ltaction "withdrawal"gt
25The Slot Definitions File
- In conjunction with the grammar file, a slot
definitions file defines the slots for the
application - The slot definitions file is simply a list of all
the slot names - account
- source-account
- dest-account
- The slot definitions file must be called
application.slot_definitions
26More About Grammars
- Subgrammars
- Return Commands
- NL Functions
- Adding New Words
27Subgrammars
- Subgrammars match a part of an utterance
- Account ( savings
- checking
- ( money market )
-
- ?account
- )
- Top-level grammars prefaced by .
- Subgrammars reduce redundancy
- .Command ( tell me the balance in Account )
- ( transfer from Account to Account )
- ( withdraw from Account )
-
28Return Commands and Variables
- To associate a return value with a grammar
- return("checking")
- return is like other commands except no slot
is filled only the value is defined - Assignment A higher-level grammar can store the
returned value in a variable - ltSub-grammargtltvariable_namegt
- Example Accountacct results in the variable
acct being set to the value returned by the
grammar Account - Dereferencing To access a variables value,
preface the variable name with .
29Return Commands and Variables
- .Command ( tell me the balance in Accountacct
) - ltaccount acctgt
- ( transfer from Accountsrc to
Accountdest ) ltsource-account srcgt - ltdest-account destgt
- ( withdraw from Accountsrc )
ltsource-account srcgt -
- Account ( checking return("checking")
- savings return("savings")
- ( money market ) return("money_marke
t") -
- ?account
- )
30NL Functions
- Slot values and return values can be function
calls - Available functions
- add returns the sum of two integers
- sub returns the result of subtracting the second
integer from the first - mul returns the product of two integers
- div returns the truncated integer result of
dividing the first integer by the second
(e.g., div(9 5) returns 1) - neg returns the negation of an integer
- strcat returns the concatenation of two strings
- Arguments separated by whitespace, not commas
- No space between function name and parenthesis
31NL Functions
- Example
- Digit one return(1)
- two return(2)
- three return(3)
- ...
-
- Decade twenty return(20)
- thirty return(30)
- forty return(40)
- ...
-
- .Number ( Decaded1 Digitd2 ) ltnumber add(d1
d2)gt - Matching the top-level grammar .Number fills the
slot number with the sum of NL variables d1 and
d2
32Compiling Grammars
- nuance-compile application-name model-set
- application-name is the name used for the grammar
and slot definitions files - model-set is the set of acoustic models for
recognition. - Successful compilation produces a recognition
package in a directory called application-name.
The package provides all the necessary files for
recognition and understanding.
33Testing Recognition and NL
- Xapp is a graphical application for exercising a
grammar and NL specification - Xapp -package recognition_package
- You must start a recognition server before
running Xapp - recserver package recognition_package
34Testing Recognition and NL
- parse-tool lets you type phrases to see if they
match a particular grammar - parse-tool -package recognition_package
- The optional parameter -print-trees prints the
entire parse treethe full set of grammars and
subgrammars traversed to match the sentence - nl-tool lets you type sentences to see the
interpretations that are produced - nl-tool -package recognition_package
- For parse-tool or nl-tool, specify -grammar
grammar if you want to use just one grammar to
process the phrase
35Adding New Words
- The Nuance toolkit includes a dictionary with
pronunciations for more than 100,000 words. - Some applications may require words that are not
in the default dictionary, for example, unusual
names. - If the grammar contains words not in the
dictionary, nuance-compile prints an error
message saying so and lists the words in a file
called application-name.missing.
36Adding Words Phonetically
- Create application-name.dictionary
- Use normal text editor
- Copy application-name.missing
- Add new words directly
- For each word, supply a phonetic spelling
- (see the manual, Nuance Grammar Developers
Guide\7.Creating application-specific
dictionaries\The Nuance phoneme set). - Multiple pronunciations
- Repeat the word on multiple lines.
- General rule Limit to 3
37Adding Words Phonetically
- Examples
- apple a p lbook b U khotel h o t E l
- You can use the pronounce tool to find
pronunciations of similar words - pronounce English.America telephone
- telephone t E l f o n
38Summary of package files
39Nuance sample grammars
- You can find sample grammars for
- date.grammar
- money.grammar
- number.grammar
- time.grammar
- At
- Nuance\data\lang\English.America\grammars