MRCPv2

About This Presentation

Title:

Description:

Number of Views:12

Avg rating:3.0/5.0

Slides: 7

Provided by: cis949

Category:

Tags: mrcpv2 | proxyserver | resuming

Transcript and Presenter's Notes

Title: MRCPv2

1
MRCPv2

2
Status

At version 00.
Edits from the last meeting added.
Awaiting addition of SI/SV functionality.
A proposal for SI, SV, Enrollment, Hotword
Recording is out as draft currently for MRCPv1
support.
Soon to be integrated into MRCPv2.

3
Open Issues

Proxy support.
Need to add call flows describing how MRCP
proxies would work.
Need to allow MRCP server/proxy to do a SIP
re-INVITE and redirect the media as necessary.
Starting/Stopping media from the client to the
server for a RECOGNIZE method.
Recording support
Need support for a separate Recording resource.
Does this recording resource relate to Recognizer
capability to record utterances.
Does this recording resource relate to Speaker
Verification engines capability to record/buffer
utterances.
Does the utterance recording capability of
Recognizers relate to Speaker verification needs
to record utterances or buffer speech.

4
Open Issues

Resource Types or Profiles
Need to support separate resource types for
things like DTMF recognizer, Audio Player,
Poorman TTS.
These may not require completely different
resource state machines or methods and can be
addressed by the same methods as Recognizer and
Synthesizer resources. We could address them with
resource sub-types or profiles. Any suggestions.
NLSML Vs EMMA
Proposal to add EMMA as a SHOULD have for MRCPv2.
We would continue to have support NLSML for
backward compatibility.

5
Open Issues

Multiple media sessions under a single MRCP
session.
Do we need support for this.
If there is a need for a separate media session
for Recognition and SI and SV and Recording. Why
have it under a single MRCP session? Would it be
a reason to have a separate MRCP session?
Do we need to support multiple active SPEAK
requests with capability to switch between them
by pausing and resuming the different speak
request.
Do we need to rename START-OF-SPEECH event to see
that it applies for DTMF as well as speech.
Should it be START-OF-INPUT?

6
Open Issues

Content Management
If we add support for Recorders, we may need
mechanisms to fetch the recorded audio.
Can we do this content management using HTTP
instead of adding new methods into MRCPv2?
Do we need to add support for Grammar management
at the server level?
Do we need the capability to delete grammars
within a session?
Do we need to support the management Speech
content?
Is there a need for Intermediate Recognition
Results?
Cookies?