|
|
SpeechPlayer is a freely redistributable
runtime system that coordinates complex interactions between voice recognition technologies, voice-enabled application programs,
and the end user.
SpeechPlayer is application intelligent,
using information embedded by SpeechStudio in the target application to
focus the voice recognition engine
on a limited vocabulary and syntax. SpeechPlayer displays
context-sensitive prompts and feedback from the voice recognition engine.
It provides a standard interface for dictation, confirmation,
and system configuration. SpeechPlayer runs iconified, popping up as needed
to display prompts or request confirmation.
SpeechPlayer
manages the speech engines and telephony hardware, allowing application developers to test and tune their application
independent of speech recognition engines, and document a single user
interface regardless of the end user’s choice of voice recognition
software.
Key Features of SpeechPlayer
- Freely redistributable, royalty free.
- Includes Microsoft SAPI 5 text-to-speech
and speech recognition engines, royalty
free.
- Controls any SAPI 5 compliant speech engine.
- Controls any TAPI compliant telephony
hardware.
- Compensates for variations among the
voice recognition engines.
- Manages sound sources, including
microphones, telephony devices, and recorded audio data files.
- Provides standard engine status display,
including a histogram of recent sound input.
- Provides controls to adjust engine
parameters and swap engines.
- Provides an application intelligent
"What Can I Say" display.
- Provides standard dictation and edit for
mini-dictation of a few words, phrases, or sentences.
- Provides standard confirmation, as
advised by SpeechStudio, to deter dangerous misrecognitions.
- Allows multiple client programs to
cooperate or compete for voice attention.
- Automatically reconfigures to minimize
screen footprint.
- Logs high-level information for
SpeechRunner test-level analyses.
- Logs low-level information to support
SpeechRunner phrase-level analyses.
- Implements recognition refinements such
as variable threshold recognitions.
- Implements press-to-mute, press-to-talk
and focus control.
- Recovers from application crashes and
unexpected system behavior.
|