What is New in SpeechStudio Suite
The version 3.8* series release
makes Speech Interfaces even easier and more effective. You can review
all the major changes for our various versions in our
Release Notes. Even better,
download our SpeechStudio Suite product - as a free trial.
Some Highlights of version 3.8 series:
Some highlights of version 3.6 / 3.7 series:
The Dragon Naturally Speaking (version 8) supported by ScanSoft and
now Nuance is generaly accepted as the best engine for desktop
dictation. It does not do as well as the Microsoft engine for
command and control, however, because it is very slow to change
contexts.
SpeechStudio Suite v3.8.0 introduces a new cooperation between Dragon
and Microsoft Engines: now you can develop your application so that your
end user can dictate with Dragon while using the nimble Microsoft engine
for context-dependent command and control.
The best of both worlds!
SpeechStudio has developed a remote microphone for use with speech
recognition through the Microsoft engine. It uses TCP/IP
communications to ship sound from one machine to another. The
microphone can be on any machine on your LAN, for example, or on a
wireless supporting Microsoft OS.

Remote microphone lets your end user roam from machine to machine, or
access speech fro tablet or wireless without having the speech engine n
the local device.
This support is separately licensed and must be requested from
info@speechstudio.com.
Remote Program Support allows your program to reside on a remote
machine - usually a web server - while using speech recognition using an
engine and user profile on the user's local machine. The user's
local engine is trained to recognize that user's voice, and can do so
without delay of shipping bulky voice data around the net.

This support is separately licensed and must be requested from
info@speechstudio.com.
Microsoft is urging its Independent Software Vendors (ISVs) to undergo
Microsoft's testing in order to ensure high quality.
We are proud that SpeechStudio Suite was able to pass Certification
with no issues and in fact without any modifications. SpeechStudio is
now a Microsoft Certified Partner.
SpeechStudio Suite continues to improve the effectiveness of the SAPI
5 engine. In version 3.6, SpeechPlayer is able to stop the engine if it
gets into trouble, then restart a new engine so that to the end user the
engine will appear to have paused briefly but continues just where the
application was. This allows SpeechPlayer to manage multiple sessions,
so users can switch users on XP for example.
SpeechStudio products include about
600,000 unique source lines, and a like amount of documentation lines.
That's 1,200,000 chances for bugs!
We're happy to report that fewer that
5% of our users have reported any bug at all in our code.
Of course, our users are happy to make
suggestions, and many do. The enhancements we add are often a direct
response to your requests.
SpeechPlayer now supports Press-to-talk. Select press-to-talk behavior
for your hot key by going to Tools | Options and looking on the Sleeping
tab for the Press-to-talk radio button. Note that the hot key will
reverse the current mute state of SpeechPlayer, so if SpeechPlayer is
Listening, holding the hot key will MUTE.
Version 3.6 reorganized the entire threshholding
process. The problem was basically that the user could drop the mike or
otherwise generate a loud noise which then would cause the max possible
volume to be set very high - that caused the threshold to go too high,
so everything was suppressed thereafter. The new version has
SpeechPlayer call in whenever there is a successful recognition; the
audio object keeps track of the last few seconds of input and calculates
a new max from that history, on the assumption that if the engine could
recognize it, then it wasn't a hyper-loud noise such as a mike drop.
The gotosleep and wakeup specifications to
user-accessible properties so they can be set programmatically through
new properties.
Add properties to tell/set when we are sleeping and when we are
muting.
Allow the user to see false recognition information, if desired. This
info can help train the user to speak in an even cadence, not too loud
or too soft, as well as explain what might be happening.
Added support to have the profiles listed and restored directly from
SpeechPlayer (Profile Manager must still be purchased separately.) The
Profile file and auto-restore to are connected to registry options and
so persist.
SpeechStudio Suite has supported Microsoft's SAPI 5 in the
expectation that other vendors would integrate their best products with
SAPI 5. That hasn't happened. While the Microsoft SAPI 5.1 engine is
excellent for command and control, the Dragon Naturally Speaking engines
from ScanSoft are still better for dictation. However, the Naturally
Speaking engines are not easy to use for the more advanced command and
control supported by SpeechStudio. So with version 3.6 SpeechStudio is
introducing integrated support for BOTH the SAPI 5 engines like
Microsoft and the SAPI 4 engines like Naturally Speaking.
What do you have to do to develop for one or the other? Nothing!.
You write one application. You can the test it with several
different engines. In fact, you can deploy your application without
worrying about what engine your end users might have or prefer - or even
in most cases what they might get in the future.
Dictation Manager will now support separate modes for edit and
dictation. Users reported that dictation accuracy was substantially
reduced when complex grammars are presented along with the dictation
engine, so in the new version dictation mode has only a very limited
command grammar: correct that (enter correction mode), scratch that
(delete the last chunk), start number mode, start spell mode, start edit
mode, stop dictation ?mode.
Dictation mode now supports a "stop dictation" command. Also,
dictation is turned off during edit mode.
Dictation Manager now uses What Can You Say to support organized
online help during the edit, spell and number modes. These grammars and
usages were complex for end users.
SpeechStudio can now suspend dictation while still maintaining focus
control of its managed windows. The app developer can use this suspended
mode to perform window-related grammar or mouse activities without
complications from dictation recognitions.
Our users have told us that the only really hard part of building a
desktop telephony application with SpeechStudio is in getting a modem to
work. SpeechStudio works with TAPI, but many modems have incomplete or
buggy implementations of TAPI. Further, modems change often, or have new
drivers, which may improve their performance. To make things easier for
you, we've built in some specific support for some good modem choices.
If you have a:
- Zoom modem (3025-00-00C0 or 3049-00-00C)
- U.S. Robotics 56K Voice Faxmodem Internal ISA
- Intel Dialogic Cards (D4PCI, D/41E, D/41ESC, ProLine/2V, D/41H)
- Way2Call (Hi-Phone DeskTop)
Then you can look forward to a specific tutorial to help you get up
and running with all SpeechStudio's Telephony support, including:
- Call generation and call answering
- DTMF support - know when the caller presses buttons
- Voice recognition and prompt support
- Call recording support
- Dictation support
Now there are no excuses! AND THERE ARE NO ROYALTIES!
Array microphones represent the future of computer voice input. They
allow computers to listen continuously, to differentiate voices from
background noises, to track speakers as they move about, and someday
even track multiple speakers, just like we do. Array microphone users
can move freely, without the discomforts and annoyances of close-talk
headsets.
The expanding capabilities of these new mikes add some new
complexities for voice interface software. The most immediate problem
is that users tend to forget that the microphone is listening. That
means a user may start a conversation that can inadvertently trigger
recognitions for a program that may be feet or yards away - BUT
SpeechStudio's new release can prevent this.
SpeechPlayer now supports automatic sleeping and automatic muting.
Sleeping means that SpeechPlayer will stop all application grammars and
listen only for a single "wake up command". Muting means that
SpeechPlayer will stop listening altogether, so the end user will need
to use a non-voice operation like the mute key to turn the microphone
back on. The end user can control if, when and how this happens.
Speech Player is also set up to use "remote controls". End users can
use a remote control for press-to-talk or mute-unmute. That way the
array microphone will be on if and only if the end user turns it on.
A Tutorial is now available for those who are using or designing for
the Voice Tracker array microphone.
SpeechStudio Suite is filled with up-to-date and powerful tutorials:
|
SpeechStudio Tutorials |
|
Introduction |
Use
SpeechPlayer to build a VUI |
|
SpeechRunner Introductory |
Use
SpeechRunner to debug a VUI |
|
SpeechRunner Phrase Library |
Use sound
files to test a VUI |
|
SpeechRunner Changing Grammars |
Use
SpeechRunner to understand issues of changing grammars
dynamically |
|
SpeechRunner Analysis |
Use
SpeechRunner logs to understand your VUI and assess Engine
performance |
|
Telephony |
Use
SpeechStudio Suite to build a modem-based Telephony application |
|
Dictation |
Use
SpeechStudio Suite to build a dictation application |
|
Foreign
Windows |
Use
SpeechStudio Suite to build a VUI for unknown windows or
applications – windows for which you do not have or cannot
change source code. |
|
Grammars |
Use
advanced sub-grammars to simplify and refine your VUI. |
|
Using
Voice Tracker |
NEW! Use the Voice Tracker
array microphone to build a hands-free, eyes-free VUI. |
|
Using
Zoom Modems |
NEW! Set up your Zoom modem
so that it works seamlessly with your SpeechStudio VUI. |
In most cases each SpeechStudio Tutorial is
available in three versions: one for .NET (C# and VB.NET), one for VB6
and one for C++.
Samples are included for each of these above
tutorials, and more.
|