SpeechStudio RELEASE NOTES
SpeechStudio products were first deployed in November 2000.
Here are some highlights of their progress, by (selected) version.
-
Version 3.8.1 (December 31, 2005)
-
Version 3.8.0 (December 8, 2005)
-
Version 3.7.3 (November 18, 2005)
-
Version 3.7.2 (February 15, 2005)
-
Version 3.7.1 (January 6. 2005)
-
Version 3.7.0 (December 10, 2004)
-
Version 3.6.6 (October 1, 2004)
-
Version 3.6.5 (April 27, 2004)
-
Version 3.6.4 (November 25, 2003)
-
Version 3.6.3 (July 22, 2003)
-
Version 3.6.2 (June 20, 2003)
-
Version 3.6.1 (June 7, 2003)
-
Version 3.5.0 (February 14, 2003)
-
Version 3.0.2 (October 28, 2002)
-
Version 2.0.5 (March 19, 2002)
-
Version 2.0.1 (November 21, 2001)
-
Version
1.7.1 (April 11, 2001)
-
Version 1.5.1 (December 14, 2000)
-
Version 1.0 (November 7, 2000)
General
- Version 3.8.1 is a bug fix release of SpeechStudio Suite
v3.8.0. It is
upward compatible with v 3.5.* thru 3.8.0.
- Samples are now supplied, as
part of the <language>\Events sample, to show how to program in Dragon
Dictation in cooperation with SpeechPlayer.
Incompatibilities with
Version 3.8.0
- You are advised to rewrite your Dragon control code to work through the new
SpeechPlayer Dragon properties listed below for the SpeechStudio Voice Control.
- Message2U event name Message2U_DragonMutingChanged has been changed to
Message2U_DragonStatusChanged, and now covers a variety of changes in the
status of Dragon, not just its muting. This change reflects the broadened
capability of SpeechPlayer to control Dragon Dictation.
SpeechPlayer
- [Fix] In some cases, SpeechPlayer would not LocateControl's properly if
they were within a C# or VB.NET dialog group box; that is repaired in 3.8.1.
- You can now tell what status wav is used by going to SpeechPlayer's Tools |
Options and clicking into the "Feedback Options". A "*" will be in front
of the current state.
- When directed to establish a "dragon target window" (via GetPropertyNum),
SpeechPlayer will try to move Windows focus into that target window and will
bring its parent Window to the top. This is because Dragon only dictates
to the topmost window.
SpeechStudio Voice
Control
- [Fix] The SpeechStudio Voice Control now allows control of ScanSoft/Nuance
Dragon through SpeechPlayer, via property values. This allows Dragon to
be controlled through a single API instance, instead of having one API instance
in SpeechPlayer and then others in various user applications. Dragon was
unstable when controlled from more than one API instance, and would often hang.
This approach also allows development of basic Dragon dictation support without
programmer involvement with the Dragon API..
- Set.../GetPropertyNum now supports "dragon connected", so you can
programmatically begin connecting to Dragon in advance of requesting dictation;
Dragon is slow to initialize, and so this creates better user experience.
It also controls Dragon disconnect.
- Set.../GetPropertyNum now supports "dragon target window", so you can
programmatically direct Dragon dictation to any window that supports text
capabilities.
- Set.../GetPropertyNum now supports "dragon microphone on", so you can
programmatically control Dragon listening. Note that Dragon can be slow
to respond, so that turning off the microphone may not take effect for many
seconds.
AudioTeeControl
- [fix] If SpeechPlayer's Tools | Options | Sound Input "Allow standaside
recording" was checked, then some explicit or implicit engine restarts could
result in a very large initial threshold value, that caused blocking of further
input. The symptom was that the SpeechPlayer sound level histogram would
say "Listening" but no histogram values. The workaround is to turn off
"Allow standaside...", exit options to force a restart, and then turn "allow
standaside..." back on. This bug is fixed in v3.8.1.
General
- Version 3.8.0 is a major new release of SpeechStudio Suite. It is
upward compatible with v 3.5.* thru 3.7.*. It introduces support for remote web clients
and Dragon interoperation.
Incompatibilities with
Versions Before 3.8.0
- The SpeechUtilities.cs in the Samples/CSharp area has changed its namespace
name from "SpeechUtilities" to "SpeechStudio". Public class Speech has
changed names to public class SpeechUtilities
SpeechPlayer
- SpeechPlayer now supports parallel use of Nuance/ScanSoft's Dragon for Dictation
with Microsoft's SAPI 5/6 engine for Command & Control. Dragon Dictation
is excellent but its command and control is rather rigid and slow.
Microsoft's command and control is fast and flexible, but it is weak for
dictation. The combination is strong for both. SpeechPlayer Tools |
Options allow end users to set up their own hotkeys to switch between engines.
New Message2U events to alert your applications of changes in Microphone
control to/from Dragon, allowing them to set focus for dictation, for example.
A Message2U event is now provided for changes in SpeechPlayer muting as well.
- SpeechPlayer now provides a "Reconnect to Dragon" button in its Advanced
Settings under the "Sleeping" tab of Tools | Options. This button allows
you to shut down Dragon while your SpeechStudio apps are running, and then
later reconnect to Dragon if you bring it back up.
- SpeechPlayer uses multi-threading to protect your Apps from the sometimes
long delays inherent in Dragon's dictation, even when natspeak.exe hangs.
- SpeechPlayer now can generate a background audio reminder periodically to
let the user it is listening for something. The prompt sounds are
configurable by the end user, as their frequency. Separate sounds can be
issued depending on the state of SpeechPlayer, such as "listening for commands"
or "listening for commands and Dragon is listening for dictation at the same
time".
- SpeechPlayer will now look for a focus-less client (one started with an
Init call with a 0 hWnd) if not focus-driven process should have speech focus.
This allows background grammars to persist and recover after focus-driven apps
give up control.
- Allow the user to set up a remote microphone from SpeechPlayer.
- Some audio devices will buffer up substantial amounts of sound, so that
when muting ends, several seconds of buffered sound could be sent to the
engine. SpeechPlayer now cleans out buffered sound so that actual listening
starts about the same time as muting is clicked off.
- If SpeechPlayer was dealing with an engine (say one named "English
Recognizer") and that engine was de-installed, then SpeechPlayer reported that
it could not find a SAPI engine when in fact there may be other engines
available. It now explains that its old engine is not available, and asks the
user to choose a new engine.
- SpeechPlayer can now send you a "Message2U" event with a code of
Message2U_FalseRecognition indicating a false recognition. Its "text" field has
a guess at what was heard followed after "??" with a string explaining why the
recognition failed:
"NOISE", "MIKE", "LOUD", "SOFT", "FAST", "SLOW"
You'll need to set the "Deliver False Recognitions" property to 1 to receive
these messages.
- SpeechPlayer can now generate TTS to a file, with its SpeakToFile command.
SpeechStudio Voice
Control
- ListenForSPProxy command allows licensed user to tell SpeechPlayer to
listen for a remote proxy.
- UseRemoteMic command allows licensed user to tell SpeechPlayer to connect
to a remote microphone object - as supplied by SpeechStudio. SpeechPlayer
will then get its sound from that remote device.
- For the Speak() method, when using SPEAKFLAGS the first character after the
closing ']' was ignored. For example, "[SPEAKFLAGS:SPF_IS_XML]what's up" would
be spoken as "hat's up". The workaround is to put in a space, for example, "[SPEAKFLAGS:SPF_IS_XML]
what's up".
- Set.../GetPropertyNum now supports "enable sound feedback", so you can
programmatically disable sound feedback.
- Set.../GetPropertyNum now supports "deliver false recognitions", so you can
programmatically get information about poor recognitions delivered to your App.
SpeechStudio Remote
Microphone
- The Remote Microphone allows the speech engine on one machine to do speech
recognition on sound from a remote machine. The Remote Microphone can run
on any Windows or Windows CE device.
- The remote microphone can be told a range of ports, allowing it to support
firewall tunneling.
SpeechStudio Proxy
- The Proxy allows SpeechStudio clients on one machine to control a
SpeechPlayer on a remote machine. Typical applications include web
programs and terminal server applications.
- Tracks window activation, destruction, focus and keypress actions on the
client side, passing them on to the remote SpeechPlayer so it can track focus.
That way grammars on the SpeechPlayer will automatically stay in sync with
changes in the remote application..
- The Proxy can be told a range of ports, allowing it to support firewall
tunneling.
SpeechStudio
- SpeechStudio can now handle more structural changes to VB projects without
complaints.
SpeechRunner
- You can now specify arguments for the programs that you launch from
SpeechRunner. An argument edit box in included in the Setup Dialog and in
the SQR Command dialog.
UI (User Interface) Control
- SpeechStudio's UI Control provides the WindowCreationParameters command, allowing you to specify some important attributes of predefined SAPI 5 dialogs and wizards before they are launched. These dialogs can now be launched asynchronously, so that your code can maintain control while the dialog is active. You can specify their initial position and to some extent their names. And you can control their behavior such as, for example, having them be always on top.
- You can now use the StandardUserTraining command to launch user training with the engine vendor's training texts. You can use the UserTraining command to launch training with your own specific texts.
AudioEditControl
- [fix] Truncation sometimes happened on odd bytes, if the user specified it.
Any appended sound then had upper and lower bytes reversed - generating noise.
We now prevent odd byte splitting on truncation.
AudioFilesControl
- The value passed to SaveAs is not longer required to be VARIANT_TRUE or
VARIANT_FALSE. Now simple true or false is accepted. Previously, using "true"
could sometimes fail to delete temp wav files.
Grammars
- The SpeechStudio.2Digits grammars now include In_20_to_29, In_30_to_39 and
so on to In_90_to_99, so that users can compose their own grammars more easily.
- The SpeechStudio.DateTime grammars now include time_to_the_minute (hours,
and optionally 0, 1, 2, ...59 minutes), and minute_phrase (0, 1, 2, ...59
minutes) in addition to the approx_minutes versions.
Version 3.7.3 is a support release of SpeechStudio contract work and future
development. It has minor bug fixes and improvements for general users.
Version 3.7.2 is a support release of SpeechStudio contract work and future
development. It has minor bug fixes and improvements for general users.
SpeechPlayer
- (Bug Fix) In some cases,
dictation with non-base logging levels could cause a fault in SpeechPlayer.
- (Bug Fix) Dictation focus
would not always track with changes in edit window focus for multiple dictation
windows in a single client.
- Logging output was reduced
and refined..
- SpeechPlayer now issues
Message2U callbacks to let the app know when muting has changed. Users
can now create their own notifications of mute status. such as greying their
application.
SpeechStudio Voice
Control
SpeechPlayer
- (Bug Fix) Microsoft XP Upgrade SP 2 has a bug in
its get_hFont routine; it showed up as a crash in SpeechPlayer,
typically in the Microsoft's oleaut32.dll module. The workaround was to
turn off "Show Misrecognitions..." on the Recognitions tab of
SpeechPlayer's Tools | Options. The bug was triggered when a
misrecognition report tried to change font. Version 3.7 avoids all
calls to get_hFont.
- (Fix) If the user selected Tools | Options twice,
SpeechPlayer asserted. Now it will just reshow the existing options box.
- SpeechPlayer no longer stops because an internal parameter is out of
range; it can correct and continue with just a log message.
SpeechStudio
Voice Control
-
(Bug Fix) The Control Log Name property could
cause an infinite loop crash: if two program instances set the control
log to the same name, then the second opener might loop trying to report
the error by opening the log file - thus causing the error again. The
workaround was to avoid using the same log name in two programs.
Version 3.7 suppresses logging for the second program.
-
(Fix) For the SetPropertyNum function, Auto
Sleep should have been Auto Sleep Seconds. Auto Shutdown should have
been Auto Sleep Shutdown. Neither of these properties worked correctly
in v3.6.4.
SpeechRunner
- SpeechRunner can now launch its test programs with
arguments. In previous versions, a command file was used as an
intermediary to provide arguments. This had the drawback that the
program under test was the command file invocation rather than the
program itself. In Version 3.7, the Default Settings command and the
Settings dialog allow an argument string. The target program can then
be launched directly.
Incompatibilities
with Versions before 3.6.6
- The WatchForWindow command to the
SpeechStudioControl now has an additional required parameter:
bEverywhere. If this parameter is true, SpeechPlayer will report
windows matching the given pattern regardless of what process they arise
in. If it is false, only windows in the same process are considered -
this was the only choice in versions before 3.6.6.
SpeechPlayer
- (Bug Fix) The Transparency Pattern was being
applied to windows even when they have their own client. This bug
showed up when clicking from a non-client "transparent" window back into
the current current client window: SpeechPlayer might stop listening.
- SpeechPlayer's Tools | Options now support a Sound
Input tab page, on which you can set up connection to a SpeechStudio
Remote Microphone or other instance of the SpeechStudio AudioSendControl.
When selected, sound will come from that remote microphone.
- (Fix) A window can be activated many times.
SpeechPlayer incorrectly sent a WindowDestroyed message for each
activation; it now sends just one WindowDestroyed message.
SpeechRunner
- SpeechRunner can now simulate the basic mouse
commands. The command "DoMouseOp: Click, (24,543)" will cause the mouse
to reposition to x=24 and y=543 relative to the rectangle for the
current SetTestFocus window; the current SetTestFocus window will be
brought to the top (if possible) and the events (down and then up) of a
"click of the left mouse button" will be simulated. Move, DoubleClick,
Down, Up, RightClick, RightDoubleClick, RightDown and RightUp are also
supported. These commands may be used to test graphics or to push
old-style buttons that do not have window substructure.
- The SetTestFocus command now allows the selection
of an unnamed window. The window name pattern can be used to select
some named relative of the desired window; then a prefix of the form [child.sibling.sibling.child]
can walk to the desired unnamed window. Besides sibling and child,
owner and parent relationships are available.
- If the user starts another test while one is still
running, SpeechRunner now disconnects the older test, recovers, and
continues with the new test.
- If the user tries to start a test which does not
exist or is not usable, SpeechRunner now complains, recovers and
continues; previously, SpeechRunner just waited for the test to become
usable.
- (Fix) In rare cases test disconnect happened while
cleanup was incomplete, causing the end of test message to be
suppressed.
- (Fix) The DoMenuClick, DoMouseOp, AwaitWindow,
AwaitWindowGone, AwaitLogText, and
AwaitSpeak commands no longer require that if an engine is present that
it be listening. These commands now operate independently of the engine
state.
SpeechStudio
Voice Control
-
The Control Log Name property can be set to a
simple root file name in order to capture logging in your application’s
controls. All logging messages for control in your app will go to a
file named MyControlLog.txt located with the other SpeechStudio logs, in
a directory defined in SpeechPlayer’s help. Give only the file name
root; the extension is always .txt and the directory is always the
SpeechStudio log directory. A file name of “” will turn off logging.
Each log line will be prefixed by the number of the control that issued
the log message.
-
The WatchForWindow command can look at all
windows, not just those in your own process. The bEverywhere parameter
controls this behavior.
-
SetPropertyString("engine", "new recognizer
name") is now supported; the special engine name "SuppressEngine" allows
the user to disconnect SpeechPlayer from any engine. This allows the
user to compensate for engine bugs where the engine crashes if it is
left running too long.
General
- Version 3.6.5 is a upgrade for v3.6.4, and offers
improved support for Audio Recording. It is upward compatible with v
3.5.*, 3.6.1, 3.6.2 and 3.6.3.
- Audio recording can now be continued independently
of the engine listening status.
- Recording can now elide sub-threshold sound.
- Introduced the Audio Edit Control: delete sound
data after a given point to accomplish "rewind" functionality. See the
Recorder sample under Samples/Cpp or Samples/VB for usage.
Profile Developer
- Profile Developer users will now be able to
control licensing on remote machines. Ask SpeechStudio Support for
details.
SpeechStudio
Voice Control
- The "DisplayMode", "Auto Sleep Seconds" and "Auto
Sleep Shutdown" properties did not work correctly in v3.6.4.
General
- Version 3.6.4 is a upgrade for v3.6.3, and offers
tuned support for some popular telephony devices. It is upward
compatible with v 3.5.*, 3.6.1, 3.6.2 and 3.6.3.
- Some suggestions are now available for those who
are using or designing for PC Telephony, for example with the Zoom 3025
modem family or Intel Dialogic cards. See Start | Programs |
SpeechStudio | Help | Hooking Up Phones.
- Added a modem test program for end users, to help
them qualify and set up a modem. See Start | Programs | SpeechStudio |
Tools | Test Modem.
- A bug was reported in the order of speechstudio3.h
declarations. The SetRemoter (unreleased) operation was out of order.
The effect is that WatchForWindow, LocateControl, ListenToFile, and
DictationActivatedFor methods would sometimes fail.
SpeechPlayer
- SpeechPlayer now allows the end user to adjust
rate and volume of TTS. Pitch is controllable only in SAPI 4. A TTS
pause hotkey is displayed but not implemented.
- SpeechPlayer's TTS support for SAPI 4 has been
improved to allow display and selection of voices as well as control of
speed and volume.
- SpeechPlayer now allows WhatCanYouSay to be null.
- SpeechPlayer no longer goes into sleep mode if the
user has muted SpeechPlayer.
- SpeechPlayer will no longer sleep in cases where
we are hiding a speech client with a non-speech-enabled window from the
same process. Now SpeechPlayer will stop listening entirely until a
voice client regains control. Exception: when the telephone is used
for input and the developer has set the property "GiveFocus to
RequestLine app" then control will immediately switch to the RequestLine
master app.
- SpeechPlayer and its engine management have been
accelerated for large grammars.
- You now may allow a non-speech window to get
control while listening for another (speech-enabled) window:
SetPropertyString("transparency
pattern", pattern).
- Message2U codes now reserve 1..100 for
SpeechPlayer messages for odd synchronizations, error and warning
reports. You can tell quickly about error-type messages.
- SpeechPlayer now issues Message2U callbacks to let
the app know when the ListenToFile call has completed. In previous
versions, the user could only guess at this, and if a mistake was made
it could hang the engine.
- SpeechPlayer now pads wav file inputs (SAPI 5
only) with silence at their front and back to give the engine time to
see a pause at the beginning and end. This adjustment improved
recognitions from directly recorded files by about 30%, to bring it up
to about the levels seen from the initial live input.
- The SimPhone panel (to simulate phone calls) now
displays when the user selects SimPhone option - and is removed when it
is unselected. Now users can try SimPhone without having to program a
special trigger into their app.
- If a client's dynamic grammars can accept a phrase
ambiguously, SpeechPlayer now dumps out each parse so the user can see
where the ambiguity arises.
Dictation Manager
- Dictation Manager will now accept a "point" in its
number editor, allowing decimal numbers.
- Dictation Manager will now restore focus correctly
after spelling or number entry, and will maintain focus during those
operations so that the cursor and selection is visible.
- Dictation Manager has improved its text
formatting.
Profile Manager
- Profile Manager is now included in SpeechStudio
Suite. It will have rights based on its Free Software license, unless
the full license is purchased separately.
SpeechStudio
Voice Control
- The EnableDictation method can now create managed
or unmanaged dictation that is disconnected from Windows focus. You
control its enable/disable directly. The user can be dictating while
doing other actions on Windows.
- You can now control of rate and volume of TTS
through properties "TTS rate" and "TTS volume", programmatically. A
"TTS pitch" parameter is effective only for SAPI 4 engines.
- You can now pause and resume TTS through the
property "TTS paused".
- You can now get the list of available SAPI 4
voices in the "voice list" property.
- You can now control when grammars are refreshed,
using the "Do not refresh" property. When this property is set true,
then incidental refreshes, such as those normally generated on focus
switches, will be suppressed for that specific client. This can improve
performance of very large grammars.
Incompatibilities
with Versions before 3.6.4
- Windows that are part of the same process are now
considered to hide the current speech-client even if they are not direct
descendents. If you want a client to keep listening when a hiding
window (of this process) gets focus, then use the "transparency pattern"
parameter to describe the hiding window as transparent.
SpeechRunner
- SpeechRunner can now simulate the dictation
commands for text formating and punctuation. The command "Say: new
paragraph", for example, now has the expected semantics, following the
effects of typical dictation engines.
General
- Version 3.6.3 is a bug fix release for v3.6.2 and
a tuning release for the Voice Tracker array microphone (see
www.acousticmagic.com). It is upward compatible with v 3.5.*, 3.6.1 and
3.6.2.
- A Tutorial is now available for those who are
using or designing for the Voice Tracker array microphone. See Start |
Product Files | SpeechStudio | Tutorials | Using Voice Tracker.
- The threshold calculations were found to be in
error - the effect is that noise was not suppressed unless the threshold
was set very high.
- This version of SpeechPlayer supports user
"sessions". The SAPI 5 engines do not appear to resume correctly after
switching users (e.g. on XP). In this version, SpeechPlayer shuts down
the underlying engine for one user and restarts it after a new user
session is selected, as needed. The transition should appear seemless,
except that on restart the engine will be muted.
- Version 3.6.3 includes some additional support for
SAPI 4 engines such as Dragon/ScanSoft Naturally Speaking. If you try a
SAPI4 engine, please do not use the thresholding Audio object - a
warning will remind you. The audio object formating is not compatible
with at least the Dragon engine and hurts accuracy.
General
- Version 3.6.2 is a bug fix release for v3.6.1. It
is upward compatible with v 3.5.* and 3.6.1.
- No bugs were found in the v3.6.1 beta test. Some
documentation errors have been fixed and some clarifications added.
- Version 3.6.2 includes some internal tuning in
support of the AcousticMagic Voice Tracker(TM) array microphone.
General
- Version 3.6 is primarily an end user upgrade for
v3.5*. It is upward compatible with v 3.5.0, 3.5.1 and 3.5.2.
- Version 3.6 is the first SpeechStudio product to
find and allow limited use of SAPI 4.1 engines as well as SAPI 5
engines. SAPI 4.1 performance and correctness is not warranted. You may
switch between engines by using SpeechPlayer's Tools | Options on the
Recognition tab, but you'll need to select that same tab again to set up
the profile, since the profile combo box does not reset for the new
engine until that engine is actually selected - i.e., you've close that
page with "ok". SAPI 4 support is limited: it does not support
dictation, for example.
Incompatibilities
with Versions before 3.6.1
- Users of Dictation Manager will find its behavior
changed slightly. Most edit commands are now available only in edit
mode. In edit mode, dictation is suspended. This change was made to
remedy user complaints of degraded dictation accuracy.
SpeechStudio
Voice Control
- The SpeechStudio Control can now compile grammars
on the fly for either SAPI 5 XML format or SAPI 4
cfg format. This retargeting should be invisible to you and to
your end user.
- Added support for the
WindowDestroyed event. This even is generated once for any window
that is previously reported by a
NewWindowActivating event, at the time
that window dies. The NewWindowActivating
event is restricted by the WatchForWindow
pattern, but WindowDestroyed is independent
of later changes to the window name. Together these facilities allow you
to get voice focus back and clean up after a foreign window is closed.
- Added support for the
DictationActivatedFor call. This call can be used to suspend
dictation for a client, while still receiving the managed window focus
events.
SpeechPlayer
- SpeechPlayer now supports Press-to-talk. Select
press-to-talk behavior for your hot key by going to Tools | Options and
looking on the Sleeping tab for the Press-to-talk radio button. Note
that the hot key will reverse the current mute state of SpeechPlayer, so
if SpeechPlayer is Listening, holding the hot key will MUTE.
- SpeechPlayer now supports automatic sleeping.
Sleeping means that SpeechPlayer will stop all application grammars and
listen only for a single "wake up command". You can control if, when and
how this happens. Just go to the Sleeping tab under Tools | Options.
Sleeping is different from muting, in that when SpeechPlayer is sleeping
it is still listening, although just for a single command; that command
will not do anything drastic, it just restores the previous listeners.
The idea of sleep is to keep SpeechPlayer from recognizing some
dangerous command by accident, while still allowing voice control.
Muting, on the other hand, turns off voice; muting cannot be ended by
voice, since voice is turned off.
- SpeechPlayer now supports automatic muting. For
example, if you want SpeechPlayer to mute itself after 30 seconds of
inactivity, then go to the Sleeping tab under Tools | Options; check to
box for "Automatically sleep (or mute)..." and put "30" for seconds. The
clear the "wake up command" edit box.
- Bug fix: Restore the registry's value for the
voice threshold when starting a new engine.
- Version 3.6 reorganized the entire
threshholding process. The problem was
basically that the user could drop the mike or otherwise generate a loud
noise which then would cause the max possible volume to be set very high
- that caused the threshold to go too high, so everything was suppressed
thereafter. The new version has SpeechPlayer call in whenever there is a
successful recognition; the audio object keeps track of the last few
seconds of input and calculates a new max from that history, on the
assumption that if the engine could recognize it, then it wasn't a
hyper-loud noise such as a mike drop.
- The gotosleep and
wakeup specifications to user-accessible properties so they can be set
programmatically through new properties.
- Add properties to tell/set when we are sleeping
and when we are muting.
- Bug fix: Do not SetPrompts
for a client that is not listening.
- Do not overwhelm the log with useless "SAPIEVENT"
lines, since there are many.
- Added "auto shutdown seconds" controls the time
SpeechPlayer waits before it will shut itself down; any new client will
inhibit the shutdown. Autoshutdown allows
SpeechStudio programs to be distributed to end-users without having a
"sticky" instance of the engine and SpeechPlayer retained on their
desktop.
- Allow the user to see false recognition
information, if desired. This info can help train the user to speak in
an even cadence, not too loud or too soft, as well as explain what might
be happening.
- Added support to have the profiles listed and
restored directly from SpeechPlayer (Profile Manager must still be
purchased separately.) The Profile file and auto-restore to are
connected to registry options and so persist.
- Added an option to allow SpeechPlayer to turn off
screen and power saving modes. The .NET version will crash hard in some
cases where SpeechPlayer is listening, gets a
recognition, and tries to work with a control such as a button on
a program that is asleep.
- Bug fix: On some occasions a client would
"disappear" immediately after its creation, if it immediately followed
the death of another speech-client window. This was because of a misuse
of the same memory from one client to the next. This is now prevented.
- Bug fix: In cases where the user did not have
auto-Answer selected and had no program listening to answer the phone or
elected not to answer the phone, an incoming call was ignored until it
disconnected; the disconnection caused the engine to switch to
FileInput mode. This was wrong in any case,
since any SpeechRunner test would have been stopped by the call if it
was answered.
- The version 3.6 log is cleaner, in that it does
not print out grammars that are not active.
- Tell the user immediately if the microphone can't
be connected.
- SpeechStudio can now suspend dictation while still
maintaining focus control of its managed windows. The app developer can
use this suspended mode to perform window-related grammar or mouse
activities without complications from dictation recognitions.
Dictation Manager
- Dictation Manager will now support separate modes
for edit and dictation. Users reported that dictation accuracy was
substantially reduced when complex grammars are presented along with the
dictation engine, so in the new version dictation mode has only a very
limited command grammar: correct that (enter correction mode), scratch
that (delete the last chunk), start number mode, start spell mode, start
edit mode, stop dictation ?mode.
- Dictation mode now supports a "stop dictation"
command. Also, dictation is turned off during edit mode.
- Dictation Manager now uses What Can You Say to
support organized help during the edit, spell and number modes. These
grammars and usages were complex for end users.
Profile Manager
- Profile Manager will now share information with
SpeechPlayer directly. Starting in version 3.6.1, you may launch Profile
Manager from SpeechPlayer's Tools | Options under the Profiles tab.
Profile Manager will automatically hand information back to SpeechPlayer
about your available profile backup choices.
SpeechRunner
- Bug fix: On some occasions SpeechRunner tests
would cause SpeechPlayer to produce a complaint box saying "Sorry, your
phrase is too complex for SpeechPlayer to analyze" even though the
phrase might be trivial. This was because of uninitialized memory. This
is now repaired.
General
- Version 3.5.2 is primarily a bug fix release for
version 3.5.0, and is compatible with v 3.5.0 and 3.5.1.
- The biggest remaining issue area is with the .NET
version; Microsoft changed the message
delivery to basic windows controls in .NET: basic controls like buttons
will not receive messages such as BN_CLICK unless the control is "mouse
visible". As a result, SpeechStudio Control will try to make its
target/client window mouse-visible before sending on any windows control
messages. You will see this as your speech-enabled windows popping up.
SpeechStudio
Voice Control
- SetForegroundWindow is
now called prior to attempting control message delivery; in .NET,
controls such as buttons are not given their messages unless they are in
the foreground - i.e., visible for a potential mouse click. They do not
have to be the active window.
- Add new control method:
ListenToFile(
fileName )
Causes the engine to be restarted (if necessary) in unshared mode with a
file audio input, and loads the given file. Any activity, such as
recognitions, will happen thereafter as though the microphone had been
in use. Upon completion, or if fileName
is "", the engine is restarted listening to the microphone.
- Recent additions to send
LogText messages from the control - designed to inform the user
about control issues - could potentially hang. The problem was that the
rare messages could be requested during a COM interaction with
SpeechPlayer.
SpeechPlayer
- SpeechPlayer will now reset the
SPI_GETFOREGROUNDLOCKTIMEOUT system parameter to zero to shutdown the
lock timeout. This timeout normally prevents window switches when
certain programs, such as notepad, lock the keyboard to their input.
SpeechPlayer restores the old timeout when it is shut down. This change
allows voice action messages to reach "hidden" controls.
- Add an option to allow SpeechPlayer to turn off
screen and power saving modes. The .NET Speech engine will crash hard
(require unplugging!) in some cases where SpeechPlayer is listening,
gets a recognition, and tries to work with a control such as a button on
a program that is asleep. This option is "on" by default, so the screen
saver will not be enabled if SpeechPlayer is listening for an active
client.
- The Tools | Options | Recognitions voice threshold
parameter was not saved. Now we update the voice threshold parameter
into the registry.
- Two new read-only properties are supplied:
GetPropertyString("loaded
grammars", strProperty)
GetPropertyString("active grammars",
strProperty)
Each returns a string value: a newline-separated
list of the grammars.
- In rare cases an
OnSpeaking(audio
stop) event could try to send a message via a dead client, causing a
crash.
- If you destroy a control on your speech-enabled
frame then in some cases the client of that frame would Stop Listening.
(Bug introduced in v3.5.0,3..5.1; fixed in v3.5.2)
- If the user changed a speaker or other engine
attribute while SpeechPlayer was muted, the engine reset would start it
listening again even though SpeechPlayer's mute still showed as on.
- A crash sometimes occurred when a window doing
active dictation correction was closed. This problem and any related
problem of delayed message action have been fixed.
- Try to explain to the user when the SAPI engine
has hung in the most common case of it being unable to activate or
deactivate a grammar.
SpeechStudio
Grammar Editor
- A menu item that has a name with a number in it,
such as "menu item 1", caused a crash in SpeechStudio in some cases when
Insert Pattern is used. The workaround is to use the control number of
the item and create the pattern by hand. (v3.5.0,3..5.1; fixed in
v3.5.2)
SpeechRunner
- If a problem occurs during its diff operation,
SpeechRunner will now dump out the highlights and the timestamp-stripped
data that it was working on, so you can inspect it by hand to correct
the problem.
General
- Version 3.5 has added support for Visual Studio
.NET: Visual Basic VB.NET, C++ and C#.
- Version 3.5 has added support for Foreign
Windows. A foreign window is one for which you have no code or for
which you have chosen to make no direct or derivation-based VUI
changes. You can now add voice control to manipulate any components
that you call or that are called indirectly.
- The biggest source of issues in Version 3.0.0 and
3.0.1 was in attempts to use non-compliant TAPI modems. We could not
fix this but we have made the error results more clear and have provided
greater instruction on avoiding the problems.
Incompatibilities
with Versions before 3.5.0
- The Refresh event has been renamed Refreshing.
This change was required for support of .NET,
since the old name collided with the Refresh method. The Refresh
method remains as unchanged. Impact: you must find all uses of the
name Refresh and, if they apply to the event, change them to
Refreshing. Also changing the data structure name to
REFRESHING_EVENT_TYPE and the message number to
DISPID_Refreshing. This is the change that's refreshing!
- The semantics of RequestLine
have changed. Now, the boolean
return value indicates whether or not SpeechPlayer found a telephony
device. A "true" return value does not mean that
RequestLine succeeded. A new TelephonyEvent:
TeleponeLine is delivered when the
RequestLine succeeds or fails.
- For TreeView, we now
provide regularized navigation: {Up | Down} {Out | Over |
Into}. The DownInto
command now will open the tree to the next level if it can.
SpeechStudio
Grammar Editor
- The
LocateControl(descriptor,
class) prefix function has been added to allow dynamic search for a
control on a given frame. The descriptor may name the target
control directly, may name a neighbor control and a direction to the
target control, or may give an X,Y position
for the target control. The class gives the general control
class, such as "Button" or "TreeView". For
example, the action
LocateControl("OK",
Button).Press() would press the OK button.
-
LocateControl(...)
may be used for actions or for variables, for dynamic grammar creation.
- For
LocateControl(descriptor,
class) the descriptor may be a pattern variable, and so may
map to a control name at runtime.
- Added a special dialog to help users build their
LocateControl calls.
- Fixed bug where the Grammar Editor sometimes
crashed if opened without resources.
- You can now double-click on a grammar (.grm)
file to launch SpeechStudio. It will locate the co-resident
sqz file automatically.
SpeechStudio
Voice Control
- Don't refresh the grammar if the window is dead.
This can happen now with greater frequency because of
ForeignWindows, which can disappear under
the grammar. Note: refresh doesn't crash, it
just puts out warnings about any LocateControls
that failed.
- MapInsert(map, "",
"written form") will now be construed to mean that the spoken form is
the same as the written form.
- Implement LocateControl
as a method of our SpeechStduioControl. This
method parallels LocateMethod in the grammar
editor, and allows our user to access foreign window controls using the
same syntax as for SpeechRunner and SpeechStudio actions.
- Add WatchForWindow
method to watch for the activation of a window whose name matches a
given pattern. Add the NewWindowActivating
event so that the user app can be signaled when a matching widow
is activated.
- Add
GetWhatCanYouSay() to fetch
prompts, grammar samples programmatically.
- Add
SetWhatCanYouSay() to control
SpeechPlayer's prompts in its WhatCanYouSay
window.
- Add Message2U event to allow SpeechPlayer to
provide error and informational messages asynchronously.
- Tell the user what dat
file was used in a SpeechPlayer log message.
- Define a flexible lookup to find the .dat
file. Use the given directory/exe name as a path to find the
dat file. If the exe is in
/a/b/c/bin/foo.exe and we are told to find "foo.dat",
then look first in /a/b/c/bin/foo.dat, then
for /a/b/c/foo.dat, then for /a/b/foo.dat
and so on. This allows C++ to be in any of the subdirectories (such as
Debug or Release). More importantly, it allows C# users to not worry
that their exe goes into ../bin/Debug or
wherever. Users that care can give a full path name.
- If no suffix is given for the
dat file name, then add .dat.
- Changed the Speak method to allow users to set up
SPEAKFLAGS in SAPI. Users can now pass through XML commands to control
pitch, volume and so forth, in TTS.
SpeechPlayer
-
ResetEngine() has been no
actually resetting in most cases. It was "optimized" to avoid reset if
it judged that the engine was already in the desired mode. By forcing it
to restart the engine, users can change engine options, such as the use
of the audio tee object, on the fly.
- Fixed PRN 301: If the user calls
Dial without calling
RequestLine, we crashed.
- We now do more to explain any grammar error to our
user.
- SimPhone variable is
now a parameter, accessible via GetParameterNum.
- Only deactivate a dictation grammar if it was
started. Otherwise the log will be sprinkled with misleading deactivate
messages.
- The user can see what the current state of the
grammar's prompts is.
GetWhatCanYouSay(grammar_name).
- The user can control a grammar's prompt display:
SetWhatCanYouSay(grammar_name,
string_of_new_prompts)
- Added the QueryPrompt
event: it is fired when the end user double clicks on a prompt in
SpeechPlayer's WhatCanYouSay window. You
can handle this event to explain or expand on prompts, thus providing
dynamic control of the SpeechPlayer WhatCanYouSay
prompt display.
- Changed the handling of
HalfDuplex so that we stop listening but continue to process the
message loop. We restore the system to listening when the sound output
is over (as indicated by the AudioStop
event.) Speak and SpeakWav will now queue
so the engine will not have to switch back and forth from listening to
not listening, which in previous implementations cause a noticeable
pause (.5-2 sec) between Speak calls.
- DTMF can now interrupt (barge-in) during speak and
speakWav calls, even in
HalfDuplex mode.
- The Microsoft engine will occasionally die/hang
after responding "BUSY". We will now repeatedly try the rejected
operation to see if it will get unbusy. The
problem is usually that the engine refuses to listen. This happens when
the profile is corrupted or on some occasions where the number of active
grammars goes to zero temporarily: Occasionally when the number of
active grammars goes to zero (for example when we are switching
grammars,) SAPI may change the state to SPRST_INACTIVE. Thereafter SAPI
will keep changing the state back to SPRST_INACTIVE no matter what we
do, effectively hanging the engine. To work around this problem, we
reset the engine under these circumstances, then
restore the dynamic state.
- We no longer complain about
LocateSibling unless we actually need to use that result.
Otherwise it confused the log.
- If the program answers a call, we must
unmute. otherwise
the phone caller will get dead air. We then restore the state of muting
after the call is over.
- Add an option so that the
AudioTeeObject is by default not used, and can be used by setting
an option.
- Implemented Message2U event so that
errors/conditions can be reported back to the user program. It reports
either to the control program if we are managing a
telephone, or to the current client otherwise.
- Changed many error/assert
conditions to instead report their error as Message2U events, reducing
MessageBox annoyance and assertion checks.
- SpeechPlayer no longer crashes if the user
configures out the AudioTeeObject.
- Added an option so that the
AudioTeeObject is by default not used, and can be used by setting
an option.
- Added Allow users to set/unset the
AudioTeeObject thresholds both from
SpeechRunner and from their code.
- Allow the voice level threshold and file in/out
formats (for the phone) to be controlled via options and
programmatically. This change specifically supports older Dialogic
boards and some modems, where the "supported formats" query returns
incorrect results so the format must be forced to a specific value, at
least for now.
- Improved reporting to help user understand what to
do if their telephony device setup fails.
- Tell user why format/stream creation failed.
- Provide DoNotAnswer
property to allow users to control whether SpeechPlayer will answer the
phone at all. SpeechPlayer can be set to not answer the phone anymore
via a program, so that a program can honor a request, say over the
phone, to stop answering the phone. That way if SpeechPlayer programs
are causing a problem, there is a remote way to disengage them.
DoNotAnswer applies not just to
AutoAnswer, but to all calls.
- Fixed PRN 287: Before changing the telephony
device, make sure telephony will be initialized. The first
initialization (on the IBM laptop) failed preventing any further change
to telephony.
- Provide a property to see who's got control of the
phone line.
- Only answer the phone if an appropriate client is
waiting for a connection.
- Don't answer the phone if we are in the middle of
testing.
- Hang up the phone when we're done. In
AutoAnswer we were leaving the connection
going, causing the line to be locked until SpeechPlayer is killed.
Naturally this annoyed anyone who want to
call or get a call.
- Unify the call id number for
simphone and tapi. It turns out that
TAPI can get complicated about how it sets up telephone input, so let
the call id number reflect calls to
Telephone_Connect, and return 0 if the most recent event is a
telephone_disconnect.
- Added "View Log" to the SpeechPlayer menu
available from the system tray.
SpeechRunner
- Add support for ignoring selected lines that would
otherwise be matched as highlights. This is required because some lines
are produced asynchronously and therefore are susceptible to race
conditions.
- Add directive support for control designation,
mirroring the directives for LocateControl:
[DownFrom], [UpFrom],
[RightFrom], [LeftFrom],
and [On].
- Add the [@x,y]
notation for control directives.
- Adapted the command templates for C#.
- Allow arbitrary blanks between the SPEAK> and the
phrase in AwaitSpeak.
- Clean up after a simulation from an SQR. We
especially must honor the ForceTermination.
- Allow users to use relative path names for
sqr args. The file names will default to the
same dir as the sqr.
- A command line of the form "cmd:
//comment" caused a complaint that there was a phantom
arg. This fix eliminates an empty
arg in this case.
Samples
- Added complete set of samples for C#.
- Made at least one sqt
test available in each VUI sample program, to serve as a use case as
well as a test.
- Supplied Speech Utilities code, so users don't
have to write their own common glue code.
- The samples have been upgraded and further
integrated into tutorials.
- Added samples to illustrate Foreign Window use in
each language.
- Added samples to illustrate control of What Can
You Say use in each language.
- Added samples to illustrate Speak prefix for
SPEAKFLAGS use in each language.
·
Fixed PRN 286: Speech Collector: "Record" doesn't start recording
·
Fixed PRN 242: Menu text is missing on Windows NT.
SpeechStudio
Grammar Editor
- The grammar compiler used to be included with the
grammar editor. Now the grammar compiler exists as a standalone program,
compile.exe. Visual Basic projects are unaffected; however, this change
affects how grammar files are compiled in Visual C++ projects. The
changes to Visual C++ projects are described in the Quick Start section
of the SpeechStudio Help on-line manual.
- When grammars are compiled as part of building a
Visual Basic project, a new set of dialogs help ease the process of
fixing and recompiling the grammar files.
- When grammars are compiled with custom build rules
in Visual C++ projects, any errors or messages
from the grammar compiler are directed to the Visual Studio log window.
- The "Find in File" command is only implemented to
search grammars in a project with case sensitive or insensitive
searches. In particular, regular expressions are not yet implemented.
- SpeechStudio has a /display=file option to
initialize the contents of the log window, for example, with error
messages from a compilation.
- For Visual Basic projects, SpeechStudio grammars
are no longer stored in resource DLLs. The grammars are now compiled
into a new file, SpeechStudio.dat, which is
loaded at run-time. Visual Basic users no longer have to load the
resource compiler, nor are there problems caused by read-only resource
files.
- For Visual C++ projects, SpeechStudio grammars are
no longer included in the resource script (.rc
file). The grammars are now compiled into a new file,
SpeechStudio.dat, which is loaded at
run-time. This change simplifies project setup and fixes a problem where
Visual Studio would not recompile grammar files.
- New built-in methods include
LogText and StopSpeaking.
SpeechStudio
Voice Control
- The SpeechStudio Voice Control API has been
upgraded in many ways. The control is documented in the SpeechStudio
on-line help file in the section Programming Reference under
SpeechStudio Control API Reference. The voice control encapsulates most
of the functionality available for an application program.
- A new call, Init(), is
used to associate a controlling window with each instance of the voice
control as well as to provide a search path for the
SpeechStudio.dat file. Init() must be
called before any other API commands are called.
- Start() now accepts
just the grammar name. It no longer includes a reference to an hWnd,
which is now set by the Init() call.
- A new call,
StartEX(),
is used to override the search path set by Init().
- VBStart()
has been replaced by Start(). The dll
parameter to VBStart()
has been moved to Init() and StartEx().
- The name of the event associated with audio and
text-to-speech is now called Speaking.
- The event (and callback)
GetPhraseList has been replaced by the combination of the new
Refresh event and phrase maps. This new event provides better control
over dynamic grammar generation, in
particular it supports initialization, caching, and optimization.
- Literal strings with colons (":") passed to action
routines used to cause incorrect arguments to be passed.
- The control is now maintaining grammars on a
one-to-one basis with the application's grammars. In the past, the
control built a single big grammar from all of the app's grammars. Now,
each app grammar is loaded, activated, deactivated, and unload
separately.
- The internal command
Deactivate() has been replaced by Deactivate(grammarName).
This change improves the performance of SpeechPlayer because grammars
are now smaller and dynamic grammars are evaluated independently of
other grammars in the system.
- SpeechStudio.dat files
have a redistribution key to prevent redistribution of programs created
with evaluation copies of SpeechStudio Suite.
- SetProperty can be
used to change the current speaker profile.
- The Speaking type was updated for C++ programs.
- The control creates an intermediate window to
handle COM events. This window's Class was not being correctly
removed. This was causing VB programs to get an exception the 2nd time
they ran under the VB IDE (run it once and kill the program (click on
the x in it's upper right corner) then run it
again).
- The SpeakWav parameter
StreamID has been renamed Cookie to reflect
that its really not connected to the stream
but a chunk of data for the user.
- SpeakWav now correctly
maintains the StreamId parameter.
- The Speaking event now returns the name of the wav
file in question. The old version returned an integer that needed to be
converted to a file name.
- The Dictation API can be connected to an edit
control or text box. When an edit control or text box gets control, the
Dictation API pops up a small text box to track dictation. When a phrase
is recognized, the Dictating event is fired and the application must
deal with the text.
- The Dictation event now returns the hWnd of its
associated edit control or text box.
- The dictation event now returns a list of
dictation alternatives.
- A new method,
UseAlternate(),
is used to inform the speech engine which dictation alternative to
accept as correct (for training).
- A set of methods and events for Telephony has been
added to the control.
- A crash on exit has been fixed.
- Several unused properties have been removed.
- Most parameter names have been changed to be more
like Visual Basic names.
- The C++ event mechanism, sending Windows messages,
has been simplified. The number of messages has been reduced from four
to one, and the message parameters have been changed. The wParam is the
event type. The lParam is a pointer to structure containing information
about the event.
- For C++ programs, a new include file,
SpeechStudio3.h, is available. It is very similar to the definitions
that you would see with #import, except the types are
more friendly and it does not throw
exceptions.
- For phrase maps, the parameters and documentation
now use Spoken and Written instead of Key and Value.
- The audio recording interface now uses the
RecordingMode property instead of start and
stop recording.
- Removed SQZGrammar.h
which was used to compile C++ rc2 files. It is no longer needed.
- Changed the name of SpeechStudioControl.dll to
SpeechStudio3.dll.
- Characters that do not have virtual key codes,
like '>' and '<', were not handled correctly (because they are shifted
versions of other virtual keys). This new code handles these cases using
the built-in Windows functions.
- Implement
MapClear(),
so users can rebuild maps without having to remember what used to be in
them.
SpeechPlayer
- Add telephone "auto answer" capability to
SpeechPlayer.
- Improve behavior in cases of: no SAPI, no SR
engines, no TTS engines.
- Improve options dialog and logging for the case
where there are no supported TAPI devices.
- Improve telephony logging and error reporting.
- Move "View Log" and "View Summary" menu items from
File to Tools.
- Record audio for dictation results.
- Reduce unnecessary redrawing.
- Grammars are created in the
deactive state, and then enabled or disabled according to the
muting/input state.
- Changes to allow SpeechPlayer to work with a
half-duplex modem.
- Change "hot" image for the microphone-mute toolbar
button.
- Double-clicking on the icon in the system tray no
longer leaves a bunch of junk in the system tray.
- The prompts window now displays its horizontal
scroll bar only when needed.
- The buttons that control the scrolling of the
prompts window now react much faster. The auto-scroll is faster and
smoother as well.
- Fix two confirmation bugs: confirmation button
panel was not shown immediately when going into confirmation mode, and a
freed confirmation grammar could be referenced if the client was
destroyed before completing confirmation.
- Default to using the inproc
recognizer for audio stream recording.
- Disable mute controls while running a test.
- Work around SAPI compilation (gc.exe) bug passing
properties.
- Empty rules cause the
SAPI XML grammar compiler to fail with a non-specific error so now the
XML grammar is created empty.
- Draw volume bars with a dimmer color when not
listening.
- Do not shorten phrase lists that are directly
reached from <start> as stand-alone lists.
- If the user clicks the microphone button, the
input device is reset to the microphone.
- Add an internal routine
Canonical() to clean up recognized text, for example removing
doubled spaces.
- SAPI, by default, loads grammars as active.
SpeechPlayer wants grammars to start inactive. SpeechPlayer uses grammar
state to do muting and top-level rule state to show active & inactive
grammars. In order to create the grammar without a window of activity,
and in keeping with SpeechPlayer's usage model, we create a grammar like
this:
1) Create grammar.
2) Set grammar to inactive.
3) Load grammar (rules)
4) Set rules to inactive
5) Set grammar to active
- Add "StartInTray"
option for keeping SpeechPlayer hidden in the system tray.
- Implement pop-to-top, pop-on-confirm display
modes.
- Consolidate Autosize
and Show/Hide Prompt options into one "prompt mode" option: show, hide,
or auto-show.
- Add an easy way to restore the default logging
levels: the "Default Levels" button on the Logging tab of the Options
dialog.
- Add telephony BlindTransfer
method.
- Add DTMF detection support.
- Add context help strings for Options dialog
controls.
- Add logging to track how many grammars are loaded
into the engine.
- Reflect changes in the TTS options immediately.
- Add a simulated telephone interface,
SimPhone. It acts like a telephony device,
sending the same events but using the microphone.
- Implement full-time audio stream recording.
- Better updating of GUI state.
SpeechRunner
- Add MultiprocessTest
command to allow tests to control multiple processes instead of just the
process under test.
- Diff patterns now keep "ReportTestError"
messages in the filtered log file by default.
- If the right panel of a diff window is empty,
expanding and hiding differences caused a crash.
- Fix bug: the SpeechRunner tools weren't being
hidden correctly in batch mode.
- Update the MRU list for Phrase Libraries to
include those libraries encountered in OpenPhraseLib commands.
- The regular expression compiler now reports bad
patterns.
- Check that a regexp
pattern is well-formed, and complain if it isn't.
- If the user is informed of a simulation error, ask
if s/he wants to proceed If the user quits a test, the problem (e.g.
await window) may conclude after the tester has disconnected. No longer
asserts.
- Implement basic telephone simulation support for
SpeechRunner: ring, caller id, connect,
dtmf and disconnect commands.
- Let the user know that the simulation is
completed.
- Lock in the order of speak
events for golden logs.
- Add logger message for test start up.
- Rename the Insert Say/Paraphrase menu items to
remind the user that they use side-by-side simulation.
- Add new insert commands for Say/Paraphrase so that
side-by-side simulation is not required.
- Add the template support for the Speak and the
DTMF commands.
- Show marker-type errors in the log window.
- Speak() and
SpeakWav() now have distinctive log
messages, which are considered highlights.
- Standardize error messages so they can be
displayed nicely by SpeechRunner.
- More errors are reported through SpeechRunner
instead of stopping SpeechPlayer with message boxes.
- SpeechRunner now waits for a short duration for a
new grammar to become active, reducing the need for too many calls to
AwailtWindowListening.
- Add signal value for "too busy to simulate now".
- SpeechRunner diff: Add support for coloring the
log bar lines.
- Add the PhraseLibSpeaker
command so that SpeechRunner can use phrase lib files recorded under a
name that is different from the current engine name.
- Allow apostrophe's in phrases.
- Allow more time if we are waiting for a timeout
operation.
- Allow the user to provide defaults for choosing
the exe name or the window name.
- Change Await* logging so that the command is shown
as PENDING> until it either succeeds or fails. That way it will be
synchronized with the APP messages.
- Close the log file after a simulation, so the user
can use the diff operation on it.
Grammars
- Add checks for floating point as grouped numbers.
- Add a time and duration grammar.
- Add ordinals.
- Add small_float and
tiny_float.
- Allow grouped military letters.
- Allow negative exponents.
- Disallow the word "decimal" in front of "point".
- Allow a simple number without a decimal point.
- Allow input of numbers in dollar or decimal
format.
Samples
- Add recorder samples.
- Add telephone samples.
- Add dictation samples.
- Add the Problem Call application.
- Fix mouse up event handler in Scribble.
- Change the name of Scribble's "Pen
Widths.grm" to "PenWidths.grm".
Other changes
- Add licensing to Phrase Explorer.
- Add licensing to
SpeechCollector.
- Change the GUIDs and
version numbers to reflect version 3.
- Change the name of the
eula for Suite to Suite. Move the
release notes to the top-level directory.
- The release notes & installation instructions are
combined.
- Converted the logging directory to
.../SpeechStudio from .../SQZ.
- Break out the tutorials and collect them in a
tutorial directory.
- No longer distributing
ComVector.h with our product.
·
New Feature: Added a SpeechPlayer registry entry,
StartInTray, that tells SpeechPlayer not to
bring its window up.
·
Bug fix: SpeechPlayer changed the message numbers of all messages for
which it listens into a safer range. Other (non-malicious) programs should
not be able to interfere with SpeechPlayer now.
·
Bug fix: An intermediate window class was not being correctly
removed, causing an error in a rare case.
·
New Feature: SpeechPlayer redistributable no longer requires speech
recognition components if the application is only doing TTS.
·
New Feature: SpeechPlayer no longer complains if
SpeechRecognition is not installed in the case where the application
is only doing TTS.
Version 2.0.1 brings the Visual C++ and Visual Basic
support in sync and creates an Enterprise product containing support for
both.
·
Changed the C++ Sample name "SQZedCalc"
to "SQZCalc with VUI"
·
Bug fix: recording audio and confirmation didn't mix, because the
audio for the recognition-to-be-confirmed was not being kept around.
·
The help for SpeechStudio has been combined into one manual with both
VB and VC++ information. The same is true for SpeechRunner.
·
SpeechStudio is now licensed by language (VB or VC++ or both).
·
Color scribble example for VC++ can now tune colors on the fly.
·
Bug Fix: Dictation could only be started by a recognition. That is,
if you pressed a button to do dictation, it didn't work.
·
The company name has changed from "SQZ Inc." to "SpeechStudio Inc."
To reflect this change we have changed the titlebars,
icons, program names, and most other user-visible locations where SQZ used
to appear.
·
The SpeechStudio product is now installed in C:\Program
Files\SpeechStudio instead of C:\Program Files\SQZ
·
In SpeechRunner if the path to the .exe file is "./", then "." is
defined to be the location of the current .sqt
file.
·
A better Voice user Interface has been added to the VB calculator
example to create Samples/VB/vbCalc with better
VUI. In particular the spelling, define, options, and
spellcheck dialogs have been unified.
·
In SpeechRunner, allow the use of the Static control "neighbor" to
identify an Edit box for the DoSetText and
DoEditBoxCheck commands.
·
Include the SpeechPlayer redistributable as part of the product. See
the Redist folder in the product (there is a
readme file there as well).
·
Changed Calculator so that variables and functions are now distinct.
· The VB
add-in now watches for events that require the SpeechStudio.dll to be
up-to-date, events such as execute, file | make, etc. If the DLL is not
up-to-date, then it invokes SpeechStudio to bring it up to date.
·
SpeechPlayer will now always come to the top when confirming, even if
it is iconified.
·
Fixed some SpeechPlayer redrawing bugs when going from minimized to
normal size.
Release Notes for Version 2.0.0
(October 21, 2001)
Version 2.0.0 is a Visual Basic only release.
SpeechStudio Suite for Visual Basic
Improvements to the SpeechStudio Resource Manager
Add-In for Visual Basic include:
·
The maintenance dialog, brought up by pressing the eyeglasses button,
allows for the selection of the SQZ project file and, if the VB project file
is not read-only, this selection will be recorded so the user won't be
prompted in the future.
· In a VB
project that contains SpeechStudio voice recognition components, prior to
executing a VB program from the IDE or building a .exe (i.e., file |
make...), the add-In will check that the SpeechStudio.dll is up-to-date and,
if not, will bring it up to date by invoking SpeechStudio to compile the .sqz
file.
·
Better error reporting
The dictation component of SpeechRunner has been improved to put blanks
between the groups of words as they are recognized.
There is a new StopSpeaking()
method in the SpeechStudio control which terminates audio output immediately
(TTS or .wav output).
Version 1.7.8 is a minor upgrade to 1.7.7 to add
example SpeechStudio.dll files into two sample directories in Samples\VB\:
Events and vbCalc with VUI.
The SpeechStudio help document has also been upgraded.
Release Notes for Version 1.7.7 (July
19, 2001)
SpeechStudio Suite for Visual Basic
Version 1.7.7 is the beta
release of SpeechStudio Suite for Visual Basic.
The non-VB-specific bug fixes and feature changes for the product components
will be documented in the full product release.
·
A new component called The VB Add-In, is part of the VB
product. Installation of the product will automatically enable the Add-In's
toolbar, called the SpeechStudio toolbar, in Visual Basic. Use of the
buttons on the toolbar is described in the SpeechStudio Help.
·
SpeechStudio can now get it's resource descriptions from an .xml
file, which is what the VB Add-In creates.
Known problems in the VB Beta
·
None of the Updown control actions in a
grammar will work.
·
The actions of the slider control (aka
msctls_trackbar) work, however they do not cause
the event procedures in a VB program to be triggered.
·
None of the scrolling actions for list and combo boxes are working.
You can work around most problems such as
these by calling the FireRecognized
action in the grammar instead of the control-specific action. Then just
perform the desired action in the Recognized event procedure.
·
There is no way in the beta to voice activate the three system menu
buttons usually appearing in the upper right hand corner of the window
(minimize, maximize, and quit). As described above, you can use
FireRecognized to work around this problem if
desirable.
·
SpeechStudio Inc. just recently changed its name from SQZ Inc. There
are still some vestige references to SQZ in our help documents, error
messages, etc. For those new to SpeechStudio Inc., this message explains
where that "SQZ" you see comes from.
SpeechStudio Suite C++
·
Saving a read-only file sometimes resulted in a error message that
displayed the "Help About" message.
·
Log files for all SpeechStudio programs are now written to the "SQZ"
folder in the system's Application Data folder, instead of the root
directory.
·
Sped up string handling for XML parsing.
·
Support hyphenated words like forty-one. Before the hyphenation was
treated as a confidence operator; now it is a hyphen if an ID character
immediately follows it. Numbers must now be given in hyphenated form.
·
Improved SpeechRunner and SpeechPlayer cooperation in cleaning up
killed processes.
·
Improved handling of audio recording. Audio recording now works
correctly when multiple clients are recording simultaneously, in the same or
different directories.
·
Copyright messages have been updated to include the year 2001.
·
Several additional predefined grammars are now available in addition
to <integer/>. They are natural, digit, digits2, and digits3.
SpeechStudio C++
·
In batch mode, SpeechStudio did not write error message or return a
bad error status. Now in batch mode, errors are written to standard output
and the number of errors is returned in the exit status.
·
Implemented Save As... for grammar files.
·
SpeechStudio would, in rare cases, crash immediately after briefly
flashing its splashscreen.
·
Changed the SpeechStudio grammar-attribute "disable" to be
"disabled".
· In
SpeechStudio grammars, add "true" and "false" as Boolean values, joining
"yes", "no", "1", and "0".
·
Added a new tutorial program, "Color Scribble with Playback".
·
File | Refresh now marks the project file as written when resources
attached to the project are changed.
·
File formats are automatically upgraded from version 1.5 to version
1.7.
·
SpeechStudio now supports grammars that are not associated with
resources. The grammars, which may be started and stopped from anywhere in
an application, can communicate with the application using
GetPhraseList and
FireRecognized.
·
SpeechStudio now associates grammar names, not grammar files, with
resources. Grammar files can be shared between grammar names. Multiple
grammar names can be associated with a single resource. Taken together,
these new features combine to allow reuse of primitive functions, such as
common control dialogs.
·
Grammars can now access the value dictated to the wildcard patterns
"*" (dictation) and "..." (junk).
·
Reconciliation has been simplified so that only grammar files must be
reconciled, and then only when the resources associated with the grammar
file undergo a significant change.
SpeechPlayer
·
Internal debugging routines no longer fail with an assertion error if
the output message is too long.
·
SpeechPlayer now uses the shared recognizer when possible, so that
SpeechStudio-enabled programs can run alongside Office XP and other programs
that use SAPI 5 directly.
·
When the current SpeechPlayer client is killed abnormally (for
example, by the Win32 API function TerminateProcess),
stop listening immediately.
·
SpeechPlayer has more efficient grammar loading. Grammars are only
written to a temporary file if an error occurs, or if the user has requested
debugging information by setting a high logging level.
·
SpeechPlayer has improved logging for speech recognition engine
events and grammar state changes.
·
SpeechPlayer sometimes terminated a test immediately upon starting
it. This rare event occurred when a test was just terminated unnaturally and
left timer routines running, so that these stale timers triggered during the
new test.
·
Put in score tuning for the Microsoft SAPI 5 engine. Numeric scores
between -100 and 102 can now be used for judging confidence for the SAPI 5
engines. Values less than zero are false recognitions. Values above 100
indicate that the result was simulated or forced.
·
The Properties | Set Logging Levels dialog now supports a button to
reset to defaults, resetting to the original logging levels.
·
Added menu item File | View Log to quickly access the most recent
part of the SpeechPlayer log.
·
The Microsoft SAPI 5 (or engine) has a bug returning properties for
grammars that use multiword string literals as default
VALSTRs. Occasionally only the last literal word is returned as the
VALSTR property. Work around this problem by simulating the correct result
construction.
·
If a SpeechPlayer client starts without registering an hWnd, an
addition message may be displayed later when the client registers its hWnd.
SpeechStudio Control
·
The control is incompatible with version 1.5.
·
Fixed built-in actions affecting Listbox
selection: for a listbox with multiple selection
style (LBS_MULTIPLESEL), speaking the name of a list item that is selected
will unselect it.
·
Built-in actions affecting Listbox
selection always send change notifications, even if the actual selection
does not change.
·
The API Speak method was limited to a short, fixed, number of
characters to speak. Now this number is unlimited; however, speaking cannot
be interrupted so speaking in short phrases is preferred.
·
Built-in Actions now have limited support for the Windows common
control ListView.
·
SpeechStudio Control maps, which are used to handle abbreviations or
dynamic wordlists, are no longer associated with controls. Instead, maps
have unique identifiers. These identifiers are used throughout map-related
methods (such as GetMappedWordlist and MapInsert).
·
A new built-in action, GetPhraseList, is
used to dynamically create grammar elements. When the grammar is evaluated,
GetPhraseList caused the control to send the
GetPhraseList event (or Windows message) to the
application.
·
SpeechStudio Control signatures are no longer used. The Signature
property has been removed.
·
The built-in action FireRecognized now
takes a variable number of arguments.
·
The Recognized event (or Windows message) has a new parameter
profile. In particular, the event now has an associated id and score. If the
phrase is being recorded, the id number of the recording is set. The
parameters to FireRecognized are passed in a
single SafeArray.
·
Built-in actions which appear in quoted strings (for example,
GetPhraseList), will implicitly and
automatically have their first argument quoted to avoid the XML quoting
syntax, e.g. "id".
·
The control no longer tries to get dictation text from a null hWnd.
·
The Recorded event (or Windows message) has a new parameters profile.
·
The Recorded event (or Windows message) for a false recognition is
sent to all clients that have registered to receive false recognitions.
·
The built-in action Dictate that was associated with edit (and
richedit) windows has been replaced by
StartDictation.
·
The API implements a new interface for examining and changing SAPI
engine properties, SetProperty and
GetProperty.
·
The API methods Speak and SpeakWav now
take an addition parameter, a stream id. The stream id is communicated back
to the application with the Speaking event (or Windows message).
·
The API methods and events are now suitable for automation (and
compatible with Visual Basic).
·
The control always sends the Dictated event (or Windows message)
after dictation, even if the id is omitted.
SpeechRunner C++
·
SpeechRunner now waits until the test program's log file has been
closed before checking the test results. If an error occurs during the
check, it now reports more useful information.
·
SpeechRunner could sometimes not start a test without waiting for a
client: SpeechPlayer complained that the engine had no grammar.
·
Introduced the SetParameter:
StartWithoutClient switch to tell SpeechPlayer
that we're starting the test even though the engine or process may not be
ready, so it doesn't complain. This parameter is set automatically by
SpeechRunner, under normal circumstances, when "Start
Without Client" is selected in the Run dialog.
·
In some cases quotes from incoming string arguments were not removed.
·
Introduced the AwaitLogText command for
SQT files. This command allows SpeechPlayer to monitor log messages as they
arrive from the app, and delay the next test action until a log message
matches a given pattern.
·
Show Differences did not work correctly if the output line had a "|"
character in it.
·
Standardized the "Test File Completed" message so that it comes out
whenever the test is stopped, as long as the SQT file has been completely
processed.
·
SpeechRunner no longer will complain about missing phrase libraries
when simulating.
·
In rare cases, an infinite loop of repeated errors could occur if an
SQT test was edited to name bad phrase libraries.
·
Implemented SpeechRunner DoSetText
command, so that a user can push specific text into a control from an SQT
test.
·
Replaced the "spreadsheet" grammar entry display with the XML smart
editor.
·
Minor changes and bug-fixes in SpeechStudio.
·
Major changes switching from SAPI 4.0a to SAPI 5.
·
The documentation was improved. Tutorials were enhanced for
SpeechStudio and SpeechRunner.
·
Introduced SpeechCollector and Phrase
Explorer for test suite management
·
Introduced SpeechStudio Suite for SAPI 4.0a