I've made an app that uses the SpeechRecognizer class to setup a simple grammar and recognize simple words.
When I run it on Win7 I notice two things.
1) The first time I start the app the Speech recognition bar (thingy) comes up, but the UI of my app is not shown (it is running as I can see in the Task Manager).
When I start the app for the 2nd time (after killing the first instance) it display normally (with the windows speech recognition toolbar already running).
2) When I speak one of the words I'm recognizing in my app a 2nd time, it does not trigger an event -instead- it selects the text on my app where I print out in a listbox the history of the recognized words.
Note: When I remove the history listbox from the main screen, it works as expected. Apperently Win7 tries to find the word in my UI first and when it cannot find it - only then does it trigger my programmatic event...??
Both problems seem very weird to me.
More info on the app: Its a VS2008/.NET 3.0 WPF app written in C#. The application allows a user to edit setting groups (patches) for sending Midi commands. Each Patch is tagged with a phrase. When that Phrase is spoken (recognized by the app) all configured Midi commands are sent to the output(s). The history of patches that were recalled by the user are printed in a 'history' list on the apps main screen.
I hope someone can help me with this. Any suggestions are most welcome.
Thanx,
Marc Jacobi
I think you are using the shared speech recognizer (SpeechRecognizer). When you instantiate
SpeechRecognizer you get a recognizer that can be shared by other applications and is typically used for building applications to control windows and applications running on the desktop.
It sounds like you want to use your own private recognition engine (SpeechRecognitionEngine). So instantiate a SpeechRecognitionEngine instead.
see http://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognizer(v=vs.90).aspx
What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? and Disable built-in speech recognition commands? may also have some helpful info.
I got it working, thanx!
The main difference between using the SpeechRecognizer and the SpeechRecognitionEngine are:
Construct the SpeechRecognitionEngine using a RecognizerInfo from InstalledRecognizers.
Call one of the SetInputToXxxx methods
Call RecognizeAsync(RecognizeMode.Multiple) to mimic the SpeechRecognizer (SpeechRecognized) events.
Call RecognizeCancel/Stop to quit.
Hope it helps.
Related
Few questions,
What is the difference between the SpeechRecognizer and the
SpeechRecognitionEngine classes? why use one over the other for
speech recognition?
Is the speech recognition widget that I see in Windows 10 when I
start my program has to be shown?
I loaded the SpeechRecognizer object with simple grammar such as
"a", "b", "a r". it recognizes it perfectly but the time it
takes is not ideal for my program, I would like it to be faster, any
way to do that?
I think this has been answered in the past. See Using System.Speech.Recognition opens Windows Speech Recognition, does this help?
In general, you can use System.Speech as inproc or shared. When shared, you see a recognizer "widget" on the screen. If you use an inproc recognizer, you control the recognizer and windows does not add a UI. See good Speech recognition API for some more background.
So I created an UWP App that can record several Audio Lines and save the recordings to MP3 files for in-game multi-line recording that I can later edit separately (game audio, microphone, game comms, voice comms) as NVidia ShadowPlay/Share does not support this yet. I achieve this multi-line setup with VAC.
I have a version of this tool written in regular Windows WPF C# and I have a system-wide HotKey Ctrl+Alt+R that starts/stops recording so when I'm in a full screen game, I can start/stop recording without exiting full screen mode (switching window focus).
Can a global (system wide, app window not in focus) HotKey that triggers some in-App event be achieved in a UWP App? I know the functionality is not supported for other platforms. But I only need it to run on Windows 10 Desktop and the HotKey support is mandatory. Or can I achieve my goal in any other way for UWP Apps?
GOAL: System wide key combination to trigger in UWP app event without switching Window focus and messing with full-screen games.
at the moment it is not possible to solve this task thoroughly.
You are facing two limitations of UWP and can be only partially solved:
Lifecycle: UWP apps go in suspended state when they are not focused. They just "block" to consume less resources (and battery). This is a great feature for mobile devices, but is bad news for you project. You can solve this by requesting "ExtendedExecutionSession" which will guarantee that your app never falls asleep when out of focus if "attached to wallpower".
Detect input without focus. It's clearly stated on MSDN that UWP doesn't support keyboard HOOKS (this refers to SetWindowsHookEx). They reinvented "GetAsyncKeyState", now it works only when the Windows is focused. Indeed you can find that under CoreWindow.GetAsyncKeyState().
If you only need to use F Keys as hotkeys you can still do something, like "press F2 when the app is minimzed to activate a function".
Use Stefan Wick example. He solved part of the problem.
Instead if you need to listen to lots of keys (or mouse events) there isn't a way. You can't right now.
Curiosity
UWP has restricted capabilities, one of which called "InputObservation".
At the moment it is not documented and impossible to implement (unless you are a select Microsoft Partner), but it should allow apps to access system input (keyboard/mouse..) without any limitation and regardless its final destination.
I think this feature is the key for system-wide inputs detection.
I am not able to find a way to implement it.
Kind Regards
I'm trying to get Bing Speech followed this link:
www.microsoft.com/cognitive-services/en-us/speech-api it says: "Show Quota is temporarily unavailable. We are working hard to bring an improved version back in the middle of October." and I can't see it in the list, there is only Bing Search.
I found this www.microsoft.com/cognitive-services/en-us/speech-api/documentation/overview I've pressed Get started for free and got two keys.
Now I have to find basic code to use it. I found this: www.github.com/Microsoft/Cognitive-Speech-STT-Windows, Actually I'm just moved my code from Windows Form Application to WPF with same interface and it is works without any special changes, but fact is I'm new with WPF so not sure what I have to use to get needed output.
Maybe I have to follow instruction just by this: https://www.microsoft.com/cognitive-services/en-us/Speech-api/documentation/GetStarted/GetStartedCSharpWin10
Assuming you're interested in a C# sample, maybe this can help: https://github.com/Microsoft/BotBuilder-Samples/tree/master/CSharp/intelligence-SpeechToText. It is a sample speech-to-text bot, which uses the Bing Speech API. The bot reads the audio file that the user uploaded, and then converts it into text. Obviously, you can leverage the same code to run speech-to-text for non-bot applications.
I'm using the speechrecognizer in C# to create a basic grammar and listen for commands.
Since I'm developing this grammar/system to be used for a small game/app I'm making, I'd
love to start the speech recognizer in listen mode and invisible (or at the very least condense it
down to the taskbar). My biggest concern is that if the game ever gets any recognition (lol)
that people will be turned off by the obtrusiveness of the windows SR.
So, Can I start the speech recognizer in listen mode, and invisible to the user?
And can I close it when the app is finished so that the user doesn't ever actually deal with
the SR program, just my program/game?
Thanks.
There is a SpeechRecognitionEngine class that allows you to do exactly what I was wanting.
Just wanted to let everyone know. Here is the documentation for that class from microsofts website:
http://msdn.microsoft.com/en-us/library/microsoft.speech.recognition.speechrecognitionengine.aspx
Enjoy.
I am making a robot that responds to few voice commands. I am using Windows XP and C# to achieve that. My only problem is that I don't know how to use speech recognition with C#.
I've been searching Google and MSDN, but I did not find any beginner friendly tutorial yet..
Any suggestions??
Also, I know -from my experience with windows' speech recognition in M$ word- that I need to train the computer before starting the speech recognition application. This may cause a big problem for me because I may need to present my robot using different computers/or/different people may be the presenters.
So is there any way to make a predefined list of words that any user can say to the application without having to train it first???
Thanks for help!
Yes, you'll have to train anything that uses pattern recognition to respond to things. In Philadelphia, they pronounce "water" as "wudder". How could an algorithm figure that out? A predefined list would require you to have a working knowledge of every accent in the target sales countries.
SAPI 5.4 in Windows 7 does a very good job of recognizing limited command & control grammars without training.
If you keep your command set (grammar) small (say, no more than 10-15 commands), you should be able to get good results.
Dictation or a large command set requires training; there's just too much uncertainty.