Few questions,
What is the difference between the SpeechRecognizer and the
SpeechRecognitionEngine classes? why use one over the other for
speech recognition?
Is the speech recognition widget that I see in Windows 10 when I
start my program has to be shown?
I loaded the SpeechRecognizer object with simple grammar such as
"a", "b", "a r". it recognizes it perfectly but the time it
takes is not ideal for my program, I would like it to be faster, any
way to do that?
I think this has been answered in the past. See Using System.Speech.Recognition opens Windows Speech Recognition, does this help?
In general, you can use System.Speech as inproc or shared. When shared, you see a recognizer "widget" on the screen. If you use an inproc recognizer, you control the recognizer and windows does not add a UI. See good Speech recognition API for some more background.
Related
I'm not sure if it is possible but anyway,
I use using System.Speech.Recognition; in winform C# app.
I'm wondering if it is possible not only to recognize speech, but also recognize voice, somehow recognize difference between different voices
to get something near to reading of multiply content from each separate voice, for example from two simultaneously or separately speaking users as different two.
Or at least maybe some method to control background loudness, for example if AudioLevelUpdated event allows me to see input volume, but maybe also exist some specific way to separate loud voice from extra noise or voices in background
System.Speech.Recognition will not help you in voice recognition.
System.Speech.Recognition is intended for speech to text. Adding grammar to it improves its efficiency. You can train the Windows desktop for better conversion. Refer Speech Recognition in Control Panel.
There are couple of 3rd party libraries available for voice recognition.
For removal of noise, you can refer Sound visualizer in C#.
You can find an interesting discussion at msdn forum.
I think you should take a look at CRIS which is part of Microsoft Cognitive Services, at least for you question about noise.
CRIS is a Custom Speech Service and its basic use is to improve the quality of Speech-to-text using custom acoustics models (like background noise) and learning vocabulary using samples.
You can import :
Acoustic Datasets
Language Datasets
Pronunciation Datasets
For example in acoustic models you have:
Microsoft Conversational Model for recognizing speech spoken in a conversational style (i.e. speech directed at another person).
Microsoft Search and Dictation Model for speech directed to an application, such as commands, search queries or dictation.
There is also a Speaker Recognition API available in preview
Hi, is there a way to use a PromptBuilder to develop a Windows Phone speech recognition app? I would like to make an app that can recognize my voice and do something with it in code, but I have to do it on Windows Phone. Can anyone help expand on that? If I can't use a PromptBuilder, is there a alternative?
The System.Speech.Synthesis namespace only deals with speech synthesis, i.e. speech generation, not speech recognition, as per the docs:
The N:System.Speech.Synthesis namespace contains classes for initializing and configuring a speech synthesis engine, for creating prompts, for generating speech, for responding to events, and for modifying voice characteristics.
There's also System.Speech.Recognition namespace, that deals with speech recognition, but I'm not sure about it's availability on Windows Phone.
For working with speech on Windows Phone you should start here: Speech for Windows Phone 8. They have their own Windows.Phone.Speech.Recognition namespace for this purpose.
I'm using the speechrecognizer in C# to create a basic grammar and listen for commands.
Since I'm developing this grammar/system to be used for a small game/app I'm making, I'd
love to start the speech recognizer in listen mode and invisible (or at the very least condense it
down to the taskbar). My biggest concern is that if the game ever gets any recognition (lol)
that people will be turned off by the obtrusiveness of the windows SR.
So, Can I start the speech recognizer in listen mode, and invisible to the user?
And can I close it when the app is finished so that the user doesn't ever actually deal with
the SR program, just my program/game?
Thanks.
There is a SpeechRecognitionEngine class that allows you to do exactly what I was wanting.
Just wanted to let everyone know. Here is the documentation for that class from microsofts website:
http://msdn.microsoft.com/en-us/library/microsoft.speech.recognition.speechrecognitionengine.aspx
Enjoy.
I know the first comment will be that am duplicating previous threads, but the codes I found (from MSDN) uses window's speech recognition... I'm doing my graduation project and speech recognition is part of it! and I cant include this code,I have to try and do it from scratch, am doing some researches about it and I would be really thankful if someone have already done this and can give me a link for a paper or a code I can benefit from !
Thanks in advance!
Microsoft Server Speech Platform 10.1 (SR and TTS in 26 languages)
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=24003
The basic operations that speech recognition applications perform:
Starting the speech recognizer.
Creating a recognition grammar.
Loading the grammar into a speech recognizer.
Registering for speech recognition event notification.
Creating a handler for the speech recognition event.
Language Packs
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=3971
Runtime Download
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=24974
These libraries could, at least, give you a starter to understand the makeup of the interfaces, and a starter of the core/base code to copy/steal or emulate ;)
There's also this paper:
http://www.cs.nyu.edu/~mohri/pub/hbka.pdf
Best of Luck!
Writing the code that implements the basic recognition algorithm (Hidden Markov Model based recognizers are the norm these days) is only part of your challenge. Virtually every speech recognition system is trained on actual speech data, so you also have to identify a corpus (collection of audio files and transcriptions) to train your mathematical models.
Have a look at the open source Sphinx speech recognizer (and related tools) from CMU if you are still interested in doing it all by yourself.
I've made an app that uses the SpeechRecognizer class to setup a simple grammar and recognize simple words.
When I run it on Win7 I notice two things.
1) The first time I start the app the Speech recognition bar (thingy) comes up, but the UI of my app is not shown (it is running as I can see in the Task Manager).
When I start the app for the 2nd time (after killing the first instance) it display normally (with the windows speech recognition toolbar already running).
2) When I speak one of the words I'm recognizing in my app a 2nd time, it does not trigger an event -instead- it selects the text on my app where I print out in a listbox the history of the recognized words.
Note: When I remove the history listbox from the main screen, it works as expected. Apperently Win7 tries to find the word in my UI first and when it cannot find it - only then does it trigger my programmatic event...??
Both problems seem very weird to me.
More info on the app: Its a VS2008/.NET 3.0 WPF app written in C#. The application allows a user to edit setting groups (patches) for sending Midi commands. Each Patch is tagged with a phrase. When that Phrase is spoken (recognized by the app) all configured Midi commands are sent to the output(s). The history of patches that were recalled by the user are printed in a 'history' list on the apps main screen.
I hope someone can help me with this. Any suggestions are most welcome.
Thanx,
Marc Jacobi
I think you are using the shared speech recognizer (SpeechRecognizer). When you instantiate
SpeechRecognizer you get a recognizer that can be shared by other applications and is typically used for building applications to control windows and applications running on the desktop.
It sounds like you want to use your own private recognition engine (SpeechRecognitionEngine). So instantiate a SpeechRecognitionEngine instead.
see http://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognizer(v=vs.90).aspx
What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? and Disable built-in speech recognition commands? may also have some helpful info.
I got it working, thanx!
The main difference between using the SpeechRecognizer and the SpeechRecognitionEngine are:
Construct the SpeechRecognitionEngine using a RecognizerInfo from InstalledRecognizers.
Call one of the SetInputToXxxx methods
Call RecognizeAsync(RecognizeMode.Multiple) to mimic the SpeechRecognizer (SpeechRecognized) events.
Call RecognizeCancel/Stop to quit.
Hope it helps.