Microsoft Speech Recognition comes with a Speech Reference Card. It consists in some pre-determined words that are recognized.
I want to know if it's possible to disable it. Is it?
EDIT:
I want to remove all pre-defined commands.
This ones:
http://windows.microsoft.com/en-us/windows-vista/Common-commands-in-Speech-Recognition
EDIT2:
I'm using SpeechLib!
You probably want an in-process recognizer, instead of a shared recognizer.
Since you're using C#, you need to use the SpeechRecognitionEngine class if you're using System.Speech.Recognition.
In particular, you need to also set the Audio Input property of the recognizer using SetInputToDefaultAudioDevice, so that the inproc recognizer knows where to get the audio from.
Trying to change the code to use what you said, I've discovered what I needed!
With this command:
recGrammar.SetGrammarState(SPGRAMMARSTATE.SPGS_EXCLUSIVE);
all worked!
You can find more info here!
Related
I'm actually creating a homemade assistant and the default speech synthetiser (in c#) hasn't a nice voice. I would like to know if it is possible, and how, to use Cortana voice and pronunciation?
You can see how to tune Cortana's responses via this link:
https://learn.microsoft.com/en-us/cortana/skills/speech-synthesis-markup-language
And you can output text-to-speech (TTS) this way:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech/api-reference-rest/bingvoiceoutput
You can also change the Windows default voice to Cortana:
https://superuser.com/questions/1141840/how-to-enable-microsoft-eva-cortanas-voice-on-windows-10
Does this help?
I'm not sure if it is possible but anyway,
I use using System.Speech.Recognition; in winform C# app.
I'm wondering if it is possible not only to recognize speech, but also recognize voice, somehow recognize difference between different voices
to get something near to reading of multiply content from each separate voice, for example from two simultaneously or separately speaking users as different two.
Or at least maybe some method to control background loudness, for example if AudioLevelUpdated event allows me to see input volume, but maybe also exist some specific way to separate loud voice from extra noise or voices in background
System.Speech.Recognition will not help you in voice recognition.
System.Speech.Recognition is intended for speech to text. Adding grammar to it improves its efficiency. You can train the Windows desktop for better conversion. Refer Speech Recognition in Control Panel.
There are couple of 3rd party libraries available for voice recognition.
For removal of noise, you can refer Sound visualizer in C#.
You can find an interesting discussion at msdn forum.
I think you should take a look at CRIS which is part of Microsoft Cognitive Services, at least for you question about noise.
CRIS is a Custom Speech Service and its basic use is to improve the quality of Speech-to-text using custom acoustics models (like background noise) and learning vocabulary using samples.
You can import :
Acoustic Datasets
Language Datasets
Pronunciation Datasets
For example in acoustic models you have:
Microsoft Conversational Model for recognizing speech spoken in a conversational style (i.e. speech directed at another person).
Microsoft Search and Dictation Model for speech directed to an application, such as commands, search queries or dictation.
There is also a Speaker Recognition API available in preview
I'm trying to get Bing Speech followed this link:
www.microsoft.com/cognitive-services/en-us/speech-api it says: "Show Quota is temporarily unavailable. We are working hard to bring an improved version back in the middle of October." and I can't see it in the list, there is only Bing Search.
I found this www.microsoft.com/cognitive-services/en-us/speech-api/documentation/overview I've pressed Get started for free and got two keys.
Now I have to find basic code to use it. I found this: www.github.com/Microsoft/Cognitive-Speech-STT-Windows, Actually I'm just moved my code from Windows Form Application to WPF with same interface and it is works without any special changes, but fact is I'm new with WPF so not sure what I have to use to get needed output.
Maybe I have to follow instruction just by this: https://www.microsoft.com/cognitive-services/en-us/Speech-api/documentation/GetStarted/GetStartedCSharpWin10
Assuming you're interested in a C# sample, maybe this can help: https://github.com/Microsoft/BotBuilder-Samples/tree/master/CSharp/intelligence-SpeechToText. It is a sample speech-to-text bot, which uses the Bing Speech API. The bot reads the audio file that the user uploaded, and then converts it into text. Obviously, you can leverage the same code to run speech-to-text for non-bot applications.
the narrator API (System.Speech) is not available for windows store.
Is there an alternative API or method for using this?
I found that there is a text-to-speech function as part of the Microsoft translator service, however since the narrator is already available, it would be silly to have to use this service for that, so I'd rather use it directly if possible.
I thought of maybe running a command through CMD.exe to text-to-speech something, but in windows store apps it's impossible to launch external processes, also I haven't found a command line interface for the narrator anyway.
Does anybody know of any method of doing this?
Windows 8.1 has offline text to speech support. Here's MSDN sample.
Are you trying to implement accessibility? If so, then this may be what your looking for:
https://msdn.microsoft.com/library/windows/apps/windows.ui.xaml.automation.automationproperties
You can set the Name attached property to whatever you would like to have read.
I am writing my first speech recognition application by using the System.Speech namespace of .NET Framework 4.0.
I am using the shared speech recognition, loading a default dictation grammar and custom grammars I've done.
I also capture the text recognized by the Windows Speech Recognizer (WSR) by implementing a handler for the event "SpeechRecognized".
I would like to change the text recognized (e.g. to change "two" by "2" in the text) but if I do that, the output will not be written in the current app (e.g. MS Word).
I know I can do something SIMILAR by using the SendKeys method, but I think it's not a good idea because the quality of the output is lower. For example, if you use WSR as a standard user, you will see that after "." or a new line the following sentence starts with an uppercase character. There're tons of things you must take into account if you want to write your own output parser so I would like to use the one WSR uses if you don't handle the SpeechRecognized event. But... HOW??
(I wouldn't mind to use SAPI if necessary).
Thanks!!
The short answer is that you can't. WSR doesn't have a hook that allows 3rd parties to connect into its dictation pipeline.