C# - Free speech recognition Engine library (SDK)
System.Speech.Recognition is very bad... I want another SDK that give me good results and works with c# on Visual Studio...
and i want it OFFLINE not online like google api
Thanks
I got quite good results using pocketsphinx, or Sphinx if you have more available resources, in the past. Check it here:
https://cmusphinx.github.io/
Related
I'm build a Win Universal App with capabilities to watch live captions of the lecture which student is currently watching or attending in person. I'm looking for a built-in free solution to do audio to text operations.
macOS have the Speech lib https://developer.apple.com/documentation/speech , which we're going to use, but cannot find a similar on Windows. Found docs on Windows.Media package, but cannot figure out if it actually has audio2text api or just commands recognition https://learn.microsoft.com/en-us/uwp/api/windows.media.speechrecognition?view=winrt-22621
Maybe someone has experience with building such kind of capabilities on Windows?
Yes, you could use the Windows.Media.SpeechRecognition API for speech recognition not only with the commands recognition.
You could make a simple test with the official Speech Recognition Sample here: SpeechRecognitionAndSynthesis. Just remember to enable the Online speech recognition (Settings -> Privacy -> Speech).
I am trying to develop a humble AI to help the daily routines of my family.
Voice recognition is a must, and commands can not be limited to a command library.
so command library mode is out of the table
I tried dictation mode, which already has a terrible recognition with headset, wont be able to understand anything with a room mic.
So I am trying to use Microsoft Cognitive Services: Bing Speech Recognition:
I downloaded the documents and the example, I see everything is in XAML form. I don't understand why.
I am asking some guidance from those who are experienced in this, is it possible to make it in console app or windows form? (I am using .Net 4.6).
If not do you have any suggestion for me to solve my problem?
Thank you for your time and patience.
You can use NuGets to achieve the same. Find sample over here https://github.com/Azure-Samples/Cognitive-Speech-STT-Windows
Further details about BingSpeechSDK -> https://learn.microsoft.com/en-us/azure/cognitive-services/Speech/GetStarted/GetStartedCSharpDesktop
You can use the same for Console App also just need to define the input segment from MicIn.
I would like to write a program in C# that includes limited vocabulary speech recognition of languages such as Finnish or Polish. Microsoft's Speech SDK works great for English but can it support other languages like those? If not, what other (hopefully affordable) software tools are available?
Have a look at Microsoft Server Speech Platform 10.2. It supports both STT and TTS.
For 26 Languages, including Finnish and Polish!
Here's a link that will get you started.
http://www.codeproject.com/KB/audio-video/TTSandSR.aspx
A bit late post, sorry for that.
So, you've all probably seen Iron Man where Tony interacts with an AI system called Jarvis. Demo clip here (Sorry it's a commercial).
I'm very familiar with C#, C++ and Visual Basic, but I am unsure what options I have available for me to program something like this. Ideally, I'd like to have it assist me while working on some projects by automating a few things.
After doing a bit of research, I saw that a lot of people where using apple script. Well, I'm a windows developer and I work on windows, SO, that won't work.
Microsoft has a Speech SDK, but I hear that I can't program it to learn custom words... as in it just uses it's standard library. Is this true? What are the other limitations of speech recognition with the SDK? Is there something else then?
Also, which language would be better to use for a project like this? C# or VB?
The .NET 3.0 System.Speech.Recognition namespace has very elegant .NET wrapper classes around the SAPI SDK. Including the Grammar class to customize the recognition. As usual, any .NET enabled language can take advantage of it, the specific language doesn't matter.
We have a application that we were planing to use Microsoft speech API for. Now we tested it on Windows XP using Microsoft Sam voice and frankly it sound terrible ... It's almost impossible to hear what the voice is trying to say.
Are there other, better voice. Are there any updates or newer versions out there that are better. Are there other product, open source projects etc that can work as an alternative?
Just to clarify - It needs to have some sort of API so I actually can program against it.
On Windows about the best I have found was using the speech API and voices from AT&T Natural Voices: https://nextup.com/attnv.html
They are however VERY expensive if available at all. I have run into projects where the usage/business model was so far from what AT&T was thinking of that they wouldn't even sell a license.
There is a free software alternative, Festival: http://festvox.org/ , the quality though is horrible. It is about 10 years behind the current sound quality of commercial systems. It is however free.
A third alternative which has worked well for me was to shift the voice synthesis part of a few projects to OS X. OS X has a decent set of tools and speech APIS and a fairly decent set of stock voices. The downside of course is that prorams written for these APIs run only under OS X which runs only on Apple hardware.
AT&T Natural Voices engine produces great speech but its not free
there is also NeoSpeech which are also good - Not free as well
You don't describe your licensing needs, so I don't know if any of these will be suitable in that regard, but all of the following are sources of SAPI 5 compatible voices:
Ivona (http://www.ivona.com/) - I'm using their Kendra voice on a SAPI project.
AT&T Natural Voices (http://www2.research.att.com/~ttsweb/tts/)
Loquendo (http://www.loquendo.com/)
Acapela (http://www.acapela-group.com/products/products.asp)
Cepstral (http://www.cepstral.com/)
fonix (http://www.fonixspeech.com/tts.php) - only if you loved the original Speak & Spell.
Nuance RealSpeak (I'm not sure about this one...)
You can use free and open source Festival. The default Festival voice sounds a little like Stephen Hawking but you can use some other much better HTS voices. For example try selecting Peter HTS 2011 voice on this demo page: http://www.cstr.ed.ac.uk/projects/festival/morevoices.html. Most of HTS voices for Festival that I've seen are not allowed for commercial use however this one seems to be free: http://homepages.inf.ed.ac.uk/jyamagis/software/page54/page54.html
You can check this youtube tutorial: http://www.youtube.com/watch?v=MmcLFJQpv2o