Train voice recognition - c#

I have built a voice recognition system in C# and I’m using the Microsoft Speech Platform 11.0 (Swedish language packs). I use a wav file as an input for the SpeechRecognitionEngine.
The problem is that some of the words (40%) are not recognized at all.
I would like to record some commands (a limited number of Swedish words and/or numbers) to a sound file and import them so that the SpeechRecognitionEngine could be able to understand them.
For example:
Record when an user says the word: “Katt” (Swedish word for cat) and then be able to tell the RecognitionEngine that this means “Katt”.
Is this possible or are there better solutions?

Yes its possible. Create a lookup entry file for your words and write soundex in c# and load the lookup grammar. It will display whichever the word you pronounced but it should be in the lookup.

Related

speech recognition using HMM or MFCC

please help me in speech recognition using HMM (hidden markov models) or MFCC ( Mel Frequency Cepstral Coefficient ) by longage c# or c++
I want to recognize word "one", "two"... to "ten")
When I say one ===> show MessageBox write one
You should use a toolkit for this purpose like HTK, Kaldi, etc. which are open-source or you could use a free API like Google Speech API, Microsoft Speech API (SAPI), etc.
It is not really easy to do speech recognition using HMM from scratch. BTW, MFCC is not a machine learning tool like HMM. MFCC is a method of feature extraction which is used to prepare observations for HMM training and decoding.

Search an Audio inside another Audio in C# or C++

Suppose there is a sample audio file that contains up to 10 simple words
"One Two Three .... Ten"
and there is 1 second silence between each number in the audio file.
I want to check to see if the audio file contains keyword "Two" for example.
Please note that I have the keyword "Two" voice file and it's the same exact voice from the master voice file, but It may contain some noise.
Is there a way for me to search the voice "Two" inside that bigger audio file and find the occurrence time?
Since there's no provided code, i'll just give you the idea how to proceed, hope it'll help.
First you have to split your file to 10 different audio files according to silence (I'm sure there are libraries that will help you do that).
Then you can send the file to google voice recognition api, and get a string as a result, which will contain the string according to the voice in the file.
EDIT: Please refer to:
https://googlespeechtotext.codeplex.com/
How to use google speech recognition api in c#?
Why don't you just work out to convert both the audio samples into some bits or signal formats and check if they both have a some common strings.
Some of the links you should check before going any further just to work out with audio in .Net:
http://crsouza.com/2009/08/converting-audio-bit-depths-in-c/
https://cscore.codeplex.com/
http://www.codeproject.com/Articles/501521/How-to-convert-between-most-audio-formats-in-NET
Let me know if you can work this out.

Add new word to windows speech recognition using C#

i know how to use speech recognition in C# but the problem is how to add a special word or name into windows speech dictionary database?
in windows 7 and 8 you can do it easily using:
Opening Speech Dictionary > Add new word > Enter the Text of word > Record the pronunciation of the word by Microphone
and then,it's OK! the word will add to database.
we also can edit the word using the Speech Dictionary.
does anyone know how can we do these steps with .NET and programming?
EDIT:
its very simple, windows speech dictionary has limited database , how can we add some other words into this via .NET C#?
for example name "Salad" doesn't exist in windows speech dictionary. how can i add this word and its pronunciation into windows speech dictionary?
sorry i'm a bit new to this great site.
You'll need to use the SAPI Automation APIs (aka SpeechLib) to access the ISpLexicon interfaces.
In particular, ISpLexicon::AddPronunciation will add a new word (and its associated pronunciation) to the user lexicon.

Extract word timings from MP3 audio file

I want to capture audio timings for each word in the audio.
For example, "Introduction Hi Bangalore" consider this is one mp3file(page1.mp3), from that how can extract each word starting time.
i.e., The starting time of "Introduction", "Hi", "Bangalore".
If it is possible?, Possible means, how can i do?
If any software available?.
I have tried to extract timing using system.speech API using bookmark-reach event but it not as accurate as I want.

Specifying a pronunciation of a word in Microsoft Speech API

I'm working on a small application in C# which performs speech recognition using Microsoft Speech API.
I need to add some non-english words to grammar, whose pronunciation don't obey english pronunciation rules.
Is it possible specify their pronunciation using International Phonetic Alphabet ?
If yes, which methods should be used ?
The way to achieve custom pronunciation here is by passing an SrgsDocument to the Grammar constructor. This allows specification per http://www.w3.org/TR/speech-grammar/.
I have not done this and it looks non-trivial, but this ought to allow you to do what you want.

Categories

Resources