How can I perform speech recognition on speech coming from an audio file (.mp3, wav) instead of the microphone ?
I want to be able to do that from C#.NET and Delphi.
This article answers your question specifically:
Using WAV File Input with SR Engines
http://msdn.microsoft.com/en-us/library/ms717071(VS.85).aspx
See the following articles for general info:
http://msdn.microsoft.com/en-us/magazine/cc163663.aspx
http://en.wikipedia.org/wiki/Speech_Application_Programming_Interface
http://msdn.microsoft.com/en-us/library/ms723627(VS.85).aspx
Related
I found NAudio Library during making simple audio editor program.
I want to output wav(*.wav) file to microphone input stream.
This is not something natively supported by Windows Audio APIs, and so NAudio isn't able to help you with this. You need an audio 'loopback' device driver like this, or physically connect an output of your soundcard to an input with an audio cable.
I just want to know if there is any built in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it.
You can use the built in .NET speech recognition APIs to accomplish this. MSDN has a set of complete samples that read an audio file and then write the recognised speech to the console. With a little bit of work they can be modified to output to a plain text file:
http://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognitionengine.setinputtoaudiostream.aspx
I have a Wav file that I'm creating via the Microsoft Kinect that I'm saving to the desktop. I need to convert it to FLAC format so I can send it up to the Google Cloud to be processed from Speech2Text.
I haven't found any WAV to FLAC encoders in C#.
Is there any way to convert a WAV file to a FLAC file in C#?
Try to use LibFlac (sourceforge). The FLAC encoder is an open-source C/C++ project. In order to use it in a C# application you have to use PInvoke to call its application programming interface LibFlac.dll. Checkout this blog post, that explore the processing of uncompressed audio data with the FLAC API in C#:
Encoding uncompressed audio with FLAC in C#
How can I open different audio formats(except .mp3 and .wav) using Naudio and C#?
How can I create audio synthesis unsing Naudio and C#?(I mean how to get the sound frequency and other data necesary for audio synthesis).
P.S. I've looked at this tutorial series
http://opensebj.blogspot.com/2009/02/introduction-to-using-naudio.html
and this one
http://www.giawa.com/tutorials/?p=19o
In addition to MP3 and WAV, NAudio can also open AIFF and WMA files with the AiffFileReader and WmaFileReader classes. Beyond that, you will have to write your own WaveStream derived class to read other formats.
See my tutorial on how to play a sine wave in NAudio, which will show you the basics of how to get started with audio synthesis in NAudio.
I need to convert an AMR (Adaptive Multi-Rate) audio file recorded in a phone (as a Stream object) to a PCM uncompressed wav audio Stream so it can be processed afterwards for speech recognition. The Speech Recognition doesn't like the AMR format. This is going to be a server application using the Microsoft Speech Platform. I am not sure about using the ffdshow or similar libraries in a .
Right now I am researching NAudio and DirectShowNet to see if they can help me accomplish this but was hoping someone can point in the right direction.
After a lot of searching for a solution for this, I am going to use ffmpeg. It provides the AMR-NB (NB=Narrow Band) decoder. There are a lot of c# wrappers for ffmpeg around; most of them abandoned efforts and one that is up to date but is not free. Just running ffmpeg with the basic parameters provides what I need, plus it is really fast.
I don't like the idea of calling an external process to do the conversion, plus I need to save the AMR stream as a file so it can be converted to a wav file but I believe I can make it work efficiently.