Suppose there is a sample audio file that contains up to 10 simple words
"One Two Three .... Ten"
and there is 1 second silence between each number in the audio file.
I want to check to see if the audio file contains keyword "Two" for example.
Please note that I have the keyword "Two" voice file and it's the same exact voice from the master voice file, but It may contain some noise.
Is there a way for me to search the voice "Two" inside that bigger audio file and find the occurrence time?
Since there's no provided code, i'll just give you the idea how to proceed, hope it'll help.
First you have to split your file to 10 different audio files according to silence (I'm sure there are libraries that will help you do that).
Then you can send the file to google voice recognition api, and get a string as a result, which will contain the string according to the voice in the file.
EDIT: Please refer to:
https://googlespeechtotext.codeplex.com/
How to use google speech recognition api in c#?
Why don't you just work out to convert both the audio samples into some bits or signal formats and check if they both have a some common strings.
Some of the links you should check before going any further just to work out with audio in .Net:
http://crsouza.com/2009/08/converting-audio-bit-depths-in-c/
https://cscore.codeplex.com/
http://www.codeproject.com/Articles/501521/How-to-convert-between-most-audio-formats-in-NET
Let me know if you can work this out.
Related
i want to extract features of input audio files in C#. (frequencies , length , etc)
for this task i was trying to use Accord.audio nuget library. But I've failed to find a how to guide or working example that suits my need.
Can you please show me how to extract audio features of a file using accord.audio nuget library.
For an example, when i input "song.mp3" file, i want a requency array , decibal array , length etc feature of "song.mp3"
If you are looking for some tutorials on how to do stuff with Acoord.net, I think this is all you need.
and I don't think that C# is the best choice when it comes for Sound preprocessing related stuff. If you are willing to use Python, You can make use of Librosa for all your Sound Preprocessing and feature Extraction things, Dump the datasets into json file or whatever and Use it from C#.
I want to capture audio timings for each word in the audio.
For example, "Introduction Hi Bangalore" consider this is one mp3file(page1.mp3), from that how can extract each word starting time.
i.e., The starting time of "Introduction", "Hi", "Bangalore".
If it is possible?, Possible means, how can i do?
If any software available?.
I have tried to extract timing using system.speech API using bookmark-reach event but it not as accurate as I want.
I need to convert audio files, being specific mp3 files to simple Text, and then converting back that text to mp3 again, one of feature of my mobile application, is it possible? if Yes, then how? guideline please.
You can find the answer for audio to text here (second post)
For the text to audio, i would use any Text-to-Speech. In c# you could also use System.Speech.
I have wav file in which using the naudio lib i have been able to get raw data out of the wav files.
Does any one know how to loop though the data in chuncks detecting DTMF tones?
The NuGet package DtmfDetection.NAudio provides extension methods and wrappers to detect DTMF tones in live (captured) audio and pre-recorded audio files.
On the GitHub site of the project you can find a sample program.
Well, on the top of the google is this:
http://sourceforge.net/projects/dtmf-cs/
But, if you want to use heavy artillery, you can always FFT your samples and check what two freqs are seen the most.
BTW, do some searching before you post anything, and you'll come up with:
Detect a specific frequency/tone from raw wave-data
or even
Is it possible to detect DTMF tones using C#
I've gone with http://www.tapiex.com/ToneDecoder.Net.htm
Its cheeap and does a good job at detection. All the others i found dont seem to do the job or have no documentation
DTMF stands for dual-tone multi frequence signaling. So you have to detect the two frequencies used to send a signal.
You have to transform your timebased audio material into the frequency domain typically by using a FFT algorithm.
Here i found a very old VB5 program with source online which does exactly what you want i think: http://www.qsl.net/kb5ryo/dtmf.htm
EDIT: Ok, maybe its better to take a look at the suggested C# lib.
I'm doing a sample which will run mp3 files which are selected by the user. I
want to calculate the playing time of the file (e.g. 00:05:32). How can I calculate the playing time?
You could use TagLib Sharp
It exposes TagLib.AudioProperties.Duration
For Alvas.Audio library see code below
using (Mp3Reader mr = new Mp3Reader(File.OpenRead("len.mp3")))
{
int durationMS = mr.GetDurationInMS();
TimeSpan durationTS = TimeSpan.FromMilliseconds(durationMS);
}
You suggest in the tag that you're doing this in C#. This question deals with it:
Finding MP3 length in C#
And there's some code for reading the MP3 header and extracting relevant information (like the length) here:
http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=79
I believe the Windows Media API (or windows mixer api or something, I can't recall the exact name) has a way to open and read sound files like mp3 and maybe get the time from it too. As an added bonus, by using that API you can open any audio format that will work in say Windows Media Player, so you're not limited to just mp3's.