I am using nAudio Library to capture microphone input. But I have run into a problem.
I am using the code(which I have modified slightly) from an nAudio sample app.
The codes generates a WAV file based on mic input and renders it as a wave. Here is the code for that.
private void RenderFile()
{
SampleAggregator.RaiseRestart();
using (WaveFileReader reader = new WaveFileReader(this.voiceRecorderState.ActiveFile))
{
this.samplesPerSecond = reader.WaveFormat.SampleRate;
SampleAggregator.NotificationCount = reader.WaveFormat.SampleRate/10;
//Sample rate is 44100
byte[] buffer = new byte[1024];
WaveBuffer waveBuffer = new WaveBuffer(buffer);
waveBuffer.ByteBufferCount = buffer.Length;
int bytesRead;
do
{
bytesRead = reader.Read(waveBuffer, 0, buffer.Length);
int samples = bytesRead / 2;
double sum = 0;
for (int sample = 0; sample < samples; sample++)
{
if (bytesRead > 0)
{
sampleAggregator.Add(waveBuffer.ShortBuffer[sample] / 32768f);
double sample1 = waveBuffer.ShortBuffer[sample] / 32768.0;
sum += (sample1 * sample1);
}
}
double rms = Math.Sqrt(sum / (SampleAggregator.NotificationCount));
var decibel = 20 * Math.Log10(rms);
System.Diagnostics.Debug.WriteLine(decibel.ToString() + " in dB");
} while (bytesRead > 0);
int totalSamples = (int)reader.Length / 2;
TotalWaveFormSamples = totalSamples / sampleAggregator.NotificationCount;
SelectAll();
}
audioPlayer.LoadFile(this.voiceRecorderState.ActiveFile);
}
Below is a little chunk from a result of a 2second WAV file with no sound but only mic noise.
-54.089102453893 in dB
-51.9171950072361 in dB
-53.3478098666891 in dB
-53.1845794096928 in dB
-53.8851764055102 in dB
-57.5541358628342 in dB
-54.0121140454216 in dB
-55.5204248291508 in dB
-54.9012326746571 in dB
-53.6831017096011 in dB
-52.8728852678309 in dB
-55.7021600863786 in dB
As we can see, the db level hovers around -55 when there is no input sound, only silence. if I record saying "Hello" in mic in a normal tone, the db value will goto -20 or so. I read somewhere that average human talk is around 20dB and -3dB to -6dB is the ZERO value range for mic.
Question: Am I calculating the dB value correctly? (i used a formula proposed here by someone else)...Why dB is always coming up in negative? Am i missing a crucial concept or a mechanism?
I searched nAudio documentation at codeplex and didn't find an answer. In my observation, the documentation there needs to be more explanatory then just a bunch of Q&A [no offense nAudio :)]
If I understood the formula correctly, the actual value you're calculating is dBm, and that's absolutely ok since dB is just a unit to measure amplification and can't be used for measuring signal strength/amplitude (i.e. you can say I amplified the signal by 3 db, but can't say my signal strength is 6 dB).
The negative values are there just because of the logarithmic conversion part of the formula (converting watts/miliWatts to db) and since the signals you're dealing with are relativly weak.
So in conclusion, looks, like you've done everything right.
Hope it helps.
EDIT: BTW, as you can see, there really is ~23-25dbm difference between silence and human speech
Related
I'm writing educational musical app for my students.
I need to pass audio input to pitchAnalyzer; It works fine with 44100 samplerate but showing wrong results with 8000.
I read a lot of examples: how that can be done reading and writing to the file. But i don't have file to read and don't need to write audio.
I already did conversion from 16 bit byte array to IEEE float ( format that pitchAnalyzer algorithm requires) which comes with the WaveInOnDataAvailable event;
void WaveInOnDataAvailable(object sender, WaveInEventArgs waveInEventArgs)
var buffer = waveInEventArgs.Buffer;
float[] samples = new float[buffer.Length/2];
for (int n = 0; n < buffer.Length; n += 2)
{
samples[n / 2] = BitConverter.ToInt16(resampled_buffer, n) / 32768f;
}
WaveIn setup:
waveIn = new WaveIn();
waveIn.DeviceNumber = comboBx_RecordingDevices.SelectedIndex;
waveIn.DataAvailable += WaveInOnDataAvailable;
waveIn.StartRecording();
So how this can be resampled directly to the buffer without writing it to a file?
Is there is a way to change WaveIn input samplerate? Property shows that it can't be set, but probably there is another way?
Thanks a lot in advance
I am using IBM Watson Unity SDK
There are some examples in the web on how to send file to IBM Watson.
But no exact examples how to stream a long file split to parts. So what I want to do:
I have a log audio file (about 1-3min) and want to sent it to Watson to recognize speech.
IBM Watson accepts only <5mb files, but my file is larger, so I need to split it and send as parts.
Here is my code:
private void OnAudioLoaded (AudioClip clip)
{
Debug.Log ("Audio was loaded and starting to stream...");
_chunksCount = 0;
float[] clipData = new float[(int)(clip.length * CHUNK_SIZE)];
clip.GetData (clipData, 1);
try {
_speechToText.StartListening (OnRecognize);
for (int i = 0; i < Math.Ceiling (clip.length / SECONDS_TO_SPLIT); i++) {
Debug.Log ("Iteration of recognition #" + i);
_chunksCount++;
// creating array of floats from clip array
float[] chunkData = new float[SECONDS_TO_SPLIT * (int)CHUNK_SIZE];
Array.Copy (clipData, i * SECONDS_TO_SPLIT * (int)CHUNK_SIZE, chunkData, 0, clipData.Length - i * SECONDS_TO_SPLIT * CHUNK_SIZE < SECONDS_TO_SPLIT * CHUNK_SIZE ? (int)(clipData.Length - i * SECONDS_TO_SPLIT * CHUNK_SIZE) : SECONDS_TO_SPLIT * (int)CHUNK_SIZE);
// creating audioclip from floats array
AudioClip chunk = AudioClip.Create ("ch", clip.frequency * SECONDS_TO_SPLIT, clip.channels, clip.frequency, false);
chunk.SetData (chunkData, 0);
AudioData audioData = new AudioData (chunk, chunk.samples);
// sending recognition request
_speechToText.OnListen (audioData);
}
} catch (OutOfMemoryException e) {
DialogBoxes.CallErrorBox ("Audio Recognition Error", e.Message);
}
}
The problem is:
On line _speechToText.StartListening (OnRecognize); I assign a Callback function OnRecognize, that should be called when something is recognized, but it is never called.
This file i am testing on has been recognized, on online website and it is definitely ok.
Any suggestions?
So the figure was that the data chunks where too small for Watson to recognize, so my solution to this specific problem was to send longer audio chunks, a few seconds long, about half a minute, and the recognition was working properly.
The longer audio file I was sending the better result I received, but still I had to be under 5mb.
This solution is very old, but it can help someone running into the same problem.
I'm currently trying to implement my own single track MIDI file output. It turns an 8x8 grid of colours stored in multiple frames, into a MIDI file that can be imported into a digital audio interface and played through a Novation Launchpad. Some more context here.
I've managed to output a file that programs recognize as MIDI, but the resultant MIDI does not play, and its not matching files generated via the same frame data. I've been doing comparisions by recording my programs live MIDI messages through a dedicated MIDI program, and then spitting out a MIDI file via that. I then compare my generated file to that properly generated file via a hex editor. Things are correct as far as the headers, but that seems to be it.
I've been slaving over multiple renditions of the MIDI specification and existing Stack Overflow questions with no 100% solution.
Here is my code, based on what I have researched. I can't help but feel I'm missing something simple. I'm avoiding the use of existing MIDI libraries, as I only need this one MIDI function to work (and want the learning experience of doing this from scratch). Any guidance would be very helpful.
/// <summary>
/// Outputs an MIDI file based on frames for the Novation Launchpad.
/// </summary>
/// <param name="filename"></param>
/// <param name="frameData"></param>
/// <param name="bpm"></param>
/// <param name="ppq"></param>
public static void WriteMidi(string filename, List<FrameData> frameData, int bpm, int ppq) {
decimal totalLength = 0;
using (FileStream stream = new FileStream(filename, FileMode.Create, FileAccess.Write)) {
// Output midi file header
stream.WriteByte(77);
stream.WriteByte(84);
stream.WriteByte(104);
stream.WriteByte(100);
for (int i = 0; i < 3; i++) {
stream.WriteByte(0);
}
stream.WriteByte(6);
// Set the track mode
byte[] trackMode = BitConverter.GetBytes(Convert.ToInt16(0));
stream.Write(trackMode, 0, trackMode.Length);
// Set the track amount
byte[] trackAmount = BitConverter.GetBytes(Convert.ToInt16(1));
stream.Write(trackAmount, 0, trackAmount.Length);
// Set the delta time
byte[] deltaTime = BitConverter.GetBytes(Convert.ToInt16(60000 / (bpm * ppq)));
stream.Write(deltaTime, 0, deltaTime.Length);
// Output track header
stream.WriteByte(77);
stream.WriteByte(84);
stream.WriteByte(114);
stream.WriteByte(107);
for (int i = 0; i < 3; i++) {
stream.WriteByte(0);
}
stream.WriteByte(12);
// Get our total byte length for this track. All colour arrays are the same length in the FrameData class.
byte[] bytes = BitConverter.GetBytes(frameData.Count * frameData[0].Colours.Count * 6);
// Write our byte length to the midi file.
stream.Write(bytes, 0, bytes.Length);
// Cycle through frames and output the necessary MIDI.
foreach (FrameData frame in frameData) {
// Calculate our relative delta for this frame. Frames are originally stored in milliseconds.
byte[] delta = BitConverter.GetBytes((double) frame.TimeStamp / 60000 / (bpm * ppq));
for (int i = 0; i < frame.Colours.Count; i++) {
// Output the delta length to MIDI file.
stream.Write(delta, 0, delta.Length);
// Get the respective MIDI note based on the colours array index.
byte note = (byte) NoteIdentifier.GetIntFromNote(NoteIdentifier.GetNoteFromPosition(i));
// Check if the current color signals a MIDI off event.
if (!CheckEqualColor(frame.Colours[i], Color.Black) && !CheckEqualColor(frame.Colours[i], Color.Gray) && !CheckEqualColor(frame.Colours[i], Color.Purple)) {
// Signal a MIDI on event.
stream.WriteByte(144);
// Write the current note.
stream.WriteByte(note);
// Check colour and write the respective velocity.
if (CheckEqualColor(frame.Colours[i], Color.Red)) {
stream.WriteByte(7);
} else if (CheckEqualColor(frame.Colours[i], Color.Orange)) {
stream.WriteByte(83);
} else if (CheckEqualColor(frame.Colours[i], Color.Green) || CheckEqualColor(frame.Colours[i], Color.Aqua) || CheckEqualColor(frame.Colours[i], Color.Blue)) {
stream.WriteByte(124);
} else if (CheckEqualColor(frame.Colours[i], Color.Yellow)) {
stream.WriteByte(127);
}
} else {
// Calculate the delta that the frame had.
byte[] offDelta = BitConverter.GetBytes((double) (frameData[frame.Index - 1].TimeStamp / 60000 / (bpm * ppq)));
// Write the delta to MIDI.
stream.Write(offDelta, 0, offDelta.Length);
// Signal a MIDI off event.
stream.WriteByte(128);
// Write the current note.
stream.WriteByte(note);
// No need to set our velocity to anything.
stream.WriteByte(0);
}
}
}
}
}
BitConverter.GetBytes returns the bytes in the native byte order, but MIDI files use big-endian values. If you're running on x86 or ARM, you must reverse the bytes.
The third value in the file header is not called "delta time"; it is the number of ticks per quarter note, which you already have as ppq.
The length of the track is not 12; you must write the actual length.
Due to the variable-length encoding of delta times (see below), this is usually not possible before collecting all bytes of the track.
You need to write a tempo meta event that specifies the number of microseconds per quarter note.
A delta time is not an absolute time; it specifies the interval starting from the time of the previous event.
A delta time specifies the number ticks; your calculation is wrong.
Use TimeStamp * bpm * ppq / 60000.
Delta times are not stored as a double floating-point number but as a variable-length quantity; the specification has example code for encoding it.
The last event of the track must be an end-of-track meta event.
Another approach would be to use one of the .NET MIDI libraries to write the MIDI file. You just have to convert your frames into Midi objects and pass them to the library to save. The library will take care of all the MIDI details.
You could try MIDI.NET and C# Midi Toolkit. Not sure if NAudio does writing MIDI files and at what abstraction level that is...
Here is more info on the MIDI File Format specification:
http://www.blitter.com/~russtopia/MIDI/~jglatt/tech/midifile.htm
Hope it helps,
Marc
We are using the NAudio Stack written in c# and trying to capture the audio in Exclusive mode with PCM 8kHZ and 16bits per sample.
In the following function:
private void InitializeCaptureDevice()
{
if (initialized)
return;
long requestedDuration = REFTIMES_PER_MILLISEC * 100;
if (!audioClient.IsFormatSupported(AudioClientShareMode.Shared, WaveFormat) &&
(!audioClient.IsFormatSupported(AudioClientShareMode.Exclusive, WaveFormat)))
{
throw new ArgumentException("Unsupported Wave Format");
}
var streamFlags = GetAudioClientStreamFlags();
audioClient.Initialize(AudioClientShareMode.Shared,
streamFlags,
requestedDuration,
requestedDuration,
this.waveFormat,
Guid.Empty);
int bufferFrameCount = audioClient.BufferSize;
this.bytesPerFrame = this.waveFormat.Channels * this.waveFormat.BitsPerSample / 8;
this.recordBuffer = new byte[bufferFrameCount * bytesPerFrame];
Debug.WriteLine(string.Format("record buffer size = {0}", this.recordBuffer.Length));
initialized = true;
}
We configured the WaveFormat before calls this function to (8000,1) and also a period of 100 ms.
We expected the system to allocate 1600 bytes for the buffer and interval of 100 ms as requested.
But we noticed following occured:
1. the system allocated audioClient.BufferSize to be 4800 and "this.recordBuffer" an array of 9600 bytes (which means a buffer for 600ms and not 100ms).
2. the thread is going to sleep and then getting 2400 samples (4800 bytes) and not as expected frames of 1600 bytes
Any idea what is going there?
You say you are capturing audio in exclusive mode, but in the example code you call the Initialize method with AudioClientMode.Shared. It strikes me as very unlikely that shared mode will let you work at 8kHz. Unlike the wave... APIs, WASAPI does no resampling for you of playback or capture, so the soundcard itself must be operating at the sample rate you specify.
I've been learning C# by creating an app and i've hit a snag i'm really struggling with.
Basicly i have the code below which is what im using to read from a network stream I have setup. It works but as its only reading 1 packet for each time the sslstream.Read() unblocks. It's causes a big backlog of messages.
What im looking at trying to do is if the part of the stream read contains multiple packets read them all.
I've tried multiple times to work it out but i just ended up in a big mess of code.
If anyone could help out I'd appreciate it!
(the first 4bytes of each packet is the size of the packet.. packets range between 8 bytes and 28,000 bytes)
SslStream _sslStream = (SslStream)_sslconnection;
int bytes = -1;
int nextread = 0;
int byteslefttoread = -1;
byte[] tmpMessage;
byte[] buffer = new byte[3000000];
do
{
bytes = _sslStream.Read(buffer, nextread, 8192);
int packetSize = BitConverter.ToInt32(buffer, 0);
nextread += bytes;
byteslefttoread = packetSize - nextread;
if (byteslefttoread <= 0)
{
int leftover = Math.Abs(byteslefttoread);
do
{
tmpMessage = new byte[packetSize];
Buffer.BlockCopy(buffer, 0, tmpMessage, 0, packetSize);
PacketHolder tmpPacketHolder = new PacketHolder(tmpMessage, "in");
lock (StaticMessageBuffers.MsglockerIn)
{
//puts message into the message queue.. not very oop... :S
MessageInQueue.Enqueue(tmpPacketHolder);
}
}
while (leftover > 0);
Buffer.BlockCopy(buffer, packetSize , buffer, 0, leftover);
byteslefttoread = 0;
nextread = leftover;
}
} while (bytes != 0);
If you are using .Net 3.5 or later I would highly suggest you look into Windows Communication Foundation (wcf). It will simply anything you are trying to do over a network.
On the other hand, if you are doing this purely for educational purposes.
Take a look at this link. Your best bet is to read from the stream in somewhat smaller increments, and feed that data into another stream. Once you can identify the length of data you need for a message, you can cut the second stream off into a message. You can setup an outside loop where available bytes are being checked and wait until its value is > 0 to start the next message. Also should note, that any network code should be running on its own thread, so as to not block the UI thread.