Essentially, I'm trying to write a piece of software that can display the pitch of any incoming audio. I have a C# console app, and I'm using NAudio as a way to take in audio. Here's the code I have to do that:
int SampleRate = 44100;
int BufferMS = 1000;
NAudio.Wave.WaveInEvent Wave = new()
{
DeviceNumber = 0,
WaveFormat = new NAudio.Wave.WaveFormat(SampleRate, 1),
BufferMilliseconds = BufferMS
};
Wave.DataAvailable += OnWaveInData;
Wave.StartRecording();
void OnWaveInData(object? sender, NAudio.Wave.WaveInEventArgs e)
{
// get the frequency of the audio
var freq = GetFrequency(e.Buffer, e.BytesRecorded);
// print freq in hz
Console.WriteLine($"\r{freq} Hz");
}
I used GitHub Copilot to help me write this next part, as I know I need to use some math to get a frequency out of this, but it seemed quite advanced.
double GetFrequency(byte[] buffer, int bytesRecorded)
{
double max = 0;
int maxN = 0;
for (int i = 0; i < bytesRecorded / 2; i++)
{
var real = buffer[2 * i];
var imag = buffer[2 * i + 1];
var magnitude = Math.Sqrt(real * real + imag * imag);
if (magnitude > max)
{
max = magnitude;
maxN = i;
}
}
double freqN = maxN;
if (maxN > 0 && maxN < bytesRecorded / 2 - 1)
{
var dL = buffer[2 * (maxN - 1)];
var dR = buffer[2 * (maxN + 1)];
freqN += 0.5 * (dR * dR - dL * dL) / (dR + dL - 2 * buffer[2 * maxN]);
}
return freqN * SampleRate / bytesRecorded;
}
It works somewhat okay, but the issue is the frequency reported can be quite sporadic and jump around which is not desireable. Should I just write a filter function to remove these "outliers", or am I approaching this completely incorrectly?
Thanks for any help!
I implemented your procedure but I get some weird values like 218525442.36 in the freqN variable after the second block of calculations in the GetFrequency procedure.
It is from the variables dL and dr that values of the type 1 560 501 425 are returned to me and I don't understand why
You have suggestions
Thank you
Mimmo
Related
I'm developping a simple program that analyses frequencies of audio files.
Using an fft length of 8192, samplerate of 44100, if I use as input a constant frequency wav file - say 65Hz, 200Hz or 300Hz - the output is a constant graph at that value.
If I use a recording of someone speaking, the frequencies has peaks as high as 4000Hz, with an average at 450+ish on a 90 seconds file.
At first I thought it was because of the recording being stereo sound, but converting it to mono with the exact same bitrate as the test files doesn't change much. (average goes down from 492 to 456 but that's still way too high)
Has anyone got an idea as to what could cause this ?
I think I shouldn't find the highest value but perhaps take either an average or a median value ?
EDIT : using the average of the magnitudes per 8192 bytes buffer and getting the index that's closest to that magnitude messes everything up.
This is the code for the handler of the event the Sample Aggregator fires when it has calculated fft for current buffer
void FftCalculated(object sender, FftEventArgs e)
{
int length = e.Result.Length;
float[] magnitudes = new float[length];
for (int i = 0; i < length / 2; i++)
{
float real = e.Result[i].X;
float imaginary = e.Result[i].Y;
magnitudes[i] = (float)(10 * Math.Log10(Math.Sqrt((real * real) + (imaginary * imaginary))));
}
float max_mag = float.MinValue;
float max_index = -1;
for (int i = 0; i < length / 2; i++)
if (magnitudes[i] > max_mag)
{
max_mag = magnitudes[i];
max_index = i;
}
var currentFrequency = max_index * SAMPLERATE / 8192;
Console.WriteLine("frequency be " + currentFrequency);
}
ADDITION : this is the code that reads and sends the file to the analysing part
using (var rdr = new WaveFileReader(audioFilePath))
{
var newFormat = new WaveFormat(Convert.ToInt32(SAMPLERATE/*44100*/), 16, 1);
byte[] buffer = new byte[8192];
var audioData = new AudioData(); //custom class for project
using (var conversionStream = new WaveFormatConversionStream(newFormat, rdr))
{
// Used to send audio in realtime, it's a timestamps issue for the graphs
// I'm working on fixing this, but it has lower priority so disregard it :p
TimeSpan audioDuration = conversionStream.TotalTime;
long audioLength = conversionStream.Length;
int waitTime = (int)(audioDuration.TotalMilliseconds / audioLength * 8192);
while (conversionStream.Read(buffer, 0, buffer.Length) != 0)
{
audioData.AudioDataBase64 = Utils.Base64Encode(buffer);
Thread.Sleep(waitTime);
SendMessage("AudioData", Utils.StringToAscii(AudioData.GetJSON(audioData)));
}
Console.WriteLine("Reached End of File");
}
}
This is the code that receives the audio data
{
var audioData = new AudioData();
audioData =
AudioData.GetStateFromJSON(Utils.AsciiToString(receivedMessage));
QueueAudio(Utils.Base64Decode(audioData.AudioDataBase64)));
}
followed by
var waveFormat = new WaveFormat(Convert.ToInt32(SAMPLERATE/*44100*/), 16, 1);
_bufferedWaveProvider = new BufferedWaveProvider(waveFormat);
_bufferedWaveProvider.BufferDuration = new TimeSpan(0, 2, 0);
{
void QueueAudio(byte[] data)
{
_bufferedWaveProvider.AddSamples(data, 0, data.Length);
if (_bufferedWaveProvider.BufferedBytes >= fftLength)
{
byte[] buffer = new byte[_bufferedWaveProvider.BufferedBytes];
_bufferedWaveProvider.Read(buffer, 0, _bufferedWaveProvider.BufferedBytes);
for (int index = 0; index < buffer.Length; index += 2)
{
short sample = (short)((buffer[index] | buffer[index + 1] << 8));
float sample32 = (sample) / 32767f;
sampleAggregator.Add(sample32);
}
}
}
}
And then the SampleAggregator fires the event above when it's done with the fft.
I am having trouble getting a smoothed RSI. The below picture is from freestockcharts.com. The calculation uses this code.
public static double CalculateRsi(IEnumerable<double> closePrices)
{
var prices = closePrices as double[] ?? closePrices.ToArray();
double sumGain = 0;
double sumLoss = 0;
for (int i = 1; i < prices.Length; i++)
{
var difference = prices[i] - prices[i - 1];
if (difference >= 0)
{
sumGain += difference;
}
else
{
sumLoss -= difference;
}
}
if (sumGain == 0) return 0;
if (Math.Abs(sumLoss) < Tolerance) return 100;
var relativeStrength = sumGain / sumLoss;
return 100.0 - (100.0 / (1 + relativeStrength));
}
https://stackoverflow.com/questions/...th-index-using-some-programming-language-js-c
This seems to be the pure RSI with no smoothing. How does a smoothed RSI get calculated? I have tried changing it to fit the definitions of the these two sites however the output was not correct. It was barely smoothed.
(I don't have enough rep to post links)
tc2000 -> Indicators -> RSI_and_Wilder_s_RSI (Wilder's smoothing = Previous MA value + (1/n periods * (Close - Previous MA)))
priceactionlab -> wilders-cutlers-and-harris-relative-strength-index (RS = EMA(Gain(n), n)/EMA(Loss(n), n))
Can someone actually do the calculation with some sample data?
Wilder's RSI vs RSI
In order to calculate the RSI, you need a period to calculate it with. As noted on Wikipedia, 14 is used quite often.
So the calculation steps would be as follows:
Period 1 - 13, RSI = 0
Period 14:
AverageGain = TotalGain / PeriodCount;
AverageLoss = TotalLoss / PeriodCount;
RS = AverageGain / AverageLoss;
RSI = 100 - 100 / (1 + RS);
Period 15 - to period (N):
if (Period(N)Change > 0
AverageGain(N) = ((AverageGain(N - 1) * (PeriodCount - 1)) + Period(N)Change) / PeriodCount;
else
AverageGain(N) = (AverageGain(N - 1) * (PeriodCount - 1)) / PeriodCount;
if (this.Change < 0)
AverageLoss(N) = ((AverageLoss(N - 1) * (PeriodCount - 1)) + Math.Abs(Period(N)Change)) / PeriodCount;
else
AverageLoss(N) = (AverageLoss(N - 1) * (PeriodCount - 1)) / PeriodCount;
RS = AverageGain / AverageLoss;
RSI = 100 - (100 / (1 + RS));
Thereafter, to smooth the values, you need to apply a moving average of a certain period to your RSI values. To do that, traverse your RSI values from the last index to the first and calculate your average for the current period based on the preceding x smoothing periods.
Once done, just reverse the list of values to get the expected order:
List<double> SmoothedRSI(IEnumerable<double> rsiValues, int smoothingPeriod)
{
if (rsiValues.Count() <= smoothingPeriod)
throw new Exception("Smoothing period too large or too few RSI values passed.");
List<double> results = new List<double>();
List<double> reversedRSIValues = rsiValues.Reverse().ToList();
for (int i = 1; i < reversedRSIValues.Count() - smoothingPeriod - 1; i++)
results.Add(reversedRSIValues.Subset(i, i + smoothingPeriod).Average());
return results.Reverse().ToList();
}
The Subset method is just a simple extension method as follows:
public static List<double> Subset(this List<double> values, int start, int end)
{
List<double> results = new List<double>();
for (int i = start; i <= end; i++)
results.Add(values[i]);
return results;
}
Disclaimer, I did not test the code, but it should give you an idea of how the smoothing is applied.
You can't get accurate values without buffers / global variables to store data.
This is a smoothed indicator, meaning it doesn't only use 14 bars but ALL THE BARS:
Here's a step by step article with working and verified source codes generating exactly the same values if prices and number of available bars are the same, of course (you only need to load the price data from your source):
Tested and verified:
using System;
using System.Data;
using System.Globalization;
namespace YourNameSpace
{
class PriceEngine
{
public static DataTable data;
public static double[] positiveChanges;
public static double[] negativeChanges;
public static double[] averageGain;
public static double[] averageLoss;
public static double[] rsi;
public static double CalculateDifference(double current_price, double previous_price)
{
return current_price - previous_price;
}
public static double CalculatePositiveChange(double difference)
{
return difference > 0 ? difference : 0;
}
public static double CalculateNegativeChange(double difference)
{
return difference < 0 ? difference * -1 : 0;
}
public static void CalculateRSI(int rsi_period, int price_index = 5)
{
for(int i = 0; i < PriceEngine.data.Rows.Count; i++)
{
double current_difference = 0.0;
if (i > 0)
{
double previous_close = Convert.ToDouble(PriceEngine.data.Rows[i-1].Field<string>(price_index));
double current_close = Convert.ToDouble(PriceEngine.data.Rows[i].Field<string>(price_index));
current_difference = CalculateDifference(current_close, previous_close);
}
PriceEngine.positiveChanges[i] = CalculatePositiveChange(current_difference);
PriceEngine.negativeChanges[i] = CalculateNegativeChange(current_difference);
if(i == Math.Max(1,rsi_period))
{
double gain_sum = 0.0;
double loss_sum = 0.0;
for(int x = Math.Max(1,rsi_period); x > 0; x--)
{
gain_sum += PriceEngine.positiveChanges[x];
loss_sum += PriceEngine.negativeChanges[x];
}
PriceEngine.averageGain[i] = gain_sum / Math.Max(1,rsi_period);
PriceEngine.averageLoss[i] = loss_sum / Math.Max(1,rsi_period);
}else if (i > Math.Max(1,rsi_period))
{
PriceEngine.averageGain[i] = ( PriceEngine.averageGain[i-1]*(rsi_period-1) + PriceEngine.positiveChanges[i]) / Math.Max(1, rsi_period);
PriceEngine.averageLoss[i] = ( PriceEngine.averageLoss[i-1]*(rsi_period-1) + PriceEngine.negativeChanges[i]) / Math.Max(1, rsi_period);
PriceEngine.rsi[i] = PriceEngine.averageLoss[i] == 0 ? 100 : PriceEngine.averageGain[i] == 0 ? 0 : Math.Round(100 - (100 / (1 + PriceEngine.averageGain[i] / PriceEngine.averageLoss[i])), 5);
}
}
}
public static void Launch()
{
PriceEngine.data = new DataTable();
//load {date, time, open, high, low, close} values in PriceEngine.data (6th column (index #5) = close price) here
positiveChanges = new double[PriceEngine.data.Rows.Count];
negativeChanges = new double[PriceEngine.data.Rows.Count];
averageGain = new double[PriceEngine.data.Rows.Count];
averageLoss = new double[PriceEngine.data.Rows.Count];
rsi = new double[PriceEngine.data.Rows.Count];
CalculateRSI(14);
}
}
}
For detailed step-by-step instructions, I wrote a lengthy article, you can check it here: https://turmanauli.medium.com/a-step-by-step-guide-for-calculating-reliable-rsi-values-programmatically-a6a604a06b77
P.S. functions only work for simple indicators (Simple Moving Average), even Exponential Moving Average, Average True Range absolutely require global variables to store previous values.
I successfully captured sound from Wasapi using the following code:
IWaveIn waveIn = new WasapiLoopbackCapture();
waveIn.DataAvailable += OnDataReceivedFromWaveOut;
What I need to do now, is to resample the in-memory data to pcm with a sample rate of 8000 and 16 bits per sample mono.
I can't use ACMStream to resample the example, because the recorded audio is 32 bits per second.
I have tried this code to convert bytes from 32 to 16 bits, but all I get every time is just blank audio.
byte[] newArray16Bit = new byte[e.BytesRecorded / 2];
short two;
float value;
for (int i = 0, j = 0; i < e.BytesRecorded; i += 4, j += 2)
{
value = (BitConverter.ToSingle(e.Buffer, i));
two = (short)(value * short.MaxValue);
newArray16Bit[j] = (byte)(two & 0xFF);
newArray16Bit[j + 1] = (byte)((two >> 8) & 0xFF);
}
source = newArray16Bit;
I use this routine to resample on the fly from WASAPI IeeeFloat to the format I need in my app, which is 16kHz, 16 bit, 1 channel. My formats are fixed, so I'm hardcoding the conversions I need, but it can be adapted as needed.
private void ResampleWasapi(object sender, WaveInEventArgs e)
{
//the result of downsampling
var resampled = new byte[e.BytesRecorded / 12];
var indexResampled = 0;
//a variable to flag the mod 3-ness of the current sample
var arity = -1;
var runningSamples = new short[3];
for(var offset = 0; offset < e.BytesRecorded; offset += 8)
{
var float1 = BitConverter.ToSingle(e.Buffer, offset);
var float2 = BitConverter.ToSingle(e.Buffer, offset + 4);
//simple average to collapse 2 channels into 1
float mono = (float)((double)float1 + (double)float2) / 2;
//convert (-1, 1) range int to short
short sixteenbit = (short)(mono * 32767);
//the input is 48000Hz and the output is 16000Hz, so we need 1/3rd of the data points
//so save up 3 running samples and then mix and write to the file
arity = (arity + 1) % 3;
//record the value
runningSamples[arity] = sixteenbit;
//if we've hit the third one
if (arity == 2)
{
//simple average of the 3 and put in the 0th position
runningSamples[0] = (short)(((int)runningSamples[0] + (int)runningSamples[1] + (int)runningSamples[2]) / 3);
//copy that short (2 bytes) into the result array at the current location
Buffer.BlockCopy(runningSamples, 0, resampled, indexResampled, 2);
//for next pass
indexResampled += 2;
}
}
//and tell listeners that we've got data
RaiseDataEvent(resampled, resampled.Length);
}
I am a newcomer to the sound programming. I have a real-time sound visualizer(http://www.codeproject.com/Articles/20025/Sound-visualizer-in-C). I downloaded it from codeproject.com.
In AudioFrame.cs class there is an array as below:
_fftLeft = FourierTransform.FFTDb(ref _waveLeft);
_fftLeft is a double array. _waveLeft is also a double array. As above they applied
FouriorTransform.cs class's FFTDb function to a _waveLeft array.
Here is FFTDb function:
static public double[] FFTDb(ref double[] x)
{
n = x.Length;
nu = (int)(Math.Log(n) / Math.Log(2));
int n2 = n / 2;
int nu1 = nu - 1;
double[] xre = new double[n];
double[] xim = new double[n];
double[] decibel = new double[n2];
double tr, ti, p, arg, c, s;
for (int i = 0; i < n; i++)
{
xre[i] = x[i];
xim[i] = 0.0f;
}
int k = 0;
for (int l = 1; l <= nu; l++)
{
while (k < n)
{
for (int i = 1; i <= n2; i++)
{
p = BitReverse(k >> nu1);
arg = 2 * (double)Math.PI * p / n;
c = (double)Math.Cos(arg);
s = (double)Math.Sin(arg);
tr = xre[k + n2] * c + xim[k + n2] * s;
ti = xim[k + n2] * c - xre[k + n2] * s;
xre[k + n2] = xre[k] - tr;
xim[k + n2] = xim[k] - ti;
xre[k] += tr;
xim[k] += ti;
k++;
}
k += n2;
}
k = 0;
nu1--;
n2 = n2 / 2;
}
k = 0;
int r;
while (k < n)
{
r = BitReverse(k);
if (r > k)
{
tr = xre[k];
ti = xim[k];
xre[k] = xre[r];
xim[k] = xim[r];
xre[r] = tr;
xim[r] = ti;
}
k++;
}
for (int i = 0; i < n / 2; i++)
decibel[i] = 10.0 * Math.Log10((float)(Math.Sqrt((xre[i] * xre[i]) + (xim[i] * xim[i]))));
return decibel;
}
When I play a music note in a guitar i wanted to know it's frequency in a numerical format. I wrote a foreach loop to know what is the output of a _fftLeft array as below,
foreach (double myarray in _fftLeft)
{
Console.WriteLine(myarray );
}
This output's contain lots of real-time values as below .
41.3672743963389
,43.0176034462662,
35.3677383746087,
42.5968946936404,
42.0600935794783,
36.7521669642071,
41.6356709559342,
41.7189032845742,
41.1002451261724,
40.8035583510188,
45.604366914128,
39.645552593115
I want to know what are those values (frequencies or not)? if the answer is frequencies then why it contains low frequency values? And when I play a guitar note I want to detect a frequency of that particular guitar note.
Based on the posted code, FFTDb first computes the FFT then computes and returns the magnitudes of the frequency spectrum in the logarithmic decibels scale. In other words, the _fftLeft then contains magnitudes for a discreet set of frequencies. The actual values of those frequencies can be computed using the array index and sampling frequency according to this answer.
As an example, if you were plotting the _fftLeft output for a pure sinusoidal tone input you should be able to see a clear spike in the index corresponding to the sinusoidal frequency. For a guitar note however you are likely going to see multiple spikes in magnitude corresponding to the harmonics. To detect the note's frequency aka pitch is a more complicated topic and typically requires the use of one of several pitch detection algorithms.
I had a look at the wave format today and created a little wave generator. I create a sine sound like this:
public static Wave GetSine(double length, double hz)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
for (int i = 0; i < l; i++)
{
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
MakeInt16WaveData looks like this:
public static byte[] MakeInt16WaveData(Int16[] ints)
{
int s = sizeof(Int16);
byte[] buf = new byte[s * ints.Length];
for(int i = 0; i < ints.Length; i++)
{
Buffer.BlockCopy(BitConverter.GetBytes(ints[i]), 0, buf, i * s, s);
}
return buf;
}
This works as expected! Now I wanted to swoop from one frequency to another like this:
public static Wave GetSineSwoop(double length, double hzStart, double hzEnd)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
double hz;
double hzDelta = hzEnd - hzStart;
for (int i = 0; i < l; i++)
{
hz = hzStart + ((double)i / l) * hzDelta * 0.5; // why *0.5 ?
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
Now, when I swooped from 200 to 100 Hz, the sound played from 200 to 0 hertz. For some reason I had to multiply the delta by 0.5 to get the correct output. What might be the issue here ? Is this an audio thing or is there a bug in my code ?
Thanks
Edit by TaW: I take the liberty to add screenshots of the data in a chart which illustrate the problem, the first is with the 0.5 factor, the 2nd with 1.0 and the 3rd & 4th with 2.0 and 5.0:
Edit: here is an example, a swoop from 200 to 100 hz:
Debug values:
Wave clearly does not end at 100 hz
Digging out my rusty math I think it may be because:
Going in L steps from frequency F1 to F2 you have a frequency of
Fi = F1 + i * ( F2 - F1 ) / L
or with
( F2 - F1 ) / L = S
Fi = F1 + i * S
Now to find out how far we have progressed we need the integral over i, which would be
I(Fi) = i * F1 + 1/2 * i ^ 2 * S
Which give or take resembles the term inside your formula for the sine.
Note that you can gain efficiency by moving the constant part (S/2) out of the loop..