I had a look at the wave format today and created a little wave generator. I create a sine sound like this:
public static Wave GetSine(double length, double hz)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
for (int i = 0; i < l; i++)
{
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
MakeInt16WaveData looks like this:
public static byte[] MakeInt16WaveData(Int16[] ints)
{
int s = sizeof(Int16);
byte[] buf = new byte[s * ints.Length];
for(int i = 0; i < ints.Length; i++)
{
Buffer.BlockCopy(BitConverter.GetBytes(ints[i]), 0, buf, i * s, s);
}
return buf;
}
This works as expected! Now I wanted to swoop from one frequency to another like this:
public static Wave GetSineSwoop(double length, double hzStart, double hzEnd)
{
int bitRate = 44100;
int l = (int)(bitRate * length);
double f = 1.0 / bitRate;
Int16[] data = new Int16[l];
double hz;
double hzDelta = hzEnd - hzStart;
for (int i = 0; i < l; i++)
{
hz = hzStart + ((double)i / l) * hzDelta * 0.5; // why *0.5 ?
data[i] = (Int16)(Math.Sin(hz * i * f * Math.PI * 2) * Int16.MaxValue);
}
return new Wave(false, Wave.MakeInt16WaveData(data));
}
Now, when I swooped from 200 to 100 Hz, the sound played from 200 to 0 hertz. For some reason I had to multiply the delta by 0.5 to get the correct output. What might be the issue here ? Is this an audio thing or is there a bug in my code ?
Thanks
Edit by TaW: I take the liberty to add screenshots of the data in a chart which illustrate the problem, the first is with the 0.5 factor, the 2nd with 1.0 and the 3rd & 4th with 2.0 and 5.0:
Edit: here is an example, a swoop from 200 to 100 hz:
Debug values:
Wave clearly does not end at 100 hz
Digging out my rusty math I think it may be because:
Going in L steps from frequency F1 to F2 you have a frequency of
Fi = F1 + i * ( F2 - F1 ) / L
or with
( F2 - F1 ) / L = S
Fi = F1 + i * S
Now to find out how far we have progressed we need the integral over i, which would be
I(Fi) = i * F1 + 1/2 * i ^ 2 * S
Which give or take resembles the term inside your formula for the sine.
Note that you can gain efficiency by moving the constant part (S/2) out of the loop..
Related
Essentially, I'm trying to write a piece of software that can display the pitch of any incoming audio. I have a C# console app, and I'm using NAudio as a way to take in audio. Here's the code I have to do that:
int SampleRate = 44100;
int BufferMS = 1000;
NAudio.Wave.WaveInEvent Wave = new()
{
DeviceNumber = 0,
WaveFormat = new NAudio.Wave.WaveFormat(SampleRate, 1),
BufferMilliseconds = BufferMS
};
Wave.DataAvailable += OnWaveInData;
Wave.StartRecording();
void OnWaveInData(object? sender, NAudio.Wave.WaveInEventArgs e)
{
// get the frequency of the audio
var freq = GetFrequency(e.Buffer, e.BytesRecorded);
// print freq in hz
Console.WriteLine($"\r{freq} Hz");
}
I used GitHub Copilot to help me write this next part, as I know I need to use some math to get a frequency out of this, but it seemed quite advanced.
double GetFrequency(byte[] buffer, int bytesRecorded)
{
double max = 0;
int maxN = 0;
for (int i = 0; i < bytesRecorded / 2; i++)
{
var real = buffer[2 * i];
var imag = buffer[2 * i + 1];
var magnitude = Math.Sqrt(real * real + imag * imag);
if (magnitude > max)
{
max = magnitude;
maxN = i;
}
}
double freqN = maxN;
if (maxN > 0 && maxN < bytesRecorded / 2 - 1)
{
var dL = buffer[2 * (maxN - 1)];
var dR = buffer[2 * (maxN + 1)];
freqN += 0.5 * (dR * dR - dL * dL) / (dR + dL - 2 * buffer[2 * maxN]);
}
return freqN * SampleRate / bytesRecorded;
}
It works somewhat okay, but the issue is the frequency reported can be quite sporadic and jump around which is not desireable. Should I just write a filter function to remove these "outliers", or am I approaching this completely incorrectly?
Thanks for any help!
I implemented your procedure but I get some weird values like 218525442.36 in the freqN variable after the second block of calculations in the GetFrequency procedure.
It is from the variables dL and dr that values of the type 1 560 501 425 are returned to me and I don't understand why
You have suggestions
Thank you
Mimmo
I'm writing a class which should calculate angles (in degrees), after long trying I don't get it to work.
When r = 45 & h = 0 sum should be around 77,14°
but it returns NaN - Why? I know it must be something with the Math.Atan-Method and a not valid value.
My code:
private int r = 0, h = 0;
public Schutzwinkelberechnung(int r, int h)
{
this.r = r;
this.h = h;
}
public double getSchutzwinkel()
{
double sum = 0;
sum = 180 / Math.PI * Math.Atan((Math.Sqrt(2 * h * r - (h * h)) - r * Math.Sin(Math.Asin((Math.Sqrt(2 * h * r)) / (2 * r)))) / (h - r + r * (Math.Cos(Math.Asin(Math.Sqrt(2 * h * r) / (2 * r))))));
return sum;
}
Does someone see my mistake? I got the formula from an excel sheet.
EDIT: Okay, my problem was that I had a parsing error while creating the object or getting the user input, apparently solved it by accident. Ofc I have to add a simple exception, as Nick Dechiara said. Thank you very much for the fast reply, I appreciate it.
EDIT2: The exception in my excel sheet is:
if(h < 2) {h = 2}
so that's explaining everything and I wasn't paying attention at all. Thanks again for all answers.
int r = 45, h = 2;
sum = 77.14°
A good approach to debugging these kinds of issues is to break the equation into smaller pieces, so it is easier to debug.
double r = 45;
double h = 0;
double sqrt2hr = Math.Sqrt(2 * h * r);
double asinsqrt2hr = Math.Asin((sqrt2hr) / (2 * r));
double a = (Math.Sqrt(2 * h * r - (h * h)) - r * Math.Sin(asinsqrt2hr));
double b = (h - r + r * (Math.Cos(asinsqrt2hr)));
double sum = 180 / Math.PI * Math.Atan(a / b);
Now if we put a breakpoint at sum and let the code run, we see that both a and b are equal to zero. This gives us a / b = 0 / 0 = NaN in the final line.
Now we can ask, why is this happening? Well in the case of b you have h - r + r which is 0 - 45 + 45, evaluates to 0, so b becomes 0. You probably have an error in your math there.
In the case of a, we have 2 * h * r - h * h, which also evaluates to 0.
You probably either A) have an error in your equation, or B) need to include a special case for when h = 0, as that is breaking your math here.
Definitely break up the expression something like
var a = Asin(Sqrt(2 * h * r) / (2 * r));
var b = Sqrt(2 * h * r - h * h) - r * Sin(a);
var c = h - r + r * Cos(a);
var sum = 180 / PI * Atan(b / c);
and you will find b=0 and c=0. You might consider changing the last expression into
var sum = 180 / PI * Atan2(b , c);
which will return a value when b=0 and c=0.
PS. Also, use using static System.Math; in the beginning of the code to shorten such math expressions.
I am about to program a visualizer with pretty good results. I have got an array with the size of 1500, with the magnitude of the frequencys in it. Now I want to convert this array in an array with 100 values. For example in the 1st index of the 2nd array should be the average of the first two values in the first array. On the 2nd index of the 2nd array should be the values of index 3-6. But i don't know how to calculate this properly. So how can I convert the first array into the second one?
I have found an answer in the rainmeter source code. Maybe it will now be clearer what I wanted to do here is the c# code:
To get an array with an specific length, log scaled with min. and max. frequencies.
private float[] getFrequencies(int min, int max, int nBands)
{
float[] returnVal = new float[nBands];
double step = (Math.Log(max / min) / nBands) / Math.Log(2.0);
returnVal[0] = (float)(min * Math.Pow(2.0, step / 2.0));
for (int iBand = 1; iBand < nBands; ++iBand)
{
returnVal[iBand] = (float)(returnVal[iBand - 1] * Math.Pow(2.0, step));
}
return returnVal;
}
And to fill the output array:
private double[] getLogArray(double[] data, int nBands, int minFreq, int maxFreq)
{
float[] bandFreq = getFrequencies(minFreq, maxFreq, nBands);
float df = (float)sampleRate / samples;
float scalar = 1.0f / sampleRate;
double[] bandOut = new double[nBands];
int iBin = 0;
int iBand = 0;
float f0 = 0.0f;
while (iBin <= (samples / 2) && iBand < nBands)
{
float fLin1 = ((float)iBin + 0.5f) * df;
float fLog1 = bandFreq[iBand];
float x = (float)data[iBin];
if (fLin1 <= fLog1)
{
bandOut[iBand] += (fLin1 - f0) * x * scalar;
f0 = fLin1;
iBin += 1;
}
else
{
bandOut[iBand] += (fLog1 - f0) * x * scalar;
f0 = fLog1;
iBand += 1;
}
}
return bandOut;
}
Have a nice day and sorry for the late response.
I am a newcomer to the sound programming. I have a real-time sound visualizer(http://www.codeproject.com/Articles/20025/Sound-visualizer-in-C). I downloaded it from codeproject.com.
In AudioFrame.cs class there is an array as below:
_fftLeft = FourierTransform.FFTDb(ref _waveLeft);
_fftLeft is a double array. _waveLeft is also a double array. As above they applied
FouriorTransform.cs class's FFTDb function to a _waveLeft array.
Here is FFTDb function:
static public double[] FFTDb(ref double[] x)
{
n = x.Length;
nu = (int)(Math.Log(n) / Math.Log(2));
int n2 = n / 2;
int nu1 = nu - 1;
double[] xre = new double[n];
double[] xim = new double[n];
double[] decibel = new double[n2];
double tr, ti, p, arg, c, s;
for (int i = 0; i < n; i++)
{
xre[i] = x[i];
xim[i] = 0.0f;
}
int k = 0;
for (int l = 1; l <= nu; l++)
{
while (k < n)
{
for (int i = 1; i <= n2; i++)
{
p = BitReverse(k >> nu1);
arg = 2 * (double)Math.PI * p / n;
c = (double)Math.Cos(arg);
s = (double)Math.Sin(arg);
tr = xre[k + n2] * c + xim[k + n2] * s;
ti = xim[k + n2] * c - xre[k + n2] * s;
xre[k + n2] = xre[k] - tr;
xim[k + n2] = xim[k] - ti;
xre[k] += tr;
xim[k] += ti;
k++;
}
k += n2;
}
k = 0;
nu1--;
n2 = n2 / 2;
}
k = 0;
int r;
while (k < n)
{
r = BitReverse(k);
if (r > k)
{
tr = xre[k];
ti = xim[k];
xre[k] = xre[r];
xim[k] = xim[r];
xre[r] = tr;
xim[r] = ti;
}
k++;
}
for (int i = 0; i < n / 2; i++)
decibel[i] = 10.0 * Math.Log10((float)(Math.Sqrt((xre[i] * xre[i]) + (xim[i] * xim[i]))));
return decibel;
}
When I play a music note in a guitar i wanted to know it's frequency in a numerical format. I wrote a foreach loop to know what is the output of a _fftLeft array as below,
foreach (double myarray in _fftLeft)
{
Console.WriteLine(myarray );
}
This output's contain lots of real-time values as below .
41.3672743963389
,43.0176034462662,
35.3677383746087,
42.5968946936404,
42.0600935794783,
36.7521669642071,
41.6356709559342,
41.7189032845742,
41.1002451261724,
40.8035583510188,
45.604366914128,
39.645552593115
I want to know what are those values (frequencies or not)? if the answer is frequencies then why it contains low frequency values? And when I play a guitar note I want to detect a frequency of that particular guitar note.
Based on the posted code, FFTDb first computes the FFT then computes and returns the magnitudes of the frequency spectrum in the logarithmic decibels scale. In other words, the _fftLeft then contains magnitudes for a discreet set of frequencies. The actual values of those frequencies can be computed using the array index and sampling frequency according to this answer.
As an example, if you were plotting the _fftLeft output for a pure sinusoidal tone input you should be able to see a clear spike in the index corresponding to the sinusoidal frequency. For a guitar note however you are likely going to see multiple spikes in magnitude corresponding to the harmonics. To detect the note's frequency aka pitch is a more complicated topic and typically requires the use of one of several pitch detection algorithms.
I have run this FFT algorithm on a 440Hz audio file. But I get an unexpected sound frequency: 510Hz.
Is the byteArray containing .wav correctly converted into 2 double arrays (Re & Im parts)? The imaginary array contains only 0.
I assume that the highest sound frequency is the maximum of xRe array: please look at the very end of the run() function? Maybe that is my mistake: is it average or something like that?
What is the problem then?
UPDATED: The biggest sum Re+Im is at index = 0 so I get frequency = 0;
Whole project: contains .wav -> just open and run.
using System;
using System.Net;
using System.IO;
namespace FFT {
/**
* Performs an in-place complex FFT.
*
* Released under the MIT License
*
* Copyright (c) 2010 Gerald T. Beauregard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
public class FFT2 {
// Element for linked list in which we store the
// input/output data. We use a linked list because
// for sequential access it's faster than array index.
class FFTElement {
public double re = 0.0; // Real component
public double im = 0.0; // Imaginary component
public FFTElement next; // Next element in linked list
public uint revTgt; // Target position post bit-reversal
}
private static int sampleRate;
private uint m_logN = 0; // log2 of FFT size
private uint m_N = 0; // FFT size
private FFTElement[] m_X; // Vector of linked list elements
/**
*
*/
public FFT2() {
}
/**
* Initialize class to perform FFT of specified size.
*
* #param logN Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
*/
public void init(uint logN) {
m_logN = logN;
m_N = (uint)(1 << (int)m_logN);
// Allocate elements for linked list of complex numbers.
m_X = new FFTElement[m_N];
for (uint k = 0; k < m_N; k++)
m_X[k] = new FFTElement();
// Set up "next" pointers.
for (uint k = 0; k < m_N - 1; k++)
m_X[k].next = m_X[k + 1];
// Specify target for bit reversal re-ordering.
for (uint k = 0; k < m_N; k++)
m_X[k].revTgt = BitReverse(k, logN);
}
/**
* Performs in-place complex FFT.
*
* #param xRe Real part of input/output
* #param xIm Imaginary part of input/output
* #param inverse If true, do an inverse FFT
*/
public void run(double[] xRe, double[] xIm, bool inverse = false) {
uint numFlies = m_N >> 1; // Number of butterflies per sub-FFT
uint span = m_N >> 1; // Width of the butterfly
uint spacing = m_N; // Distance between start of sub-FFTs
uint wIndexStep = 1; // Increment for twiddle table index
// Copy data into linked complex number objects
// If it's an IFFT, we divide by N while we're at it
FFTElement x = m_X[0];
uint k = 0;
double scale = inverse ? 1.0 / m_N : 1.0;
while (x != null) {
x.re = scale * xRe[k];
x.im = scale * xIm[k];
x = x.next;
k++;
}
// For each stage of the FFT
for (uint stage = 0; stage < m_logN; stage++) {
// Compute a multiplier factor for the "twiddle factors".
// The twiddle factors are complex unit vectors spaced at
// regular angular intervals. The angle by which the twiddle
// factor advances depends on the FFT stage. In many FFT
// implementations the twiddle factors are cached, but because
// array lookup is relatively slow in C#, it's just
// as fast to compute them on the fly.
double wAngleInc = wIndexStep * 2.0 * Math.PI / m_N;
if (inverse == false)
wAngleInc *= -1;
double wMulRe = Math.Cos(wAngleInc);
double wMulIm = Math.Sin(wAngleInc);
for (uint start = 0; start < m_N; start += spacing) {
FFTElement xTop = m_X[start];
FFTElement xBot = m_X[start + span];
double wRe = 1.0;
double wIm = 0.0;
// For each butterfly in this stage
for (uint flyCount = 0; flyCount < numFlies; ++flyCount) {
// Get the top & bottom values
double xTopRe = xTop.re;
double xTopIm = xTop.im;
double xBotRe = xBot.re;
double xBotIm = xBot.im;
// Top branch of butterfly has addition
xTop.re = xTopRe + xBotRe;
xTop.im = xTopIm + xBotIm;
// Bottom branch of butterly has subtraction,
// followed by multiplication by twiddle factor
xBotRe = xTopRe - xBotRe;
xBotIm = xTopIm - xBotIm;
xBot.re = xBotRe * wRe - xBotIm * wIm;
xBot.im = xBotRe * wIm + xBotIm * wRe;
// Advance butterfly to next top & bottom positions
xTop = xTop.next;
xBot = xBot.next;
// Update the twiddle factor, via complex multiply
// by unit vector with the appropriate angle
// (wRe + j wIm) = (wRe + j wIm) x (wMulRe + j wMulIm)
double tRe = wRe;
wRe = wRe * wMulRe - wIm * wMulIm;
wIm = tRe * wMulIm + wIm * wMulRe;
}
}
numFlies >>= 1; // Divide by 2 by right shift
span >>= 1;
spacing >>= 1;
wIndexStep <<= 1; // Multiply by 2 by left shift
}
// The algorithm leaves the result in a scrambled order.
// Unscramble while copying values from the complex
// linked list elements back to the input/output vectors.
x = m_X[0];
while (x != null) {
uint target = x.revTgt;
xRe[target] = x.re;
xIm[target] = x.im;
x = x.next;
}
//looking for max IS THIS IS FREQUENCY
double max = 0, index = 0;
for (int i = 0; i < xRe.Length; i++) {
if (xRe[i] + xIm[i] > max) {
max = xRe[i]*xRe[i] + xIm[i]*xIm[i];
index = i;
}
}
max = Math.Sqrt(max);
/* if the peak is at bin index i then the corresponding
frequency will be i * Fs / N whe Fs is the sample rate in Hz and N is the FFT size.*/
//DONT KNOW WHY THE BIGGEST VALUE IS IN THE BEGINNING
Console.WriteLine("max "+ max+" index " + index + " m_logN" + m_logN + " " + xRe[0]);
max = index * sampleRate / m_logN;
Console.WriteLine("max " + max);
}
/**
* Do bit reversal of specified number of places of an int
* For example, 1101 bit-reversed is 1011
*
* #param x Number to be bit-reverse.
* #param numBits Number of bits in the number.
*/
private uint BitReverse(
uint x,
uint numBits) {
uint y = 0;
for (uint i = 0; i < numBits; i++) {
y <<= 1;
y |= x & 0x0001;
x >>= 1;
}
return y;
}
public static void Main(String[] args) {
// BinaryReader reader = new BinaryReader(System.IO.File.OpenRead(#"C:\Users\Duke\Desktop\e.wav"));
BinaryReader reader = new BinaryReader(File.Open(#"440.wav", FileMode.Open));
//Read the wave file header from the buffer.
int chunkID = reader.ReadInt32();
int fileSize = reader.ReadInt32();
int riffType = reader.ReadInt32();
int fmtID = reader.ReadInt32();
int fmtSize = reader.ReadInt32();
int fmtCode = reader.ReadInt16();
int channels = reader.ReadInt16();
sampleRate = reader.ReadInt32();
int fmtAvgBPS = reader.ReadInt32();
int fmtBlockAlign = reader.ReadInt16();
int bitDepth = reader.ReadInt16();
if (fmtSize == 18) {
// Read any extra values
int fmtExtraSize = reader.ReadInt16();
reader.ReadBytes(fmtExtraSize);
}
int dataID = reader.ReadInt32();
int dataSize = reader.ReadInt32();
// Store the audio data of the wave file to a byte array.
byte[] byteArray = reader.ReadBytes(dataSize);
/* for (int i = 0; i < byteArray.Length; i++) {
Console.Write(byteArray[i] + " ");
}*/
byte[] data = byteArray;
double[] arrRe = new double[data.Length];
for (int i = 0; i < arrRe.Length; i++) {
arrRe[i] = data[i] / 32768.0;
}
double[] arrI = new double[data.Length];
for (int i = 0; i < arrRe.Length; i++) {
arrI[i] = 0;
}
/**
* Initialize class to perform FFT of specified size.
*
* #param logN Log2 of FFT length. e.g. for 512 pt FFT, logN = 9.
*/
Console.WriteLine();
FFT2 fft2 = new FFT2();
uint logN = (uint)Math.Log(data.Length, 2);
fft2.init(logN);
fft2.run(arrRe, arrI);
// After this you have to split that byte array for each channel (Left,Right)
// Wav supports many channels, so you have to read channel from header
Console.ReadLine();
}
}
}
There are a few things that you need to address:
you're not applying a window function prior to the FFT - this will result in spectral leakage in the general case and you may get misleading results, particularly when looking for peaks, as there will be "smearing" of the spectrum.
when looking for peaks you should be looking at the magnitude of FFT output bins, not the individual real and imaginary parts - magnitude = sqrt(re^2 +im^2) (although you don't need to worry about the sqrt if you're just looking for peaks).
having identified a peak you need to convert the bin index into a frequency - if the peak is at bin index i then the corresponding frequency will be i * Fs / N where Fs is the sample rate in Hz and N is the FFT size.
for a real-to-complex FFT you can ignore the second N / 2 output bins as they are just the complex conjugate mirror image of the first N / 2 bins
(See also this answer for fuller explanations of the above.)