Directsound logarithm volume to linear volume slider - c#

I am developing an music player with DirectX.DirectSound. I have a problem with the volume. The directsound volume is logarithm. This means that with silent sounds, is much more sensitive to small variations in amplitude than with loud sounds. It also means that with a linear volume slider we have a logarithmic sensation of volume variations, and that just doesn't feel right. My question is: How can I make it linear?
My code until here is:
if (trkBalance.Value == trkBalance.Minimum)
{
foreGroundSound.Volume = (int)DS.Volume.Min;
}
else if (trkBalance.Value == trkBalance.Maximum)
{
foreGroundSound.Volume = (int)DS.Volume.Max;
}
else
{
foreGroundSound.Volume = (int)(-5000 * Math.Log10(100 - trkBalance.Value));
}

There is a rule of thumb to determine the perceived loudness:
A difference of 10 dB (doubleValue) results in a sound twice / half as loud as the original source.
With that in mind we can create a formula that maps the attenuation to the sound pressure level.
But at first we have to calculate the actual attenuation (as a fraction). DirectSound can attenuate a sound by 100 dB, which is an attenuation of 1/2^(100/doubleValue). This is the value for the minimum trackbar value. The maximum value is 1 (no change). So overall:
doubleValue = 10;
minimumAttenuation = 1/2^(100/doubleValue)
attenuation = minimumAttenuation + trkBalance.Value / 100 * (1 - minimumAttenuation);
Now we have a value within valid range. Now we need to find the sound pressure level for this attenuation.
And we know that the loudness doubles every 10 db (doubleValue):
attenuation = 2^(db/doubleValue) //ln
ln(attenuation) = db / doubleValue * ln(2)
db = doubleValue * ln(attenuation) / ln(2)
And since DirectSound takes hundreths dB, you can use
foreGroundSound.Volume = db * 100;
Those are just some theoretical thoughts based on wikipedia information. It might or might not work. Just try it.

Related

How to set a Mixer's volume to a slider's volume in Unity?

I'm trying to make some audio settings. Here is my script:
public AudioMixer masterMixer;
public float masterLvl;
public float musicLvl;
public float sfxLvl;
public void SetMasterVolume ()
{
masterLvl = masterVolumeSlider.value;
masterMixer.SetFloat("masterVol", masterLvl);
}
public void SetMusicVolume()
{
musicLvl = musicVolumeSlider.value;
masterMixer.SetFloat("musicVol", musicLvl);
}
public void SetSfxVolume()
{
sfxLvl = sfxVolumeSlider.value;
masterMixer.SetFloat("sfxVol", sfxLvl);
}
It has all the OnValueChanged(); things on the sliders. I just want to know why this doesn't work. Thanks.
EDIT: So the thing is that it changes the dB, not the volume. The new question is: How do I make it change the volume instead of dB?
EDIT 2: Screenshot.
I think, your problem is that the mixer value (-80db - 20db) is a logarithmic scale and the slider value is linear. For example: half volume is actually about -10db, but if you connect it to a linear scale, like the slider, then half volume will end up being -40db! Which is why it sounds like it's basically silent at that point.
There's an easy way to fix it:
Instead of setting the slider min / max values to -80 and 20, set them to min 0.0001 and max 1.
In the script to set the value of the exposed parameter, use this to convert the linear value to an attenuation level:
masterMixer.SetFloat("musicVol", Mathf.Log10(masterLevel) * 20);
It's important to set the min value to 0.0001, otherwise dropping it all the way to zero breaks the calculation and puts the volume up again.
Post:
https://forum.unity.com/threads/changing-audio-mixer-group-volume-with-ui-slider.297884/
Here's a formula that sounds subjectively closer to how linear volume behaves usually:
private float ValueToVolume(float value, float maxVolume)
{
return Mathf.Log10(Mathf.Clamp(value, 0.0001f, 1f)) * (maxVolume - zeroVolume) / 4f + maxVolume;
}
It looks something like this.
maxVolume can be used if your AudioMixerGroup "max" volume is not 0 but -20db for example.
You will have to deal with the dB to set the volume of a mixer. Set your slider's lower limit to -80 and upper limit to 20 and it will work fine with the mixer. If you do not want to deal with it You can either change the volume of the audio listener or the source.

GPS lap & Segment timer

I've been searching for a while but haven't found exactly what I'm looking for.
I'm working on an app that will go in a race car. It will give the driver the ability to press a button to mark a Start/Finish line. It will also have a button to allow a driver to set segment times.
Keep in mind a track can be an oval which I'm working on first. It could be a road course or it could be an auto cross where the start and finish line aren't the exact same location. They could be with 50 feet of each other or so but the car never crosses where it starts.
I have my gps data coming in and I convert the NMea messages to my classes and I store Lat, Lon, Speed, Course etc. In my research I've ran across this which is interesting. The GPS will be mounted outside the roof for better signal. It generates 10 hits per second. (Garmin Glo)
http://www.drdobbs.com/windows/gps-programming-net/184405690?pgno=1
It's old but it talks about UTM and the Cartesian coordinate system. So using the DecDeg2UTM, I convert Lat & Lon to X & coordinates as well.
I've also been trying to use the Intersect formula I found Here I took the intersect and tried to convert it to C# which I'll post at the end. However, feeding coordinates of an oval track, it doesn't seem to be working. Also, I'm not sure exactly what it's supposed to be doing. But the coordinates it returns when it does somethign like -35.xxx & 98.xxxx which out in an ocean somewhere 1000's of miles from where the track is.
I looking for answers to the following.
I assume I need to take the location recorded when a button is pressed for Start/Finish or Segment and calculate a line perpendicular to the direction the car in able to be able to do some sort of Line Intersection calculation. The Cartesian coordinates seems to calculate the bearing fairly well. But the question here is how do you get the "left and right coordinates". Also, keep in mind, an oval track may be 60 feet wide. But as mentioned an auto cross track may only be 20 ft wide and part of the track may be with 50 ft. Note I'm fine with indicating to set the points, the car needs to be going slow or stopped at the points to get an accurate coordinate. Some tracks they will have to be set while walking the track.
Based on this, should I be trying to use decimal lat lon or would utilizing the Cartesian coordinate system based on UTM be a more accurate method for what I'm trying to do?
Either one is there a .Net library or C based library with source code that has methods for making these calculations?
How can this be accurately handled. (Not that great with Math, links to code samples would help tremendously.)
Next, after I have the lines or whatever is needed for start/finish and segments, as I get GPS sign from the car racing, I need to figure out the most accurate way to tell when a car has crossed those segments. again if I'm lucky I'll get 10 hits per second but it will probably be lower. Then the vehicle speeds could vary significantly depending on the type of track and vehicle. So the GPS hit could be many feet "left or right" of a segment. Also, it could be many feet before or after a segment.
Again, if there is a GIS library out there I can feed coordinates and all this is calculated, that's would work as well as long as it's performant. If not again I'm trying to decide if it's best to break down coordinates to X Y or some geometry formulas for coordinates in decimal format. Mods, I assume there is hard data to support an answer of either way and this isn't responses aren't fully subjective to opinions.
Here is the C# code I came up with from the Script page above. I'm starting to feel UTM and the Cartesian Coordinate system would be better for accuracy and performance. But again I'm open to evidence to the contrary if it exists.
Thanks
P.S. Note GeoCoordinate is from the .Net System.Device.Location assemble. GpsData is just a class I use to convert NMEA messages into Lat, Lon, Course, NumSats, DateTime etc.
The degree Radian methods are extensions as as follows.
public static double DegreeToRadians(this double angle)
{
return Math.PI * angle / 180.0;
}
public static double RadianToDegree(this double angle)
{
return angle * (180.0 / Math.PI);
}
}
public static GeoCoordinate CalculateIntersection(GpsData p1, double brng1, GpsData p2, double brng2)
{
// see http://williams.best.vwh.net/avform.htm#Intersection
// Not sure I need to use Cosine
double _p1LatRadians = p1.Latitude.DegreeToRadians();
double _p1LonToRadians = p1.Longitude.DegreeToRadians();
double _p2LatToRadians = p2.Latitude.DegreeToRadians();
double _p2LonToRadians = p2.Longitude.DegreeToRadians();
double _brng1ToRadians = brng1.DegreeToRadians();
double _brng2ToRadians = brng2.DegreeToRadians();
double _deltaLat = _p2LatToRadians - _p1LatRadians;
double _deltaLon = _p2LonToRadians - _p1LonToRadians;
var _var1 = 2 * Math.Asin(Math.Sqrt(Math.Sin(_deltaLat / 2) * Math.Sin(_deltaLat / 2)
+ Math.Cos(_p1LatRadians) * Math.Cos(_p2LatToRadians) * Math.Sin(_deltaLon / 2) * Math.Sin(_deltaLon / 2)));
if (_var1 == 0) return null;
// initial/final bearings between points
var _finalBrng = Math.Acos((Math.Sin(_p2LatToRadians) - Math.Sin(_p1LatRadians) * Math.Cos(_var1)) / (Math.Sin(_var1) * Math.Cos(_p1LatRadians)));
//if (isNaN(θa)) θa = 0; // protect against rounding
var θb = Math.Acos((Math.Sin(_p1LatRadians) - Math.Sin(_p2LatToRadians) * Math.Cos(_var1)) / (Math.Sin(_var1) * Math.Cos(_p2LatToRadians)));
var θ12 = Math.Sin(_p2LonToRadians - _p1LonToRadians) > 0 ? _finalBrng : 2 * Math.PI - _finalBrng;
var θ21 = Math.Sin(_p2LonToRadians - _p1LonToRadians) > 0 ? 2 * Math.PI - θb : θb;
var α1 = (_brng1ToRadians - θ12 + Math.PI) % (2 * Math.PI) - Math.PI; // angle 2-1-3
var α2 = (θ21 - _brng2ToRadians + Math.PI) % (2 * Math.PI) - Math.PI; // angle 1-2-3
if (Math.Sin(α1) == 0 && Math.Sin(α2) == 0) return null; // infinite intersections
if (Math.Sin(α1) * Math.Sin(α2) < 0) return null; // ambiguous intersection
α1 = Math.Abs(α1);
α2 = Math.Abs(α2);
// ... Ed Williams takes abs of α1/α2, but seems to break calculation?
var α3 = Math.Acos(-Math.Cos(α1) * Math.Cos(α2) + Math.Sin(α1) * Math.Sin(α2) * Math.Cos(_var1));
var δ13 = Math.Atan2(Math.Sin(_var1) * Math.Sin(α1) * Math.Sin(α2), Math.Cos(α2) + Math.Cos(α1) * Math.Cos(α3));
var _finalLatRadians = Math.Asin(Math.Sin(_p1LatRadians) * Math.Cos(δ13) + Math.Cos(_p1LatRadians) * Math.Sin(δ13) * Math.Cos(_brng1ToRadians));
var _lonBearing = Math.Atan2(Math.Sin(_brng1ToRadians) * Math.Sin(δ13) * Math.Cos(_p1LatRadians), Math.Cos(δ13) - Math.Sin(_p1LatRadians) * Math.Sin(_finalLatRadians));
var _finalLon = _p1LonToRadians + _lonBearing;
var _returnLat = _finalLatRadians.RadianToDegree();
var _latToDegree = _finalLon.RadianToDegree();
var _returnLon = ( _latToDegree + 540) % 360 - 180;
return new GeoCoordinate(_returnLat, _returnLon);
//return new LatLon(φ3.toDegrees(), (λ3.toDegrees() + 540) % 360 - 180); // normalise to −180..+180°
}

How to get the fundamental frequency using Harmonic Product Spectrum?

I'm trying to get the pitch from the microphone input. First I have decomposed the signal from time domain to frequency domain through FFT. I have applied Hamming window to the signal before performing FFT. Then I get the complex results of FFT. Then I passed the results to Harmonic product spectrum, where the results get downsampled and then multiplied the downsampled peaks and gave a value as a complex number. Then what should I do to get the fundamental frequency?
public float[] HarmonicProductSpectrum(Complex[] data)
{
Complex[] hps2 = Downsample(data, 2);
Complex[] hps3 = Downsample(data, 3);
Complex[] hps4 = Downsample(data, 4);
Complex[] hps5 = Downsample(data, 5);
float[] array = new float[hps5.Length];
for (int i = 0; i < array.Length; i++)
{
checked
{
array[i] = data[i].X * hps2[i].X * hps3[i].X * hps4[i].X * hps5[i].X;
}
}
return array;
}
public Complex[] Downsample(Complex[] data, int n)
{
Complex[] array = new Complex[Convert.ToInt32(Math.Ceiling(data.Length * 1.0 / n))];
for (int i = 0; i < array.Length; i++)
{
array[i].X = data[i * n].X;
}
return array;
}
I have tried to get the magnitude using,
magnitude[i] = (float)Math.Sqrt(array[i] * array[i] + (data[i].Y * data[i].Y));
inside the for loop in HarmonicProductSpectrum method. Then tried to get the maximum bin using,
float max_mag = float.MinValue;
float max_index = -1;
for (int i = 0; i < array.Length / 2; i++)
if (magnitude[i] > max_mag)
{
max_mag = magnitude[i];
max_index = i;
}
and then I tried to get the frequency using,
var frequency = max_index * 44100 / 1024;
But I was getting garbage values like 1248.926, 1205,859, 2454.785 for the A4 note (440 Hz) and those values don't look like harmonics of A4.
A help would be greatly appreciated.
I implemented harmonic product spectrum in Python to make sure your data and algorithm were working nicely.
Here’s what I see when applying harmonic product spectrum to the full dataset, Hamming-windowed, with 5 downsample–multiply stages:
This is just the bottom kilohertz, but the spectrum is pretty much dead above 1 KHz.
If I chunk up the long audio clip into 8192-sample chunks (with 4096-sample 50% overlap) and Hamming-window each chunk and run HPS on it, this is the matrix of HPS. This is kind of a movie of the HPS spectrum over the entire dataset. The fundamental frequency seems to be quite stable.
The full source code is here—there’s a lot of code that helps chunk the data and visualize the output of HPS running on the chunks, but the core HPS function, starting at def hps(…, is short. But it has a couple of tricks in it.
Given the strange frequencies that you’re finding the peak at, it could be that you’re operating on the full spectrum, from 0 to 44.1 KHz? You want to only keep the “positive” frequencies, i.e., from 0 to 22.05 KHz, and apply the HPS algorithm (downsample–multiply) on that.
But assuming you start out with a positive-frequency-only spectrum, take its magnitude properly, it looks like you should get reasonable results. Try to save out the output of your HarmonicProductSpectrum to see if it’s anything like the above.
Again, the full source code is at https://gist.github.com/fasiha/957035272009eb1c9eb370936a6af2eb. (There I try out another couple of spectral estimator, Welch’s method from Scipy and my port of the Blackman-Tukey spectral estimator. I’m not sure if you are set on implementing HPS or if you would consider other pitch estimators, so I’m leaving the Welch/Blackman-Tukey results there.)
Original I wrote this as a comment but had to keep revising it because it was confusing so here’s it as a mini-answer.
Based on my brief reading of this intro to HPS, I don’t think you’re taking the magnitudes correctly after you find the four decimated responses.
You want:
array[i] = sqrt(data[i] * Complex.conjugate(data[i]) *
hps2[i] * Complex.conjugate(hps2[i]) *
hps3[i] * Complex.conjugate(hps3[i]) *
hps4[i] * Complex.conjugate(hps4[i]) *
hps5[i] * Complex.conjugate(hps5[i])).X;
This uses the sqrt(x * Complex.conjugate(x)) trick to find x’s magnitude, and then multiplies all 5 magnitudes.
(Actually, it moves the sqrt outside the product, so you only do one sqrt, saves some time, but gives the same result. So maybe that’s another trick.)
Final trick: it takes that result’s real part because sometimes due to float accuracy issues, a tiny imaginary component, like 1e-15, survives.
After you do this, array should contain just real floats, and you can apply the max-bin-finding.
If there’s no Conjugate method, then the old-fashioned way should work:
public float mag2(Complex c) { return c.X * c.X + c.Y * c.Y; }
// in HarmonicProductSpectrum
array[i] = sqrt(mag2(data[i]) * mag2(hps2[i]) * mag2(hps3[i]) * mag2(hps4[i]) * mag2(hps5[i]));
There’s algebraic flaws with the two approaches you suggested in the comments below, but the above should be correct. I’m not sure what C# does when you assign a Complex to a float—maybe it uses the real component? I’d have thought that’d be a compiler error, but with the above code, you’re doing the right thing with the complex data, and only assigning a float to array[i].
To get a pitch estimate, you have to divide your sumed bin frequency estimate by the downsampling ratio used for that sum.
Added: You should also sum the magnitudes (abs()), not take the magnitude of the complex sum.
But the harmonic product spectrum algorithm (HPS), especially when using only integer ratios of downsampling, doesn't usually provide better pitch estimation resolution. Instead, it provides a more robust rough pitch estimate (less likely to be fooled by a harmonic) than using a single bare FFT magnitude peak for sequential overtone rich timbres that have weak or missing fundamental spectral content.
If you know how to downsample a spectrum by fractional ratios (using interpolation, etc.), you can try finer grained downsampling to get a better pitch estimate out of HPS. Or you can use an HPS result to inform you of a narrower frequency range in which to search using another pitch or frequency estimation method.

Synthesizer Slide from One Frequency to Another

I'm writing a synthesizer in C# using NAudio. I'm trying to make it slide smoothly between frequencies. But I have a feeling I'm not understanding something about the math involved. It slides wildly at a high pitch before switching to the correct next pitch.
What's the mathematically correct way to slide from one pitch to another?
Here's the code:
public override int Read(float[] buffer, int offset, int sampleCount)
{
int sampleRate = WaveFormat.SampleRate;
for (int n = 0; n < sampleCount; n++)
{
if (nextFrequencyQueue.Count > 0)
{
nextFrequency = nextFrequencyQueue.Dequeue();
}
if (nextFrequency > 0 && Frequency != nextFrequency)
{
if (Frequency == 0) //special case for first note
{
Frequency = nextFrequency;
}
else //slide up or down to next frequency
{
if (Frequency < nextFrequency)
{
Frequency = Clamp(Frequency + frequencyStep, nextFrequency, Frequency);
}
if (Frequency > nextFrequency)
{
Frequency = Clamp(Frequency - frequencyStep, Frequency, nextFrequency);
}
}
}
buffer[n + offset] = (float)(Amplitude * Math.Sin(2 * Math.PI * time * Frequency));
try
{
time += (double)1 / (double)sampleRate;
}
catch
{
time = 0;
}
}
return sampleCount;
}
You are using absolute time to determine the wave function, so when you change the frequency very slightly, the next sample is what it would have been had you started the run at that new frequency.
I don't know the established best approach, but a simple approach that's probably good enough is to compute the phase (φ = t mod 1/fold) and adjust t to preserve the phase under the new frequency (t = φ/fnew).
A smoother approach would be to preserve the first derivative. This is more difficult because, unlike for the wave itself, the amplitude of the first derivative varies with frequency, which means that preserving the phase isn't sufficient. In any event, this added complexity is almost certainly overkill, given that you are varying the frequency smoothly.
One approach is to use wavetables. You construct a full cycle of a sine wave in an array, then in your Read function you can simply lookup into it. Each sample you read, you advance by an amount calculated from the desired output frequency. Then when you want to glide to a new frequency, you calculate the new delta for lookups into the table, and then instead of going straight there you adjust the delta incrementally to move to the new value over a set period of time (the 'glide' or portamento time).
Frequency = Clamp(Frequency + frequencyStep, nextFrequency, Frequency);
The human ear doesn't work like that, it is highly non-linear. Nature is logarithmic. The frequency of middle C is 261.626 Hz. The next note, C#, is related to the previous one by a factor of Math.Pow(2, 1/12.0) or about 1.0594631. So C# is 277.183 Hz, an increment of 15.557 Hz.
The next C up the scale has double the frequency, 523.252 Hz. And C# after that is 554.366 Hz, an increment of 31.084 Hz. Note how the increment doubled. So the frequencyStep in your code snippet should not be an addition, it should be a multiplication.
buffer[n + offset] = (float)(Amplitude * Math.Sin(2 * Math.PI * time * Frequency));
That's a problem as well. Your calculated samples do not smoothly transition from one frequency to the next. There's a step when "Frequency" changes. You have to apply an offset to "time" so it produces the exact same sample value at sample time "time - 1", the same value you previously calculated with the previous value of Frequency. These steps produce high frequency artifacts with many harmonics that are gratingly obvious to the human ear.
Background info is available in this Wikipedia article. It will help to visualize the wave form you generate, you would have easily diagnosed the step problem. I'll copy the Wiki image:

How to simulate a harmonic oscillator driven by a given signal (not driven by sine wave)

I've got a table of values telling me how the signal level changes over time and I want to simulate a harmonic oscillator driven by this signal. It does not matter if the simulation is not 100% accurate.
I know the frequency of the oscillator.
I found lots of formulas but they all use a sine wave as driver.
I guess you want to perform some time-discrete simulation. The well-known formulae require analytic input (see Green's function). If you have a table of forces at some point in time, the typical analytical formulae won't help you too much.
The idea is this: For each point in time t0, the oscillator has some given acceleration, velocity, etc. Now a force acts on it -according to the table you were given- which will change it's acceleration (F = m * a). For the next time step t1, we assume the acceleration stays at that constant, so we can apply simple Newtonian equations (v = a * dt) with dt = (t1-t0) for this time frame. Iterate until the desired range in time is simulated.
The most important parameter of this simulation is dt, that is, how fine-grained the calculation is. For example, you might want to have 10 steps per second, but that completely depends on your input parameters. What we're doing here, in essence, is an Eulerian integration of the equations.
This, of course, isn't all there is - such simulations can be quite complicated, esp. in not-so-well behaved cases where extreme accelerations, etc. In those cases you need to perform numerical sanity checks within a frame, because something 'extreme' happens in a single frame. Also some numerical integration might become necessary, e.g. the Runge-Kutta algorithm. I guess that leads to far at this point, however.
EDIT: Just after I posted this, somebody posted a comment to the original question pointing to the "Verlet Algorithm", which is basically an implementation of what I described above.
http://en.wikipedia.org/wiki/Simple_harmonic_motion
http://en.wikipedia.org/wiki/Hooke's_Law
http://en.wikipedia.org/wiki/Euler_method
Ok, i finally figured it out and wrote a gui app to test it until it worked. But my pc is not very happy with doing it 1000*44100 times per second, even without gui^^
Whatever: here is my test code (wich worked quite well):
double lastTime;
const double deltaT = 1 / 44100.0;//length of a frame in seconds
double rFreq;
private void InitPendulum()
{
double freq = 2;//frequency in herz
rFreq = FToRSpeed(freq);
damp = Math.Pow(0.8, freq * deltaT);
}
private static double FToRSpeed(double p)
{
p *= 2;
p = Math.PI * p;
return p * p;
}
double damp;
double bHeight;
double bSpeed;
double lastchange;
private void timer1_Tick(object sender, EventArgs e)
{
double now=sw.ElapsedTicks/(double)Stopwatch.Frequency;
while (lastTime+deltaT <= now)
{
bHeight += bSpeed * deltaT;
double prevSpeed=bSpeed;
bSpeed += (mouseY - bHeight) * (rFreq*deltaT);
bSpeed *= damp;
if ((bSpeed > 0) != (prevSpeed > 0))
{
Console.WriteLine(lastTime - lastchange);
lastchange = lastTime;
}
lastTime += deltaT;
}
Invalidate();//No, i am not using gdi^^
}

Categories

Resources