Audio stream for multiple outputs (single producer, multi-consumer)

Audio stream for multiple outputs (single producer, multi-consumer) - c#

I am attempting to propagate a single sound source to multiple outputs (such as one microphone input to multiple sound cards or channels). The output does not have to be sync'd (a few ms delay is acceptable) but it would be nice if it could be sync'd.
I have successfully written code that loops a microphone input to an output using a WaveIn, a BufferedWaveProvider, and a WaveOut. However when I try to read one BufferedWaveProvider with two instances of WaveOut, the two outputs create this odd 'interleaved' choppy sound. Here is a code snippet for the output portion;
private void CreateWaveOutDevice()
{
waveProvider = new BufferedWaveProvider(waveIn.WaveFormat);
waveOut = new WaveOut();
waveOut.DeviceNumber = 0; //Sound card 1
waveOut.DesiredLatency = 100;
waveOut.Init(waveProvider);
waveOut.PlaybackStopped += wavePlayer_PlaybackStopped;
waveOut2 = new WaveOut();
waveOut2.DeviceNumber = 1; //Sound card 2
waveOut2.DesiredLatency = 100;
waveOut2.Init(waveProvider);
waveOut2.PlaybackStopped += wavePlayer_PlaybackStopped;
waveOut.Play();
waveOut2.Play();
}
I think the reason this is happening is because when the waveProvider circular buffer is read, the data is deleted so the two read methods are 'fighting' over the data which results in the choppy sound.
So I really have two questions;
1.) I see the Naudio library contains many types of waveStreams (RawSourceWaveStream is particularly interesting) However, I have been unable to find a good example of how to read a single stream with multiple waveOut methods. I have also been unable to create working code using waveStream with multiple outputs. Is anyone familiar with waveStreams and knows if this is something that can be done?
2.) If the Naudio wave streams cannot be used in a single producer multiple consumer situation then I believe I would need to make a circular buffer that is not cleared on a read, but only when the buffer is full and new data is pushed in. The code won't care if the data was read or not it just keeps filling the buffer. Would this be the correct approach?
I've spent days searching so hopefully this hasn't already been asked. Thanks for reading.

If you're just reading from a microphone and want two WaveOut's to play it, then the simple option is to create two BufferedWaveProviders, one for each WaveOut, and then when audio is received, send it to both.
Likewise if you were playing from an audio file to two soundcards, the easiest way is to use two reader objects and start them both separately.
There is unfortunately no easy way to synchronize, short of starting and stopping both players at the same time.
There are a few more advanced ways to try to split off an audio stream to two readers, but there can be complications especially if the two readers are not able to read at roughly the same rate.

Related

C# NAudio - Keep Last X Seconds Of Audio

I have a project where audio is recorded at several sources, and the goal is to process X (user-defined) seconds of it through various methods (DSP as well as speech-to-text).
I'm using a MixingSampleProvider to collect all the sources into one "provider." I'm passing this to a NotifyingSampleProvider because it raises an event at each sample, and then passing that sample to my class that does the processing. I'm adding the float the NotifyingSampleProvider produces to the end of my "X second window" array (using Array.Copy to create a temp array with all but that last value, adding that last value, and copying the temp array back to the original) which I use for processing.
The obvious problem here is that it notifies (and that I'm locking and adding to the "X second window" array) for every single sample, or 44100 times a second. This leads to the audio being pretty much constantly locked so it can't be processed. There's got to be a more performant way to deal with this.
The one idea I had was a BufferedWaveProvider that doesn't get read from anywhere else, so it's always full (with DiscardOnOverflow = true of course). However A) this still requires a NotifyingSampleProvider to add to it periodically (you can't pass a provider to the BufferedWaveProvider for it to automatically read from) and I'd like to get away from such frequent (44100 Hz) function calls, and B) I understand that there's a limit to the BufferDuration length, which might be too small of a window (I don't recall what the limit is and I can't find anything online saying what it is).
How might I solve this problem? Using NAudio, how to I keep the last X seconds of audio accessible to a separate class at any time without using the NotifyingSampleProvider?

Getting Audio Data From MP3 File using NAudio

I want to be able to get audio data from an MP3 file with NAudio, average out the data in the left and right channels to create one dataset and then resample the averaged 44.1KHz audio data to 8Khz but I am having trouble understanding how data is represented in an NAudio Wavestream.
If I had 1 sec worth of MP3 audio, then how many bytes would I have in the WaveStream? By looking at a few code samples it seems one sample is 4 bytes and audio is sampled at 44100Hz and we have 2 different channels, so would that mean we would have (44100 * 4 * 2) bytes in the wavestream, is that right?
Which of the following 3 streams - AStream,PCM and inputStream - should I use to get audio data from? And how to I access left and right channel data separately?
var AStream = new MP3FileReader(myFilePath);
var PCM = new WaveConversionStream.Createpcm(AStream);
var inputStream = new WaveChannel32(new BlockAlignStream(PCM));
I have been thinking of converting the WaveStream using the WaveFormatConversionStream but the code below throws a NAudio.MmException with a message saying "AcmNotPossible calling Acmstreamopen".
var targetFormat = new WaveFormat(8000,1);
var resampled = new WaveFormatConversionStream(targetFormat, inputStream);
The above code doesn't even work if targetFormat is equal to inputStream's format, so I don't know what I am doing wrong here.
//Still throws NAudio.MmException
var resampled = new WaveFormatConversionStream(inputStream.WaveFormat, inputStream);
Other Info: VS2012, WPF, NAudio 1.6.

You seem to have copied a code sample that belongs to a much earlier version of NAudio. The Mp3FileReader class will emit 16 bit samples, and uses the ACM MP3 frame decompressor by default. If you'd prefer your samples directly in floating point, then you can make use of the AudioFileReader.
Resampling 44.1kHz straight down to 8kHz is not a particularly good idea, as you'd end up with a lot of aliasing, so a low pass filter would ideally be applied first. Left and right channels are stored interleaved, so you get a left sample, followed by a right sample, and so on.

Can't get smooth playback on NAudio waveOut, WaveProvider being emptied too quickly

I am attempting to create an application similar to the network chat demo in NAudio.
On the receiving/listening end however, I am buffering my received audio packets until I get 2 of them before calling WaveProvider.addSamples(). I have experimented with not buffering at all, buffering a large amount, etc. but the issue remains so I don't think this is the explicit cause.
The issue that I am having is that no matter how fast I attempt to addSamples, the WaveProvider is getting exhausted (BufferedBytes == 0) repeatedly when the audio is playing. I ran some tests, and my debug output goes something like this:
0.690s WaveProvider Empty
0.695s Samples are added to WaveProvider
0.871s WaveProvider Empty
0.875s Samples added to WaveProvider
1.013s WaveProvider Empty
... etc.
the WaveProvider does not display "empty" until a every 150 ms or so, which makes sense as the addSamples call adds data corresponding to 100ms of audio data.
The interesting part is the 4ms to 5ms of time between the moment the WaveProvider is empty and the next addSamples is attempted. No change in buffer size or effort to addSamples faster seems to decrease this gap and thus the audio played back always has these tiny gaps of silence which sound like pops/breaks.
I was wondering if this tiny time delay might be caused by the WaveProvider being locked when the waveOut device is playing from it, and thus I am being locked out from addingSamples to the WaveProvider until it is finished playing? Is there any way to rectify this?
My initialization code is as follows: (they should be normal I hope)
WaveProvider = new BufferedWaveProvider(new WaveFormat(16000, 1));
WaveProvider.BufferDuration = new TimeSpan(0, 0, 10);
AudioPlayer = new WaveOut(WaveCallbackInfo.FunctionCallback());
AudioPlayer.Init(WaveProvider);
I would appreciate any help on the matter.
Thank you

If you are not receiving data as fast as you are playing it, you will have gaps in playback and there is nothing much you can do about it. However, you can improve the experience by automatically going into pause while you buffer (say up to a second) of audio. Then if the buffer runs out, you go into pause again. That way the audio will sound a lot less choppy.

How to combine SoundEffectInstances into a new Sound File /mp3 or wav

I'm working on the new WindowsPhone platform. I have a few intances of a SoundEffectInstance that I would like to combine into a new single Sound file (either SoundEffectInstance, SoundEffect or MediaElement, it does not matter.) I then want to save that file as an mp3 to the phone.
How do I do that? Normally, I would try to send all the files to a bytearray but I'm not sure if that is the correct method here, or how to convert the bytearray into an MP3 format sound.
So for example I have SoundEffectInstance soudBackground, playing from 0 - 5 seconds. I then have SoundEffectInstance chime playing from 3 - 4 seconds, and SoundEffectInstance foreground playing from 3.5 to 7 seconds. I want to combine all these into a single mp3 file that lasts 7 seconds long.

There are two task that you are trying to accomplish here:
Combine several sound files into a single sound file
Save the resulting file as an MP3.
As far as I have found thus far you will have a good bit of challenges with item 2. To date I have not found a pure .Net MP3 encoder. All the ones I find rely on P/Invokes to native code (Which of course won't work on the phone).
As for combining the files, you don't want to treat them as a SoundEffectInstance. That class is only meant for playing and it abstracts most of the details of the sound file away. Instead you will need to treat the sound files as arrays of ints. I'm going to assume that the sample rate on all three sound files is the exact same and that these are 16-bit recordings. I am also going to assume that these wave files are recorded in mono. I'm keeping the scenario simple for now. You can extend upon it with stereo and various sample rates after you've mastered this simpler scenario.
The first 48 bytes of the wave files is nothing but header. Skip past that (for now) and read the contents of the wave files into their own arrays. Once they are all read we can start mixing them together. Ignoring the time differences in which you want to start playing these sounds if we wanted to start producing a sample that is the combined result of all three we could do it by adding the values in the sound file array together and writing that out to an array to hold our result. But there's a problem. 16-bit numbers can only go up to 32,767 (and down to -32,768). If the combined value of all three sounds were to go beyond these limits you'll get really bad distortion. The easiest (though not necessarily the best) way to handle this is to consider the maximum number of simultaneous sounds that will play and scale the values down accordingly. From the 3.5 second to 4 second mark you will have all three sounds playing. So we will scale by dividing by three. Another way is to sum up the sound samples using a data type that can go beyond this range and then normalizing the values back to this range when you are done mixing them together.
Let's define some parameters.
int SamplesPerSecond = 22000;
int ResultRecordingLength = 7;
short[] Sound01;
short[] Sound02;
short[] Sound03;
int[] ResultantSoundBuffer;
short[] ProcessedResultSoundBuffer;
//Insert code to populate sound array's here.
// Sound01.Length will equal 5.0*SamplesPerSecond
// Sound02.Length will equal 1.0*SamplesPerSecond
// Sound03.Length will equal 3.5*SamplesPerSecond
ResultantSound = new int[ResultRecordingLength*SamplesPerSecond];
Once you've got your sound files read and the array prepared for receiving the resulting file you can start rendering. There's several ways we could go about this. Here is one:
void InitResultArray(int[] resultArray)
{
for(int i=0;i<resultArray.Length;++i)
{
resultArray[i]=0;
}
}
void RenderSound(short[] sourceSound, int[] resultArray, double timeOffset)
{
int startIndex = (int)(timeOffset*SamplesPerSecond);
int readIndex = 0;
for(int readIndex=0;((readIndex<sourceSound.Length)&&(readIndex+sourceSound<resultArray.Length;++readIndex)
{
resultArray[readIndex+startIndex] += (int)sourceSound[readIndex];
}
}
RangeAdjust(int[] resultArray)
{
int max = int.MinimumValue;
int min = int.MaximumValue;
for(int i=0;i<resultArray;++i)
{
max = Math.Max(max, resultArray[i]);
min = Math.Min(min, resultArray[i]);
}
//I want the range normalized to [-32,768..32,768]
//you may want to normalize differently.
double scale = 65536d/(double)(max-min);
double offset = 32767-(max*scale);
for(int i=0;i<resultArray.Length;++i)
{
resultArray[i]= (scale*resultArray[i])+offset;
}
}
You would call InitResultAttay to ensure the result array is filled with zeros (I believe it is by default, but I still prefer to explicitly set it to zero) and then call RenderSound() for each sound that you want in your result. After you've rendered your sounds call RangeAdjust to normalize the sound. All that's left is to write it to a file. You'll need to convert from ints back to shorts.
short[] writeBuffer = new short[ResultantSound.Length];
for(int i=0;i<writeBuffer.Length;++i)
writeBuffer[i]=(short)ResultantSound[i];
Now the mixed sound is all ready to write to the file. There is just one thing missing, you need to write the 48 byte wave header before writing the file. I've written code on how to do that here: http://www.codeproject.com/KB/windows-phone-7/WpVoiceMemo.aspx

How to produce precisely-timed tone and silence?

I have a C# project that plays Morse code for RSS feeds. I write it using Managed DirectX, only to discover that Managed DirectX is old and deprecated. The task I have is to play pure sine wave bursts interspersed with silence periods (the code) which are precisely timed as to their duration. I need to be able to call a function which plays a pure tone for so many milliseconds, then Thread.Sleep() then play another, etc. At its fastest, the tones and spaces can be as short as 40ms.
It's working quite well in Managed DirectX. To get the precisely timed tone I create 1 sec. of sine wave into a secondary buffer, then to play a tone of a certain duration I seek forward to within x milliseconds of the end of the buffer then play.
I've tried System.Media.SoundPlayer. It's a loser [edit - see my answer below] because you have to Play(), Sleep(), then Stop() for arbitrary tone lengths. The result is a tone that is too long, variable by CPU load. It takes an indeterminate amount of time to actually stop the tone.
I then embarked on a lengthy attempt to use NAudio 1.3. I ended up with a memory resident stream providing the tone data, and again seeking forward leaving the desired length of tone remaining in the stream, then playing. This worked OK on the DirectSoundOut class for a while (see below) but the WaveOut class quickly dies with an internal assert saying that buffers are still on the queue despite PlayerStopped = true. This is odd since I play to the end then put a wait of the same duration between the end of the tone and the start of the next. You'd think that 80ms after starting Play of a 40 ms tone that it wouldn't have buffers on the queue.
DirectSoundOut works well for a while, but its problem is that for every tone burst Play() it spins off a separate thread. Eventually (5 min or so) it just stops working. You can see thread after thread after thread exiting in the Output window while running the project in VS2008 IDE. I don't create new objects during playing, I just Seek() the tone stream then call Play() over and over, so I don't think it's a problem with orphaned buffers/whatever piling up till it's choked.
I'm out of patience on this one, so I'm asking in the hopes that someone here has faced a similar requirement and can steer me in a direction with a likely solution.

I can't believe it... I went back to System.Media.SoundPlayer and got it to do just what I want... no giant dependency library with 95% unused code and/or quirks waiting to be discovered :-). Furthermore, it runs on MacOSX under Mono (2.6)!!! [wrong - no sound, will ask separate question]
I used a MemoryStream and BinaryWriter to crib a WAV file, complete with the RIFF header and chunking. No "fact" chunk needed, this is 16-bit samples at 44100Hz. So now I have a MemoryStream with 1000ms of samples in it, and wrapped by a BinaryReader.
In a RIFF file there are two 4-byte/32-bit lengths, the "overall" length which is 4 bytes into the stream (right after "RIFF" in ASCII), and a "data" length just before the sample data bytes. My strategy was to seek in the stream and use the BinaryWriter to alter the two lengths to fool the SoundPlayer into thinking the audio stream is just the length/duration I want, then Play() it. Next time, the duration is different, so once again overwrite the lengths in the MemoryStream with the BinaryWriter, Flush() it and once again call Play().
When I tried this, I couldn't get the SoundPlayer to see the changes to the stream, even if I set its Stream property. I was forced to create a new SoundPlayer... every 40 milliseconds??? No.
Well I want back to that code today and started looking at the SoundPlayer members. I saw "SoundLocation" and read it. There it said that a side effect of setting SoundLocation would be to null the Stream property, and vice versa for Stream. So I added a line of code to set the SOundLocation property to something bogus, "x", then set the Stream property to my (just modified) MemoryStream. Damn if it didn't pick that up and play a tone precisely as long as I asked for. There don't seem to be any crazy side effects like dead time afterward or increasing memory, or ??? It does take 1-2 milliseconds to do that tweaking of the WAV stream and then load/start the player, but it's very small and the price is right!
I also implemented a Frequency property which re-generates the samples and uses the Seek/BinaryWriter trick to overlay the old data in the RIFF/WAV MemoryStream with the same number of samples but for a different frequency, and again did the same thing for an Amplitude property.
This project is on SourceForge. You can get to the C# code for this hack in SPTones.CS from this page in the SVN browser. Thanks to everyone who provided info on this, including #arke whose thinking was close to mine. I do appreciate it.

It's best to just generate the sine waves and silence together into a buffer which you play. That is, always play something, but write whatever you need next into that buffer.
You know the samplerate, and given the samplerate, you can calculate the amount of samples you need to write.
uint numSamples = timeWantedInSeconds * sampleRate;
That's the amount of samples you need to generate a sine wave or silence, whichever. Then just fill the buffer as needed. That way, you get the most accurate possible timing.

Try using XNA.
You will have to provide a file, or a stream to a static tone, that you can loop. You can then change the pitch and volume of that tone.
Since XNA is made for games, it will have no problem at all with 40 ms delays.

It should be pretty easy to convert from ManagedDX to SlimDX ...
Edit: What stops you, btw, just pre-generating 'n' samples of sine wave? (Where n is the closest to the number of milliseconds you want). It really doesn't take all that long to generate the data. Further than that if you have a 22Khz buffer and you want the final 100 samples why don't you just submit 'buffer + 21950' and set the buffer length to 100 samples?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.