C# NAudio - Keep Last X Seconds Of Audio

C# NAudio - Keep Last X Seconds Of Audio - c#

I have a project where audio is recorded at several sources, and the goal is to process X (user-defined) seconds of it through various methods (DSP as well as speech-to-text).
I'm using a MixingSampleProvider to collect all the sources into one "provider." I'm passing this to a NotifyingSampleProvider because it raises an event at each sample, and then passing that sample to my class that does the processing. I'm adding the float the NotifyingSampleProvider produces to the end of my "X second window" array (using Array.Copy to create a temp array with all but that last value, adding that last value, and copying the temp array back to the original) which I use for processing.
The obvious problem here is that it notifies (and that I'm locking and adding to the "X second window" array) for every single sample, or 44100 times a second. This leads to the audio being pretty much constantly locked so it can't be processed. There's got to be a more performant way to deal with this.
The one idea I had was a BufferedWaveProvider that doesn't get read from anywhere else, so it's always full (with DiscardOnOverflow = true of course). However A) this still requires a NotifyingSampleProvider to add to it periodically (you can't pass a provider to the BufferedWaveProvider for it to automatically read from) and I'd like to get away from such frequent (44100 Hz) function calls, and B) I understand that there's a limit to the BufferDuration length, which might be too small of a window (I don't recall what the limit is and I can't find anything online saying what it is).
How might I solve this problem? Using NAudio, how to I keep the last X seconds of audio accessible to a separate class at any time without using the NotifyingSampleProvider?

Related

Process very large XML file

I need to process an XML file with the following structure:
<FolderSizes>
<Version></Version>
<DateTime Un=""></DateTime>
<Summary>
<TotalSize Bytes=""></TotalSize>
<TotalAllocated Bytes=""></TotalAllocated>
<TotalAvgFileSize Bytes=""></TotalAvgFileSize>
<TotalFolders Un=""></TotalFolders>
<TotalFiles Un=""></TotalFiles>
</Summary>
<DiskSpaceInfo>
<Drive Type="" Total="" TotalBytes="" Free="" FreeBytes="" Used=""
UsedBytes=""><![CDATA[ ]]></Drive>
</DiskSpaceInfo>
<Folder ScanState="">
<FullPath Name=""><![CDATA[ ]]></FullPath>
<Attribs Int=""></Attribs>
<Size Bytes=""></Size>
<Allocated Bytes=""></Allocated>
<AvgFileSz Bytes=""></AvgFileSz>
<Folders Un=""></Folders>
<Files Un=""></Files>
<Depth Un=""></Depth>
<Created Un=""></Created>
<Accessed Un=""></Accessed>
<LastMod Un=""></LastMod>
<CreatedCalc Un=""></CreatedCalc>
<AccessedCalc Un=""></AccessedCalc>
<LastModCalc Un=""></LastModCalc>
<Perc><![CDATA[ ]]></Perc>
<Owner><![CDATA[ ]]></Owner>
<!-- Special element; see paragraph below -->
<Folder></Folder>
</Folder>
</FolderSizes>
The <Folder> element is special in that it repeats within the <FolderSizes> element but can also appear within itself; I reckon up to about 5 levels.
The problem is that the file is really big at a whopping 11GB so I'm having difficulty processing it - I have experience with XML documents, but nothing on this scale.
What I would like to do is to import the information into a SQL database because then I will be able to process the information in any way necessary without having to concern myself with this immense, impractical file.
Here are the things I have tried:
Simply load the file and attempt to process it with a simple C# program using an XmlDocument or XDocument object
Before I even started I knew this would not work, as I'm sure everyone would agree, but I tried it anyway, and ran the application on a VM (since my notebook only has 4GB RAM) with 30GB memory. The application ended up using 24GB memory, and taking very, very long, so I just cancelled it.
Attempt to process the file using an XmlReader object
This approach worked better in that it didn't use as much memory, but I still had a few problems:
It was taking really long because I was reading the file one line at a time.
Processing the file one line at a time makes it difficult to really work with the data contained in the XML because now you have to detect the start of a tag, and then the end of that tag (hopefully), and then create a document from that information, read the info, attempt to determine which parent tag it belongs to because we have multiple levels... Sound prone to problems and errors
Did I mention it takes really long reading the file one line at a time; and that still without actually processing that line - literally just reading it.
Import the information using SQL Server
I created a stored procedure using XQuery and running it recursively within itself processing the <Folder> elements. This went quite well - I think better than the other two approaches - until one of the <Folder> elements ended up being rather big, producing a An XML operation resulted an XML data type exceeding 2GB in size. Operation aborted. error. I read up about it and I don't think it's an adjustable limit.
Here are more things I think I should try:
Re-write my C# application to use unmanaged code
I don't have much experience with unmanaged code, so I'm not sure how well it will work and how to make it as unmanaged as possible.
I once wrote a little application that works with my webcam, receiving the image, inverting the colours, and painting it to a panel. Using normal managed code didn't work - the result was about 2 frames per second. Re-writing the colour inversion method to use unmanaged code solved the problem. That's why I thought that unmanaged might be a solution.
Rather go for C++ in stead of C#
Not sure if this is really a solution. Would it necessarily be better that C#? Better than unmanaged C#?
The problem here is that I haven't actually worked with C++ before, so I'll need to get to know a few things about C++ before I can really start working with it, and then probably not very efficiently yet.
I thought I'd ask for some advice before I go any further, possibly wasting my time.
Thanks in advance for you time and assistance.
EDIT
So before I start processing the file I run through it and check the size in a attempt to provide the user with feedback as to how long the processing might take; I made a screenshot of the calculation:
That's about 1500 lines per second; if the average line length is about 50 characters, that's 50 bytes per line, that's 75 kilobytes per second, for an 11GB file should take about 40 hours, if my maths is correct. But this is only stepping each line. It's not actually processing the line or doing anything with it, so when that starts, the processing rate drops significantly.
This is the method that runs during the size calculation:
private int _totalLines = 0;
private bool _cancel = false; // set to true when the cancel button is clicked
private void CalculateFileSize()
{
xmlStream = new StreamReader(_filePath);
xmlReader = new XmlTextReader(xmlStream);
while (xmlReader.Read())
{
if (_cancel)
return;
if (xmlReader.LineNumber > _totalLines)
_totalLines = xmlReader.LineNumber;
InterThreadHelper.ChangeText(
lblLinesRemaining,
string.Format("{0} lines", _totalLines));
string elapsed = string.Format(
"{0}:{1}:{2}:{3}",
timer.Elapsed.Days.ToString().PadLeft(2, '0'),
timer.Elapsed.Hours.ToString().PadLeft(2, '0'),
timer.Elapsed.Minutes.ToString().PadLeft(2, '0'),
timer.Elapsed.Seconds.ToString().PadLeft(2, '0'));
InterThreadHelper.ChangeText(lblElapsed, elapsed);
if (_cancel)
return;
}
xmlStream.Dispose();
}
Still runnig, 27 minutes in :(

you can read an XML as a logical stream of elements instead of trying to read it line-by-line and piece it back together yourself. see the code sample at the end of this article
also, your question has already been asked here

Drawing signal with a lot of samples

I need to display a set of signals. Each signal is defined by millions of samples. Just processing the collection (for converting samples to points according to bitmap size) of samples takes a significant amount of time (especially during scrolling).
So I implemented some kind of downsampling. I just skip some points: take every 2nd, every 3rd, every 50th point depending on signal characteristics. It increases speed very much but significantly distorts signal form.
Are there any smarter approaches?

We've had a similar issue in a recent application. Our visualization (a simple line graph) became too cluttered when zoomed out to see the full extent of the data (about 7 days of samples with a sample taken every 6 seconds more or less), so down-sampling was actually the way to go. If we didn't do that, zooming out wouldn't have much meaning, as all you would see was just a big blob of lines smeared out over the screen.
It all depends on how you are going to implement the down-sampling. There's two (simple) approaches: down-sample at the moment you get your sample or down-sample at display time.
What really gives a huge performance boost in both of these cases is the proper selection of your data-sources.
Let's say you have 7 million samples, and your viewing window is just interested in the last million points. If your implementation depends on an IEnumerable, this means that the IEnumerable will have to MoveNext 6 million times before actually starting. However, if you're using something which is optimized for random reads (a List comes to mind), you can implement your own enumerator for that, more or less like this:
public IEnumerator<T> GetEnumerator(int start, int count, int skip)
{
// assume we have a field in the class which contains the data as a List<T>, named _data
for(int i = start;i<count && i < _data.Count;i+=skip)
{
yield return _data[i];
}
}
Obviously this is a very naive implementation, but you can do whatever you want within the for-loop (use an algorithm based on the surrounding samples to average?). However, this approach will make usually smooth out any extreme spikes in your signal, so be wary of that.
Another approach would be to create some generalized versions of your dataset for different ranges, which update itself whenever you receive a new signal. You usually don't need to update the complete dataset; just updating the end of your set is probably good enough. This allows you do do a bit more advanced processing of your data, but it will cost more memory. You will have to cache the distinct 'layers' of detail in your application.
However, reading your (short) explanation, I think a display-time optimization might be good enough. You will always get a distortion in your signal if you generalize. You always lose data. It's up to the algorithm you choose on how this distortion will occur, and how noticeable it will be.

You need a better sampling algorithm, also you can employ parallel processing features of c#. Refer to Task Parallel Library

PeekRange on a stack in C#?

I have a program that needs to store data values and periodically get the last 'x' data values.
It initially thought a stack is the way to go but I need to be able to see more than just the top value - something like a PeekRange method where I can peek the last 'x' number of values.
At the moment I'm just using a list and get the last, say, 20 values like this:
var last20 = myList.Skip(myList.Count - 20).ToList();
The list grows all the time the program runs, but I only ever want the last 20 values. Could someone give some advice on a better data structure?

I'd probably be using a ring buffer. It's not hard to implement one on your own, AFAIK there's no implementation provided by the Framework..

Well since you mentioned the stack, I guess you only need modifications at the end of the list?
In that case the list is actually a nice solution (cache efficient and with fast insertion/removal at the end). However your way of extracting the last few items is somewhat inefficient, because IEnumerable<T> won't expose the random access provided by the List. So the Skip()-Implementation has to scan the whole List until it reaches the end (or do a runtime type check first to detect that the container implements IList<T>). It is more efficient, to either access the items directly by index, or (if you need a second array) to use List<T>.CopyTo().
If you need fast removal/insertion at the beginning, you may want to consider a ring buffer or (doubly) linked list (see LinkedList<T>). The linked list will be less cache-efficient, but it is easy and efficient to navigate and alter from both directions. The ring buffer is a bit harder to implement, but will be more cache- and space-efficient. So its probably better if only small value types or reference types are stored. Especially when the buffers size is fixed.

You could just removeat(0) after each add (if the list is longer than 20), so the list will never be longer than 20 items.

You said stack, but you also said you only ever want the last 20 items. I don't think these two requirements really go together.
I would say that Johannes is right about a ring buffer. It is VERY easy to implement this yourself in .NET; just use a Queue<T> and once you reach your capacity (20) start dequeuing (popping) on every enqueue (push).
If you want your PeekRange to enumerate from the most recent to least recent, you can defineGetEnumerator to do somehing likereturn _queue.Reverse().GetEnumerator();

Woops, .Take() wont do it.
Here's an implementation of .TakeLast()
http://www.codeproject.com/Articles/119666/LINQ-Introducing-The-Take-Last-Operators.aspx

How to produce precisely-timed tone and silence?

I have a C# project that plays Morse code for RSS feeds. I write it using Managed DirectX, only to discover that Managed DirectX is old and deprecated. The task I have is to play pure sine wave bursts interspersed with silence periods (the code) which are precisely timed as to their duration. I need to be able to call a function which plays a pure tone for so many milliseconds, then Thread.Sleep() then play another, etc. At its fastest, the tones and spaces can be as short as 40ms.
It's working quite well in Managed DirectX. To get the precisely timed tone I create 1 sec. of sine wave into a secondary buffer, then to play a tone of a certain duration I seek forward to within x milliseconds of the end of the buffer then play.
I've tried System.Media.SoundPlayer. It's a loser [edit - see my answer below] because you have to Play(), Sleep(), then Stop() for arbitrary tone lengths. The result is a tone that is too long, variable by CPU load. It takes an indeterminate amount of time to actually stop the tone.
I then embarked on a lengthy attempt to use NAudio 1.3. I ended up with a memory resident stream providing the tone data, and again seeking forward leaving the desired length of tone remaining in the stream, then playing. This worked OK on the DirectSoundOut class for a while (see below) but the WaveOut class quickly dies with an internal assert saying that buffers are still on the queue despite PlayerStopped = true. This is odd since I play to the end then put a wait of the same duration between the end of the tone and the start of the next. You'd think that 80ms after starting Play of a 40 ms tone that it wouldn't have buffers on the queue.
DirectSoundOut works well for a while, but its problem is that for every tone burst Play() it spins off a separate thread. Eventually (5 min or so) it just stops working. You can see thread after thread after thread exiting in the Output window while running the project in VS2008 IDE. I don't create new objects during playing, I just Seek() the tone stream then call Play() over and over, so I don't think it's a problem with orphaned buffers/whatever piling up till it's choked.
I'm out of patience on this one, so I'm asking in the hopes that someone here has faced a similar requirement and can steer me in a direction with a likely solution.

I can't believe it... I went back to System.Media.SoundPlayer and got it to do just what I want... no giant dependency library with 95% unused code and/or quirks waiting to be discovered :-). Furthermore, it runs on MacOSX under Mono (2.6)!!! [wrong - no sound, will ask separate question]
I used a MemoryStream and BinaryWriter to crib a WAV file, complete with the RIFF header and chunking. No "fact" chunk needed, this is 16-bit samples at 44100Hz. So now I have a MemoryStream with 1000ms of samples in it, and wrapped by a BinaryReader.
In a RIFF file there are two 4-byte/32-bit lengths, the "overall" length which is 4 bytes into the stream (right after "RIFF" in ASCII), and a "data" length just before the sample data bytes. My strategy was to seek in the stream and use the BinaryWriter to alter the two lengths to fool the SoundPlayer into thinking the audio stream is just the length/duration I want, then Play() it. Next time, the duration is different, so once again overwrite the lengths in the MemoryStream with the BinaryWriter, Flush() it and once again call Play().
When I tried this, I couldn't get the SoundPlayer to see the changes to the stream, even if I set its Stream property. I was forced to create a new SoundPlayer... every 40 milliseconds??? No.
Well I want back to that code today and started looking at the SoundPlayer members. I saw "SoundLocation" and read it. There it said that a side effect of setting SoundLocation would be to null the Stream property, and vice versa for Stream. So I added a line of code to set the SOundLocation property to something bogus, "x", then set the Stream property to my (just modified) MemoryStream. Damn if it didn't pick that up and play a tone precisely as long as I asked for. There don't seem to be any crazy side effects like dead time afterward or increasing memory, or ??? It does take 1-2 milliseconds to do that tweaking of the WAV stream and then load/start the player, but it's very small and the price is right!
I also implemented a Frequency property which re-generates the samples and uses the Seek/BinaryWriter trick to overlay the old data in the RIFF/WAV MemoryStream with the same number of samples but for a different frequency, and again did the same thing for an Amplitude property.
This project is on SourceForge. You can get to the C# code for this hack in SPTones.CS from this page in the SVN browser. Thanks to everyone who provided info on this, including #arke whose thinking was close to mine. I do appreciate it.

It's best to just generate the sine waves and silence together into a buffer which you play. That is, always play something, but write whatever you need next into that buffer.
You know the samplerate, and given the samplerate, you can calculate the amount of samples you need to write.
uint numSamples = timeWantedInSeconds * sampleRate;
That's the amount of samples you need to generate a sine wave or silence, whichever. Then just fill the buffer as needed. That way, you get the most accurate possible timing.

Try using XNA.
You will have to provide a file, or a stream to a static tone, that you can loop. You can then change the pitch and volume of that tone.
Since XNA is made for games, it will have no problem at all with 40 ms delays.

It should be pretty easy to convert from ManagedDX to SlimDX ...
Edit: What stops you, btw, just pre-generating 'n' samples of sine wave? (Where n is the closest to the number of milliseconds you want). It really doesn't take all that long to generate the data. Further than that if you have a 22Khz buffer and you want the final 100 samples why don't you just submit 'buffer + 21950' and set the buffer length to 100 samples?

How to count the data sent or received by my PC (all processes/programs)?

I need to count the amount (in B/kB/MB/whatever) of data sent and received by my PC, by every running program/process.
Let's say I click "Start counting" and I get the sum of everything sent/received by my browser, FTP client, system actualizations etc. etc. from that moment till I choose "Stop".
To make it simpler, I want to count data transferred via TCP only - if it matters.
For now, I got the combo list of NICs in the PC (based on the comment in the link below).
I tried to change the code given here but I failed, getting strange out-of-nowhere values in dataSent/dataReceived.
I also read the answer at the question 442409 but as I can see it is about the data sent/received by the same program, which doesn't fit my requirements.

Perfmon should have counters for this type of thing that you want to do, so look there first.

Alright, I think I've found the solution, but maybe someone will suggest something better...
I made the timer (tested it with 10ms interval), which gets the "Bytes Received/sec" PerformanceCounter value and adds it to a global "temporary" variable and also increments the sum counter (if there is any lag). Then I made second timer with 1s interval, which gets the sum of values (from temporary sum), divides it by the counter and adds to the overall amount (also global). Then it resets the temporary sum and the counter.
I'm just not sure if it is right method, because I don't know, how the variables of "Bytes Received/sec" PerformanceCounter are varying during the one second. Maybe I should make some kind of histograph and get the average value?
For now, downloading 8.6MB file gave me 9.2MB overall amount - is it possible the other processes would generate that amount of net activity in less than 20 seconds?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.