UWP AudioGraph : Garbage Collector causes clicks in the audio output

UWP AudioGraph : Garbage Collector causes clicks in the audio output - c#

I have a C# UWP application that uses the AudioGraph API.
I use a custom effect on a MediaSourceAudioInputNode.
I followed the sample on this page :
https://learn.microsoft.com/en-us/windows/uwp/audio-video-camera/custom-audio-effects
It works but I can hear multiple clicks per second in the speakers when the custom effect is running.
Here is the code for my ProcessFrame method :
public unsafe void ProcessFrame(ProcessAudioFrameContext context)
{
if (context == null)
{
throw new ArgumentNullException(nameof(context));
}
AudioFrame frame = context.InputFrame;
using (AudioBuffer inputBuffer = frame.LockBuffer(AudioBufferAccessMode.Read))
using (IMemoryBufferReference inputReference = inputBuffer.CreateReference())
{
((IMemoryBufferByteAccess)inputReference).GetBuffer(out byte* inputDataInBytes, out uint inputCapacity);
Span<float> samples = new Span<float>(inputDataInBytes, (int)inputCapacity / sizeof(float));
for (int i = 0; i < samples.Length; i++)
{
float sample = samples[i];
// sample processing...
samples[i] = sample;
}
}
}
I used the Visual Studio profiler to identify the cause of the problem.
It is clear that there is a memory problem. The garbage collection runs several times each second. At each garbage collection, I can hear a click.
The Visual Studio profiler shows that the garbage-collected objects are type ProcessAudioFrameContext.
These objects are created by the AudioGraph API before entering the ProcessFrame method and passed as a parameter to the method.
Is there something that I can do to avoid these frequent garbage collections ?

The problem is not specific to custom effects, but it is a general problem with AudioGraph (current SDK is 1809).
Garbage collections can pause the AudioGraph thread for a too long time (more than 10ms, it is the default size of audio buffers). The result is that clicks can be heard in the audio output.
The use of custom effects puts a lot of pressure on the garbage collector.
I found a good workaround. It uses the GC.TryStartNoGCRegion method.
After this method is called, the clicks completely disappear. But the app keeps growing in memory until the GC.EndNoGCRegion method is called.
// at the beginning of playback...
// 240 Mb is the amount of memory that can be allocated before a GC occurs
GC.TryStartNoGCRegion(240 * 1024 * 1024, true);
// ... at the end of playback
GC.EndNoGCRegion();
MSDN doc :
https://learn.microsoft.com/fr-fr/dotnet/api/system.gc.trystartnogcregion?view=netframework-4.7.2
And a good article :
https://mattwarren.org/2016/08/16/Preventing-dotNET-Garbage-Collections-with-the-TryStartNoGCRegion-API/

the garbage collector is probably reacting to you initializing the sample temporary memory every frame, which is then released after the frame, try assign the memory for holding the samples in your start up code and just reuse it every frame.

Related

Improve multi-threaded code design to prevent race condition

I'm running into an issue that I'm not sure is solvable in the way I want to solve it. I have a problem with a race condition.
I have one project running as a C++ dll (the main engine).
Then I have a second C# process that uses C++/CLI to communicate with the main engine (the editor).
The editor is hosting the engine window as a child window. The result of this is that the child window receives input messages async (see RiProcessMouseMessage()). Normally this only happens when I call window->PollEvents();.
main engine loop {
RiProcessMouseMessage(); // <- Called by the default windows message poll function from the child window
foreach(inputDevice)
inputDevice->UpdateState();
otherCode->UseCurrentInput();
}
The main editor loop is the WPF loop which I don't control. Basically it does this:
main editor loop {
RiProcessMouseMessage(); // <- This one is called by the editor (parent) window, but is using the message loop of the (child) engine window
}
The RawInput processor which is called sync by the engine and async by the editor:
void Win32RawInput::RiProcessMouseMessage(const RAWMOUSE& rmouse, HWND hWnd) {
MouseState& state = Input::mouse._GetGatherState();
// Check Mouse Position Relative Motion
if (rmouse.usFlags == MOUSE_MOVE_RELATIVE) {
vec2f delta((float)rmouse.lLastX, (float)rmouse.lLastY);
delta *= MOUSE_SCALE;
state.movement += delta;
POINT p;
GetCursorPos(&p);
state.cursorPosGlobal = vec2i(p.x, p.y);
ScreenToClient(hWnd, &p);
state.cursorPos = vec2i(p.x, p.y);
}
// Check Mouse Wheel Relative Motion
if (rmouse.usButtonFlags & RI_MOUSE_WHEEL)
state.scrollMovement.y += ((float)(short)rmouse.usButtonData) / WHEEL_DELTA;
if (rmouse.usButtonFlags & RI_MOUSE_HWHEEL)
state.scrollMovement.x += ((float)(short)rmouse.usButtonData) / WHEEL_DELTA;
// Store Mouse Button States
for (int i = 0; i < 5; i++) {
if (rmouse.usButtonFlags & maskDown_[i]) {
state.mouseButtonState[i].pressed = true;
state.mouseButtonState[i].changedThisFrame = true;
} else if (rmouse.usButtonFlags & maskUp_[i]) {
state.mouseButtonState[i].pressed = false;
state.mouseButtonState[i].changedThisFrame = true;
}
}
}
UpdateState() is called only by the engine. It basically swaps the RawInput to the currently used input. This is to prevent input updating in the middle of a frame loop (aka. during otherCode->UseCurrentInput();)
void UpdateState() {
currentState = gatherState; // Copy gather state to current
Reset(gatherState); // Reset the old buffer so the next time the buffer it's used it's all good
// Use current state to check stuff
// For the rest of this frame currentState should be used
}
MouseState& _GetGatherState() { return gatherState; }
void Reset(MouseState& state) { // Might need a lock around gatherState :(
state.movement = vec2f::zero;
state.scrollMovement = vec2f::zero;
for (int i = 0; i < 5; ++i)
state.mouseButtonState[i].changedThisFrame = false;
}
So as you can see the race condition happens when RiProcessMouseMessage() is called while Reset() was called in the main engine loop. If it wasn't clear: The Reset() function is required to reset state back to it's frames default data so that the data is read correctly every frame.
Now I'm very much aware I can fix this easily by adding a mutex around the gatherState updates but I would like to avoid this if possible. Basically I'm asking is it possible to redesign this code to be lock free?

You are asking lock-free which is not quite possible if both ends alter the buffer. But if you ask lock that is optimized and almost instantaneous then you can use FIFO logic. You can use the .net's ConcurrentQueue "https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentqueue-1?view=net-5.0" to write updates and poll updates from this queue.
If you really get rid of the lock then you may check lock-free circular arrays aka lock-free ring-buffer,
If you want to dig deeper into hardware level to understand the logic behind this then you can check https://electronics.stackexchange.com/questions/317415/how-to-allow-thread-and-interrupt-safe-writing-of-incoming-usart-data-on-freerto so you will have an idea about concurrency at the low-level as well; With limitations, a lock-free ring buffer can work when one end only writes and the other end only reads within known intervals/boundaries can check similar questions asked:
Circular lock-free buffer
Boost has well-known implementations for lock-free: https://www.boost.org/doc/libs/1_65_1/doc/html/lockfree.html

Memory leak analysis and help requested

I've been using the methodology outlined by Shivprasad Koirala to check for memory leaks from code running inside a C# application (VoiceAttack). It basically involves using the Performance Monitor to track an application's private bytes as well as bytes in all heaps and compare these counters to assess if there is a leak and what type (managed/unmanaged). Ideally I need to test outside of Visual Studio, which is why I'm using this method.
The following portion of code generates the below memory profile (bear in mind the code has a little different format compared to Visual Studio because this is a function contained within the main C# application):
public void main()
{
string FilePath = null;
using (FileDialog myFileDialog = new OpenFileDialog())
{
myFileDialog.Title = "this is the title";
myFileDialog.FileName = "testFile.txt";
myFileDialog.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*";
myFileDialog.FilterIndex = 1;
if (myFileDialog.ShowDialog() == DialogResult.OK)
{
FilePath = myFileDialog.FileName;
var extension = Path.GetExtension(FilePath);
var compareType = StringComparison.InvariantCultureIgnoreCase;
if (extension.Equals(".txt", compareType) == false)
{
FilePath = null;
VA.WriteToLog("Selected file is not a text file. Action canceled.");
}
else
VA.WriteToLog(FilePath);
}
else
VA.WriteToLog("No file selected. Action canceled.");
}
VA.WriteToLog("done");
}
You can see that after running this code the private bytes don't come back to the original count and the bytes in all heaps are roughly constant, which implies that there is a portion of unmanaged memory that was not released. Running this same inline function a few times consecutively doesn't cause further increases to the maximum observed private bytes or the unreleased memory. Once the main C# application (VoiceAttack) closes all the related memory (including the memory for the above code) is released. The bad news is that under normal circumstances the main application may be kept running indefinitely by the user, causing the allocated memory to remain unreleased.
For good measure I threw this same code into VS (with a pair of Thread.Sleep(5000) added before and after the using block for better graphical analysis) and built an executable to track with the Performance Monitor method, and the result is the same. There is an initial unmanaged memory jump for the OpenFileDialog and the allocated unmanaged memory never comes back down to the original value.
Does the memory and leak tracking methodology outlined above make sense? If YES, is there anything that can be done to properly release the unmanaged memory?

Does the memory and leak tracking methodology outlined above make sense?
No. You shouldn't expect unmanaged committed memory (Private Bytes) always be released. For instance processes have an unmanaged heap, which is managed to allow for subsequent allocations. And since Windows can page your committed memory, it isn't critical to minimize each processes committed memory.

If repeated calls don't increase memory use, you don't have a memory leak, you have delayed initialization. Some components aren't initialized until you use them, so their memory usage isn't being taken into account when you establish your baseline.

CPU-greedy loop when streaming music

To give some context, I'm working on an opensource alternative desktop Spotify client, with accessibility at it's core. You'll also see some NAudio in here.
I'm noticing pretty intense CPU usage as soon as playback starts. Even when paused, the CPU is high.
I ran Visual Studio's inbuilt profiler to try and shed some light on any resource hogs that might be occuring. As I suspected, the problem wasin my playback manager's streaming loop.
The code that the profiler flags as one of the most sample-rich is as follows:
const int secondsToBuffer = 3;
private void GetStreaming(object state)
{
this.fullyDownloaded = false;
// secondsToBuffer is an integer to represent how many seconds we should buffer up at once to prevent choppy playback on slow connections
try
{
do
{
if (bufferedWaveProvider == null)
{
this.bufferedWaveProvider = new BufferedWaveProvider(new WaveFormat(44100, 2));
this.bufferedWaveProvider.BufferDuration = TimeSpan.FromSeconds(20); // allow us to get well ahead of ourselves
Logger.WriteDebug("Creating buffered wave provider");
this.gatekeeper.MinimumSampleSize = bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * secondsToBuffer;
}
// this bit in particular seems to be the hot point
if (bufferedWaveProvider != null && bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes < bufferedWaveProvider.WaveFormat.AverageBytesPerSecond / 4)
{
Logger.WriteDebug("Buffer getting full, taking a break");
Thread.Sleep(500);
}
// do we have at least double the buffered sample's size in free space, just in case
else if (bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes > bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * (secondsToBuffer * 2))
{
var sample = gatekeeper.Read();
if (sample != null)
{
bufferedWaveProvider.AddSamples(sample, 0, sample.Length);
}
}
} while (playbackState != StreamingPlaybackState.Stopped);
Logger.WriteDebug("Playback stopped");
}
finally
{
// no post-processing work here, right?
}
}
An NAudio sample was the inspiration for my way of handling streaming in this method. To find the full file's source code, you can view it here: http://blindspot.codeplex.com/SourceControl/latest#Blindspot.Playback/PlaybackManager.cs
I'm a newbie to profiling and I'm not a year on year expert on streaming either (both might be obvious).
Is there any way I can make this loop less resource intensive. Would increasing the sleep amount in the if block where the buffer is full help? Or am I barking up the wrong tree here. It seems like it would, but I'd have thought half a second would be sufficient.
Any help gratefully received.

Basically, you've created an infinite loop until the buffer gets full. The section you've marked with
// this bit in particular seems to be the hot point
probably appears to be as the calculations in the if statement are just being repeated over and over again; can any of them be moved outside of the loop?
I'd put a Thread.Sleep(50) before the while statement to prevent thrashing and see if that makes a difference (I suspect it will).

Why is my C# program faster in a profiler?

I have a relatively large system (~25000 lines so far) for monitoring radio-related devices. It shows graphs and such using latest version of ZedGraph.
The program is coded using C# on VS2010 with Win7.
The problem is:
when I run the program from within VS, it runs slow
when I run the program from the built EXE, it runs slow
when I run the program though Performance Wizard / CPU Profiler, it runs Blazing Fast.
when I run the program from the built EXE, and then start VS and Attach a profiler to ANY OTHER PROCESS, my program speeds up!
I want the program to always run that fast!
Every project in the solution is set to RELEASE, Debug unmanaged code is DISABLED, Define DEBUG and TRACE constants is DISABLED, Optimize Code - I tried either, Warning Level - I tried either, Suppress JIT - I tried either,
in short I tried all the solutions already proposed on StackOverflow - none worked. Program is slow outside profiler, fast in profiler.
I don't think the problem is in my code, because it becomes fast if I attach the profiler to other, unrelated process as well!
Please help!
I really need it to be that fast everywhere, because it's a business critical application and performance issues are not tolerated...
UPDATES 1 - 8 follow
--------------------Update1:--------------------
The problem seems to Not be ZedGraph related, because it still manifests after I replaced ZedGraph with my own basic drawing.
--------------------Update2:--------------------
Running the program in a Virtual machine, the program still runs slow, and running profiler from the Host machine doesn't make it fast.
--------------------Update3:--------------------
Starting screen capture to video also speeds the program up!
--------------------Update4:--------------------
If I open the Intel graphics driver settings window (this thing: http://www.intel.com/support/graphics/sb/img/resolution_new.jpg)
and just constantly hover with the cursor over buttons, so they glow, etc, my program speeds up!.
It doesn't speed up if I run GPUz or Kombustor though, so no downclocking on the GPU - it stays steady 850Mhz.
--------------------Update5:--------------------
Tests on different machines:
-On my Core i5-2400S with Intel HD2000, UI runs slow and CPU usage is ~15%.
-On a colleague's Core 2 Duo with Intel G41 Express, UI runs fast, but CPU usage is ~90% (which isn't normal either)
-On Core i5-2400S with dedicated Radeon X1650, UI runs blazing fast, CPU usage is ~50%.
--------------------Update6:--------------------
A snip of code showing how I update a single graph (graphFFT is an encapsulation of ZedGraphControl for ease of use):
public void LoopDataRefresh() //executes in a new thread
{
while (true)
{
while (!d.Connected)
Thread.Sleep(1000);
if (IsDisposed)
return;
//... other graphs update here
if (signalNewFFT && PanelFFT.Visible)
{
signalNewFFT = false;
#region FFT
bool newRange = false;
if (graphFFT.MaxY != d.fftRangeYMax)
{
graphFFT.MaxY = d.fftRangeYMax;
newRange = true;
}
if (graphFFT.MinY != d.fftRangeYMin)
{
graphFFT.MinY = d.fftRangeYMin;
newRange = true;
}
List<PointF> points = new List<PointF>(2048);
int tempLength = 0;
short[] tempData = new short[2048];
int i = 0;
lock (d.fftDataLock)
{
tempLength = d.fftLength;
tempData = (short[])d.fftData.Clone();
}
foreach (short s in tempData)
points.Add(new PointF(i++, s));
graphFFT.SetLine("FFT", points);
if (newRange)
graphFFT.RefreshGraphComplete();
else if (PanelFFT.Visible)
graphFFT.RefreshGraph();
#endregion
}
//... other graphs update here
Thread.Sleep(5);
}
}
SetLine is:
public void SetLine(String lineTitle, List<PointF> values)
{
IPointListEdit ip = zgcGraph.GraphPane.CurveList[lineTitle].Points as IPointListEdit;
int tmp = Math.Min(ip.Count, values.Count);
int i = 0;
while(i < tmp)
{
if (values[i].X > peakX)
peakX = values[i].X;
if (values[i].Y > peakY)
peakY = values[i].Y;
ip[i].X = values[i].X;
ip[i].Y = values[i].Y;
i++;
}
while(ip.Count < values.Count)
{
if (values[i].X > peakX)
peakX = values[i].X;
if (values[i].Y > peakY)
peakY = values[i].Y;
ip.Add(values[i].X, values[i].Y);
i++;
}
while(values.Count > ip.Count)
{
ip.RemoveAt(ip.Count - 1);
}
}
RefreshGraph is:
public void RefreshGraph()
{
if (!explicidX && autoScrollFlag)
{
zgcGraph.GraphPane.XAxis.Scale.Max = Math.Max(peakX + grace.X, rangeX);
zgcGraph.GraphPane.XAxis.Scale.Min = zgcGraph.GraphPane.XAxis.Scale.Max - rangeX;
}
if (!explicidY)
{
zgcGraph.GraphPane.YAxis.Scale.Max = Math.Max(peakY + grace.Y, maxY);
zgcGraph.GraphPane.YAxis.Scale.Min = minY;
}
zgcGraph.Refresh();
}
.
--------------------Update7:--------------------
Just ran it through the ANTS profiler. It tells me that the ZedGraph refresh counts when the program is fast are precisely two times higher compared to when it's slow.
Here are the screenshots:
I find it VERY strange that, considering the small difference in the length of the sections, performance differs twice with mathematical precision.
Also, I updated the GPU driver, that didn't help.
--------------------Update8:--------------------
Unfortunately, for a few days now, I'm unable to reproduce the issue... I'm getting constant acceptable speed (which still appear a bit slower than what I had in the profiler two weeks ago) which isn't affected by any of the factors that used to affect it two weeks ago - profiler, video capturing or GPU driver window. I still have no explanation of what was causing it...

Luaan posted the solution in the comments above, it's the system wide timer resolution. Default resolution is 15.6 ms, the profiler sets the resolution to 1ms.
I had the exact same problem, very slow execution that would speed up when the profiler was opened. The problem went away on my PC but popped back up on other PCs seemingly at random. We also noticed the problem disappeared when running a Join Me window in Chrome.
My application transmits a file over a CAN bus. The app loads a CAN message with eight bytes of data, transmits it and waits for an acknowledgment. With the timer set to 15.6ms each round trip took exactly 15.6ms and the entire file transfer would take about 14 minutes. With the timer set to 1ms round trip time varied but would be as low as 4ms and the entire transfer time would drop to less than two minutes.
You can verify your system timer resolution as well as find out which program increased the resolution by opening a command prompt as administrator and entering:
powercfg -energy duration 5
The output file will have the following in it somewhere:
Platform Timer Resolution:Platform Timer Resolution
The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations.
Current Timer Resolution (100ns units) 10000
Maximum Timer Period (100ns units) 156001
My current resolution is 1 ms (10,000 units of 100nS) and is followed by a list of the programs that requested the increased resolution.
This information as well as more detail can be found here: https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/
Here is some code to increase the timer resolution (originally posted as the answer to this question: how to set timer resolution from C# to 1 ms?):
public static class WinApi
{
/// <summary>TimeBeginPeriod(). See the Windows API documentation for details.</summary>
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
[DllImport("winmm.dll", EntryPoint = "timeBeginPeriod", SetLastError = true)]
public static extern uint TimeBeginPeriod(uint uMilliseconds);
/// <summary>TimeEndPeriod(). See the Windows API documentation for details.</summary>
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
[DllImport("winmm.dll", EntryPoint = "timeEndPeriod", SetLastError = true)]
public static extern uint TimeEndPeriod(uint uMilliseconds);
}
Use it like this to increase resolution :WinApi.TimeBeginPeriod(1);
And like this to return to the default :WinApi.TimeEndPeriod(1);
The parameter passed to TimeEndPeriod() must match the parameter that was passed to TimeBeginPeriod().

There are situations when slowing down a thread can speed up other threads significantly, usually when one thread is polling or locking some common resource frequently.
For instance (this is a windows-forms example) when the main thread is checking overall progress in a tight loop instead of using a timer, for example:
private void SomeWork() {
// start the worker thread here
while(!PollDone()) {
progressBar1.Value = PollProgress();
Application.DoEvents(); // keep the GUI responisive
}
}
Slowing it down could improve performance:
private void SomeWork() {
// start the worker thread here
while(!PollDone()) {
progressBar1.Value = PollProgress();
System.Threading.Thread.Sleep(300); // give the polled thread some time to work instead of responding to your poll
Application.DoEvents(); // keep the GUI responisive
}
}
Doing it correctly, one should avoid using the DoEvents call alltogether:
private Timer tim = new Timer(){ Interval=300 };
private void SomeWork() {
// start the worker thread here
tim.Tick += tim_Tick;
tim.Start();
}
private void tim_Tick(object sender, EventArgs e){
tim.Enabled = false; // prevent timer messages from piling up
if(PollDone()){
tim.Tick -= tim_Tick;
return;
}
progressBar1.Value = PollProgress();
tim.Enabled = true;
}
Calling Application.DoEvents() can potentially cause allot of headaches when GUI stuff has not been disabled and the user kicks off other events or the same event a 2nd time simultaneously, causing stack climbs which by nature queue the first action behind the new one, but I'm going off topic.
Probably that example is too winforms specific, I'll try making a more general example. If you have a thread that is filling a buffer that is processed by other threads, be sure to leave some System.Threading.Thread.Sleep() slack in the loop to allow the other threads to do some processing before checking if the buffer needs to be filled again:
public class WorkItem {
// populate with something usefull
}
public static object WorkItemsSyncRoot = new object();
public static Queue<WorkItem> workitems = new Queue<WorkItem>();
public void FillBuffer() {
while(!done) {
lock(WorkItemsSyncRoot) {
if(workitems.Count < 30) {
workitems.Enqueue(new WorkItem(/* load a file or something */ ));
}
}
}
}
The worker thread's will have difficulty to obtain anything from the queue since its constantly being locked by the filling thread. Adding a Sleep() (outside the lock) could significantly speed up other threads:
public void FillBuffer() {
while(!done) {
lock(WorkItemsSyncRoot) {
if(workitems.Count < 30) {
workitems.Enqueue(new WorkItem(/* load a file or something */ ));
}
}
System.Threading.Thread.Sleep(50);
}
}
Hooking up a profiler could in some cases have the same effect as the sleep function.
I'm not sure if I've given representative examples (it's quite hard to come up with something simple) but I guess the point is clear, putting sleep() in the correct place can help improve the flow of other threads.
---------- Edit after Update7 -------------
I'd remove that LoopDataRefresh() thread altogether. Rather put a timer in your window with an interval of at least 20 (which would be 50 frames a second if none were skipped):
private void tim_Tick(object sender, EventArgs e) {
tim.Enabled = false; // skip frames that come while we're still drawing
if(IsDisposed) {
tim.Tick -= tim_Tick;
return;
}
// Your code follows, I've tried to optimize it here and there, but no guarantee that it compiles or works, not tested at all
if(signalNewFFT && PanelFFT.Visible) {
signalNewFFT = false;
#region FFT
bool newRange = false;
if(graphFFT.MaxY != d.fftRangeYMax) {
graphFFT.MaxY = d.fftRangeYMax;
newRange = true;
}
if(graphFFT.MinY != d.fftRangeYMin) {
graphFFT.MinY = d.fftRangeYMin;
newRange = true;
}
int tempLength = 0;
short[] tempData;
int i = 0;
lock(d.fftDataLock) {
tempLength = d.fftLength;
tempData = (short[])d.fftData.Clone();
}
graphFFT.SetLine("FFT", tempData);
if(newRange) graphFFT.RefreshGraphComplete();
else if(PanelFFT.Visible) graphFFT.RefreshGraph();
#endregion
// End of your code
tim.Enabled = true; // Drawing is done, allow new frames to come in.
}
}
Here's the optimized SetLine() which no longer takes a list of points but the raw data:
public class GraphFFT {
public void SetLine(String lineTitle, short[] values) {
IPointListEdit ip = zgcGraph.GraphPane.CurveList[lineTitle].Points as IPointListEdit;
int tmp = Math.Min(ip.Count, values.Length);
int i = 0;
peakX = values.Length;
while(i < tmp) {
if(values[i] > peakY) peakY = values[i];
ip[i].X = i;
ip[i].Y = values[i];
i++;
}
while(ip.Count < values.Count) {
if(values[i] > peakY) peakY = values[i];
ip.Add(i, values[i]);
i++;
}
while(values.Count > ip.Count) {
ip.RemoveAt(ip.Count - 1);
}
}
}
I hope you get that working, as I commented before, I hav'nt got the chance to compile or check it so there could be some bugs there. There's more to be optimized there, but the optimizations should be marginal compared to the boost of skipping frames and only collecting data when we have the time to actually draw the frame before the next one comes in.
If you closely study the graphs in the video at iZotope, you'll notice that they too are skipping frames, and sometimes are a bit jumpy. That's not bad at all, it's a trade-off you make between the processing power of the foreground thread and the background workers.
If you really want the drawing to be done in a separate thread, you'll have to draw the graph to a bitmap (calling Draw() and passing the bitmaps device context). Then pass the bitmap on to the main thread and have it update. That way you do lose the convenience of the designer and property grid in your IDE, but you can make use of otherwise vacant processor cores.
---------- edit answer to remarks --------
Yes there is a way to tell what calls what. Look at your first screen-shot, you have selected the "call tree" graph. Each next line jumps in a bit (it's a tree-view, not just a list!). In a call-graph, each tree-node represents a method that has been called by its parent tree-node (method).
In the first image, WndProc was called about 1800 times, it handled 872 messages of which 62 triggered ZedGraphControl.OnPaint() (which in turn accounts for 53% of the main threads total time).
The reason you don't see another rootnode, is because the 3rd dropdown box has selected "[604] Mian Thread" which I didn't notice before.
As for the more fluent graphs, I have 2nd thoughts on that now after looking more closely to the screen-shots. The main thread has clearly received more (double) update messages, and the CPU still has some headroom.
It looks like the threads are out-of-sync and in-sync at different times, where the update messages arrive just too late (when WndProc was done and went to sleep for a while), and then suddenly in time for a while. I'm not very familiar with Ants, but does it have a side-by side thread timeline including sleep time? You should be able to see what's going on in such a view. Microsofts threads view tool would come in handy for this:

When I have never heard or seen something similar; I’d recommend the common sense approach of commenting out sections of code/injecting returns at tops of functions until you find the logic that’s producing the side effect. You know your code and likely have an educated guess where to start chopping. Else chop mostly all as a sanity test and start adding blocks back. I’m often amazed how fast one can find those seemingly impossible bugs to track. Once you find the related code, you will have more clues to solve your issue.

There is an array of potential causes. Without stating completeness, here is how you could approach your search for the actual cause:
Environment variables: the timer issue in another answer is only one example. There might be modifications to the Path and to other variables, new variables could be set by the profiler. Write the current environment variables to a file and compare both configurations. Try to find suspicious entries, unset them one by one (or in combinations) until you get the same behavior in both cases.
Processor frequency. This can easily happen on laptops. Potentially, the energy saving system sets the frequency of the processor(s) to a lower value to save energy. Some apps may 'wake' the system up, increasing the frequency. Check this via performance monitor (permon).
If the apps runs slower than possible there must be some inefficient resource utilization. Use the profiler to investigate this! You can attache the profiler to the (slow) running process to see which resources are under-/ over-utilized. Mostly, there are two major categories of causes for too slow execution: memory bound and compute bound execution. Both can give more insight into what is triggering the slow-down.
If, however, your app actually changes its efficiency by attaching to a profiler you can still use your favorite monitor app to see, which performance indicators do actually change. Again, perfmon is your friend.

If you have a method which throws a lot of exceptions, it can run slowly in debug mode and fast in CPU Profiling mode.
As detailed here, debug performance can be improved by using the DebuggerNonUserCode attribute. For example:
[DebuggerNonUserCode]
public static bool IsArchive(string filename)
{
bool result = false;
try
{
//this calls an external library, which throws an exception if the file is not an archive
result = ExternalLibrary.IsArchive(filename);
}
catch
{
}
return result;
}

Process Memory Size - Different Counters

I'm trying to find out how much memory my own .Net server process is using (for monitoring and logging purposes).
I'm using:
Process.GetCurrentProcess().PrivateMemorySize64
However, the Process object has several different properties that let me read the memory space used:
Paged, NonPaged, PagedSystem, NonPagedSystem, Private, Virtual, WorkingSet
and then the "peaks": which i'm guessing just store the maximum values these last ones ever took.
Reading through the MSDN definition of each property hasn't proved too helpful for me. I have to admit my knowledge regarding how memory is managed (as far as paging and virtual goes) is very limited.
So my question is obviously "which one should I use?", and I know the answer is "it depends".
This process will basically hold a bunch of lists in memory of things that are going on, while other processes communicate with it and query it for stuff. I'm expecting the server where this will run on to require lots of RAM, and so i'm querying this data over time to be able to estimate RAM requirements when compared to the sizes of the lists it keeps inside.
So... Which one should I use and why?

If you want to know how much the GC uses try:
GC.GetTotalMemory(true)
If you want to know what your process uses from Windows (VM Size column in TaskManager) try:
Process.GetCurrentProcess().PrivateMemorySize64
If you want to know what your process has in RAM (as opposed to in the pagefile) (Mem Usage column in TaskManager) try:
Process.GetCurrentProcess().WorkingSet64
See here for more explanation on the different sorts of memory.

OK, I found through Google the same page that Lars mentioned, and I believe it's a great explanation for people that don't quite know how memory works (like me).
http://shsc.info/WindowsMemoryManagement
My short conclusion was:
Private Bytes = The Memory my process has requested to store data. Some of it may be paged to disk or not. This is the information I was looking for.
Virtual Bytes = The Private Bytes, plus the space shared with other processes for loaded DLLs, etc.
Working Set = The portion of ALL the memory of my process that has not been paged to disk. So the amount paged to disk should be (Virtual - Working Set).
Thanks all for your help!

If you want to use the "Memory (Private Working Set)" as shown in Windows Vista task manager, which is the equivalent of Process Explorer "WS Private Bytes", here is the code. Probably best to throw this infinite loop in a thread/background task for real-time stats.
using System.Threading;
using System.Diagnostics;
//namespace...class...method
Process thisProc = Process.GetCurrentProcess();
PerformanceCounter PC = new PerformanceCounter();
PC.CategoryName = "Process";
PC.CounterName = "Working Set - Private";
PC.InstanceName = thisProc.ProcessName;
while (true)
{
String privMemory = (PC.NextValue()/1000).ToString()+"KB (Private Bytes)";
//Do something with string privMemory
Thread.Sleep(1000);
}

To get the value that Task Manager gives, my hat's off to Mike Regan's solution above. However, one change: it is not: perfCounter.NextValue()/1000; but perfCounter.NextValue()/1024; (i.e. a real kilobyte). This gives the exact value you see in Task Manager.
Following is a full solution for displaying the 'memory usage' (Task manager's, as given) in a simple way in your WPF or WinForms app (in this case, simply in the title). Just call this method within the new Window constructor:
private void DisplayMemoryUsageInTitleAsync()
{
origWindowTitle = this.Title; // set WinForms or WPF Window Title to field
BackgroundWorker wrkr = new BackgroundWorker();
wrkr.WorkerReportsProgress = true;
wrkr.DoWork += (object sender, DoWorkEventArgs e) => {
Process currProcess = Process.GetCurrentProcess();
PerformanceCounter perfCntr = new PerformanceCounter();
perfCntr.CategoryName = "Process";
perfCntr.CounterName = "Working Set - Private";
perfCntr.InstanceName = currProcess.ProcessName;
while (true)
{
int value = (int)perfCntr.NextValue() / 1024;
string privateMemoryStr = value.ToString("n0") + "KB [Private Bytes]";
wrkr.ReportProgress(0, privateMemoryStr);
Thread.Sleep(1000);
}
};
wrkr.ProgressChanged += (object sender, ProgressChangedEventArgs e) => {
string val = e.UserState as string;
if (!string.IsNullOrEmpty(val))
this.Title = string.Format(#"{0} ({1})", origWindowTitle, val);
};
wrkr.RunWorkerAsync();
}`

Is this a fair description? I'd like to share this with my team so please let me know if it is incorrect (or incomplete):
There are several ways in C# to ask how much memory my process is using.
Allocated memory can be managed (by the CLR) or unmanaged.
Allocated memory can be virtual (stored on disk) or loaded (into RAM pages)
Allocated memory can be private (used only by the process) or shared (e.g. belonging to a DLL that other processes are referencing).
Given the above, here are some ways to measure memory usage in C#:
1) Process.VirtualMemorySize64(): returns all the memory used by a process - managed or unmanaged, virtual or loaded, private or shared.
2) Process.PrivateMemorySize64(): returns all the private memory used by a process - managed or unmanaged, virtual or loaded.
3) Process.WorkingSet64(): returns all the private, loaded memory used by a process - managed or unmanaged
4) GC.GetTotalMemory(): returns the amount of managed memory being watched by the garbage collector.

Working set isn't a good property to use. From what I gather, it includes everything the process can touch, even libraries shared by several processes, so you're seeing double-counted bytes in that counter. Private memory is a much better counter to look at.

I'd suggest to also monitor how often pagefaults happen. A pagefault happens when you try to access some data that have been moved from physical memory to swap file and system has to read page from disk before you can access this data.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

UWP AudioGraph : Garbage Collector causes clicks in the audio output - c#

the garbage collector is probably reacting to you initializing the sample temporary memory every frame, which is then released after the frame, try assign the memory for holding the samples in your start up code and just reuse it every frame.

Related

Improve multi-threaded code design to prevent race condition

Memory leak analysis and help requested

CPU-greedy loop when streaming music

Why is my C# program faster in a profiler?

Process Memory Size - Different Counters

Categories

Resources