Continuing the discussion from Understanding VS2010 C# parallel profiling results but more to the point:
I have many threads that work in parallel (using Parallel.For/Each), which use many memory allocations for small classes.
This creates a contention on the global memory allocator thread.
Is there a way to instruct .NET to preallocate a memory pool for each thread and do all allocations from this pool?
Currently my solution is my own implementation of memory pools (globally allocated arrays of object of type T, which are recycled among the threads) which helps a lot but is not efficient because:
I can't instruct .NET to allocate from a specific memory slice.
I still need to call new many times to allocate the memory for the pools.
Thanks,
Haggai
I searched for two days trying to find an answer to the same issue you had. The answer is you need to set the garbage collection mode to Server mode. By default, garbage collection mode set to Workstation mode.
Setting garbage collection to Server mode causes the managed heap to split into separately managed sections, one-per CPU.
To do this, you need to add a config setting to your app.config file.
<runtime>
<gcServer enabled="true"/>
</runtime>
The speed difference on my 12-core Opteron 6172 was dramatic!
The garbage collector does not allocate memory.
It sounds more like you're allocating lots of small temporary objects and a few long-lived objects, and the garbage collector is spending a lot of time garbage-collecting the temporary objects so your app doesn't have to request more memory from the OS. From .NET Framework 4 Advanced Development - Garbage Collection:
As long as address space is available in the managed heap, the runtime continues to allocate space for new objects. However, memory is not infinite. Eventually the garbage collector must perform a collection in order to free some memory.
The solution: Don't allocate lots of small temporary objects. The page on Garbage Collection and Performance might also be helpful.
You could pre-allocate a bunch of objects, and keep them in groups intended for separate threads. However, it's likely that you won't get any better performance from this.
The garbage collector is specially designed to handle small short-lived objects efficiently. If you keep the objects in a pool, they are long-lived and will survive a garbage collection, which in turns means that they will be copied to the second generation heap. This copying will be more expensive than just allocating new objects.
Related
Using the Visual Studio Concurrency Visualizer I now see why I don't get any benefit switching to Parallel.For: only the 9% of the time the machine is busy executing the code, the rest is 71% synchronization and 17% memory management (1).
Checking all the orange stripes on the diagram below I discovered that GC is always involved (2).
After reading all these interesting topics...
Why do I have a lock here?
https://blog.marcgravell.com/2011/10/assault-by-gc.html
Prevent .NET Garbage collection for short period of time
https://devblogs.microsoft.com/premier-developer/understanding-different-gc-modes-with-concurrency-visualizer/
.. am I right assuming that all these threads need to play with a single memory management object and therefore removing the need to allocate objects on the heap my scenario will improve considerably? Like using structs instead of classes, array instead of dynamic lists, etc.?
I have a lot of work to do to bend my code in this direction. Just wanted to be sure before starting.
From your screenshot it seems like memory allocation is blocked while waiting for GC to complete. There are server and workstation GC modes, and it may be concurrent or not, but all options need to block threads at least a little while. I would check in more detail how often, and how much time you are spending in GC, and how often gen 0/1 and 2 is running.
I believe that each thread has a separate ephemeral segment it uses for allocations, so that it would not need to synchronize allocations, unless it needs a new segment, or the allocation is on the large object heap. But I'm unable to find a reference for this.
In any case, you will likely benefit from reducing the amount and size of allocations. If possible, use a object pool or memory pool to reuse memory. You might also benefit from increasing the amount of memory and checking the application for memory leaks. A general recommendation for memory is that there should be two types of allocations:
Small temporary allocations that only live for a short duration, like a temporary object that live for the duration of a method call.
Long lived allocations of any size that live for the duration of the "application".
If this pattern is followed almost all garbage should be collected in Gen 0/1, and gen 2 collections should be fairly rare.
It also depends a bit if you are allocating many small objects, or large chunks of memory. If the former you may consider using structs since these are stack allocated. If the later you also need to consider memory fragmentation, and this should also improve by using a memory pool that only allocates fixed sized chunks of memory.
Edit:
At the very simplest a object pool could be something like this:
public class ObjectPool<T>
{
private ConcurrentBag<T> pool = new ConcurrentBag<T>();
public T Get(Func<T> constructor) => pool.TryTake(out var result) ? result : constructor();
public void Return(T obj) => pool.Add(obj);
}
This assumes that the objects represent identical resources, like byte arrays of some fixed size. But there are also existing implementations:
.Net core MemoryPool
asp.Net core object pool
stack overflow question regarding object pools
Memory Management The Memory Management report shows the calls where memory management blocks occurred, along with the total blocking
times of each call stack. Use this information to identify areas that
have excessive paging or garbage collection issues.
Further more
Memory management time
These segments in the timeline are associated with blocking times that
are categorized as Memory Management. This scenario implies that a
thread is blocked by an event that is associated with a memory
management operation such as Paging. During this time, a thread has
been blocked in an API or kernel state that the Concurrency Visualizer
is counting as memory management. These include events such as paging
and memory allocation. Examine the associated call stacks and profile
reports to better understand the underlying reasons for blocks that
are categorized as Memory Management.
Yes, allocating less will likely have a large benefit on your resources and efficiency, but that is almost always the case on hot paths and thrashed applications
Heap allocations and particular Large Object Heap (LOB) allocations are costly, it also creates extra work for your The Garbage Collector and can fragment your memory causing even more inefficiency. The less you allocate, or reuse memory, or use the stack the better you are (in general).
This is also where you would learn to use a good memory profiler and get to know your garbage collector.
On saying that this would not be the only tool you would use to make your application less allocatey. A good memory profiler will go a long way, combined with learning how to read the results and affect changes based on the results.
Creating minimal allocation code is an artform, and one worth your learning
Also as #mjwills pointed out in the comments, you would run any change through your benchmark software as well, removing allocations at the cost of CPU time won't make sense. There are a lot of ways to speed up code, and low allocation is just one of a lot of approaches that may help.
Lastly, I would suggest following Marc Gravell and his blogs as a start (Mr DeAllocation), get to know your Garbage Collector and how the generations wortk, and tools like memory profilers and benchmarkers for performant silky smooth production code
I have a console program written in VB.NET (.NET 4.5.2) that acts as a service. A continuous loop runs, and then waits for a message on an MSMQ queue, and processes the message. Somehow this program has a substantial memory leak. I have gone through all the code and done everything I could to use Using statements, but yet the problem persists. The more times through the loop, the higher the memory used by the program, and this memory is never reclaimed by the garbage collector.
I ended up putting a GC.Collect() at the bottom of my loop, and was able to free up most of the memory. However, I realize that this is bad practice and could cause issues. Just wondering if there is a way to inspect what variables the GC.Collect() is getting rid of, so I can find the root of the problem?
Do While (True)
' Code to wait for message on a queue
' Code to process message (includes calls to class library)
GC.Collect()
Loop
If the garbage collector is freeing memory when you manually invoke it, you are not leaking memory. A leak in a managed memory environment is memory that the GC can't free because it is referenced somewhere on the object graph.
It could be that they way your code uses objects instances that are being promoted to gen1, gen2 or the large object heap. Instances in these generations are collected less frequently then gen0. The windows resource monitor includes a number of performance counters that can be used to profile the behavior of the managed heap. I would guess that you might have objects being promoted into gen2. Tracking the "Gen 1 Promoted Bytes/Sec" counter would give you insight as to whether this is what's happening.
In a managed memory environment the GC runs when there is memory pressure, not when object instances are no longer needed, so the mere presence of increased memory use is not necessarily the sign of a leak.
If you take the Collect out does memory usage always increase (say over a number of minutes or messages being processed form the queue), or does it go up and down somewhat like a sin wave? If it's the latter just let the GC do its thing, you don't have a leak.
Visual studio has a number of memory analysis features https://msdn.microsoft.com/en-us/library/dn342825.aspx
The SOS managed debugger extension is a very powerful tool for sifting through the managed heap, though it is not for the faint of heart. https://learn.microsoft.com/en-us/dotnet/framework/tools/sos-dll-sos-debugging-extension
When to go for object pooling using C#? Any good ex...
What are the pro's and con's of maintaining a pool of frequently used objects and grab one from the pool instead of creating a new one?
There are only two types of resources I can think of that are commonly pooled: Threads and Connections (i.e. to a database).
Both of these have one overarching concern: Scarcity.
If you create too many threads, context-switching will waste away all of your CPU time.
If you create too many network connections, the overhead of maintaining those connections becomes more work than whatever the connections are supposed to do.
Also, for a database, connection count may be limited for licensing reasons.
So the main reason you'd want to create a resource pool is if you can only afford to have a limited number of them at any one time.
I would add memory fragmentation to the list. That can occur when using objects that encapsulate native resources which after they are allocated can not be moved by the Garbage Collector and could potentially fragment the heap.
One real-life example is when you create and destroy lots of sockets. The buffers they use to read/write data have to be pinned in order to be transferred to the native WinSock API which means that when garbage collection occurs, even though some of the memory is reclaimed for the sockets that where destroyed - it could leave the memory in a fragmented state since the GC can't compact the heap after collection. Thus the read/write buffers are prime candidates for pooling. Also, if you're using SocketEventArgs objects, those would also be good candidates.
Here's a good article that talks about the garbage collection process, memory compacting and why object pooling helps.
When object creation is expensive
When you potentially can experience memory pressure - way too many objects (for instance - flyweight pattern)
For games GC may introduce an unwanted delay in some situations. If this is the case, reusing objects may be a good idea. There's are some useful considerations on the topic in this thread.
There is an excellent MSDN magazine article called Rediscover the Lost Art of Memory Optimization in Your Managed Code by Erik Brown http://msdn.microsoft.com/en-us/magazine/cc163856.aspx. It includes a general purpose object pool with a test program. This object pool does support minimum and maximum sizes. I can't find any references to people using this in production. Has anyone done so? Also, having dealt with memory fragmentation in an ASP.NET application I can attest to the value in Miky Dinescu's answer. Also, elaborating slightly on Vitaly's answer, consider the case of large objects (i.e. > 85K) which are expensive to create. Large objects only participate in Gen 2 garbage collection. This means that they won't be collected as quickly as objects that fully participate in garbage collection in Gen 0 and Gen 1. This article Large Object Heap Uncovered by Maoni Stephens
at http://msdn.microsoft.com/en-us/magazine/cc534993.aspx explains the large object heap in detail.
5% of execution time spent on GC? 10%? 25%?
Thanks.
This blog post has an interesting investigation into this area.
The posters conclusion? That the overhead was negligible for his example.
So the GC heap is so fast that in a real program, even in tight loops, you can use closures and delegates without even giving it a second’s thought (or even a few nanosecond’s thought). As always, work on a clean, safe design, then profile to find out where the overhead is.
It depends entirely on the application. The garbage collection is done as required, so the more often you allocate large amounts of memory which later becomes garbage, the more often it must run.
It could even go as low as 0% if you allocate everything up front and the never allocate any new objects.
In typical applications I would think the answer is very close to 0% of the time is spent in the garbage collector.
The overhead varies widely. It's not really practical to reduce the problem domain into "typical scenarios" because the overhead of GC (and related functions, like finalization) depend on several factors:
The GC flavor your application uses (impacts how your threads may be blocked during a GC).
Your allocation profile, including how often you allocate (GC triggers automatically when an allocation request needs more memory) and the lifetime profile of objects (gen 0 collections are fastest, gen 2 collections are slower, if you induce a lot of gen 2 collections your overhead will increase).
The lifetime profile of finalizable objects, because they must have their finalizers complete before they will be eligible for collection.
The impact of various points on each of those axes of relevancy can be analyzed (and there are probably more relevant areas I'm not recalling off the top of my head) -- so the problem is really "how can you reduce those axes of relevancy to a 'common scenario?'"
Basically, as others said, it depends. Or, "low enough that you shouldn't worry about it until it shows up on a profiler report."
In native C/C++ there is sometimes a large cost of allocating memory due to finding a block of free memory that is of the right size, there is also a none 0 cost of freeing memory due to having to linked the freed memory into the correct list of blocks, and combine small blocks into large blocks.
In .NET it is very quick to allocate a new object, but you pay the cost when the garbage collector runs. However to cost of garbage collection short lived object is as close to free as you can get.
I have always found that if the cost of garbage collection is a problem to you, then you are likely to have over bigger problems with the design of your software. Paging can be a big issue with any GC if you don’t have enough physical RAM, so you may not be able to just put all your data in RAM and depend on the OS to provide virtual memory as needed.
It really can vary. Look at this demonstration short-but-complete program that I wrote:
http://nomorehacks.wordpress.com/2008/11/27/forcing-the-garbage-collector/
that shows the effect of large gen2 garbage collections.
Yes, the Garbage Collector will spend some X% of time collecting when averaged over all applications everywhere. But that doesn't necessarily means that time is overhead. For overhead, you can really only count the time that would be left after releasing an equivalent amount of memory on an unmanaged platform.
With that in mind, the actual overhead is negative, but the Garbage collector will save time by release several chunks of memory in batches. That means fewer context switches and an overall improvement in efficiency.
Additionally, starting with .Net 4 the garbage collector does a lot of it's work on a different thread that doesn't interrupt your currently running code as much. As we work more and more with mutli-core machines where a core might even be sitting idle now and then, this is a big deal.
I have a large 2GB file with 1.5 million listings to process. I am running a console app that performs some string manipulation then uploads each listing to the database.
I created a LINQ object and clear the object by assigning it to a new LinqObject() for each listing (loop).
When the object is complete, I add it to a list.
When the list reaches 100 objects, I submitAll on the entire list, clear the list, then repeat.
My memory usage continues to grow as the program runs. Is there anything I should be doing to keep memory usage down? I tried GC.collect. I think I want to use dispose..
Thanks in advance for looking.
It's normal for the memory usage of a program to increase when it's working. You should not try to force the garbage collector to reduce the memory usage to try to save resources, this will most likely waste resources instead.
Contrary to one's first reaction, high memory usage is not a performance problem as long as there are any free memory left at all. Having a lot of unused memory doesn't increase the performance a bit. If you try to reduce the memory usage only to keep it down, you are just wasting CPU time doing cleanup that is not needed.
If you are running out of free memory or if some other application needs it, the garbage collector will do the appropriate cleanup. In almost every situation the garbage collector will know much more about the current memory situatiuon than you can possibly anticipate when writing the code.
If you are using objects that implement the IDisposable interface, you should call the Dispose method to free unmanaged resources, but all other objects are handled by the garbage collector. Managed objects normally don't leak memory at all.
Do you need your memory usage to stay low? Absent an actual functional problem, high memory usage in and of itself is not an issue.
How large is the memory usage growing? It may be that .NET is just "settling" effectively.
It's not really clear exactly how you're doing this, but the general principle sounds okay. I suggest you take the database work out of the equation - just comment out whichever line would actually submit to the database. See how much memory that uses. Other than the StreamReader (or whatever) you shouldn't have anything else that needs disposing if you're not touching the database - just building batches of transformed objects and throwing them away.