Profiling memory usage of all variables during runtime in C# app

Profiling memory usage of all variables during runtime in C# app - c#

To reduce the memory footprint in my C# app to below the limit (around 1-2GB), I would love to see a list of all variables in realtime (during runtime), along with how much memory they eat up (and maybe even the contents).
From what I can see, this seemingly simple request seems to have escaped the attention of the memory profilers out there. .NET Memory Profiler for instance shows the memory for each given type (e.g. Int32[] or String), but doesn't seem to allow finer granularity to show the memory for each named variable.
Although I haven't tried dotTrace or ANTS Memory Profiler, scanning the FAQ, videos and screenshots draws a blank too.
Apart from my own variables, the desired profiler would probably include 'overhead' memory usage typical for any .NET app, though to me, that's less important.
Is there any program (preferably free or under $100) which can do this?
------------- EDIT
For variables which reference each other (as shown by Jon Skeet), or for variables passed by reference to a method, the profiler could maybe either group them to show that they're really the same object (and therefore 'share' the same memory), or just show the original variable name and omit the references.

but doesn't seem to allow finer granularity to show the memory for each named variable.
That's probably because it doesn't make much sense. Variables themselves don't generally take up much memory - it's objects which take up memory, and variables just prevent those objects from being garbage collected.
So for example, consider the following code:
byte[] array = new byte[1024 * 1024]; // 1MB
byte[] array2 = array;
byte[] array3 = array;
Here we have three variables, all referring to the same array. How much memory would your desired tool show each of them taking? 1MB, because each one refers to a 1MB array? That would be confusing, as the total memory shown would be 3MB despite only 1MB actually being used. 0.3333MB? Surely more confusing. 1MB? Sort of accurate, but unhelpful.
You should concentrate on which objects stay alive longer than you want them to, then work out what's keeping those objects alive.
Usually, if a "seemingly simple request" seems to have "escaped the attention" of people who specialize in the area, it's because it's not a simple request after all.

Related

Block allocation

Is it better to pre-allocate (for example) 100KB of memory (in the heap) but then only go on to use 60KB, or is it better to allocate each byte as you need it?
My question arises from reading this blog:
http://deplinenoise.wordpress.com/2012/10/20/toollibrary-memory-management-youre-doing-it-wrong/

This really depends on intricate memory details of your application. However, the guy's fundamental point is absolutely accurate- pre-allocation and memory regions are obscenely efficient. new and delete are the most general tools possible, and if you have a more specific problem, you can find a much more efficient solution. Fixed-size object pools are another example.

It is. The operating system does not actually give you all that space anyway in some cases. Take Linux for example. Java tends to request large amounts of memory and never use it so what actually happens is that the OS keeps tracks of these ranges you requested but never maps them into the page tables (and therefore never allocates a frame for it) until you use it. So in terms of virtual memory it looks like you're using a lot but really you're only using the pages that you ever access (the 40kb in your example that you actually used). You can see this in the difference between virtual and physical usage of memory (assuming your processes aren't swapping out).

setnewhandler in C#

for creating my own memory management in C# I need to have a possibility to intercept the new command before it returns a null or fires an exception. When using the new command I want to call the original handler first. If this handler fails to return a block of memory, I want to inform all my mappable objects to be written to disk and to free memory.
In C++ there has been a possibility to intercept the new command by assigned a different new handler. In C# I couldn't find anything which shows the same behaviour.
Has anyone seen a possibility to do this.
Thanks
Martin

You can't do what you're after in C#, or in any managed language. Nor should you try. The .NET runtime manages allocations and garbage collection. It's impossible for you to instruct your objects to free memory, as you have no guarantee when (or, technically, even if) a particular object will be collected once it's no longer rooted. Even eliminating all references and manually calling GC.Invoke() is not an absolute guarantee. If you're looking for granular memory management, you need to be using a lower-level environment.
As an important point, it is not possible for the new operator to return a null reference. It can only return either a reference to the specified type or throw an exception.
If you want to do your own management of how and when objects are allocated, you'll have to use something along the lines of a factory pattern.

I think you're approaching this from the wrong angle; the whole point of using a runtime with managed memory is so that you don't have to worry about memory. The tradeoff is that you can't do this type of low-level trickery.
As an aside, you can 'override new' for a limited class of objects (those descending from ContextBoundObject) by creating a custom ProxyAttribute, though this likely does not address what you're intending.

I believe that you are not understanding the side-effects of what you're asking for. Even in C++, you can't really do what you think you can do. The reason is simple, if you have run out of memory, you can't even make your objects serialize to disk because you have no memory to accomplish that. By the time memory is exhausted, the only real thing you can do is either discard memory (without saving or doing anything else first) or abend the program.
Now, what you're talking about will still work 95% of the time because your memory allocation will likely be sufficiently large that when it fails, you have a little room to play with, but you can't guarantee that this will be the case.
Example: If you have only 2MB of memory left, and you try to allocate 10MB, then it will fail, and you still have 2MB to play with to try and free up some memory, which will allow you to allocate small chunks of memory needed to serialize objects to disk.
But, if you only have 10 bytes of memory left, then you don't even have enough memory to create a new exception object (unless it comes from a reserved pool). So, in essence, you're creating a very poor situation that will likely crash at some point.
Even in C++ low memory conditions are almost impossible to get right, and it's almost impossible to recover from every case unless you have very carefully planned, and pre-allocated memory for your recovery routines.
Now, when you're talking about a garbage collected OS, you have no control over how memory is allocated or freed. At best, all you can do is give hints. There is very little you can reliably do here by the nature of garbage collection. It's non-deterministic.

.net collections memory optimization - will this method work?

Just like almost any other big .NET application, my current C# project contains many .net collections .
Sometimes I don't know, from the beginning, what the size of a Collection (List/ObservableCollection/Dictionary/etc.) is going to be.
But there are many times when I do know what it is going to be.
I often get an OutOfMemoryException and I've been told it can happen not only because process size limits, but also because of fragmentation.
So my question is this - will setting collection's size (using the capacity argument in the constructor) every time I know its expected size help me prevent at least some of the fragmentation problems ?
This quote is from the msdn :
If the size of the collection can be
estimated, specifying the initial
capacity eliminates the need to
perform a number of resizing
operations while adding elements to
the List.
But still, I don't want to start changing big parts of my code for something that might not be the real problem.
Has it ever helped any of you to solve out of memory problems ?

Specifying an initial size will rarely if ever get rid of an OutOfMemory issue - unless your collection size is millions of object in which case you should really not keep such a collection.
Resizing a collection involves defining a completely new array with a new additional size and then copying the memory. If you are already close to out of memory, yes, this can cause an out of memory since the new array cannot be allocated.
However, 99 out of 100, you have a memory leak in your app and collection resizing issues is only a symptom of it.

If you are hitting OOM, then you may be being overly aggressive with the data, but to answer the question:
Yes, this may help some - as if it has to keep growing the collections by doubling, it could end up allocating and copying twice as much memory for the underlying array (or more precicely, for the earlier smaller copies that are discarded). Most of these intermediate arrays will be collected promptly, but when they get big you are using the "large object heap", which is harder to compact.
Starting with the correct size prevents all the intermediate copies of the array.
However, it also depends what is in the array matters. Typically, for classes, there is more data in each object (plus overheads for references etc) - meaning the list is not necessarily the biggest culprit for memory use; you might be burning up most of the memory on objects.
Note that x64 will allow more overall space, but arrays are limited to 2GB - and if each reference doubles in size this halves the maximum effective length of the array.
Personally I would look at breaking the huge sets into smaller chains of lists; jagged lists, for example.

.NET has a compating garbage collector, so you probably won't run into fragmentation problems on the normal .NET heap. You can however get memory fragmentation if you're using lots of unmanaged memory (e.g. through GDI+, COM, etc.). Also, the large object heap isn't compacted, so that can get fragmented, too. IIRC an object is put into the LOH if it's bigger than 80kb. So if you have many collections that contain more than 20k objects, you might get fragmentation problems.
But instead of guessing where the problem might be, it might be better to narrow the problem down some more: When do you get the OutOfMemoryExceptions? How much memory is the application using at that time? Using a tool like WinDbg or memory profilers you should be able to find out how much of that memory is on the LOH.
That said, it's always a good idea to set the capacity of List and other data structures in advance if you know it. Otherwise, the List will double it's capacity everytime you add an item and hit the capacity limit which means lots of unnecessary allocation and copy operations.

In order to solve this, you have to understand the basics and pinpoint the problem in your code.
It is always a good idea to set the initial capacity, if you have a sensible estimate. If you only have an approximate guess, allocate more.
Fragmentation can only occur on the LOH (objects over 80 kB). To prevent it , try to allocate blocks of the same size. Paradoxically, the solution might be to sometimes allocate more memory than you actually need.

The answer is that, yes pre-defining a size on collections will increase performance and memory optimization and reduce fragmentation. See my answer here to see why - If I set the initial size of a .NET collection and then add some items OVER this initial size, how does the collection determine the next resize?
However, without analyzing a memory dump or memory profiling on the app, it's impossible to say exactly what the cause of the OOM is. Thus, impossible to conjecture if this optimization will solve the problem.

Deallocate memory from large data structures in C#

I have a fewSortedList<>andSortedDictionary<>structures in my simulation code and I add millions of items in them over time. The problem is that the garbage collector does not deallocate quickly enough memory so there is a huge hit on the application's performance. My last option was to engage theGC.Collect()method so that I can reclaim that memory back. Has anyone got a different idea? I am aware of theFlyweightpattern which is another option but I would appreciate other suggestions that would not require huge refactoring of my code.

You are fighting the "There's no free lunch" principle. You cannot assume that stuffing millions of items in a list isn't going to affect perf. Only the SortedList<> should be a problem, it is going to start allocating memory in the Large Object Heap. That allocation isn't going to be freed soon, it takes a gen #2 collection to chuck stuff out of the LOH again. This delay should not otherwise affect the perf of your program.
One thing you can do is avoiding the multiple of copies of the internal array that SortedList<> will jam into the LOH when it keeps growing. Try to guess a good value for Capacity so it pre-allocates the large array up front.
Next, use Perfmon.exe or TaskMgr.exe and looks at the page fault delta of your program. It should be quite busy while you're allocating. If you see low values (100 or less) then you might have a problem with the paging file being fragmented. A common scourge on older machines that run XP. Defragging the disk and using SysInternals' PageDefrag utility can do extraordinary wonders.

I think the SortedList uses a array as backing field, which means that large SortedList get allocated on the Large object heap. The large object heap can get defragmentated, which can cause an out of memory exception while in principle there is still enough memory available.
See this link.
This might be your problem, as intermediate calls to GC.collect prevent the LOH from getting badly defragmented in some scenarios, which explains why calling it helps you reduce the problem.
The problem can be mitigated by splitting large objects into smaller fragments.

I'd start with doing some memory profiling on your application to make sure that the items you remove from those lists (which I assume is happening from the way your post is written) are actually properly released and not hanging around places.
What sort of performance hit are we talking and on what operating system? If I recall, GC will run when it's needed, not immediately or even "soon". So task manager showing high memory allocated to your application is not necessarily a problem. What happens if you put the machine under higher load (e.g. run several copies of your application)? Does memory get reclaimed faster in that scenario or are you starting to run out of memory?
I hope answers to these questions will help point you in a right direction.

Well, if you keep all of the items in those structures, the GC will never collect the resources because they still have references to them.
If you need the items in the structures to be collected, you must remove them from the data structure.
To clear the entire data structure try using Clear() and setting the data structure reference to null. If the data is still not getting collected fast enough, call CC.Collect().

Large Arrays, and LOH Fragmentation. What is the accepted convention?

I have an other active question HERE regarding some hopeless memory issues that possibly involve LOH Fragmentation among possibly other unknowns.
What my question now is, what is the accepted way of doing things?
If my app needs to be done in Visual C#, and needs to deal with large arrays to the tune of int[4000000], how can I not be doomed by the garbage collector's refusal to deal with the LOH?
It would seem that I am forced to make any large arrays global, and never use the word "new" around any of them. So, I'm left with ungraceful global arrays with "maxindex" variables instead of neatly sized arrays that get passed around by functions.
I've always been told that this was bad practice. What alternative is there?
Is there some kind of function to the tune of System.GC.CollectLOH("Seriously") ?
Are there possibly some way to outsource garbage collection to something other than System.GC?
Anyway, what are the generally accepted rules for dealing with large (>85Kb) variables?

Firstly, the garbage collector does collect the LOH, so do not be immediately scared by its prescence. The LOH gets collected when generation 2 gets collected.
The difference is that the LOH does not get compacted, which means that if you have an object in there that has a long lifetime then you will effectively be splitting the LOH into two sections — the area before and the area after this object. If this behaviour continues to happen then you could end up with the situation where the space between long-lived objects is not sufficiently large for subsequent assignments and .NET has to allocate more and more memory in order to place your large objects, i.e. the LOH gets fragmented.
Now, having said that, the LOH can shrink in size if the area at its end is completely free of live objects, so the only problem is if you leave objects in there for a long time (e.g. the duration of the application).
Starting from .NET 4.5.1, LOH could be compacted, see GCSettings.LargeObjectHeapCompactionMode property.
Strategies to avoid LOH fragmentation are:
Avoid creating large objects that hang around. Basically this just means large arrays, or objects which wrap large arrays (such as the MemoryStream which wraps a byte array), as nothing else is that big (components of complex objects are stored separately on the heap so are rarely very big). Also watch out for large dictionaries and lists as these use an array internally.
Watch out for double arrays — the threshold for these going into the LOH is much, much smaller — I can't remember the exact figure but its only a few thousand.
If you need a MemoryStream, considering making a chunked version that backs onto a number of smaller arrays rather than one huge array. You could also make custom version of the IList and IDictionary which using chunking to avoid stuff ending up in the LOH in the first place.
Avoid very long Remoting calls, as Remoting makes heavy use of MemoryStreams which can fragment the LOH during the length of the call.
Watch out for string interning — for some reason these are stored as pages on the LOH and can cause serious fragmentation if your application continues to encounter new strings to intern, i.e. avoid using string.Intern unless the set of strings is known to be finite and the full set is encountered early on in the application's life. (See my earlier question.)
Use Son of Strike to see what exactly is using the LOH memory. Again see this question for details on how to do this.
Consider pooling large arrays.
Edit: the LOH threshold for double arrays appears to be 8k.

It's an old question, but I figure it doesn't hurt to update answers with changes introduced in .NET. It is now possible to defragment the Large Object Heap. Clearly the first choice should be to make sure the best design choices were made, but it is nice to have this option now.
https://msdn.microsoft.com/en-us/library/xe0c2357(v=vs.110).aspx
"Starting with the .NET Framework 4.5.1, you can compact the large object heap (LOH) by setting the GCSettings.LargeObjectHeapCompactionMode property to GCLargeObjectHeapCompactionMode.CompactOnce before calling the Collect method, as the following example illustrates."
GCSettings can be found in the System.Runtime namespace
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

The first thing that comes to mind is to split the array up into smaller ones, so they don't reach the memory needed for the GC to put in it the LOH. You could spit the arrays into smaller ones of say 10,000, and build an object which would know which array to look in based on the indexer you pass.
Now I haven't seen the code, but I would also question why you need an array that large. I would potentially look at refactoring the code so all of that information doesn't need to be stored in memory at once.

You get it wrong. You do NOT need to havean array size 4000000 and you definitely do not need to call the garbace collector.
Write your own IList implementation. Like "PagedList"
Store items in arrays of 65536 elements.
Create an array of arrays to hold the pages.
This allows you to access basically all your elements with ONE redirection only. And, as the individual arrays are smaller, fragmentation is not an issue...
...if it is... then REUSE pages. Dont throw them away on dispose, put them on a static "PageList" and pull them from there first. All this can be transparently done within your class.
The really good thing is that this List is pretty dynamic in the memory usage. You may want to resize the holder array (the redirector). Even when not, it is about 512kbdata per page only.
Second level arrays have basically 64k per byte - which is 8 byte for a class (512kb per page, 256kb on 32 bit), or 64kb per struct byte.
Technically:
Turn
int[]
into
int[][]
Decide whether 32 or 64 bit is better as you want ;) Both ahve advantages and disadvantages.
Dealing with ONE large array like that is unwieldely in any langauge - if you ahve to, then... basically.... allocate at program start and never recreate. Only solution.

This is an old question, but with .NET Standard 1.1 (.NET Core, .NET Framework 4.5.1+) there is another possible solution:
Using ArrayPool<T> in the System.Buffers package, we can pool arrays to avoid this problem.

Am adding an elaboration to the answer above, in terms of how the issue can arise. Fragmentation of the LOH is not only dependent on the objects being long lived, but if you've got the situation that there are multiple threads and each of them are creating big lists going onto the LOH then you could have the situation that the first thread needs to grow its List but the next contiguous bit of memory is already taken up by a List from a second thread, hence the runtime will allocate new memory for the first threads List - leaving behind a rather big hole. This is whats happening currently on one project I've inherited and so even though the LOH is approx 4.5 MB, the runtime has got a total of 117MB free memory but the largest free memory segment is 28MB.
Another way this could happen without multiple threads, is if you've got more than one list being added to in some kind of loop and as each expands beyond the memory initially allocated to it then each leapfrogs the other as they grow beyond their allocated spaces.
A useful link is: https://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/
Still looking for a solution for this, one option may be to use some kind of pooled objects and request from the pool when doing the work. If you're dealing with large arrays then another option is to develop a custom collection e.g. a collection of collections, so that you don't have just one huge list but break it up into smaller lists each of which avoid the LOH.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.