Related
The general advice is that you should not call GC.Collect from your code, but what are the exceptions to this rule?
I can only think of a few very specific cases where it may make sense to force a garbage collection.
One example that springs to mind is a service, that wakes up at intervals, performs some task, and then sleeps for a long time. In this case, it may be a good idea to force a collect to prevent the soon-to-be-idle process from holding on to more memory than needed.
Are there any other cases where it is acceptable to call GC.Collect?
If you have good reason to believe that a significant set of objects - particularly those you suspect to be in generations 1 and 2 - are now eligible for garbage collection, and that now would be an appropriate time to collect in terms of the small performance hit.
A good example of this is if you've just closed a large form. You know that all the UI controls can now be garbage collected, and a very short pause as the form is closed probably won't be noticeable to the user.
UPDATE 2.7.2018
As of .NET 4.5 - there is GCLatencyMode.LowLatency and GCLatencyMode.SustainedLowLatency. When entering and leaving either of these modes, it is recommended that you force a full GC with GC.Collect(2, GCCollectionMode.Forced).
As of .NET 4.6 - there is the GC.TryStartNoGCRegion method (used to set the read-only value GCLatencyMode.NoGCRegion). This can itself, perform a full blocking garbage collection in an attempt to free enough memory, but given we are disallowing GC for a period, I would argue it is also a good idea to perform full GC before and after.
Source: Microsoft engineer Ben Watson's: Writing High-Performance .NET Code, 2nd Ed. 2018.
See:
https://msdn.microsoft.com/en-us/library/system.runtime.gclatencymode(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/dn906204(v=vs.110).aspx
I use GC.Collect only when writing crude performance/profiler test rigs; i.e. I have two (or more) blocks of code to test - something like:
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
TestA(); // may allocate lots of transient objects
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
TestB(); // may allocate lots of transient objects
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
...
So that TestA() and TestB() run with as similar state as possible - i.e. TestB() doesn't get hammered just because TestA left it very close to the tipping point.
A classic example would be a simple console exe (a Main method sort-enough to be posted here for example), that shows the difference between looped string concatenation and StringBuilder.
If I need something precise, then this would be two completely independent tests - but often this is enough if we just want to minimize (or normalize) the GC during the tests to get a rough feel for the behaviour.
During production code? I have yet to use it ;-p
The best practise is to not force a garbage collection in most cases. (Every system I have worked on that had forced garbage collections, had underlining problems that if solved would have removed the need to forced the garbage collection, and speeded the system up greatly.)
There are a few cases when you know more about memory usage then the garbage collector does. This is unlikely to be true in a multi user application, or a service that is responding to more then one request at a time.
However in some batch type processing you do know more then the GC. E.g. consider an application that.
Is given a list of file names on the command line
Processes a single file then write the result out to a results file.
While processing the file, creates a lot of interlinked objects that can not be collected until the processing of the file have complete (e.g. a parse tree)
Does not keep match state between the files it has processed.
You may be able to make a case (after careful) testing that you should force a full garbage collection after you have process each file.
Another cases is a service that wakes up every few minutes to process some items, and does not keep any state while it’s asleep. Then forcing a full collection just before going to sleep may be worthwhile.
The only time I would consider forcing
a collection is when I know that a lot
of object had been created recently
and very few objects are currently
referenced.
I would rather have a garbage collection API when I could give it hints about this type of thing without having to force a GC my self.
See also "Rico Mariani's Performance Tidbits"
These days I consider same of the above cases would be better to use a short lived worker process to do each batch of work and let the OS do the resource recovery.
One case is when you are trying to unit test code that uses WeakReference.
In large 24/7 or 24/6 systems -- systems that react to messages, RPC requests or that poll a database or process continuously -- it is useful to have a way to identify memory leaks. For this, I tend to add a mechanism to the application to temporarily suspend any processing and then perform full garbage collection. This puts the system into a quiescent state where the memory remaining is either legitimately long lived memory (caches, configuration, &c.) or else is 'leaked' (objects that are not expected or desired to be rooted but actually are).
Having this mechanism makes it a lot easier to profile memory usage as the reports will not be clouded with noise from active processing.
To be sure you get all of the garbage, you need to perform two collections:
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
As the first collection will cause any objects with finalizers to be finalized (but not actually garbage collect these objects). The second GC will garbage collect these finalized objects.
You can call GC.Collect() when you know something about the nature of the app the garbage collector doesn't.
As the author, it's often tempting to think this is likely or normal. However, the truth is the GC amounts to a pretty well-written and tested expert system, and it's rare you'll know something about the low level code paths it doesn't.
The best example I can think of where you might have some extra information is an app that cycles between idle periods and very busy periods. You want the best performance possible for the busy periods and therefore want to use the idle time to do some clean up.
However, most of the time the GC is smart enough to do this anyway.
One instance where it is almost necessary to call GC.Collect() is when automating Microsoft Office through Interop. COM objects for Office don't like to automatically release and can result in the instances of the Office product taking up very large amounts of memory. I'm not sure if this is an issue or by design. There's lots of posts about this topic around the internet so I won't go into too much detail.
When programming using Interop, every single COM object should be manually released, usually though the use of Marshal.ReleseComObject(). In addition, calling Garbage Collection manually can help "clean up" a bit. Calling the following code when you're done with Interop objects seems to help quite a bit:
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
In my personal experience, using a combination of ReleaseComObject and manually calling garbage collection greatly reduces the memory usage of Office products, specifically Excel.
As a memory fragmentation solution.
I was getting out of memory exceptions while writing a lot of data into a memory stream (reading from a network stream). The data was written in 8K chunks. After reaching 128M there was exception even though there was a lot of memory available (but it was fragmented). Calling GC.Collect() solved the issue. I was able to handle over 1G after the fix.
Have a look at this article by Rico Mariani. He gives two rules when to call GC.Collect (rule 1 is: "Don't"):
When to call GC.Collect()
I was doing some performance testing on array and list:
private static int count = 100000000;
private static List<int> GetSomeNumbers_List_int()
{
var lstNumbers = new List<int>();
for(var i = 1; i <= count; i++)
{
lstNumbers.Add(i);
}
return lstNumbers;
}
private static int[] GetSomeNumbers_Array()
{
var lstNumbers = new int[count];
for (var i = 1; i <= count; i++)
{
lstNumbers[i-1] = i + 1;
}
return lstNumbers;
}
private static int[] GetSomeNumbers_Enumerable_Range()
{
return Enumerable.Range(1, count).ToArray();
}
static void performance_100_Million()
{
var sw = new Stopwatch();
sw.Start();
var numbers1 = GetSomeNumbers_List_int();
sw.Stop();
//numbers1 = null;
//GC.Collect();
Console.WriteLine(String.Format("\"List<int>\" took {0} milliseconds", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
var numbers2 = GetSomeNumbers_Array();
sw.Stop();
//numbers2 = null;
//GC.Collect();
Console.WriteLine(String.Format("\"int[]\" took {0} milliseconds", sw.ElapsedMilliseconds));
sw.Reset();
sw.Start();
//getting System.OutOfMemoryException in GetSomeNumbers_Enumerable_Range method
var numbers3 = GetSomeNumbers_Enumerable_Range();
sw.Stop();
//numbers3 = null;
//GC.Collect();
Console.WriteLine(String.Format("\"int[]\" Enumerable.Range took {0} milliseconds", sw.ElapsedMilliseconds));
}
and I got OutOfMemoryException in GetSomeNumbers_Enumerable_Range method the only workaround is to deallocate the memory by:
numbers = null;
GC.Collect();
You should try to avoid using GC.Collect() since its very expensive. Here is an example:
public void ClearFrame(ulong timeStamp)
{
if (RecordSet.Count <= 0) return;
if (Limit == false)
{
var seconds = (timeStamp - RecordSet[0].TimeStamp)/1000;
if (seconds <= _preFramesTime) return;
Limit = true;
do
{
RecordSet.Remove(RecordSet[0]);
} while (((timeStamp - RecordSet[0].TimeStamp) / 1000) > _preFramesTime);
}
else
{
RecordSet.Remove(RecordSet[0]);
}
GC.Collect(); // AVOID
}
TEST RESULT: CPU USAGE 12%
When you change to this:
public void ClearFrame(ulong timeStamp)
{
if (RecordSet.Count <= 0) return;
if (Limit == false)
{
var seconds = (timeStamp - RecordSet[0].TimeStamp)/1000;
if (seconds <= _preFramesTime) return;
Limit = true;
do
{
RecordSet[0].Dispose(); // Bitmap destroyed!
RecordSet.Remove(RecordSet[0]);
} while (((timeStamp - RecordSet[0].TimeStamp) / 1000) > _preFramesTime);
}
else
{
RecordSet[0].Dispose(); // Bitmap destroyed!
RecordSet.Remove(RecordSet[0]);
}
//GC.Collect();
}
TEST RESULT: CPU USAGE 2-3%
In your example, I think that calling GC.Collect isn't the issue, but rather there is a design issue.
If you are going to wake up at intervals, (set times) then your program should be crafted for a single execution (perform the task once) and then terminate. Then, you set the program up as a scheduled task to run at the scheduled intervals.
This way, you don't have to concern yourself with calling GC.Collect, (which you should rarely if ever, have to do).
That being said, Rico Mariani has a great blog post on this subject, which can be found here:
http://blogs.msdn.com/ricom/archive/2004/11/29/271829.aspx
One useful place to call GC.Collect() is in a unit test when you want to verify that you are not creating a memory leak (e. g. if you are doing something with WeakReferences or ConditionalWeakTable, dynamically generated code, etc).
For example, I have a few tests like:
WeakReference w = CodeThatShouldNotMemoryLeak();
Assert.IsTrue(w.IsAlive);
GC.Collect();
GC.WaitForPendingFinalizers();
Assert.IsFalse(w.IsAlive);
It could be argued that using WeakReferences is a problem in and of itself, but it seems that if you are creating a system that relies on such behavior then calling GC.Collect() is a good way to verify such code.
There are some situations where it is better safe than sorry.
Here is one situation.
It is possible to author an unmanaged DLL in C# using IL rewrites (because there are situations where this is necessary).
Now suppose, for example, the DLL creates an array of bytes at the class level - because many of the exported functions need access to such. What happens when the DLL is unloaded? Is the garbage collector automatically called at that point? I don't know, but being an unmanaged DLL it is entirely possible the GC isn't called. And it would be a big problem if it wasn't called. When the DLL is unloaded so too would be the garbage collector - so who is going to be responsible for collecting any possible garbage and how would they do it? Better to employ C#'s garbage collector. Have a cleanup function (available to the DLL client) where the class level variables are set to null and the garbage collector called.
Better safe than sorry.
The short answer is: never!
using(var stream = new MemoryStream())
{
bitmap.Save(stream, ImageFormat.Png);
techObject.Last().Image = Image.FromStream(stream);
bitmap.Dispose();
// Without this code, I had an OutOfMemory exception.
GC.Collect();
GC.WaitForPendingFinalizers();
//
}
Another reason is when you have a SerialPort opened on a USB COM port, and then the USB device is unplugged. Because the SerialPort was opened, the resource holds a reference to the previously connected port in the system's registry. The system's registry will then contain stale data, so the list of available ports will be wrong. Therefore the port must be closed.
Calling SerialPort.Close() on the port calls Dispose() on the object, but it remains in memory until garbage collection actually runs, causing the registry to remain stale until the garbage collector decides to release the resource.
From https://stackoverflow.com/a/58810699/8685342:
try
{
if (port != null)
port.Close(); //this will throw an exception if the port was unplugged
}
catch (Exception ex) //of type 'System.IO.IOException'
{
System.GC.Collect();
System.GC.WaitForPendingFinalizers();
}
port = null;
If you are creating a lot of new System.Drawing.Bitmap objects, the Garbage Collector doesn't clear them. Eventually GDI+ will think you are running out of memory and will throw a "The parameter is not valid" exception. Calling GC.Collect() every so often (not too often!) seems to resolve this issue.
i am still pretty unsure about this.
I am working since 7 years on an Application Server. Our bigger installations take use of 24 GB Ram. Its hightly Multithreaded, and ALL calls for GC.Collect() ran into really terrible performance issues.
Many third party Components used GC.Collect() when they thought it was clever to do this right now.
So a simple bunch of Excel-Reports blocked the App Server for all threads several times a minute.
We had to refactor all the 3rd Party Components in order to remove the GC.Collect() calls, and all worked fine after doing this.
But i am running Servers on Win32 as well, and here i started to take heavy use of GC.Collect() after getting a OutOfMemoryException.
But i am also pretty unsure about this, because i often noticed, when i get a OOM on 32 Bit, and i retry to run the same Operation again, without calling GC.Collect(), it just worked fine.
One thing i wonder is the OOM Exception itself...
If i would have written the .Net Framework, and i can't alloc a memory block, i would use GC.Collect(), defrag memory (??), try again, and if i still cant find a free memory block, then i would throw the OOM-Exception.
Or at least make this behavior as configurable option, due the drawbacks of the performance issue with GC.Collect.
Now i have lots of code like this in my app to "solve" the problem:
public static TResult ExecuteOOMAware<T1, T2, TResult>(Func<T1,T2 ,TResult> func, T1 a1, T2 a2)
{
int oomCounter = 0;
int maxOOMRetries = 10;
do
{
try
{
return func(a1, a2);
}
catch (OutOfMemoryException)
{
oomCounter++;
if (maxOOMRetries > 10)
{
throw;
}
else
{
Log.Info("OutOfMemory-Exception caught, Trying to fix. Counter: " + oomCounter.ToString());
System.Threading.Thread.Sleep(TimeSpan.FromSeconds(oomCounter * 10));
GC.Collect();
}
}
} while (oomCounter < maxOOMRetries);
// never gets hitted.
return default(TResult);
}
(Note that the Thread.Sleep() behavior is a really App apecific behavior, because we are running a ORM Caching Service, and the service takes some time to release all the cached objects, if RAM exceeds some predefined values. so it waits a few seconds the first time, and has increased waiting time each occurence of OOM.)
one good reason for calling GC is on small ARM computers with little memory, like the Raspberry PI (running with mono).
If unallocated memory fragments use too much of the system RAM, then the Linux OS can get unstable.
I have an application where I have to call GC every second (!) to get rid of memory overflow problems.
Another good solution is to dispose objects when they are no longer needed. Unfortunately this is not so easy in many cases.
This isn't that relevant to the question, but for XSLT transforms in .NET (XSLCompiledTranform) then you might have no choice. Another candidate is the MSHTML control.
If you are using a version of .net less than 4.5, manual collection may be inevitable (especially if you are dealing with many 'large objects').
this link describes why:
https://blogs.msdn.microsoft.com/dotnet/2011/10/03/large-object-heap-improvements-in-net-4-5/
Since there are Small object heap(SOH) and Large object heap(LOH)
We can call GC.Collect() to clear de-reference object in SOP, and move lived object to next generation.
In .net4.5, we can also compact LOH by using largeobjectheapcompactionmode
The following code is a simplified example of an issue I am seeing. This application consumes approx 4GB of memory before throwing an exception as the dictionary is too big.
class Program
{
static void Main(string[] args)
{
Program program = new Program();
while(true)
{
program.Method();
Console.ReadLine();
}
}
public void Method()
{
WasteOfMemory memory = new WasteOfMemory();
Task tast = new Task(memory.WasteMemory);
tast.Start();
}
}
public class WasteOfMemory
{
public void WasteMemory()
{
Dictionary<string, string> aMassiveList = new Dictionary<string, string>();
try
{
long i = 0;
while (true)
{
aMassiveList.Add(i.ToString(), "I am a line of text designed to waste space.... I am exceptionally useful........");
i++;
}
}
catch(Exception e)
{
Console.WriteLine("I have broken myself");
}
}
}
This is all as expected, although what we cannot currently work out is when this memory should be released from the CLR.
We have let the task complete and then simulated a memory overload situation, but the memory consumed by the dictionary is not released. As the OS is running out of memory, is it not putting pressure on the CLR to release the memory?
However and even more confusing, if we wait until the task has completed, then hit enter to run the task again the memory is released, so obviously the previous dictionary has been garbage collected (hasn't it?).
So, why is the memory not being released? And how can we get the CLR to release the memory?
Any explanations or solutions would be greatly appreciated.
EDIT: Following replies, particularly Beska's, it is obvious my description of the issue is not the the clearest, so I will try to clarify.
The code may not be the best example, sorry! It was a quick crude piece of code to try to replicate the issue.
The dictionary is used here to replicate the fact we have a large custom data object, which fills a large chunk of our memory and it is not then released after the task has completed.
In the example, the dictionary fills up to the limit of the dictionary and then throws an exception, it does NOT keep filling forever! This is well before our memory is full, and it does not cause an OutOfMemoryException. Hence the result is a large object in memory, and then the task completes.
At this point we would expect the dictionary to be out of scope, as both the task and the method 'Method' have completed. Hence, we would expect the dictionary to be garbage collected and the memory reclaimed. In reality, the memory is not freed until 'Method' is called again, creating a new WasteOfMemory instance and starting a new task.
Hopefully that will clarify the issue a bit
The garbage collector only frees locations in memory that are no longer in use that are objects which have no pointer pointing to them.
(1) your program runs infinitely without termination and
(2) you never change the pointer to your dictionary, so the GC has certainly no reason to touch the dictionary.
So for me your program is doing exactly what it is supposed to do.
Okay, I've been following this...I think there are a couple issues, some of which people have touched on, but I think not answering the real question (which, admittedly, took me a while to recognize, and I'm not sure I'm answering what you want even now.)
This is all as expected, although what we cannot currently work out is when this memory should be released from the CLR.
As others have said, while the task is running, the dictionary will not be released. It's being used. It gets bigger until you run out of memory. I'm pretty sure you understand this.
We have let the task complete and then simulated a memory overload situation, but the memory consumed by the dictionary is not released. As the OS is running out of memory, is it not putting pressure on the CLR to release the memory?
Here, I think, is the real question.
If I understand you correctly, you're saying you set this up to fill up memory. And then, after it crashes (but before you hit return to start a new task) you're trying other things outside of this program, such as running other programs in Windows to try to get the GC to collect the memory, right? Hoping that the OS would talk to the GC, and start pressuring it to do it's thing.
However and even more confusing, if we wait until the task has completed, then hit enter to run the task again the memory is released, so obviously the previous dictionary has been garbage collected (hasn't it?).
I think you answered your own question...it has not been necessarily been released until you hit return to start a new task. The new task needs memory, so it goes to the GC, and the GC happily collects the memory from the previous task, which has now ended (after throwing from full memory).
So, why is the memory not being released? And how can we get the CLR to release the memory?
I don't know that you can force the GC to release memory. Generally speaking, it does it when it wants (though some hacker types might know some slick way to force its hand.) Of course, .NET decides when to run the GC, and since nothing is happening while the program is just sitting there, it may well be deciding that it doesn't need to. As to whether the OS can pressure the GC to run, it seems from your tests the answer is "no". A bit counter-intuitive perhaps.
Is that what you were trying to get at?
The memory is not being released because the scope aMassiveList is never finished. When a function returns, it releases all non-referenced resources created inside it.
In your case, aMassiveList never leaves context. If you want your function never to return you have to find a way to 'process' your info and release it instead of storing all of them forever.
If you create a function that increasingly allocates resources and never release it you will end up consuming all the memory.
GC will only release unreferenced objects, so as the dictionary is being referenced by your program it can't be released by the GC
The way you've written the WasteMemory method, it will never exit (unless the variable "i" overflows, which won't happen this year) and BECAUSE IT WILL NEVER EXIT it will keep IN USE the reference to the internal Dictionary.
Daniel White is right, you should read about how GC works.
If the references are in use, GC will not collect the referenced memory. Otherwise, how would any program work?
I don't see what you expect the CLR/GC to do here. There's nothing to garbage-collect inside one run of your WasteMemory method.
However and even more confusing, if we wait until the task has completed, then hit enter to run the task again the memory is released, so obviously the previous dictionary has been garbage collected (hasn't it?).
When you press Enter, a new task is created and started. It's not the same task, it's a new task - a new object holding a reference to a new WasteOfMemory instance.
The old task will keep running and the memory it uses will NOT be collected because the old task keeps running in background and it keeps USING that memory.
I'm not sure why - and most importantly HOW - you observe the memory of the old task being released.
Change your method to be a using statement
Example:
Using (WateOfMemory memory = new WateOfMemory())
{
Task tast = new Task(memory.WasteMemory);
tast.Start();
}
And add disposible WateOfMemoryClass (by the way your constructor is WasteOfMemory)
#region Dispose
private IntPtr handle;
private Component component = new Component();
private bool disposed = false;
public WateOfMemory()
{
}
public WateOfMemory(IntPtr handle)
{
this.handle = handle;
}
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
private void Dispose(bool disposing)
{
if(!this.disposed)
{
if(disposing)
{
component.Dispose();
}
CloseHandle(handle);
handle = IntPtr.Zero;
}
disposed = true;
}
[System.Runtime.InteropServices.DllImport("Kernel32")]
private extern static Boolean CloseHandle(IntPtr handle);
~WateOfMemory()
{
Dispose(false);
}
#endregion
I have a stored procedure that returns several resultsets and within (some) resultsets 1000's of rows. I am executing this stored proc using x threads at the same time, with a maximum of y concurrently.
When i just run through the data:
using (SqlDataReader reader = command.ExecuteReader(CommandBehavior.CloseConnection))
{
do
{
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
var value = reader.GetValue(i);
}
}
}
while (reader.NextResult());
}
I get reasonable throughput - and CPU. Now obviously this is useless, so i need some objects! OK, now i change it to do something like this:
using (SqlDataReader reader = command.ExecuteReader(CommandBehavior.CloseConnection))
{
while(reader.Read())
{
Bob b = new Bob(reader);
this.bobs.Add(b);
}
reader.NextResult();
while(reader.Read())
{
Clarence c = new Clarence(reader);
this.clarences.Add(c);
}
// More fun
}
And my implementation of data classes:
public Bob(SqlDataReader reader)
{
this.some = reader.GetInt32(0);
this.parameter = reader.GetInt32(1);
this.value = reader.GetString(2);
}
And this performs worse. Which is not surprising. What is surprising, is that the CPU drops, and by about 20 - 25% of total (i.e. not 25% of what it was using; it's close to 50% of what it was using)! Why would doing more work, drop the CPU? I don't get it... it looks like there's some locking somewhere - but i don't understand where? i want to get the most out of the machine!
EDIT - changed code due to bad demo code example. Oops.
ALSO: to test the constructor theory, i changed the implementation of the one that creates the object. This time it also does a for loop over the fields, and does a getValue, and will create an empty object. So it doesn't do what i want yes, but i wanted to see if creating the large numbers of objects is the issue. It isn't.
SECOND EDIT: looks like adding the objects to the list is the issue - CPU drops immediately as soon as i add these objects to the list. Not sure how to improve this... (or if it's worth investigating; the first scenario is obviously stupid)
I can see the performance issue with your methodology, but I am not sure I can pinpoint the exact way to explain the cause. But essentially, you are adding the flow of the cursor to instantiation rather setting properties. If I were to take a guess, you are causing some context switching when you make a pull from a reader as part of your constructor.
If you think of a reader as a firehose cursor (it is) and think of the difference between the firehose being controlled by the user holding the hose (normal method) rather than the container being filled, you start to get a picture of the problem.
Not sure how the threading relates? But, if you have multiple clients and you are stopping the flow of the hose a bit more by moving bits over to a constructor rather than setting properties on a constructed object, and then contending for thread time across multiple requests, I can envision an even bigger issue under load.
Well, the reason the CPU isn't maxed out is simply that it's waiting on something else. Either the disk or the network.
Considering your first code block simply reads and immediately discards a value there are several possibilities: One is that the compiler could be optimizing out the allocation of the value variable because it is limited in scope and never used. This would mean the processor doesn't have to wait for memory allocation. Instead it would be limited mainly by how fast you can get the data off of the network wire.
Normally memory allocation ought to be super fast, however, if the amount of memory you are allocating causes windows to push stuff out to disk then this be held back by your hard drive speeds.
Looking at your second code block, you are constructing lots of objects (more memory usage) and keeping them around. Neither of which allows the compiler to optimize the calls out; and both of which put increased pressure on local system resources beyond just the processor.
To determine if the above is even close, you need to put watch counters on the amount of memory used by the app vs the amount of available RAM. Also you'll want counters on your disk system to see if it's being heavily accessed under either scenario.
Point is, try and watch EVERYTHING that is going on in the machine while the code blocks are running. That will give you a better idea of why the processor is waiting.
Maybe its just a matter of incorporating the Bob and Clarence calls that are inherently less CPU intensive- and that part of the execution just takes some time because of I/O bottlenecks or something along those lines.
Your best bet is to run this through a profiler and take a look at the report.
I have a C# service which listens to a queue for XML messages, receives them, processing them using XSLTs and writing them in the database.
It does process about 60K messages a day of about 1Mb each. The memory when is idle is going down to 100MB which is really good.
However recently I have started processing messages of 12 MB in size. It does blow the memory up and even when is idle it has memory of about 500MB. Any suggestions why this might be a problem? I don't think there is a memory leak as it would have surfaced after processing so many (60K messages of 1MB).
That looks perfectly fine. Why do you think this is a problem?
The garbage collector will eventually release the unused memory, but this doesn't mean this is going to happen as soon as your service is idle.
Raymond Chen has written a good article explaining the basic idea of garbage collection:
Everybody thinks about garbage collection the wrong way
However - but this is pure speculation with the information given in your questions - there might be memory leaks related to extension methods in your XSLT. Extension methods may lead to problems in case you recompile the stylesheet every time a new XML document is transformed. The fix is simple: Once compiled, cache the stylesheet.
SIR! PUT DOWN TASK MANAGER AND WALK AWAY. Seriously. Memory management in .NET is not tuned for smallest footprint. It is tuned for efficiency. It will hold onto memory rather than release it back to the system. Resist the temptation to babysit your memory unless there is an actual problem (OOM exceptions, system issues).
What measure are you using for memory? There are lots of possible measures, and none of them really mean "used memory". With virtual memory, sharing of images, etc. things are not simple.
even when is idle it has memory of about 500MB
Most processes (and I think this includes .NET) will not release memory back to the OS once allocated to the process. But if it is not in use it will be paged from physical memory allowing other processes to make use of it.
started processing messages of 12 MB in size
If an XML document is expanded into an object model (e.g. XmlDocument) it needs a lot more memory (lots of links to other nodes). Either look at a different model (both XDocument and XPathDocument are lighter weight) or, better, process the XML with XmlReader and thus never have a fully expanded object model.
First of all, add something to your system to allow you to manually invoke garbage collection for the purpose of investigation. The CLR will not necessarily return memory to the O/S unless it is in short supply. You should use this mechanism to manually invoke garbage collection when you have the system in a quiescent state so that you can see if the memory drops back down. (You should invoke GC.Collect(2) twice to ensure objects with finalizers are actually collected and not just finalized.)
If the memory goes down to the quiescent level after a full GC then you do not have a problem: simply that .NET is not being proactive at deallocating memory.
If the memory does not go down, then you have some kind of a leak. As your messages are large, it means their text representations are likely ending up on the Large Object Heap (LOH). This particular heap is not compacted which means that it is easier to leak this memory than with the other heaps.
One thing to watch out for is string interning: if you are manually interning strings this can cause LOH fragmentation.
Very hard to say what may be the problem. I've had success using Ants Memory Profiler when investigating memory issues.
Contrary to the thought I don't think there is a memory leak; I would profile the memory.
I would check your event handlers. If you're not careful detaching those when you're done, it seems like it's easy to create object references that won't be collected by GC.
I don't know that it's the greatest practice (since the onus of beginning and cancelling event handling should fall to the subscriber), but I've gone the route of implementing IDisposable and creating my events with an explicit delegate field instance. On disposal, the field can be set to null, which effectively detaches all subscriptions.
something like this:
public class Publisher
: IDisposable
{
private EventHandler _somethingHappened;
public event EventHandler SomethingHappened
{
add { _somethingHappened += value; }
remove { _somethingHappened -= value; }
}
protected void OnSomethingHappened(object sender, EventArgs e)
{
if (_somethingHappened != null)
_somethingHappened(sender, e);
}
public void Dispose()
{
_somethingHappened = null;
}
}
Barring that (I don't know how much control you have over the publisher), you might need to be careful to detach unneeded handlers on your subscriber:
public class Subscriber
: IDisposable
{
private Publisher _publisher;
public Publisher Publisher
{
get { return _publisher; }
set {
// Detach from the old reference
DetachEvents(_publisher);
_publisher = value;
// Attach to the new
AttachEvents(_publisher);
}
}
private void DetachEvents(Publisher publisher)
{
if (publisher != null)
{
publisher.SomethingHappened -= new EventHandler(publisher_SomethingHappened);
}
}
private void AttachEvents(Publisher publisher)
{
if (publisher != null)
{
publisher.SomethingHappened += new EventHandler(publisher_SomethingHappened);
}
}
void publisher_SomethingHappened(object sender, EventArgs e)
{
// DO STUFF
}
public void Dispose()
{
// Detach from reference
DetachEvents(Publisher);
}
}
I have a crawler application (with C#) that downloads pages from web .
The application take more virtual memory ,
even i dispose every object and even use GC.Collect() .
This , have 10 thread and each thread has a socket that downloads pages .
In each thread , i have a byte[] buffer that store content of page , and have
an string str_content that i store in it , content of page in string .
I have a Sytem.Timer that every 3 second chech , if each thread has been stopped ,
assign it new thread and start it.
I use dispose method and even use GC.Collect() in my application , but in 3 hour my application take
500 MB on virtual memory (500 MB on private bytes in Process explorer) . Then my system
will be hang and i should restart my pc .
would it be rude , If i assign my byte[] and string to null ?
Is there any way that i use to free virtual memory ?
Thanks .
First of all, you SHOULDN'T be calling gc.collect() anyway, as it is a costly call, and shouldn't be necessary.
If you are seeing growth AND you are still calling gc.collect() you have resources that still have references, thus they can't be collected.
I would start looking at your code and making sure that all of your objects are declared at the proper scope, that you are using the Using Statement syntax to ensure that items implementing IDisposable are properly cleaned up and do a full review of your code.
The next step would be to take a tool like ANTS profiler or the like and look at what is actually stored in memory.
It would take exploration of your code to see where your memory leaks are. It's likely in events that you're tying into (either manually or automatically) that cause apparently-out-of-scope objects to not get properly disposed.
In C# (and in Java) as long as you have a reference to the object, the program environment assumes you are still using the object. Calls to free memory will only free unused objects. The key is to stop using the object.
Odds are excellent you have something like:
Object mine = new Object();
the key is that you also need something like:
mine = null;
to signal that the "mine" object is no longer being used. Typically these problems don't happen in code blocks like this because once you leave the block, the variables aren't accessible anymore:
public void process() {
Object mine = new Object();
}
Typically these problems happen in code blocks like this, because the Collection accumulates objects over time:
static List tasks = new ArrayList();
public void process(String item) {
tasks.add(item);
}
The key is that without a corresponding tasks.remove(item) the list will hold references to the items forever thwarting garbage collection efforts.