C# Memory leak - XDocument

C# Memory leak - XDocument - c#

I am working on a code written by x-colleague. We have several Image files and we are converting them to XAML. The code is using XDocument to load in the image file (not of huge sizes but quite a lot of them) and do the processing on multi-thread. I have tried to look for every object which I think can be disposed once each iteration completes but still the issue is there, If I keep running the process it consumes the RAM fully and then Visual Studio crashes, what surprises me most is once this happened then I am unable to open anything on my PC, every single thing complains about memory is full including Visual Studio.
I am unable to upload the image here.
What I have tried it to run it on a single thread, though I encounter GC pressure but I am still able to run the code and memory stays good until the end.
I know I need to look for alternative instead of using XDocument but that is out of scope at the moment and I need to work through the code.
Can You please help me or give me some pointers?
Code below is how I load the XML before sending it to API for processing:
XDocument doc;
using (var fileStream = new MemoryStream(System.Text.Encoding.ASCII.GetBytes(Image1.sv.ToString())))
{
doc = XDocument.Load(fileStream);
}
API then uses multi-threading to process the image file to convert to XAML using different methods, each of these are using XDocument, its loading via memory stream, save in memory and continued the processing.
I have used Diagnostic Tools within VS to identify the memory leak.
Kind regards

The new MemoryStream(Encoding.ASCII.GetBytes(someString)) step seems very redundant, so we can shave a lot of things by just... not doing that, and using XDocument.Parse(someString):
var doc = XDocument.Parse(Image1.sv.ToString());
This also avoids losing data by going via ASCII, which is almost always the wrong choice.
More savings may be possible, if we knew what Image1.sv was here - i.e. it may be possible to avoid allocating a single large string in the first place.

Related

MVC memory issue, memory not getting cleared after controller call is finished (Example project included)

My issue: After calling a controller which generates csv file content and returns, memory is not cleared up. Which generates issue since it will slowly fill up memory and you'll end up using 100% memory on your server after awhile, this issue can be hastened by just generating larger csv files example in the project below it simply generates 3 million lines of csv file content which fills memory quite a bit.
What I want: Simply it clears the memory after the content is finished getting used.
I've tried a few thing with using statement and IDisposable but none of them fixed the issue.
Only way I can think of is using pointers and clearing the RAM manually since the garbage collector for some reason is not removing the old data which is not getting used anymore.
The RAM usage problem is mostly contained around the "test" function and GetDymmyData.
I have not added code examples in this post since I've tried to do that in Issues with list not getting garbage collected after use in mvc
but it ended up being too much code
Example project: https://github.com/jespper/Memory-Issue

The github version is a fixed version for this issue, so you can go explorer what i did in the changeset
Notes:
After generating a large file, you might need to download smaller sized files before C# releases the memory
Adding forced garbage collection helped alot
Adding a few using statements also helped alot
Smaller existing issues
If your object cant fit in the RAM and it start filling the pagefile it will not reduce the pagefile after use(Restarting your pc will help a little on it but wont clear the pagefile entirely)
I couldnt get it below 400MB of ram usage no matter what i tried, but it didnt matter if i had 5GB or 1GB in the RAM it would still get reduced down to ~400 MB

C# How to investigate a OutOfMemoryException in production

We have a WCF service developed in C# running in a production environment where it crashes every few hours with no observable pattern. Memory usage will hover at ~250mb for a while, then all of a sudden memory usage starts going up until it crashes with an OutOfMemoryException at 4gb (it's a 32bit process).
We have a hard time identifying the problem, our exceptions logged are from different places in the code, presumably from another request trying to use some memory and it receive the exception.
We have taken a memory dump when the process is at 4gb and a list of ~750k database objects is in memory when the crash occurs. We have looked up the queries of those said objects but can't pinpoint the one that loads up the entire table. The service make calls to the database using EF6.
Another thing to note, this problem never occured in our preproduction environment. The data in the database is sufficient in our preproduction environment for this to occur, if it were to load the entire table also. It's probably a specific call with a specific parameter that triggers this issue, but we can't pinpoint it.
I am out of ideas what to try next to solve our issues. Is there a tool that can help us in this situation ?
Thanks

If you want to capture all your SQL and are using Entity, you can print out queries like this
Context.Database.Log = s => Debug.Print(s);
If you mess around with that a bit you can get it to output to a variable and save the result to text file or Db. You would have to wrap it around all Db calls-not sure how big your project is?
Context.Database.Log = null;
turns it off

basic understanding of stream writer

HI,
My question has to do with a very basic understanding of Writing data to using a StreamWriter.
If you consider the following code:
StreamWriter writer = new StreamWriter(#"C:\TEST.XML");
writer.WriteLine("somestring");
writer.Flush();
writer.Close();
When the writer object is initialized, with the filename, all it has is a pointer to the file.
However when we write any string to the writer object, does it actually LOAD the whole file, read its contents, append the string towards the end and then close the handle?
I hope its not a silly questions.
I ask this because, I came across an application that writes frequently probably every half a second to a file, and the file size increased to about 1 GB, and it still continuted to write to the file. (logging)
Do you think this could have resulted in a CPU usage of 100 % ?
Please let me know if my question is unclear?
Thanks in advance.

does it actually LOAD the whole file, read its contents
After the framework opens the file, it will perform a FileStream.Seek operation to position the file pointer to the end of the file. This is supported by the operating system, and does not require reading or writing any file data.
and then close the handle
The handle is closed when you call Close or Dispose. Both are equivalent. (Note for convenience that you can take advantage of the C# using statement to create a scope where the call to Dispose is handled by the compiler on exiting the scope.)
every half a second to a file
That doesn't sound frequent enough to load the machine at 100%. Especially since disk I/O mainly consists of waiting on the disk, and this kind of wait does not contribute to CPU usage. Use a profiler to see where your application is spending its time. Alternatively, a simple technique that you might try is to run under the debugger, click pause, and examine the call stacks of your threads. There is a good chance that a method that is consuming a lot of time will be on a stack when you randomly pause the application.

The code you provided above will overwrite the content of the file, so it has no need to load the file upfront.
Nonetheless, you can append to a file by saying:
StreamWriter writer = new StreamWriter(#"C:\TEST.XML", true);
The true parameter is to tell it to append to the file.
And still, it does not load the entire file in memory before it appends to it.
That's what makes this called a "stream", which means if you're going one way, you're going one way.

Memory management in C#

Good afternoon,
I have some text files containing a list of (2-gram, count) pairs collected by analysing a corpus of newspaper articles which I need to load into memory when I start a given application I am developing. To store those pairs, I am using a structure like the following one:
private static Dictionary<String, Int64>[] ListaDigramas = new Dictionary<String, Int64>[27];
The ideia of having an array of dictionaries is due to efficiency questions, since I read somewhere that a long dictionary has a negative impact on performance. That said, every 2-gram goes into the dictionary that corresponds to it's first character's ASCII code minus 97 (or 26 if the first character is not a character in the range from 'a' to 'z').
When I load the (2-gram, count) pairs into memory, the application takes an overall 800Mb of RAM, and stays like this until I use a program called Memory Cleaner to free up memory. After this, the memory taken by the program goes down to the range 7Mb-100Mb, without losing functionality (I think).
Is there any way I can free up memory this way but without using an external application? I tried to use GC.Collect() but it doesn't work in this case.
Thank you very much.

You are using a static field so chances are once it is loaded it never gets garbage collected, so unless you call the .Clear() method on this dictionary it probably won't be subject to garbage collection.

It is fairly mysterious to me how utilities like that ever make it onto somebody's machine. All they do is call EmptyWorkingSet(). Maybe it looks good in Taskmgr.exe, but it is otherwise just a way to keep the hard drive busy unnecessarily. You'll get the exact same thing by minimizing the main window of your app.

I don't know the details of how memory cleaner works, but given that it's unlikely to know the inner workings of a programs memory allocations, the best it can probably do is just cause pages to be swapped out to disk reducing the apparent memory usage of the program.
Garbage collection won't help unless you actually have objects you aren't using any more. If you are using your dictionaries, which the GC considers that you are since it is a static field, then all the objects in them are considered in use and must belong to the active memory of the program. There's no way around this.

What you are seeing is the total usage of the application. This is 800MB and will stay that way. As the comments say, memory cleaner makes it look like the application uses less memory. What you can try to do is access all values in the dictionary after you've run the memory cleaner. You'll see that the memory usage goes up again (it's read from swap).
What you probably want is to not load all this data into memory. Is there a way you can get the same results using an algorithm?
Alternatively, and this would probably be the best option if you are actually storing information here, you could use a database. If it's cumbersome to use a normal database like SQLExpress, you could always go for SQLite.

About the only other idea I could come up with, if you really want to keep your memory usage down, would be store the dictionary in a stream and compress it. Factors to consider would be how often you're accessing/inflating this data, and how compressible the data is. Text from newspaper articles would compress extremely well, and the performance hit might be less than you'd think.
Using an open-source library like SharpZipLib ( http://www.icsharpcode.net/opensource/sharpziplib/ ), your code would look something like:
MemoryStream stream = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(stream, ListaDigramas);
byte[] dictBytes = stream.ToArray();
Stream zipStream = new DeflaterOutputStream(new MemoryStream());
zipStream.Write(dictBytes, 0, dictBytes.Length);
Inflating requires an InflaterInputStream and a loop to inflate the stream in chunks, but is fairly straightforward.
You'd have to play with the app to see if performance was acceptable. Keeping in mind, of course, that you'll still need enough memory to hold the dictionary when you inflate it for use (unless someone has a clever idea to work with the object in its compressed state).
Honestly, though, keeping it as-is in memory and letting Windows swap it to the page file is probably your best/fastest option.
Edit
I've never tried it, but you might be able to serialize directly to the compression stream, meaning the compression overhead is minimal (you'd still have the serialization overhead):
MemoryStream stream = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
Stream zipStream = new DeflaterOutputStream(new MemoryStream());
formatter.Serialize(zipStream, ListaDigramas);

Thank you very much for all the answers. The data actually needs to be loaded during the whole running time of the application, so based on your answers I think there is nothing better to do... I could perhaps try an external database, but since I already need to deal with two other databases at the same time, I think it is not a good idea.
Do you think it is possible to be dealing with three databases at the same time and do not lose on performance?

If you are disposing of your applications resources correctly then the actual used memory may not be what you are seeing (if verifying through Task Manager).
The Garbage Collector will free up the unused memory at the best possible time. It usually isn't really a good idea to force collection either...see this post
"data actually needs to be loaded during the whole running time of the application" - why?

Reading lines from a .NET text/scintilla box without using too much memory?

I have to create a C# program that deals well with reading in huge files.
For example, I have a 60+ mB file. I read all of it into a scintilla box, let's call it sci_log. The program is using roughly 200mB of memory with this and other features. This is still acceptable (and less than the amount of memory used by Notepad++ to open this file).
I have another scintilla box, sci_splice. The user inputs a search term and the program searches through the file (or sci_log if the file length is small enough--it doesn't matter because it happens both ways) to find a regexp.match. When it finds a match, it concatenates that line with a string that has previous matches and increases a temporary count variable. When count is 100 (or 150, or 200, any number really), then I put the output in sci_splice, call GC.Collect(), and repeat for the next 100 lines (setting count = 0, nulling the string).
I don't have the code on me right now as I'm writing this from my home laptop, but the issue with this is it's using a LOT of memory. The 200mB mem usage jumps up to well over 1gB with no end in sight. This only happens on a search with a lot of regexp matches, so it's something with the string. But the issue is, wouldn't the GC free up that memory? Also, why does it go up so high? It doesn't make sense for why it would more than triple (worst possible case). Even if all of that 200mB was just the log in memory, all it's doing is reading each line and storing it (at worst).
After some more testing, it looks like there's something wrong with Scintilla using a lot of memory when adding lines. The initial read of the lines has a memory spike up to 850mB for a fraction of a second. Guess I need to just page the output.

Don't call GC.Collect. In this case I don't think it matters because I think this memory is going to end up on the Large Object Heap (LOH). But the point is .Net knows a lot more about memory management than you do; leave it alone.
I suspect you are looking at this using Task Manager just by the way you are describing it. You need to instead use at least Perfmon. Anticipating you have not used it before go here and do pretty much what Tess does to where it says Get a Memory Dump. Not sure you are ready for WinDbg but that maybe your next step.
Without seeing code there is almost no way to know what it is going on. The problem could be inside of Scintilla too, but I would check through what you are doing first. By running perfmon you may at least get more information to figure out what to do next.

If you are using System.String to store your matching lines, I suggest you try replacing it with System.Text.StringBuilder and see if this makes any difference.

Try http://msdn.microsoft.com/en-us/library/system.io.memorymappedfiles.memorymappedfile(VS.100).aspx

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.