StackOverFlowException - but oviously NO recursion/endless loop - c#

I'm now blocked by this problem the entire day, read thousands of google results, but nothing seems to reflect my problem or even come near to it... i hope any of you has a push into the right direction for me.
I wrote a client-server-application (so more like 2 applications) - the client collects data about his system, as well as a screenshot, serializes all this into a XML stream (the picture as a byte[]-array]) and sends this to the server in regular intervals.
The server receives the stream (via tcp), deserializes the xml to an information-object and shows the information on a windows form.
This process is running stable for about 20-25 minutes at a submission interval of 3 seconds. When observing the memory usage there's nothing significant to see, also kinda stable. But after these 20-25 mins the server throws a StackOverflowException at the point where it deserializes the tcp-stream, especially when setting the Image property from the byte[]-array.
I thoroughly searched for recursive or endless loops, and regarding the fact that it occurs after thousands of sucessfull intervals, i could hardly imagine that.
public byte[] ImageBase
{
get
{
MemoryStream ms = new MemoryStream();
_screen.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
return ms.GetBuffer();
}
set
{
if (_screen != null) _screen.Dispose(); //preventing well-known image memory leak
MemoryStream ms = new MemoryStream(value);
try
{
_screen = Image.FromStream(ms); //<< EXCEPTION THROWING HERE
}
catch (StackOverflowException ex) //thx to new CLR management this wont work anymore -.-
{
Console.WriteLine(ex.Message + Environment.NewLine + ex.StackTrace);
}
ms.Dispose();
ms = null;
}
}
I hope that more code would be unnecessary, or it could get very complex...
Please help, i have no clue at all anymore
thx
Chris

I suspect that it's not the code you posted but the code that reads from the TCP stream that's growing the stack. The fact that the straw breaking the camel's back happens during Image.FromStream is probably irrelevant. I've oftentimes seen people write socket processing code containing self-calling code (sometimes indirectly, like A -> B -> A -> B). You should inspect that code and post it here for us to look at.

You might want to read this. Loading an image from a stream without keeping the stream open
It seem possible that streams are being maintained on the stack or some other object that is eventually blowing the stack.
My suggestion would be to then just hold onto the byte[] and wait until the last possible moment to decode it and draw it. then dispose of the Image immediately. Your get/set would just then set/get the byte[]. You would then implement a custom drawing routine that would decode the current byte[] and draw it making sure not to hold onto any more resources than necessary.
Update
If there is a way you can get us a full stack trace we might be able to help further. I'm beginning to think it isn't the problem like I described. I created a sample program that created 10,000 images just like you do in your setter and there wasn't a problem. If you send an image every 3 seconds, that's 30 images a minute times 20 minutes which is only 600 images.
I'm very interested in the solution to this. I'll come back to it later.
There are a few possibilities, however, remote.
Image.FromStream is trying to process an invalid/corrupt byte[] and that method somehow uses recursion to decode a bitmap. Highly unlikely.
The exception isn't being thrown where you think it is. A full stack trace, if possible would be very helpful. As you stated you cannot catch a StackOverflowException. I believe there are provisions for this if you are running it through the debugger though.

I'm not sure if it's relevant but the MSDN documentation for Image.FromStream states that
You must keep the stream open for the lifetime of the Image.

Related

MimeKit Iterator with FileStreams

I just managed to rewrite some code that should now manage to process large MIME files meant for printing.
For this we used MimeKit in the following manner:
var message = MimeMessage.Load(_fileStream, true);
var iter = new MimeIterator(message);
while (iter.MoveNext())
{
...
}
What is interesting here is that before I used fileStream, we used to copy the inputStream of a HttpListenerRequest object, which is a NetworkStream (non-seekable) which can not be loaded to MimeKit.
Since adding fileStream, for a 2GB job, I spend around 40 seconds inside the while loop iterating for each part of the request which is a lot if you ask me... with memoryStream it was 6 seconds.
Is this behavior normal? Is there ca chance to get better times with MimeKit or should I implement my own parser?
Thanks in advance!
What is interesting here is that before I used fileStream, we used to copy the inputStream of a HttpListenerRequest object, which is a NetworkStream (non-seekable) which can not be loaded to MimeKit.
I'm not quite sure what you mean above. Were you doing something like this?
var memory = new MemoryStream ();
httpResponse.Stream.CopyTo (memory);
memory.Position = 0;
var message = MimeMessage.Load(memory, true);
Since adding fileStream, for a 2GB job, I spend around 40 seconds inside the while loop iterating for each part of the request which is a lot if you ask me... with memoryStream it was 6 seconds.
Your use of MimeMessage.Load() passes true as the persistent argument. This improves parsing performance (because it no longer needs to load MIME content into RAM after parsing), but negatively impacts performance of reading the content of individual MIME parts later because it needs to seek to the _fileStream starting offset for the content of each MIME part as you request it.
Is this behavior normal?
40 seconds does seem like a lot, but without knowing more details of your code, I can't make any suggestions.
Is there a chance to get better times with MimeKit or should I implement my own parser?
Well, I can pretty much guarantee that if you implement your own parser, it will very likely be orders of magnitude slower than MimeKit (seeing as how literally every other attempt to write a MIME parser in C# is orders of magnitude slower than MimeKit) ;-)
What you need to do here is run your code under a profiler to see where the problem is.
If the performance problem is as you suggest inside of the loop, it is NOT a performance problem in the parser. It is a performance problem within your loop. That's not to say that MimeKit code doesn't play a role in the slowness.
If the problem is in MimeKit, I would suspect this logic inside of BoundStream.Read():
// make sure that the source stream is in the expected position
if (BaseStream.Position != StartBoundary + position)
BaseStream.Seek (StartBoundary + position, SeekOrigin.Begin);
In your case, BaseStream should be your _fileStream. It's been a while since I've looked at the code for FileStream.Position, but it may require a syscall to get that value (and syscalls are not free).
40 seconds, though, is a LOT of time - way more than I would expect unless each MIME part's content is massive (streaming content of a MimePart.Content will read ~4K per loop and if there is 1 lseek() per loop, that can add up).
Hope that helps you figure this out.

Lossless rotation of a JPG image with Image.Save and EncoderParameters fails

I have to rotate JPG images lossless in .net (90°|180°|270°). The following articles show how to do it:
https://learn.microsoft.com/en-us/dotnet/api/system.drawing.imaging.encoder.transformation?view=netframework-4.7.2
https://www.codeproject.com/tips/64116/Using-GDIplus-code-in-a-WPF-application-for-lossle.aspx
The examples seem quite straightforward; however, I had no luck getting this to work. My source data comes as an array (various JPG files, from camera from internet etc.) and so I want to return the rotated images also as a byte array. Here the (simplified) code:
Image image;
using (var ms = new MemoryStream(originalImageData)) {
image = System.Drawing.Image.FromStream(ms);
}
// If I don't copy the image into a new bitmap, every try to save the image fails with a general GDI+ exception. This seems to be another bug of GDI+.
var bmp = new Bitmap(image);
// Creating the parameters for saving
var encParameters = new EncoderParameters(1);
encParameters.Param[0] = new EncoderParameter(Encoder.Transformation, (long)EncoderValue.TransformRotate90);
using (var ms = new MemoryStream()) {
// Now saving the image, what fails always with an ArgumentException from GDI+
// There is no difference, if I try to save to a file or to a stream.
bmp.Save(ms, GetJpgEncoderInfo(), encParameters);
return ms.ToArray();
}
I always get an ArgumentException from GDI+ without any useful information:
The operation failed with the final exception [ArgumentException].
Source: System.Drawing
I tried an awful lot of things, however never got it working.
The main code seems right, since if I change the EncoderParameter to Encoder.Quality, the code works fine:
encParameters.Param[0] = new EncoderParameter(Encoder.Quality, 50L);
I found some interesting posts about this problem in the internet, however no real solution. One particularly contains a statement from Hans Passant, that this seems to be really a bug, with a response from an MS employee, which I don't understand or which may be also simply weird:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/de74ec2e-643d-41c7-9d04-254642a9775c/imagesave-quotparameter-is-not-validquot-in-windows-7?forum=netfxbcl
However this post is 10 years old and I cannot believe, that this is not fixed, especially since the transformation has an explicit example in the MSDN docs.
Does anyone have a hint, what I'm doing wrong, or, if this is really a bug, how can I circumvent it?
Please note that I have to make the transformation lossless (as far as the pixel-size allows it). Therefore, Image.RotateFlip is not an option.
Windows version is 10.0.17763, .Net is 4.7.2
using (var ms = new MemoryStream(originalImageData)) {
image = System.Drawing.Image.FromStream(ms);
}
This is the root of all evil and made the first attempt fail. It violates the rule stipulated in the Remarks section of the documentation, You must keep the stream open for the lifetime of the Image. Violating the rule does not cause consistent trouble, note how Save() call failed but the Bitmap(image) constructor succeeded. GDI+ is somewhat lazy, you have very nice evidence that the JPEG codec indeed tries to avoid recompressing the image. But that can't work, the raw data in the stream is no longer accessible since the stream got disposed. The exception is lousy because the native GDI+ code doesn't know beans about a MemoryStream. The fix is simple, just move the closing } bracket after the Save() call.
From there it went wrong another way, triggered primarily by the new bmp object. Neither the image nor the bmp objects are being disposed. This consumes address space in a hurry, the GC can't run often enough to keep you out of trouble since the data for the bitmap is stored in unmanaged memory. Now the Save() call fails when the MemoryStream can't allocate memory anymore.
You must use the using statement on these objects so this can't happen.
Ought to solve the problems, do get rid of Bitmap workaround since that forces the JPEG to be recompressed. Technically you can still get into trouble when the images are large, suffering from address space fragmentation in a 32-bit process. Keep an eye on the "Private bytes" memory counter for the process, ideally it stays below a gigabyte. If not then use Project > Properties > Build tab, untick "Prefer 32-bit".

Why is my encoding showing twice?

byte[] lengthBytes = new byte[4];
serverStream.Read(lengthBytes, 0, 4);
MessageBox.Show("'>>" + System.Text.Encoding.UTF8.GetString(lengthBytes) + "<<'");
MessageBox.Show("Hello");
This is the code I used for debugging. I get 2 messageboxes now. If I used Debug.WriteLine it was also printed twice.
Msgbox 1: '>>/ (Note that this is still 4 characters long, the last 3 bytes are null.
Msgbox 2: '>>{"ac<<'
Msgbox 3: Hello
I'm trying to send 4 bytes with an integer, the length of the message. This is going fine ('/ ' is utf8 for 47). The problem is that the first 4 bytes of the message are also being read ('{"ac'). I totally dont know how this happens, I'm already debugging this for several hours and I just can't get my head around it. One of my friends suggested to make an account on StackOverflow so here I am :p
Thanks for all the help :)
EDIT: The real code for the people who asked
My code http://kutj.es/2ah-j9
You are making traditional programmer mistakes, everybody has to make them once to learn how to avoid it and do it right. This primarily went off the rails by writing debugging code that is buggy and made it lot harder to find your mistake:
Never write debugging code that uses MessageBox.Show(). It is a very, very evil function, it causes re-entrancy. And expensive word that means that it only freezes the user interface, it doesn't freeze your program. It continues to run, one of the things that can go wrong is that the code that you posted is executed again. Re-entered. You'll see two message boxes. And you'll have a completely corrupted program state because your code was never written to assume it could be re-entered. Which is why you complained that 4 bytes of data were swallowed.
The proper tool to use here is the feature that really freezes your program. A debugger breakpoint.
Never assume that binary data can be converted to text. Those 4 bytes you received contain binary zeros. There is no character for it. Worse, it acts as a string terminator to many operating system calls, the kind used by the debugger, Debug.WriteLine() etc. This is why you can't see the "<<"
The proper tool to use here is a debugger watch or tooltip, it lets you look into the array directly. If you absolutely have to generate a diagnostic string then use BitConverter.GetString().
Never assume that a stream's Read() method will always return the number of bytes you asked for. Using the return value in your code is a hard requirement. This is the real bug in your program, the only you are actually trying to fix.
The proper solution is to continue to call Read() until you counted down the number of bytes you expected to receive from the length you read earlier. You'll need a MemoryStream to store the chunks of byte[]s you get.
Perhaps this link regarding Encoding.GetString() will help you out a bit. The part to pay attention to being:
If the data to be converted is available only in sequential blocks
(such as data read from a stream) or if the amount of data is so large
that it needs to be divided into smaller blocks, you should use the
Decoder object returned by the GetDecoder method of a derived class.
The problem was that I started the getMessage void 2 times. This started the while 2 times (in different threads).
Elgonzo helped me finding the problem, he is a great guy :)

Process very large XML file

I need to process an XML file with the following structure:
<FolderSizes>
<Version></Version>
<DateTime Un=""></DateTime>
<Summary>
<TotalSize Bytes=""></TotalSize>
<TotalAllocated Bytes=""></TotalAllocated>
<TotalAvgFileSize Bytes=""></TotalAvgFileSize>
<TotalFolders Un=""></TotalFolders>
<TotalFiles Un=""></TotalFiles>
</Summary>
<DiskSpaceInfo>
<Drive Type="" Total="" TotalBytes="" Free="" FreeBytes="" Used=""
UsedBytes=""><![CDATA[ ]]></Drive>
</DiskSpaceInfo>
<Folder ScanState="">
<FullPath Name=""><![CDATA[ ]]></FullPath>
<Attribs Int=""></Attribs>
<Size Bytes=""></Size>
<Allocated Bytes=""></Allocated>
<AvgFileSz Bytes=""></AvgFileSz>
<Folders Un=""></Folders>
<Files Un=""></Files>
<Depth Un=""></Depth>
<Created Un=""></Created>
<Accessed Un=""></Accessed>
<LastMod Un=""></LastMod>
<CreatedCalc Un=""></CreatedCalc>
<AccessedCalc Un=""></AccessedCalc>
<LastModCalc Un=""></LastModCalc>
<Perc><![CDATA[ ]]></Perc>
<Owner><![CDATA[ ]]></Owner>
<!-- Special element; see paragraph below -->
<Folder></Folder>
</Folder>
</FolderSizes>
The <Folder> element is special in that it repeats within the <FolderSizes> element but can also appear within itself; I reckon up to about 5 levels.
The problem is that the file is really big at a whopping 11GB so I'm having difficulty processing it - I have experience with XML documents, but nothing on this scale.
What I would like to do is to import the information into a SQL database because then I will be able to process the information in any way necessary without having to concern myself with this immense, impractical file.
Here are the things I have tried:
Simply load the file and attempt to process it with a simple C# program using an XmlDocument or XDocument object
Before I even started I knew this would not work, as I'm sure everyone would agree, but I tried it anyway, and ran the application on a VM (since my notebook only has 4GB RAM) with 30GB memory. The application ended up using 24GB memory, and taking very, very long, so I just cancelled it.
Attempt to process the file using an XmlReader object
This approach worked better in that it didn't use as much memory, but I still had a few problems:
It was taking really long because I was reading the file one line at a time.
Processing the file one line at a time makes it difficult to really work with the data contained in the XML because now you have to detect the start of a tag, and then the end of that tag (hopefully), and then create a document from that information, read the info, attempt to determine which parent tag it belongs to because we have multiple levels... Sound prone to problems and errors
Did I mention it takes really long reading the file one line at a time; and that still without actually processing that line - literally just reading it.
Import the information using SQL Server
I created a stored procedure using XQuery and running it recursively within itself processing the <Folder> elements. This went quite well - I think better than the other two approaches - until one of the <Folder> elements ended up being rather big, producing a An XML operation resulted an XML data type exceeding 2GB in size. Operation aborted. error. I read up about it and I don't think it's an adjustable limit.
Here are more things I think I should try:
Re-write my C# application to use unmanaged code
I don't have much experience with unmanaged code, so I'm not sure how well it will work and how to make it as unmanaged as possible.
I once wrote a little application that works with my webcam, receiving the image, inverting the colours, and painting it to a panel. Using normal managed code didn't work - the result was about 2 frames per second. Re-writing the colour inversion method to use unmanaged code solved the problem. That's why I thought that unmanaged might be a solution.
Rather go for C++ in stead of C#
Not sure if this is really a solution. Would it necessarily be better that C#? Better than unmanaged C#?
The problem here is that I haven't actually worked with C++ before, so I'll need to get to know a few things about C++ before I can really start working with it, and then probably not very efficiently yet.
I thought I'd ask for some advice before I go any further, possibly wasting my time.
Thanks in advance for you time and assistance.
EDIT
So before I start processing the file I run through it and check the size in a attempt to provide the user with feedback as to how long the processing might take; I made a screenshot of the calculation:
That's about 1500 lines per second; if the average line length is about 50 characters, that's 50 bytes per line, that's 75 kilobytes per second, for an 11GB file should take about 40 hours, if my maths is correct. But this is only stepping each line. It's not actually processing the line or doing anything with it, so when that starts, the processing rate drops significantly.
This is the method that runs during the size calculation:
private int _totalLines = 0;
private bool _cancel = false; // set to true when the cancel button is clicked
private void CalculateFileSize()
{
xmlStream = new StreamReader(_filePath);
xmlReader = new XmlTextReader(xmlStream);
while (xmlReader.Read())
{
if (_cancel)
return;
if (xmlReader.LineNumber > _totalLines)
_totalLines = xmlReader.LineNumber;
InterThreadHelper.ChangeText(
lblLinesRemaining,
string.Format("{0} lines", _totalLines));
string elapsed = string.Format(
"{0}:{1}:{2}:{3}",
timer.Elapsed.Days.ToString().PadLeft(2, '0'),
timer.Elapsed.Hours.ToString().PadLeft(2, '0'),
timer.Elapsed.Minutes.ToString().PadLeft(2, '0'),
timer.Elapsed.Seconds.ToString().PadLeft(2, '0'));
InterThreadHelper.ChangeText(lblElapsed, elapsed);
if (_cancel)
return;
}
xmlStream.Dispose();
}
Still runnig, 27 minutes in :(
you can read an XML as a logical stream of elements instead of trying to read it line-by-line and piece it back together yourself. see the code sample at the end of this article
also, your question has already been asked here

File.ReadAllText leading to memory leak [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
TextBox.Text Leaking Memory in WPF Application
I've got an application trailing a logfile. Every time the logfile updates (which is usually a series of updates in a row) the memory use balloons out of control.
I've tracked down the problem to this call:
if (File.Exists(Path + "\\logfile.txt"))
Data = File.ReadAllText(Path + "\\logfile.txt");
This is being called from within LoadAllData, here.
private void FileChangeNotificationHandler(object source, FileSystemEventArgs e)
{
this.Dispatcher.BeginInvoke
(new Action(delegate()
{
Logfile.GetPath();
Logfile.LoadAllData();
LogText.Clear();
LogText.Text = Logfile.Data;
if (CheckFollowTail.IsChecked == true) LogText.ScrollToEnd();
}));
}
Does anyone have insight on why this occurring? I assume it's related to the delegate or the handler.
It's probably just down to the amount and frequency with which you are loading log file data into memory.
GC takes time, so it you are repeating this in quick succession, then chances are you'll have several files worth of data in memory until the next GC. This seems very inefficient. You should consider the use of a stream based reader, to avoid keeping all the data in memory. If you do use a stream reader, make sure you dispose of it afterwards to avoid introducing another leak.
The another thing to check it that your not subscribing to a static event somewhere and therefore preventing your object tree from being disposed. Is it a web app?
First of all, checking if the file exists is wrong. This is because the file system is volatile and because there is more than just existence at play (permissions, for example). The correct way to do this is to just open the file, and then handle the exception if it fails.
Now, on to your stated problem. What I suspect is happening is that the log is growing large enough to use the Large Object Heap (85000 bytes is all that's needed, iirc, and remember that .Net uses utf16 (2-byte) characters). A 43K ascii log file is all you'll need to start causing problems, because at that size your .Net string is no longer garbage collected in the normal way. Every time you read the file you end up adding another instance of the entire log file to memory.
To best recommend how to get around this, it will be helpful to know what kind of component you use for your LogText variable. But pending that information, I can at least suggest a few pointers:
Ideally, you would just keep the file open (using FileShare.ReadWrite) and read from the stream every time you get a change notification. But that's not always possible.
If you have to re-open the file each time, at least read the text line by line (using a StreamReader) rather than pulling it all at once using File.ReadAllLines(). This will help you keep your log file broken up into smaller pieces that won't end up on the large object heap.
Unfortunately, I suspect that in the end you're stuck building one big string to assign to a plain textbox. If this is the case, I strongly recommend that you either only ever build and show the last part of the log (less than 85000 bytes worth) or that you search for a Large Object Heap-safe Textbox component to use.

Categories

Resources