Writing huge amounts of text to a textbox - c#

I am writing a log of lots and lots of formatted text to a textbox in a .net windows form app.
It is slow once the data gets over a few megs. Since I am appending the string has to be reallocated every time right? I only need to set the value to the text box once, but in my code I am doing line+=data tens of thousands of times.
Is there a faster way to do this? Maybe a different control? Is there a linked list string type I can use?

StringBuilder will not help if the text box is added to incrementally, like log output for example.
But, if the above is true and if your updates are frequent enough it may behoove you to cache some number of updates and then append them in one step (rather than appending constantly). That would save you many string reallocations... and then StringBuilder would be helpful.
Notes:
Create a class-scoped StringBuilder member (_sb)
Start a timer (or use a counter)
Append text updates to _sb
When timer ticks or certain counter reached reset and append to
text box
restart process from #1

No one has mentioned virtualization yet, which is really the only way to provide predictable performance for massive volumes of data. Even using a StringBuilder and converting it to a string every half a second will be very slow once the log gets large enough.
With data virtualization, you would only hold the necessary data in memory (i.e. what the user can see, and perhaps a little more on either side) whilst the rest would be stored on disk. Old data would "roll out" of memory as new data comes in to replace it.
In order to make the TextBox appear as though it has a lot of data in it, you would tell it that it does. As the user scrolls around, you would replace the data in the buffer with the relevant data from the underlying source (using random file access). So your UI would be monitoring a file, not listening for logging events.
Of course, this is all a lot more work than simply using a StringBuilder, but I thought it worth mentioning just in case.

Build your String together with a StringBuilder, then convert it to a String using toString(), and assign this to the textbox.

I have found that setting the textbox's WordWrap property to false greatly improves performance, as long as you're ok with having to scroll to the right to see all of your text. In my case, I wanted to paste a 20-50 MB file into a MultiLine textbox to do some processing on it. That took several minutes with WordWrap on, and just several seconds with WordWrap off.

Related

Fastest way to draw a large text file in C# winforms

I have a large text file (~100MB) that I keep it's lines in a list of strings.
my Winform requires occasionally to show a part of it, for example 500,000 lines.
I have tried using a ListBox, RichTextBox and TextBox, but the drawing takes too much time.
for example, TextBox takes 25 seconds to show 500,000 lines,
whereas notepad opens a text file of this size immediately.
what will be the fastest solution for this purpose?
Why not open a file stream and just read the first few lines. You can use seek as the user scrolls in the file and display the appropriate lines. The point is - reading the whole file into memory takes to long so don't do that!
Starter Code
The following is a short code snippet that isn't complete but it should at least get you started:
// estimate the average line length in bytes somehow:
int averageLineLengthBytes = 100;
// also need to store the current scroll location in "lines"
int currentScroll = 0;
using (var binaryReader = new StreamReader(new FileStream(fileName, FileAccess.Read)))
{
if (binaryReader.BaseStream.CanSeek)
{
// seek the location to read:
binaryReader.BaseStream.Seek(averageLineLengthBytes * currentScroll, SeekOrigin.Begin);
// read the next few lines using this command
binaryReader.ReadLine();
}
else
{
// revert to a slower implementation here!
}
}
The biggest trick is going to be estimating how long the scroll bar needs to be (how many lines are in the file). For that you are going to have to either alter the scroll bar as the user scrolls or you can use prior knowledge of how long typical lines are in this file and estimate the length based on the total number of bytes. Either way, hope this helps!
A Note About Virtual Mode
Virtual mode is a method of using a ListBox or similar list control to load the items on an as needed basis. The control will execute a callback to retrieve the items based on an index when the user scrolls within the control. This is a viable solution only if your data meets the following criteria:
You must know (up front) the number of data items that you wish to present. If you need to read the entire file to get this total, it isn't going to work for you!
You must be able to retrieve a specific data item based an index for that item without reading the entire file.
You must be willing to present the data in an icon, small details, details or other supported format (or be willing to go to a ton of extra work to write a custom list view).
If you cannot meet these criteria, then virtual mode is not going to be particularly helpful. The answer I presented with seek will work regardless of whether or not you can perform these actions. Of course, if you can meet these minimum criteria, then by all means - look up virtual mode for list views and you should find some really useful information!
ListView have a Virtual Mode property. It allows you to only load the data are in view using the Retrieve Virtual Item Event. So when that event is trigger for item number 40,000 for example, you would perform a seek on the file read in the line.
You can also find example of a virtual list box on Microsoft. It really old, but it gives you a basic idea.

Why does appending to TextBox.Text during a loop take up more memory with each iteration?

Short Question
I have a loop that runs 180,000 times. At the end of each iteration it is supposed to append the results to a TextBox, which is updated real-time.
Using MyTextBox.Text += someValue is causing the application to eat huge amounts of memory, and it runs out of available memory after a few thousand records.
Is there a more efficient way of appending text to a TextBox.Text 180,000 times?
Edit I really don't care about the result of this specific case, however I want to know why this seems to be a memory hog, and if there is a more efficient way to append text to a TextBox.
Long (Original) Question
I have a small app which reads a list of ID numbers in a CSV file and generates a PDF report for each one. After each pdf file is generated, the ResultsTextBox.Text gets appended with the ID Number of the report that got processed and that it was successfully processed. The process runs on a background thread, so the ResultsTextBox gets updated real-time as items get processed
I am currently running the app against 180,000 ID numbers, however the memory the application is taking up is growing exponentially as time goes by. It starts by around 90K, but by about 3000 records it is taking up roughly 250MB and by 4000 records the application is taking up about 500 MB of memory.
If I comment out the update to the Results TextBox, the memory stays relatively stationary at roughly 90K, so I can assume that writing ResultsText.Text += someValue is what is causing it to eat memory.
My question is, why is this? What is a better way of appending data to a TextBox.Text that doesn't eat memory?
My code looks like this:
try
{
report.SetParameterValue("Id", id);
report.ExportToDisk(ExportFormatType.PortableDocFormat,
string.Format(#"{0}\{1}.pdf", new object[] { outputLocation, id}));
// ResultsText.Text += string.Format("Exported {0}\r\n", id);
}
catch (Exception ex)
{
ErrorsText.Text += string.Format("Failed to export {0}: {1}\r\n",
new object[] { id, ex.Message });
}
It should also be worth mentioning that the app is a one-time thing and it doesn't matter that it is going to take a few hours (or days :)) to generate all the reports. My main concern is that if it hits the system memory limit, it will stop running.
I'm fine with leaving the line updating the Results TextBox commented out to run this thing, but I would like to know if there is a more memory efficient way of appending data to a TextBox.Text for future projects.
I suspect the reason the memory usage is so large is because textboxes maintain a stack so that the user can undo/redo text. That feature doesn't seem to be required in your case, so try setting IsUndoEnabled to false.
Use TextBox.AppendText(someValue) instead of TextBox.Text += someValue. It's easy to miss since it's on TextBox, not TextBox.Text. Like StringBuilder, this will avoid creating copies of the entire text each time you add something.
It would be interesting to see how this compares to the IsUndoEnabled flag from keyboardP's answer.
Don't append directly to the text property. Use a StringBuilder for the appending, then when done, set the .text to the finished string from the stringbuilder
Instead of using a text box I would do the following:
Open up a text file and stream the errors to a log file just in case.
Use a list box control to represent the errors to avoid copying potentially massive strings.
Personally, I always use string.Concat* . I remember reading a question here on Stack Overflow years ago that had profiling statistics comparing the commonly-used methods, and (seem) to recall that string.Concat won out.
Nonetheless, the best I can find is this reference question and this specific String.Format vs. StringBuilder question, which mentions that String.Format uses a StringBuilder internally. This makes me wonder if your memory hog lies elsewhere.
**based on James' comment, I should mention that I never do heavy string formatting, as I focus on web-based development.*
Maybe reconsider the TextBox? A ListBox holding string Items will probably perform better.
But the main problem seem to be the requirements, Showing 180,000 items cannot be aimed at a (human) user, neither is changing it in "Real Time".
The preferable way would be to show a sample of the data or a progress indicator.
When you do want to dump it at the poor User, batch string updates. No user could descern more than 2 or 3 changes per second. So if you produce 100/second, make groups of 50.
Some responses have alluded to it, but nobody has outright stated it which is surprising.
Strings are immutable which means a String cannot be modified after it is created. Therefore, every time you concatenate to an existing String, a new String Object needs to be created. The memory associated with that String Object also obviously needs to be created, which can get expensive as your Strings become larger and larger. In college, I once made the amateur mistake of concatenating Strings in a Java program that did Huffman coding compression. When you're concatenating extremely large amounts of text, String concatenation can really hurt you when you could have simply used StringBuilder, as some in here have mentioned.
Use the StringBuilder as suggested.
Try to estimate the final string size then use that number when instantiating the StringBuilder. StringBuilder sb = new StringBuilder(estSize);
When updating the TextBox just use assignment eg: textbox.text = sb.ToString();
Watch for cross-thread operations as above. However use BeginInvoke. No need to block
the background thread while the UI updates.
A) Intro: already mentioned, use StringBuilder
B) Point: don't update too frequently, i.e.
DateTime dtLastUpdate = DateTime.MinValue;
while (condition)
{
DoSomeWork();
if (DateTime.Now - dtLastUpdate > TimeSpan.FromSeconds(2))
{
_form.Invoke(() => {textBox.Text = myStringBuilder.ToString()});
dtLastUpdate = DateTime.Now;
}
}
C) If that's one-time job, use x64 architecture to stay within 2Gb limit.
StringBuilder in ViewModel will avoid string rebindings mess and bind it to MyTextBox.Text. This scenario will increase performance many times over and decrease memory usage.
Something that has not been mentioned is that even if you're performing the operation in the background thread, the update of the UI element itself HAS to happen on the main thread itself (in WinForms anyway).
When updating your textbox, do you have any code that looks like
if(textbox.dispatcher.checkAccess()){
textbox.text += "whatever";
}else{
textbox.dispatcher.invoke(...);
}
If so, then your background op is definitely being bottlenecked by the UI Update.
I would suggest that your background op use StringBuilder as noted above, but instead of updating the textbox every cycle, try updating it at regular intervals to see if it increases performance for you.
EDIT NOTE:have not used WPF.
You say memory grows exponentially. No, it is a quadratic growth, i.e. a polynomial growth, which is not as dramatic as an exponential growth.
You are creating strings holding the following number of items:
1 + 2 + 3 + 4 + 5 ... + n = (n^2 + n) /2.
With n = 180,000 you get total memory allocation for 16,200,090,000 items, i.e. 16.2 billion items! This memory will not be allocated at once, but it is a lot of cleanup work for the GC (garbage collector)!
Also, bear in mind, that the previous string (which is growing) must be copied into the new string 179,999 times. The total number of copied bytes goes with n^2 as well!
As others have suggested, use a ListBox instead. Here you can append new strings without creating a huge string. A StringBuild does not help, since you want to display the intermediate results as well.

File.ReadAllText leading to memory leak [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
TextBox.Text Leaking Memory in WPF Application
I've got an application trailing a logfile. Every time the logfile updates (which is usually a series of updates in a row) the memory use balloons out of control.
I've tracked down the problem to this call:
if (File.Exists(Path + "\\logfile.txt"))
Data = File.ReadAllText(Path + "\\logfile.txt");
This is being called from within LoadAllData, here.
private void FileChangeNotificationHandler(object source, FileSystemEventArgs e)
{
this.Dispatcher.BeginInvoke
(new Action(delegate()
{
Logfile.GetPath();
Logfile.LoadAllData();
LogText.Clear();
LogText.Text = Logfile.Data;
if (CheckFollowTail.IsChecked == true) LogText.ScrollToEnd();
}));
}
Does anyone have insight on why this occurring? I assume it's related to the delegate or the handler.
It's probably just down to the amount and frequency with which you are loading log file data into memory.
GC takes time, so it you are repeating this in quick succession, then chances are you'll have several files worth of data in memory until the next GC. This seems very inefficient. You should consider the use of a stream based reader, to avoid keeping all the data in memory. If you do use a stream reader, make sure you dispose of it afterwards to avoid introducing another leak.
The another thing to check it that your not subscribing to a static event somewhere and therefore preventing your object tree from being disposed. Is it a web app?
First of all, checking if the file exists is wrong. This is because the file system is volatile and because there is more than just existence at play (permissions, for example). The correct way to do this is to just open the file, and then handle the exception if it fails.
Now, on to your stated problem. What I suspect is happening is that the log is growing large enough to use the Large Object Heap (85000 bytes is all that's needed, iirc, and remember that .Net uses utf16 (2-byte) characters). A 43K ascii log file is all you'll need to start causing problems, because at that size your .Net string is no longer garbage collected in the normal way. Every time you read the file you end up adding another instance of the entire log file to memory.
To best recommend how to get around this, it will be helpful to know what kind of component you use for your LogText variable. But pending that information, I can at least suggest a few pointers:
Ideally, you would just keep the file open (using FileShare.ReadWrite) and read from the stream every time you get a change notification. But that's not always possible.
If you have to re-open the file each time, at least read the text line by line (using a StreamReader) rather than pulling it all at once using File.ReadAllLines(). This will help you keep your log file broken up into smaller pieces that won't end up on the large object heap.
Unfortunately, I suspect that in the end you're stuck building one big string to assign to a plain textbox. If this is the case, I strongly recommend that you either only ever build and show the last part of the log (less than 85000 bytes worth) or that you search for a Large Object Heap-safe Textbox component to use.

Problem with C# multiline textbox memory usage

I am using a multiline text box in C# to just log some trace information. I simply use AppendText("text-goes-here\r\n") as I need to add lines.
I've let this program run for a few days (with a lot of active trace) and I noticed it was using a lot of memory. Long story short, it appears that even with the maxlength value to something very small (256) the content of the text box just keeps expanding.
I thought it worked like a FIFO (throwing away the oldest text that exceeds the maxlength size). It doesn't, it just keeps increasing in size. This is apparently the cause of my memory waste. Anybody know what I'm doing wrong?
Added a few hours after initial question...
Ok, I tried the suggested code below. To quickly test it, I simply added a timer to my app and from that timer tick I now call a method that does essentially the same thing as the code below. The tick rate is high so that I can observe the memory usage of the process and quickly determine if there is a leak. There wasn't. That was good; however, I put this in my application and memory usage did not change (still leaking). That sure seems to imply that I have a leak somwehere else :-( however, if I simply add a return at the top of that method, the usage drops back to stable. Any thoughts on this? The timer-tick-invoked code did not accumulate memory but my real code (same method) does. The difference is that I'm calling the method from a variety of different places in the real code. Can the context of the call affect this somehow? (note, if it isn't already obvious, I'm not a .NET expert by any means)...
TextBox will allow you to append text regardless of MaxLength value - it's only used to control user entry. You can create a method that will be adding new text after verifying that maxlength is not reached, and if it is, just remove x lines from the beginning.
You could use a simple function to append text:
int maxLength = 256;
private void AppendText(string text)
{
textBox1.AppendText(text);
if(textBox1.Text.Length > maxLength)
textBox1.Text = textBox1.Text.Substring(textBox1.Text.Length - maxLength);
}

C# Application Becomes Slow and Unresponsive as String in Multiline Textbox Grows

I have a C# application in which a LOT of information is being added to a Textbox for display to the user. Upon processing of the data, almost immediately, the application becomes very slow and unresponsive. This is how I am currently attempting to handle this:
var saLines = textBox1.Lines;
var saNewLines = saLines.Skip(50);
textBox1.Lines = saNewLines.ToArray();
This code is run from a timer every 100mS. Is there a better way to handle this? I am using Microsoft Visual C# 2008 Express Edition. Thanks.
The simple answer is TextBox.AppendText().
You get much better performance initially.
I tested writing a 500 char message every 20 ms for 2 mins (with BackgroundWorker) and the UI remained responsive and CPU minimal. At some point, of course, it will become unresponsive but it was good enough for my needs.
Try by having in memory a list with the content, and removing the first 50 elements by RemoveRange and then going with ToArray();
Like this :
lst.RemoveRange(0,50);
textBox1.Lines = lst.ToArray();
It should be a lot faster.
I'd say your main problem here is that you are using the TextBox as your primary storage for your text. Everytime you call TextBox.Lines, the string is split on Environment.NewLine.
Try turning it around:
Store the text in a new List<String>(maxLines)
In your AddLine method, check the length of your text buffer and use RemoveRange(0, excessCount)
Update your display TextBox by calling String.Join(Environment.NewLine, textBuffer.ToArray())
That last call is a bit expensive, but it should stop your slowdowns. To get it any faster you'd need to use a statically sized string array and move the references around yourself.
The most efficient way to trim an array is to create a new array of the desired size, then use Array.Copy to copy the desired portion of the old array.
I would recommend that you maintain a List<string> containing all of your lines.
You should use a StringBuilder to build a string containing the lines you're looking for, and set the textbox's Text proeprty to the StringBuilder's string. For added performance, set the StringBuilder's capacity to a reasonable guess of the final sie of the string. (Or to list.Skip(...).Take(...).Sum(s => s.Length))
If you're concerned about memory, you can trim the List<string> by calling RemoveRange.
As long as you don't put too much in the textbox at once, doing it this way should be extremely fast. All of the manipulation of the List<string> and the StringBuilder can be done in a background thread, and you can pass the completed string to the UI thread.
The TextBox.Lines property simply concatenates the array you give it using a StringBuilder, so there's no point in using it (and making a needless array).
Instead of splitting the text and then re-joining it, just get the sub-string from the 51st line:
int i = textBox1.GetFirstCharIndexFromLine(50);
if (i > 0) textBox1.Text = textBox1.Text.Substring(i);

Categories

Resources