C# XML Father-Son update can not save - c#

I'm trying to update an existing app.
I was asked previously to simply clean out an xml file for escape characters, which were coming to us, prior to them being pulled through to the company system. Doing this allowed us the option of avoiding writing inside an app written 7 years ago and working fine (but ZERO documentation)
It actually worked fine with
foreach (string d in Directory.GetFiles(test, "*.xml", SearchOption.AllDirectories))
{
String[] lines = File.ReadAllLines(d);
for (int i = 0; i < lines.Length; i++)
{
if (lines[i].Contains("&"))
{
i++;
}
//Replace incorrect characters
else if (lines[i].Contains("&"))
{
log.Info(saveName);
log.Error("Incorrect '&' Detected: Changing to '&'");
lines[i] = lines[i].Replace("&", "&");
log.Info(lines[i]);
}
}
System.IO.File.WriteAllLines(d, lines);
}
And maybe too easily as I have been asked to try to integrate this with the main app, to prevent the operators having to do the pre-clean.
I know (well I believe) that I am missing the corresponding System.IO.File.WriteAllLines(d, lines); in the following code but I can not get it or anything else to work.
The "replace" is working as the WriteLine is showing the corrected line(s) but I can not get the system to hold the changes.
MemoryStream ms = new MemoryStream();
ms.Position = 0;
List<string> rows = new List<string>();
using (var reader = new StreamReader(ms))
{
string line;
var sw = new StreamWriter(ms);
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("&"))
{
Console.WriteLine(line);
line = line.Replace("&", "&");
sw.Write(line);
Console.WriteLine(line);
}
}

Not sure how important for you is to write a log, but seems you can do the same using something like this:
string text = File.ReadAllText("test.xml");
text = Regex.Replace(text, "&(?!amp;)", "&");
File.WriteAllText("test.xml", text);
It should also cover the case when there are more then one & symbol in one string (the original code will not handle it - so if the sting is something like '&hello&', it will be processed as '&hello&').

Lesson is "when amending a large app, make sure to read it all".
For some reason the original developer decided to dip back in to the zip file (where these files were received) and extract the whole thing again for the Stream.
Changed that and it all works and is running much faster as a result.

Related

C# Error: OutOfMemoryException - Reading a large text file and replacing from dictionary

I'm new to C# and object-oriented programming in general. I have an application which parses text file.
The objective of the application is to read the contents of the provided text file and replace the matching values.
When a file about 800 MB to 1.2GB is provided as the input, the application crashes with error System.OutofMemoryException.
On researching, I came across couple of answers which recommend changing the Target Platform: to x64.
Same issue exists after changing the target platform.
Following is the code:
// Reading the text file
var _data = string.Empty;
using (StreamReader sr = new StreamReader(logF))
{
_data = sr.ReadToEnd();
sr.Dispose();
sr.Close();
}
foreach (var replacement in replacements)
{
_data = _data.Replace(replacement.Key, replacement.Value);
}
//Writing The text File
using (StreamWriter sw = new StreamWriter(logF))
{
sw.WriteLine(_data);
sw.Dispose();
sw.Close();
}
The error points to
_data = sr.ReadToEnd();
replacements is a dictionary. The Key contains the original word and the Value contains the word to be replaced.
The Key elements are replaced with the Value elements of the KeyValuePair.
The approached being followed is Reading the file, replacing and writing.
I tried using a StringBuilder instead of string yet the application crashed.
Can this be overcome by reading the file one line at a time, replacing and writing? What would be the efficient and faster way of doing the same.
Update: The system memory is 8 GB and on monitoring the performance it spikes upto 100% memory usage.
#Tim Schmelter answer works well.
However, the memory utilization spikes over 90%. It could be due to the following code:
String[] arrayofLine = File.ReadAllLines(logF);
// Generating Replacement Information
Dictionary<int, string> _replacementInfo = new Dictionary<int, string>();
for (int i = 0; i < arrayofLine.Length; i++)
{
foreach (var replacement in replacements.Keys)
{
if (arrayofLine[i].Contains(replacement))
{
arrayofLine[i] = arrayofLine[i].Replace(replacement, masking[replacement]);
if (_replacementInfo.ContainsKey(i + 1))
{
_replacementInfo[i + 1] = _replacementInfo[i + 1] + "|" + replacement;
}
else
{
_replacementInfo.Add(i + 1, replacement);
}
}
}
}
//Creating Replacement Information
StringBuilder sb = new StringBuilder();
foreach (var Replacement in _replacementInfo)
{
foreach (var replacement in Replacement.Value.Split('|'))
{
sb.AppendLine(string.Format("Line {0}: {1} ---> \t\t{2}", Replacement.Key, replacement, masking[replacement]));
}
}
// Writing the replacement information
if (sb.Length!=0)
{
using (StreamWriter swh = new StreamWriter(logF_Rep.txt))
{
swh.WriteLine(sb.ToString());
swh.Dispose();
swh.Close();
}
}
sb.Clear();
It finds the line number in which the replacement was made. Can this be captured using Tim's code in order to avoid loading the data into memory multiple times.
If you have very large files you should try MemoryMappedFile which is designed for this purpose(files > 1GB) and enables to read "windows" of a file into memory. But it's not easy to use.
A simple optimization would be to read and replace line by line
int lineNumber = 0;
var _replacementInfo = new Dictionary<int, List<string>>();
using (StreamReader sr = new StreamReader(logF))
{
using (StreamWriter sw = new StreamWriter(logF_Temp))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
lineNumber++;
foreach (var kv in replacements)
{
bool contains = line.Contains(kv.Key);
if (contains)
{
List<string> lineReplaceList;
if (!_replacementInfo.TryGetValue(lineNumber, out lineReplaceList))
lineReplaceList = new List<string>();
lineReplaceList.Add(kv.Key);
_replacementInfo[lineNumber] = lineReplaceList;
line = line.Replace(kv.Key, kv.Value);
}
}
sw.WriteLine(line);
}
}
}
At the end you can use File.Copy(logF_Temp, logF, true); if you want to overwite the old.
Read file line by line and append changed line to other file. At the end replace source file with new one (create backup or not).
var tmpFile = Path.GetTempFileName();
using (StreamReader sr = new StreamReader(logF))
{
using (StreamWriter sw = new StreamWriter(tmpFile))
{
string line;
while ((line = sr.ReadLine()) != null)
{
foreach (var replacement in replacements)
line = line.Replace(replacement.Key, replacement.Value);
sw.WriteLine(line);
}
}
}
File.Replace(tmpFile, logF, null);// you can pass backup file name instead on null if you want a backup of logF file
An OutOfMemoryException is thrown whenever the application tries and fails to allocate memory to perform an operation. According to Microsoft's documentation, the following operations can potentially throw an OutOfMemoryException:
Boxing (i.e., wrapping a value type in an Object)
Creating an array
Creating an object
If you try to create an infinite number of objects, then it's pretty reasonable to assume that you're going to run out of memory sooner or later.
(Note: don't forget about the garbage collector. Depending on the lifetimes of the objects being created, it will delete some of them if it determines they're no longer in use.)
For What I suspect is this line :
foreach (var replacement in replacements)
{
_data = _data.Replace(replacement.Key, replacement.Value);
}
That sooner or later u will run out of memory. Do u ever count how many it loop?
Try
Increase the available memory.
Reduce the amount of data you are retrieving.

How to remove all lines in a file, then rewrite the file in Compact Framework 3.5 c#

In the .net framework using a Windows Forms app I can purge a file, then write the data that I want back to into that file.
Here is the code that I use in Windows Forms:
var openFile = File.OpenText(fullFileName);
var fileEmpty = openFile.ReadLine();
if (fileEmpty != null)
{
var lines = File.ReadAllLines(fullFileName).Skip(4); //Will skip the first 4 then rewrite the file
openFile.Close();//Close the reading of the file
File.WriteAllLines(fullFileName, lines); //Reopen the file to write the lines
openFile.Close();//Close the rewriting of the file
}
openFile.Close();
openFile.Dispose();
I am trying to do the same thing the compact framework. I can keep the lines that I want, and then delete all the lines in the file. However I am not able to rewrite the file.
Here is my compact framework code:
var sb = new StringBuilder();
using (var sr = new StreamReader(fullFileName))
{
// read the first 4 lines but do nothing with them; basically, skip them
for (int i = 0; i < 4; i++)
sr.ReadLine();
string line1;
while ((line1 = sr.ReadLine()) != null)
{
sb.AppendLine(line1);
}
}
string allines = sb.ToString();
openFile.Close();//Close the reading of the file
openFile.Dispose();
//Reopen the file to write the lines
var writer = new StreamWriter(fullFileName, false); //Don't append!
foreach (char line2 in allines)
{
writer.WriteLine(line2);
}
openFile.Close();//Close the rewriting of the file
}
openFile.Close();
openFile.Dispose();
Your code
foreach (char line2 in allines)
{
writer.WriteLine(line2);
}
is writing out the characters of the original file, each on a separate line.
Remember, allines is a single string that happens to have Environment.NewLine between the original strings of the file.
What you probably intend to do is simply
writer.WriteLine(allines);
UPDATE
You are closing openFile a number of times (you should only do this once), but you are not flushing or closing your writer.
Try
using (var writer = new StreamWriter(fullFileName, false)) //Don't append!
{
writer.WriteLine(allines);
}
to ensure the writer is disposed and therefore flushed.
If you plan to do this to have something like a "rotating" buffer for a log file consider that most Windows CE devices uses flash as storage media and your approach will generate a full re-write of the whole file (whole - 4 lines) every time. If this happens quite often (every few seconds) this may wear our the flash, reaching its maximum number of erase cycles quickly (quickly may mean a few weeks or months).
An alternative approach would be rename the old log file when it has reached the maximum size (deleting any existing file with the same name) and create a new one.
In this was you logging info would be split on two files but you'll always append to the existing files, limiting the number of writes you perform. Also renaming or deleting a file aren't heavy operations from the point of view of a flash file system.

Cut and paste line of text from text file c#

Hi everyone beginner here looking for some advice with a program I'm writing in C#. I need to be able to open a text document, read the first line of text (that is not blank), save this line of text to another text document and finally overwrite the read line with an empty line.
This is what I have so far, everything works fine until the last part where I need to write a blank line to the original text document, I just get a full blank document. Like I mentioned above I'm new to C# so I'm sure there is an easy solution to this but I can't figure it out, any help appreciated:
try
{
StreamReader sr = new StreamReader(#"C:\Users\Stephen\Desktop\Sample.txt");
line = sr.ReadLine();
while (line == "")
{
line = sr.ReadLine();
}
sr.Close();
string path = (#"C:\Users\Stephen\Desktop\new.txt");
if (!File.Exists(path))
{
File.Create(path).Dispose();
TextWriter tw = new StreamWriter(path);
tw.WriteLine(line);
tw.Close();
}
else if (File.Exists(path))
{
TextWriter tw = new StreamWriter(path, true);
tw.WriteLine(line);
tw.Close();
}
StreamWriter sw = new StreamWriter(#"C:\Users\Stephen\Desktop\Sample.txt");
int cnt1 = 0;
while (cnt1 < 1)
{
sw.WriteLine("");
cnt1 = 1;
}
sw.Close();
}
catch (Exception e)
{
Console.WriteLine("Exception: " + e.Message);
}
finally
{
Console.WriteLine("Executing finally block.");
}
else
Console.WriteLine("Program Not Installed");
Console.ReadLine();
Unfortunately, you do have to go through the painstaking process of rewriting the file. In most cases, you could get away with loading it into memory and just doing something like:
string contents = File.ReadAllText(oldFile);
contents = contents.Replace("bad line!", "good line!");
File.WriteAllText(newFile, contents);
Remember that you'll have to deal with the idea of line breaks here, since string.Replace doesn't innately pay attention only to whole lines. But that's certainly doable. You could also use a regex with that approach. You can also use File.ReadAllLines(string) to read each line into an IEnumerable<string> and test each one while you write them back to the new file. It just depends on what exactly you want to do and how precise you want to be about it.
using (var writer = new StreamWriter(newFile))
{
foreach (var line in File.ReadAllLines(oldFile))
{
if (shouldInsert(line))
writer.WriteLine(line);
}
}
That, of course, depends on the predicate shouldInsert, but you can modify that as you see so fit. But the nature of IEnumerable<T> should make that relatively light on resources. You could also use a StreamReader for a bit lower-level of support.
using (var writer = new StreamWriter(newFile))
using (var reader = new StreamReader(oldFile))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (shouldInsert(line))
writer.WriteLine(line);
}
}
Recall, of course, that this could leave you with an extra, empty line at the end of the file. I'm too tired to say that with the certainty I should be able to, but I'm pretty sure that's the case. Just keep an eye out for that, if it really matters. Of course, it normally won't.
That all said, the best way to do it would be to have a bit of fun and do it without wasting the memory, by writing a function to read the FileStream in and write out the appropriate bytes to your new file. That's, of course, the most complicated and likely over-kill way, but it'd be a fun undertaking.
See: Append lines to a file using a StreamWriter
Add true to the StreamWriter constructor to set it to "Append" mode. Note that this adds a line at the bottom of the document, so you may have to fiddle a bit to insert or overwrite it at the top instead.
And see: Edit a specific Line of a Text File in C#
Apparently, it's not that easy to just insert or overwrite a single line and the usual method is just to copy all lines while replacing the one you want and writing every line back to the file.

Displaying Text Content

Pretty simple one I hope. I have an article of text that I want to display in a window. Now rather than have this massive load of text in the centre of my code, can I add it as a Resource and read it out to the window somehow?
For those asking why, it's simply because it is a massive article and would be very ugly looking stuck in the middle of my code.
UPDATE FOR H.B.
I have tried a number of different approaches to this and am currently looking into the GetManifestResourceStream and using an embeddedResource (txt file) and writing that out to screen. Haven't finished testing it yet but if it works it would be a heck of a lot nicer than copying and pasting the entire text txtbox1.Text = "...blah blah blah".
_textStreamReader = new
StreamReader(Assembly.GetExecutingAssembly().GetManifestResourceStream("Problem.Explaination.txt"));
try
{
if (_textStreamReader.Peek() != -1)
{
txtBlock.Text = _textStreamReader.ReadLine();
}
}
catch
{
MessageBox.Show("Error writing text!");
}
My query remains, is there a better way of achieving this (assuming this is even successful)
Thanks
NOTE
In my example above I only want one line of text. If you were adapting this to read a number of lines from a file you would change it like so;
StreamReader _textStreamReader;
_textStreamReader = new StreamReader(Assembly.GetExecutingAssembly().GetManifestResourceStream("Problem.Explaination.txt"));
var fileContents = _textStreamReader.ReadToEnd();
_textStreamReader.Close();
String[] lines = fileContents.Split("\n"[0]);
String[] lines2;
Int16 count;
foreach (string line in lines)
{
txtBlock.Text += line;
}
Add the file as a resource and, in your code, load it into a string.
StringBuilder sb = new StringBuilder();
using (var stream = this.GetType().Assembly.GetManifestResourceStream("MyNamespace.TextFile.txt"))
using(var reader = new StreamReader(stream))
{
string line;
while ((line = reader.ReadLine()) != null)
{
sb.AppendLine(line);
}
}
ViewModel.Text = sb.ToString();
You could place that text in a text file, and read it out in code
http://msdn.microsoft.com/en-us/library/db5x7c0d.aspx

Best way to write huge string into a file

In C#, I'm reading a moderate size of file (100 KB ~ 1 MB), modifying some parts of the content, and finally writing to a different file. All contents are text. Modification is done as string objects and string operations. My current approach is:
Read each line from the original file by using StreamReader.
Open a StringBuilder for the contents of the new file.
Modify the string object and call AppendLine of the StringBuilder (until the end of the file)
Open a new StreamWriter, and write the StringBuilder to the write stream.
However, I've found that StremWriter.Write truncates 32768 bytes (2^16), but the length of StringBuilder is greater than that. I could write a simple loop to guarantee entire string to a file. But, I'm wondering what would be the most efficient way in C# for doing this task?
To summarize, I'd like to modify only some parts of a text file and write to a different file. But, the text file size could be larger than 32768 bytes.
== Answer == I'm sorry to make confusin to you! It was just I didn't call flush. StremWriter.Write does not have a short (e.g., 2^16) limitation.
StreamWriter.Write
does not
truncate the string and has no limitation.
Internally it uses String.CopyTo which on the other hand uses unsafe code (using fixed) to copy chars so it is the most efficient.
The problem is most likely related to not closing the writer. See http://msdn.microsoft.com/en-us/library/system.io.streamwriter.flush.aspx.
But I would suggest not loading the whole file in memory if that can be avoided.
can you try this :
void Test()
{
using (var inputFile = File.OpenText(#"c:\in.txt"))
{
using (var outputFile = File.CreateText(#"c:\out.txt"))
{
string current;
while ((current = inputFile.ReadLine()) != null)
{
outputFile.WriteLine(Process(current));
}
}
}
}
string Process(string current)
{
return current.ToLower();
}
It avoid to have to full file loaded in memory, by processing line by line and writing it directly
Well, that entirely depends on what you want to modify. If your modifications of one part of the text file are dependent on another part of the text file, you obviously need to have both of those parts in memory. If however, you only need to modify the text file on a line-by-line basis then use something like this :
using (StreamReader sr = new StreamReader(#"test.txt"))
{
using (StreamWriter sw = new StreamWriter(#"modifiedtest.txt"))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
//do some modifications
sw.WriteLine(line);
sw.Flush(); //force line to be written to disk
}
}
}
Instead of of running though the hole dokument i would use a regex to find what you are looking for Sample:
public List<string> GetAllProfiles()
{
List<string> profileNames = new List<string>();
using (StreamReader reader = new StreamReader(_folderLocation + "profiles.pg"))
{
string profiles = reader.ReadToEnd();
var regex = new Regex("\nname=([^\r]{0,})", RegexOptions.IgnoreCase);
var regexMatchs = regex.Matches(profiles);
profileNames.AddRange(from Match regexMatch in regexMatchs select regexMatch.Groups[1].Value);
}
return profileNames;
}

Categories

Resources