C# text creation issue - c#

This is whats going on. I have a huge text file that is suppose to be 1 line per entry. The issue is sometimes the line is broken with a new line.
I edit this entire file and wherever the file doesn't begin with ("\"A) i need to append the current line to the previous line ( replacing \n with " "). Everything I come up with keeps appending the line to a new line. Any help is appricated...
CODE:
public void step1a()
{
string begins = ("\"A");
string betaFilePath = #"C:\ext.txt";
string[] lines = File.ReadAllLines(betaFilePath);
foreach (string line in lines)
{
if (line.StartsWith(begins))
{
File.AppendAllText(#"C:\xt2.txt",line);
File.AppendAllText(#"C:\xt2.txt", "\n");
}
else
{
string line2 = line.Replace(Environment.NewLine, " ");
File.AppendAllText(#"C:\xt2.txt",line2);
}
}
}
Example:
Orig:
"\"A"Hero|apple|orange|for the fun of this
"\"A"Hero|apple|mango|lots of fun always
"\"A"Her|apple|fruit|no
pain is the way
"\"A"Hero|love|stackoverflowpeople|more fun
Resulting:
"\"A"Hero|apple|orange|for the fun of this
"\"A"Hero|apple|mango|lots of fun always
"\"A"Her|apple|fruit|no pain is the way
"\"A"Hero|love|stackoverflowpeople|more fun
my problem isnt the finding the if (line.StartsWith(begins)) its the else statement, it appends line2 to a new line

it seems like your string is not well formated...
try this "\"\\\"A\"" instead
public void step1a()
{
string begins = ("\"\\\"A\"");
string betaFilePath = #"C:\ext.txt";
string[] lines = File.ReadAllLines(betaFilePath);
foreach (string line in lines)
{
if (line.StartsWith(begins))
{
File.AppendAllText(#"C:\xt2.txt",line);
File.AppendAllText(#"C:\xt2.txt", "\n");
}
else
{
string line2 = line.Replace(Environment.NewLine, " ");
File.AppendAllText(#"C:\xt2.txt",line2);
}
}
}

This does what you want:
CopyFileRemovingStrayNewlines(#"C:\ext.txt", #"C:\xt2.txt", #"""\""A");
With this method:
public static void CopyFileRemovingStrayNewlines(string sourcePath, string destinationPath, string linePrefix)
{
string[] lines = File.ReadAllLines(sourcePath);
bool firstLine = true;
foreach (string line in lines)
{
if (line.StartsWith(linePrefix))
{
if (!firstLine)
File.AppendAllText(destinationPath, Environment.NewLine);
else
firstLine = false;
File.AppendAllText(destinationPath, line);
}
else
{
File.AppendAllText(destinationPath, " ");
File.AppendAllText(destinationPath, line);
}
}
}
It does have the problem of appending to an existing file, though. I suggest using a StreamWriter rather than AppendAllText. Like this:
public static void CopyFileRemovingStrayNewlines(string sourcePath, string destinationPath, string linePrefix)
{
string[] lines = File.ReadAllLines(sourcePath);
bool firstLine = true;
using (StreamWriter writer = new StreamWriter(destinationPath, false))
{
foreach (string line in lines)
{
if (line.StartsWith(linePrefix))
{
if (!firstLine)
writer.WriteLine();
else
firstLine = false;
writer.Write(line);
}
else
{
writer.Write(" ");
writer.Write(line);
}
}
}
}

Your problem is that the \ is a C# escape code.
Your string is parsed as "A, because \" is the escape code for a single ".
You should make the begins string an #-string, which does not use escape codes.
You will then need to escape the " by doubling it up.
For example:
const string begins = #"\""A";
Note that the best way to do this is to use a StreamWriter, like this:
using(StreamWriter writer = File.Create(#"C:\xt2.txt"))
{
foreach (string line in lines)
{
if (line.StartsWith(begins))
writer.WriteLine(); //Close the previous line
writer.Write(line);
}
}

Based on #SLaks's example here is some code that should do the trick:
public static void step1a()
{
string betaFilePath = #"C:\ext.txt";
string[] lines = File.ReadAllLines(betaFilePath);
using (StreamWriter writer = new StreamWriter(File.Create(#"C:\xt2.txt")))
{
string buffer = null;
foreach (string line in lines)
{
if (!line.StartsWith(begins))
{
writer.WriteLine(buffer + line);
buffer = null;
}
else
{
if (buffer != null)
writer.WriteLine(buffer);
buffer = line;
}
}
if(buffer != null)
Console.Out.WriteLine(buffer);
}
}

Related

OutOfMemoryException while trying to read a txt file that is over 2GB

In my code I recover the file, extract the text, manipulate it and write the modified string in the file, I have not had any problems to date, the file I had to manipulate today weighed over 2GB, with over 1 million lines
public static void ModifyFile(string directory, string filename)
{
string input = string.Empty;
using (StreamReader reader = new StreamReader(directory + filename))
{
input = reader.ReadToEnd();
}
string output = Manipulate(input);
File.WriteAllText($"{directory}{filename}", String.Empty);
WriteFile(directory, filename, output);
}
private static void WriteFile(string directory, string filename, string output)
{
using (StreamWriter writer = new StreamWriter(directory + filename, true))
{
{
writer.Write(output);
}
writer.Close();
}
}
private static string Manipulate(string input)
{
var counter = 1;
StringBuilder output = new StringBuilder();
string[] subs = input.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
foreach (var x in subs)
{
if (subs[subs.Length - 1] != x && subs[subs.Length - 2] != x)
{
var column = x.Substring(121, 2);
if (column.Equals("NA"))
{
var c = x.Substring(22, 9);
output.Append(ManipulateStringElement(x, counter, 22)
.Replace("\r\n", "\n").Replace("\r", "\n").Replace("\n", "\r\n"));
output.Append("\n");
counter++;
}
}
else if (subs[subs.Length - 2] == x)
{
output.Append(ManipulateStringElement(x, counter, 22)
.Replace("\r\n", "\n").Replace("\r", "\n").Replace("\n", "\r\n"));
}
}
return output.ToString();
}
private static string ManipulateStringElement(string item, int counter, int start)
{
return item.Replace(item.Substring(start, 9), GenerateProgressive(counter));
}
private static string GenerateProgressive(int counter)
{
return $"{counter}".PadLeft(9, '0');
}
But while running reader.ReadToEnd() I get "OutOfMemoryException" error, which makes me think the file is too big
The application is in .NET Framewrok 4.6.1, the operating system is 64bit (I had read that it could affect)
You need to do this in a streaming fashion in order to reduce memory consumption.
Open an input and an output file at the same time, and immediately output the result of a single line from Manipulate(). Ensure it ends with your custom newline character.
Finally replace the original file with the new one.
public static void ModifyFile(string directory, string filename)
{
string inputFile = Path.Combine(directory, filename);
string outputFile = Path.Combine(directory, filename + ".new");
using (var reader = new StreamReader(inputFile))
using (var reader = new StreamWriter(outputFile, true))
{
string input;
while((input = reader.ReadLine()) != null)
{
string output = Manipulate(input);
writer.Write(output);
}
}
File.Move(outputFile, inputFile, true);
}
You may also want to do this using async code, which could improve responsiveness.
I note that you are also retrieving the last two lines of the file. I suggest you do this separately, using this answer for example.
There are also other performance improvements you can make. For example:
private static string GenerateProgressive(int counter)
{
return counter.ToString("D9");
}
as well as:
private static string ManipulateStringElement(string item, int counter, int start)
{
return GenerateProgressive(counter) + item.Substring(9);
}

How do I delete all lines in text file below certain text?

I have a code which iterates through the entire text file searching for a specific text "[names]", and "tried" to delete all the lines below the text. I tried File.WriteAllText(INILoc, string.Empty);, but this just deletes everything in the entire text file. How do I make it so only all the lines below "[names]" gets deleted?
I have set up the iteration likes this :
string[] lines = File.ReadAllLines(INILoc);
bool containsSearchResul = false;
foreach (string line in lines)
{
if (containsSearchResul)
{
File.WriteAllText(INILoc, string.Empty);
}
if (line.Contains("[names]"))
{
containsSearchResul = true;
}
}
You need to store lines before "[names]" text into a string variable, and when condition (line.Contains("[names]")) satisfy then just break the loop and write string value into the same file.
Something like,
string[] lines = File.ReadAllLines(INILoc); //Considering INILoc is a string variable which contains file path.
StringBuilder newText = new StringBuilder();
bool containsSearchResul = false;
foreach (string line in lines)
{
newText.Append(line);
newText.Append(Environment.NewLine); //This will add \n after each line so all lines will be well formatted
//Adding line into newText before if condition check will add "name" into file
if (line.Contains("[names]"))
break;
}
File.WriteAllText(INILoc, newText.ToString());
//^^^^^^^ due to string.Empty it was storing empty string into file.
Note: If you are using StringBuilder class, then do not miss to add Using System.Text in your program
Use StreamReader as it will give you the best performance as you don't need to read the whole file. Swap 'PATH TO INPUT FILE' with your file path and the result will be stored at the path you provide for 'PATH TO OUTPUT FILE'.
using (var sr = new StreamReader("PATH TO INPUT FILE"))
{
using (var sw = new StreamWriter("PATH TO OUTPUT FILE"))
{
var line = sr.ReadLine();
while (line != null)
{
sw.WriteLine(line);
if (line.Contains("[names]"))
{
sw.Close();
sr.Close();
}
else
{
line = sr.ReadLine();
}
}
}
}
If you need to write to the same file:
var sb = new StringBuilder();
using (var sr = new StreamReader("PATH TO INPUT FILE"))
{
var line = sr.ReadLine();
while (line != null)
{
sb.AppendLine(line);
if (line.Contains("[names]"))
{
sr.Close();
}
else
{
line = sr.ReadLine();
}
}
}
File.WriteAllText("PATH TO INPUT FILE", sb.ToString());
Based on the requested code I have put together modifications.
string[] lines = File.ReadAllLines(INILoc);
//create a list to hold the lines
List<string> output = new List<string>();
//loop through each line
foreach (string line in lines)
{
//add current line to ouput.
output.Add(line);
//check to see if our line includes the searched text;
if (line.Contains("[names]"))
{
//output to the file and then exit loop causing all lines below this
//one to be skipped
File.WriteAllText(INILoc, output.ToArray());
break;
}
}
The problem with your code is that it delete all the lines before the [names], not after (more exactly, write only the lines after that text). Also, any time you rewrite all the file content, and so remove all previous wrote line. It'll work as follows:
string[] lines = File.ReadAllLines(INILoc);
using (StreamWriter writer = new StreamWriter(INILoc)) // https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-write-text-to-a-file
{
bool containsSearchResul = false;
foreach (string line in lines)
{
if (!containsSearchResul)
{
writer.Write(INILoc, string.Empty);
}
if (line.Contains("[names]"))
{
containsSearchResul = true;
}
}
}
You have another, better option to do this with break:
string[] lines = File.ReadAllLines(INILoc);
using (StreamWriter writer = new StreamWriter(INILoc)) // https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-write-text-to-a-file
{
foreach (string line in lines)
{
if (line.Contains("[names]"))
{
break;
}
writer.WriteLine(INILoc, string.Empty);
}
}
But you can do this in prefered, more-readable way, by using LINQ:
using System.Linq;
// ...
string[] lines = File.ReadAllLines(INILoc);
string[] linesTillNames = lines
.Take( // Take just N items from the array
Array.IndexOf(lines, "[names]") // Until the index of [names]
)
.ToArray();
File.WriteAllLines(INILoc, linesTillNames);
You can also use: WriteAllLines(string path, IEnumerable<string> contents) like this:
string[] lines = File.ReadAllLines(INILoc);
List<string> linesToWrite = new List<string>();
foreach(string line in lines)
{
linesToWrite.Add(line);
if (line.Contains("[names]")) break;
}
File.WriteAllLines(INILoc, linesToWrite);

C# detect quotes in a text file

I try to detect quotes in a loaded text file but it is not working. I have tried with '"' and '\"' without success. Any suggestion? thanks
void read()
{
txt = File.ReadAllText("txt/txttst");
for(int i=0;i<txt.Length;i++)
{
if(txt[i]=='"')
{
Debug.Log("Quotes at "+i);
}
}
}
How about this
string[] lines = File.ReadAllLines(#"txt/txttst");
for (int i=0;i<lines.Length;i++)
{
string line = lines[i];
// ASCII Code of Quotes is 34
var bytes = Encoding.UTF8.GetBytes(line.ToCharArray()).ToList();
if(bytes.Count(b=> b.ToString()=="34")>0)
Console.WriteLine("\"" + "at line " + (i + 1));
}
This is how you can do it, please see the code and screenshot below. Hope it helps.
namespace TestConsoleApp
{
class Program
{
static void Main(string[] args)
{
string txt = File.ReadAllText(#"C:\Users\Public\TestFolder\test.txt");
string[] lines = File.ReadAllLines(#"C:\Users\Public\TestFolder\test.txt");
var reg = new Regex("\"");
Console.WriteLine("Contents of test.txt are; ");
foreach (string line in lines)
{
Console.WriteLine(line);
var matches = reg.Matches(line);
foreach (var item in matches)
{
Console.WriteLine("Quotes at "+ ((System.Text.RegularExpressions.Capture)item).Index);
}
}
}
}
}
Ok I found the problem, my text editor did a subtle auto-correct from " to “ . Cheers.

Read file, check correctness of column, write file C#

I need to check certain columns of data to make sure there are no trailing blank spaces. At first thought I thought it would be very easy, but after attempting to achieve the goal I have got stuck.
I know that there should be 6-digits in the column I need to check. If there is less I will reject, if there are more I will trim the blank spaces. After doing that for the entire file, I want to write it back to the file with the same delimiters.
This is my attempt:
Everything seems to be working correctly except for writing the file.
if (File.Exists(filename))
{
using (StreamReader sr = new StreamReader(filename))
{
string lines = sr.ReadLine();
string[] delimit = lines.Split('|');
while (delimit[count] != "COLUMN_DATA_TO_CHANGE")
{
count++;
}
string[] allLines = File.ReadAllLines(#filename);
foreach(string nextLine in allLines.Skip(1)){
string[] tempLine = nextLine.Split('|');
if (tempLine[count].Length == 6)
{
checkColumn(tempLine);
writeFile(tempLine);
}
else if (tempLine[count].Length > 6)
{
tempLine[count] = tempLine[count].Trim();
checkColumn(tempLine);
}
else
{
throw new Exception("Not enough numbers");
}
}
}
}
}
public static void checkColumn(string[] str)
{
for (int i = 0; i < str[count].Length; i++)
{
char[] c = str[count].ToCharArray();
if (!Char.IsDigit(c[i]))
{
throw new Exception("A non-digit is contained in data");
}
}
}
public static void writeFile(string[] str)
{
string temp;
using (StreamWriter sw = new StreamWriter(filename+ "_tmp", false))
{
StringBuilder builder = new StringBuilder();
bool firstColumn = true;
foreach (string value in str)
{
if (!firstColumn)
{
builder.Append('|');
}
if (value.IndexOfAny(new char[] { '"', ',' }) != -1)
{
builder.AppendFormat("\"{0}\"", value.Replace("\"", "\"\""));
}
else
{
builder.Append(value);
}
firstColumn = false;
}
temp = builder.ToString();
sw.WriteLine(temp);
}
}
If there is a better way to go about this, I would love to hear it. Thank you for looking at the question.
edit:
file structure-
country| firstname| lastname| uniqueID (column I am checking)| address| etc
USA|John|Doe|123456 |5 main street|
notice the blank space after the 6
var oldLines = File.ReadAllLines(filePath):
var newLines = oldLines.Select(FixLine).ToArray();
File.WriteAllLines(filePath, newLines);
string FixLine(string oldLine)
{
string fixedLine = ....
return fixedLine;
}
The main problem with writing the file is that you're opening the output file for each output line, and you're opening it with append=false, which causes the file to be overwritten every time. A better approach would be to open the output file one time (probably right after validating the input file header).
Another problem is that you're opening the input file a second time with .ReadAllLines(). It would be better to read the existing file one line at a time in a loop.
Consider this modification:
using (StreamWriter sw = new StreamWriter(filename+ "_tmp", false))
{
string nextLine;
while ((nextLine = sr.ReadLine()) != null)
{
string[] tempLine = nextLine.Split('|');
...
writeFile(sw, tempLine);

Array element is maybe not set

In this method below I set string smdrext to tmp[3]. However, tmp[3] seems to sometimes be empty because I get "Index was outside the bounds of the array.". Before I set it, can I change that it really exist to make sure the program does not halt again due to this?
public void WriteToCSV(string line, string path)
{
string[] tmp = line.Split(',');
string smdrext = tmp[3];
if (ext.Contains(Convert.ToString(smdrext)))
{
File.AppendAllText(path, line + "\n");
}
}
Please try with the below code snippet.
public void WriteToCSV(string line, string path)
{
if (!string.IsNullOrEmpty(line))
{
string[] tmp = line.Split(',');
if (tmp.Length > 3)
{
string smdrext = tmp[3];
if (ext.Contains(Convert.ToString(smdrext)))
{
File.AppendAllText(path, line + "\n");
}
}
}
}
Let me know if any concern.

Categories

Resources