Alternative to File.AppendAllText for newline - c#

I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?

Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();

Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.

You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.

Related

How Can I Read a Multiline Field from a CSV Without Altering It?

I have a CSV that looks like this. My goal is to extract each entry (notice I said entry, not line), where an entry starts from the first column and stretches to the last column, and may span multiple lines. I'd like to extract an entry without ruining the formatting. For example, I do not want the following to be considered four seperate lines,
Eg. 1, One Column Multiple Lines
...,"1. copy ctor
2. copy ctor
3. declares function
4. default ctor",... // Where ... represents the columns before and after
but rather a column in one entry that can be represented as such
Eg. 2, One Column Single Line
"1. copy ctor\n2.copy ctor\ndeclares function\n4.default ctor"
When I iterate over the CSV, as such, I get Eg. 1. I'm not sure why splitting on a comma is treating a new line as a comma.
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
string[] splitLine = line.Split(',');
foreach (var column in splitLine)
Console.WriteLine(column);
}
}
If someone can show me what I need to do to get these multi line CSV columns into one line that maintains the formatting (e.g. adds \t or \n where necessary) that would be great. Thanks!
Assuming your source file is valid CSV, variability in the data is really hard to account for. That's all I'll say, but I'll link you to another SO answer if you need convincing that writing your own CSV parser is a horrible task. Reading CSV files using C#
Let's assume you are going to take advantage of an existing CSV reader library. I'll use TextFieldParser from the Microsoft.VisualBasic library as is used in the example answer I linked.
Your task is to read your source file line by line, and validate whether the line is a complete CSV entry on it's own, or if it forms part of a broken line.
If it forms part of a broken line, we need to remember the line and add the next line to it before attempting validation again.
For this we need to know one thing:
What is the expected number of fields each data entry row should have?
int expectedFieldCount = 7;
string brokenLine = "";
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null) // read the next line
{
// if the previous line was incomplete, add it to the current line,
// otherwise use the current line
string csvLineData = (brokenLine.Length > 0) ? brokenLine + line : line;
try
{
using (StringReader stringReader = new StringReader(csvLineData ))
using (TextFieldParser parser = new TextFieldParser(stringReader))
{
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields(); // tests if the line is valid csv
if (expectedFieldCount == fields.Length)
{
// do whatever you want with the fields now.
foreach (var field in fields)
{
Console.WriteLine(field);
}
brokenLine = ""; // reset the brokenLine
}
else // it was valid csv, but we don't have the required number of fields yet
{
brokenLine += line + #"\r\n";
break;
}
}
}
}
catch (Exception ex) // the current line is NOT valid csv, update brokenLine
{
brokenLine += (line + #"\r\n");
}
}
}
I am replacing the line breaks that broken lines contain with \r\n literals. You can display these in your resulting one-liner field however you want. But you shouldn't expect to be able to copy paste the result into notepad and see line breaks.
One assumes you have the same number of columns in each record. Therefore in your code where you do your Split you can merely sum the length of splitLine into a running columnsReadCount until they equal the desired columnsPerRecordCount. At that point you have read all the record and can reset the running columnsReadCount back to zero ready for the next record to read.

Read second line and save it from txt C#

What I have to do is read only the second line in a .txt file and save it as a string, to use later in the code.
The file name is "SourceSetting". In line 1 and 2 I have some words
For line 1, I have this code:
string Location;
StreamReader reader = new StreamReader("SourceSettings.txt");
{
Location = reader.ReadLine();
}
ofd.InitialDirectory = Location;
And that works out great but how do I make it so that it only reads the second line so I can save it as for example:
string Text
You can skip the first line by doing nothing with it, so call ReadLine twice:
string secondLine:
using(var reader = new StreamReader("SourceSettings.txt"))
{
reader.ReadLine(); // skip
secondLine = reader.ReadLine();
}
Another way is the File class that has handy methods like ReadLines:
string secondLine = File.ReadLines("SourceSettings.txt").ElementAtOrDefault(1);
Since ReadLines also uses a stream the whole file must not be loaded into memory first to process it. Enumerable.ElementAtOrDefault will only take the second line and don't process more lines. If there are less than two lines the result is null.
Update I'd advice to go with Tim Schmelter solution.
When you call ReadLine - it moves the carret to next line. So on second call you'll read 2nd line.
string Location;
using(var reader = new StreamReader("SourceSettings.txt"))
{
Location = reader.ReadLine(); // this call will move caret to the begining of 2nd line.
Text = reader.ReadLine(); //this call will read 2nd line from the file
}
ofd.InitialDirectory = Location;
Don't forget about using.
Or an example how to do this vi ReadLines of File class if you need just one line from file. But solution with ElementAtOrDefault is the best one as Tim Schmelter points.
var Text = File.ReadLines(#"C:\Projects\info.txt").Skip(1).First()
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
So it doesn't read all lines into memory in comparison with ReadAllLines.
The line could be read using Linq as follows.
var SecondLine = File.ReadAllLines("SourceSettings.txt").Skip(1).FirstOrDefault();
private string GetLine(string filePath, int line)
{
using (var sr = new StreamReader(filePath))
{
for (int i = 1; i < line; i++)
sr.ReadLine();
return sr.ReadLine();
}
}
Hope this will help :)
If you know that your second line is unique, because it contains a specific keyword that does not appear anywhere else in your file, you also could use linq, the benefit is that the "second" line could be any line in future.
var myLine = File.ReadLines("SourceSettings.txt")
.Where(line => line.Contains("The Keyword"))
.ToList();

Read and extract from file

I have a huge file with ~3 mill rows. Every line contains record like this:
1|2|3|4|5|6|7|8|9
Exactly 8 separators like '|' on every line. I am looking for a way to read this file then extract last '9' number only from every line and store it into another file.
edit:
Ok here is what i done already.
using (StreamReader sr = new StreamReader(filepath))
using (StreamWriter sw = new StreamWriter(filepath1))
{
string line = null;
while ((line = sr.ReadLine()) != null)
sw.WriteLine(line.Split('|')[8]);
}
File.WriteAllLines("filepath", File.ReadAllLines(filepath).Where(l => !string.IsNullOrWhiteSpace(l)));
Read file, extract last digits then write in new file and clear blank lines. Last digit is 10-15 symbols and I want to extract first 6. I continue to read and try some and when I'm done or have some question I'll edit again.
Thanks
Edit 2:
Ok, here I take first 8 digits from the number:
sw.WriteLine(line.Substring(0, Math.Min(line.Length, 8)));
Edit 3:
I have no idea how can I match now every numbers that left in file. I want to match them and to see witch number how many times is in the file.
Any help?
I am looking for a way to read this file then extract last [..] number only from every line and store it into another file.
What part exactly are you having trouble with? In psuedo code, this is what you want:
fileReader = OpenFile("input")
fileWriter = OpenFile("output")
while !fileReader.EndOfFile
line = fileReader.ReadLine
records[] = line.Split('|')
value = records[8]
fileWriter.WriteLine(value)
do
So start implementing it and feel free to ask a question on any specific line you're having trouble with. Each line of code I posted contains enough pointers to figure out the C# code or the terms to do a web search for it.
You don't say where you are stuck. Break the problem down:
Write and run minimal C# program
Read lines from file
Break up one line
write result line to a file
Are you stuck on any one of those? Then ask a specific question about that. This decomposition technique is key to many programming tasks, and indeed complex tasks in general.
You might find the string split capability useful.
Because it's a huge file you must read it line by line!
public IEnumerable ReadFileIterator(String filePath)
{
using (StreamReader streamReader = new StreamReader(filePath, Encoding.Default))
{
String line;
while ((line = streamReader.ReadLine()) != null)
{
yield return line;
}
yield break;
}
}
public void WriteToFile(String inputFilePath, String outputFilePath)
{
using (StreamWriter streamWriter = new StreamWriter(outputFilePath, true, Encoding.Default))
{
foreach (String line in ReadFileIterator(inputFilePath))
{
String[] subStrings = line.Split('|');
streamWriter.WriteLine(subStrings[8]);
}
streamWriter.Flush();
streamWriter.Close();
}
}
using (StreamReader sr = new StreamReader("input"))
using (StreamWriter sw = new StreamWriter("output"))
{
string line = null;
while ((line=sr.ReadLine())!=null)
sw.WriteLine(line.Split('|')[8]);
}
Some pointer to start from: StreamReader.Readline() and String.Split(). There are examples on both pages.
With LINQ you could do a thing like the following to filter the numbers:
var numbers = from l in File.ReadLines(fileName)
let p = l.Split('|')
select p[8];
and then write them into a new file like that:
File.WriteAllText(newFileName, String.Join("\r\n", numbers));
Use String.Split() to get the line inside an array and get the last element and store it into another file. Repeat the process for each line.
Try this...
// Read the file and display it line by line.
System.IO.StreamReader file =
new System.IO.StreamReader("c:\\test.txt");
while((line = file.ReadLine()) != null)
{
string[] words = s.Split('|');
string value = words [8]
Console.WriteLine (value);
}
file.Close();

Assign line in text file to a string

I'm making a simple text adventure in C# and I was wondering if it was possible to read certain lines from a .txt file and assign them to a string.
I am aware of how to read all the text from a .txt file but how exactly would I assign the contents of certain lines to a string?
Have you considered the ReadAllLines method?
It returns an array of lines from which you can choose your desired line.
So for eg, if you wish to choose the 3rd line (Assuming you have 3 lines in the file):
string[] lines = File.ReadAllLines(path);
string myThirdLine= lines[2];
Probably the easiest (and cheapest in terms of memory consumption) is File.ReadLines:
String stringAtLine10 = File.ReadLines(path).ElementAtOrDefault(9);
Note that it is null if there are less than 10 lines in the file. See: ElementAtOrDefault.
It's just the concise version of a StreamReader and a counter variable which increases on every line.
As an advanced alternative: ReadLines plus some LINQ:
var lines = File.ReadLines(myFilePath).Where(MyCondition).ToArray();
where MyCondition:
bool MyCondition(string line)
{
if (line == "something")
{
return true;
}
return false;
}
In case you don't want to load all lines atonce
using(StreamReader reader=new StreamReader(path))
{
String line;
while((line=reader.ReadLine())!=null)//process temp
}
Here's a example how you can assign the lines to a string, you can't decide which line is which via fields, you have to select them yourself.
which is the line of the string you want to assign.
For example, you want line one, you define which as one and not zero, you want line eight, you define which with eight.
string getWord(int which)
{
string readed = "";
using (Systen.IO.StreamReader read = new System.IO.StreamReader("PATH HERE"))
{
readed = read.ReadToEnd();
}
string[] toReturn = readed.Split('\n');
return toReturn[which - 1];
}

is there any way to ignore reading in certain lines in a text file?

I'm trying to read in a text file in a c# application, but I don't want to read the first two lines, or the last line. There's 8 lines in the file, so effectivly I just want to read in lines, 3, 4, 5, 6 and 7.
Is there any way to do this?
example file
_USE [Shelley's Other Database]
CREATE TABLE db.exmpcustomers(
fName varchar(100) NULL,
lName varchar(100) NULL,
dateOfBirth date NULL,
houseNumber int NULL,
streetName varchar(100) NULL
) ON [PRIMARY]_
EDIT
Okay, so, I've implemented Callum Rogers answer into my code and for some reason it works with my edited text file (I created a text file with the lines I didn't want to use omitted) and it does exactly what it should, but whenever I try it with the original text file (above) it throws an exception. I display this information in a DataGrid and I think that's where the exception is being thrown.
Any ideas?
The Answer by Rogers is good, I am just providing another way of doing this.
Try this,
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(FilePath))
{
string text = "";
while ((text = reader.ReadLine()) != null)
{
list.Add(text);
}
list.RemoveAt(0);
list.RemoveAt(0);
}
Hope this helps
Why do you want to ignore exactly the first two and the last line?
Depending on what your file looks like you might want to analyze the line, e.g. look at the first character whether it is a comment sign, or ignore everything until you find the first empty line, etc.
Sometimes, hardcoding "magic" numbers isn't such a good idea. What if the file format needs to be changed to contain 3 header lines?
As the other answers demonstrate: Nothing keeps you from doing what you ever want with a line you have read, so of course, you can ignore it, too.
Edit, now that you've provided an example of your file: For your case I'd definitely not use the hardcoded numbers approach. What if some day the SQL statement should contain another field, or if it appears on one instead of 8 lines?
My suggestion: Read in the whole string at once, then analyze it. Safest way would be to use a grammar, but if you presume the SQL statement is never going to be more complicated, you can use a regular expression (still much better than using line numbers etc.):
string content = File.ReadAllText(filename);
Regex r = new Regex(#"CREATE TABLE [^\(]+\((.*)\) ON");
string whatYouWant = r.Match(content).Groups[0].Value;
Why not just use File.ReadAllLines() and then remove the first 2 lines and the last line? With such a small file speed differences will not be noticeable.
string[] allLines = File.ReadAllLines("file.ext");
string[] linesWanted = new string[allLines.Length-3];
Array.Copy(allLines, 2, linesWanted, 0, allLines.Length-3);
If you have a TextReader object wrapping the filestream you could just call ReadLine() two times.
StreamReader inherits from TextReader, which is abstract.
Non-fool proof example:
using (var fs = new FileStream("blah", FileMode.Open))
using (var reader = new StreamReader(fs))
{
reader.ReadLine();
reader.ReadLine();
// Do stuff.
}
string filepath = #"C:\whatever.txt";
using (StreamReader rdr = new StreamReader(filepath))
{
rdr.ReadLine(); // ignore 1st line
rdr.ReadLine(); // ignore 2nd line
string fileContents = "";
while (true)
{
string line = rdr.ReadLine();
if (rdr.EndOfStream)
break; // finish without processing last line
fileContents += line + #"\r\n";
}
Console.WriteLine(fileContents);
}
How about a general solution?
To me, the first step is to enumerate over the lines of a file (already provided by ReadAllLines, but that has a performance cost due to populating an entire string[] array; there's also ReadLines, but that's only available as of .NET 4.0).
Implementing this is pretty trivial:
public static IEnumerable<string> EnumerateLines(this FileInfo file)
{
using (var reader = file.OpenText())
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
}
The next step is to simply skip the first two lines of this enumerable sequence. This is straightforward using the Skip extension method.
The last step is to ignore the last line of the enumerable sequence. Here's one way you could implement this:
public static IEnumerable<T> IgnoreLast<T>(this IEnumerable<T> source, int ignoreCount)
{
if (ignoreCount < 0)
{
throw new ArgumentOutOfRangeException("ignoreCount");
}
var buffer = new Queue<T>();
foreach (T value in source)
{
if (buffer.Count < ignoreCount)
{
buffer.Enqueue(value);
continue;
}
T buffered = buffer.Dequeue();
buffer.Enqueue(value);
yield return buffered;
}
}
OK, then. Putting it all together, we have:
var file = new FileInfo(#"path\to\file.txt");
var lines = file.EnumerateLines().Skip(2).IgnoreLast(1);
Test input (contents of file):
This is line number 1.
This is line number 2.
This is line number 3.
This is line number 4.
This is line number 5.
This is line number 6.
This is line number 7.
This is line number 8.
This is line number 9.
This is line number 10.
Output (of Skip(2).IgnoreLast(1)):
This is line number 3.
This is line number 4.
This is line number 5.
This is line number 6.
This is line number 7.
This is line number 8.
This is line number 9.
You can do this:
var valid = new int[] { 3, 4, 5, 6, 7 };
var lines = File.ReadAllLines("file.txt").
Where((line, index) => valid.Contains(index + 1));
Or the opposite:
var invalid = new int[] { 1, 2, 8 };
var lines = File.ReadAllLines("file.txt").
Where((line, index) => !invalid.Contains(index + 1));
If you're looking for a general way to remove the last and the first 2, you can use this:
var allLines = File.ReadAllLines("file.txt");
var lines = allLines
.Take(allLines.Length - 1)
.Skip(2);
But from your example it seems that you're better off looking for the string pattern that you want to read from the file. Try using regexes.

Categories

Resources