Seeking for a line i a text file - c#

I need some assistance, I am writing a method to read a text file, and if any exception occurs I append a line to the text file. e.g "**"
So what I need to know is how can I check for that specific line of text in the text file without reading every line of the text file, like a peek method or something.
Any help would be appreciated.
Thanks in advance.

You can use File.ReadLines in combination with Any:
bool isExcFile = System.IO.File.ReadLines(path).Any(l => l == "**");
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.

I have found a solution, the line I have appended to the file will always be the last line in the file, so I created a method to read the last line. See below:
public string ReadLastLine(string path)
{
string returnValue = "";
FileStream fs = new FileStream(path, FileMode.Open);
for (long pos = fs.Length - 2; pos > 0; --pos)
{
fs.Seek(pos, SeekOrigin.Begin);
StreamReader ts = new StreamReader(fs);
returnValue = ts.ReadToEnd();
int eol = returnValue .IndexOf("\n");
if (eol >= 0)
{
fs.Close();
return returnValue .Substring(eol + 1);
}
}
fs.Close();
return returnValue ;
}

You will need to maintain a separate file with indexes (such as comma delimited) of where your special markers are. Then you can only read those indexes and use the Seek method to jump to that point in the filestream.
If your file is relatively small, let's say <50MB this is an overkill. More than that you can consider maintaining the index file. You basically have to weigh the performance of an extra IO call (that is reading the index file) with that of simply reading from the filestream each line.

From what I understand you want to process some files and after the processing find out which files contain the "**" symbol, without reading every line of the file.
If you append the "**" to the end of the file you could do something like:
using (StreamReader sr = new StreamReader(File.OpenText(fileName)))
{
sr.BaseStream.Seek(-3, SeekOrigin.End);
string endToken = sr.ReadToEnd();
if (endToken == "**\n")
{
// if needed, go back to start of file:
sr.BaseStream.Seek(0, SeekOrigin.Begin);
// do something with the file
}
}

Related

Read second line and save it from txt C#

What I have to do is read only the second line in a .txt file and save it as a string, to use later in the code.
The file name is "SourceSetting". In line 1 and 2 I have some words
For line 1, I have this code:
string Location;
StreamReader reader = new StreamReader("SourceSettings.txt");
{
Location = reader.ReadLine();
}
ofd.InitialDirectory = Location;
And that works out great but how do I make it so that it only reads the second line so I can save it as for example:
string Text
You can skip the first line by doing nothing with it, so call ReadLine twice:
string secondLine:
using(var reader = new StreamReader("SourceSettings.txt"))
{
reader.ReadLine(); // skip
secondLine = reader.ReadLine();
}
Another way is the File class that has handy methods like ReadLines:
string secondLine = File.ReadLines("SourceSettings.txt").ElementAtOrDefault(1);
Since ReadLines also uses a stream the whole file must not be loaded into memory first to process it. Enumerable.ElementAtOrDefault will only take the second line and don't process more lines. If there are less than two lines the result is null.
Update I'd advice to go with Tim Schmelter solution.
When you call ReadLine - it moves the carret to next line. So on second call you'll read 2nd line.
string Location;
using(var reader = new StreamReader("SourceSettings.txt"))
{
Location = reader.ReadLine(); // this call will move caret to the begining of 2nd line.
Text = reader.ReadLine(); //this call will read 2nd line from the file
}
ofd.InitialDirectory = Location;
Don't forget about using.
Or an example how to do this vi ReadLines of File class if you need just one line from file. But solution with ElementAtOrDefault is the best one as Tim Schmelter points.
var Text = File.ReadLines(#"C:\Projects\info.txt").Skip(1).First()
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
So it doesn't read all lines into memory in comparison with ReadAllLines.
The line could be read using Linq as follows.
var SecondLine = File.ReadAllLines("SourceSettings.txt").Skip(1).FirstOrDefault();
private string GetLine(string filePath, int line)
{
using (var sr = new StreamReader(filePath))
{
for (int i = 1; i < line; i++)
sr.ReadLine();
return sr.ReadLine();
}
}
Hope this will help :)
If you know that your second line is unique, because it contains a specific keyword that does not appear anywhere else in your file, you also could use linq, the benefit is that the "second" line could be any line in future.
var myLine = File.ReadLines("SourceSettings.txt")
.Where(line => line.Contains("The Keyword"))
.ToList();

Alternative to File.AppendAllText for newline

I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.

How to add linebreaks to a stream reader if conditions are met

So I have code that needs to check if the file has already been split every 50 characters. 99% of the time it will come to me already split, where each line is 50 characters, however there is an off chance that it may come to me as a single line, and I need to add a linebreak every 50 characters. This file will always come to me as a stream.
Once I have the properly formatted file, I process it as needed.
However, I am uncertain how I can check if the stream is properly formatted.
Here is the code I have to check if the first line if larger than 50 characters(an indicator it may need to be split).
var streamReader = new StreamReader(s);
var firstLineCount = streamReader.ReadLines().Count();
if(firstLineCount > 50)
{
//code to add line breaks
}
//once the file is good
using(var trackReader = new TrackingTextReader(streamReader))
{
//do biz logic
}
How can I add linebreaks to a stream reader?
I would add all lines to a List<string>. (Line by line)
Do the check for each item in the list (using for, not foreach, because we will be inserting items).
If some item in the list has more than 50 characters.
Add an item to the next index of the list using item.SubString(50) (all the string after the 50th character).
And cut the final of the item at current index using YourList[i] = YourList[i].SubString(0,50).
Funny comment someone did helped for this:
You can also create a StreamWriter to write the Stream you're reading with the corrections.
Then you get the produced Stream and pass it forward to what you need.
You can't write anything to TextReader, because... it is a reader. The option here is to make a well-formed copy of data:
private IEnumerable<string> GetWellFormedData(Stream s)
{
using (var reader = new StreamReader(s))
{
while (!reader.EndOfStream)
{
var nextLine = reader.ReadLine();
if (nextLine.Length > 50)
{
// break the line into 50-chars fragments and yield return fragments
}
else
yield return nextLine;
}
}
}

Most efficient way of removing lines that contain more than one string from a file?

I want to find the most efficient way of removing string 1 and string 2 when reading a file (host file) and remove the entire lines that contains string 1 or string 2.
Currently I have, and is obviously sluggish. What better methods are there?
using(StreamReader sr = File.OpenText(path)){
while ((stringToRemove = sr.ReadLine()) != null)
{
if (!stringToRemove.Contains("string1"))
{
if (!stringToRemove.Contains("string2"))
{
emptyreplace += stringToRemove + Environment.NewLine;
}
}
}
sr.Close();
File.WriteAllText(path, emptyreplace);
hostFileConfigured = false;
UInt32 result = DnsFlushResolverCache();
MessageBox.Show(removeSuccess, windowOffline);
}
The primary problem that you have is that you are constantly using large regular strings and appending data onto the end. This is re-creating the strings each time and consumes a lot of time and particularly memory. By using string.Join it will avoid the (very large number of) intermediate string values being created.
You can also shorten the code to get the lines of text by using File.ReadLines instead of using the stream directly. It's not really any better or worse, just prettier.
var lines = File.ReadLines(path)
.Where(line => !line.Contains("string1") && !line.Contains("string2"));
File.WriteAllText(path, string.Join(Environment.NewLine, lines));
Another option would be to stream the writing of the output as well. Since there is no good library method for writing out a IEnumerable<string> without eagerly evaluating the input, we'll have to write our own (which is simple enough):
public static void WriteLines(string path, IEnumerable<string> lines)
{
using (var stream = File.CreateText(path))
{
foreach (var line in lines)
stream.WriteLine(line);
}
}
Also note that if we're streaming our output then we'll need a temporary file, since we don't want to be reading and writing to the same file at the same time.
//same code as before
var lines = File.ReadLines(path)
.Where(line => !line.Contains("string1") && !line.Contains("string2"));
//get a temp file path that won't conflict with any other files
string tempPath = Path.GetTempFileName();
//use the method from above to write the lines to the temp file
WriteLines(tempPath, lines);
//rename the temp file to the real file we want to replace,
//both deleting the temp file and the old file at the same time
File.Move(tempPath, path);
The primary advantage of this option, as opposed to the first, is that it will consume far less memory. In fact, it only ever needs to hold line of the file in memory at a time, rather than the whole file. It does take up a bit of extra space on disk (temporarily) though.
The first thing that shines to me, is wrong (not efficient) use of string type variable inside a while loop (emptyreplace), use StrinBuilder type and it will be much memory efficient.
For example:
StringBuilder emptyreplace = new StringBuilder();
using(StreamReader sr = File.OpenText(path)){
while ((stringToRemove = sr.ReadLine()) != null)
{
if (!stringToRemove.Contains("string1"))
{
if (!stringToRemove.Contains("string2"))
{
//USE StringBuilder.Append, and NOT string concatenation
emptyreplace.AppendLine(stringToRemove + Environment.NewLine);
}
}
}
...
}
The rest seems good enough.
There are a number of ways to improve this:
Compile the array of words you're searching for into a regex (eg, word1|word2; beware of special characters) so that you'll only need to loop over the string once. (this would also allow you to use \b to only match words)
Write each line through a StreamWriter to a new file so that you don't need to store the whole thing in memory while building it. (after you finish, delete the original file & rename the new one)
Is your host file really that big that you need to bother with reading it line by line? Why not simply do this?
var lines = File.ReadAllLines(path);
var lines = lines.Where(x => !badWords.Any(y => x.Contains(y))).ToArray();
File.WriteAllLines(path, lines);
Two suggestions:
Create an array of strings to detect (I'll call them stopWords) and use Linq's Any extension method.
Rather than building the file up and writing it all at once, write each line to an output file one at a time while your reading the source file, and replace the source file once your done.
The resulting code:
string[] stopWords = new string[]
{
"string1",
"string2"
}
using(StreamReader sr = File.OpenText(srcPath))
using(StreamWriter sw = new StreamWriter(outPath))
{
while ((stringToRemove = sr.ReadLine()) != null)
{
if (!stopWords.Any(s => stringToRemove.Contains(s))
{
sw.WriteLine(stringToRemove);
}
}
}
File.Move(outPath, srcPath);
Update: I just realized that you are actually talking about the "hosts file". Assuming you mean %windir%\system32\drivers\etc\hosts, it is very unlikely that this file has a truly significant size (like more than a couple of KBs). So personally, I would go with the most readable approach. Like, for example, the one by #servy.
In the end you will have to read every line and write every line, that does not match your criteria. So, you will always have the basic IO overhead that you cannot avoid. Depending on the actual (average) size of your files that might overshadow every other optimization technique you use in your code to actually filter the lines.
Having that said, you can however be a little less wasteful on the memory side of things, by not collecting all output lines in a buffer, but directly writing them to the output file as you have read them (again, this might be pointless if you files are not very big).
using (var reader = new StreamReader(inputfile))
{
using (var writer = new StreamWriter(outputfile))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.IndexOf("string1") == -1 && line.IndexOf("string2") == -1)
{
writer.WriteLine(line);
}
}
}
}
File.Move(outputFile, inputFile);

Reading stream with 2 different readers

I have a text file that contains a fixed length table that I am trying to parse. However, the beginning of the file is general information about when this table was generated (IE Time, Data, etc).
To read this I have attempted to make a FileStream, then read the first part of this file with a StreamReader. I parse out what I need from the top part of the document, and then when I am done, set the stream's position to the first line of the structured data.
Then I attach a TextFieldParser to the stream (with appropriate settings for the fixed length table), and then attempt to read the file. On the first row, it fails, and in the ErrorLine property, it lists off the last half of the third row of the table. I stepped through it and it was on the first row to read, yet the ErrorLine property suggests otherwise.
When debugging, I found that if I tried using my StreamReader.ReadLine() method after I had attached the TextFieldParser to the stream, the first 2 row show up fine. When I read the third row however, it returns a line where it starts with the first half of the third row (and stops right where the text in ErrorLine would be) appends some part from much later in the document. If I try this before I attach the TextFieldParser, it reads all 3 rows fine.
I have a feeling this has to do with my tying 2 readers to the same stream. I'm not sure how to read this with a structured part and an unstructured part, without just tokenizing the lines myself. I can do that but I assume I am not the first person to want to read part of a stream one way, and a later part of a stream in another.
Why is it skipping like this, and how would you read a text file with different formats?
Example:
Date: 3/1/2013
Time: 3:00 PM
Sensor: Awesome Thing
Seconds X Y Value
0 5.1 2.8 55
30 4.9 2.5 33
60 5.0 5.3 44
Code tailored for this simplified example:
Boolean setupInfo = true;
DataTable result = new DataTable();
String[] fields;
Double[] dFields;
FileStream stream = File.Open(filePath,FileMode.Open);
StreamReader reader = new StreamReader(stream);
String tempLine;
for(int j = 1; j <= 7; j++)
{
result.Columns.Add(("Column" + j));
}
//Parse the unstructured part
while(setupInfo)
{
tempLine = reader.ReadLine();
if( tempLine.StartsWith("Date: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Time: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Seconds")
{
//break out of this loop because the
//next line to be read is the unstructured part
setupInfo = false;
}
}
//Parse the structured part
TextFieldParser parser = new TextFieldParser(stream);
parser.TextFieldType = FieldType.FixedWidth;
parser.HasFieldsEnclosedInQuotes = false;
parser.SetFieldWidths(10, 10, 10, 10);
while (!parser.EndOfData)
{
if (reader.Peek() == '*')
{
break;
}
else
{
fields = parser.ReadFields();
if (parseStrings(fields, out dFields))
{
result.Rows.Add(dFields);
}
}
}
return result;
The reason it's skipping is that the StreamReader is reading blocks of data from the FileStream, rather than reading character-by-character. For example, the StreamReader might read 4 kilobytes from the FileStream and then parse out the lines as required to respond to ReadLine() calls. So when you attach the TextFieldParser to the FileStream, it's going to read from the current file position -- which is where the StreamReader left it.
The solution should be pretty simple: just connect the TextFieldParser to the StreamReader:
TextFieldParser parser = new TextFieldParser(reader);
See TextFieldParser(TextReader reader)
Generally speaking, most streams are consuming - that is, once read, it's no longer available. You could fork off to multiple streams by writing an intermediary class that derives from Stream and either raises an event, republished to other streams, etc.
In your case you don't need the StreamReader. The best choice is to check the file contents is using the File.ReadLines method instead. It will not load the whole file content, just the lines until you've found all that you need:
foreach (string line in File.ReadLines(filePath))
{
if( line.StartsWith("Date: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Time: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Seconds"))
{
break;
}
}
EDIT
You can do it even more simple using LINQ:
var d = from line in File.ReadLines(filePath) where line.Contains("Date: ") select line;
result.Rows.Add(d);

Categories

Resources