So I have code that needs to check if the file has already been split every 50 characters. 99% of the time it will come to me already split, where each line is 50 characters, however there is an off chance that it may come to me as a single line, and I need to add a linebreak every 50 characters. This file will always come to me as a stream.
Once I have the properly formatted file, I process it as needed.
However, I am uncertain how I can check if the stream is properly formatted.
Here is the code I have to check if the first line if larger than 50 characters(an indicator it may need to be split).
var streamReader = new StreamReader(s);
var firstLineCount = streamReader.ReadLines().Count();
if(firstLineCount > 50)
{
//code to add line breaks
}
//once the file is good
using(var trackReader = new TrackingTextReader(streamReader))
{
//do biz logic
}
How can I add linebreaks to a stream reader?
I would add all lines to a List<string>. (Line by line)
Do the check for each item in the list (using for, not foreach, because we will be inserting items).
If some item in the list has more than 50 characters.
Add an item to the next index of the list using item.SubString(50) (all the string after the 50th character).
And cut the final of the item at current index using YourList[i] = YourList[i].SubString(0,50).
Funny comment someone did helped for this:
You can also create a StreamWriter to write the Stream you're reading with the corrections.
Then you get the produced Stream and pass it forward to what you need.
You can't write anything to TextReader, because... it is a reader. The option here is to make a well-formed copy of data:
private IEnumerable<string> GetWellFormedData(Stream s)
{
using (var reader = new StreamReader(s))
{
while (!reader.EndOfStream)
{
var nextLine = reader.ReadLine();
if (nextLine.Length > 50)
{
// break the line into 50-chars fragments and yield return fragments
}
else
yield return nextLine;
}
}
}
Related
What I have to do is read only the second line in a .txt file and save it as a string, to use later in the code.
The file name is "SourceSetting". In line 1 and 2 I have some words
For line 1, I have this code:
string Location;
StreamReader reader = new StreamReader("SourceSettings.txt");
{
Location = reader.ReadLine();
}
ofd.InitialDirectory = Location;
And that works out great but how do I make it so that it only reads the second line so I can save it as for example:
string Text
You can skip the first line by doing nothing with it, so call ReadLine twice:
string secondLine:
using(var reader = new StreamReader("SourceSettings.txt"))
{
reader.ReadLine(); // skip
secondLine = reader.ReadLine();
}
Another way is the File class that has handy methods like ReadLines:
string secondLine = File.ReadLines("SourceSettings.txt").ElementAtOrDefault(1);
Since ReadLines also uses a stream the whole file must not be loaded into memory first to process it. Enumerable.ElementAtOrDefault will only take the second line and don't process more lines. If there are less than two lines the result is null.
Update I'd advice to go with Tim Schmelter solution.
When you call ReadLine - it moves the carret to next line. So on second call you'll read 2nd line.
string Location;
using(var reader = new StreamReader("SourceSettings.txt"))
{
Location = reader.ReadLine(); // this call will move caret to the begining of 2nd line.
Text = reader.ReadLine(); //this call will read 2nd line from the file
}
ofd.InitialDirectory = Location;
Don't forget about using.
Or an example how to do this vi ReadLines of File class if you need just one line from file. But solution with ElementAtOrDefault is the best one as Tim Schmelter points.
var Text = File.ReadLines(#"C:\Projects\info.txt").Skip(1).First()
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
So it doesn't read all lines into memory in comparison with ReadAllLines.
The line could be read using Linq as follows.
var SecondLine = File.ReadAllLines("SourceSettings.txt").Skip(1).FirstOrDefault();
private string GetLine(string filePath, int line)
{
using (var sr = new StreamReader(filePath))
{
for (int i = 1; i < line; i++)
sr.ReadLine();
return sr.ReadLine();
}
}
Hope this will help :)
If you know that your second line is unique, because it contains a specific keyword that does not appear anywhere else in your file, you also could use linq, the benefit is that the "second" line could be any line in future.
var myLine = File.ReadLines("SourceSettings.txt")
.Where(line => line.Contains("The Keyword"))
.ToList();
I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.
I have a text file that contains a fixed length table that I am trying to parse. However, the beginning of the file is general information about when this table was generated (IE Time, Data, etc).
To read this I have attempted to make a FileStream, then read the first part of this file with a StreamReader. I parse out what I need from the top part of the document, and then when I am done, set the stream's position to the first line of the structured data.
Then I attach a TextFieldParser to the stream (with appropriate settings for the fixed length table), and then attempt to read the file. On the first row, it fails, and in the ErrorLine property, it lists off the last half of the third row of the table. I stepped through it and it was on the first row to read, yet the ErrorLine property suggests otherwise.
When debugging, I found that if I tried using my StreamReader.ReadLine() method after I had attached the TextFieldParser to the stream, the first 2 row show up fine. When I read the third row however, it returns a line where it starts with the first half of the third row (and stops right where the text in ErrorLine would be) appends some part from much later in the document. If I try this before I attach the TextFieldParser, it reads all 3 rows fine.
I have a feeling this has to do with my tying 2 readers to the same stream. I'm not sure how to read this with a structured part and an unstructured part, without just tokenizing the lines myself. I can do that but I assume I am not the first person to want to read part of a stream one way, and a later part of a stream in another.
Why is it skipping like this, and how would you read a text file with different formats?
Example:
Date: 3/1/2013
Time: 3:00 PM
Sensor: Awesome Thing
Seconds X Y Value
0 5.1 2.8 55
30 4.9 2.5 33
60 5.0 5.3 44
Code tailored for this simplified example:
Boolean setupInfo = true;
DataTable result = new DataTable();
String[] fields;
Double[] dFields;
FileStream stream = File.Open(filePath,FileMode.Open);
StreamReader reader = new StreamReader(stream);
String tempLine;
for(int j = 1; j <= 7; j++)
{
result.Columns.Add(("Column" + j));
}
//Parse the unstructured part
while(setupInfo)
{
tempLine = reader.ReadLine();
if( tempLine.StartsWith("Date: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Time: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Seconds")
{
//break out of this loop because the
//next line to be read is the unstructured part
setupInfo = false;
}
}
//Parse the structured part
TextFieldParser parser = new TextFieldParser(stream);
parser.TextFieldType = FieldType.FixedWidth;
parser.HasFieldsEnclosedInQuotes = false;
parser.SetFieldWidths(10, 10, 10, 10);
while (!parser.EndOfData)
{
if (reader.Peek() == '*')
{
break;
}
else
{
fields = parser.ReadFields();
if (parseStrings(fields, out dFields))
{
result.Rows.Add(dFields);
}
}
}
return result;
The reason it's skipping is that the StreamReader is reading blocks of data from the FileStream, rather than reading character-by-character. For example, the StreamReader might read 4 kilobytes from the FileStream and then parse out the lines as required to respond to ReadLine() calls. So when you attach the TextFieldParser to the FileStream, it's going to read from the current file position -- which is where the StreamReader left it.
The solution should be pretty simple: just connect the TextFieldParser to the StreamReader:
TextFieldParser parser = new TextFieldParser(reader);
See TextFieldParser(TextReader reader)
Generally speaking, most streams are consuming - that is, once read, it's no longer available. You could fork off to multiple streams by writing an intermediary class that derives from Stream and either raises an event, republished to other streams, etc.
In your case you don't need the StreamReader. The best choice is to check the file contents is using the File.ReadLines method instead. It will not load the whole file content, just the lines until you've found all that you need:
foreach (string line in File.ReadLines(filePath))
{
if( line.StartsWith("Date: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Time: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Seconds"))
{
break;
}
}
EDIT
You can do it even more simple using LINQ:
var d = from line in File.ReadLines(filePath) where line.Contains("Date: ") select line;
result.Rows.Add(d);
I need some assistance, I am writing a method to read a text file, and if any exception occurs I append a line to the text file. e.g "**"
So what I need to know is how can I check for that specific line of text in the text file without reading every line of the text file, like a peek method or something.
Any help would be appreciated.
Thanks in advance.
You can use File.ReadLines in combination with Any:
bool isExcFile = System.IO.File.ReadLines(path).Any(l => l == "**");
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
I have found a solution, the line I have appended to the file will always be the last line in the file, so I created a method to read the last line. See below:
public string ReadLastLine(string path)
{
string returnValue = "";
FileStream fs = new FileStream(path, FileMode.Open);
for (long pos = fs.Length - 2; pos > 0; --pos)
{
fs.Seek(pos, SeekOrigin.Begin);
StreamReader ts = new StreamReader(fs);
returnValue = ts.ReadToEnd();
int eol = returnValue .IndexOf("\n");
if (eol >= 0)
{
fs.Close();
return returnValue .Substring(eol + 1);
}
}
fs.Close();
return returnValue ;
}
You will need to maintain a separate file with indexes (such as comma delimited) of where your special markers are. Then you can only read those indexes and use the Seek method to jump to that point in the filestream.
If your file is relatively small, let's say <50MB this is an overkill. More than that you can consider maintaining the index file. You basically have to weigh the performance of an extra IO call (that is reading the index file) with that of simply reading from the filestream each line.
From what I understand you want to process some files and after the processing find out which files contain the "**" symbol, without reading every line of the file.
If you append the "**" to the end of the file you could do something like:
using (StreamReader sr = new StreamReader(File.OpenText(fileName)))
{
sr.BaseStream.Seek(-3, SeekOrigin.End);
string endToken = sr.ReadToEnd();
if (endToken == "**\n")
{
// if needed, go back to start of file:
sr.BaseStream.Seek(0, SeekOrigin.Begin);
// do something with the file
}
}
I'm trying to parse a text file that has a heading and the body. In the heading of this file, there are line number references to sections of the body. For example:
SECTION_A 256
SECTION_B 344
SECTION_C 556
This means, that SECTION_A starts in line 256.
What would be the best way to parse this heading into a dictionary and then when necessary read the sections.
Typical scenarios would be:
Parse the header and read only section SECTION_B
Parse the header and read fist paragraph of each section.
The data file is quite large and I definitely don't want to load all of it to the memory and then operate on it.
I'd appreciate your suggestions. My environment is VS 2008 and C# 3.5 SP1.
You can do this quite easily.
There are three parts to the problem.
1) How to find where a line in the file starts. The only way to do this is to read the lines from the file, keeping a list that records the start position in the file of that line. e.g
List lineMap = new List();
lineMap.Add(0); // Line 0 starts at location 0 in the data file (just a dummy entry)
lineMap.Add(0); // Line 1 starts at location 0 in the data file
using (StreamReader sr = new StreamReader("DataFile.txt"))
{
String line;
int lineNumber = 1;
while ((line = sr.ReadLine()) != null)
lineMap.Add(sr.BaseStream.Position);
}
2) Read and parse your index file into a dictionary.
Dictionary index = new Dictionary();
using (StreamReader sr = new StreamReader("IndexFile.txt"))
{
String line;
while ((line = sr.ReadLine()) != null)
{
string[] parts = line.Split(' '); // Break the line into the name & line number
index.Add(parts[0], Convert.ToInt32(parts[1]));
}
}
Then to find a line in your file, use:
int lineNumber = index["SECTION_B";]; // Convert section name into the line number
long offsetInDataFile = lineMap[lineNumber]; // Convert line number into file offset
Then open a new FileStream on DataFile.txt, Seek(offsetInDataFile, SeekOrigin.Begin) to move to the start of the line, and use a StreamReader (as above) to read line(s) from it.
Well, obviously you can store the name + line number into a dictionary, but that's not going to do you any good.
Well, sure, it will allow you to know which line to start reading from, but the problem is, where in the file is that line? The only way to know is to start from the beginning and start counting.
The best way would be to write a wrapper that decodes the text contents (if you have encoding issues) and can give you a line number to byte position type of mapping, then you could take that line number, 256, and look in a dictionary to know that line 256 starts at position 10000 in the file, and start reading from there.
Is this a one-off processing situation? If not, have you considered stuffing the entire file into a local database, like a SQLite database? That would allow you to have a direct mapping between line number and its contents. Of course, that file would be even bigger than your original file, and you'd need to copy data from the text file to the database, so there's some overhead either way.
Just read the file one line at a time and ignore the data until you get to the ones you need. You won't have any memory issues, but performance probably won't be great. You can do this easily in a background thread though.
Read the file until the end of the header, assuming you know where that is. Split the strings you've stored on whitespace, like so:
Dictionary<string, int> sectionIndex = new Dictionary<string, int>();
List<string> headers = new List<string>(); // fill these with readline
foreach(string header in headers) {
var s = header.Split(new[]{' '});
sectionIndex.Add(s[0], Int32.Parse(s[1]));
}
Find the dictionary entry you want, keep a count of the number of lines read in the file, and loop until you hit that line number, then read until you reach the next section's starting line. I don't know if you can guarantee the order of keys in the Dictionary, so you'd probably need the current and next section's names.
Be sure to do some error checking to make sure the section you're reading to isn't before the section you're reading from, and any other error cases you can think of.
You could read line by line until all the heading information is captured and stop (assuming all section pointers are in the heading). You would have the section and line numbers for use in retrieving the data at a later time.
string dataRow = "";
try
{
TextReader tr = new StreamReader("filename.txt");
while (true)
{
dataRow = tr.ReadLine();
if (dataRow.Substring(1, 8) != "SECTION_")
break;
else
//Parse line for section code and line number and log values
continue;
}
tr.Close();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}