Exclude last line from StreamReader in C# - c#

I'm saving some DataGridView's data into a text file(for data I'm refering to the content of each cell), and at the last line of the TXT File, I am storing the number of rows of the dataGridView with a "-" at the start of the line.
Now, I can read my data perfectly into my DVG, but how can I exclude the last line which contains my row count and also use it to read its stored int?
The code I use for reading the data into the DVG:
void LoadDVGData()
{
StreamReader sr = new StreamReader(open.FileName);
foreach (DataGridViewRow row in editDGV.Rows)
{
foreach (DataGridViewCell cell in row.Cells)
{
string content = sr.ReadLine();
cell.Value = content;
}
}
sr.Close();
}
Thanks in Advance. - CCB

You just need to maintain one token of look ahead so that you can answer the question, "Is this the last line?" (if you have a current line, but no next line, you're positioned at the last line.
Here's one approach, which is probably about as simple as it gets:
void LoadDVGData()
{
using ( StreamReader sr = new StreamReader(open.FileName) )
{
string currentLine = sr.ReadLine() ;
string nextLine = sr.ReadLine() ;
foreach (DataGridViewRow row in editDGV.Rows)
{
foreach (DataGridViewCell cell in row.Cells)
{
if ( currentLine == null ) throw new InvalidOperationException("Hmph. We seem to have run out of input data");
cell.Value = currentLine ;
currentLine = nextLine ;
nextLine = sr.ReadLine() ;
} //end cols loop
} // end rows loop
} // end using block
return;
}

I assume sr is the StreamReader that reads your text file - you should wrap that in a using or try..finally block to ensure correct cleanup in case of errors.
Your current setup doesn't work very well for cases where the text file doesn't match the number of rows / cells that you have in the DataGridView (DGV): you try reading a value from the file for each cell and if the wrong file is specified, you may attempt to read beyond the end of the file. You should have more error handling around this code because there may be various problem spots during reading the file: too few rows, too few cells for a row, invalid value for a particular cell, etc.
Your setup doesn't lend itself very well to handling. If you control the format of the text file, I'd recommend to change the format so that the very first row is the count and then each row in the text file describes a complete row for the DGV, using TAB as the delimiter. It'd be far simpler to load such a file. (You can even have the row count and column count on the first row to give you even more information about the data; you can never have too much metadata about input data.)
So your new textfile format would look something like:
3{TAB}5
R1C1{TAB}R1C2{TAB}R1C3{TAB}R1C4{TAB}R1C5{CRLF}
R2C1{TAB}R2C2{TAB}R2C3{TAB}R2C4{TAB}R2C5{CRLF}
R3C1{TAB}R3C2{TAB}R3C3{TAB}R3C4{TAB}R3C5{CRLF}
If you cannot change the text file format, then just read the file twice - first seek to the end of the file and read back to the end of the line before the last - then just read one line from it - it should be the row count. You can do this by looking for a newline in the file going from the end. Then just seek to the beginning of the file and read so many lines from it.
Edit:
As #vesan commented under the question, if your input file is not huge, you can just call File.ReadAllLines() to get all lines in a string[]. It's trivial to process it then. Using a more convenient file format would work, though, for any file size. Reading all lines assumes that the input file is not huge.
Edit 2:
For example (error handling omitted for brevity):
var sLines = File.ReadAllLines ( open.FileName );
var nRowCount = int.Parse ( sLines[sLines.Length - 1] );
var nIndex = 0;
foreach (DataGridViewRow row in editDGV.Rows)
{
foreach (DataGridViewCell cell in row.Cells)
{
cell.Value = sLines[nIndex];
nIndex++;
}
}
If your text file is correct, nRowCount will have the correct number of rows and you'll never actually read the last element of your array into the DGV.

Related

How Can I Read a Multiline Field from a CSV Without Altering It?

I have a CSV that looks like this. My goal is to extract each entry (notice I said entry, not line), where an entry starts from the first column and stretches to the last column, and may span multiple lines. I'd like to extract an entry without ruining the formatting. For example, I do not want the following to be considered four seperate lines,
Eg. 1, One Column Multiple Lines
...,"1. copy ctor
2. copy ctor
3. declares function
4. default ctor",... // Where ... represents the columns before and after
but rather a column in one entry that can be represented as such
Eg. 2, One Column Single Line
"1. copy ctor\n2.copy ctor\ndeclares function\n4.default ctor"
When I iterate over the CSV, as such, I get Eg. 1. I'm not sure why splitting on a comma is treating a new line as a comma.
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
string[] splitLine = line.Split(',');
foreach (var column in splitLine)
Console.WriteLine(column);
}
}
If someone can show me what I need to do to get these multi line CSV columns into one line that maintains the formatting (e.g. adds \t or \n where necessary) that would be great. Thanks!
Assuming your source file is valid CSV, variability in the data is really hard to account for. That's all I'll say, but I'll link you to another SO answer if you need convincing that writing your own CSV parser is a horrible task. Reading CSV files using C#
Let's assume you are going to take advantage of an existing CSV reader library. I'll use TextFieldParser from the Microsoft.VisualBasic library as is used in the example answer I linked.
Your task is to read your source file line by line, and validate whether the line is a complete CSV entry on it's own, or if it forms part of a broken line.
If it forms part of a broken line, we need to remember the line and add the next line to it before attempting validation again.
For this we need to know one thing:
What is the expected number of fields each data entry row should have?
int expectedFieldCount = 7;
string brokenLine = "";
using (var streamReader = new StreamReader("results-survey111101.csv"))
{
string line;
while ((line = streamReader.ReadLine()) != null) // read the next line
{
// if the previous line was incomplete, add it to the current line,
// otherwise use the current line
string csvLineData = (brokenLine.Length > 0) ? brokenLine + line : line;
try
{
using (StringReader stringReader = new StringReader(csvLineData ))
using (TextFieldParser parser = new TextFieldParser(stringReader))
{
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields(); // tests if the line is valid csv
if (expectedFieldCount == fields.Length)
{
// do whatever you want with the fields now.
foreach (var field in fields)
{
Console.WriteLine(field);
}
brokenLine = ""; // reset the brokenLine
}
else // it was valid csv, but we don't have the required number of fields yet
{
brokenLine += line + #"\r\n";
break;
}
}
}
}
catch (Exception ex) // the current line is NOT valid csv, update brokenLine
{
brokenLine += (line + #"\r\n");
}
}
}
I am replacing the line breaks that broken lines contain with \r\n literals. You can display these in your resulting one-liner field however you want. But you shouldn't expect to be able to copy paste the result into notepad and see line breaks.
One assumes you have the same number of columns in each record. Therefore in your code where you do your Split you can merely sum the length of splitLine into a running columnsReadCount until they equal the desired columnsPerRecordCount. At that point you have read all the record and can reset the running columnsReadCount back to zero ready for the next record to read.

Issue renaming two columns in a CSV file instead of one

I need to be able to rename the column in a spreadsheet from 'idn_prod' to 'idn_prod1', but there are two columns with this name.
I have tried implementing code from similar posts, but I've only been able to update both columns. Below you'll find the code I have that just renames both columns.
//locate and edit column in csv
string file1 = #"C:\Users\username\Documents\AppDevProjects\import.csv";
string[] lines = System.IO.File.ReadAllLines(file1);
System.IO.StreamWriter sw = new System.IO.StreamWriter(file1);
foreach(string s in lines)
{
sw.WriteLine(s.Replace("idn_prod", "idn_prod1"));
}
I expect only the 2nd column to be renamed, but the actual output is that both are renamed.
Here are the first couple rows of the CSV:
I'm assuming that you only need to update the column header, the actual rows need not be updated.
var file1 = #"test.csv";
var lines = System.IO.File.ReadAllLines(file1);
var columnHeaders = lines[0];
var textToReplace = "idn_prod";
var newText = "idn_prod1";
var indexToReplace = columnHeaders
.LastIndexOf("idn_prod");//LastIndex ensures that you pick the second idn_prod
columnHeaders = columnHeaders
.Remove(indexToReplace,textToReplace.Length)
.Insert(indexToReplace, newText);//I'm removing the second idn_prod and replacing it with the updated value.
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(file1))
{
sw.WriteLine(columnHeaders);
foreach (var str in lines.Skip(1))
{
sw.WriteLine(str);
}
sw.Flush();
}
Replace foreach(string s in lines) loop with
for loop and get the lines count and rename only the 2nd column.
I believe the only way to handle this properly is to crack the header line (first string that has column names) into individual parts, separated by commas or tabs or whatever, and run through the columns one at a time yourself.
Your loop would consider the first line from the file, use the Split function on the delimiter, and look for the column you're interested in:
bool headerSeen = false;
foreach (string s in lines)
{
if (!headerSeen)
{
// special: this is the header
string [] parts = s.Split("\t");
for (int i = 0; i < parts.Length; i++)
{
if (parts[i] == "idn_prod")
{
// only fix the *first* one seen
parts[i] = "idn_prod1";
break;
}
}
sw.WriteLine( string.Join("\t", parts));
headerSeen = true;
}
else
{
sw.WriteLine( s );
}
}
The only reason this is even remotely possible is that it's the header and not the individual lines; headers tend to be more predictable in format, and you worry less about quoting and fields that contain the delimiter, etc.
Trying this on the individual data lines will rarely work reliably: if your delimiter is a comma, what happens if an individual field contains a comma? Then you have to worry about quoting, and this enters all kinds of fun.
For doing any real CSV work in C#, it's really worth looking into a package that specializes in this, and I've been thrilled with CsvHelper from Josh Close. Highly recommended.

How to add linebreaks to a stream reader if conditions are met

So I have code that needs to check if the file has already been split every 50 characters. 99% of the time it will come to me already split, where each line is 50 characters, however there is an off chance that it may come to me as a single line, and I need to add a linebreak every 50 characters. This file will always come to me as a stream.
Once I have the properly formatted file, I process it as needed.
However, I am uncertain how I can check if the stream is properly formatted.
Here is the code I have to check if the first line if larger than 50 characters(an indicator it may need to be split).
var streamReader = new StreamReader(s);
var firstLineCount = streamReader.ReadLines().Count();
if(firstLineCount > 50)
{
//code to add line breaks
}
//once the file is good
using(var trackReader = new TrackingTextReader(streamReader))
{
//do biz logic
}
How can I add linebreaks to a stream reader?
I would add all lines to a List<string>. (Line by line)
Do the check for each item in the list (using for, not foreach, because we will be inserting items).
If some item in the list has more than 50 characters.
Add an item to the next index of the list using item.SubString(50) (all the string after the 50th character).
And cut the final of the item at current index using YourList[i] = YourList[i].SubString(0,50).
Funny comment someone did helped for this:
You can also create a StreamWriter to write the Stream you're reading with the corrections.
Then you get the produced Stream and pass it forward to what you need.
You can't write anything to TextReader, because... it is a reader. The option here is to make a well-formed copy of data:
private IEnumerable<string> GetWellFormedData(Stream s)
{
using (var reader = new StreamReader(s))
{
while (!reader.EndOfStream)
{
var nextLine = reader.ReadLine();
if (nextLine.Length > 50)
{
// break the line into 50-chars fragments and yield return fragments
}
else
yield return nextLine;
}
}
}

Reading stream with 2 different readers

I have a text file that contains a fixed length table that I am trying to parse. However, the beginning of the file is general information about when this table was generated (IE Time, Data, etc).
To read this I have attempted to make a FileStream, then read the first part of this file with a StreamReader. I parse out what I need from the top part of the document, and then when I am done, set the stream's position to the first line of the structured data.
Then I attach a TextFieldParser to the stream (with appropriate settings for the fixed length table), and then attempt to read the file. On the first row, it fails, and in the ErrorLine property, it lists off the last half of the third row of the table. I stepped through it and it was on the first row to read, yet the ErrorLine property suggests otherwise.
When debugging, I found that if I tried using my StreamReader.ReadLine() method after I had attached the TextFieldParser to the stream, the first 2 row show up fine. When I read the third row however, it returns a line where it starts with the first half of the third row (and stops right where the text in ErrorLine would be) appends some part from much later in the document. If I try this before I attach the TextFieldParser, it reads all 3 rows fine.
I have a feeling this has to do with my tying 2 readers to the same stream. I'm not sure how to read this with a structured part and an unstructured part, without just tokenizing the lines myself. I can do that but I assume I am not the first person to want to read part of a stream one way, and a later part of a stream in another.
Why is it skipping like this, and how would you read a text file with different formats?
Example:
Date: 3/1/2013
Time: 3:00 PM
Sensor: Awesome Thing
Seconds X Y Value
0 5.1 2.8 55
30 4.9 2.5 33
60 5.0 5.3 44
Code tailored for this simplified example:
Boolean setupInfo = true;
DataTable result = new DataTable();
String[] fields;
Double[] dFields;
FileStream stream = File.Open(filePath,FileMode.Open);
StreamReader reader = new StreamReader(stream);
String tempLine;
for(int j = 1; j <= 7; j++)
{
result.Columns.Add(("Column" + j));
}
//Parse the unstructured part
while(setupInfo)
{
tempLine = reader.ReadLine();
if( tempLine.StartsWith("Date: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Time: "))
{
result.Rows.Add(tempLine);
}
else if (tempLine.StartsWith("Seconds")
{
//break out of this loop because the
//next line to be read is the unstructured part
setupInfo = false;
}
}
//Parse the structured part
TextFieldParser parser = new TextFieldParser(stream);
parser.TextFieldType = FieldType.FixedWidth;
parser.HasFieldsEnclosedInQuotes = false;
parser.SetFieldWidths(10, 10, 10, 10);
while (!parser.EndOfData)
{
if (reader.Peek() == '*')
{
break;
}
else
{
fields = parser.ReadFields();
if (parseStrings(fields, out dFields))
{
result.Rows.Add(dFields);
}
}
}
return result;
The reason it's skipping is that the StreamReader is reading blocks of data from the FileStream, rather than reading character-by-character. For example, the StreamReader might read 4 kilobytes from the FileStream and then parse out the lines as required to respond to ReadLine() calls. So when you attach the TextFieldParser to the FileStream, it's going to read from the current file position -- which is where the StreamReader left it.
The solution should be pretty simple: just connect the TextFieldParser to the StreamReader:
TextFieldParser parser = new TextFieldParser(reader);
See TextFieldParser(TextReader reader)
Generally speaking, most streams are consuming - that is, once read, it's no longer available. You could fork off to multiple streams by writing an intermediary class that derives from Stream and either raises an event, republished to other streams, etc.
In your case you don't need the StreamReader. The best choice is to check the file contents is using the File.ReadLines method instead. It will not load the whole file content, just the lines until you've found all that you need:
foreach (string line in File.ReadLines(filePath))
{
if( line.StartsWith("Date: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Time: "))
{
result.Rows.Add(line);
}
else if (line.StartsWith("Seconds"))
{
break;
}
}
EDIT
You can do it even more simple using LINQ:
var d = from line in File.ReadLines(filePath) where line.Contains("Date: ") select line;
result.Rows.Add(d);

Parsing a textfile in C# with skipping some contents

I'm trying to parse a text file that has a heading and the body. In the heading of this file, there are line number references to sections of the body. For example:
SECTION_A 256
SECTION_B 344
SECTION_C 556
This means, that SECTION_A starts in line 256.
What would be the best way to parse this heading into a dictionary and then when necessary read the sections.
Typical scenarios would be:
Parse the header and read only section SECTION_B
Parse the header and read fist paragraph of each section.
The data file is quite large and I definitely don't want to load all of it to the memory and then operate on it.
I'd appreciate your suggestions. My environment is VS 2008 and C# 3.5 SP1.
You can do this quite easily.
There are three parts to the problem.
1) How to find where a line in the file starts. The only way to do this is to read the lines from the file, keeping a list that records the start position in the file of that line. e.g
List lineMap = new List();
lineMap.Add(0); // Line 0 starts at location 0 in the data file (just a dummy entry)
lineMap.Add(0); // Line 1 starts at location 0 in the data file
using (StreamReader sr = new StreamReader("DataFile.txt"))
{
String line;
int lineNumber = 1;
while ((line = sr.ReadLine()) != null)
lineMap.Add(sr.BaseStream.Position);
}
2) Read and parse your index file into a dictionary.
Dictionary index = new Dictionary();
using (StreamReader sr = new StreamReader("IndexFile.txt"))
{
String line;
while ((line = sr.ReadLine()) != null)
{
string[] parts = line.Split(' '); // Break the line into the name & line number
index.Add(parts[0], Convert.ToInt32(parts[1]));
}
}
Then to find a line in your file, use:
int lineNumber = index["SECTION_B";]; // Convert section name into the line number
long offsetInDataFile = lineMap[lineNumber]; // Convert line number into file offset
Then open a new FileStream on DataFile.txt, Seek(offsetInDataFile, SeekOrigin.Begin) to move to the start of the line, and use a StreamReader (as above) to read line(s) from it.
Well, obviously you can store the name + line number into a dictionary, but that's not going to do you any good.
Well, sure, it will allow you to know which line to start reading from, but the problem is, where in the file is that line? The only way to know is to start from the beginning and start counting.
The best way would be to write a wrapper that decodes the text contents (if you have encoding issues) and can give you a line number to byte position type of mapping, then you could take that line number, 256, and look in a dictionary to know that line 256 starts at position 10000 in the file, and start reading from there.
Is this a one-off processing situation? If not, have you considered stuffing the entire file into a local database, like a SQLite database? That would allow you to have a direct mapping between line number and its contents. Of course, that file would be even bigger than your original file, and you'd need to copy data from the text file to the database, so there's some overhead either way.
Just read the file one line at a time and ignore the data until you get to the ones you need. You won't have any memory issues, but performance probably won't be great. You can do this easily in a background thread though.
Read the file until the end of the header, assuming you know where that is. Split the strings you've stored on whitespace, like so:
Dictionary<string, int> sectionIndex = new Dictionary<string, int>();
List<string> headers = new List<string>(); // fill these with readline
foreach(string header in headers) {
var s = header.Split(new[]{' '});
sectionIndex.Add(s[0], Int32.Parse(s[1]));
}
Find the dictionary entry you want, keep a count of the number of lines read in the file, and loop until you hit that line number, then read until you reach the next section's starting line. I don't know if you can guarantee the order of keys in the Dictionary, so you'd probably need the current and next section's names.
Be sure to do some error checking to make sure the section you're reading to isn't before the section you're reading from, and any other error cases you can think of.
You could read line by line until all the heading information is captured and stop (assuming all section pointers are in the heading). You would have the section and line numbers for use in retrieving the data at a later time.
string dataRow = "";
try
{
TextReader tr = new StreamReader("filename.txt");
while (true)
{
dataRow = tr.ReadLine();
if (dataRow.Substring(1, 8) != "SECTION_")
break;
else
//Parse line for section code and line number and log values
continue;
}
tr.Close();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}

Categories

Resources