.CSV file write by line, checking for errors

.CSV file write by line, checking for errors - c#

first question so hopefully it'll be relevant. While not terribly new to coding... I'm pretty new to 'Good' coding so apologies if I'm completely unaware of obvious best practice.
I'm currently writing to a CSV file by row (using c#).
Which was all well and good till i got told this morning, that the department i'm writing the code for actually want all the lines in one file (i was doing a file per batch, and error checking on a batch lvl, so if anything was wrong with the batch none of the batch was exported)
They still need it to be all or nothing on a batch lvl (and i'm checking the 'cell' values before trying to write them to file, so that stage should work) but I'm acutely aware that the one part of my code which i currently can't check works before beginning the file write is the line writes themselves. And I was wondering how likely it is for individual lines to fail to write?, assuming the file is created properly/ accessed properly for the first line?
and i guess secondly if this is likely to be a problem, is there any 'best practice' I should be aware of to easy(or eliminate) the problem.
code below which i call for each row in the batch that needs writing.
public void WriteRow(CsvRow row)
{
StringBuilder builder = new StringBuilder();
bool firstColumn = true;
foreach (string value in row)
{
// Add separator if this isn't the first value
if (!firstColumn)
builder.Append(',');
// Implement special handling for values that contain comma or quote
// Enclose in quotes and double up any double quotes
if (value.IndexOfAny(new char[] { '"', ',' }) != -1)
builder.AppendFormat("\"{0}\"", value.Replace("\"", "\"\""));
else
builder.Append(value);
firstColumn = false;
}
row.LineText = builder.ToString();
WriteLine(row.LineText);
}

Related

Delete specific row from File

I have this little project in C# where I am manipulating with files. Now my task is that I have to delete specific rows from files.
For example my file looks like this:
1-this is the first line
2-this is the second line
3-this is the third line
4-this is the fourth line
Now how can I keep only the first two rows and delete only the last two rows?
Note- this is how I read the file from my local machine:
string[] lines = File.ReadAllLines(#"C:\Users\admin\Desktop\COMMANDS.dat");
I have tried something like this but I think it's not so "efficient"
string text = File.ReadAllText(#"C:\Users\admin\Desktop\COMMANDS.dat");
text = text.Replace(lines[2], "");
text = text.Replace(lines[3], "");
File.WriteAllText(#"C:\Users\admin\Desktop\COMMANDS.dat", text);
So this actually does the job, it replaces the lines by string with an empty character but when I take a look at the file, I don't want to have 4 lines there, even though 2 of them are real strings and the other two are just empty lines... Can I manage to do this in another way?

Try replacing the newline character with an empty string:
string text = File.ReadAllText(#"C:\Users\admin\Desktop\COMMANDS.dat");
text = text.Replace(lines[2], "").Remove(Environment.NewLine, "");
text = text.Replace(lines[3], "").Remove(Environment.NewLine , "");
File.WriteAllText(#"C:\Users\admin\Desktop\COMMANDS.dat", text);
If my answer is useful, please mark it as accepted, and upvote it.

async Task Example()
{
var inputLines = await File.ReadAllLinesAsync("path/to/file.txt");
var outputLines = inputLines.Where((l, i) => i < 2);
await File.WriteAllLinesAsync("target/file.txt", outputLines);
}
What it does
Read data but not as one string but as a collection of lines
Create a new collection containing only the lines you want in your output
Write the filtered lines
Notes:
This example is not optimized for memory usage (because we read all lines and for larger files, e.g. multiple GB, this will fail). See existing answers for memory optimized version) - but: It's totally fine to do it this way if you know you have just a few k lines. (and it's faster)
Try not to "modify" strings. This will always create a copy and needs a lot of memory.
In this "Linq style" (functional) approach, we should treat data as immutable. That means: we have one variable that represents the input file and one variable that represents the result. We use declarative Linq to describe how the output should look like. "output is input where the filter index < 2 matches" instead of "if xy remove line" in an imperative style.

Shift text file rows by one deleting the first line

how to Shift text file rows by one after deleting a particular line based on condition xyz=true for that particular line. I want to use C#.
using (StreamWriter writer = new StreamWriter(file))
{
while ((line = rows[rowIndex++]) != null)
{
if( line contains xyz )// i know the logic to find this;
{
delete this line and shift all below lines one up;
}
}
}

A simple way to do this for small files (files which can be completely loaded into memory) is to use File.ReadAllLines() to get a string[] of the lines. With this, you can loop through the array of strings and test each one to determine if it should be written to the new file or not.
A similar question came up recently here C# Delete line i text file if it is different than specific term
If you're writing the output to a new file, you don't need to worry about moving all the later lines up, removing the entire line will also remove the line ending characters.
Hope this helps

Removing text above real content of CSV file

I have a CSV whose author, annoyingly enough, has decided to 'introduce' the file before the contents themselves. So in all, I have a CSV that looks like:
This file was created by XXXXYY and represents the crossover between YY and QQQ.
Additional information can be found through the website GG, blah blah blah...
Jacob, Hybrid
Dan, Pure
Lianne, Hybrid
Jack, Hatchback
So the problem here is that I want to get rid of the first few lines before the 'real content' of the CSV file begins. I'm looking for robustness here, so using Streamreader and removing all content before the 4th line for example, is not ideal (plus the length of the text can vary).
Is there a way in which one can read only what matters and write a new CSV into a directory path?
Regards,
genesis
(edit - I'm looking for C sharp code)

The solution depends on the files you have to parse. You need to look for a reliable pattern that distinguishes data from comment.
In your example, there are some possibilities that might be the same in other files:
there are 4 lines of text. But you say this isn't consistent across files
The text lives may not contain the same number of commas as the data table. But that is unlikely to be reliable for all files.
there is a blank/whitespace only line between the text and the data.
the data appears to be in the form word-comma-word. If this is true it should be easy to identify non data lines (any line which doesn't contain exactly one comma, or has multiple words etc)
You may be able to use a combination of these heuristics to more reliably detect the data.

You could scan by line (looking for the \r\n) and ignore lines that don't have a comma count that matches you csv.
You should be able to read the file into a string pretty easily unless it is really massive.
e.g.
var csv = "some test\r\nsome more text\r\na,b,c\r\nd,e,f\r\n";
var lines = csv.Split('\r\n');
var csvLines = line.Where(l => l.Count(',') == 2);
// now csvLines contains only the lines you are after

List<string> info = new List<string>();
int counter = 0;
// Open the file to read from.
info = System.IO.File.ReadAllLines(path).ToList();
// Find the lines up until (& including) the empty one
foreach (string s in info)
{
counter++;
if(string.IsNullOrEmpty(s))
break; //exit from the loop
}
// Remove the lines including the blank one.
info.RemoveRange(0,counter);
Something like this should work, you should probably put some tests in to make sure counter is not > length and other tests to handle errors.
You could adapt this code so that it just finds the empty line number using linq or something, but I don't like the overhead of linq (Yeah ironic considering I'm using c#).
Regards,
Slipoch

Simple csv comma delimited html table gen

I know the question has been asked before, but I wasn't quite satisfied with the answer (considering it didn't explain what was going on).
My specific question is :How do I open a csv/text file that's comma delimited and rows are separated by returns and put it into an HTML table using C#?
I understand how to do this in PHP but I have just started learning ASP.Net/C#, if anyone has some free resources for C# and/or is able to provide me with a snippet of code with some explanation of whats going on I would appreciate it.
I have this code, but I'm not sure how I would use it because A)I don't know how C# arrays work and B)I'm not sure how to open files in ASP.Net C#:
var lines =File.ReadAllLines(args[0]);
using (var outfs = File.AppendText(args[1]))
{
outfs.Write("<html><body><table>");
foreach (var line in lines)
outfs.Write("<tr><td>" + string.Join("</td><td>", line.Split(',')) + "</td></tr>");
outfs.Write("</table></body></html>");
}
I apologize for my glaring inexperience here.

The code sample you posted does exactly what you're asking:
Opens a file.
Writes a string of HTML for a table with the contents of the file from step 1 to another file.
Let's break it down:
var lines =File.ReadAllLines(args[0]);
This opens the file specified in args[0] and reads all the lines into a string array, one lay per element. See File.ReadAllLines Method (String).
using (var outfs = File.AppendText(args[1]))
{
File.AppendText Method creates a StreamWriter to append text to an existing file (or creates it if it doesn't exist). The filename (and path, possibly) are in args[1]. The using statement puts the StreamWriter into what is called a using block, to ensure the stream is correctly disposed once the using block is left. See using Statement (C# Reference) for more information.
outfs.Write("<html><body><table>");
outfs.Write calls the Write method of the StreamWriter (StreamWriter.Write Method (String)). Actually in the case of your code snippet nothing is written to the file until you exit the using block - it's written to a buffer. Exiting the using block will flush the buffer and write to the file.
foreach (var line in lines)
This command starts a loop through all the elements in the string array lines, staring with the first (element 0) index. See foreach, in (C# Reference) for more information if you need it.
outfs.Write("<tr><td>" + string.Join("</td><td>", line.Split(',')) + "</td></tr>");
String.Join is the key part here, where most of the work is done. String.Join Method (String, String[]) has the technical details, but essentially what is happening here is that the second argument (line.Split(',')) is passing in an array of strings, and the strings in that array are then being concatenated together with the first argument (</td><td>) as the separator, and the table row is being opened and closed.
For example, if the line is "1,2,3,4,5,6", the Split gives you a 6 element array. This array is then conatenated with </td><td> as the separator by String.Join, so you have "1</td><td>2</td><td>3</td><td>4</td><td>5</td><td>6". "<tr><td>" is added to the front and "</td></tr>" is added to the end and the final line is "<tr><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td><td>6</td></tr>".
outfs.Write("</table></body></html>");
}
This writes the end of the HTML to the buffer, which is then flushed and written to the specified text file.
A couple of things to note. args[0] and args[1] are used to hold command line arguments (i.e., MakeMyTable.exe InFile.txt OutFile.txt), which aren't (in my experience) applicable to ASP.NET applications. You'll need to either code the files (and paths) necessary, or allow the user to specify the input file and/or output file. The ASP.NET application will need to be running under an account that has permission to access those files as well.
If you have quoted values in the CSV file, you'll need to handle those (this is very common when dealing with monetary amounts, for example), as splitting on the , may cause an incorrect split. I recommend taking a look at TextFieldParser, as it can handle quoted fields quite easily.
Unless you're sure that each line in the file has the same number of fields, you run the risk of having poorly formed HTML in your table and no guarantees on how it will render.
Additionally, it would be advisable to test that the file you're opening exists. There's probably more, but these are the basics (and may already be beyond the scope of Stack Overflow).

Hopefully this will help point you in the right direction:
line = "<table>" + line;
foreach (string line in lines)
{
line = "<tr><td>" + line;
line.Replace(",", "</td><td>");
line += "</td></tr>";
}
Response.Write(line + "</table>");
Good luck with your learning!

Reading line by line

I have a program that generates a plain text file. The structure (layout) is always the same. Example:
Text File:
LinkLabel
"Hello, this text will appear in a LinkLabel once it has been
added to the form. This text may not always cover more than one line. But will always be surrounded by quotation marks."
240, 780
So, to explain what is going on in that file:
Control
Text
Location
And when a button on the Form is clicked, and the user opens one of these files from the OpenFileDialog dialog, I need to be able to Read each line. Starting from the top, I want to check to see what control it is, then starting on the second line I need to be able to get all text inside the quotation marks (regardless of whether is is one line of text or more), and on the next line (after the closing quotation mark), I need to extract the location (240, 780)... I have thought of a few ways of going about this but when I go to write it down and put it to practice, it doesn't make much sense and end up figuring out ways that it won't work.
Has anybody ever done this before? Would anybody be able to provide any help, suggestions or advice on how I'd go about doing this?
I have looked up CSV files but that seems too complicated for something that seems so simple.
Thanks
jase

You could use a regular expression to get the lines from the text:
MatchCollection lines = Regex.Matches(File.ReadAllText(fileName), #"(.+?)\r\n""([^""]+)""\r\n(\d+), (\d+)\r\n");
foreach (Match match in lines) {
string control = match.Groups[1].Value;
string text = match.Groups[2].Value;
int x = Int32.Parse(match.Groups[3].Value);
int y = Int32.Parse(match.Groups[4].Value);
Console.WriteLine("{0}, \"{1}\", {2}, {3}", control, text, x, y);
}

I'll try and write down the algorithm, the way I solve these problems (in comments):
// while not at end of file
// read control
// read line of text
// while last char in line is not "
// read line of text
// read location
Try and write code that does what each comment says and you should be able to figure it out.
HTH.

You are trying to implement a parser and the best strategy for that is to divide the problem into smaller pieces. And you need a TextReader class that enables you to read lines.
You should separate your ReadControl method into three methods: ReadControlType, ReadText, ReadLocation. Each method is responsible for reading only the item it should read and leave the TextReader in a position where the next method can pick up. Something like this.
public Control ReadControl(TextReader reader)
{
string controlType = ReadControlType(reader);
string text = ReadText(reader);
Point location = ReadLocation(reader);
... return the control ...
}
Of course, ReadText is the most interesting one, since it spans multiple lines. In fact it's a loop that calls TextReader.ReadLine until the line ends with a quotation mark:
private string ReadText(TextReader reader)
{
string text;
string line = reader.ReadLine();
text = line.Substring(1); // Strip first quotation mark.
while (!text.EndsWith("\"")) {
line = reader.ReadLine();
text += line;
}
return text.Substring(0, text.Length - 1); // Strip last quotation mark.
}

This kind of stuff gets irritating, it's conceptually simple, but you can end up with gnarly code. You've got a comparatively simple case:one record per file, it gets much harder if you have lots of records, and you want to deal nicely with badly formed records (consider writing a parser for a language such as C#.
For large scale problems one might use a grammar driven parser such as this: link text
Much of your complexity comes from the lack of regularity in the file. The first field is terminated by nwline, the second by delimited by quotes, the third terminated by comma ...
My first recomendation would be to adjust the format of the file so that it's really easy to parse. You write the file so you're in control. For example, just don't have new lines in the text, and each item is on its own line. Then you can just read four lines, job done.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

.CSV file write by line, checking for errors - c#

Related

Delete specific row from File

Shift text file rows by one deleting the first line

Removing text above real content of CSV file

Simple csv comma delimited html table gen

Reading line by line

Categories

Resources