Parsing CSV data - c#

I am trying to parse a CSV file with data with no luck, i have tried a bunch of tools online and none has been able to parse the CSV file correctly. I am baffled by the fact that i am in here asking for help as one would think parsing CSV data would be something super easy.
The format of the CSV data is like this:
",95,54070,3635,""Test Reservation"",0,102,0.00,0.00,2014-12-31,""Name of customer"",""$12.34 + $10, special price"",""extra information"",,CustomerName,,,,,1234567890,youremail#domain.com,CustomerName,2014-12-31,23:59:59,16,0,60,2,120,0,NULL,NULL,NULL,"
Current code:
private void btnOpenFileDialog_Click(object sender, EventArgs e)
{
DialogResult result = openFileDialog1.ShowDialog();
if (result == DialogResult.OK)
{
using (StreamReader reader = new StreamReader(openFileDialog1.FileName))
{
string line;
while ((line = reader.ReadLine()) != null)
{
ParseCsvLine(line);
}
}
}
}
private void ParseCsvLine(string line)
{
if (line != string.Empty)
{
string[] result;
using (var csvParser = new TextFieldParser(new StringReader(line)))
{
csvParser.Delimiters = new string[] { "," };
result = csvParser.ReadFields();
}
foreach (var item in result)
{
Console.WriteLine(item + Environment.NewLine);
}
}
}
The result variable only has one item and its:
,95,54070,3635,"Test Reservation",0,102,0.00,0.00,2014-12-31,"Name of customer","$12.34 + $10, special price","extra information",,CustomerName,,,,,1234567890,youremail#domain.com,CustomerName,2014-12-31,23:59:59,16,0,60,2,120,0,NULL,NULL,NULL,

// Add Microsoft.VisualBasic.dll to References.
using Microsoft.VisualBasic.FileIO;
// input is your original line from csv.
// Remove starting and ending quotes.
input = input.Remove(0, 1);
input = input.Remove(input.Length - 1);
// Replace double quotes with single quotes.
input = input.Replace("\"\"", "\"");
string[] result;
using (var csvParser = new TextFieldParser(new StringReader(input)))
{
csvParser.Delimiters = new string[] { "," };
result = csvParser.ReadFields();
}

You can check out a previous post that deals with those pesky commas in csv files. I'm linking it here.
Also Mihai, your solution works well for just the one line but will fail once there are many lines to parse.

Related

How to delete a certain row in Excel?

I watch a lot of tutorials on how to delete a certain row in Excel.
Please help mo to delete a row in excel using c#.
The fileReader ,FileWriter and Splitter are already working. My only problem now is how to delete a certain row in Excel.
Class Variable
public static string fileName = #".\Contestant.csv";
public static string[,] contestant;
Main Method
List<string> lines = fileReader(fileName);
while (i < lines.Count)
{
string[] temp = stringSplitter(lines[i], new char[] { ',' });
// a contains how many elements in the array
a = temp.Count();
// divides a and plus by 1 to know how many arrays there should be in the 2d array
d = (a / 2) + 1;
contestant = new string[a, d];
This is my code for FileReader
static List<string> fileReader(string filePath)
{
List<string> lines = new List<string>();
try
{
using (StreamReader sr = new StreamReader(filePath))
{
string line = "";
while ((line = sr.ReadLine()) != null)
{
lines.Add(line);
}
}
}
catch (Exception e)
{
Console.WriteLine("Error Message: Please close the file and try again");
//Console.WriteLine(e); for more detailed errors
}
return lines;
}
Here's my code for FileWriter
static void fileWriter(string filePath, bool appendFlag, string message)
{
using (StreamWriter sr = new StreamWriter(filePath, appendFlag))
{
sr.WriteLine(message);
}
}
This is for Splitter String
static string[] stringSplitter(string stringToSplit, char[] splitChars)
{
return stringToSplit.Split(splitChars);
}
I would recommend to completely manipulate your date inside the lists, then replace the whole document with the new information. So read all -> manipulate -> replace your document with new content.
Also don't forget to close your FileStreams after reading/writing.

Read file, check correctness of column, write file C#

I need to check certain columns of data to make sure there are no trailing blank spaces. At first thought I thought it would be very easy, but after attempting to achieve the goal I have got stuck.
I know that there should be 6-digits in the column I need to check. If there is less I will reject, if there are more I will trim the blank spaces. After doing that for the entire file, I want to write it back to the file with the same delimiters.
This is my attempt:
Everything seems to be working correctly except for writing the file.
if (File.Exists(filename))
{
using (StreamReader sr = new StreamReader(filename))
{
string lines = sr.ReadLine();
string[] delimit = lines.Split('|');
while (delimit[count] != "COLUMN_DATA_TO_CHANGE")
{
count++;
}
string[] allLines = File.ReadAllLines(#filename);
foreach(string nextLine in allLines.Skip(1)){
string[] tempLine = nextLine.Split('|');
if (tempLine[count].Length == 6)
{
checkColumn(tempLine);
writeFile(tempLine);
}
else if (tempLine[count].Length > 6)
{
tempLine[count] = tempLine[count].Trim();
checkColumn(tempLine);
}
else
{
throw new Exception("Not enough numbers");
}
}
}
}
}
public static void checkColumn(string[] str)
{
for (int i = 0; i < str[count].Length; i++)
{
char[] c = str[count].ToCharArray();
if (!Char.IsDigit(c[i]))
{
throw new Exception("A non-digit is contained in data");
}
}
}
public static void writeFile(string[] str)
{
string temp;
using (StreamWriter sw = new StreamWriter(filename+ "_tmp", false))
{
StringBuilder builder = new StringBuilder();
bool firstColumn = true;
foreach (string value in str)
{
if (!firstColumn)
{
builder.Append('|');
}
if (value.IndexOfAny(new char[] { '"', ',' }) != -1)
{
builder.AppendFormat("\"{0}\"", value.Replace("\"", "\"\""));
}
else
{
builder.Append(value);
}
firstColumn = false;
}
temp = builder.ToString();
sw.WriteLine(temp);
}
}
If there is a better way to go about this, I would love to hear it. Thank you for looking at the question.
edit:
file structure-
country| firstname| lastname| uniqueID (column I am checking)| address| etc
USA|John|Doe|123456 |5 main street|
notice the blank space after the 6
var oldLines = File.ReadAllLines(filePath):
var newLines = oldLines.Select(FixLine).ToArray();
File.WriteAllLines(filePath, newLines);
string FixLine(string oldLine)
{
string fixedLine = ....
return fixedLine;
}
The main problem with writing the file is that you're opening the output file for each output line, and you're opening it with append=false, which causes the file to be overwritten every time. A better approach would be to open the output file one time (probably right after validating the input file header).
Another problem is that you're opening the input file a second time with .ReadAllLines(). It would be better to read the existing file one line at a time in a loop.
Consider this modification:
using (StreamWriter sw = new StreamWriter(filename+ "_tmp", false))
{
string nextLine;
while ((nextLine = sr.ReadLine()) != null)
{
string[] tempLine = nextLine.Split('|');
...
writeFile(sw, tempLine);

Extract data from text file

I need to extract some data from a text file and insert to columns in excel sheet. I know how to do this if the rows and the length of the string is known.
try
{
using (System.IO.StreamReader sr = new System.IO.StreamReader("test.txt")
{
string line;
while ((line = sr.ReadLine()) != null)
{
listSNR.Items.Add(line.Substring (78,4));
}
}
}
But the particular text file is complex and the starting index or the length cannot be provided. But the starting word (PCPU01) of the row is known.
Eg: PCPU01,T2716,0.00,0.01,0.00,0.00
output:
T2716 0 0.01 0 0
In that case can somebody please let me know how to extract the texts?
using(System.IO.StreamReader sr = new System.IO.StreamReader("test.txt"))
{
string line;
while((line = sr.ReadLine()) != null)
{
string[] split = line.Split(',');
//...
}
}
split[0] will return "PCPU01", split[1] "T2716" and so on.
You can split one string into an array of strings, separated by a given character. This way, you could split the source string by a comma and use the resulting strings to build your output. Example:
string source = "PCPU01,T2716,0.00,0.01,0.00,0.00";
string[] parts = source.Split(',');
StringBuilder result = new StringBuilder();
result.Append(parts[1]); // The second element in the array, i.e. T2716
result.Append(" ");
result.Append(parts[2]); // 0.00
... // And so on...
return result.ToString() // return a string, not a StringBuilder
I hope this helps a little bit. You might have to tweak it to your needs. But this is a higher level code that gives you general idea of extracting data off a notepad.
DialogResult result = openFileDialog.ShowDialog();
Collection<Info> _infoCollection = new Collection<Info>();
Collection<string> listOfSubDomains = new Collection<string>();
string[] row;
string line;
// READ THE FILE AND STORE IT IN INFO OBJECT AND STORE TAHT INFO OBJECT IN COLLECTION
try
{
using (StreamReader reader = new StreamReader(openFileDialog.FileName))
{
while((line = reader.ReadLine()) != null)
{
Info _info = new Info();
row = line.Split(' ');
_info.FirstName = row[0];
_info.LastName = row[1];
_info.Email = row[2];
_info.Id = Convert.ToInt32(row[3]);
_infoCollection.Add(_info);
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
thanks for the answers. What i wanted is to identify the particular line in the text file and split the line into columns. So i was able to do this by calling a GetLine method:
string line15=GetLine(#"test.txt",15);
public string GetLine(string fileName, int line)
{
using (System.IO.StreamReader ssr = new System.IO.StreamReader("test.txt"))
//using (var ssr = new StreamReader("test.txt"))
{
for (int i = 1; i < line; i++)
ssr.ReadLine();
return ssr.ReadLine();
}
}
Then i splitted this line by using the delimiter (,)
This was my approach in C#. It takes a string input (which you can get out of a text file) and an int with which line you want to get. It then separates the string at a given seperator char to a list which in turn is then read out. If the given line number is lower than the count of the created list, the entry is given back.
public string GetLine(string multiline,int line)
{
List<string> lines = new List<string>();
lines = multiline.Split('\n').ToList<string>();
return lines.Count >= line ? lines[line] : "";
}

How to use textfieldParser to edit a CSV file?

I wrote a small function that reads a csv file using textField line by line , edit it a specific field then write it back to a CSV file.
Here is the code :
private void button2_Click(object sender, EventArgs e)
{
String path = #"C:\file.csv";
String dpath = #"C:\file_processed.csv";
List<String> lines = new List<String>();
if (File.Exists(path))
{
using (TextFieldParser parser = new TextFieldParser(path))
{
String line;
parser.HasFieldsEnclosedInQuotes = true;
parser.Delimiters = new string[] { "," };
while ((line = parser.ReadLine()) != null)
{
string[] parts = parser.ReadFields();
if (parts == null)
{
break;
}
if ((parts[12] != "") && (parts[12] != "0"))
{
parts[12] = parts[12].Substring(0, 3);
//MessageBox.Show(parts[12]);
}
lines.Add(line);
}
}
using (StreamWriter writer = new StreamWriter(dpath, false))
{
foreach (String line in lines)
writer.WriteLine(line);
}
MessageBox.Show("CSV file successfully processed ! ");
}
}
The field I want to edit is the 12th one (parts[12]):
for example : if parts[12] = 000,000,234 then change to 000
the file is created the problem is it does not edit the file and half the records are missing. I am hoping someone could point the mistake.
You call both parser.ReadFields() and parser.ReadLine(). Each of them advance the cursor by one. That's why you're missing half the rows. Change the while to:
while(!parser.EndOfData)
Then add parts = parser.ReadFields(); to the end of the loop. Not having this is why you're edit isn't being seen.
You can also remove:
if (parts == null)
{
break;
}
Since you no longer have line, you'll need to use the fields to keep track of your results:
lines.Add(string.Join(",", parts));//handle string escaping fields if needed.

How to read values from a comma separated file?

I want to read words in a text file of a line separated by commas in c sharp.
For example, I want to read this line:
9/10/2011 10:05,995.4,998.8,995.4,997.5,118000
and get the values: 9/10/2011 10:05, 995.4, 998.8, 995.4, 997.5 and 118000.
Next, I also need to change the format of the date to MMddYYYY, and of the time to HHmmss (e.g. 100500).
I am using this code for reading is there anything wrong
private void button1_Click(object sender, EventArgs e)
{
StreamReader reader1 = File.OpenText(Path1);
string str = reader1.ReadToEnd();
reader1.Close();
reader1.Dispose();
// File.Delete(Path1);
string[] Strarray = str.Split(new char[] { Strings.ChrW(7) });
int abc = Strarray.Length - 1;
int xyz = 0;
bool status = true;
while (xyz <= abc)
{
try
{
status = true;
string[] strarray1 = Strarray[xyz].Split(",".ToCharArray());
string SecName = strarray1[0];
int a2 = 0;
while (status) //If the selected list is empty or the text file has selected name this will execute
{
status = false;
string SecSym = strarray1[1];
int DT = int.Parse(strarray1[2]);
int TM = int.Parse(strarray1[3]);
float O = float.Parse(strarray1[2]);
float H = float.Parse(strarray1[3]);
float L = float.Parse(strarray1[4]);
float C = float.Parse(strarray1[5]);
double OI = double.Parse(Convert.ToString(0));
float V = float.Parse(strarray1[6]);
// string a = string.Concat(SecName, ",",SecSym,",", DT, ",", TM, ",", O, ",", H, ",", L);
//writer.WriteLine(a);
}
}
catch
{ }
}
}
}
.Net comes with a ready CSV parser you can use to get your data. It's a part of VB.net, but you can easily use it in C# by adding a reference to the assembly Microsoft.VisualBasic (it's OK, honesly), and a using statement: using Microsoft.VisualBasic.FileIO;.
The code should be simple to understand:
List<String[]> fileContent = new List<string[]>();
using(FileStream reader = File.OpenRead(#"data.csv")) // mind the encoding - UTF8
using(TextFieldParser parser = new TextFieldParser(reader))
{
parser.TrimWhiteSpace = true; // if you want
parser.Delimiters = new[] { "," };
parser.HasFieldsEnclosedInQuotes = true;
while (!parser.EndOfData)
{
string[] line = parser.ReadFields();
fileContent.Add(line);
}
}
CSV is fairly simple, but it may contain quoted values with commas and newlines, so using Split isn't the best option.
Use the String.Split method to get an array of strings:
string[] ar = line.Split(',')
This should get you started.
using System;
using System.IO;
public class Sample
{
public static void Main() {
using (StreamReader reader = new StreamReader("yourfile.txt")) {
string line = null;
while (null != (line = reader.ReadLine())) {
string[] values = line.Split(',');
DateTime date = DateTime.Parse(values[0];
float[] numbers = new float[values.Length - 1];
for (int i = 1; i < values.Length - 1; i++)
numbers[i - 1] = float.Parse(values[i]);
// do stuff with date and numbers
}
}
}
}
I have also posted a simple CsvReader class which may be helpful for you.
For a quick and simple solution, you can stream through the file and parse each line using String.Split like this:
using (var sr = File.OpenText("myfile.csv"))
{
string line;
while ((line = sr.ReadLine()) != null)
{
var fields = line.Split(',');
var date = DateTime.Parse(fields[0].Trim());
var value1 = fields[0].Trim();
}
}
However, this appraoch if fairly error prone, for a more robust solution check out the CsvReader project, it's excellent for parsing CSV files like this. It will handle field values with commas, trimming spaces before and after fields, and much more.
If you need to parse a date string from a format not recognised by DateTime.Parse, try using DateTime.ParseExact instead.
For 1st part of your task use Split method with , as separator. To convert string datetime from one format to another you need to convert that string to datetime(DateTime.Parse, DateTime.ParseExact) and then convert to final format using DateTime.ToString method.
rows contain your output
StreamReader sStreamReader = new StreamReader("File Path");
string AllData = sStreamReader.ReadToEnd();
string[] rows = AllData.Split(",".ToCharArray());

Categories

Resources