Proof Reading .CSV per line - c#

CSVHelper and FileHelper is not an option
I have a .csv export that I need to check for consistency structured like the below
Reference,Date,EntryID
ABC123,08/09/2015,123
ABD234,08/09/2015,124
XYZ987,07/09/2015,125
QWE456,08/09/2016,126
I can use ReadLine or RealAllLines and .Split which give me entire rows/columns BUT I have need to select each row and then go through each attribute (separated by ',') for format checking
I am running into problems here. I can not single out each value in a row for this check.
It is probably either something simple onto
class Program
{
static void Main(string[] args)
{
string csvFile = #"proof.csv";
string[] lines = File.ReadAllLines(csvFile);
var values = lines.Skip(1).Select(l => new { FirstRow = l.Split('\n').First(), Values = l.Split('\n').Select(v => int.Parse(v)) });
foreach (var value in values)
{
Console.WriteLine(string.Format("{0}", value.FirstRow));
}
}
}
Or I am going down the wrong path, my searches relate to pulling specific rows or columns (as opposed to checking the individual values associated)
The sample of the data above has a highlighted example: The date is next year and I would like to be able to proof that value (just an example as it could be in either column where errors appear)

I can not single out each value in a row
That's because you split on \n twice. The values within a row are separated by comma (,).
I'm not sure what all that LINQ is supposed to do, but it's as simple as this:
string[] lines = File.ReadAllLines(csvFile);
foreach (var line in lines.Skip(1))
{
var values = line.Split(',');
// access values[0], values[1] ...
}

Instead of reading it as text read it by OLEDB object, so data of CSV file will come in datatable and you do not need to spit it.
To Read the csv file you can use these objects of OLEDB
System.Data.OleDb.OleDbCommand
System.Data.OleDb.OleDbDataAdapter
System.Data.OleDb.OleDbConnection
and
System.Data.DataTable

Related

Parse CSV File section wise

I am new to in C# need help to write parser for below cvs file of data
[INFO]
LINE_NAME,MACHINE_SN,MACHINE_NAME,OPERATOR_ID
LineName,ParmiMachineSN,PARMI_AOI_1,engineer
[INFO_END]
[PANEL_INSP_RESULT]
MODEL_NAME,MODEL_CODE,PANEL_SIDE,INDEX,BARCODE,DATE,START_TIME,END_TIME,DEFECT_NAME,DEFECT_CODE,RESULT
E11-03356-0388-A-TOP CNG,,BOTTOM,47,MLT0388A03358CSNSOF1232210200052-0001,20201023,12:46:57,12:47:04,,,OK
[PANEL_INSP_RESULT_END]
[BOARD_INSP_RESULT]
BOARD_NO,BARCODE,DEFECT_NAME,DEFECT_CODE,BADMARK,RESULT
1,MLT0388A03358CSNSOF1232210200052-0001,,,NO,OK
2,MLT0388A03358CSNSOF1232210200052-0004,,,NO,OK
3,MLT0388A03358CSNSOF1232210200052-0003,,,NO,OK
4,MLT0388A03358CSNSOF1232210200052-0002,,,NO,OK
[BOARD_INSP_RESULT_END]
[COMPONENT_INSP_RESULT]
BOARD_NO,LOCATION_NAME,PIN_NUMBER,POS_X,POS_Y,DEFECT_NAME,DEFECT_CODE,RESULT
[COMPONENT_INSP_RESULT_END]
I need to parse the above file
To parse the above CSV file in C#, you can use the following steps:
Read the entire file into a string using the File.ReadAllText method.
string fileText = File.ReadAllText("file.csv");
Split the file into individual sections by looking for the "[INFO]" and "
[INFO_END]" tags, and then use a loop to process each section.
string[] sections = fileText.Split(new string[] { "[INFO]", "[INFO_END]", "[PANEL_INSP_RESULT]", "[PANEL_INSP_RESULT_END]", "[BOARD_INSP_RESULT]", "[BOARD_INSP_RESULT_END]", "[COMPONENT_INSP_RESULT]", "[COMPONENT_INSP_RESULT_END]" }, StringSplitOptions.RemoveEmptyEntries);
foreach (string section in sections)
{
//Process each section
}
Within the loop, use the String.Split method to split each section into rows by looking for the newline character.
string[] rows = section.Split('\n');
Use the String.Split method again to split each row into cells by looking for the comma.
foreach (string row in rows)
{
string[] cells = row.Split(',');
//Process each cell
}
Now you can process each cell as you need, you can check the first cell value to decide which section this row belongs to, and then you can process the cells according to their type and position in the row.
You can use a switch case statement to check which section you are currently processing and then use appropriate logic to parse the data.
Please be aware that this is a simplified example, and you may need to add additional error handling and validation to ensure that the data is properly parsed.
This is an example how you can parse the csv file but you might need to handle various edge cases like empty rows, empty cells, etc based on your specific use case.
The following reads all text, creates an anonymous list with line index and line followed by looping through a list of sections. In the loop find a section and in this case displays to a console window.
internal partial class Program
{
static void Main(string[] args)
{
var items = (File.ReadAllLines("YourFileNameGoesHere")
.Select((line, index) => new { Line = line, Index = index })
.Select(lineData => lineData)).ToList();
List<string> sections = new List<string>()
{
"INFO",
"PANEL_INSP_RESULT",
"BOARD_INSP_RESULT",
"COMPONENT_INSP_RESULT"
};
foreach (var section in sections)
{
Console.WriteLine($"{section}");
var startItem = items.FirstOrDefault(x => x.Line == $"[{section}]");
var endItem = items.FirstOrDefault(x => x.Line == $"[{section}_END]");
if (startItem is not null && endItem is not null)
{
bool header = false;
for (int index = startItem.Index + 1; index < endItem.Index; index++)
{
if (header == false)
{
Console.WriteLine($"\t{items[index].Line}");
header = true;
}
else
{
Console.WriteLine($"\t\t{items[index].Line}");
}
}
}
else
{
Console.WriteLine("\tFailed to read this section");
}
}
}
}

How to get all rows for specific column from .csv file

In my project, I have a .csv file with many columns.
I need to extract all rows for only first column. I've managed to read all lines, but got stuck on how to extract rows from first column to another .csv file.
string filePath = #"C:\Users\BP185150\Desktop\OTC.csv";
string[] OTC_Output = File.ReadAllLines(#"C:\Users\BP185150\Desktop\OTC.csv");
foreach (string line in OTC_Output)
{
Console.WriteLine(line);
Console.Read();
}
Console.ReadLine();
Depending on what seperator your csv is using you can use the string.split() function.
e.g.
string firstItem = line.Split(',')[0];
Console.WriteLine(firstItem);
Adding them to a collection:
ICollection<string> firstItems = new List<string>();
string[] OTC_Output = File.ReadAllLines(#"C:\Users\BP185150\Desktop\OTC.csv");
foreach (string line in OTC_Output)
{
firstItems.Add(line.Split(',')[0]);
}
Well if you want to use File.ReadAllLines, the best way to get the first column is to split the line with a delimiter that your csv is using. Then just add the first item of every line to a collection.
var column = OTC_Output.Select(line => line.Split(';').First()).ToList();
In lineItems, you'll have all the columns splitted:
var lineItems = line.Split(";").ToArray();
Then, parse the value only for the first of them:
lineItems.GetValue(0).ToString();

Saving listbox items into a csv file

I have a list box which contains student information, it has a studentID and a student mark, when I write to the file I want the studentID and studentMark to be in separate columns (for example the first students ID will be in A1 and their mark will be in B1. and so on until all the students have been written into the file)
I have this code, but this code only adds the studentinformation into the one column, how would I go about making it split the data and putting them in 2 columns
if(lstMarks.Items.Count > 0)
{
using(TextWriter outputfile = new StreamWriter("StudentRecords.csv"))
{
foreach(string data in lstMarks.Items)
{
outputfile.WriteLine(data);
}
MessageBox.Show("Student Information inserted successfully");
}
}
This will do what you need...
outputfile.WriteLine(string.Join(",", data.Split(':')));
What this does it firstly split the string data where it finds the colon, and this returns an array. The string.Join() then joins that array back together as a string, using a comma as a separator.
You could alternatively use data.Replace(":", ",").
You can create a string and WriteLine inside your foreach loop
You will have to split your string so you have studentId and studentMark in an array. Then you can build a string with comma separated.
Finally use flush to clear all buffers
using(TextWriter outputfile = new StreamWriter("StudentRecords.csv"))
{
foreach(string data in lstMarks.Items)
{
var dataArray = data.split(':');
var line = string.Format("{0},{1}", data[0] , data[1]);
outputfile.WriteLine(line);
outputfile.Flush();
}
MessageBox.Show("Student Information inserted successfully");
}

string.Join Linq query to merge two strings from an array and output as single comma delimited string

As part of a data cleansing exercise I need to correct the formatting of a csv file.
Due to poor formatting/lack of quotes an extra comma in a description field is breaking my DTS package.
So, to get around this I have created a simple C# script to find any line in the csv that contains more columns than the header row.
When the row contains more columns than the header I want to merge array item [10] and [11] into one column and then write the line to my new file - keeping all the other existing columns as they are.
Code:
var columns = splitExpression.Split(line).Where(s => s != delimiter).ToArray();
if (headers == null) headers = new string[columns.Length];
if (columns.Length != headers.Length)
{
// TODO - Linq to write comma separated string but merge column 10 and 11 of the array
// writer.WriteLine(string.Join(delimiter, columns));
}
else
{
writer.WriteLine(string.Join(delimiter, columns));
}
Unfortunately, my Linq writing skills are somewhat lacking, can someone please help me fill in the TODO.
Simply use for columns list instead of array. That will allow you to remove unnecessary columns after merge:
var columns = splitExpression.Split(line).Where(s => s != delimiter).ToList();
if (headers == null) headers = new string[columns.Count];
if (columns.Count != headers.Length)
{
columns[10] = columns[10] + columns[11]; // combine columns here
columns.RemoveAt(11);
}
writer.WriteLine(string.Join(delimiter, columns));

FileHelpers - Column mapping

Quick question regarding filehelper library:
I have used file helper engine to read stream, do my validation and if the CSV file has not got a header we need to match/map it to my model: i.e
id, name, age, phone, sex,
but the CSV might not come in this format/order all the time and we need to match them using a drop down list for each column.
Is there any way I can do this?
Thannks,
The short answer, no. BUT you can create a dependent class dynamically:
Since you have the list of possible fields in your JSON file, I would recommend doing a basic System.IO ReadLine for the first data row, and then parse by your delimiter for the individual headers. i.e.:
string headerString;
var headers = new List<String>();
var file = new System.IO.StreamReader("C:\\myFile.txt");
headerString = file.ReadLine();
file.Close();
headers = headerString.Split(',').ToList();
now you have the list of strings for the first row to match against your JSON file. Then you can create your dependent class using System.Reflection.Emit (referenced link below)
typeBuilder.SetParent(typeof(MyFileHelperBaseClass));
// can place the property definitions in a for loop against your headers
foreach(string h in headers){
typeBuilder.DefineProperty("<header/col#>", ..., typeof(System.Int32), null);
}
stackoverflow article 14724822: How Can I add properties to a class on runtime in C#?
File Helpers gets a little finicky at times, so it will take some tweaking.
Hope this helps
You can use File.ReadLines(#"C:\myfile.txt").First() to read the first line and get the headers.
Then you can just use a FileHelpers CodeBuilder to build your runtime class. From the example for a delimited csv file:
DelimitedClassBuilder cb = new DelimitedClassBuilder("Customers", ",");
cb.IgnoreFirstLines = 1;
cb.IgnoreEmptyLines = true;
cb.AddField("BirthDate", typeof(DateTime));
cb.LastField.TrimMode = TrimMode.Both;
cb.LastField.FieldNullValue = DateTime.Today;
cb.AddField("Name", typeof(string));
cb.LastField.FieldQuoted = true;
cb.LastField.QuoteChar = '"';
cb.AddField("Age", typeof(int));
engine = new FileHelperEngine(cb.CreateRecordClass());
DataTable dt = engine.ReadFileAsDT("testCustomers.txt");
Then you can traverse the resulting data table.

Categories

Resources