How to split CSV file - c#

"0.0.0.0,""0.255.255.255"",""ZZ"""
"1.0.0.0,""1.0.0.255"",""AU"""
"1.0.1.0,""1.0.3.255"",""CN"""
"1.0.4.0,""1.0.7.255"",""AU"""
"1.0.8.0,""1.0.15.255"",""CN"""
"1.0.16.0,""1.0.31.255"",""JP"""
"1.0.32.0,""1.0.63.255"",""CN"""
"1.0.64.0,""1.0.127.255"",""JP"""
"1.0.128.0,""1.0.255.255"",""TH"""
"1.1.0.0,""1.1.0.255"",""CN"""
"1.1.1.0,""1.1.1.255"",""AU"""
"1.1.2.0,""1.1.63.255"",""CN"""
"1.1.64.0,""1.1.127.255"",""JP"""
"1.1.128.0,""1.1.255.255"",""TH"""
İN EXCEL
0.0.0.0,"0.255.255.255","ZZ"
1.0.0.0,"1.0.0.255","AU"
1.0.1.0,"1.0.3.255","CN"
1.0.4.0,"1.0.7.255","AU"
1.0.8.0,"1.0.15.255","CN"
1.0.16.0,"1.0.31.255","JP"
1.0.32.0,"1.0.63.255","CN"
1.0.64.0,"1.0.127.255","JP"
1.0.128.0,"1.0.255.255","TH"
1.1.0.0,"1.1.0.255","CN"
1.1.1.0,"1.1.1.255","AU"
1.1.2.0,"1.1.63.255","CN"
1.1.64.0,"1.1.127.255","JP"
1.1.128.0,"1.1.255.255","TH"
1.2.0.0,"1.2.2.255","CN"
1.2.3.0,"1.2.3.255","AU"
1.2.4.0,"1.2.127.255","CN"
1.2.128.0,"1.2.255.255","TH"
1.3.0.0,"1.3.255.255","CN"
1.4.0.0,"1.4.0.255","AU"
1.4.1.0,"1.4.127.255","CN"
1.4.128.0,"1.4.255.255","TH"
How can split this CSV file.
For example 0.0.0.0 0.255.255.255 ZZ for first row and how can add datagridview with 3columns

You can do it via the following way..
using System.IO;
static void Main(string[] args)
{
using(var reader = new StreamReader(#"C:\test.csv"))
{
List<string> listA = new List<string>();
List<string> listB = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(','); // or whatever yur get by reading that file
listA.Add(values[0]);
listB.Add(values[1]);
}
}
}

A CSV file is either a Tab delimited or a Comma delimited file. That said; you have to read the file line by line and then separate the values available in a line based on the delimiter character. The first line usually appears in a CSV file is usually the headers which you can use in order to produce a KeyValue pair to make your collection more efficient. For example:
Dictionary<int, Dictionary<String, String>> values = new Dictionary<int, Dictionary<String,String>>();
using(FileStream fileStream = new FileStream(#"D:\MyCSV.csv", FileMode.Open, FileAccess.Read, FileShare.Read)) {
using(StreamReader streamReader = new StreamReader(fileStream)){
//You can skip this line if there is no header
// Then instead of Dictionary<String,String> you use List<String>
var headers = streamReader.ReadLine().Split(',');
String line = null;
int lineNumber = 1;
while(!streamReader.EndOfStream){
line = streamReader.ReadLine().split(',');
if(line.Length == headers.Length){
var temp = new Dictionary<String, String>();
for(int i = 0; i < headers.Length; i++){
// You can remove '"' character by line[i].Replace("\"", "") or through using the Substring method
temp.Add(headers[i], line[i]);
}
values.Add(lineNumber, temp);
}
lineNumber++;
}
}
In case the data structure of your CSV is constant and it will not change in the future, you can develop a strongly typed data model and get rid of the Dictionary type. This approach will be more elegant and more efficient.

First of all, your CSV lines are surrounded by quotes. Is it copy/paste mistake? If not, you will need to sanitize the file to a valid CSV file.
You can try Cinchoo ETL - an open source library to load the CSV file to datatable, then you can assign it to your DataGridView source.
I'll show you both approach, how to handle
Valid CSV: (test.csv)
0.0.0.0,"0.255.255.255","ZZ"
1.0.0.0,"1.0.0.255","AU"
1.0.1.0,"1.0.3.255","CN"
1.0.4.0,"1.0.7.255","AU"
1.0.8.0,"1.0.15.255","CN"
1.0.16.0,"1.0.31.255","JP"
1.0.32.0,"1.0.63.255","CN"
1.0.64.0,"1.0.127.255","JP"
1.0.128.0,"1.0.255.255","TH"
1.1.0.0,"1.1.0.255","CN"
1.1.1.0,"1.1.1.255","AU"
1.1.2.0,"1.1.63.255","CN"
1.1.64.0,"1.1.127.255","JP"
1.1.128.0,"1.1.255.255","TH"
Read CSV:
using (var p = new ChoCSVReader("test.csv"))
{
var dt = p.AsDataTable();
//Assign dt to DataGridView
}
Next approach
Invalid CSV: (test.csv)
"0.0.0.0,""0.255.255.255"",""ZZ"""
"1.0.0.0,""1.0.0.255"",""AU"""
"1.0.1.0,""1.0.3.255"",""CN"""
"1.0.4.0,""1.0.7.255"",""AU"""
"1.0.8.0,""1.0.15.255"",""CN"""
"1.0.16.0,""1.0.31.255"",""JP"""
"1.0.32.0,""1.0.63.255"",""CN"""
"1.0.64.0,""1.0.127.255"",""JP"""
"1.0.128.0,""1.0.255.255"",""TH"""
"1.1.0.0,""1.1.0.255"",""CN"""
"1.1.1.0,""1.1.1.255"",""AU"""
"1.1.2.0,""1.1.63.255"",""CN"""
"1.1.64.0,""1.1.127.255"",""JP"""
"1.1.128.0,""1.1.255.255"",""TH"""
Read CSV:
using (var p = new ChoCSVReader("Sample6.csv"))
{
p.SanitizeLine += (o, e) =>
{
string line = e.Line as string;
if (line != null)
{
line = line.Substring(1, line.Length - 2);
line = line.Replace(#"""""", #"""");
}
e.Line - line;
};
var dt = p.AsDataTable();
//Assign dt to DataGridView
}
Hope it helps.

Related

Why do I only get results from my first txt?

I am trying to read & use all lines from txt files. With a method I iterate through them asinc, and I try to get the data. My problem is, the output looks like it only contains the data from the first txt file. I just cant find where the problem is. I would appreciate any help.
Here's my code:
string[] files = Directory.GetFiles("C:/DPS-EDPWB05/forlogsearch", "*", SearchOption.AllDirectories);
//data I need from the txts
List<string> num = new List<string>();
List<string> date = new List<string>();
List<string> time = new List<string>();
List<string> sip = new List<string>();
List<string> csmethod = new List<string>();
List<string> csuristem = new List<string>();
List<string> csuriquery = new List<string>();
List<string> sport = new List<string>();
List<string> csusername = new List<string>();
List<string> cip = new List<string>();
List<string> csuseragent = new List<string>();
List<string> csreferer = new List<string>();
List<string> scstatus = new List<string>();
List<string> scsubstatus = new List<string>();
List<string> cswin32status = new List<string>();
List<string> timetaken = new List<string>();
int x = 0;
int y = 0;
int i = 0;
int filesCount = 0;
string v = "";
//Taking the data from the Log, getting a list of string[]
//items with the lines from the txts
List<string[]> lines = new List<string[]>();
while (i < files.Length)
{
lines.Add(ReadAllLinesAsync(files[i]).Result);
i++;
}
//Trying to get the data from the string[]s
do
{
string line;
int f = 0;
string[] linesOfTxt = lines[filesCount];
do
{
line = linesOfTxt[f];
string[] splittedLine = { };
splittedLine = line.Split(' ', 15, StringSplitOptions.None);
y = splittedLine.Count();
if (y == 15)
{
num.Add(x.ToString());
date.Add(splittedLine[0]);
time.Add(splittedLine[1]);
sip.Add(splittedLine[2]);
csmethod.Add(splittedLine[3]);
csuristem.Add(splittedLine[4]);
csuriquery.Add(splittedLine[5]);
sport.Add(splittedLine[6]);
csusername.Add(splittedLine[7]);
cip.Add(splittedLine[8]);
csuseragent.Add(splittedLine[9]);
csreferer.Add(splittedLine[10]);
scstatus.Add(splittedLine[11]);
scsubstatus.Add(splittedLine[12]);
cswin32status.Add(splittedLine[13]);
timetaken.Add(splittedLine[14]);
x++;
}
f++;
} while (f < linesOfTxt.Length);
filesCount++;
}
while (filesCount < files.Count());
After all this I group these and stuff but that happens AFTER the lists of data i need are filled - so the problem must be here somewhere. Also, my asinc reader (I found here on stackoverflow):
public static Task<string[]> ReadAllLinesAsync(string path)
{
return ReadAllLinesAsync(path, Encoding.UTF8);
}
public static async Task<string[]> ReadAllLinesAsync(string path, Encoding encoding)
{
var lines = new List<string>();
// Open the FileStream with the same FileMode, FileAccess
// and FileShare as a call to File.OpenText would've done.
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, DefaultBufferSize, DefaultOptions))
using (var reader = new StreamReader(stream, encoding))
{
string line;
while ((line = await reader.ReadLineAsync()) != null)
{
lines.Add(line);
}
}
return lines.ToArray();
}
There are several problems with the example code, thou I'm unsure what is responsible for the actual issue.
Instead of writing your own ReadAllLinesAsync, just use File.ReadAllLinesAsync
You should in general avoid .Result since this is a blocking operation, and this has the potential to cause deadlocks. In this case you should just call the synchronous version, File.ReadAllLines, instead.
When possible, use foreach or for loops. They make it much easier to see if your code is correct, and avoids spreading the loop logic over multiple lines.
As far as I can see, you are processing each line identically, so you should be able to just merge all lines from all files by using List<string> lines and AddRange
Instead of keeping 15 different lists of properties, a more common approach would be to store one list of objects of a class with 15 different properties.
Whenever you need to store data, you should seriously consider using an existing serialization format, like json, xml, csv, protobuf etc. This lets you use existing, well tested, libraries for writing and reading data and converting it to your own types.

C# Reading CSV to DataTable and Invoke Rows/Columns

i am currently working on a small Project and i got stuck with a Problem i currently can not manage to solve...
I have multiple ".CSV" Files i want to read, they all have the same Data just with different Values.
Header1;Value1;Info1
Header2;Value2;Info2
Header3;Value3;Info3
While reading the first File i Need to Create the Headers. The Problem is they are not splited in Columns but in rows (as you can see above Header1-Header3).
Then it Needs to read the Value 1 - Value 3 (they are listed in the 2nd Column) and on top of that i Need to create another Header -> Header4 with the data of "Info2" which is always placed in Column 3 and Row 2 (the other values of Column 3 i can ignore).
So the Outcome after the first File should look like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Info2;
And after multiple files it sohuld be like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Value4;
Value1b;Value2b;Value3b;Value4b;
Value1c;Value2c;Value3c;Value4c;
I tried it with OleDB but i get the Error "missing ISAM" which i cant mange to fix. The Code i Used is the following:
public DataTable ReadCsv(string fileName)
{
DataTable dt = new DataTable("Data");
/* using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
Path.GetDirectoryName(fileName) + "\";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
*/
using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(fileName) + ";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
{
using(OleDbCommand cmd = new OleDbCommand(string.Format("select *from [{0}]", new FileInfo(fileName).Name,cn)))
{
cn.Open();
using(OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
adapter.Fill(dt);
}
}
}
return dt;
}
Another attempt i did was using StreamReader. But the Headers are in the wrong place and i dont know how to Change this + do this for every file. the Code i tried is the following:
public static DataTable ReadCsvFilee(string path)
{
DataTable oDataTable = new DataTable();
var fileNames = Directory.GetFiles(path);
foreach (var fileName in fileNames)
{
//initialising a StreamReader type variable and will pass the file location
StreamReader oStreamReader = new StreamReader(fileName);
// CONTROLS WHETHER WE SKIP A ROW OR NOT
int RowCount = 0;
// CONTROLS WHETHER WE CREATE COLUMNS OR NOT
bool hasColumns = false;
string[] ColumnNames = null;
string[] oStreamDataValues = null;
//using while loop read the stream data till end
while (!oStreamReader.EndOfStream)
{
String oStreamRowData = oStreamReader.ReadLine().Trim();
if (oStreamRowData.Length > 0)
{
oStreamDataValues = oStreamRowData.Split(';');
//Bcoz the first row contains column names, we will poluate
//the column name by
//reading the first row and RowCount-0 will be true only once
// CHANGE TO CHECK FOR COLUMNS CREATED
if (!hasColumns)
{
ColumnNames = oStreamRowData.Split(';');
//using foreach looping through all the column names
foreach (string csvcolumn in ColumnNames)
{
DataColumn oDataColumn = new DataColumn(csvcolumn.ToUpper(), typeof(string));
//setting the default value of empty.string to newly created column
oDataColumn.DefaultValue = string.Empty;
//adding the newly created column to the table
oDataTable.Columns.Add(oDataColumn);
}
// SET COLUMNS CREATED
hasColumns = true;
// SET RowCount TO 0 SO WE KNOW TO SKIP COLUMNS LINE
RowCount = 0;
}
else
{
// IF RowCount IS 0 THEN SKIP COLUMN LINE
if (RowCount++ == 0) continue;
//creates a new DataRow with the same schema as of the oDataTable
DataRow oDataRow = oDataTable.NewRow();
//using foreach looping through all the column names
for (int i = 0; i < ColumnNames.Length; i++)
{
oDataRow[ColumnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString();
}
//adding the newly created row with data to the oDataTable
oDataTable.Rows.Add(oDataRow);
}
}
}
//close the oStreamReader object
oStreamReader.Close();
//release all the resources used by the oStreamReader object
oStreamReader.Dispose();
}
return oDataTable;
}
I am thankful for everyone who is willing to help. And Thanks for reading this far!
Sincerely yours
If I understood you right, there is a strict parsing there like this:
string OpenAndParse(string filename, bool firstFile=false)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var header = $"{parsed[0][0]};{parsed[1][0]};{parsed[2][0]};{parsed[1][0]}\n";
var data = $"{parsed[0][1]};{parsed[1][1]};{parsed[2][1]};{parsed[1][2]}\n";
return firstFile
? $"{header}{data}"
: $"{data}";
}
Where it would return - if first file:
Header1;Header2;Header3;Header2
Value1;Value2;Value3;Value4
if not first file:
Value1;Value2;Value3;Value4
If I am correct, rest is about running this against a list file of files and joining the results in an output file.
EDIT: Against a directory:
void ProcessFiles(string folderName, string outputFileName)
{
bool firstFile = true;
foreach (var f in Directory.GetFiles(folderName))
{
File.AppendAllText(outputFileName, OpenAndParse(f, firstFile));
firstFile = false;
}
}
Note: I missed you want a DataTable and not an output file. Then you could simply create a list and put the results into that list making the list the datasource for your datatable (then why would you use semicolons in there? Probably all you need is to simply attach the array values to a list).
(Adding as another answer just to make it uncluttered)
void ProcessMyFiles(string folderName)
{
List<MyData> d = new List<MyData>();
var files = Directory.GetFiles(folderName);
foreach (var file in files)
{
OpenAndParse(file, d);
}
string[] headers = GetHeaders(files[0]);
DataGridView dgv = new DataGridView {Dock=DockStyle.Fill};
dgv.DataSource = d;
dgv.ColumnAdded += (sender, e) => {e.Column.HeaderText = headers[e.Column.Index];};
Form f = new Form();
f.Controls.Add(dgv);
f.Show();
}
string[] GetHeaders(string filename)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
return new string[] { parsed[0][0], parsed[1][0], parsed[2][0], parsed[1][0] };
}
void OpenAndParse(string filename, List<MyData> d)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var data = new MyData
{
Col1 = parsed[0][1],
Col2 = parsed[1][1],
Col3 = parsed[2][1],
Col4 = parsed[1][2]
};
d.Add(data);
}
public class MyData
{
public string Col1 { get; set; }
public string Col2 { get; set; }
public string Col3 { get; set; }
public string Col4 { get; set; }
}
I don't know if this is the best way to do this. But what i would have done in your case, is to rewrite the CSV's the conventionnal way while reading all the files, then create a stream containing the new CSV created.
It would look like something like this :
var csv = new StringBuilder();
csv.AppendLine("Header1;Header2;Header3;Header4");
foreach (var item in file)
{
var newLine = string.Format("{0},{1},{2},{3}", item.value1, item.value2, item.value3, item.value4);
csv.AppendLine(newLine);
}
//Create Stream
MemoryStream stream = new MemoryStream();
StreamReader reader = new StreamReader(stream);
//Fill your data table here with your values
Hope this will help.

Import two CSV, add specific columns from one CSV and import changes to new CSV (C#)

i have to import 2 CSV's.
CSV 1 [49]: Including about 50 tab seperated colums.
CSV 2:[2] Inlcudes 3 Columns which should be replaced on the [3] [6] and [11] place of my first csv.
So heres what i do:
1) Importing the csv and split into a array.
string employeedatabase = "MYPATH";
List<String> status = new List<String>();
StreamReader file2 = new System.IO.StreamReader(filename);
string line = file2.ReadLine();
while ((line = file2.ReadLine()) != null)
{
string[] ud = line.Split('\t');
status.Add(ud[0]);
}
String[] ud_status = status.ToArray();
PROBLEM 1: i have about 50 colums to handle, ud_status is just the first, so do i need 50 Lists and 50 String arrays?
2) Importing the second csv and split into a array.
List<String> vorname = new List<String>();
List<String> nachname = new List<String>();
List<String> username = new List<String>();
StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
string[] data = line3.Split(';');
vorname.Add(data[0]);
nachname.Add(data[1]);
username.Add(data[2]);
}
String[] db_vorname = vorname.ToArray();
String[] db_nachname = nachname.ToArray();
String[] db_username = username.ToArray();
PROBLEM 2: After loading these two csv's i dont know how to combine them, and change to columns as mentioned above ..
somethine like this?
mynewArray = ud_status + "/t" + ud_xy[..n] + "/t" + changed_colum + ud_xy[..n];
save "mynewarray" into tablulator seperated csv with encoding "utf-8".
To read the file into a meaningful format, you should set up a class that defines the format of your CSV:
public class CsvRow
{
public string vorname { get; set; }
public string nachname { get; set; }
public string username { get; set; }
public CsvRow (string[] data)
{
vorname = data[0];
nachname = data[1];
username = data[2];
}
}
Then populate a list of this:
List<CsvRow> rows = new List<CsvRow>();
StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
rows.Add(new CsvRow(line3.Split(';'));
}
Similarly format your other CSV and include unused properties for the new fields. Once you have loaded both, you can populate the new properties from this list in a loop, matching the records by whatever common field the CSVs hopefully share. Then finally output the resulting data to a new CSV file.
Your solution is not to use string arrays to do this. That will just drive you crazy. It's better to use the System.Data.DataTable object.
I didn't get a chance to test the LINQ lambda expression at the end of this (or really any of it, I wrote this on a break), but it should get you on the right track.
using (var ds = new System.Data.DataSet("My Data"))
{
ds.Tables.Add("File0");
ds.Tables.Add("File1");
string[] line;
using (var reader = new System.IO.StreamReader("FirstFile"))
{
//first we get columns for table 0
foreach (string s in reader.ReadLine().Split('\t'))
ds.Tables["File0"].Columns.Add(s);
while ((line = reader.ReadLine().Split('\t')) != null)
{
//and now the rest of the data.
var r = ds.Tables["File0"].NewRow();
for (int i = 0; i <= line.Length; i++)
{
r[i] = line[i];
}
ds.Tables["File0"].Rows.Add(r);
}
}
//we could probably do these in a loop or a second method,
//but you may want subtle differences, so for now we just do it the same way
//for file1
using (var reader2 = new System.IO.StreamReader("SecondFile"))
{
foreach (string s in reader2.ReadLine().Split('\t'))
ds.Tables["File1"].Columns.Add(s);
while ((line = reader2.ReadLine().Split('\t')) != null)
{
//and now the rest of the data.
var r = ds.Tables["File1"].NewRow();
for (int i = 0; i <= line.Length; i++)
{
r[i] = line[i];
}
ds.Tables["File1"].Rows.Add(r);
}
}
//you now have these in functioning datatables. Because we named columns,
//you can call them by name specifically, or by index, to replace in the first datatable.
string[] columnsToReplace = new string[] { "firstColumnName", "SecondColumnName", "ThirdColumnName" };
for(int i = 0; i < ds.Tables[0].Rows.Count; i++)
{
//you didn't give a sign of any relation between the two tables
//so this is just by row, and assumes the row count is equivalent.
//This is also not advised.
//if there is a key these sets of data share
//you should join on them instead.
foreach(DataRow dr in ds.Tables[0].Rows[i].ItemArray)
{
dr[3] = ds.Tables[1].Rows[i][columnsToReplace[0]];
dr[6] = ds.Tables[1].Rows[i][columnsToReplace[1]];
dr[11] = ds.Tables[1].Rows[i][columnsToReplace[2]];
}
}
//ds.Tables[0] now has the output you want.
string output = String.Empty;
foreach (var s in ds.Tables[0].Columns)
output = String.Concat(output, s ,"\t");
output = String.Concat(output, Environment.NewLine); // columns ready, now the rows.
foreach (DataRow r in ds.Tables[0].Rows)
output = string.Concat(output, r.ItemArray.SelectMany(t => (t.ToString() + "\t")), Environment.NewLine);
if(System.IO.File.Exists("MYPATH"))
using (System.IO.StreamWriter file = new System.IO.StreamWriter("MYPATH")) //or a variable instead of string literal
{
file.Write(output);
}
}
With Cinchoo ETL - an open source file helper library, you can do the merge of CSV files as below. Assumed the 2 CSV file contains same number of lines.
string CSV1 = #"Id Name City
1 Tom New York
2 Mark FairFax";
string CSV2 = #"Id City
1 Las Vegas
2 Dallas";
dynamic rec1 = null;
dynamic rec2 = null;
StringBuilder csv3 = new StringBuilder();
using (var csvOut = new ChoCSVWriter(new StringWriter(csv3))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
using (var csv1 = new ChoCSVReader(new StringReader(CSV1))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
using (var csv2 = new ChoCSVReader(new StringReader(CSV2))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
while ((rec1 = csv1.Read()) != null && (rec2 = csv2.Read()) != null)
{
rec1.City = rec2.City;
csvOut.Write(rec1);
}
}
}
}
Console.WriteLine(csv3.ToString());
Hope it helps.
Disclaimer: I'm the author of this library.

Given string array of column names, how do I read a .csv file to a DataTable?

Assume I have a .csv file with 70 columns, but only 5 of the columns are what I need. I want to be able to pass a method a string array of the columns names that I want, and for it to return a datatable.
private void method(object sender, EventArgs e) {
string[] columns =
{
#"Column21",
#"Column48"
};
DataTable myDataTable = Get_DT(columns);
}
public DataTable Get_DT(string[] columns) {
DataTable ret = new DataTable();
if (columns.Length > 0)
{
foreach (string column in columns)
{
ret.Columns.Add(column);
}
string[] csvlines = File.ReadAllLines(#"path to csv file");
csvlines = csvlines.Skip(1).ToArray(); //ignore the columns in the first line of the csv file
//this is where i need help... i want to use linq to read the fields
//of the each row with only the columns name given in the string[]
//named columns
}
return ret;
}
Read the first line of the file, line.Split(',') (or whatever your delimiter is), then get the index of each column name and store that.
Then for each other line, again do a var values = line.Split(','), then get the values from the columns.
Quick and dirty version:
string[] csvlines = File.ReadAllLines(#"path to csv file");
//select the indices of the columns we want
var cols = csvlines[0].Split(',').Select((val,i) => new { val, i }).Where(x => columns.Any(c => c == x.val)).Select(x => x.i).ToList();
//now go through the remaining lines
foreach (var line in csvlines.Skip(1))
{
var line_values = line.Split(',').ToList();
var dt_values = line_values.Where(x => cols.Contains(line_values.IndexOf(x)));
//now do something with the values you got for this row, add them to your datatable
}
You can look at https://joshclose.github.io/CsvHelper/
Think Reading individual fields is what you are looking for
var csv = new CsvReader( textReader );
while( csv.Read() )
{
var intField = csv.GetField<int>( 0 );
var stringField = csv.GetField<string>( 1 );
var boolField = csv.GetField<bool>( "HeaderName" );
}
We can easily do this without writing much code.
Exceldatareader is an awesome dll for that, it will directly as a datable from the excel sheet with just one method.
here is the links for example:http://www.c-sharpcorner.com/blogs/using-iexceldatareader1
http://exceldatareader.codeplex.com/
Hope it was useful kindly let me know your thoughts or feedbacks
Thanks
Karthik
var data = File.ReadAllLines(#"path to csv file");
// the expenses row
var query = data.Single(d => d[0] == "Expenses");
//third column
int column21 = 3;
return query[column21];
As others have stated a library like CsvReader can be used for this. As for linq, I don't think its suitable for this kind of job.
I haven't tested this but it should get you through
using (TextReader textReader = new StreamReader(filePath))
{
using (var csvReader = new CsvReader(textReader))
{
var headers = csvReader.FieldHeaders;
for (int rowIndex = 0; csvReader.Read(); rowIndex++)
{
var dataRow = dataTable.NewRow();
for (int chosenColumnIndex = 0; chosenColumnIndex < columns.Count(); chosenColumnIndex++)
{
for (int headerIndex = 0; headerIndex < headers.Length; headerIndex++)
{
if (headers[headerIndex] == columns[chosenColumnIndex])
{
dataRow[chosenColumnIndex] = csvReader.GetField<string>(headerIndex);
}
}
}
dataTable.Rows.InsertAt(dataRow, rowIndex);
}
}
}

Different CSV files based on value

I'm having a question about my CSV.
I export a CSV and read it in C#.
The last colomn of each line in CSV is A,B,C,D,E or G.
Now, I want my CSV to be cut in pieces, like; I want a new CSV with the lines which contain A and D. And another one which contains B and C for example.
Can anyone point me in the right direction? I'm stuck..
This is a part of my code
StreamReader debtors = new StreamReader(#"C:\CSV\Debtors.csv");
StreamWriter debtorsMetaal = new StreamWriter(#"C:\CSV\DebtorsMetaal.csv");
StreamWriter debtorsSystemen = new StreamWriter(#"C:\CSV\DebtorsSystemen.csv");
StreamWriter debtorsHolding = new StreamWriter(#"C:\CSV\DebtorsHolding.csv");
while(debtors.Peek() >=0)
{
string line = debtors.ReadLine();
try
{
string[] rowsArray = line.Split(';');
//..... etc
Now the lines are in pieces, but how can I select the last colomn in my line and create a new CSV file based upon the values of the last colomn?
debtorsMetaal, debtorsSystemen and debtorsHolding will be the new CSV files.
For example;
In a line in the CSV I have the following info
number - name- description - type
Where type can be A, B, C, D , E or G.
Now I want the lines where type = A and the lines where type = D together in one CSV file.
Is this even possible?
The values A,B,C,D,E or G are always in colomn AJ in excel format.
I would use a loop like this:
var adLines = new List<string>();
var bcLines = new List<string>();
var unknownLines = new List<string>();
var adList = new[]{"A", "D"};
var bcList = new[]{"B", "C"};
using(var debtors = new StreamReader(#"C:\CSV\Debtors.csv"))
{
string line = null;
while((line = debtors.ReadLine()) != null)
{
string[] columns = line.Split(';'); // you should check if columns.Length is correct
string lastColumn = columns.Last().Trim();
if(adList.Contains(lastColumn, StringComparer.CurrentCultureIgnoreCase))
adLines.Add(line);
else if(bcList.Contains(lastColumn, StringComparer.CurrentCultureIgnoreCase))
bcLines.Add(line);
else
unknownLines.Add(line);
}
}
File.WriteAllLines(#"C:\CSV\DebtorsSystemen.csv", adLines);
File.WriteAllLines(#"C:\CSV\DebtorsHolding.csv", bcLines);
However, in general you should not reinvent the wheel and use an abvailable CSV-parser like:
http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader

Categories

Resources