Adding dictionary values to csv file - c#

I'm playing about with working with a dictionary and and adding the contents of it within an existing csv file. This is what I have so far:
List<string> files = new List<string>();
files.Add("test1");
files.Add("test2");
Dictionary<string, List<string>> data = new Dictionary<string, List<string>>();
data.Add("Test Column", files.ToList());
foreach ( var columnData in data.Keys)
{
foreach (var rowData in data[columnData])
{
var csv = File.ReadLines(filePath.ToString()).Select((line, index) => index == 0
? line + "," + columnData.ToString()
: line + "," + rowData.ToString()).ToList();
File.WriteAllLines(filePath.ToString(), csv);
}
}
This sort of works but not the way I'm intending. What I would like the output to be is something along the lines
but what I'm actually getting is:
as you'll be able to see I'm getting 2 columns instead of just 1 with a column each for both list values and the values repeating on every single row. How can I fix it so that it's like how I've got in the first image? I know it's something to do with my foreach loop and the way I'm inputting the data into the file but I'm just not sure how to fix it
Edit:
So I have the read, write and AddToCsv methods and when I try it like so:
File.WriteAllLines("file.csv", new string[] { "Col0,Col1,Col2", "0,1,2", "1,2,3", "2,3,4", "3,4,5" });
var filePath = "file.csv";
foreach (var line in File.ReadLines(filePath))
Console.WriteLine(line);
Console.WriteLine("\n\n");
List<string> files = new List<string>() { "test1", "test2" };
List<string> numbers = new List<string>() { "one", "two", "three", "four", "five" };
Dictionary<string, List<string>> newData = new Dictionary<string, List<string>>() {
{"Test Column", files},
{"Test2", numbers}
};
var data1 = ReadCsv(filePath);
AddToCsv(data1, newData);
WriteCsv(filePath.ToString(), data1);
It works perfectly but when I have the file path as an already created file like so:
var filePath = exportFile.ToString();
I get the error:
Message :Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
Source :System.Private.CoreLib
Stack : at System.Collections.Generic.List1.get_Item(Int32 index) at HMHExtract.Runner.ReadCsv(String path) in C:\tfs\Agility\Client\HMH Extract\HMHExtract\Runner.cs:line 194 at HMHExtract.Runner.Extract(Nullable1 ct) in C:\tfs\Agility\Client\HMH Extract\HMHExtract\Runner.cs:line 68
Target Site :Void ThrowArgumentOutOfRange_IndexException()
The lines in question are:
line 194 - var col = colNames[i]; of the ReadCsv method
line 68 - var data1 = ReadCsv(filePath);
Edit:
So after debugging I've figured out where the issue has come from.
In the csv I am trying to update there are 17 columns so obviously 17 rows of values. So the colNames count is 17. csvRecord Count = 0 and i goes up to 16.
However when it reaches a row where in one of the fields there are 2 values separated by a comma, it counts it s 2 row values instead of just 1 so for the row value instead of being string{17} it becomes string{18} and that causes the out of range error.
To clarify, for the row it gets to which causes the error one of the fields has the values Chris Jones, Malcolm Clark. Now instead of counting them as just 1 row, the method counts them as 2 separate ones, how can I change so it doesn't count them as 2 separate rows?

The best way is to read the csv file first into a list of records, and then add columns to each record. A record is a single row of the csv file, read as a Dictionary<string, string>. The keys of this dict are the column names, and the values are the elements of the row in that column.
public static void AddToCsv(string path, Dictionary<string, List<string>> newData)
{
var fLines = File.ReadLines(path);
var colNames = fLines.First().Split(',').ToList(); // col names in first line
List<Dictionary<string, string>> rowData = new List<Dictionary<string, string>>(); // A list of records for all other rows
foreach (var line in fLines.Skip(1)) // Iterate over second through last lines
{
var row = line.Split(',');
Dictionary<string, string> csvRecord = new Dictionary<string, string>();
// Add everything from this row to the record dictionary
for (int i = 0; i < row.Length; i++)
{
var col = colNames[i];
csvRecord[col] = row[i];
}
rowData.Add(csvRecord);
}
// Now, add new data
foreach (var newColName in newData.Keys)
{
var colData = newData[newColName];
for (int i = 0; i < colData.Count; i++)
{
if (i < rowData.Count) // If the row record already exists, add the new column to it
rowData[i].Add(newColName, colData[i]);
else // Add a row record with only this column
rowData.Add(new Dictionary<string, string>() { {newColName, colData[i]} });
}
colNames.Add(newColName);
}
// Now, write all the data
StreamWriter sw = new StreamWriter(path);
// Write header
sw.WriteLine(String.Join(",", colNames));
foreach (var row in rowData)
{
var line = new List<string>();
foreach (var colName in colNames) // Iterate over columns
{
if (row.ContainsKey(colName)) // If the row contains this column, add it to the line
line.Add(row[colName]);
else // Else add an empty string
line.Add("");
}
// Join all elements in the line with a comma, then write to file
sw.WriteLine(String.Join(",", line));
}
sw.Close();
}
To use this, let's create the following CSV file file.csv:
Col0,Col1,Col2
0,1,2
1,2,3
2,3,4
3,4,5
List<string> files = new List<string>() {"test1", "test2"};
List<string> numbers = new List<string>() {"one", "two", "three", "four", "five"};
Dictionary<string, List<string>> newData = new Dictionary<string, List<string>>() {
{"Test Column", files},
{"Test2", numbers}
}
AddToCsv("file.csv", newData);
And this results in file.csv being modified to:
Col0,Col1,Col2,Test Column,Test2
0,1,2,test1,one
1,2,3,test2,two
2,3,4,,three
3,4,5,,four
,,,,five
To make this more organized, I defined a struct CsvData to hold the column names and row records, and a function ReadCsv() that reads the file into this struct, and WriteCsv() that writes the struct to a file. Then separate responsibilities -- ReadCsv() only reads the file, WriteCsv() only writes the file, and AddToCsv() only adds to the file.
public struct CsvData
{
public List<string> ColNames;
public List<Dictionary<string, string>> RowData;
}
public static CsvData ReadCsv(string path)
{
List<string> colNames = new List<string>();
List<Dictionary<string, string>> rowData = new List<Dictionary<string, string>>(); // A list of records for all other rows
if (!File.Exists(path)) return new CsvData() {ColNames = colNames, RowData = rowData };
var fLines = File.ReadLines(path);
var firstLine = fLines.FirstOrDefault(); // Read the first line
if (firstLine != null) // Only try to parse the file if the first line actually exists.
{
colNames = firstLine.Split(',').ToList(); // col names in first line
foreach (var line in fLines.Skip(1)) // Iterate over second through last lines
{
var row = line.Split(',');
Dictionary<string, string> csvRecord = new Dictionary<string, string>();
// Add everything from this row to the record dictionary
for (int i = 0; i < row.Length; i++)
{
var col = colNames[i];
csvRecord[col] = row[i];
}
rowData.Add(csvRecord);
}
}
return new CsvData() {ColNames = colNames, RowData = rowData};
}
public static void WriteCsv(string path, CsvData data)
{
StreamWriter sw = new StreamWriter(path);
// Write header
sw.WriteLine(String.Join(",", data.ColNames));
foreach (var row in data.RrowData)
{
var line = new List<string>();
foreach (var colName in data.ColNames) // Iterate over columns
{
if (row.ContainsKey(colName)) // If the row contains this column, add it to the line
line.Add(row[colName]);
else // Else add an empty string
line.Add("");
}
// Join all elements in the line with a comma, then write to file
sw.WriteLine(String.Join(",", line));
}
sw.Close();
}
public static void AddToCsv(CsvData data, Dictionary<string, List<string>> newData)
{
foreach (var newColName in newData.Keys)
{
var colData = newData[newColName];
for (int i = 0; i < colData.Count; i++)
{
if (i < data.RowData.Count) // If the row record already exists, add the new column to it
data.RowData[i].Add(newColName, colData[i]);
else // Add a row record with only this column
data.RowData.Add(new Dictionary<string, string>() { {newColName, colData[i]} });
}
data.ColNames.Add(newColName);
}
}
Then, to use this, you do:
var data = ReadCsv(path);
AddToCsv(data, newData);
WriteCsv(path, data);

I managed to figure out a way that worked for me, might not be the most efficient but it does work. It involves using csvHelper
public static void AppendFile(FileInfo fi, List<string> newColumns, DataTable newRows)
{
var settings = new CsvConfiguration(new CultureInfo("en-GB"))
{
Delimiter = ";"
};
var dt = new DataTable();
using (var reader = new StreamReader(fi.FullName))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
using (var dataReader = new CsvDataReader(csv))
{
dt.Load(dataReader);
foreach (var title in newColumns)
{
dt.Columns.Add(title);
}
dt.Rows.Clear();
foreach (DataRow row in newRows.Rows)
{
dt.Rows.Add(row.ItemArray);
}
}
}
using var streamWriter = new StreamWriter(fi.FullName);
using var csvWriter = new CsvWriter(streamWriter, settings);
// Write columns
foreach (DataColumn column in dt.Columns)
{
csvWriter.WriteField(column.ColumnName);
}
csvWriter.NextRecord();
// Write row values
foreach (DataRow row in dt.Rows)
{
for (var i = 0; i < dt.Columns.Count; i++)
{
csvWriter.WriteField(row[i]);
}
csvWriter.NextRecord();
}
}
I start by getting the contents of the csv file into a data table and then adding in the new columns that I need. I then clear all the rows in the datatable and add new ones in (the data that is removed is added back in via the newRows parameter) and then write the datatable to the csv file

Related

Remove all columns from datatable except for 25

I have 500 Columns in my DataTable and I want to remove all of them except for 25 columns.
Is there any way to do this faster to save time and lines of code?
This is what I already tried:
private static void DeleteUselessColumns()
{
//This is example data!
List<DataColumn> dataColumnsToDelete = new List<DataColumn>();
DataTable bigData = new DataTable();
bigData.Columns.Add("Harry");
bigData.Columns.Add("Konstantin");
bigData.Columns.Add("George");
bigData.Columns.Add("Gabriel");
bigData.Columns.Add("Oscar");
bigData.Columns.Add("Muhammad");
bigData.Columns.Add("Emily");
bigData.Columns.Add("Olivia");
bigData.Columns.Add("Isla");
List<string> columnsToKeep = new List<string>();
columnsToKeep.Add("Isla");
columnsToKeep.Add("Oscar");
columnsToKeep.Add("Konstantin");
columnsToKeep.Add("Gabriel");
//This is the code i want to optimize------
foreach (DataColumn column in bigData.Columns)
{
bool keepColumn = false;
foreach (string s in columnsToKeep)
{
if (column.ColumnName.Equals(s))
{
keepColumn = true;
}
}
if (!keepColumn)
{
dataColumnsToDelete.Add(column);
}
}
foreach(DataColumn dataColumn in dataColumnsToDelete)
{
bigData.Columns.Remove(dataColumn);
}
//------------------------
}
var columnsToKeep = new List<string>() { "Isla", "Oscar", "Konstantin", "Gabriel"};
var toRemove = new List<DataColumn>();
foreach(DataColumn column in bigData.Columns)
{
if (!columnsToKeep.Any(name => column.ColumnName == name ))
{
toRemove.Add(column);
}
}
toRemove.ForEach(col => bigData.Columns.Remove(col));
Test1...test9 same code could be made a loop. No need to add the columns to delete in a list, just delete them in the first while loop. As for performance, not sure how to improve it.
You could try to use a DataView that selects the desired columns then copy to table. You need to experiment.
if they have different names create an array of string
var columns = new string[] { "Harry", "Konstantin","John"};
var columnsToKeep = new string[] { "John", "Konstantin"};
var columnsToDelete = from item in columns
where !columnsToKeep.Contains(item)
select item;
or using lambda
var columnsToDelete = columns
.Where (i=> !columnsToKeep.Contains(i))
.ToList();
toDelete
Harry

loading data from csv file into key and value dictionary list

I have a file consisting of a list of text which looks as follows:
Example csv file
There csv file has consist of 3 columns. The first columns will always be the length of 5. So I want to loop through the file content, store those first 5 letters as Key and remaining column as value. I am removing comma between them and Substringing as follows to store.
static string line;
static Dictionary<string, string> stations = new Dictionary<string, string>();
static void Main(string[] args)
{
// Dictionary<string, List<KeyValuePair<string, string>>> stations = new Dictionary<string, List<KeyValuePair<string, string>>>();
var lines = File.ReadAllLines(".\\ariba_sr_header_2017122816250.csv");
foreach (var l in lines)
{
line = l.Replace(",", "");
stations.Add(line.Substring(14),line.Substring(14, line.Length-14));
}
//read all key and value in file
foreach (KeyValuePair<string, string> item in stations)
{
Console.WriteLine(item.Key);
Console.WriteLine(item.Value);
}
Console.ReadLine();
}
After debug, the output is
Output
My Expected Result is as follow:
Expected Result
I cannot see any KeyValuePair here. You have
00021,00014,Ordered
00021,00026,Ordered
00024,00036,Ordered
...
and you want
00021
00021
00024
000014Ordered
000026Ordered
000036Ordered
...
outcome which seems to be IEnumerable<string>. You can try Linq for this
var result = File
.ReadLines(".\\ariba_sr_header_2017122816250.csv")
.Line(line => line.Split(','))
.SelectMany(items => new string[] {
items[0],
$"0{items[1]}{items[2]}" })
.OrderBy(item => item.Length);
foreach (var item in result)
Console.WriteLine(item);
Here we Split each line like 00021,00014,Ordered into separate items: {00021, 00014, Ordered}anf then combine them back with a help ofSelectMany`. We want
00021 which is items[0]
000014Ordered which is 0 + items[1] + items[2]
Finally we want to have short items first - OrderBy(item => item.Length)
Here you go:
var stations = new Dictionary<string, string>();
var lines = File.ReadAllLines(#"C:\temp\22.txt");
foreach (var l in lines)
{
var lsplit = l.Split(',');
if (lsplit.Length > 1)
{
var newkey = lsplit[0];
var newval = lsplit[1] + lsplit[2];
stations[newkey] = newval;
}
}
//read all key and value in file
foreach (KeyValuePair<string, string> item in stations)
{
Console.WriteLine(item.Key + " = " + item.Value);
}
Console.ReadLine();
Not exactly the output you expected, but hopefully it helps.

C# Reading CSV to DataTable and Invoke Rows/Columns

i am currently working on a small Project and i got stuck with a Problem i currently can not manage to solve...
I have multiple ".CSV" Files i want to read, they all have the same Data just with different Values.
Header1;Value1;Info1
Header2;Value2;Info2
Header3;Value3;Info3
While reading the first File i Need to Create the Headers. The Problem is they are not splited in Columns but in rows (as you can see above Header1-Header3).
Then it Needs to read the Value 1 - Value 3 (they are listed in the 2nd Column) and on top of that i Need to create another Header -> Header4 with the data of "Info2" which is always placed in Column 3 and Row 2 (the other values of Column 3 i can ignore).
So the Outcome after the first File should look like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Info2;
And after multiple files it sohuld be like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Value4;
Value1b;Value2b;Value3b;Value4b;
Value1c;Value2c;Value3c;Value4c;
I tried it with OleDB but i get the Error "missing ISAM" which i cant mange to fix. The Code i Used is the following:
public DataTable ReadCsv(string fileName)
{
DataTable dt = new DataTable("Data");
/* using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
Path.GetDirectoryName(fileName) + "\";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
*/
using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(fileName) + ";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
{
using(OleDbCommand cmd = new OleDbCommand(string.Format("select *from [{0}]", new FileInfo(fileName).Name,cn)))
{
cn.Open();
using(OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
adapter.Fill(dt);
}
}
}
return dt;
}
Another attempt i did was using StreamReader. But the Headers are in the wrong place and i dont know how to Change this + do this for every file. the Code i tried is the following:
public static DataTable ReadCsvFilee(string path)
{
DataTable oDataTable = new DataTable();
var fileNames = Directory.GetFiles(path);
foreach (var fileName in fileNames)
{
//initialising a StreamReader type variable and will pass the file location
StreamReader oStreamReader = new StreamReader(fileName);
// CONTROLS WHETHER WE SKIP A ROW OR NOT
int RowCount = 0;
// CONTROLS WHETHER WE CREATE COLUMNS OR NOT
bool hasColumns = false;
string[] ColumnNames = null;
string[] oStreamDataValues = null;
//using while loop read the stream data till end
while (!oStreamReader.EndOfStream)
{
String oStreamRowData = oStreamReader.ReadLine().Trim();
if (oStreamRowData.Length > 0)
{
oStreamDataValues = oStreamRowData.Split(';');
//Bcoz the first row contains column names, we will poluate
//the column name by
//reading the first row and RowCount-0 will be true only once
// CHANGE TO CHECK FOR COLUMNS CREATED
if (!hasColumns)
{
ColumnNames = oStreamRowData.Split(';');
//using foreach looping through all the column names
foreach (string csvcolumn in ColumnNames)
{
DataColumn oDataColumn = new DataColumn(csvcolumn.ToUpper(), typeof(string));
//setting the default value of empty.string to newly created column
oDataColumn.DefaultValue = string.Empty;
//adding the newly created column to the table
oDataTable.Columns.Add(oDataColumn);
}
// SET COLUMNS CREATED
hasColumns = true;
// SET RowCount TO 0 SO WE KNOW TO SKIP COLUMNS LINE
RowCount = 0;
}
else
{
// IF RowCount IS 0 THEN SKIP COLUMN LINE
if (RowCount++ == 0) continue;
//creates a new DataRow with the same schema as of the oDataTable
DataRow oDataRow = oDataTable.NewRow();
//using foreach looping through all the column names
for (int i = 0; i < ColumnNames.Length; i++)
{
oDataRow[ColumnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString();
}
//adding the newly created row with data to the oDataTable
oDataTable.Rows.Add(oDataRow);
}
}
}
//close the oStreamReader object
oStreamReader.Close();
//release all the resources used by the oStreamReader object
oStreamReader.Dispose();
}
return oDataTable;
}
I am thankful for everyone who is willing to help. And Thanks for reading this far!
Sincerely yours
If I understood you right, there is a strict parsing there like this:
string OpenAndParse(string filename, bool firstFile=false)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var header = $"{parsed[0][0]};{parsed[1][0]};{parsed[2][0]};{parsed[1][0]}\n";
var data = $"{parsed[0][1]};{parsed[1][1]};{parsed[2][1]};{parsed[1][2]}\n";
return firstFile
? $"{header}{data}"
: $"{data}";
}
Where it would return - if first file:
Header1;Header2;Header3;Header2
Value1;Value2;Value3;Value4
if not first file:
Value1;Value2;Value3;Value4
If I am correct, rest is about running this against a list file of files and joining the results in an output file.
EDIT: Against a directory:
void ProcessFiles(string folderName, string outputFileName)
{
bool firstFile = true;
foreach (var f in Directory.GetFiles(folderName))
{
File.AppendAllText(outputFileName, OpenAndParse(f, firstFile));
firstFile = false;
}
}
Note: I missed you want a DataTable and not an output file. Then you could simply create a list and put the results into that list making the list the datasource for your datatable (then why would you use semicolons in there? Probably all you need is to simply attach the array values to a list).
(Adding as another answer just to make it uncluttered)
void ProcessMyFiles(string folderName)
{
List<MyData> d = new List<MyData>();
var files = Directory.GetFiles(folderName);
foreach (var file in files)
{
OpenAndParse(file, d);
}
string[] headers = GetHeaders(files[0]);
DataGridView dgv = new DataGridView {Dock=DockStyle.Fill};
dgv.DataSource = d;
dgv.ColumnAdded += (sender, e) => {e.Column.HeaderText = headers[e.Column.Index];};
Form f = new Form();
f.Controls.Add(dgv);
f.Show();
}
string[] GetHeaders(string filename)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
return new string[] { parsed[0][0], parsed[1][0], parsed[2][0], parsed[1][0] };
}
void OpenAndParse(string filename, List<MyData> d)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var data = new MyData
{
Col1 = parsed[0][1],
Col2 = parsed[1][1],
Col3 = parsed[2][1],
Col4 = parsed[1][2]
};
d.Add(data);
}
public class MyData
{
public string Col1 { get; set; }
public string Col2 { get; set; }
public string Col3 { get; set; }
public string Col4 { get; set; }
}
I don't know if this is the best way to do this. But what i would have done in your case, is to rewrite the CSV's the conventionnal way while reading all the files, then create a stream containing the new CSV created.
It would look like something like this :
var csv = new StringBuilder();
csv.AppendLine("Header1;Header2;Header3;Header4");
foreach (var item in file)
{
var newLine = string.Format("{0},{1},{2},{3}", item.value1, item.value2, item.value3, item.value4);
csv.AppendLine(newLine);
}
//Create Stream
MemoryStream stream = new MemoryStream();
StreamReader reader = new StreamReader(stream);
//Fill your data table here with your values
Hope this will help.

Import two CSV, add specific columns from one CSV and import changes to new CSV (C#)

i have to import 2 CSV's.
CSV 1 [49]: Including about 50 tab seperated colums.
CSV 2:[2] Inlcudes 3 Columns which should be replaced on the [3] [6] and [11] place of my first csv.
So heres what i do:
1) Importing the csv and split into a array.
string employeedatabase = "MYPATH";
List<String> status = new List<String>();
StreamReader file2 = new System.IO.StreamReader(filename);
string line = file2.ReadLine();
while ((line = file2.ReadLine()) != null)
{
string[] ud = line.Split('\t');
status.Add(ud[0]);
}
String[] ud_status = status.ToArray();
PROBLEM 1: i have about 50 colums to handle, ud_status is just the first, so do i need 50 Lists and 50 String arrays?
2) Importing the second csv and split into a array.
List<String> vorname = new List<String>();
List<String> nachname = new List<String>();
List<String> username = new List<String>();
StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
string[] data = line3.Split(';');
vorname.Add(data[0]);
nachname.Add(data[1]);
username.Add(data[2]);
}
String[] db_vorname = vorname.ToArray();
String[] db_nachname = nachname.ToArray();
String[] db_username = username.ToArray();
PROBLEM 2: After loading these two csv's i dont know how to combine them, and change to columns as mentioned above ..
somethine like this?
mynewArray = ud_status + "/t" + ud_xy[..n] + "/t" + changed_colum + ud_xy[..n];
save "mynewarray" into tablulator seperated csv with encoding "utf-8".
To read the file into a meaningful format, you should set up a class that defines the format of your CSV:
public class CsvRow
{
public string vorname { get; set; }
public string nachname { get; set; }
public string username { get; set; }
public CsvRow (string[] data)
{
vorname = data[0];
nachname = data[1];
username = data[2];
}
}
Then populate a list of this:
List<CsvRow> rows = new List<CsvRow>();
StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
rows.Add(new CsvRow(line3.Split(';'));
}
Similarly format your other CSV and include unused properties for the new fields. Once you have loaded both, you can populate the new properties from this list in a loop, matching the records by whatever common field the CSVs hopefully share. Then finally output the resulting data to a new CSV file.
Your solution is not to use string arrays to do this. That will just drive you crazy. It's better to use the System.Data.DataTable object.
I didn't get a chance to test the LINQ lambda expression at the end of this (or really any of it, I wrote this on a break), but it should get you on the right track.
using (var ds = new System.Data.DataSet("My Data"))
{
ds.Tables.Add("File0");
ds.Tables.Add("File1");
string[] line;
using (var reader = new System.IO.StreamReader("FirstFile"))
{
//first we get columns for table 0
foreach (string s in reader.ReadLine().Split('\t'))
ds.Tables["File0"].Columns.Add(s);
while ((line = reader.ReadLine().Split('\t')) != null)
{
//and now the rest of the data.
var r = ds.Tables["File0"].NewRow();
for (int i = 0; i <= line.Length; i++)
{
r[i] = line[i];
}
ds.Tables["File0"].Rows.Add(r);
}
}
//we could probably do these in a loop or a second method,
//but you may want subtle differences, so for now we just do it the same way
//for file1
using (var reader2 = new System.IO.StreamReader("SecondFile"))
{
foreach (string s in reader2.ReadLine().Split('\t'))
ds.Tables["File1"].Columns.Add(s);
while ((line = reader2.ReadLine().Split('\t')) != null)
{
//and now the rest of the data.
var r = ds.Tables["File1"].NewRow();
for (int i = 0; i <= line.Length; i++)
{
r[i] = line[i];
}
ds.Tables["File1"].Rows.Add(r);
}
}
//you now have these in functioning datatables. Because we named columns,
//you can call them by name specifically, or by index, to replace in the first datatable.
string[] columnsToReplace = new string[] { "firstColumnName", "SecondColumnName", "ThirdColumnName" };
for(int i = 0; i < ds.Tables[0].Rows.Count; i++)
{
//you didn't give a sign of any relation between the two tables
//so this is just by row, and assumes the row count is equivalent.
//This is also not advised.
//if there is a key these sets of data share
//you should join on them instead.
foreach(DataRow dr in ds.Tables[0].Rows[i].ItemArray)
{
dr[3] = ds.Tables[1].Rows[i][columnsToReplace[0]];
dr[6] = ds.Tables[1].Rows[i][columnsToReplace[1]];
dr[11] = ds.Tables[1].Rows[i][columnsToReplace[2]];
}
}
//ds.Tables[0] now has the output you want.
string output = String.Empty;
foreach (var s in ds.Tables[0].Columns)
output = String.Concat(output, s ,"\t");
output = String.Concat(output, Environment.NewLine); // columns ready, now the rows.
foreach (DataRow r in ds.Tables[0].Rows)
output = string.Concat(output, r.ItemArray.SelectMany(t => (t.ToString() + "\t")), Environment.NewLine);
if(System.IO.File.Exists("MYPATH"))
using (System.IO.StreamWriter file = new System.IO.StreamWriter("MYPATH")) //or a variable instead of string literal
{
file.Write(output);
}
}
With Cinchoo ETL - an open source file helper library, you can do the merge of CSV files as below. Assumed the 2 CSV file contains same number of lines.
string CSV1 = #"Id Name City
1 Tom New York
2 Mark FairFax";
string CSV2 = #"Id City
1 Las Vegas
2 Dallas";
dynamic rec1 = null;
dynamic rec2 = null;
StringBuilder csv3 = new StringBuilder();
using (var csvOut = new ChoCSVWriter(new StringWriter(csv3))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
using (var csv1 = new ChoCSVReader(new StringReader(CSV1))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
using (var csv2 = new ChoCSVReader(new StringReader(CSV2))
.WithFirstLineHeader()
.WithDelimiter("\t")
)
{
while ((rec1 = csv1.Read()) != null && (rec2 = csv2.Read()) != null)
{
rec1.City = rec2.City;
csvOut.Write(rec1);
}
}
}
}
Console.WriteLine(csv3.ToString());
Hope it helps.
Disclaimer: I'm the author of this library.

Given string array of column names, how do I read a .csv file to a DataTable?

Assume I have a .csv file with 70 columns, but only 5 of the columns are what I need. I want to be able to pass a method a string array of the columns names that I want, and for it to return a datatable.
private void method(object sender, EventArgs e) {
string[] columns =
{
#"Column21",
#"Column48"
};
DataTable myDataTable = Get_DT(columns);
}
public DataTable Get_DT(string[] columns) {
DataTable ret = new DataTable();
if (columns.Length > 0)
{
foreach (string column in columns)
{
ret.Columns.Add(column);
}
string[] csvlines = File.ReadAllLines(#"path to csv file");
csvlines = csvlines.Skip(1).ToArray(); //ignore the columns in the first line of the csv file
//this is where i need help... i want to use linq to read the fields
//of the each row with only the columns name given in the string[]
//named columns
}
return ret;
}
Read the first line of the file, line.Split(',') (or whatever your delimiter is), then get the index of each column name and store that.
Then for each other line, again do a var values = line.Split(','), then get the values from the columns.
Quick and dirty version:
string[] csvlines = File.ReadAllLines(#"path to csv file");
//select the indices of the columns we want
var cols = csvlines[0].Split(',').Select((val,i) => new { val, i }).Where(x => columns.Any(c => c == x.val)).Select(x => x.i).ToList();
//now go through the remaining lines
foreach (var line in csvlines.Skip(1))
{
var line_values = line.Split(',').ToList();
var dt_values = line_values.Where(x => cols.Contains(line_values.IndexOf(x)));
//now do something with the values you got for this row, add them to your datatable
}
You can look at https://joshclose.github.io/CsvHelper/
Think Reading individual fields is what you are looking for
var csv = new CsvReader( textReader );
while( csv.Read() )
{
var intField = csv.GetField<int>( 0 );
var stringField = csv.GetField<string>( 1 );
var boolField = csv.GetField<bool>( "HeaderName" );
}
We can easily do this without writing much code.
Exceldatareader is an awesome dll for that, it will directly as a datable from the excel sheet with just one method.
here is the links for example:http://www.c-sharpcorner.com/blogs/using-iexceldatareader1
http://exceldatareader.codeplex.com/
Hope it was useful kindly let me know your thoughts or feedbacks
Thanks
Karthik
var data = File.ReadAllLines(#"path to csv file");
// the expenses row
var query = data.Single(d => d[0] == "Expenses");
//third column
int column21 = 3;
return query[column21];
As others have stated a library like CsvReader can be used for this. As for linq, I don't think its suitable for this kind of job.
I haven't tested this but it should get you through
using (TextReader textReader = new StreamReader(filePath))
{
using (var csvReader = new CsvReader(textReader))
{
var headers = csvReader.FieldHeaders;
for (int rowIndex = 0; csvReader.Read(); rowIndex++)
{
var dataRow = dataTable.NewRow();
for (int chosenColumnIndex = 0; chosenColumnIndex < columns.Count(); chosenColumnIndex++)
{
for (int headerIndex = 0; headerIndex < headers.Length; headerIndex++)
{
if (headers[headerIndex] == columns[chosenColumnIndex])
{
dataRow[chosenColumnIndex] = csvReader.GetField<string>(headerIndex);
}
}
}
dataTable.Rows.InsertAt(dataRow, rowIndex);
}
}
}

Categories

Resources