Custom file parser slows down with every next file - c#

I have built a simple file parser that reads a csv file line by line and adds it to the DB.
i dont commit changes to the DB till after the file is completely parsed.
It works fine but for some reason with every next file - the parsing becomes slower and slower
here is the code any suggestions on how to speed it up are very welcome.
using Microsoft.VisualBasic.FileIO;
using System;
using System.IO;
namespace CsvToSQL
{
internal class Program
{
private static void Main(string[] args)
{
TransactionsEntities entities = new TransactionsEntities();
string targetFolderPath = "C:\\Transactions\\";
string[] allFiles = Directory.GetFiles(targetFolderPath);
//Loop through files in folder
foreach (var file in allFiles)
{
//parse file
Console.WriteLine(file);
using (TextFieldParser parser = new TextFieldParser(file))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
int lineNo = 0;
while (!parser.EndOfData)
{
TransactionList transaction = new TransactionList();
//processing row
string[] fields = parser.ReadFields();
try
{
if(lineNo % 20 == 0)
{
Console.WriteLine(file + " Parsed line no: " + lineNo);
}
transaction.Account = fields[0];
transaction.Timestamp = fields[1];
transaction.TransactionType = fields[2];
transaction.Status = fields[3];
transaction.Product = fields[4];
transaction.Price = fields[5];
transaction.BuySell = fields[6];
transaction.Series = fields[7];
transaction.Volume = fields[8];
transaction.FillVolume = fields[9];
transaction.OrderID = fields[10];
transaction.BestBid = fields[11];
transaction.BestAsk = fields[12];
entities.TransactionLists.Add(transaction);
lineNo++;
}
catch(Exception e)
{
Console.WriteLine(e.ToString());
Console.ReadKey();
}
}
try
{
entities.SaveChanges();
}catch(Exception e)
{
Console.WriteLine(e.ToString());
Console.ReadKey();
}
}
}
}
}
}

Hi all i found the memory issue with this - if anyone comes across a similar problem.
TransactionsEntities entities = new TransactionsEntities();
This is what is causing it to slow down so incredibly. It uses the same connection to DB replaced it with:
using (TransactionsEntities entities = new TransactionsEntities()){
//Transaciton parsing code for 1 file
}
and the application is flying through the files at 100x the speed it was before :)

Your problem is:
TransactionsEntities entities = new TransactionsEntities();
Because Entity framework was not designed to work with bulk data in mind, it caches the entries in memory to minimize requeries etc. But adding lots of entries in a single context makes maintaining/checking cached data slow, which is your case.
You should better use SQLBulkCopy if you're using SQL server. It will add 100x more speed to your process.

Related

Replacing a special character with a \n but keeping the text in the same 'column' [duplicate]

I am trying to write into a csv file row by row using C# language. Here is my function
string first = reader[0].ToString();
string second=image.ToString();
string csv = string.Format("{0},{1}\n", first, second);
File.WriteAllText(filePath, csv);
The whole function runs inside a loop, and every row should be written to the csv file. In my case, next row overwrites the existing row and in the end, I am getting an only single record in the csv file which is the last one. How can I write all the rows in the csv file?
UPDATE
Back in my naïve days, I suggested doing this manually (it was a simple solution to a simple question), however due to this becoming more and more popular, I'd recommend using the library CsvHelper that does all the safety checks, etc.
CSV is way more complicated than what the question/answer suggests.
Original Answer
As you already have a loop, consider doing it like this:
//before your loop
var csv = new StringBuilder();
//in your loop
var first = reader[0].ToString();
var second = image.ToString();
//Suggestion made by KyleMit
var newLine = string.Format("{0},{1}", first, second);
csv.AppendLine(newLine);
//after your loop
File.WriteAllText(filePath, csv.ToString());
Or something to this effect.
My reasoning is: you won't be need to write to the file for every item, you will only be opening the stream once and then writing to it.
You can replace
File.WriteAllText(filePath, csv.ToString());
with
File.AppendAllText(filePath, csv.ToString());
if you want to keep previous versions of csv in the same file
C# 6
If you are using c# 6.0 then you can do the following
var newLine = $"{first},{second}"
EDIT
Here is a link to a question that explains what Environment.NewLine does.
I would highly recommend you to go the more tedious route. Especially if your file size is large.
using(var w = new StreamWriter(path))
{
for( /* your loop */)
{
var first = yourFnToGetFirst();
var second = yourFnToGetSecond();
var line = string.Format("{0},{1}", first, second);
w.WriteLine(line);
w.Flush();
}
}
File.AppendAllText() opens a new file, writes the content and then closes the file. Opening files is a much resource-heavy operation, than writing data into open stream. Opening\closing a file inside a loop will cause performance drop.
The approach suggested by Johan solves that problem by storing all the output in memory and then writing it once. However (in case of big files) you program will consume a large amount of RAM and even crash with OutOfMemoryException
Another advantage of my solution is that you can implement pausing\resuming by saving current position in input data.
upd. Placed using in the right place
Writing csv files by hand can be difficult because your data might contain commas and newlines. I suggest you use an existing library instead.
This question mentions a few options.
Are there any CSV readers/writer libraries in C#?
I use a two parse solution as it's very easy to maintain
// Prepare the values
var allLines = (from trade in proposedTrades
select new object[]
{
trade.TradeType.ToString(),
trade.AccountReference,
trade.SecurityCodeType.ToString(),
trade.SecurityCode,
trade.ClientReference,
trade.TradeCurrency,
trade.AmountDenomination.ToString(),
trade.Amount,
trade.Units,
trade.Percentage,
trade.SettlementCurrency,
trade.FOP,
trade.ClientSettlementAccount,
string.Format("\"{0}\"", trade.Notes),
}).ToList();
// Build the file content
var csv = new StringBuilder();
allLines.ForEach(line =>
{
csv.AppendLine(string.Join(",", line));
});
File.WriteAllText(filePath, csv.ToString());
Instead of calling every time AppendAllText() you could think about opening the file once and then write the whole content once:
var file = #"C:\myOutput.csv";
using (var stream = File.CreateText(file))
{
for (int i = 0; i < reader.Count(); i++)
{
string first = reader[i].ToString();
string second = image.ToString();
string csvRow = string.Format("{0},{1}", first, second);
stream.WriteLine(csvRow);
}
}
You can use AppendAllText instead:
File.AppendAllText(filePath, csv);
As the documentation of WriteAllText says:
If the target file already exists, it is overwritten
Also, note that your current code is not using proper new lines, for example in Notepad you'll see it all as one long line. Change the code to this to have proper new lines:
string csv = string.Format("{0},{1}{2}", first, image, Environment.NewLine);
Instead of reinventing the wheel a library could be used. CsvHelper is great for creating and reading csv files. It's read and write operations are stream based and therefore also support operations with a big amount of data.
You can write your csv like the following.
using(var textWriter = new StreamWriter(#"C:\mypath\myfile.csv"))
{
var writer = new CsvWriter(textWriter, CultureInfo.InvariantCulture);
writer.Configuration.Delimiter = ",";
foreach (var item in list)
{
writer.WriteField( "a" );
writer.WriteField( 2 );
writer.WriteField( true );
writer.NextRecord();
}
}
As the library is using reflection it will take any type and parse it directly.
public class CsvRow
{
public string Column1 { get; set; }
public bool Column2 { get; set; }
public CsvRow(string column1, bool column2)
{
Column1 = column1;
Column2 = column2;
}
}
IEnumerable<CsvRow> rows = new [] {
new CsvRow("value1", true),
new CsvRow("value2", false)
};
using(var textWriter = new StreamWriter(#"C:\mypath\myfile.csv")
{
var writer = new CsvWriter(textWriter, CultureInfo.InvariantCulture);
writer.Configuration.Delimiter = ",";
writer.WriteRecords(rows);
}
value1,true
value2,false
If you want to read more about the librarys configurations and possibilities you can do so here.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Data;
using System.Configuration;
using System.Data.SqlClient;
public partial class CS : System.Web.UI.Page
{
protected void ExportCSV(object sender, EventArgs e)
{
string constr = ConfigurationManager.ConnectionStrings["constr"].ConnectionString;
using (SqlConnection con = new SqlConnection(constr))
{
using (SqlCommand cmd = new SqlCommand("SELECT * FROM Customers"))
{
using (SqlDataAdapter sda = new SqlDataAdapter())
{
cmd.Connection = con;
sda.SelectCommand = cmd;
using (DataTable dt = new DataTable())
{
sda.Fill(dt);
//Build the CSV file data as a Comma separated string.
string csv = string.Empty;
foreach (DataColumn column in dt.Columns)
{
//Add the Header row for CSV file.
csv += column.ColumnName + ',';
}
//Add new line.
csv += "\r\n";
foreach (DataRow row in dt.Rows)
{
foreach (DataColumn column in dt.Columns)
{
//Add the Data rows.
csv += row[column.ColumnName].ToString().Replace(",", ";") + ',';
}
//Add new line.
csv += "\r\n";
}
//Download the CSV file.
Response.Clear();
Response.Buffer = true;
Response.AddHeader("content-disposition", "attachment;filename=SqlExport.csv");
Response.Charset = "";
Response.ContentType = "application/text";
Response.Output.Write(csv);
Response.Flush();
Response.End();
}
}
}
}
}
}
Handling Commas
For handling commas inside of values when using string.Format(...), the following has worked for me:
var newLine = string.Format("\"{0}\",\"{1}\",\"{2}\"",
first,
second,
third
);
csv.AppendLine(newLine);
So to combine it with Johan's answer, it'd look like this:
//before your loop
var csv = new StringBuilder();
//in your loop
var first = reader[0].ToString();
var second = image.ToString();
//Suggestion made by KyleMit
var newLine = string.Format("\"{0}\",\"{1}\"", first, second);
csv.AppendLine(newLine);
//after your loop
File.WriteAllText(filePath, csv.ToString());
Returning CSV File
If you simply wanted to return the file instead of writing it to a location, this is an example of how I accomplished it:
From a Stored Procedure
public FileContentResults DownloadCSV()
{
// I have a stored procedure that queries the information I need
SqlConnection thisConnection = new SqlConnection("Data Source=sv12sql;User ID=UI_Readonly;Password=SuperSecure;Initial Catalog=DB_Name;Integrated Security=false");
SqlCommand queryCommand = new SqlCommand("spc_GetInfoINeed", thisConnection);
queryCommand.CommandType = CommandType.StoredProcedure;
StringBuilder sbRtn = new StringBuilder();
// If you want headers for your file
var header = string.Format("\"{0}\",\"{1}\",\"{2}\"",
"Name",
"Address",
"Phone Number"
);
sbRtn.AppendLine(header);
// Open Database Connection
thisConnection.Open();
using (SqlDataReader rdr = queryCommand.ExecuteReader())
{
while (rdr.Read())
{
// rdr["COLUMN NAME"].ToString();
var queryResults = string.Format("\"{0}\",\"{1}\",\"{2}\"",
rdr["Name"].ToString(),
rdr["Address"}.ToString(),
rdr["Phone Number"].ToString()
);
sbRtn.AppendLine(queryResults);
}
}
thisConnection.Close();
return File(new System.Text.UTF8Encoding().GetBytes(sbRtn.ToString()), "text/csv", "FileName.csv");
}
From a List
/* To help illustrate */
public static List<Person> list = new List<Person>();
/* To help illustrate */
public class Person
{
public string name;
public string address;
public string phoneNumber;
}
/* The important part */
public FileContentResults DownloadCSV()
{
StringBuilder sbRtn = new StringBuilder();
// If you want headers for your file
var header = string.Format("\"{0}\",\"{1}\",\"{2}\"",
"Name",
"Address",
"Phone Number"
);
sbRtn.AppendLine(header);
foreach (var item in list)
{
var listResults = string.Format("\"{0}\",\"{1}\",\"{2}\"",
item.name,
item.address,
item.phoneNumber
);
sbRtn.AppendLine(listResults);
}
}
return File(new System.Text.UTF8Encoding().GetBytes(sbRtn.ToString()), "text/csv", "FileName.csv");
}
Hopefully this is helpful.
This is a simple tutorial on creating csv files using C# that you will be able to edit and expand on to fit your own needs.
First you’ll need to create a new Visual Studio C# console application, there are steps to follow to do this.
The example code will create a csv file called MyTest.csv in the location you specify. The contents of the file should be 3 named columns with text in the first 3 rows.
https://tidbytez.com/2018/02/06/how-to-create-a-csv-file-with-c/
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace CreateCsv
{
class Program
{
static void Main()
{
// Set the path and filename variable "path", filename being MyTest.csv in this example.
// Change SomeGuy for your username.
string path = #"C:\Users\SomeGuy\Desktop\MyTest.csv";
// Set the variable "delimiter" to ", ".
string delimiter = ", ";
// This text is added only once to the file.
if (!File.Exists(path))
{
// Create a file to write to.
string createText = "Column 1 Name" + delimiter + "Column 2 Name" + delimiter + "Column 3 Name" + delimiter + Environment.NewLine;
File.WriteAllText(path, createText);
}
// This text is always added, making the file longer over time
// if it is not deleted.
string appendText = "This is text for Column 1" + delimiter + "This is text for Column 2" + delimiter + "This is text for Column 3" + delimiter + Environment.NewLine;
File.AppendAllText(path, appendText);
// Open the file to read from.
string readText = File.ReadAllText(path);
Console.WriteLine(readText);
}
}
}
public static class Extensions
{
public static void WriteCSVLine(this StreamWriter writer, IEnumerable<string> fields)
{
const string q = #"""";
writer.WriteLine(string.Join(",",
fields.Select(
v => (v.Contains(',') || v.Contains('"') || v.Contains('\n') || v.Contains('\r')) ? $"{q}{v.Replace(q, q + q)}{q}" : v
)));
}
public static void WriteCSVLine(this StreamWriter writer, params string[] fields) => WriteCSVLine(writer, (IEnumerable<string>)fields);
}
This should allow you to write a csv file quite simply. Usage:
StreamWriter writer = new ("myfile.csv");
writer.WriteCSVLine("A", "B"); // A,B
Here is another open source library to create CSV file easily, Cinchoo ETL
List<dynamic> objs = new List<dynamic>();
dynamic rec1 = new ExpandoObject();
rec1.Id = 10;
rec1.Name = #"Mark";
rec1.JoinedDate = new DateTime(2001, 2, 2);
rec1.IsActive = true;
rec1.Salary = new ChoCurrency(100000);
objs.Add(rec1);
dynamic rec2 = new ExpandoObject();
rec2.Id = 200;
rec2.Name = "Tom";
rec2.JoinedDate = new DateTime(1990, 10, 23);
rec2.IsActive = false;
rec2.Salary = new ChoCurrency(150000);
objs.Add(rec2);
using (var parser = new ChoCSVWriter("emp.csv").WithFirstLineHeader())
{
parser.Write(objs);
}
For more information, please read the CodeProject article on usage.
One simple way to get rid of the overwriting issue is to use File.AppendText to append line at the end of the file as
void Main()
{
using (System.IO.StreamWriter sw = System.IO.File.AppendText("file.txt"))
{
string first = reader[0].ToString();
string second=image.ToString();
string csv = string.Format("{0},{1}\n", first, second);
sw.WriteLine(csv);
}
}
enter code here
string string_value= string.Empty;
for (int i = 0; i < ur_grid.Rows.Count; i++)
{
for (int j = 0; j < ur_grid.Rows[i].Cells.Count; j++)
{
if (!string.IsNullOrEmpty(ur_grid.Rows[i].Cells[j].Text.ToString()))
{
if (j > 0)
string_value= string_value+ "," + ur_grid.Rows[i].Cells[j].Text.ToString();
else
{
if (string.IsNullOrEmpty(string_value))
string_value= ur_grid.Rows[i].Cells[j].Text.ToString();
else
string_value= string_value+ Environment.NewLine + ur_grid.Rows[i].Cells[j].Text.ToString();
}
}
}
}
string where_to_save_file = #"d:\location\Files\sample.csv";
File.WriteAllText(where_to_save_file, string_value);
string server_path = "/site/Files/sample.csv";
Response.ContentType = ContentType;
Response.AppendHeader("Content-Disposition", "attachment; filename=" + Path.GetFileName(server_path));
Response.WriteFile(server_path);
Response.End();
You might just have to add a line feed "\n\r".

Using MySqlBulkLoader to upload a DataSet content : Issue with the filename

I'd like to try the performance of MySqlBulkLoader knowing that the Adapter.update() method i'm using is taking roughly 30 mn to run.
I understand you have to go through a file to do it so here is my code :
private void button14_Click(object sender, EventArgs e)
{
string fileName = #"C:\Users\Utilisateur\ds.txt";
if (File.Exists(fileName))
{
File.Delete(fileName);
}
using (StreamWriter sw = File.CreateText(fileName))
{
foreach (DataRow row in Globals.ds.Tables[0].Rows)
{
foreach (object item in row.ItemArray)
{
string itemstr = item.ToString();
sw.Write((string)itemstr + "\t");
}
sw.WriteLine();
}
}
using (var conn = new MySqlConnection(Globals.connString))
{
conn.Open();
MySqlCommand comm = new MySqlCommand("TRUNCATE Song",conn);
comm.ExecuteNonQuery();
var bl = new MySqlBulkLoader(conn)
{
TableName = Globals.ds.Tables[0].ToString(),
Timeout = 600,
FieldTerminator = "\t",
LineTerminator = "\n",
FileName = fileName
};
var numberOfInsertedRows = bl.Load();
Console.WriteLine(numberOfInsertedRows);
}
}
The file is generated ok. but at the var numberOfInsertedRows = bl.Load(); line, i have the following error at run time :
MySql.Data.MySqlClient.MySqlException: 'Can't get stat of '/var/packages/MariaDB10/target/mysql/disk/C:\Users\Utilisateur\ds.txt' (Errcode: 2 "No such file or directory")'
I tried to put "/" instead of "\" in the fileName but it's the same error.
I have no idea what's going on, anyone can help ?
Thanks
By default, MySqlBulkLoader loads a file from the server's file system. To use a local file, set bl.Local = true; before calling bl.Load().
To enable this, you will need to set AllowLoadLocalInfile = True in your connection string; see https://mysqlconnector.net/troubleshooting/load-data-local-infile/
Finally, if you switch to MySqlConnector, you can use its MySqlBulkCopy API to load data directly from a DataTable, instead of first saving it to a local CSV file, then loading that file.

Dumping SQL table to .csv C#

I am trying to implement a script in my application that will dump the entire contents (for now, but I am trying to write the code so that I can easily customize it to only grab certain columns) of a sql db (running ms sql server express 2014) to a .csv file.
Here is the code I have written currently:
public void doCsvWrite(string timeStamp){
try {
//specify file name of log file (csv).
string newFileName = "C:/TestDirectory/DataExport-" + timeStamp + ".csv";
//check to see if file exists, if not create an empty file with the specified file name.
if (!File.Exists(newFileName)) {
FileStream fs = new FileStream(newFileName, FileMode.CreateNew);
fs.Close();
//define header of new file, and write header to file.
string csvHeader = "ITEM1,ITEM2,ITEM3,ITEM4,ITEM5";
using (FileStream fsWHT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swT = new StreamWriter(fsWHT))
{
swT.WriteLine(csvHeader.ToString());
}
}
//set up connection to database.
SqlConnection myDEConnection;
String cDEString = "Data Source=localhost\\NAMEDPIPE;Initial Catalog=db;User Id=user;Password=pwd";
String strDEStatement = "SELECT * FROM table";
try
{
myDEConnection = new SqlConnection(cDEString);
}
catch (Exception ex)
{
//error handling here.
return;
}
try
{
myDEConnection.Open();
}
catch (Exception ex)
{
//error handling here.
return;
}
SqlDataReader reader = null;
SqlCommand myDECommand = new SqlCommand(strDEStatement, myDEConnection);
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
catch (Exception ex)
{
//error handling here.
myDEConnection.Close();
return;
}
myDEConnection.Close();
}
catch (Exception ex)
{
//error handling here.
MessageBox.Show(ex.Message);
}
}
Now, this was working fine when I was using it with a 3rd party SQLite-based database, but the output I'm getting after modifing this to my MSSQL db looks something like this (ITEM1 is the primary key, a standard auto-incrementing ID-field):
ITEM1,ITEM2,ITEM3,ITEM4,ITEM5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
....
So it seems that it writes several entries of the same row, where I would just like one single line each row. Any suggestions?
Thanks in advance.
edit: Thanks everyone for your answers!
The for loop isn't needed in the section below. Because it loops from 0 to FieldCount I assume the loop was originally meant to append the text from each column together but inside the loop there's a single line that concatenates the text and assigns it to csvDetails.
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
Usually, we use specialy designed export/import utilites for dumping data.
However, if you have to implement you own routine I suggest decomposing.
private static IEnumerable<IDataRecord> SourceData(String sql) {
using (SqlConnection con = new SqlConnection(ConnectionStringHere)) {
con.Open();
using (SqlCommand q = new SqlCommand(sql, con)) {
using (var reader = q.ExecuteReader()) {
while (reader.Read()) {
//TODO: you may want to add additional conditions here
yield return reader;
}
}
}
}
}
private static IEnumerable<String> ToCsv(IEnumerable<IDataRecord> data) {
foreach (IDataRecord record in data) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < record .FieldCount; ++i) {
String chunk = Convert.ToString(record .GetValue(0));
if (i > 0)
sb.Append(',');
if (chunk.Contains(',') || chunk.Contains(';'))
chunk = "\"" + chunk.Replace("\"", "\"\"") + "\"";
sb.Append(chunk);
}
yield return sb.ToString();
}
}
Having SourceData and ToCsv you can easily implement
private static void WriteMyCsv(String fileName) {
var source = SourceData("SELECT * FROM table");
File.WriteAllLines(fileName, ToCsv(source));
}
You have a for loop which is looping over the fieldcount.
for (int i = 0; i < reader.FieldCount; i++)
I think it will work if you remove the loop as you don't need to iterate through the columns.
it happens because output placed inside for-loop
for (int i = 0; i < reader.FieldCount; i++)
and every record repeats FieldCount-times
Complete example. Verified working .NET 4.8, May 22. Code simplified for demo.
Why the DataTable ? Under circumstances it is useful. If you converting hundreds of files at once and multi threading - it works as large buffer + you can do pretty complex data mangling at the same time - should you need it.
UNFORTUNATELY - Microsoft trying to detect the column types and if your data not comply with the mechanism it ends with hard to correct errors. In that case use the second solution.
// Get the data from SQLite
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM stats;", SQLiDataCon).ExecuteReader();
// Load data to DataTable
DataTable csvTable = new DataTable();
csvTable.Load(SQLiDtaReader);
// Get "one" string with column names
string csvFields = #"""" + String.Join(#""",""",csvTable.Columns.Cast<DataColumn>().Select(dc => dc.ColumnName).ToArray()) + #"""";
// Prep "in memory the entire content of the CSV"
StringBuilder csvString = new StringBuilder();
// Write the header in
csvString.AppendLine(csvFields);
// Write the rows in
foreach (DataRow dr in csvTable.Rows)
{
csvString.AppendLine(#"""" + String.Join(#""",""", dr.ItemArray) + #"""");
}
// Save to file
StreamWriter csvFile = new StreamWriter(#"c:\stats.csv");
csvFile.Write(csvString);
Without DataTable.
// SQLITE
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
StringBuilder csvString = new StringBuilder();
StreamWriter csvFile;
Object[] csvRow;
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM sometable;", SQLiDataCon).ExecuteReader();
// CSV HEADER
csvString.AppendLine(#"""" + String.Join(#""",""", SQLiDtaReader.GetSchemaTable().AsEnumerable().Select(dr => dr.Field<string>("ColumnName")).ToArray<string>()) + #"""");
// CSV BODY
while (SQLiDtaReader.Read())
{
SQLiDtaReader.GetValues(csvRow = new Object[SQLiDtaReader.FieldCount]);
csvString.AppendLine(#"""" + String.Join(#""",""",csvRow ) + #"""");
}
// WRITE IT
csvFile = new StreamWriter(#"C:\somecsvfile.csv");
csvFile.Write(csvString);

Parsing CSV data

I am trying to parse a CSV file with data with no luck, i have tried a bunch of tools online and none has been able to parse the CSV file correctly. I am baffled by the fact that i am in here asking for help as one would think parsing CSV data would be something super easy.
The format of the CSV data is like this:
",95,54070,3635,""Test Reservation"",0,102,0.00,0.00,2014-12-31,""Name of customer"",""$12.34 + $10, special price"",""extra information"",,CustomerName,,,,,1234567890,youremail#domain.com,CustomerName,2014-12-31,23:59:59,16,0,60,2,120,0,NULL,NULL,NULL,"
Current code:
private void btnOpenFileDialog_Click(object sender, EventArgs e)
{
DialogResult result = openFileDialog1.ShowDialog();
if (result == DialogResult.OK)
{
using (StreamReader reader = new StreamReader(openFileDialog1.FileName))
{
string line;
while ((line = reader.ReadLine()) != null)
{
ParseCsvLine(line);
}
}
}
}
private void ParseCsvLine(string line)
{
if (line != string.Empty)
{
string[] result;
using (var csvParser = new TextFieldParser(new StringReader(line)))
{
csvParser.Delimiters = new string[] { "," };
result = csvParser.ReadFields();
}
foreach (var item in result)
{
Console.WriteLine(item + Environment.NewLine);
}
}
}
The result variable only has one item and its:
,95,54070,3635,"Test Reservation",0,102,0.00,0.00,2014-12-31,"Name of customer","$12.34 + $10, special price","extra information",,CustomerName,,,,,1234567890,youremail#domain.com,CustomerName,2014-12-31,23:59:59,16,0,60,2,120,0,NULL,NULL,NULL,
// Add Microsoft.VisualBasic.dll to References.
using Microsoft.VisualBasic.FileIO;
// input is your original line from csv.
// Remove starting and ending quotes.
input = input.Remove(0, 1);
input = input.Remove(input.Length - 1);
// Replace double quotes with single quotes.
input = input.Replace("\"\"", "\"");
string[] result;
using (var csvParser = new TextFieldParser(new StringReader(input)))
{
csvParser.Delimiters = new string[] { "," };
result = csvParser.ReadFields();
}
You can check out a previous post that deals with those pesky commas in csv files. I'm linking it here.
Also Mihai, your solution works well for just the one line but will fail once there are many lines to parse.

How to use textfieldParser to edit a CSV file?

I wrote a small function that reads a csv file using textField line by line , edit it a specific field then write it back to a CSV file.
Here is the code :
private void button2_Click(object sender, EventArgs e)
{
String path = #"C:\file.csv";
String dpath = #"C:\file_processed.csv";
List<String> lines = new List<String>();
if (File.Exists(path))
{
using (TextFieldParser parser = new TextFieldParser(path))
{
String line;
parser.HasFieldsEnclosedInQuotes = true;
parser.Delimiters = new string[] { "," };
while ((line = parser.ReadLine()) != null)
{
string[] parts = parser.ReadFields();
if (parts == null)
{
break;
}
if ((parts[12] != "") && (parts[12] != "0"))
{
parts[12] = parts[12].Substring(0, 3);
//MessageBox.Show(parts[12]);
}
lines.Add(line);
}
}
using (StreamWriter writer = new StreamWriter(dpath, false))
{
foreach (String line in lines)
writer.WriteLine(line);
}
MessageBox.Show("CSV file successfully processed ! ");
}
}
The field I want to edit is the 12th one (parts[12]):
for example : if parts[12] = 000,000,234 then change to 000
the file is created the problem is it does not edit the file and half the records are missing. I am hoping someone could point the mistake.
You call both parser.ReadFields() and parser.ReadLine(). Each of them advance the cursor by one. That's why you're missing half the rows. Change the while to:
while(!parser.EndOfData)
Then add parts = parser.ReadFields(); to the end of the loop. Not having this is why you're edit isn't being seen.
You can also remove:
if (parts == null)
{
break;
}
Since you no longer have line, you'll need to use the fields to keep track of your results:
lines.Add(string.Join(",", parts));//handle string escaping fields if needed.

Categories

Resources