I am using below code to export data from a csv file to datatable.
As the values are of mixed text i.e. both numbers and Alphabets, some of the columns are not getting exported to Datatable.
I have done some research here and found that we need to set ImportMixedType = Text and TypeGuessRows = 0 in registry which even did not solve the problem.
Below code is working for some files even with mixed text.
Could someone tell me what is wrong with below code. Do I miss some thing here.
if (isFirstRowHeader)
{
header = "Yes";
}
using (OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly +
";Extended Properties=\"text;HDR=" + header + ";FMT=Delimited\";"))
{
using (OleDbCommand command = new OleDbCommand(sql, connection))
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
adapter.Fill(table);
}
connection.Close();
}
}
for comma delimited file this worked for me
public DataTable CSVtoDataTable(string inputpath)
{
DataTable csvdt = new DataTable();
string Fulltext;
if (File.Exists(inputpath))
{
using (StreamReader sr = new StreamReader(inputpath))
{
while (!sr.EndOfStream)
{
Fulltext = sr.ReadToEnd().ToString();//read full content
string[] rows = Fulltext.Split('\n');//split file content to get the rows
for (int i = 0; i < rows.Count() - 1; i++)
{
var regex = new Regex("\\\"(.*?)\\\"");
var output = regex.Replace(rows[i], m => m.Value.Replace(",", "\\c"));//replace commas inside quotes
string[] rowValues = output.Split(',');//split rows with comma',' to get the column values
{
if (i == 0)
{
for (int j = 0; j < rowValues.Count(); j++)
{
csvdt.Columns.Add(rowValues[j].Replace("\\c",","));//headers
}
}
else
{
try
{
DataRow dr = csvdt.NewRow();
for (int k = 0; k < rowValues.Count(); k++)
{
if (k >= dr.Table.Columns.Count)// more columns may exist
{ csvdt .Columns.Add("clmn" + k);
dr = csvdt .NewRow();
}
dr[k] = rowValues[k].Replace("\\c", ",");
}
csvdt.Rows.Add(dr);//add other rows
}
catch
{
Console.WriteLine("error");
}
}
}
}
}
}
}
return csvdt;
}
The main thing that would probably help is to first stop using OleDB objects for reading a delimited file. I suggest using the 'TextFieldParser' which is what I have successfully used for over 2 years now for a client.
http://www.dotnetperls.com/textfieldparser
There may be other issues, but without seeing your .CSV file, I can't tell you where your problem may lie.
The TextFieldParser is specifically designed to parse comma delimited files. The OleDb objects are not. So, start there and then we can determine what the problem may be, if it persists.
If you look at an example on the link I provided, they are merely writing lines to the console. You can alter this code portion to add rows to a DataTable object, as I do, for sorting purposes.
Related
I am using Epplus library in order to upload data from excel file.The code i am using is perfectly works for excel file which has standard form.ie if first row is column and rest all data corresponds to column.But now a days i am getting regularly , excel files which has different structure and i am not able to read
excel file like as shown below
what i want is on third row i wan only Region and Location Id and its values.Then 7th row is columns and 8th to 15 are its values.Finally 17th row is columns for 18th to 20th .How to load all these datas to seperate datatables
code i used is as shown below
I created an extension method
public static DataSet Exceltotable(this string path)
{
DataSet ds = null;
using (var pck = new OfficeOpenXml.ExcelPackage())
{
try
{
using (var stream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
pck.Load(stream);
}
ds = new DataSet();
var wss = pck.Workbook.Worksheets;
////////////////////////////////////
//Application app = new Application();
//app.Visible = true;
//app.Workbooks.Add("");
//app.Workbooks.Add(#"c:\MyWork\WorkBook1.xls");
//app.Workbooks.Add(#"c:\MyWork\WorkBook2.xls");
//for (int i = 2; i <= app.Workbooks.Count; i++)
//{
// for (int j = 1; j <= app.Workbooks[i].Worksheets.Count; j++)
// {
// Worksheet ws = app.Workbooks[i].Worksheets[j];
// ws.Copy(app.Workbooks[1].Worksheets[1]);
// }
//}
///////////////////////////////////////////////////
//for(int s=0;s<5;s++)
//{
foreach (var ws in wss)
{
System.Data.DataTable tbl = new System.Data.DataTable();
bool hasHeader = true; // adjust it accordingly( i've mentioned that this is a simple approach)
string ErrorMessage = string.Empty;
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
{
tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
}
var startRow = hasHeader ? 2 : 1;
for (var rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
{
var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
var row = tbl.NewRow();
foreach (var cell in wsRow)
{
//modifed by faras
if (cell.Text != null)
{
row[cell.Start.Column - 1] = cell.Text;
}
}
tbl.Rows.Add(row);
tbl.TableName = ws.Name;
}
DataTable dt = RemoveEmptyRows(tbl);
ds.Tables.Add(dt);
}
}
catch (Exception exp)
{
}
return ds;
}
}
If you're providing the template for users to upload, you can mitigate this some by using named ranges in your spreadsheet. That's a good idea anyway when programmatically working with Excel because it helps when you modify your own spreadsheet, not just when the user does.
You probably know how to name a range, but for the sake of completeness, here's how to name a range.
When you're working with the spreadsheet in code you can get a reference to the range using [yourworkbook].Names["yourNamedRange"]. If it's just a single cell and you need to reference the row or column index you can use .Start.Row or .Start.Column.
I add named ranges for anything - cells containing particular values, columns, header rows, rows where sets of data begin. If I need row or column indexes I assign useful variable names. That protects you from having all sorts of "magic numbers" in your spreadsheet. You (or your users) can move quite a bit around without breaking anything.
If they modify the structure too much then it won't work. You can also use protection on the workbook and worksheet to ensure that they can't accidentally modify the structure - tabs, rows, columns.
This is loosely taken from a test I was working with last weekend when I was learning this. It was just a "hello world" so I wasn't trying to make it all streamlined and perfect. (I was working on populating a spreadsheet, not reading one, so I'm just learning the properties as I go.)
// Open the workbook
using (var package = new ExcelPackage(new FileInfo("PriceQuoteTemplate.xlsx")))
{
// Get the worksheet I'm looking for
var quoteSheet = package.Workbook.Worksheets["Quote"];
//If I wanted to get the text from one named range
var cellText = quoteSheet.Workbook.Names["myNamedRange"].Text
//If I wanted to get the cell's value as some other type
var cellValue = quoteSheet.Workbook.Names["myNamedRange"].GetValue<int>();
//If I had a named range and I wanted to loop through the rows and get
//values from certain columns
var myRange = quoteSheet.Workbook.Names["rangeContainingRows"];
//This is a named range used to mark a column. So instead of using a
//magic number, I'll read from whatever column has this named range.
var someColumn = quoteSheet.Workbook.Names["columnLabel"].Start.Column;
for(var rowNumber = myRange.Start.Row; rowNumber < myRange.Start.Row + myRange.Rows; rowNumber++)
{
var getTheTextForTheRowAndColumn = quoteSheet.Cells(rowNumber, someColumn).Text
}
There might be a more elegant way to go about it. I just started using this myself. But the idea is you tell it to find a certain named range on the spreadsheet, and then you use the row or column number of that range instead of a magic row or column number.
Even though a range might be one cell, one row, or one column, it can potentially be a larger area. That's why I use .Start.Row. In other words, give me the row for the first cell in the range. If a range has more than one row, the .Rows property indicates the number of rows so I know how many there are. That means someone could even insert rows without breaking the code.
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.OleDb;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
namespace ReadData
{
public partial class ImportExelDataInGridView : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
protected void btnUpload_Click(object sender, EventArgs e)
{
//Coneection String by default empty
string ConStr = "";
//Extantion of the file upload control saving into ext because
//there are two types of extation .xls and .xlsx of excel
string ext = Path.GetExtension(FileUpload1.FileName).ToLower();
//getting the path of the file
string path = Server.MapPath("~/MyFolder/"+FileUpload1.FileName);
//saving the file inside the MyFolder of the server
FileUpload1.SaveAs(path);
Label1.Text = FileUpload1.FileName + "\'s Data showing into the GridView";
//checking that extantion is .xls or .xlsx
if (ext.Trim() == ".xls")
{
//connection string for that file which extantion is .xls
ConStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + path + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
}
else if (ext.Trim() == ".xlsx")
{
//connection string for that file which extantion is .xlsx
ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
}
//making query
string query = "SELECT * FROM [Sheet1$]";
//Providing connection
OleDbConnection conn = new OleDbConnection(ConStr);
//checking that connection state is closed or not if closed the
//open the connection
if (conn.State == ConnectionState.Closed)
{
conn.Open();
}
//create command object
OleDbCommand cmd = new OleDbCommand(query, conn);
// create a data adapter and get the data into dataadapter
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
DataSet ds = new DataSet();
//fill the excel data to data set
da.Fill(ds);
if (ds.Tables != null && ds.Tables.Count > 0)
{
for (int i = 0; i < ds.Tables[0].Columns.Count; i++)
{
if (ds.Tables[0].Columns[0].ToString() == "ID" && ds.Tables[0].Columns[1].ToString() == "name")
{
}
//else if (ds.Tables[0].Rows[0][i].ToString().ToUpper() == "NAME")
//{
//}
//else if (ds.Tables[0].Rows[0][i].ToString().ToUpper() == "EMAIL")
//{
//}
}
}
//set data source of the grid view
gvExcelFile.DataSource = ds.Tables[0];
//binding the gridview
gvExcelFile.DataBind();
//close the connection
conn.Close();
}
}
}
try
{
System.Diagnostics.Process[] process = System.Diagnostics.Process.GetProcessesByName("Excel");
foreach (System.Diagnostics.Process p in process)
{
if (!string.IsNullOrEmpty(p.ProcessName))
{
try
{
p.Kill();
}
catch { }
}
}
REF_User oREF_User = new REF_User();
oREF_User = (REF_User)Session["LoggedUser"];
string pdfFilePath = Server.MapPath("~/FileUpload/" + oREF_User.USER_ID + "");
if (Directory.Exists(pdfFilePath))
{
System.IO.DirectoryInfo di = new DirectoryInfo(pdfFilePath);
foreach (FileInfo file in di.GetFiles())
{
file.Delete();
}
Directory.Delete(pdfFilePath);
}
Directory.CreateDirectory(pdfFilePath);
string path = Server.MapPath("~/FileUpload/" + oREF_User.USER_ID + "/");
if (Path.GetExtension(FileUpload1.FileName) == ".xlsx")
{
string fullpath1 = path + Path.GetFileName(FileUpload1.FileName);
if (FileUpload1.FileName != "")
{
FileUpload1.SaveAs(fullpath1);
}
FileStream Stream = new FileStream(fullpath1, FileMode.Open);
IExcelDataReader ExcelReader = ExcelReaderFactory.CreateOpenXmlReader(Stream);
DataSet oDataSet = ExcelReader.AsDataSet();
Stream.Close();
bool result = false;
foreach (System.Data.DataTable oDataTable in oDataSet.Tables)
{
//ToDO code
}
oBL_PlantTransactions.InsertList(oListREF_PlantTransactions, null);
ShowMessage("Successfully saved!", REF_ENUM.MessageType.Success);
}
else
{
ShowMessage("File Format Incorrect", REF_ENUM.MessageType.Error);
}
}
catch (Exception ex)
{
ShowMessage("Please check the details and submit again!", REF_ENUM.MessageType.Error);
System.Diagnostics.Process[] process = System.Diagnostics.Process.GetProcessesByName("Excel");
foreach (System.Diagnostics.Process p in process)
{
if (!string.IsNullOrEmpty(p.ProcessName))
{
try
{
p.Kill();
}
catch { }
}
}
}
I found this article to be very helpful.
It lists various libraries you can choose from. One of the libraries I used is EPPlus as shown below.
Nuget: EPPlus Library
Excel Sheet 1 Data
Cell A2 Value :
Cell A2 Color :
Cell B2 Formula :
Cell B2 Value :
Cell B2 Border :
Excel Sheet 2 Data
Cell A2 Formula :
Cell A2 Value :
static void Main(string[] args)
{
using(var package = new ExcelPackage(new FileInfo("Book.xlsx")))
{
var firstSheet = package.Workbook.Worksheets["First Sheet"];
Console.WriteLine("Sheet 1 Data");
Console.WriteLine($"Cell A2 Value : {firstSheet.Cells["A2"].Text}");
Console.WriteLine($"Cell A2 Color : {firstSheet.Cells["A2"].Style.Font.Color.LookupColor()}");
Console.WriteLine($"Cell B2 Formula : {firstSheet.Cells["B2"].Formula}");
Console.WriteLine($"Cell B2 Value : {firstSheet.Cells["B2"].Text}");
Console.WriteLine($"Cell B2 Border : {firstSheet.Cells["B2"].Style.Border.Top.Style}");
Console.WriteLine("");
var secondSheet = package.Workbook.Worksheets["Second Sheet"];
Console.WriteLine($"Sheet 2 Data");
Console.WriteLine($"Cell A2 Formula : {secondSheet.Cells["A2"].Formula}");
Console.WriteLine($"Cell A2 Value : {secondSheet.Cells["A2"].Text}");
}
}
I am trying to implement a script in my application that will dump the entire contents (for now, but I am trying to write the code so that I can easily customize it to only grab certain columns) of a sql db (running ms sql server express 2014) to a .csv file.
Here is the code I have written currently:
public void doCsvWrite(string timeStamp){
try {
//specify file name of log file (csv).
string newFileName = "C:/TestDirectory/DataExport-" + timeStamp + ".csv";
//check to see if file exists, if not create an empty file with the specified file name.
if (!File.Exists(newFileName)) {
FileStream fs = new FileStream(newFileName, FileMode.CreateNew);
fs.Close();
//define header of new file, and write header to file.
string csvHeader = "ITEM1,ITEM2,ITEM3,ITEM4,ITEM5";
using (FileStream fsWHT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swT = new StreamWriter(fsWHT))
{
swT.WriteLine(csvHeader.ToString());
}
}
//set up connection to database.
SqlConnection myDEConnection;
String cDEString = "Data Source=localhost\\NAMEDPIPE;Initial Catalog=db;User Id=user;Password=pwd";
String strDEStatement = "SELECT * FROM table";
try
{
myDEConnection = new SqlConnection(cDEString);
}
catch (Exception ex)
{
//error handling here.
return;
}
try
{
myDEConnection.Open();
}
catch (Exception ex)
{
//error handling here.
return;
}
SqlDataReader reader = null;
SqlCommand myDECommand = new SqlCommand(strDEStatement, myDEConnection);
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
catch (Exception ex)
{
//error handling here.
myDEConnection.Close();
return;
}
myDEConnection.Close();
}
catch (Exception ex)
{
//error handling here.
MessageBox.Show(ex.Message);
}
}
Now, this was working fine when I was using it with a 3rd party SQLite-based database, but the output I'm getting after modifing this to my MSSQL db looks something like this (ITEM1 is the primary key, a standard auto-incrementing ID-field):
ITEM1,ITEM2,ITEM3,ITEM4,ITEM5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
1,row1_item2,row1_item3,row1_item4,row1_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
2,row2_item2,row2_item3,row2_item4,row2_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
3,row3_item2,row3_item3,row3_item4,row3_item5
....
So it seems that it writes several entries of the same row, where I would just like one single line each row. Any suggestions?
Thanks in advance.
edit: Thanks everyone for your answers!
The for loop isn't needed in the section below. Because it loops from 0 to FieldCount I assume the loop was originally meant to append the text from each column together but inside the loop there's a single line that concatenates the text and assigns it to csvDetails.
try
{
reader = myDECommand.ExecuteReader();
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
if(reader["Column1"].ToString() == "") {
//does nothing if the current line is "bugged" (containing no values at all, typically happens after reboot of 3rd party equipment).
}
else {
//grab relevant tag data and set the csv line for the current row.
string csvDetails = reader["Column1"] + "," + reader["Column2"] + "," + String.Format("{0:0.0}", reader["Column3"]) + "," + String.Format("{0:0.000}", reader["Column4"]) + "," + reader["Column5"];
using (FileStream fsWDT = new FileStream(newFileName, FileMode.Append, FileAccess.Write))
using(StreamWriter swDT = new StreamWriter(fsWDT))
{
//write csv line to file.
swDT.WriteLine(csvDetails.ToString());
}
}
}
}
}
Usually, we use specialy designed export/import utilites for dumping data.
However, if you have to implement you own routine I suggest decomposing.
private static IEnumerable<IDataRecord> SourceData(String sql) {
using (SqlConnection con = new SqlConnection(ConnectionStringHere)) {
con.Open();
using (SqlCommand q = new SqlCommand(sql, con)) {
using (var reader = q.ExecuteReader()) {
while (reader.Read()) {
//TODO: you may want to add additional conditions here
yield return reader;
}
}
}
}
}
private static IEnumerable<String> ToCsv(IEnumerable<IDataRecord> data) {
foreach (IDataRecord record in data) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < record .FieldCount; ++i) {
String chunk = Convert.ToString(record .GetValue(0));
if (i > 0)
sb.Append(',');
if (chunk.Contains(',') || chunk.Contains(';'))
chunk = "\"" + chunk.Replace("\"", "\"\"") + "\"";
sb.Append(chunk);
}
yield return sb.ToString();
}
}
Having SourceData and ToCsv you can easily implement
private static void WriteMyCsv(String fileName) {
var source = SourceData("SELECT * FROM table");
File.WriteAllLines(fileName, ToCsv(source));
}
You have a for loop which is looping over the fieldcount.
for (int i = 0; i < reader.FieldCount; i++)
I think it will work if you remove the loop as you don't need to iterate through the columns.
it happens because output placed inside for-loop
for (int i = 0; i < reader.FieldCount; i++)
and every record repeats FieldCount-times
Complete example. Verified working .NET 4.8, May 22. Code simplified for demo.
Why the DataTable ? Under circumstances it is useful. If you converting hundreds of files at once and multi threading - it works as large buffer + you can do pretty complex data mangling at the same time - should you need it.
UNFORTUNATELY - Microsoft trying to detect the column types and if your data not comply with the mechanism it ends with hard to correct errors. In that case use the second solution.
// Get the data from SQLite
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM stats;", SQLiDataCon).ExecuteReader();
// Load data to DataTable
DataTable csvTable = new DataTable();
csvTable.Load(SQLiDtaReader);
// Get "one" string with column names
string csvFields = #"""" + String.Join(#""",""",csvTable.Columns.Cast<DataColumn>().Select(dc => dc.ColumnName).ToArray()) + #"""";
// Prep "in memory the entire content of the CSV"
StringBuilder csvString = new StringBuilder();
// Write the header in
csvString.AppendLine(csvFields);
// Write the rows in
foreach (DataRow dr in csvTable.Rows)
{
csvString.AppendLine(#"""" + String.Join(#""",""", dr.ItemArray) + #"""");
}
// Save to file
StreamWriter csvFile = new StreamWriter(#"c:\stats.csv");
csvFile.Write(csvString);
Without DataTable.
// SQLITE
SqliteConnection SQLiDataCon = new SqliteConnection(#"Data Source=c:\sqlite.db3");
SQLiDataCon.Open();
StringBuilder csvString = new StringBuilder();
StreamWriter csvFile;
Object[] csvRow;
SqliteDataReader SQLiDtaReader = new SqliteCommand(#"SELECT * FROM sometable;", SQLiDataCon).ExecuteReader();
// CSV HEADER
csvString.AppendLine(#"""" + String.Join(#""",""", SQLiDtaReader.GetSchemaTable().AsEnumerable().Select(dr => dr.Field<string>("ColumnName")).ToArray<string>()) + #"""");
// CSV BODY
while (SQLiDtaReader.Read())
{
SQLiDtaReader.GetValues(csvRow = new Object[SQLiDtaReader.FieldCount]);
csvString.AppendLine(#"""" + String.Join(#""",""",csvRow ) + #"""");
}
// WRITE IT
csvFile = new StreamWriter(#"C:\somecsvfile.csv");
csvFile.Write(csvString);
How can I use OLEDB to parse and import a CSV file that each cell is encased in double quotes because some rows contain commas in them?? I am unable to change the format as it is coming from a vendor.
I am trying the following and it is failing with an IO error:
public DataTable ConvertToDataTable(string fileToImport, string fileDestination)
{
string fullImportPath = fileDestination + #"\" + fileToImport;
OleDbDataAdapter dAdapter = null;
DataTable dTable = null;
try
{
if (!File.Exists(fullImportPath))
return null;
string full = Path.GetFullPath(fullImportPath);
string file = Path.GetFileName(full);
string dir = Path.GetDirectoryName(full);
//create the "database" connection string
string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=No;FMT=Delimited\"";
//create the database query
string query = "SELECT * FROM " + file;
//create a DataTable to hold the query results
dTable = new DataTable();
//create an OleDbDataAdapter to execute the query
dAdapter = new OleDbDataAdapter(query, connString);
//fill the DataTable
dAdapter.Fill(dTable);
}
catch (Exception ex)
{
throw new Exception(CLASS_NAME + ".ConvertToDataTable: Caught Exception: " + ex);
}
finally
{
if (dAdapter != null)
dAdapter.Dispose();
}
return dTable;
}
When I use a normal CSV it works fine. Do I need to change something in the connString??
Use a dedicated CSV parser.
There are many out there. A popular one is FileHelpers, though there is one hidden in the Microsoft.VisualBasic.FileIO namespace - TextFieldParser.
Have a look at FileHelpers.
You can use this code : MS office required
private void ConvertCSVtoExcel(string filePath = #"E:\nucc_taxonomy_140.csv", string tableName = "TempTaxonomyCodes")
{
string tempPath = System.IO.Path.GetDirectoryName(filePath);
string strConn = #"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + tempPath + #"\;Extensions=asc,csv,tab,txt";
OdbcConnection conn = new OdbcConnection(strConn);
OdbcDataAdapter da = new OdbcDataAdapter("Select * from " + System.IO.Path.GetFileName(filePath), conn);
DataTable dt = new DataTable();
da.Fill(dt);
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConfigurationSettings.AppSettings["dbConnectionString"]))
{
bulkCopy.DestinationTableName = tableName;
bulkCopy.BatchSize = 50;
bulkCopy.WriteToServer(dt);
}
}
There is a lot to consider when handling CSV files. However you extract them from the file, you should know how you are handling the parsing. There are classes out there that can get you part way, but most don't handle the nuances that Excel does with embedded commas, quotes and line breaks. However, loading Excel or the MS classes seems a lot of freaking overhead if you just want parse a txt file like a CSV.
One thing you can consider is doing the parsing in your own Regex, which will also make your code a little more platform independent, in case you need to port it to another server or application at some point. Using regex has the benefit of also being accessible in virtually every language. That said, there are some good regex patterns out there that handle the CSV puzzle. Here is my shot at it, which does cover embedded commas, quotes and line breaks. Regex code/pattern and explanation :
http://www.kimgentes.com/worshiptech-web-tools-page/2008/10/14/regex-pattern-for-parsing-csv-files-with-embedded-commas-dou.html
Hope that is of some help..
Try the code from my answer here:
Reading CSV files in C#
It handles quoted csv just fine.
private static void Mubashir_CSVParser(string s)
{
// extract the fields
Regex RegexCSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = RegexCSVParser.Split(s);
// clean up the fields (remove " and leading spaces)
for (int i = 0; i < Fields.Length; i++)
{
Fields[i] = Fields[i].TrimStart(' ', '"');
Fields[i] = Fields[i].TrimEnd('"');// this line remove the quotes
//Fields[i] = Fields[i].Trim();
}
}
Just incase anyone has a similar issue, i wanted to post the code i used. i did end up using Textparser to get the file and parse ot the columns, but i am using recrusion to get the rest done and substrings.
/// <summary>
/// Parses each string passed as a "row".
/// This routine accounts for both double quotes
/// as well as commas currently, but can be added to
/// </summary>
/// <param name="row"> string or row to be parsed</param>
/// <returns></returns>
private List<String> ParseRowToList(String row)
{
List<String> returnValue = new List<String>();
if (row[0] == '\"')
{// Quoted String
if (row.IndexOf("\",") > -1)
{// There are more columns
returnValue = ParseRowToList(row.Substring(row.IndexOf("\",") + 2));
returnValue.Insert(0, row.Substring(1, row.IndexOf("\",") - 1));
}
else
{// This is the last column
returnValue.Add(row.Substring(1, row.Length - 2));
}
}
else
{// Unquoted String
if (row.IndexOf(",") > -1)
{// There are more columns
returnValue = ParseRowToList(row.Substring(row.IndexOf(",") + 1));
returnValue.Insert(0, row.Substring(0, row.IndexOf(",")));
}
else
{// This is the last column
returnValue.Add(row.Substring(0, row.Length));
}
}
return returnValue;
}
Then the code for Textparser is:
// string pathFile = #"C:\TestFTP\TestCatalog.txt";
string pathFile = #"C:\TestFTP\SomeFile.csv";
List<String> stringList = new List<String>();
TextFieldParser fieldParser = null;
DataTable dtable = new DataTable();
/* Set up TextFieldParser
* use the correct delimiter provided
* and path */
fieldParser = new TextFieldParser(pathFile);
/* Set that there are quotes in the file for fields and or column names */
fieldParser.HasFieldsEnclosedInQuotes = true;
/* delimiter by default to be used first */
fieldParser.SetDelimiters(new string[] { "," });
// Build Full table to be imported
dtable = BuildDataTable(fieldParser, dtable);
This is what I used in a project, parses a single line of data.
private string[] csvParser(string csv, char separator = ',')
{
List <string> parsed = new List<string>();
string[] temp = csv.Split(separator);
int counter = 0;
string data = string.Empty;
while (counter < temp.Length)
{
data = temp[counter].Trim();
if (data.Trim().StartsWith("\""))
{
bool isLast = false;
while (!isLast && counter < temp.Length)
{
data += separator.ToString() + temp[counter + 1];
counter++;
isLast = (temp[counter].Trim().EndsWith("\""));
}
}
parsed.Add(data);
counter++;
}
return parsed.ToArray();
}
http://zamirsblog.blogspot.com/2013/09/c-csv-parser-csvparser.html
So I'm looking for an easy way to export data from a SQL Server 2000 database and write it to a comma delimited text file. Its one table and only about 1,000 rows. I'm new to C# so please excuse me if this is a stupid question.
This is a very easy task, but you need to learn the SqlClient namespace and the different objects you have at your disposal. You will want to note though that for SQL Server 2000 and lower asynchronous methods are not supported, so they will be all blocking.
Mind you this is a very sketchy example, and I did not test this, but this would be one general approach.
string connectionString = "<yourconnectionstringhere>";
using (SqlConnection connection = new SqlConnection(connectionString)) {
try {
connection.Open();
}
catch (System.Data.SqlClient.SqlException ex) {
// handle
return;
}
string selectCommandText = "SELECT * FROM <yourtable>";
using (SqlDataAdapter adapter = new SqlDataAdapter(selectCommandText, connection)) {
using (DataTable table = new DataTable("<yourtable>")) {
adapter.Fill(table);
StringBuilder commaDelimitedText = new StringBuilder();
commaDelimitedText.AppendLine("col1,col2,col3"); // optional if you want column names in first row
foreach (DataRow row in table.Rows) {
string value = string.Format("{0},{1},{2}", row[0], row[1], row[2]); // how you format is up to you (spaces, tabs, delimiter, etc)
commaDelimitedText.AppendLine(value);
}
File.WriteAllText("<pathhere>", commaDelimitedText.ToString());
}
}
}
Some resources you will want to look into:
SqlConnection Class; http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection(v=VS.90).aspx
SqlDataAdapter Class; http://msdn.microsoft.com/en-us/library/ds404w5w(v=VS.90).aspx
SqlDataAdapter.Fill Method; http://msdn.microsoft.com/en-us/library/905keexk(v=VS.90).aspx
DataTable Class; http://msdn.microsoft.com/en-us/library/system.data.datatable.aspx
I'm also not sure what your requirements are, or why you have to do this task, but there are also probably quite a lot of tools out there that can already do this for you (if this is a one time thing), because this is not an uncommon task.
This is no different in C# than in any other language with the tools you need.
1) Query DB and store data in collection
2) Write collection to file in CSV format
Are you doing this in a Windows Form application? If you have bound data to a control such as a DataGridView this is pretty easy to do. You can just loop through the control and write the file that way. I like this because if you implemented filtering mechanisms in your application, whatever a user filtered can be written to the file. This is how I have done this before. You should be able to tweak this if you are using some sort of collection without much trouble.
private void exportCsv()
{
SaveFileDialog saveFile = new SaveFileDialog();
createSaveDialog(saveFile, ".csv", "CSV (*csv)|*.csv)");
TextWriter writer = new StreamWriter(saveFile.FileName);
int row = dataGridView1.Rows.Count;
int col = dataGridView1.Columns.Count;
try
{
if (saveFile.FileName != "")
{
for (int i = 0; i < dataGridView1.Columns.Count; i++)
{
writer.Write(dataGridView1.Columns[i].HeaderText + ",");
}
writer.Write("\r\n");
for (int j = 0; j < row - 1; j++)
{
for (int k = 0; k < col - 1; k++)
{
writer.Write(dataGridView1.Rows[j].Cells[k].Value.ToString() + ",");
}
writer.Write("\r\n");
}
}
MessageBox.Show("File Sucessfully Created!", "File Saved", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
}
catch
{
MessageBox.Show("File could not be created.", "Save Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
}
finally
{
writer.Close();
saveFile.Dispose();
}
}
This has been killing me - I have a massive file that I need to read in as a DataTable.
After a lot of messing about I am using this:
using (OleDbConnection connection = new OleDbConnection(connString))
{
using (OleDbCommand command = new OleDbCommand(sql, connection))
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
dataTable = new DataTable();
dataTable.Locale = CultureInfo.CurrentCulture;
adapter.Fill(dataTable);
}
}
}
which works if the text file is comma seperated but does not work if it is tab delimited - Can anyone please help??
My connection string looks like :
string connString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly + #";Extended Properties='text;HDR=YES'";
I ve tried to set the FMT property with no luck....
Here is a good library use it.
http://www.codeproject.com/KB/database/CsvReader.aspx
Here is the code which use the library.
TextReader tr = new StreamReader(HttpContext.Current.Server.MapPath(Filename));
string data = tr.ReadToEnd();
tr.Close();
for comma delimited;
CachedCsvReader cr = new CachedCsvReader(new StringReader(csv), true);
for tab delimited;
CachedCsvReader cr = new CachedCsvReader(new StringReader(csv), true, '\t');
And here you can load it into DataTable by having this code
DataTable dt = new DataTable();
dt.Load(cr);
Hope you find it helpful. Thanks
Manually: You can use String.Split() method to split your entire file. Here is an example of what i use in my code. In this example, i read the data line by line and split it. I then place the data directly into columns.
//Open and read the file
System.IO.FileStream fs = new System.IO.FileStream("myfilename", System.IO.FileMode.Open);
System.IO.StreamReader sr = new System.IO.StreamReader(fs);
string line = "";
line = sr.ReadLine();
string[] colVal;
try
{
//clear of previous data
//myDataTable.Clear();
//for each reccord insert it into a row
while (!sr.EndOfStream)
{
line = sr.ReadLine();
colVal = line.Split('\t');
DataRow dataRow = myDataTable.NewRow();
//associate values with the columns
dataRow["col1"] = colVal[0];
dataRow["col2"] = colVal[1];
dataRow["col3"] = colVal[2];
//add the row to the table
myDataTable.Rows.Add(dataRow);
}
//close the stream
sr.Close();
//binds the dataset tothe grid view.
BindingSource bs = new BindingSource();
bs.DataSource = myDataSet;
bs.DataMember = myDataTable.TableName;
myGridView.DataSource = bs;
}
You could modify it to do some loops for columns if you have many and they are numbered. Also, i recommend checking the integrity first, by checking that the number of columns read is correct.
This should work: (from http://www.hotblue.com/article0000.aspx?a=0006)
just replace the comma part with:
if ((postdata || !quoted) && (c == ',' || c == '\t'))
to make it tab delimited.
using System.Data;
using System.IO;
using System.Text.RegularExpressions;
public DataTable ParseCSV(string inputString) {
DataTable dt=new DataTable();
// declare the Regular Expression that will match versus the input string
Regex re=new Regex("((?<field>[^\",\\r\\n]+)|\"(?<field>([^\"]|\"\")+)\")(,|(?<rowbreak>\\r\\n|\\n|$))");
ArrayList colArray=new ArrayList();
ArrayList rowArray=new ArrayList();
int colCount=0;
int maxColCount=0;
string rowbreak="";
string field="";
MatchCollection mc=re.Matches(inputString);
foreach(Match m in mc) {
// retrieve the field and replace two double-quotes with a single double-quote
field=m.Result("${field}").Replace("\"\"","\"");
rowbreak=m.Result("${rowbreak}");
if (field.Length > 0) {
colArray.Add(field);
colCount++;
}
if (rowbreak.Length > 0) {
// add the column array to the row Array List
rowArray.Add(colArray.ToArray());
// create a new Array List to hold the field values
colArray=new ArrayList();
if (colCount > maxColCount)
maxColCount=colCount;
colCount=0;
}
}
if (rowbreak.Length == 0) {
// this is executed when the last line doesn't
// end with a line break
rowArray.Add(colArray.ToArray());
if (colCount > maxColCount)
maxColCount=colCount;
}
// create the columns for the table
for(int i=0; i < maxColCount; i++)
dt.Columns.Add(String.Format("col{0:000}",i));
// convert the row Array List into an Array object for easier access
Array ra=rowArray.ToArray();
for(int i=0; i < ra.Length; i++) {
// create a new DataRow
DataRow dr=dt.NewRow();
// convert the column Array List into an Array object for easier access
Array ca=(Array)(ra.GetValue(i));
// add each field into the new DataRow
for(int j=0; j < ca.Length; j++)
dr[j]=ca.GetValue(j);
// add the new DataRow to the DataTable
dt.Rows.Add(dr);
}
// in case no data was parsed, create a single column
if (dt.Columns.Count == 0)
dt.Columns.Add("NoData");
return dt;
}
Now that we have a parser for converting a string into a DataTable, all we need now is a function that will read the content from a CSV file and pass it to our ParseCSV function:
public DataTable ParseCSVFile(string path) {
string inputString="";
// check that the file exists before opening it
if (File.Exists(path)) {
StreamReader sr = new StreamReader(path);
inputString = sr.ReadToEnd();
sr.Close();
}
return ParseCSV(inputString);
}
And now you can easily fill a DataGrid with data coming off the CSV file:
protected System.Web.UI.WebControls.DataGrid DataGrid1;
private void Page_Load(object sender, System.EventArgs e) {
// call the parser
DataTable dt=ParseCSVFile(Server.MapPath("./demo.csv"));
// bind the resulting DataTable to a DataGrid Web Control
DataGrid1.DataSource=dt;
DataGrid1.DataBind();
}