I have a situation where I need to download and save excel file(.xlsx) to .CSV format in .net core console application.
Since, Microsoft.Interop packages are not compatible with .Net core 3.1, what other approach I can use to save Excel file as .CSV?
Appreciate suggestions.
This is a combination of multiple existing answers on SO.
First is from here
Convert the xlsx to a DataTable using ClosedXML
using ClosedXML.Excel;
...
public static DataTable GetDataFromExcel(string path, dynamic worksheet)
{
//Save the uploaded Excel file.
DataTable dt = new DataTable();
//Open the Excel file using ClosedXML.
using (XLWorkbook workBook = new XLWorkbook(path))
{
//Read the first Sheet from Excel file.
IXLWorksheet workSheet = workBook.Worksheet(worksheet);
//Create a new DataTable.
//Loop through the Worksheet rows.
bool firstRow = true;
foreach (IXLRow row in workSheet.Rows())
{
//Use the first row to add columns to DataTable.
if (firstRow)
{
foreach (IXLCell cell in row.Cells())
{
if (!string.IsNullOrEmpty(cell.Value.ToString()))
{
dt.Columns.Add(cell.Value.ToString());
}
else
{
break;
}
}
firstRow = false;
}
else
{
int i = 0;
DataRow toInsert = dt.NewRow();
foreach (IXLCell cell in row.Cells(1, dt.Columns.Count))
{
try
{
toInsert[i] = cell.Value.ToString();
}
catch (Exception ex)
{
//Handle this, or don't.
}
i++;
}
dt.Rows.Add(toInsert);
}
}
return dt;
}
If you need to do any data transformations, do it while the data is in a DataTable.
Then use CSVHelper to export as a CSV (SO answer I found had a solution that didn't use the Culture Info which was added as a requirement to the Library a few updates ago):
using CSVHelper;
using System.Globilization;
....
public static void SaveCSV(DataTable records)
{
string newFile = #"C:\somePath.csv";
using (StreamWriter writer = new StreamWriter(newFile))
{
using (CsvWriter csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
//add headers
foreach (DataColumn dc in records.Columns)
{
csv.WriteField(dc.ColumnName);
}
csv.NextRecord();
foreach(DataRow dr in records.Rows)
{
for (int i = 0; i< records.Columns.Count; i++)
{
csv.WriteField(dr[i]);
}
csv.NextRecord();
}
}
}
}
I want to load an Excel-file into my DataGrid, using ClosedXML.
I have this method:
public static DataTable ImportExceltoDataTable(string filePath, string sheetName) {
using (XLWorkbook wb = new(filePath)) {
IXLWorksheet ws = wb.Worksheet(1);
DataTable dt = new();
bool firstRow = true;
foreach (IXLRow row in ws.Rows()) {
if (firstRow) {
foreach (IXLCell cell in row.Cells()) {
dt.Columns.Add(cell.CachedValue.ToString());
}
firstRow = false;
} else {
dt.Rows.Add();
int i = 0;
foreach (IXLCell cell in row.Cells(row.FirstCellUsed().Address.ColumnNumber, row.LastCellUsed().Address.ColumnNumber)) {
dt.Rows[dt.Rows.Count - 1][i} = cell.CachedValue.ToString();
i++;
}
}
}
return dt;
}
}
And on a click-event, I am trying to pick my file using OpenFileDialog, see below:
OpenFileDialog of = new();
of.Filter = "Excel Files | *.xlsx;";
of.Title = "Import Excel file.";
if (of.ShowDialog()==true) {
dataGrid.ItemsSource = ImportExceltoDataTable("...", "...").DefaultView;
}
But I do not know how to notify the DataTable that I've chosen a file in my OpenFileDialog.
At the first line in the DataTable method, I get the following exception error:
System.ArgumentException: 'Empty extension is not supported'
Which makes sense... How can I tell it what file I've picked?
You may want to re-think your approach to reading the excel file. One possible issue is the if (firstRow) { … … statement, which is odd and makes a dangerous assumption. The code “assumes” that each column of data “has” a header cell. In other words, the number of columns added to the DataTable will be determined by the number of cells found (with some text in the cell) on the “FIRST” row. What if a column of data does NOT have a header cell?
Therefore, if there are any rows that have data to the right of the first row’s cells with headers, then the DataTable will not have the correct number of columns… and, when the code gets to these cells in the else portion below… the code will most likely crash when i goes beyond dt’s column count.
The code needs to guarantee that dt has the correct number of columns to avoid the problem described above. One way to help is to find the two cells that define the “top-left” cell where the data begins (as it may not necessarily always be the first cell in the worksheet) and “bottom-right” cell where the “last” cell with data is located.
Once we have these two cells (top-left and bottom-right)… then, we can determine how many columns are needed in the DataTable… and… we can almost guarantee that all the data in the worksheet will fit in the DataTable.
Below is one possible solution using the ideas described above. Note, the code below does not use a particular worksheet name and simply uses the first worksheet in the given workbook.
private void Button_Click(object sender, RoutedEventArgs e) {
OpenFileDialog of = new OpenFileDialog();
of.Filter = "Excel Files | *.xlsx;";
of.Title = "Import Excel file.";
if (of.ShowDialog() == true) {
dataGrid.ItemsSource = ImportExceltoDataTable(of.FileName).DefaultView;
}
}
public static DataTable ImportExceltoDataTable(string filePath) {
using (XLWorkbook wb = new XLWorkbook(filePath)) {
IXLWorksheet ws = wb.Worksheet(1);
int tl_Row = ws.FirstCellUsed().Address.RowNumber;
int tl_Col = ws.FirstCellUsed().Address.ColumnNumber;
int br_Row = ws.LastCellUsed().Address.RowNumber;
int br_Col = ws.LastCellUsed().Address.ColumnNumber;
DataTable dt = new DataTable();
// add dt columns using the first row of data
for (int i = tl_Col; i <= br_Col; i++) {
dt.Columns.Add(ws.Cell(tl_Row, i).CachedValue.ToString());
}
IXLRow currentRow;
// add data from the worksheet to dt - we already used the first row of data for the columns
for (int dtRow = 0; dtRow < br_Row - tl_Row; dtRow++) {
currentRow = ws.Row(tl_Row + dtRow + 1);
dt.Rows.Add();
for (int dtCol = 0; dtCol < br_Col - tl_Col + 1; dtCol++) {
dt.Rows[dtRow][dtCol] = currentRow.Cell(tl_Col + dtCol).CachedValue;
}
}
return dt;
}
}
I hope this makes sense and helps.
I would like to read the contents of an Excel worksheet into a C# DataTable. The Excel worksheet could have a variable numbers of columns and rows. The first row in the Excel worksheet will always contain the column names but other rows may be blank.
All of the suggestions I have seen here in SO all assume the presence of Microsoft.ACE.OLEDB. I do not have this library installed on my system as when I try some of these solutions I get this error.
Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine.
Strange considering I have Office 2016 installed.
For this reason I was hoping to use the ClosedXML library via Nuget but I do not see any examples in their wiki of reading an Excel worksheet to a DataTable in C#.
This is example is not mine. I cannot remember where I got it from as it was in my archives. However, this works for me. The only issue I ran into was with blank cells. According to a dicussion on the ClosedXML GitHUb wiki page it has something to do with Excel not tracking empty cells that are not bounded by data. I found that if I added data to the cells and then removed the same data the process worked.
public static DataTable ImportExceltoDatatable(string filePath, string sheetName)
{
// Open the Excel file using ClosedXML.
// Keep in mind the Excel file cannot be open when trying to read it
using (XLWorkbook workBook = new XLWorkbook(filePath))
{
//Read the first Sheet from Excel file.
IXLWorksheet workSheet = workBook.Worksheet(1);
//Create a new DataTable.
DataTable dt = new DataTable();
//Loop through the Worksheet rows.
bool firstRow = true;
foreach (IXLRow row in workSheet.Rows())
{
//Use the first row to add columns to DataTable.
if (firstRow)
{
foreach (IXLCell cell in row.Cells())
{
dt.Columns.Add(cell.Value.ToString());
}
firstRow = false;
}
else
{
//Add rows to DataTable.
dt.Rows.Add();
int i = 0;
foreach (IXLCell cell in row.Cells(row.FirstCellUsed().Address.ColumnNumber, row.LastCellUsed().Address.ColumnNumber))
{
dt.Rows[dt.Rows.Count - 1][i] = cell.Value.ToString();
i++;
}
}
}
return dt;
}
}
Need to add
using System.Data;
using ClosedXML.Excel;
As well as the ClosedXML nuget package
For other datetime data type... this could be helpful... reference
if (cell.Address.ColumnLetter=="J") // Column with date datatype
{
DateTime dtime = DateTime.FromOADate(double.Parse(cell.Value.ToString()));
dt.Rows[dt.Rows.Count - 1][i] = dtime;
}
else
{
dt.Rows[dt.Rows.Count - 1][i] = cell.Value.ToString();
}
With this code you can read the contents of an excel sheet. You can specify the name of the sheet or the number, a dataSet will be returned with the contents of the sheet.
public static DataTable GetDataFromExcel(string path, dynamic worksheet)
{
//Save the uploaded Excel file.
DataTable dt = new DataTable();
//Open the Excel file using ClosedXML.
using (XLWorkbook workBook = new XLWorkbook(path))
{
//Read the first Sheet from Excel file.
IXLWorksheet workSheet = workBook.Worksheet(worksheet);
//Create a new DataTable.
//Loop through the Worksheet rows.
bool firstRow = true;
foreach (IXLRow row in workSheet.Rows())
{
//Use the first row to add columns to DataTable.
if (firstRow)
{
foreach (IXLCell cell in row.Cells())
{
if (!string.IsNullOrEmpty(cell.Value.ToString()))
{
dt.Columns.Add(cell.Value.ToString());
}
else
{
break;
}
}
firstRow = false;
}
else
{
int i = 0;
DataRow toInsert = dt.NewRow();
foreach (IXLCell cell in row.Cells(1, dt.Columns.Count))
{
try
{
toInsert[i] = cell.Value.ToString();
}
catch (Exception ex)
{
}
i++;
}
dt.Rows.Add(toInsert);
}
}
return dt;
}
I already have added the EPPlus library to my solution. I just can't seem to figure out how to get my excel data it into a datatable that will allow my bulkcopy to work. The below code doesn't work. Can anyone help me massage this into place? Thank you in advance for your assistance. I have edited this after comments from 'mason' below.
try
{
//// open file
var excel = Request.Files[0];
var file = Path.Combine(Server.MapPath("~/Uploads/"), excel.FileName);
var sqlConnectionString = ConfigurationManager.ConnectionStrings["MyDB"].ToString();
// Get the datatable from procedure on Utility.cs page
var datapush = Utility.ImportToDataTable(file, "Sheet1");
// open connection to sql and use bulk copy to write excelData to my table
using (var destinationConnection = new SqlConnection(sqlConnectionString))
{
destinationConnection.Open();
using (var bulkCopy = new SqlBulkCopy(destinationConnection))
{
bulkCopy.DestinationTableName = "MYTABLE";
bulkCopy.ColumnMappings.Add("CODE", "code");
bulkCopy.ColumnMappings.Add("TITLE", "title");
bulkCopy.ColumnMappings.Add("LAST_NAME", "last_name");
bulkCopy.ColumnMappings.Add("FIRST_NAME", "first_name");
bulkCopy.WriteToServer(datapush);
}
}
}
and here is the code on the Utility.cs page based on Mason's suggested link:
public class Utility
{
public static DataTable ImportToDataTable(string FilePath, string SheetName)
{
DataTable dt = new DataTable();
FileInfo fi = new FileInfo(FilePath);
// Check if the file exists
if (!fi.Exists)
throw new Exception("File " + FilePath + " Does Not Exists");
using (ExcelPackage xlPackage = new ExcelPackage(fi))
{
// get the first worksheet in the workbook
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[SheetName];
// Fetch the WorkSheet size
ExcelCellAddress startCell = worksheet.Dimension.Start;
ExcelCellAddress endCell = worksheet.Dimension.End;
// create all the needed DataColumn
for (int col = startCell.Column; col <= endCell.Column; col++)
dt.Columns.Add(col.ToString());
// place all the data into DataTable
for (int row = startCell.Row; row <= endCell.Row; row++)
{
DataRow dr = dt.NewRow();
int x = 0;
for (int col = startCell.Column; col <= endCell.Column; col++)
{
dr[x++] = worksheet.Cells[row, col].Value;
}
dt.Rows.Add(dr);
}
}
return dt;
}
}
Currently when I run the code and F11 the bug is on the Utility.cs page. right after "// get the first worksheet in the workbook"
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[SheetName];
returns null and the next line of code
ExcelCellAddress startCell = worksheet.Dimension.Start;
stops everything and kicks the following error "{"Object reference not set to an instance of an object."}"
My program have ability to export some data and DataTable to Excel file (template)
In the template I insert the data to some placeholders. It's works very good, but I need to insert a DataTable too...
My sample code:
using (Stream OutStream = new MemoryStream())
{
// read teamplate
using (var fileStream = File.OpenRead(templatePath))
fileStream.CopyTo(OutStream);
// exporting
Exporting(OutStream);
// to start
OutStream.Seek(0L, SeekOrigin.Begin);
// out
using (var resultFile = File.Create(resultPath))
OutStream.CopyTo(resultFile);
Next method to exporting
private void Exporting(Stream template)
{
using (var workbook = SpreadsheetDocument.Open(template, true, new OpenSettings { AutoSave = true }))
{
// Replace shared strings
SharedStringTablePart sharedStringsPart = workbook.WorkbookPart.SharedStringTablePart;
IEnumerable<Text> sharedStringTextElements = sharedStringsPart.SharedStringTable.Descendants<Text>();
DoReplace(sharedStringTextElements);
// Replace inline strings
IEnumerable<WorksheetPart> worksheetParts = workbook.GetPartsOfType<WorksheetPart>();
foreach (var worksheet in worksheetParts)
{
DoReplace(worksheet.Worksheet.Descendants<Text>());
}
int z = 40;
foreach (System.Data.DataRow row in ExcelWorkXLSX.ToOut.Rows)
{
for (int i = 0; i < row.ItemArray.Count(); i++)
{
ExcelWorkXLSX.InsertText(workbook, row.ItemArray.ElementAt(i).ToString(), getColumnName(i), Convert.ToUInt32(z)); }
z++;
}
}
}
}
But this fragment to output DataTable slooooooooooooooooooooooowwwwwww...
How can I export DataTable to Excel fast and truly?
I wrote this quick example. It works for me. I only tested it with one dataset with one table inside, but I guess that may be enough for you.
Take into consideration that I treated all cells as String (not even SharedStrings). If you want to use SharedStrings you might need to tweak my sample a bit.
Edit: To make this work it is necessary to add WindowsBase and DocumentFormat.OpenXml references to project.
Enjoy,
private void ExportDataSet(DataSet ds, string destination)
{
using (var workbook = SpreadsheetDocument.Create(destination, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();
workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();
foreach (System.Data.DataTable table in ds.Tables) {
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);
DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
uint sheetId = 1;
if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
{
sheetId =
sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
sheets.Append(sheet);
DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
List<String> columns = new List<string>();
foreach (System.Data.DataColumn column in table.Columns) {
columns.Add(column.ColumnName);
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
foreach (System.Data.DataRow dsrow in table.Rows)
{
DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
foreach (String col in columns)
{
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
newRow.AppendChild(cell);
}
sheetData.AppendChild(newRow);
}
}
}
}
eburgos, I've modified your code slightly because when you have multiple datatables in your dataset it was just overwriting them in the spreadsheet so you were only left with one sheet in the workbook. I basically just moved the part where the workbook is created out of the loop. Here is the updated code.
private void ExportDSToExcel(DataSet ds, string destination)
{
using (var workbook = SpreadsheetDocument.Create(destination, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();
workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();
uint sheetId = 1;
foreach (DataTable table in ds.Tables)
{
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);
DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
{
sheetId =
sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
sheets.Append(sheet);
DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
List<String> columns = new List<string>();
foreach (DataColumn column in table.Columns)
{
columns.Add(column.ColumnName);
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
foreach (DataRow dsrow in table.Rows)
{
DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
foreach (String col in columns)
{
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
newRow.AppendChild(cell);
}
sheetData.AppendChild(newRow);
}
}
}
}
I also wrote a C#/VB.Net "Export to Excel" library, which uses OpenXML and (more importantly) also uses OpenXmlWriter, so you won't run out of memory when writing large files.
Full source code, and a demo, can be downloaded here:
Export to Excel
It's dead easy to use.
Just pass it the filename you want to write to, and a DataTable, DataSet or List<>.
CreateExcelFile.CreateExcelDocument(myDataSet, "MyFilename.xlsx");
And if you're calling it from an ASP.Net application, pass it the HttpResponse to write the file out to.
CreateExcelFile.CreateExcelDocument(myDataSet, "MyFilename.xlsx", Response);
I wrote my own export to Excel writer because nothing else quite met my needs. It is fast and allows for substantial formatting of the cells. You can review it at
https://openxmlexporttoexcel.codeplex.com/
I hope it helps.
You could try taking a look at this libary. I've used it for one of my projects and found it very easy to work with, reliable and fast (I only used it for exporting data).
http://epplus.codeplex.com/
You can have a look at my library here. Under the documentation section, you will find how to import a data table.
You just have to write
using (var doc = new SpreadsheetDocument(#"C:\OpenXmlPackaging.xlsx")) {
Worksheet sheet1 = doc.Worksheets.Add("My Sheet");
sheet1.ImportDataTable(ds.Tables[0], "A1", true);
}
Hope it helps!
I tried accepted answer and got message saying generated excel file is corrupted when trying to open. I was able to fix it by doing few modifications like adding below line end of the code.
workbookPart.Workbook.Save();
I have posted full code # Export DataTable to Excel with Open XML in c#
I wanted to add this answer because I used the primary answer from this question as my basis for exporting from a datatable to Excel using OpenXML but then transitioned to OpenXMLWriter when I found it to be much faster than the above method.
You can find the full details in my answer in the link below. My code is in VB.NET though, so you'll have to convert it.
How to export DataTable to Excel