How to update Excel Cell based on validation using ClosedXML? - c#

I have an input excel file which contains 6 columns and n number of rows. Out of which 3 columns are mandatory columns. If any of mandatory columns are empty then I need to update some custom text in the Remarks column. In the input excel column names are fixed but position is not fixed.
For example, in the below table I have updated Remarks column values of row no. 2 & 3 as Fail since CCode and ID column values are blank.
Name
ID
ComapanyName
CCode
Address
Remarks
Anto
12
ABC Corp Cmp
ABCCo
AvenueSt
Anuj
13
XYZ Corp Cmp
AvenueSt
Fail
Kathy
CTF Corp Cmp
CTFCo
AvenueSt
Fail
Close XML Logic:
var workbook = new XLWorkbook(IPPath);
var rows = workbook.Worksheet(1).RangeUsed().RowsUsed().Skip(1);
foreach (var row in rows)
{
-- update logic
}

First you need to read the excel data in DataTable, then you should be able to update the excel data by iterating each record.
private static System.Data.DataTable ReadExcelData(string filePath)
{
try
{
System.Data.DataTable dataTable = new System.Data.DataTable();
using (Stream stream = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
{
using (XLWorkbook workBook = new XLWorkbook(stream))
{
var workSheet = workBook.Worksheet(1);
dataTable.TableName = workSheet.Name;
int lastRowIndex = workSheet.LastRowUsed().RowNumber();
int lastColumnIndex = workSheet.LastColumnUsed().ColumnNumber();
bool header = false;
foreach (IXLRow row in workSheet.Rows(1, lastRowIndex))
{
if (!header)
{
foreach (IXLCell cell in row.Cells(1, lastColumnIndex))
{
dataTable.Columns.Add(cell.GetFormattedString());
}
header = true;
}
dataTable.Rows.Add();
int i = 0;
foreach (IXLCell cell in row.Cells(1, lastColumnIndex))
{
dataTable.Rows[dataTable.Rows.Count - 1][i] = cell.GetFormattedString();
i++;
}
}
dataTable.Rows.RemoveAt(0);
}
}
return dataTable;
}
catch (Exception ex)
{
return null;
}
}
public static void UpdateData(string filePath, DataTable dataTable)
{
using (Stream stream = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
{
using (XLWorkbook workBook = new XLWorkbook(stream))
{
var workSheet = workBook.Worksheet(1);
if (workSheet != null)
{
int ColumnIndex = 0;
int rowNumber = 1;
foreach (DataColumn column in dataTable.Columns)
{
if (column.ColumnName.Contains("Column Name"))
{
ColumnIndex = column.Ordinal + 1;
}
}
string colAddress = ColumnAddress(ColumnIndex);
foreach (DataRow row in dataTable.Rows)
{
rowNumber++;
workSheet.Cell(rowNumber, ColumnIndex).Value = "Update Value";
}
}
workBook.Save();
}
}
}
private static string ColumnAddress(int col)
{
if (col <= 26)
{
return Convert.ToString(Convert.ToChar(col + 64));
}
int div = col / 26;
int mod = col % 26;
if (mod == 0) { mod = 26; div--; }
return ColumnAddress(div) + ColumnAddress(mod);
}

Related

How to adjust Excel columns count to project into DataTable in C#?

I am trying to fetch data from Excel sheet and fill data into an DataTable in c# by using EPPlus by this code:
ExcelPackage.LicenseContext = LicenseContext.NonCommercial;
using(var _excel = new ExcelPackage(new FileInfo(fileAddress)){
var data = excel.Workbook.Worksheets[0].Cells.ToDataTable(options =>
{
options.FirstRowIsColumnNames = true;
options.EmptyRowStrategy = EmptyRowsStrategy.StopAtFirst;
});
}
but when I am running this code I am getting this error:
first row contains an empty cell at index 4
which 4 is the index of the first empty cell after data ends.
I try to set column mapping but no progress has been made. is there any way to determine the table dimension to cast to DataTable?
Add Nuget package of ClosedXML to the solution.
using ClosedXML.Excel;
call this function when you want in your code:
string strFileName = #"C:\VSTSInput\InputFileVSTS_File1.xlsx";
string strSheetName = "Sheet 1";
DataTable DT_InputData = GetDataFromExcel(strFileName, strSheetName);
Create the following as a method, to reuse it:
public static DataTable GetDataFromExcel(string path, dynamic worksheet)
{
DataTable dt = new DataTable();
//Open the Excel file using ClosedXML.
using (XLWorkbook workBook = new XLWorkbook(path))
{
//Read the first Sheet from Excel file.
IXLWorksheet workSheet = workBook.Worksheet(worksheet);
//Create a new DataTable.
//Loop through the Worksheet rows.
bool firstRow = true;
foreach (IXLRow row in workSheet.Rows())
{
//Use the first row to add columns to DataTable.
if (firstRow)
{
foreach (IXLCell cell in row.Cells())
{
if (!string.IsNullOrEmpty(cell.Value.ToString()))
{
dt.Columns.Add(cell.Value.ToString());
}
else
{
break;
}
}
firstRow = false;
}
else
{
int i = 0;
DataRow toInsert = dt.NewRow();
foreach (IXLCell cell in row.Cells(1, dt.Columns.Count))
{
try
{
toInsert[i] = cell.Value.ToString();
}
catch (Exception ex)
{
Console.WriteLine("Failed at: " + System.Reflection.MethodBase.GetCurrentMethod().Name);
Console.WriteLine(ex.Message);
}
i++;
}
dt.Rows.Add(toInsert);
}
}
return dt;
}
}
Your data from the excel will be added to the DataTable and you can access the columns by the following code:
foreach (DataRow dtRow in DT_InputData.Rows)
{
var dataColumn1 = dtRow[0].ToString(); //Data of 1st column in excel
var dataColumn2 = dtRow[1].ToString(); //Data of 2nd column in excel
var dataColumn3 = dtRow["Your Excel Column Name"].ToString();
}

How to enable read only permision to some specified cells and sheets in excel using openxml in c#

I am reading XML data then I pasted to data-set and the I created spreadsheet and copied the data to to sheets in spreadsheet.So now I want to only allow some sheets and cells to read-only. To prevent to no changes to headers and data in some sheets, So I am posting code used to convert the XML to excel using open XML. So I need to prevent write for some sheets and also cells some sheets.
public void ExportDSToExcel(DataSet ds, string dest)
{
using (var workbook = SpreadsheetDocument.Create(dest, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();
workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();
uint sheetId = 1;
foreach (DataTable table in ds.Tables)
{
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);
DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
{
sheetId =
sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
sheets.Append(sheet);
DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
List<String> columns = new List<string>();
foreach (DataColumn column in table.Columns)
{
columns.Add(column.ColumnName);
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
foreach (DataRow dsrow in table.Rows)
{
DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
foreach (String col in columns)
{
DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
newRow.AppendChild(cell);
}
sheetData.AppendChild(newRow);
}
}
}
}
protected void Button1_Click(object sender, EventArgs e)
{
if(txtname.Text != null)
{
if (FileUpload1.HasFile == true)
{
string myXMLfile = "/uploads/" + FileUpload1.FileName;
FileUpload1.SaveAs(Server.MapPath(myXMLfile));
string dest = "D:/uploads/" + txtname.Text+".xlsx";
DataSet ds = new DataSet();
try
{
ds.ReadXml(myXMLfile);
}
catch (Exception ex)
{
lblstatus.Text=(ex.ToString());
}
ExportDSToExcel(ds, dest);
}
else
{
lblstatus.Text = "Please Upload the file ";
}
}
else {
lblstatus.Text = "Please enter the name ";
}
}
}
thanks in advance so please help me to find solution in this code.
We can customise protection by password in different means.For making excel sheet specified area or column or row as read only or a full sheet into read only by giving protection to sheet by password.If we want to protect whole sheet use this code
PageMargins pageM = sheetPart.Worksheet.GetFirstChild<PageMargins>();
SheetProtection sheetProtection = new SheetProtection();
sheetProtection.Password = "admin";
sheetProtection.Sheet = true;
sheetProtection.Objects = true;
sheetProtection.Scenarios = true;
ProtectedRanges pRanges = new ProtectedRanges();
ProtectedRange pRange = new ProtectedRange();
ListValue<StringValue> lValue = new ListValue<StringValue>();
lValue.InnerText = ""; //set cell which you want to make it editable
pRange.SequenceOfReferences = lValue;
pRange.Name = "not allow editing";
pRanges.Append(pRange);
sheetPart.Worksheet.InsertBefore(sheetProtection, pageM);
sheetPart.Worksheet.InsertBefore(pRanges, pageM);
If we want specified page as read only then first give a condition filter by name then give this code .In this code lValue.InnerText = ""; is null so we are not mention cells have permission to overcome this protection. If we mention the cells we can edit up to that limit.

How to convert each column in a new table in DocumentFormat.OpenXML

I have an issue at converting from Excel file to XML file. When I convert the info in table tags, it converts in <table1></table1> <table1></table1> <table1></table1>but I need to convert each column in <table1></table1><table2></table2>...
The code is the following.
private DataTable ReadExcelFile(string filename)
{
// Initialize an instance of DataTable
DataTable dt = new DataTable();
try
{
// Use SpreadSheetDocument class of Open XML SDK to open excel file
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(filename, false))
{
// Get Workbook Part of Spread Sheet Document
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
// Get all sheets in spread sheet document
// IEnumerable Interface
// Exposes an enumerator, which supports a simple iteration over a non-generic collection.
IEnumerable<Sheet> sheetcollection = spreadsheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
// Get relationship Id
string relationshipId = sheetcollection.First().Id.Value;
// Get sheet1 Part of Spread Sheet Document
WorksheetPart worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(relationshipId);
// Get Data in Excel file
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
IEnumerable<Row> rowcollection = sheetData.Descendants<Row>();
if (rowcollection.Count() == 0)
{
return dt;
}
// Add columns
foreach (Cell cell in rowcollection.ElementAt(0))
{
dt.Columns.Add(GetValueOfCell(spreadsheetDocument, cell));
}
// Add rows into DataTable
foreach (Row row in rowcollection)
{
DataRow temprow = dt.NewRow();
int columnIndex = 0;
foreach (Cell cell in row.Descendants<Cell>())
{
// Get Cell Column Index
int cellColumnIndex = GetColumnIndex(GetColumnName(cell.CellReference));
if (columnIndex < cellColumnIndex)
{
do
{
temprow[columnIndex] = string.Empty;
columnIndex++;
}
while (columnIndex < cellColumnIndex);
}
temprow[columnIndex] = GetValueOfCell(spreadsheetDocument, cell);
columnIndex++;
}
// Add the row to DataTable
// the rows include header row
dt.Rows.Add(temprow);
}
}
// Here remove header row
dt.Rows.RemoveAt(0);
return dt;
}
catch (IOException ex)
{
throw new IOException(ex.Message);
}
}
private static string GetValueOfCell(SpreadsheetDocument spreadsheetdocument, Cell cell)
{
// Get value in Cell
SharedStringTablePart sharedString = spreadsheetdocument.WorkbookPart.SharedStringTablePart;
if (cell.CellValue == null)
{
return string.Empty;
}
string cellValue = cell.CellValue.InnerText;
// The condition that the Cell DataType is SharedString
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return sharedString.SharedStringTable.ChildElements[int.Parse(cellValue)].InnerText;
}
else
{
return cellValue;
}
}
private string GetColumnName(string cellReference)
{
// Create a regular expression to match the column name of cell
Regex regex = new Regex("[A-Za-z]+");
Match match = regex.Match(cellReference);
return match.Value;
}
private int GetColumnIndex(string columnName)
{
int columnIndex = 0;
int factor = 1;
// From right to left
for (int position = columnName.Length - 1; position >= 0; position--)
{
// For letters
if (Char.IsLetter(columnName[position]))
{
columnIndex += factor * ((columnName[position] - 'A') + 1) - 1;
factor *= 26;
}
}
return columnIndex;
}
public string GetXML(string filename)
{
using (DataSet ds = new DataSet())
{
ds.Tables.Add(this.ReadExcelFile(filename));
return ds.GetXml();
}
}
May somebody help me out with this issue?
Or any idea how to create a custom tags?
Thanks in advance.

EPPlus - How to use a template

I have recently discovered EPPlus (http://epplus.codeplex.com/).
I have an excel .xlsx file in my project with all the styled column headers.
I read on their site that you can use templates.
Does anyone know how or can provide code sample of how to use my template.xlsx file with EPPlus? I would like to be able to simply load my data into the rows without messing with the headings.
The solution I ended up going with:
using System.IO;
using System.Reflection;
using OfficeOpenXml;
//Create a stream of .xlsx file contained within my project using reflection
Stream stream = Assembly.GetExecutingAssembly().GetManifestResourceStream("EPPlusTest.templates.VendorTemplate.xlsx");
//EPPlusTest = Namespace/Project
//templates = folder
//VendorTemplate.xlsx = file
//ExcelPackage has a constructor that only requires a stream.
ExcelPackage pck = new OfficeOpenXml.ExcelPackage(stream);
After that you can use all the methods of ExcelPackage that you want on an .xlsx file loaded from a template.
To create a new package, you can provide a stream template:
// templateName = the name of .xlsx file
// result = stream to write the resulting xlsx to
using (var source = System.IO.File.OpenRead(templateName))
using (var excel = new OfficeOpenXml.ExcelPackage(result, source)) {
// Fill cells here
// Leave headers etc as is
excel.Save();
}
//This is my Implementation for EPPlus. // may be it helps.
class EPPlus
{
FileInfo newFile;
FileInfo templateFile;
DataSet _ds;
ExcelPackage xlPackage;
public string _ErrorMessage;
public EPPlus(string filePath, string templateFilePath)
{
newFile = new FileInfo(#filePath);
templateFile = new FileInfo(#templateFilePath);
_ds = GetDataTables(); /* DataTables */
_ErrorMessage = string.Empty;
CreateFileWithTemplate();
}
private bool CreateFileWithTemplate()
{
try
{
_ErrorMessage = string.Empty;
using (xlPackage = new ExcelPackage(newFile, templateFile))
{
int i = 1;
foreach (DataTable dt in _ds.Tables)
{
AddSheetWithTemplate(xlPackage, dt, i);
i++;
}
///* Set title, Author.. */
//xlPackage.Workbook.Properties.Title = "Title: Office Open XML Sample";
//xlPackage.Workbook.Properties.Author = "Author: Muhammad Mubashir.";
////xlPackage.Workbook.Properties.SetCustomPropertyValue("EmployeeID", "1147");
//xlPackage.Workbook.Properties.Comments = "Sample Record Details";
//xlPackage.Workbook.Properties.Company = "TRG Tech.";
///* Save */
xlPackage.Save();
}
return true;
}
catch (Exception ex)
{
_ErrorMessage = ex.Message.ToString();
return false;
}
}
/// <summary>
/// This AddSheet method generates a .xlsx Sheet with your provided Template file, //DataTable and SheetIndex.
/// </summary>
public static void AddSheetWithTemplate(ExcelPackage xlApp, DataTable dt, int SheetIndex)
{
string _SheetName = string.Format("Sheet{0}", SheetIndex.ToString());
ExcelWorksheet worksheet;
/* WorkSheet */
if (SheetIndex == 0)
{
worksheet = xlApp.Workbook.Worksheets[SheetIndex + 1]; // add a new worksheet to the empty workbook
}
else
{
worksheet = xlApp.Workbook.Worksheets[SheetIndex]; // add a new worksheet to the empty workbook
}
if (worksheet == null)
{
worksheet = xlApp.Workbook.Worksheets.Add(_SheetName); // add a new worksheet to the empty workbook
}
else
{
}
/* Load the datatable into the sheet, starting from cell A1. Print the column names on row 1 */
worksheet.Cells["A1"].LoadFromDataTable(dt, true);
}
private static void AddSheet(ExcelPackage xlApp, DataTable dt, int Index, string sheetName)
{
string _SheetName = string.Empty;
if (string.IsNullOrEmpty(sheetName) == true)
{
_SheetName = string.Format("Sheet{0}", Index.ToString());
}
else
{
_SheetName = sheetName;
}
/* WorkSheet */
ExcelWorksheet worksheet = xlApp.Workbook.Worksheets[_SheetName]; // add a new worksheet to the empty workbook
if (worksheet == null)
{
worksheet = xlApp.Workbook.Worksheets.Add(_SheetName); // add a new worksheet to the empty workbook
}
else
{
}
/* Load the datatable into the sheet, starting from cell A1. Print the column names on row 1 */
worksheet.Cells["A1"].LoadFromDataTable(dt, true);
int rowCount = dt.Rows.Count;
int colCount = dt.Columns.Count;
#region Set Column Type to Date using LINQ.
/*
IEnumerable<int> dateColumns = from DataColumn d in dt.Columns
where d.DataType == typeof(DateTime) || d.ColumnName.Contains("Date")
select d.Ordinal + 1;
foreach (int dc in dateColumns)
{
xlSheet.Cells[2, dc, rowCount + 1, dc].Style.Numberformat.Format = "dd/MM/yyyy";
}
*/
#endregion
#region Set Column Type to Date using LOOP.
/* Set Column Type to Date. */
for (int i = 0; i < dt.Columns.Count; i++)
{
if ((dt.Columns[i].DataType).FullName == "System.DateTime" && (dt.Columns[i].DataType).Name == "DateTime")
{
//worksheet.Cells[2,4] .Style.Numberformat.Format = "yyyy-mm-dd h:mm"; //OR "yyyy-mm-dd h:mm" if you want to include the time!
worksheet.Column(i + 1).Style.Numberformat.Format = "dd/MM/yyyy h:mm"; //OR "yyyy-mm-dd h:mm" if you want to include the time!
worksheet.Column(i + 1).Width = 25;
}
}
#endregion
//(from DataColumn d in dt.Columns select d.Ordinal + 1).ToList().ForEach(dc =>
//{
// //background color
// worksheet.Cells[1, 1, 1, dc].Style.Fill.PatternType = ExcelFillStyle.Solid;
// worksheet.Cells[1, 1, 1, dc].Style.Fill.BackgroundColor.SetColor(System.Drawing.Color.LightYellow);
// //border
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Top.Style = ExcelBorderStyle.Thin;
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Right.Style = ExcelBorderStyle.Thin;
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Bottom.Style = ExcelBorderStyle.Thin;
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Left.Style = ExcelBorderStyle.Thin;
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Top.Color.SetColor(System.Drawing.Color.LightGray);
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Right.Color.SetColor(System.Drawing.Color.LightGray);
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Bottom.Color.SetColor(System.Drawing.Color.LightGray);
// worksheet.Cells[1, dc, rowCount + 1, dc].Style.Border.Left.Color.SetColor(System.Drawing.Color.LightGray);
//});
/* Format the header: Prepare the range for the column headers */
string cellRange = "A1:" + Convert.ToChar('A' + colCount - 1) + 1;
using (ExcelRange rng = worksheet.Cells[cellRange])
{
rng.Style.Font.Bold = true;
rng.Style.Fill.PatternType = ExcelFillStyle.Solid; //Set Pattern for the background to Solid
rng.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(79, 129, 189)); //Set color to dark blue
rng.Style.Font.Color.SetColor(Color.White);
}
/* Header Footer */
worksheet.HeaderFooter.OddHeader.CenteredText = "Header: Tinned Goods Sales";
worksheet.HeaderFooter.OddFooter.RightAlignedText = string.Format("Footer: Page {0} of {1}", ExcelHeaderFooter.PageNumber, ExcelHeaderFooter.NumberOfPages); // add the page number to the footer plus the total number of pages
}
}// class End.
I use Vb.net, here is what I did:
VB
Imports OfficeOpenXml
Dim existingFile As New FileInfo("C:\OldFileLocation\File.xlsx")
Dim fNewFile As New FileInfo("C:\NewFileLocation\File.xlsx")
Using MyExcel As New ExcelPackage(existingFile)
Dim MyWorksheet As ExcelWorksheet = MyExcel.Workbook.Worksheets("ExistingSheetName")
MyWorksheet.Cells("A1").Value = "Hello"
'Add additional info here
MyExcel.SaveAs(fNewFile)
End Using
Posible C# (I did not test)
FileInfo existingFile = new FileInfo("C:\\OldFileLocation\\File.xlsx");
FileInfo fNewFile = new FileInfo("C:\\NewFileLocation\\File.xlsx");
using (ExcelPackage MyExcel = new ExcelPackage(existingFile)) {
ExcelWorksheet MyWorksheet = MyExcel.Workbook.Worksheets["ExistingSheetName"];
MyWorksheet.Cells["A1"].Value = "Hello";
//Add additional info here
MyExcel.SaveAs(fNewFile);
}
Response.Clear();
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.AddHeader("content-disposition", "attachment;filename=" + HttpUtility.UrlEncode("Logs.xlsx", System.Text.Encoding.UTF8));
using (ExcelPackage pck = new ExcelPackage())
{
ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Logs");
ws.Cells["A1"].LoadFromDataTable(dt, true);
var ms = new System.IO.MemoryStream();
pck.SaveAs(ms);
ms.WriteTo(Response.OutputStream);
}

From Excel to DataTable in C# with Open XML

I'm using Visual Studio 2008 and I need create a DataTable from a Excel Sheet using the Open XML SDK 2.0. I need to create it with the DataTable columns with the first row of the sheet and complete it with the rest of values.
Does anyone have a example code or a link that can help me to do this?
I think this should do what you're asking. The other function is there just to deal with if you have shared strings, which I assume you do in your column headers. Not sure this is perfect, but I hope it helps.
static void Main(string[] args)
{
DataTable dt = new DataTable();
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(#"..\..\example.xlsx", false))
{
WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
foreach (Cell cell in rows.ElementAt(0))
{
dt.Columns.Add(GetCellValue(spreadSheetDocument, cell));
}
foreach (Row row in rows) //this will also include your header row...
{
DataRow tempRow = dt.NewRow();
for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
{
tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
}
dt.Rows.Add(tempRow);
}
}
dt.Rows.RemoveAt(0); //...so i'm taking it out here.
}
public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
return value;
}
}
Hi The above code is working fine except one change
replace the below line of code
tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
with
tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
If you use (i-1) it will throw an exception:
specified argument was out of the range of valid values. parameter name index.
This solution works for spreadsheets without empty cells.
To handle empty cells, you will need to replace this line:
tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
with something like this:
Cell cell = row.Descendants<Cell>().ElementAt(i);
int index = CellReferenceToIndex(cell);
tempRow[index] = GetCellValue(spreadSheetDocument, cell);
And add this method:
private static int CellReferenceToIndex(Cell cell)
{
int index = -1;
string reference = cell.CellReference.ToString().ToUpper();
foreach (char ch in reference)
{
if (Char.IsLetter(ch))
{
int value = (int)ch - (int)'A';
index = (index + 1) * 26 + value;
}
else
return index;
}
return index;
}
This is my complete solution where empty cell is also taken into consideration.
public static class ExcelHelper
{
//To get the value of the cell, even it's empty. Unable to use loop by index
private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, string cellColumnReference)
{
Cell theCell = null;
string value = "";
foreach (Cell cell in theCells)
{
if (cell.CellReference.Value.StartsWith(cellColumnReference))
{
theCell = cell;
break;
}
}
if (theCell != null)
{
value = theCell.InnerText;
// If the cell represents an integer number, you are done.
// For dates, this code returns the serialized value that represents the date. The code handles strings and
// Booleans individually. For shared strings, the code looks up the corresponding value in the shared string table. For Booleans, the code converts the value into the words TRUE or FALSE.
if (theCell.DataType != null)
{
switch (theCell.DataType.Value)
{
case CellValues.SharedString:
// For shared strings, look up the value in the shared strings table.
var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
// If the shared string table is missing, something is wrong. Return the index that is in the cell. Otherwise, look up the correct text in the table.
if (stringTable != null)
{
value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
}
break;
case CellValues.Boolean:
switch (value)
{
case "0":
value = "FALSE";
break;
default:
value = "TRUE";
break;
}
break;
}
}
}
return value;
}
private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, int index)
{
return GetCellValue(wbPart, theCells, GetExcelColumnName(index));
}
private static string GetExcelColumnName(int columnNumber)
{
int dividend = columnNumber;
string columnName = String.Empty;
int modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
dividend = (int)((dividend - modulo) / 26);
}
return columnName;
}
//Only xlsx files
public static DataTable GetDataTableFromExcelFile(string filePath, string sheetName = "")
{
DataTable dt = new DataTable();
try
{
using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, false))
{
WorkbookPart wbPart = document.WorkbookPart;
IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
string sheetId = sheetName != "" ? sheets.Where(q => q.Name == sheetName).First().Id.Value : sheets.First().Id.Value;
WorksheetPart wsPart = (WorksheetPart)wbPart.GetPartById(sheetId);
SheetData sheetdata = wsPart.Worksheet.Elements<SheetData>().FirstOrDefault();
int totalHeaderCount = sheetdata.Descendants<Row>().ElementAt(0).Descendants<Cell>().Count();
//Get the header
for (int i = 1; i <= totalHeaderCount; i++)
{
dt.Columns.Add(GetCellValue(wbPart, sheetdata.Descendants<Row>().ElementAt(0).Elements<Cell>().ToList(), i));
}
foreach (Row r in sheetdata.Descendants<Row>())
{
if (r.RowIndex > 1)
{
DataRow tempRow = dt.NewRow();
//Always get from the header count, because the index of the row changes where empty cell is not counted
for (int i = 1; i <= totalHeaderCount; i++)
{
tempRow[i - 1] = GetCellValue(wbPart, r.Elements<Cell>().ToList(), i);
}
dt.Rows.Add(tempRow);
}
}
}
}
catch (Exception ex)
{
}
return dt;
}
}
First Add ExcelUtility.cs to your project :
ExcelUtility.cs
using System.Data;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
namespace Core_Excel.Utilities
{
static class ExcelUtility
{
public static DataTable Read(string path)
{
var dt = new DataTable();
using (var ssDoc = SpreadsheetDocument.Open(path, false))
{
var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
var relationshipId = sheets.First().Id.Value;
var worksheetPart = (WorksheetPart) ssDoc.WorkbookPart.GetPartById(relationshipId);
var workSheet = worksheetPart.Worksheet;
var sheetData = workSheet.GetFirstChild<SheetData>();
var rows = sheetData.Descendants<Row>().ToList();
foreach (var row in rows) //this will also include your header row...
{
var tempRow = dt.NewRow();
var colCount = row.Descendants<Cell>().Count();
foreach (var cell in row.Descendants<Cell>())
{
var index = GetIndex(cell.CellReference);
// Add Columns
for (var i = dt.Columns.Count; i <= index; i++)
dt.Columns.Add();
tempRow[index] = GetCellValue(ssDoc, cell);
}
dt.Rows.Add(tempRow);
}
}
return dt;
}
private static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
var stringTablePart = document.WorkbookPart.SharedStringTablePart;
var value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;
return value;
}
public static int GetIndex(string name)
{
if (string.IsNullOrWhiteSpace(name))
return -1;
int index = 0;
foreach (var ch in name)
{
if (char.IsLetter(ch))
{
int value = ch - 'A' + 1;
index = value + index * 26;
}
else
break;
}
return index - 1;
}
}
}
Usage :
var path = "D:\\Documents\\test.xlsx";
var dt = ExcelUtility.Read(path);
then enjoy it!
Public Shared Function ExcelToDataTable(filename As String) As DataTable
Try
Dim dt As New DataTable()
Using doc As SpreadsheetDocument = SpreadsheetDocument.Open(filename, False)
Dim workbookPart As WorkbookPart = doc.WorkbookPart
Dim sheets As IEnumerable(Of Sheet) = doc.WorkbookPart.Workbook.GetFirstChild(Of Sheets)().Elements(Of Sheet)()
Dim relationshipId As String = sheets.First().Id.Value
Dim worksheetPart As WorksheetPart = DirectCast(doc.WorkbookPart.GetPartById(relationshipId), WorksheetPart)
Dim workSheet As Worksheet = worksheetPart.Worksheet
Dim sheetData As SheetData = workSheet.GetFirstChild(Of SheetData)()
Dim rows As IEnumerable(Of Row) = sheetData.Descendants(Of Row)()
For Each cell As Cell In rows.ElementAt(0)
dt.Columns.Add(GetCellValue(doc, cell))
Next
For Each row As Row In rows
'this will also include your header row...
Dim tempRow As DataRow = dt.NewRow()
For i As Integer = 0 To row.Descendants(Of Cell)().Count() - 1
tempRow(i) = GetCellValue(doc, row.Descendants(Of Cell)().ElementAt(i))
Next
dt.Rows.Add(tempRow)
Next
End Using
dt.Rows.RemoveAt(0)
Return dt
Catch ex As Exception
Throw ex
End Try
End Function
Public Shared Function GetCellValue(document As SpreadsheetDocument, cell As Cell) As String
Try
If IsNothing(cell.CellValue) Then
Return ""
End If
Dim value As String = cell.CellValue.InnerXml
If cell.DataType IsNot Nothing AndAlso cell.DataType.Value = CellValues.SharedString Then
Dim stringTablePart As SharedStringTablePart = document.WorkbookPart.SharedStringTablePart
Return stringTablePart.SharedStringTable.ChildElements(Int32.Parse(value)).InnerText
Else
Return value
End If
Catch ex As Exception
Return ""
End Try
End Function
I know it is a long time ago since this thread started. However, none of the solutions above did not really work for me. Empty cells issue and others.
I found a very good solution with 'MIT' license on GitHub:
https://github.com/ExcelDataReader/ExcelDataReader
This worked for me for both C# and VBnet applications.
Sample call from VBNET (the sample code for c# is on GitHub) :
Using stream As FileStream = New FileStream(DataPath & "\" & fName.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Using reader As IExcelDataReader = ExcelReaderFactory.CreateReader(stream)
ds = reader.AsDataSet(New ExcelDataSetConfiguration() With {
.UseColumnDataType = False,
.ConfigureDataTable = Function(tableReader) New ExcelDataTableConfiguration() With {
.UseHeaderRow = True
}
})
End Using
End Using
The result was a dataset with one table for each sheet in the workbook.
An I really like to compile the dll made in C# by myself rather then using a ready dll. So I can control what I am delivering to customers.
if rows value is null or empty get values wrong work.
all columns filled with data if it is working true. but maybe all rows not

Categories

Resources