Reading excel using Open XML SDK c# - c#

I am trying to read excel using Open XML SDK in C# using Visual Studio 2013.
I followed the following links
http://msdn.microsoft.com/en-us/library/office/hh298534(v=office.15).aspx
http://msdn.microsoft.com/en-us/library/office/gg575571.aspx
The code is
using (SpreadsheetDocument document = SpreadsheetDocument.Open(excelPath, true))
{
WorkbookPart workbookPart = document.WorkbookPart;
foreach (WorksheetPart workSheetPart in workbookPart.WorksheetParts)
{
SheetData sheetData = workSheetPart.Worksheet.Elements<SheetData>().First();
IEnumerable<Row> rows = sheetData.Elements<Row>().Where(x => x.RowIndex > 1);
foreach (Row r in rows)
{
IEnumerable<string> textValues = from cell in r.Descendants<Cell>() where cell.CellValue != null select cell.CellValue.Text;
foreach (var cell in textValues)
{
string str = cell.ToString();
}
}
}
}
I tried the following code also
using (SpreadsheetDocument document = SpreadsheetDocument.Open(excelPath, true))
{
WorkbookPart workbookPart = document.WorkbookPart;
foreach (WorksheetPart workSheetPart in workbookPart.WorksheetParts)
{
SheetData sheetData = workSheetPart.Worksheet.Elements<SheetData>().First();
IEnumerable<Row> rows = sheetData.Elements<Row>().Where(x => x.RowIndex > 1);
foreach (Row r in rows)
{
List<Cell> cells = r.Descendants<Cell>().ToList();
foreach (var cell in cells)
{
if (cell != null)
{
string value = cell.CellValue.Text;
if (cell.DataType != null)
{
switch (cell.DataType.Value)
{
case CellValues.SharedString:
var stringTable = workSheetPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
if (stringTable != null)
{
value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
}
break;
}
}
}
}
}
}
}
But both are returning only numeric values, not text. Can anyone please help on how to read excel text using Open XML SDK in C#.?

I have not used to OpenXML SDK directly, but have you tried used ClosedXML? It's a wrapper around the SDK that makes reading and writing Excel documents a breeze.

Related

Optimize reading of an excel file into a DataTable object

I use the below code to read data from an excel file into a DataTable object for further use. Since it processes from 100k to 500k entries, the reading can get a bit slow. Is there something I could change in my code in order to optimize the process ? Code is below.
public static DataTable ReadAsDataTable(string filePath)
{
DataTable dataTable = new DataTable();
using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(filePath, false))
{
WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
foreach (Cell cell in rows.ElementAt(0))
{
dataTable.Columns.Add(GetCellValue(spreadSheetDocument, cell));
}
foreach (Row row in rows)
{
DataRow dataRow = dataTable.NewRow();
for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
{
dataRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
}
dataTable.Rows.Add(dataRow);
}
}
dataTable.Rows.RemoveAt(0);
return dataTable;
}
private static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
string value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
}
else
{
return value;
}
}
I'm not sure what's the compiler behavior on this or the performance characteristics of that API, but would it help if you call row.Descendants<Cell>() only once? It seems like something the compiler could optimize, but there may be side effects involved, so it doesn't do anything.
foreach (Row row in rows)
{
var cells = row.Descendants<Cell>().ToArray();
DataRow dataRow = dataTable.NewRow();
for (int i = 0; i < cells.Length; i++)
{
dataRow[i] = GetCellValue(spreadSheetDocument, cells[i]);
}
dataTable.Rows.Add(dataRow);
}

Open XML Reading Excel file does not enter loop to read excel sheet

I have a scenario where I need to read Excel file in an MVC Application, I need this to run on the server, therefore I am using Open XML, I have a issue where my code does not enter the loop of rows in the sheet, please see my code below and advice on how I can rectify my code.
if (file.ContentLength > 0)
{
string path = file.FileName;
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(path, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
string text = c.CellValue.Text;
}
}
}
}
Any ideas, your help will be greatly appreciated, I have been trying multiple approaching but I am not getting in the foreach loop fr some odd reason.
I am using Excel 2013 and please see below image of my workbook.
I have been using following code which works fine for me.
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(sFileNameWithPath, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = GetWorksheetPart(workbookPart, sSheetName);
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
bool bHasChildren = sheetData.HasChildren;
if (bHasChildren)
{
for (int iCounter1 = 1; iCounter1 < sheetData.Elements<Row>().Count(); iCounter1++)
{
Row rDataRow = sheetData.Elements<Row>().ElementAt(iCounter1);
for (int iCounter = 0; iCounter < rDataRow.ChildElements.Count; iCounter++)
{
Cell oCell = (Cell)rDataRow.ChildElements[iCounter];
}
}
}
}
Let me know if this helps.
Or you can use your code with following change
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(sFileNameWithPath, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
string relId = workbookPart.Workbook.Descendants<Sheet>().First(s => "Claims".Equals(s.Name)).Id;
WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(relId);
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
string text = c.CellValue.Text;
}
}
}
Note that I have used the excel sheet name "Claims", so check whether it works and if yes; put it in another function to make it generic

Why cannot read excel file version older than 2013 with XML SDK 2.0 C#

So I wrote some code basically trying to read excel file using XML SDK. And here is the code I wrote with C# on Visual Studio 2010. I put my code at the bottom but basically the problem I am having is it will read any excel file that is 2013 version. Any excel version older than 2013 will not read. More specially the program will not go into the foreach loop when the excel version is older than 2013. Any ideas why?
static void ReadExcelFile(string fileName)
{
//open the file
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open(fileName, true))
{
//workbook part captcure
WorkbookPart workbookPart = myDoc.WorkbookPart;
//then access to the worksheet part
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
//find sheet data
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
string text = c.CellValue.Text;
Console.WriteLine(text);
}
}
Console.ReadKey();
}
}
This should do the trick. Here I have provide a self explanatory code sample.
I have tested this with Excel 2010 version. It seems to me that MSDN state your version only applicable to Excel 2013 and only applies if cells contains numbers. In the given example rather than using LINQ to get elements I have manually walk through parts.
//open the file
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open(path, true))
{
//Get workbookpart
WorkbookPart workbookPart = myDoc.WorkbookPart;
// Extract the workbook part
var stringtable = workbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
//then access to the worksheet part
IEnumerable<WorksheetPart> worksheetPart = workbookPart.WorksheetParts;
foreach (WorksheetPart WSP in worksheetPart)
{
//find sheet data
IEnumerable<SheetData> sheetData = WSP.Worksheet.Elements<SheetData>();
foreach (SheetData SD in sheetData)
{
foreach (Row row in SD.Elements<Row>())
{
// For each cell we need to identify type
foreach (Cell cell in row.Elements<Cell>())
{
if (cell.DataType == null && cell.CellValue != null)
{
// Check for pure numbers
Console.WriteLine(cell.CellValue.Text);
}
else if (cell.DataType.Value == CellValues.Boolean)
{
// Booleans
Console.WriteLine(cell.CellValue.Text);
}
else if (cell.CellValue != null)
{
// A shared string
if (stringtable != null)
{
// Cell value holds the shared string location
Console.WriteLine(stringtable.SharedStringTable.ElementAt(int.Parse(cell.CellValue.Text)).InnerText);
}
}
else {
Console.WriteLine("A broken book");
}
}
}
}
}

How to open an Excel document and read values using Open XML in C#

I would like to be able to open an excel document and loop through the values in the cells using Open XML in C#.
Tried using the below code, but it never gets past the Foreach (Row...). Also comments state this is for numeric values not Alpha Numeric values.
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(Filedirectory, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
string text;
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
text = c.CellValue.Text;
}
}
}
A coworker came up with the following solution.
FileInfo uploadedFile = new FileInfo(FileDirectory);
using (ASC.ExcelPackage.ExcelPackage xlPackage = new ASC.ExcelPackage.ExcelPackage(uploadedFile))
{
ASC.ExcelPackage.ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[1];
Int32 currentRow = 1;
while (worksheet.Cell(currentRow, 1) != null && !string.IsNullOrEmpty(worksheet.Cell(currentRow, 1).Value))
{
string value = worksheet.Cell(currentRow, 1).Value;
currentRow++;
}
}

How to read xslx with open XML SDK based on columns in each row in C#?

I am trying to read some .xslx files with the open xml sdk, but I'm really struggeling finding any good examples.
What I want to do is to read the entire XSLX file and loop through all of the rows and extract the cellvalue/celltext from the columns i specify.
Like the following:
GetCellText(rowId, ColumnLetter)
Is this possible?
Helpers:
private static string GetColumnName(string cellReference)
{
if (ColumnNameRegex.IsMatch(cellReference))
return ColumnNameRegex.Match(cellReference).Value;
throw new ArgumentOutOfRangeException(cellReference);
}
private static readonly Regex ColumnNameRegex = new Regex("[A-Za-z]+");
Code:
using (var document = SpreadsheetDocument.Open(stream, true))
{
var sheets = document.WorkbookPart.Workbook.Descendants<Sheet>();
foreach (Sheet sheet in sheets)
{
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id);
Worksheet worksheet = worksheetPart.Worksheet;
var rows = worksheet.GetFirstChild<SheetData>().Elements<Row>();
foreach (var row in rows)
{
var cells = row.Elements<Cell>();
foreach (var cell in cells)
{
if(GetColumnName(cell.CellReference) == "A")
{
var str = cell.CellValue.Text;
// do whatewer you want
}
}
}
}
}
Your question is similar to this one
1) Open xml excel read cell value
You can get the row by ID and look-up the value by column name.
Hope that helps

Categories

Resources