C# EPPlus OpenXML count rows - c#

With EPPlus and OpenXML does anyone know the syntax on how to count the rows?
Say my worksheet is called "worksheet"
int numberRows = worksheet.rows.count()? or worksheet.rows.dimension
I'm certainly interested in the answer, but how to find the answer would be cool to, like "Go to definition" and look for this or that, etc.

With a worksheet object called worksheet, worksheet.Dimension.Start.Row and worksheet.Dimension.End.Row should give you the information you need.
worksheet.Dimension.Address will give you a string containing the worksheet dimensions in the traditional Excel range format (e.g. 'A1:I5' for rows 1-5, columns 1-9).
There is a documentation file available. In many cases it might be just as quick to play around with the library and find the answer that way. EPPlus seems to be well designed - everything seems to be logically named, at least.

Thanks for that tip Quppa. I used it in my bid to populate a DataTable from a Workbook Spreadsheet as below:
/// <summary>
/// Converts a Worksheet to a DataTable
/// </summary>
/// <param name="worksheet"></param>
/// <returns></returns>
private static DataTable WorksheetToDataTable(ExcelWorksheet worksheet)
{
// Vars
var dt = new DataTable();
var rowCnt = worksheet.Dimension.End.Row;
var colCnt = worksheet.Dimension.End.Column + 1;
// Loop through Columns
for (var c = 1; c < colCnt; c++ )
{
// Add Column
dt.Columns.Add(new DataColumn());
// Loop through Rows
for(var r = 1; r < rowCnt; r++ )
{
// Add Row
if (dt.Rows.Count < (rowCnt-1)) dt.Rows.Add(dt.NewRow());
// Populate Row
dt.Rows[r - 1][c - 1] = worksheet.Cells[r, c];
}
}
// Return
return dt;
}

I am working with version 4.1 and it looks like they have added some properties (mentioned in comments from previous answers) to make this easier.
string Filepath = "c:\excelfile.xlsx";
FileInfo importFileInfo = new FileInfo(FilePath);
using(var excelPackage = new ExcelPackage(importFileInfo))
{
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets[1];
int rowCount = worksheet.Dimension.Rows;
int colCount = worksheet.Dimension.Columns;
}

Quite easy with:
private int GetDimensionRows(ExcelWorksheet sheet)
{
var startRow = sheet.Dimension.Start.Row;
var endRow = sheet.Dimension.End.Row;
return endRow - startRow;
}

Related

Excel file not being written to

I am trying to write to an Excel file using openXML. I followed Microsoft's example to do it. There's only a small difference with the code being that I wish to write number values instead of text.
See the code below :
public void SetSpreadSheet(string filePath)
{
sprdSheet = SpreadsheetDocument.Open(filePath, true);
wrkBook = sprdSheet.WorkbookPart;
wrkSheet = wrkBook.AddNewPart<WorksheetPart>();
wrkSheet.Worksheet = new Worksheet(new SheetData());
wrkSheet.Worksheet.Save();
sheets = wrkBook.Workbook.AppendChild(new Sheets());
sheet = new Sheet() { Id = wrkBook.GetIdOfPart(wrkSheet), SheetId = 1, Name = "test" };
sheets.Append(sheet);
}
Here I use SpreadsheetDocument.Open instead of creating a new one because I aim to write to an existing Excel template in the future.
public void AddData(float data, string columnName, UInt32 rowIndex)
{
var row = new Row() { RowIndex = rowIndex };
wrkSheet.Worksheet.AppendChild(row);
Cell cell = InsertCellInWorksheet(columnName, rowIndex, wrkSheet);
cell.CellValue = new CellValue(data);
cell.DataType = new EnumValue<CellValues>(CellValues.Number);
wrkSheet.Worksheet.Save();
}
There is another method called InsertCellInWorksheet which I took from MSDN example (and decided to not include since it is quite long).
I tested these methods using the following programm:
int column = 10;
int row = 5;
var rnd = new Random();
var xl = new Xl();
xl.SetSpreadSheet(#"path to the file .xlsx");
for(int c = 1; c <= column; c++)
{
for(int r = 1; r <= row; r++)
{
var columnName = Number2String(c, true);
xl.AddData(rnd.Next(10), columnName, (uint)r);
}
}
Console.WriteLine("FINISHED");
Console.ReadLine();
And the result that I get is an empty Excel spreadsheet.
In the MSDN example they use a SharedStringTablePart to write to the Excel spreadsheet, is there an object akin to it that I should use to write numerical data ?
UPDATE
Following mcalex' comment i tried using ClosedXML and succeeded in doing just what I wanted to do. However, I'm still interested in understanding why my file is not being written to so feel free to answer the question.

How can I pass and format an SQL table into an excel using C#

Currently I have a table with 6 rows and 14 columns.
I'm trying to pass that table to my excel document and I have no problem doing that.
My problem is that I can't format it the way I want.
Idealy I want to have 3 rows, blank space and 3 rows again, but I can't do that.
This is the function I'm currently using to format the Sql table. Basically it writes in excel all the rows consecutively.
Instead of doing that I want it to have a black row between row 3 and row 4.
If someone could help I'd very thankful.
private int Export_putDataGeneric(Excel.Worksheet sh, DataTable ds, String D_ReferenceDate, int starting_row = 5, int[] column_mapping = null, bool isNumber = true)
{
int curr_row = 0;
if (column_mapping == null)
{
column_mapping = new int[ds.Columns.Count];
int start_char = 2;
for (int c = 0; c < ds.Columns.Count; c++)
{
column_mapping[c] = start_char;
start_char++;
}
}
var data = new Object[ds.Rows.Count, column_mapping[ds.Columns.Count - 1] - column_mapping[0] + 1];
foreach (DataRow row in ds.Rows)
{
for (int c = 0; c < ds.Columns.Count; c++)
{
data[curr_row, column_mapping[c] - column_mapping[0]] = row[c];
}
curr_row++;
}
int end_row = starting_row + ds.Rows.Count - 1;
Excel.Range beginWrite = sh.Cells[starting_row, column_mapping[0]] as Excel.Range;
Excel.Range endWrite = sh.Cells[end_row, column_mapping[ds.Columns.Count - 1]] as Excel.Range;
Excel.Range sheetData = sh.Range[beginWrite, endWrite];
sheetData.Value2 = data;
if (isNumber) sheetData.NumberFormat = "#,##0.00";
Marshal.ReleaseComObject(beginWrite);
Marshal.ReleaseComObject(endWrite);
Marshal.ReleaseComObject(sheetData);
beginWrite = null;
endWrite = null;
sheetData = null;
return end_row;
}
You can try using Range.Offset.
Check out the Microsoft Documentation
This question on SO might also help.

Optimize performance of data processing method

I am using the following code to take some data (in XML like format - Not well formed) from a .txt file and then write it to an .xlsx using EPPlus after doing some processing. StreamElements is basically a modified XmlReader. My question is about performance, I have made a couple of changes but don't see what else I can do. I'm going to use this for large datasets so I'm trying to modify to make this as efficient and fast as possible. Any help will be appreciated!
I tried using p.SaveAs() to do the excel writing but it did not really see a performance difference. Are there better faster ways to do the writing? Any suggestions are welcome.
using (ExcelPackage p = new ExcelPackage())
{
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
int rowIndex = 1; int colIndex = 1;
foreach (var element in StreamElements(pa, "XML"))
{
var values = element.DescendantNodes().OfType<XText>()
.Select(v => Regex.Replace(v.Value, "\\s+", " "));
string[] data = string.Join(",", values).Split(',');
data[2] = toDateTime(data[2]);
for (int i = 0; i < data.Count(); i++)
{
if (rowIndex < 1000000)
{
var cell1 = ws.Cells[rowIndex, colIndex];
cell1.Value = data[i];
colIndex++;
}
}
rowIndex++;
}
}
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using (FileStream fs = File.OpenWrite("C:\\test.xlsx"))
{
fs.Write(bin, 0, bin.Length);
}
}
}
Currently, for it to do the processing and then write 1 Million lines into an Excel worksheet, it takes about ~30-35 Minutes.
I've ran into this issue before and excel has a huge overhead when you're modifying worksheet cells individually one by one.
The solution to this is to create an object array and populate the worksheet using the WriteRange functionality.
using(ExcelPackage p = new ExcelPackage()) {
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
//Starting cell
int startRow = 1;
int startCol = 1;
//Needed for 2D object array later on
int maxColCount = 0;
int maxRowCount = 0;
//Queue data
Queue<string[]> dataQueue = new Queue<string[]>();
//Tried not to touch this part
foreach(var element in StreamElements(pa, "XML")) {
var values = element.DescendantNodes().OfType<XText>()
.Select(v = > Regex.Replace(v.Value, "\\s+", " "));
//Removed unnecessary split and join, use ToArray instead
string[] eData = values.ToArray();
eData[2] = toDateTime(eData[2]);
//Push the data to queue and increment counters (if needed)
dataQueue.Enqueue(eData);
if(eData.Length > maxColCount)
maxColCount = eData.Length;
maxRowCount++;
}
//We now have the dimensions needed for our object array
object[,] excelArr = new object[maxRowCount, maxColCount];
//Dequeue data from Queue and populate object matrix
int i = 0;
while(dataQueue.Count > 0){
string[] eData = dataQueue.Dequeue();
for(int j = 0; j < eData.Length; j++){
excelArr[i, j] = eData[j];
}
i++;
}
//Write data to range
Excel.Range c1 = (Excel.Range)wsh.Cells[startRow, startCol];
Excel.Range c2 = (Excel.Range)wsh.Cells[startRow + maxRowCount - 1, maxColCount];
Excel.Range range = worksheet.Range[c1, c2];
range.Value2 = excelArr;
//Tried not to touch this stuff
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using(FileStream fs = File.OpenWrite("C:\\test.xlsx")) {
fs.Write(bin, 0, bin.Length);
}
}
I didn't try compiling this code, so double check the indexing used; and check for any small syntax errors.
A few extra pointers to consider for performance:
Try to parallel the population of the object array, since it is primarily index based (maybe have a dictionary with an index tracker Dictionary<int, string[]>) and lookup in there for faster population of the object array. You would likely have to trade space for time.
See if you are able to hardcode the column and row counts, or figure it out quickly. In my code fix, I've set counters to count the maximum rows and columns on the fly; I wouldn't recommend it as a permanent solution.
AutoFitColumns is very costly, especially if you're dealing with over a million rows

Optimal Column Width OpenOffice Calc

I'm entering data from a CSV file into a OpenOffice spreadsheet.
This code gets the a new sheet in a spreadsheet:
Public Spreadsheet getSpreadsheet(int sheetIndex, XComponent xComp)
{
XSpreadsheet xSheets = ((XSpreadsheetDocument)xComp).getSheets();
XIndexAccess xSheetIA = (XIndexAccess)xSheets;
XSpreadsheet XSheet = (XSpreadsheet)xSheetsA.getByIndex(sheetIndex).Value;
return XSheet;
}
I then have method that enters a list into a cell range one cell at a time. I want to be able to automatically set the column size for these cells. which is something like
string final DataCell;
Xspreadsheet newSheet = getSpreadsheet(sheetIndex, xComp);
int numberOfRecords = ( int numberOfColumns * int numberOfRows);
for(cellNumber = 0; cellNumber < numberOfrecords; cellNumber++)
{
XCell tableData = newSheet.getCellbyPosition(columnValue, rowValue);
((XText)tableData).setString(finalDataCell);
column Value++;
if(columnValue > = numberOfColumns)
{
rowVal++ column = 0;
}
}
After googling i have found the function:
columns.OptimalWidth = True on http://forum.openoffice.org/en/forum/viewtopic.php?f=20&t=31292
but im unsure on how to use this. Could anyone explain this further or think of another way to have the cell autofit?
I understand the comments in the code are in Spanish I think, but the code is in English. I ran the comments through Google translate so now they are in English. I copied it from here:
//Auto Enlarge col width
private void largeurAuto(string NomCol)
{
XCellRange Range = null;
Range = Sheet.getCellRangeByName(NomCol + "1"); //Recover the range, a cell is
XColumnRowRange RCol = (XColumnRowRange)Range; //Creates a collar ranks
XTableColumns LCol = RCol.getColumns(); // Retrieves the list of passes
uno.Any Col = LCol.getByIndex(0); //Extract the first Col
XPropertySet xPropSet = (XPropertySet)Col.Value;
xPropSet.setPropertyValue("OptimalWidth", new one.Any((bool)true));
}
What this does it this: First it gets the range name and then gets the first column. The real code, though, is XpropertySet being used, which is explained REALLY well here.
public void optimalWidth(XSpreadsheet newSheet)
{
// gets the used range of the sheet
XSheetCellCursor XCursor = newSheet.createCursor();
XUsedAreaCursor xUsedCursor = (XUsedAreaCursor)XCursor;
xUsedCursor.gotoStartOfUsedArea(true);
xUsedCursor.gotoEndOfUsedArea(true);
XCellRangeAddressable nomCol = (XCellRangeAddressable)xUsedCursor;
XColumnRowRange RCol = (XColumnRowRange)nomCol;
XTableColumns LCol = RCol.getColumns();
// loops round all of the columns
for (int i = 0; i < nomCol.getRangeAddress().EndColumn;i++)
{
XPropertySet xPropSet = (XPropertySet)LCol.getByIndex(i).Value;
xPropSet.setPropertyValue("OptimalWidth", new uno.Any(true));
}
}

How to set the width of the cell in the xlsx file created programmatically?

I have this C# code which converts a dataset to xlsx. Is there a way to set the cell or column width of the sheet of the xlsx file created?
//Get the filename
String filepath = args[0].ToString();
//Convert the file to dataset
DataSet ds = Convert(filepath.ToString(), "tblCustomers", "\t");
//Create the excell object
Excel.Application excel = new Excel.Application();
//Create the workbook
Excel.Workbook workBook = excel.Workbooks.Add();
//Set the active sheet
Excel.Worksheet sheet = workBook.ActiveSheet;
int i = 0;
foreach (DataRow row in ds.Tables[0].Rows)
{
for (int j = 0; j < row.ItemArray.Length; j++)
{
sheet.Cells[i + 1, j + 1] = row[j];
}
i++;
}
workBook.SaveAs(#"C:\fromCsv.xlsx");
workBook.Close();
sheet.Columns["D:D"].ColumnWidth = 17.57;
or
sheet.Columns[1].ColumnWidth = 17.57;
You can record Macros in Excel and then look to generated code (object model is the same).
To automaticaly set all column widths to "right-size" for their contents, you can take care of this by calling AutoFit, like so:
_xlSheet.Columns.AutoFit();
However, sometimes one or two "rogue" values in a column make that column go ultra-wide, and you have to drag the column way over to the left so as to see more of the data. You can overcome this Catch-22 by using both AutoFit and then, afterwards, specifying the width of any problematic columns. Here's the code for how to do that, which assumes column 1 is the one to be reined in, and 42 is the width you want it to assume:
private Worksheet _xlSheet;
private static readonly int ITEMDESC_COL = 1;
private static readonly int WIDTH_FOR_ITEM_DESC_COL = 42;
. . .
_xlSheet.Columns.AutoFit();
// Now take back the wider-than-the-ocean column
((Range)_xlSheet.Cells[ITEMDESC_COL, ITEMDESC_COL]).EntireColumn.ColumnWidth = WIDTH_FOR_ITEM_DESC_COL;
Note: As an added nicety, you can have the over-long content wrap (especially useful if they are in a Merged (multi-row) range) like so (where "range" is the Range you defined when populating the column):
range.WrapText = true;
Note: You need to add the Microsoft.Offie.Interop.Excel assembly for this code to work.

Categories

Resources