Multiple Pivot tables ClosedXML - c#

Using latest Closed XML (0.76) on Net 4.5.1
Created a Worksheet with a table by:
DataTable Table = ...
var DataWorkSheet = Workbook.Worksheets.Any(x => x.Name == "Data") ?
Workbook
.Worksheets
.First(x => x.Name == "Data") :
Workbook
.Worksheets
.Add("Data");
int Start = ... // calculate cell start
var Source = DataWorkSheet
.Cell(Start, 1)
.InsertTable(Table, Name, true);
var Range = Source.DataRange;
This is done inside a loop (i.e. multiple tables in the "Data" sheet). A problem arises where the generated Excel document can't be opened if multiple pivot tables are created in a separate sheet.
var PivotWorkSheet = Workbook
.Worksheets
.Add(Name);
var Pivot = PivotWorkSheet
.PivotTables
.AddNew(Name, PivotWorkSheet.Cell(1, 1), DataRange);
Any ideas why and how to debug?

This is the same issue as in ClosedXML - Creating multiple pivot tables.
For the record, it's caused by ClosedXML bug which requires source code modification as in my answer of the linked question.

Related

ClosedXML PivotTable ReportFilter multiple values

I am working on a piece of code to generate a pivot table in Excel.
This is the code:
using (XL.XLWorkbook workbook = new XL.XLWorkbook(sourceFile))
{
var outSheet = workbook.Worksheets.Add("output table");
outSheet.Cell(1, 1).InsertTable(dt, "out table", true);
var datarange = outSheet.RangeUsed();
var pivotSheet = workbook.Worksheets.Add("PivotTable");
var pivotTable = pivotSheet.PivotTables.AddNew("Pivot Table", pivotSheet.Cell(3, 1), datarange);
pivotTable.ReportFilters.Add("Filter1");
pivotTable.ReportFilters.Add("Filter2");
pivotTable.RowLabels.Add("RLabel");
pivotTable.ColumnLabels.Add("CLabel");
pivotTable.Values.Add("Value").SummaryFormula = XL.XLPivotSummary.Sum;
workbook.SaveAs(#"C:\Temp\Test.xlsx");
}
How would I go about to filter the values in "Filter1"?
For example, selecting only the values for "Unknown" and "Gcom".
In Excel the Pivot filter looks like this:
Excel Pivot Table Report Filter
I have checked all the ClosedXML documentation for pivots, the ReportFilters functionality is not mentioned.
Source code wiki example
Please advise, is this functionality even available?
Any advice/help is much appreciated.
Not sure when the functionality was added, but I got it to work with the following additions to your code:
using (XL.XLWorkbook workbook = new XL.XLWorkbook(sourceFile))
{
var outSheet = workbook.Worksheets.Add("output table");
outSheet.Cell(1, 1).InsertTable(dt, "out table", true);
var datarange = outSheet.RangeUsed();
var pivotSheet = workbook.Worksheets.Add("PivotTable");
var pivotTable = pivotSheet.PivotTables.AddNew("Pivot Table", pivotSheet.Cell(3, 1), datarange);
// I was not sure how to retrieve the filter after adding, but found Add() returns it for you.
var filter1 = pivotTable.ReportFilters.Add("Filter1");
// Now add your filter selection.
filter1.AddSelectedValue("Unknown");
filter1.AddSelectedValue("GCom");
pivotTable.ReportFilters.Add("Filter2");
pivotTable.RowLabels.Add("RLabel");
pivotTable.ColumnLabels.Add("CLabel");
pivotTable.Values.Add("Value").SummaryFormula = XL.XLPivotSummary.Sum;
workbook.SaveAs(#"C:\Temp\Test.xlsx");
}

C# EPPlus create many tabs causes a null reference exception

I found a strange error with generating an Excel file using EPPlus library. The scenario is simple - I need to have many worksheets in a single excel file. But, when invoking the GetAsByteArray() method, I get the null reference exception
using (ExcelPackage xml = new ExcelPackage())
{
foreach (var mainValueItem in values)
{
using (ExcelWorksheet worksheet = xml.Workbook.Worksheets.Add($"sheet {mainValueItem.ID}"))
{
worksheet.Cells[1, 1].Value = "Date";
}
}
return ctr.File(xml.GetAsByteArray(), MediaTypeNames.Application.Octet);
}
I can see in both Worksheets, the Cells property is not loaded as you can see here:
so, how to create many worksheets ?
I've found an answer - we shouldn't use
using (ExcelWorksheet worksheet = xml.Workbook.Worksheets.Add($"sheet {mainValueItem.ID}"))
in that scenario. instead, just declare a variable
var worksheet = xml.Workbook.Worksheets.Add($"sheet {mainValueItem.ID}");
and now it works, I can see multiple tabs in the generated file.
Happy coding !

Get Tables (workparts) of a sheet of excel by OpenXML SDK

I have 3 tables in a sheet of excel file,
and I use OpenXML SDK to read the Excel file, like this:
SpreadSheetDocument document = SpreadSheetDDocument.open(/*read it*/);
foreach(Sheet sheet in document.WorkbookPart.Workbook.Sheets)
{
//I need each table or work part of sheet here
}
So as you see I can get each sheet of Excel, but how can I get workparts in each sheet, like my 3 tables I should can iterate on these tables, does any one know about this? any suggestion?
Does this help?
// true for editable
using (SpreadsheetDocument xl = SpreadsheetDocument.Open("yourfile.xlsx", true))
{
foreach (WorksheetPart wsp in xl.WorkbookPart.WorksheetParts)
{
foreach (TableDefinitionPart tdp in wsp.TableDefinitionParts)
{
// for example
// tdp.Table.AutoFilter = new AutoFilter() { Reference = "B2:D3" };
}
}
}
Note that the actual cell data is not in the Table object, but in SheetData (under Worksheet of the WorksheetPart). Just so you know.
You can get the specific table from excel. Adding more to the answer of #Vincent
using (SpreadsheetDocument document= SpreadsheetDocument.Open("yourfile.xlsx", true))
{
var workbookPart = document.WorkbookPart;
var relationsShipId = workbookPart.Workbook.Descendants<Sheet>()
.FirstOrDefault(s => s.Name.Value.Trim().ToUpper() == "your sheetName")?.Id;
var worksheetPart = (WorksheetPart)workbookPart.GetPartById(relationsShipId);
TableDefinitionPart tableDefinitionPart = worksheetPart.TableDefinitionParts
.FirstOrDefault(r =>
r.Table.Name.Value.ToUpper() =="your Table Name");
QueryTablePart queryTablePart = tableDefinitionPart.QueryTableParts.FirstOrDefault();
Table excelTable = tableDefinitionPart.Table;
var newCellRange = excelTable.Reference;
var startCell = newCellRange.Value.Split(':')[0]; // you can have your own logic to find out row and column with this values
var endCell = newCellRange.Value.Split(':')[1];// Then you can use them to extract values using regular open xml
}

OpenXML (SAX Method) - Adding row to existing tab

I am trying to create an Excel document using OpenXML (SAX method). When my method is called I want to check to see if a tab has already been created for a given key. If it is I would like to just append a row to the bottom of that tab. If the tab hasn't been created for a given key I create a new tab like;
part = wbPart.AddNewPart<WorksheetPart>();
string worksheetName = row.Key[i].ToString();
Sheet sheet = new Sheet() { Id = document.WorkbookPart.GetIdOfPart(part), SheetId = sheetNumber, Name = worksheetName };
sheets.Append(sheet);
writer = OpenXmlWriter.Create(part);
writer.WriteStartElement(new Worksheet());
writer.WriteStartElement(new SheetData());
currentrow = 1;
string header = Header + "\t" + wrapper.GetHeaderString(3, 2, -1); //need to fix
WriteDataToExcel(header, currentrow, 0, writer);
currentrow++;
writer.WriteEndElement();
writer.WriteEndElement();
writer.Close();
If the a tab as already been created I recall sheet using the following code;
private static WorksheetPart GetWorksheetPartByName(SpreadsheetDocument document, string sheetName)
{
IEnumerable<Sheet> sheets =
document.WorkbookPart.Workbook.GetFirstChild<Sheets>().
Elements<Sheet>().Where(s => s.Name == sheetName);
if (sheets.Count() == 0)
{
// The specified worksheet does not exist.
return null;
}
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)
document.WorkbookPart.GetPartById(relationshipId);
return worksheetPart;
}
When the correct Worksheet part is returned I try and add the new row by pointing my OpenXmlWriter to the correct part then adding the row;
part = GetWorksheetPartByName(document, row.Key[i].ToString());
writer = OpenXmlWriter.Create(part);
writer.WriteStartElement(part.Worksheet);
writer.WriteStartElement(part.Worksheet.GetFirstChild<SheetData>());
SheetData sheetData = part.Worksheet.GetFirstChild<SheetData>();
Row lastRow = sheetData.Elements<Row>().LastOrDefault();
The code runs however I always end up with just one row (the initial one I added when first creating the tab). No subsequent rows show up in the spreadsheet.
I will be adding a lot of rows (50,000+) and would prefer not to have to create a new file and copy the information over each time.
From my experience, using the SAX method to write (ie, with OpenXmlWriter) works best for new things (parts, worksheets, whatnot). When you use OpenXmlWriter.Create(), that's like overwriting the original existing data for the part (WorksheetPart in this case). Even though in effect, it's not. It's complicated.
As far as my experiments went, if there's existing data, you can't edit data using OpenXmlWriter. Not even if you use the Save() function or close the OpenXmlWriter correctly. For some reason, the SDK will ignore your efforts. Hence the original one row that you added.
If you're writing 50,000 rows, it's best to do so all at one go. Then the SAX method will be useful. Besides, if you're writing one row (at a time?), the speed benefits of using SAX versus the DOM method is negligible.
According to this site work with exist Excel with OpenXMLWriter :
OpenXMLWriter can only operate a new Worksheet instead of an existing document. So I'm afraid you cannot insert values into particular cells of existing spreadsheet using OpenXMLWriter.
You could read all data in your exist Excel file , then seems you need to add rows(50,000+) I recommend use openxmlwriter to write old and new data to a new Excel file at once. If you use DOM approach it might cause memory problem after you append a lot of rows(50,000+).

How to find the data source of a Pivot Table using OpenXML

I am using EPP to open and edit an existing excel document.
The document contains 2 sheets - one with a pivot table (named Pivot) and one with the data (Data!$A$1:$L$9899).
I have a reference to the ExcelPivotTable with the code below, but can't find any properties that relate to the data source.
ExcelPackage package = new ExcelPackage(pivotSpreadsheet);
foreach (ExcelWorksheet worksheet in package.Workbook.Worksheets)
{
if (worksheet.PivotTables.Count > 0)
{
pivotWorkSheetName = worksheet.Name;
pivotTable = worksheet.PivotTables[0];
}
}
How do I get the name and range of the source data? Is there an obvious property that I'm missing or do I have to go hunting through some xml?
PivotTables use a data cache for the data store for performance & abstraction reasons. Remember, you can have a pivot that points to a web service call. The cache itself is what stores that reference. For pivots that refer to data elsewhere in a workbook, you can access it in EPPlus like this:
worksheet.PivotTables[0].CacheDefinition.SourceRange.FullAddress;
If anyone is interested to update the data source with OpenXML SDK 2.5 then here is the code I used.
using (var spreadsheet = SpreadsheetDocument.Open(filepath, true))
{
PivotTableCacheDefinitionPart ptp = spreadsheet.WorkbookPart.PivotTableCacheDefinitionParts.First();
ptp.PivotCacheDefinition.RefreshOnLoad = true;//refresh the pivot table on document load
ptp.PivotCacheDefinition.RecordCount = Convert.ToUInt32(ds.Tables[0].Rows.Count);
ptp.PivotCacheDefinition.CacheSource.WorksheetSource.Reference = "A1:" + IntToLetters(ds.Tables[0].Columns.Count) + (ds.Tables[0].Rows.Count + 1);//Cell Range as data source
ptp.PivotTableCacheRecordsPart.PivotCacheRecords.RemoveAllChildren();//it is rebuilt when pivot table is refreshed
ptp.PivotTableCacheRecordsPart.PivotCacheRecords.Count = 0;//it is rebuilt when pivot table is refreshed
}
public string IntToLetters(int value)//copied from another stackoverflow post
{
string result = string.Empty;
while (--value >= 0)
{
result = (char)('A' + value % 26) + result;
value /= 26;
}
return result;
}

Categories

Resources