Get Tables (workparts) of a sheet of excel by OpenXML SDK - c#

I have 3 tables in a sheet of excel file,
and I use OpenXML SDK to read the Excel file, like this:
SpreadSheetDocument document = SpreadSheetDDocument.open(/*read it*/);
foreach(Sheet sheet in document.WorkbookPart.Workbook.Sheets)
{
//I need each table or work part of sheet here
}
So as you see I can get each sheet of Excel, but how can I get workparts in each sheet, like my 3 tables I should can iterate on these tables, does any one know about this? any suggestion?

Does this help?
// true for editable
using (SpreadsheetDocument xl = SpreadsheetDocument.Open("yourfile.xlsx", true))
{
foreach (WorksheetPart wsp in xl.WorkbookPart.WorksheetParts)
{
foreach (TableDefinitionPart tdp in wsp.TableDefinitionParts)
{
// for example
// tdp.Table.AutoFilter = new AutoFilter() { Reference = "B2:D3" };
}
}
}
Note that the actual cell data is not in the Table object, but in SheetData (under Worksheet of the WorksheetPart). Just so you know.

You can get the specific table from excel. Adding more to the answer of #Vincent
using (SpreadsheetDocument document= SpreadsheetDocument.Open("yourfile.xlsx", true))
{
var workbookPart = document.WorkbookPart;
var relationsShipId = workbookPart.Workbook.Descendants<Sheet>()
.FirstOrDefault(s => s.Name.Value.Trim().ToUpper() == "your sheetName")?.Id;
var worksheetPart = (WorksheetPart)workbookPart.GetPartById(relationsShipId);
TableDefinitionPart tableDefinitionPart = worksheetPart.TableDefinitionParts
.FirstOrDefault(r =>
r.Table.Name.Value.ToUpper() =="your Table Name");
QueryTablePart queryTablePart = tableDefinitionPart.QueryTableParts.FirstOrDefault();
Table excelTable = tableDefinitionPart.Table;
var newCellRange = excelTable.Reference;
var startCell = newCellRange.Value.Split(':')[0]; // you can have your own logic to find out row and column with this values
var endCell = newCellRange.Value.Split(':')[1];// Then you can use them to extract values using regular open xml
}

Related

Handling large Excel files with shared strings

Using OpenXML, Microsoft recommends using the SAX approach:
https://msdn.microsoft.com/en-us/library/office/gg575571.aspx
So rather than loading the whole document DOM in memory, you can read the file serially with OpenXmlReader. For example:
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
string text;
while (reader.Read())
{
if (reader.ElementType == typeof(CellValue))
{
text = reader.GetText();
Console.Write(text + " ");
}
}
But this kinda falls down when you have cells with the SharedString data type. Those are stored separate from the sheet data in the shared string table and, as far as I can see, there's no real way to avoid having to load the entire shared string table. For example, I can do this:
var sharedStrings = wbPart.SharedStringTablePart.SharedStringTable.Cast<SharedStringItem>()
.Select(i => i.Text.Text).ToArray();
And then I can do something like:
var row = reader.LoadCurrentElement() as Row;
var cells = row.Descendants<Cell>();
var cellValues = cells.Select(c => c.DataType != null
&& c.DataType == CellValues.SharedString ?
sharedStrings[int.Parse(c.CellValue.Text)] : c.CellValue.Text).ToArray();
Which works, but I had to load the entire shared string table, which could be very large if the file has a lot of unique strings. Is there a more efficient way handle looking up the shared strings as your process each row of the file?

How to aggregate two worksheets to one workbook using openXML?

I want to aggregate two different worksheet from another workbooks to one workbook but I don't know how to do that using openXML. I only want to create one workbook with two worksheet. I don't need merge worksheets to one. How to aggregate two worksheets to one workbook using openXML?
Copying a worksheet from one workbook to another is easy with Epplus which is available free in Nuget.
Something like this example would copy a worksheet & all it's data from one workbook to another in one go without the need for any separate function to loop over the rows to copy the data:
FileInfo fInfoSrc = new FileInfo(#"C:\Temp\Source.xlsx");
FileInfo fInfoDest = new FileInfo(#"C:\Temp\Destination.xlsx");
using (var source = new ExcelPackage(fInfoSrc))
{
using (var destination = new ExcelPackage(fInfoDest))
{
var srcWorksheet = source.Workbook.Worksheets["SourceWorksheet"];
var destWorksheet = destination.Workbook.Worksheets.Add("destinationWorksheetName", srcWorksheet);
destination.Save();
}
}
You need a reference to the OpenXml SDK.
A small example how to create a workbook.
Call the second method AddWorksheet as many worksheets you need.
private static SpreadsheetDocument CreateWorkbook(Stream stream)
{
// Create the Excel workbook
var spreadSheet = SpreadsheetDocument.Create(stream, SpreadsheetDocumentType.Workbook, false);
// Create the parts and the corresponding objects
// Workbook
spreadSheet.AddWorkbookPart();
spreadSheet.WorkbookPart.Workbook = new Workbook();
spreadSheet.WorkbookPart.Workbook.Save();
// Shared string table
var sharedStringTablePart = spreadSheet.WorkbookPart.AddNewPart<SharedStringTablePart>();
sharedStringTablePart.SharedStringTable = new SharedStringTable();
sharedStringTablePart.SharedStringTable.Save();
// Sheets collection
spreadSheet.WorkbookPart.Workbook.Sheets = new Sheets();
spreadSheet.WorkbookPart.Workbook.Save();
// Stylesheet
var workbookStylesPart = spreadSheet.WorkbookPart.AddNewPart<WorkbookStylesPart>();
workbookStylesPart.Stylesheet = new Stylesheet();
workbookStylesPart.Stylesheet.Save();
return spreadSheet;
}
private static WorksheetPart AddWorksheet(SpreadsheetDocument spreadsheet, string name)
{
// Add the worksheetpart
var worksheetPart = spreadsheet.WorkbookPart.AddNewPart<WorksheetPart>();
worksheetPart.Worksheet = new Worksheet(new SheetData());
uint sheetId = 1;
var sheets = spreadsheet.WorkbookPart.Workbook.GetFirstChild<Sheets>();
if (sheets.Elements<Sheet>().Any())
{
sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
// Add the sheet and make relation to workbook
var sheet = new Sheet
{
Id = spreadsheet.WorkbookPart.GetIdOfPart(worksheetPart),
SheetId = sheetId,
Name = name
};
sheets.Append(sheet);
worksheetPart.Worksheet.Save();
spreadsheet.WorkbookPart.Workbook.Save();
return worksheetPart;
}
The best way which I chose is to open destination file and in loop open iteratively source file with worksheet to be copied. Next, I clone each row from source to destination file with clone node method with deep mode. This cloned row I insert in specific index in destination file worksheet.
The good way is to use EPPlus like in other answer to this question but when I use it with excel file where is defined name (named ranges) it does not work correctly.

How to add values to a spreadsheet from a dictionary?

I have a template spreadsheet document that has two columns, Server Name and IP Address.
How can I populate the spreadsheet so that each dictionary key goes in its own cell in the Server column and the corresponding value goes in the cell next to it in the IP column?
I am using the EPPlus library but couldn't find anything on the topic.
Below is what I found and tried, but its for lists
using (ExcelPackage package = new ExcelPackage(_fileInfo))
{
ExcelWorksheet worksheet = package.Workbook.Worksheets[1];
for (int i = 0; i < listOfIPs.Count; i++)
{
worksheet.Cells[i + 2, 1].Value = listOfIPs[i];
}
package.Save();
}
I am not familiar with EPPlus, therefore I am not sure how you get the reference to the active sheet - you need to figure out this bit though once that's done pure C# with a bit of knowledge about VBA model and you can easily avoid any iterations to get contents of your dictionary to a spreadsheet:
// create a sample dictionary and fill it
Dictionary<string, string> myCol = new Dictionary<string, string>();
myCol.Add("server1", "ip1");
myCol.Add("server2", "ip2");
// grab a reference to the active Sheet
// this may vary depending on what framework you are using
Worksheet ws = Globals.ThisWorkbook.ActiveSheet as Worksheet;
// create a Range variable
Range myRange;
// Transpose the keys to column A
myRange = ws.get_Range("A1");
myRange.get_Resize(myCol.Keys.ToArray().Count(),1).Value =
ws.Parent.Parent.Transpose(myCol.Keys.AsEnumerable().ToArray());
// transpose the Values to column B
myRange = ws.get_Range("B1");
myRange.get_Resize(myCol.Values.ToArray().Count(), 1).Value =
ws.Parent.Parent.Transpose(myCol.Values.AsEnumerable().ToArray());
Debugging results as expected
With EPPlus I think you can do it like this (untested)
using (ExcelPackage package = new ExcelPackage(file))
{
ExcelWorksheet worksheet = package.Workbook.Worksheets.Add("test");
worksheet.Cells["A1"].LoadFromCollection(myColl, true, OfficeOpenXml.Table.TableStyles.Medium);
package.Save();
}
More details on VBA Collections iterations and printing to Sheet # vba4all.com
You can just access the keys of a dictionary and iterate over them just like you are iterating over the list.
var foo = new Dictionary<string, string>(); // populate your dictionary
foreach(var key in foo.Keys)
{
var value = foo[key];
// do stuff with 'key' and 'value', like put them into the excel doc
}

Using OpenXML, how can I associate a list for data validation

I am processing an .xlsm file and need to know how to use a list on another sheet for data validation using openXML and C#.
To start, I have a .xlsm file with two empty sheets and macros in it. In my program I open the file, Create the column header on Sheet1 then create the validation list on sheet2. So, after I run my program Sheet1 "A1" contains the text "Color" and Sheet2 "A1:A4" contains "Blue","Green","Red","Yellow". I get this far just fine.
I would like to make it so there is a dropdown list in all cells of column "A" on sheet1 that contains each of the 4 colors and enforces them as the only input. In Microsoft Excel this is done by going to the "Data" tab, selecting "Data Validation" selecting "List" and highlighting the cells you want to use. I need to make this association programmatically.
The (Desired) XML that Microsoft Excel creates if I do it manually is this:
<extLst>
<ext uri="{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}" xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main">
<x14:dataValidations count="1" xmlns:xm="http://schemas.microsoft.com/office/excel/2006/main">
<x14:dataValidation type="list" allowBlank="1" showInputMessage="1" showErrorMessage="1">
<x14:formula1>
<xm:f>'Validation Data'!$A$1:$A$4</xm:f>
</x14:formula1>
<xm:sqref>A1:A1048576</xm:sqref>
</x14:dataValidation>
</x14:dataValidations>
</ext>
</extLst>
The following method and results is something I tried. It may give a better Idea of what I'm trying to do.
Here, I pass in "'Sheet2'!$A$1:$A$4" as the "validationListCells" parameter. This represents the cells in "Sheet2" that, in this example, would contain the color names "Red", "Green"...etc.
I pass in "A2:A1048576" as the "cellsToValidate" parameter. This represents all cells of Sheet1 column "A", on which I want to enforce validation.
I pass "Sheet1" as the worksheetName parameter.
private void InsertValidation(String worksheetName, String validationListCells, String cellsToValidate)
{
DataValidations dataValidations1 = new DataValidations() { Count = (UInt32Value)1U };
DataValidation dataValidation1 = new DataValidation()
{
Formula1 = new Formula1(validationListCells),
Type = DataValidationValues.List,
ShowInputMessage = true,
ShowErrorMessage = true,
SequenceOfReferences = new ListValue<StringValue>() { InnerText = cellsToValidate }
};
dataValidations1.Append(dataValidation1);
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(_documentPath, true))
{
WorksheetPart worksheetPart = GetWorksheetPartByName(spreadSheet, worksheetName);
worksheetPart.Worksheet.Append(dataValidations1);
worksheetPart.Worksheet.Save();
}
}
It results in this XML in Sheet1.xml. Which causes an error in Excel.
<x:dataValidations count="1">
<x:dataValidation type="list" showInputMessage="1" showErrorMessage="1" sqref="A2: A1048576">
<x:formula1>'Sheet2'!$A$1:$A$5</x:formula1>
</x:dataValidation>
</x:dataValidations>
It looks like I may be on the right track since it is beginning to resemble the xml created by Excel, but I'm completely new to openXML and I'm finding little about this topic on the net.
Thanks in advance!
For anyone else in need of this..the code below worked for me.
I put in there user3251089's variable names.
In general, when I try to programmatically create an excel "feature" I manually make a really basic excel that has in it that feature (delete extra sheets too). Then I reflect the code and try to make it prettier.
hope it serves to someone!
using Excel = DocumentFormat.OpenXml.Office.Excel;
using X14 = DocumentFormat.OpenXml.Office2010.Excel;
.....
Worksheet worksheet = worksheetPart.Worksheet;
WorksheetExtensionList worksheetExtensionList = new WorksheetExtensionList();
WorksheetExtension worksheetExtension = new WorksheetExtension() { Uri = "{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}" };
worksheetExtension.AddNamespaceDeclaration("x14", "http://schemas.microsoft.com/office/spreadsheetml/2009/9/main");
X14.DataValidations dataValidations = new X14.DataValidations() { Count = (UInt32Value)3U };
dataValidations.AddNamespaceDeclaration("xm", "http://schemas.microsoft.com/office/excel/2006/main");
//sites validation
dataValidations.Append(new X14.DataValidation()
{
Type = DataValidationValues.List,
AllowBlank = true,
ShowInputMessage = true,
ShowErrorMessage = true,
DataValidationForumla1 = new X14.DataValidationForumla1() { Formula = new Excel.Formula(validationListCells) },
ReferenceSequence = new Excel.ReferenceSequence(cellsToValidate)
});
worksheetExtension.Append(dataValidations);
worksheetExtensionList.Append(worksheetExtension);
worksheet.Append(worksheetExtensionList);
worksheet.Save();

OpenXML (SAX Method) - Adding row to existing tab

I am trying to create an Excel document using OpenXML (SAX method). When my method is called I want to check to see if a tab has already been created for a given key. If it is I would like to just append a row to the bottom of that tab. If the tab hasn't been created for a given key I create a new tab like;
part = wbPart.AddNewPart<WorksheetPart>();
string worksheetName = row.Key[i].ToString();
Sheet sheet = new Sheet() { Id = document.WorkbookPart.GetIdOfPart(part), SheetId = sheetNumber, Name = worksheetName };
sheets.Append(sheet);
writer = OpenXmlWriter.Create(part);
writer.WriteStartElement(new Worksheet());
writer.WriteStartElement(new SheetData());
currentrow = 1;
string header = Header + "\t" + wrapper.GetHeaderString(3, 2, -1); //need to fix
WriteDataToExcel(header, currentrow, 0, writer);
currentrow++;
writer.WriteEndElement();
writer.WriteEndElement();
writer.Close();
If the a tab as already been created I recall sheet using the following code;
private static WorksheetPart GetWorksheetPartByName(SpreadsheetDocument document, string sheetName)
{
IEnumerable<Sheet> sheets =
document.WorkbookPart.Workbook.GetFirstChild<Sheets>().
Elements<Sheet>().Where(s => s.Name == sheetName);
if (sheets.Count() == 0)
{
// The specified worksheet does not exist.
return null;
}
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)
document.WorkbookPart.GetPartById(relationshipId);
return worksheetPart;
}
When the correct Worksheet part is returned I try and add the new row by pointing my OpenXmlWriter to the correct part then adding the row;
part = GetWorksheetPartByName(document, row.Key[i].ToString());
writer = OpenXmlWriter.Create(part);
writer.WriteStartElement(part.Worksheet);
writer.WriteStartElement(part.Worksheet.GetFirstChild<SheetData>());
SheetData sheetData = part.Worksheet.GetFirstChild<SheetData>();
Row lastRow = sheetData.Elements<Row>().LastOrDefault();
The code runs however I always end up with just one row (the initial one I added when first creating the tab). No subsequent rows show up in the spreadsheet.
I will be adding a lot of rows (50,000+) and would prefer not to have to create a new file and copy the information over each time.
From my experience, using the SAX method to write (ie, with OpenXmlWriter) works best for new things (parts, worksheets, whatnot). When you use OpenXmlWriter.Create(), that's like overwriting the original existing data for the part (WorksheetPart in this case). Even though in effect, it's not. It's complicated.
As far as my experiments went, if there's existing data, you can't edit data using OpenXmlWriter. Not even if you use the Save() function or close the OpenXmlWriter correctly. For some reason, the SDK will ignore your efforts. Hence the original one row that you added.
If you're writing 50,000 rows, it's best to do so all at one go. Then the SAX method will be useful. Besides, if you're writing one row (at a time?), the speed benefits of using SAX versus the DOM method is negligible.
According to this site work with exist Excel with OpenXMLWriter :
OpenXMLWriter can only operate a new Worksheet instead of an existing document. So I'm afraid you cannot insert values into particular cells of existing spreadsheet using OpenXMLWriter.
You could read all data in your exist Excel file , then seems you need to add rows(50,000+) I recommend use openxmlwriter to write old and new data to a new Excel file at once. If you use DOM approach it might cause memory problem after you append a lot of rows(50,000+).

Categories

Resources