open xml sdk excel formula recalculate after cell removal - c#

I have a formula cell C4 that needs to recalculate after I enter a value in another cell C2. but the C4 keeps getting cached and keeps returning old cached value.
I have asked this question multiple times on SO but I am not getting any help. I am trying every thing that I can. Here is what I found on msdn site.
With the methods from the previous code listing in place, generating
the report is now a process of getting the portfolio data and
repeatedly calling UpdateValue to create the report. Indeed, if you
add the necessary code to do this, things seem to work fine except for
one problem - any cell that contains a formula that refers to a cell
whose value was changed via Open XML manipulation does not show the
correct result. This is because Excel caches the result of a formula
within the cell. Because Excel thinks it has the correct value cached,
it does not recalculate the cell. Even if you have auto calculation
turned on or if you press F9 to force a manual recalculation, Excel
does not recalculate the cell. The solution to this is to remove the
cached value from these cells so that Excel recalculates the value as
soon as the file is opened in Excel. Add the RemoveCellValue method
shown in the following example to the PortfolioReport class to provide
this functionality.
Based on above MSDN explanation. I have tried putting the removing the code before I update the cell. After I update the cell. Before I read the formula cell, after I read the formula cell but I keep getting the following error after I read the formula cell.
System.NullReferenceException: Object reference not set to an instance
of an object.
Here is my code...
string filename = Server.MapPath("/") + "MyExcelData.xlsx";
using (SpreadsheetDocument document = SpreadsheetDocument.Open(filename, true))
{
Sheet sheet = document.WorkbookPart.Workbook.Descendants<Sheet>().SingleOrDefault(s => s.Name == "myRange1");
if (sheet == null)
{
throw new ArgumentException(
String.Format("No sheet named {0} found in spreadsheet {1}", "myRange1", filename), "sheetName");
}
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id);
Worksheet ws = worksheetPart.Worksheet; // ((WorksheetPart)(worksheetPart.GetPartById(sheet.Id))).Worksheet;
Cell cell = InsertCellInWorksheet(ws, "C4");
// If there is a cell value, remove it to force a recalculation
// on this cell.
if (cell.CellValue != null)
{
cell.CellValue.Remove();
}
// Save the worksheet.
ws.Save();
document.Close();
}
// getting 2 numbers in excel sheet, saving, and closing it.
using (SpreadsheetDocument document = SpreadsheetDocument.Open(filename, true))
{
Sheet sheet = document.WorkbookPart.Workbook.Descendants<Sheet>().SingleOrDefault(s => s.Name == "myRange1");
if (sheet == null)
{
throw new ArgumentException(
String.Format("No sheet named {0} found in spreadsheet {1}", "myRange1", filename), "sheetName");
}
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id);
int rowIndex = int.Parse("C3".Substring(1));
Row row = worksheetPart.Worksheet.GetFirstChild<SheetData>().
Elements<Row>().FirstOrDefault(r => r.RowIndex == rowIndex);
Cell cell3 = row.Elements<Cell>().FirstOrDefault(c => "C3".Equals(c.CellReference.Value));
if (cell3 != null)
{
cell3.CellValue = new CellValue("16");
cell3.DataType = new DocumentFormat.OpenXml.EnumValue<CellValues>(CellValues.Number);
}
worksheetPart.Worksheet.Save();
document.Close();
}
// getting the result out of excel.
using (SpreadsheetDocument document = SpreadsheetDocument.Open(filename, false))
{
document.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
document.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
Sheet sheet = document.WorkbookPart.Workbook.Descendants<Sheet>().SingleOrDefault(s => s.Name == "myRange1");
if (sheet == null)
{
throw new ArgumentException(
String.Format("No sheet named {0} found in spreadsheet {1}", "myRange1", filename), "sheetName");
}
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id);
int rowIndex = int.Parse("C4".Substring(1));
Row row = worksheetPart.Worksheet.GetFirstChild<SheetData>().
Elements<Row>().FirstOrDefault(r => r.RowIndex == rowIndex);
Cell cell = row.Elements<Cell>().FirstOrDefault(c => "C4".Equals(c.CellReference.Value));
d.Average = Convert.ToDouble(cell.CellValue.InnerText);
}

The problem seems to be that you are directly modifying an Excel data file without Excel being open. Since Excel can only track formula dependencies when its open it does not know that it needs to recalculate when you change data without Excel knowing that you have done so.
3 possible solutions are:
1) remove the calculation chain part from the file (not tested)
2) after making the changes to the file use interop/automation to open Excel and request a full calculation (or full calculation with dependency rebuild if you are also altering/creating formulas)
3) set the fullcalculationonload property to true : this should cause Excel to do a full calculation when it opens the file

I think u have deleted the cellValue of C4 ,, first u have to create the cellValue then u can perform any operation on it .

Related

Using NPOI to Retrieve the Value of a Merged Cell from an Excel Spreadsheet

I'm using NPOI to retrieve data from Excel into a text file. Based on the Excel sheet I'm supposed to show the data in this manner.
The cell for 13/3/19 in the Excel sheet is merged across two rows and I don't know how I can retrieve the merge cell value for May and display it. Does anyone have any ideas?
In Excel, if a cell is merged with other cells, the first cell in the merged region is the one that has the actual value. The other cells in the region are blank. The merged regions are kept in the worksheet object, since they can span multiple rows and columns.
To get the value, you need to:
Check whether the current cell is merged by looking at the IsMergedCell property on the cell itself.
If the cell is merged, loop through the merged regions on the worksheet to find the one containing that cell.
Once the containing region is found, get the first cell from the region.
Get the value from that cell.
Here is a helper method I wrote which should do the trick:
public static ICell GetFirstCellInMergedRegionContainingCell(ICell cell)
{
if (cell != null && cell.IsMergedCell)
{
ISheet sheet = cell.Sheet;
for (int i = 0; i < sheet.NumMergedRegions; i++)
{
CellRangeAddress region = sheet.GetMergedRegion(i);
if (region.ContainsRow(cell.RowIndex) &&
region.ContainsColumn(cell.ColumnIndex))
{
IRow row = sheet.GetRow(region.FirstRow);
ICell firstCell = row?.GetCell(region.FirstColumn);
return firstCell;
}
}
return null;
}
return cell;
}
Then as you are looping through your cells you can just call this method for every cell. If the cell is merged, it will return the cell that has the value for that merged region, otherwise it will just return the original cell back. So then you don't have to think about it anymore.
cell = GetFirstCellInMergedRegionContainingCell(cell);
if (cell != null)
{
// get the value
}

Is there a way to get the last filled row cell value from column using OpenXml

I am working on a Console application that simply grabs the value from a specified cell and displays that value to the Console. I would like to modify the code and get the value of the last cell's value in a column. I am able to get the value of cells I specify but I wish to only get the last filled cell(because the addressName may change when the sheet is updated with more rows ). I am currently using the code below to get the values by addressName. Can someone point me in the right direction or show an example. please and thank you.
using System;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
namespace ConsoleApp5
{
class Program
{
static void Main(string[] args)
{
string fileName = #"C:\Temp\myTempDoc\bigexcel.xlsx";
string sheetName = "sheet1";
string addressName = "B25";
var cellVall =GetCellValue(fileName, sheetName, addressName);
Console.WriteLine(cellVall);
Console.ReadKey();
}
public static string GetCellValue(string fileName,string sheetName,string addressName)
{
string value = null;
// Open the spreadsheet document for read-only access.
using (SpreadsheetDocument document =
SpreadsheetDocument.Open(fileName, false))
{
// Retrieve a reference to the workbook part.
WorkbookPart wbPart = document.WorkbookPart;
// Find the sheet with the supplied name, and then use that
// Sheet object to retrieve a reference to the first worksheet.
Sheet theSheet = wbPart.Workbook.Descendants<Sheet>().
Where(s => s.Name == sheetName).FirstOrDefault();
// Throw an exception if there is no sheet.
if (theSheet == null)
{
throw new ArgumentException("sheetName");
}
// Retrieve a reference to the worksheet part.
WorksheetPart wsPart =
(WorksheetPart)(wbPart.GetPartById(theSheet.Id));
// Use its Worksheet property to get a reference to the cell
// whose address matches the address you supplied.
Cell theCell = wsPart.Worksheet.Descendants<Cell>().
Where(c => c.CellReference ==addressName ).FirstOrDefault();
// If the cell does not exist, return an empty string.
if (theCell != null)
{
value = theCell.InnerText;
// If the cell represents an integer number, you are done.
// For dates, this code returns the serialized value that
// represents the date. The code handles strings and
// Booleans individually. For shared strings, the code
// looks up the corresponding value in the shared string
// table. For Booleans, the code converts the value into
// the words TRUE or FALSE.
if (theCell.DataType != null)
{
switch (theCell.DataType.Value)
{
case CellValues.SharedString:
// For shared strings, look up the value in the
// shared strings table.
var stringTable =
wbPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
// If the shared string table is missing, something
// is wrong. Return the index that is in
// the cell. Otherwise, look up the correct text in
// the table.
if (stringTable != null)
{
value =
stringTable.SharedStringTable
.ElementAt(int.Parse(value)).InnerText;
}
break;
case CellValues.Boolean:
switch (value)
{
case "0":
value = "FALSE";
break;
default:
value = "TRUE";
break;
}
break;
}
}
}
}
return value;
}
}
}
I'm not quite clear whether you are after the last Cell in a Row of your choosing or the last Cell of the last Row. Both approaches are very similar though so I'll show how to do both.
The basic principal is to find the Row you are after first and then to grab the child Cells from that Row.
If you want the very last Cell of the sheet then we just want the last Row:
//grab the last row
Row row = wsPart.Worksheet.Descendants<Row>().LastOrDefault();
However, if you would like to be able to pass in a row number and grab the last Cell of that Row then something like this will do the trick (here the variable rowIndex denotes the index of the Row for which you want the last Cell):
//find the row that matches the rowNumber we're after
Row row = wsPart.Worksheet.Descendants<Row>()
.Where(r => r.RowIndex == rowIndex).FirstOrDefault();
Once you have the Row, it's just a case of grabbing the last Cell of that Row using similar code to the above:
Cell theCell = null;
//find the row that matches the rowNumber we're after
Row row = wsPart.Worksheet.Descendants<Row>().Where(r => r.RowIndex == rowIndex).FirstOrDefault();
if (row != null)
{
//now grab the last cell of that row
theCell = row.Descendants<Cell>().LastOrDefault();
}
// If the cell does not exist, return an empty string.
if (theCell != null)
...
I get the same issue and I found that an empty row of excel (OpenXml) has descendants = 1:
while(++rowIndex < rows.Count())
{
currentRow = rows.ElementAt(rowIndex);
int descendants = currentRow.Descendants<Cell>().Count();
if(descendants <= 1)
{
continue;
}
// your code
}
, so you can continue to read the rows and ignore empty rows until the last row.

How do I insert Excel cells without creating a corrupt file?

I'm using the OpenXML SDK to update the contents of an Excel spreadsheet. When inserting cells into an Excel row they must be inserted in the correct order or the file will not open properly in Excel. I'm using the following code to find the first cell that will be after the cell I am inserting. This code comes almost directly from the OpenXML SDK documentation
public static Cell GetFirstFollowingCell(Row row, string newCellReference)
{
Cell refCell = null;
foreach (Cell cell in row.Elements<Cell>())
{
if (string.Compare(cell.CellReference.Value, newCellReference, true) > 0)
{
refCell = cell;
break;
}
}
return refCell;
}
When I edit files with this code and then open them in Excel, Excel reports that the file is corrupted. Excel is able to repair the file, but most of the data is removed from the workbook. Why does this result in file corruption?
Side note: I tried two different .NET Excel libraries before turning to the painfully low-level OpenXML SDK. NPOI created spreadsheets with corruption and EPPlus threw an exception whenever I tried to save. I was using the most recent version of each.
The code you are using is seriously flawed. This is very unfortunate, seeing as it comes from the documentation. It may work acceptably for spreadsheets that only use the first 26 columns but will fail miserably when confronted with "wider" spreadsheets. The first 26 columns are named alphabetically, A-Z. Columns 27-52 are named AA-AZ. Column 53-78 are named BA-BZ. (You should notice the pattern.)
Cell "AA1" should come after all cells with a single character column name (i.e. "A1" - "Z1"). Let's examine the current code comparing cell "AA1" with cell "B1".
string.Compare("B1", "AA1", true) returns the value 1
The code interprets this to mean that "AA1" should be placed before cell "B1".
The calling code will insert "AA1" before "B1" in the XML.
At this point the cells will be out of order and the Excel file is corrupted. Clearly, string.Compare by itself is not a sufficient test to determine the proper order of cells in a row. A more sophisticated comparison is required.
public static bool IsNewCellAfterCurrentCell(string currentCellReference, string newCellReference)
{
var columnNameRegex = new Regex("[A-Za-z]+");
var currentCellColumn = columnNameRegex.Match(currentCellReference).Value;
var newCellColumn = columnNameRegex.Match(newCellReference).Value;
var currentCellColumnLength = currentCellColumn.Length;
var newCellColumnLength = newCellColumn.Length;
if (currentCellColumnLength == newCellColumnLength)
{
var comparisonValue = string.Compare(currentCellColumn, newCellColumn, StringComparison.OrdinalIgnoreCase);
return comparisonValue > 0;
}
return currentCellColumnLength < newCellColumnLength;
}
If you wanted to place a new cell in column "BC" and you were comparing to cell "D5" you would use IsCellAfterColumn("D5", "BC5"). Substituting the new comparison function into the original code and simplifying with LINQ:
public static Cell GetFirstFollowingCell(Row row, string newCellReference)
{
var rowCells = row.Elements<Cell>();
return rowCells.FirstOrDefault(c => IsNewCellAfterCurrentCell(c.CellReference.Value, newCellReference));
}

Read Excel Cell Format

I'm working on this program that will read the data in excel file and put it into our database. The program is written in Visual Studio 2010 using C#, and I'm using the NPOI library.
In the past, I was able to read the spreadsheet row by row and cell by cell to get the data, but the new format of the excel file will not allow me to do this easily. (The excel is given by another user, so I can't really make big changes to it).
There are several "tables" in one sheet (using borders and headers for each column name), and I will need to get data mainly from the tables but sometimes outside the tables too.
I was wondering if I were to read the spreadsheet row by row (which is what I'm a bit for familiar with), is there a way I can tell that I have reached a table? Is there a way I can read the "format" of the cell?
What I mean is, for example, "this cell has borders around it so starting this row is a table." or "the text in this cell is bold, so this row is the header row for this new table."
In the past I was only able to read the "text" for the spreadsheet and not the format/style. I've been searching on the internet and I can only find how to set the style for output excel but not how to read the format from input.
Any help is appreciated, thanks!
It would be better to have the various tables in your source workbook defined as named ranges with known names. Then you can get the associated area like this -
using System.IO;
using System.Windows;
using NPOI.SS.UserModel;
using NPOI.XSSF.UserModel;
// ...
using (var file = new FileStream(workbookLocation, FileMode.Open, FileAccess.Read))
{
var workbook = new XSSFWorkbook(file);
var nameInfo = workbook.GetName("TheTable");
var tableRange = nameInfo.RefersToFormula;
// Do stuff with the table
}
If you have no control over the source spreadsheet and cannot define the tables as named ranges, you can read the cell formats as you suggest. Here is an example of reading the TopBorder style -
using (var file = new FileStream(workbookLocation, FileMode.Open, FileAccess.Read))
{
var workbook = new XSSFWorkbook(file);
var sheet = workbook.GetSheetAt(0);
for (int rowNo = 0; rowNo <= sheet.LastRowNum; rowNo++)
{
var row = sheet.GetRow(rowNo);
if (row == null) // null is when the row only contains empty cells
continue;
for (int cellNo = 0; cellNo <= row.LastCellNum; cellNo++)
{
var cell = row.GetCell(cellNo);
if (cell == null) // null is when the cell is empty
continue;
var topBorderStyle = cell.CellStyle.BorderTop;
if (topBorderStyle != BorderStyle.None)
{
MessageBox.Show(string.Format("Cell row: {0} column: {1} has TopBorder: {2}", cell.Row.RowNum, cell.ColumnIndex, topBorderStyle));
}
}
}
}

How to set cells' background?

How to set the background of several cells within a row (or of a whole row) in OpenXml?
Having read several articles:
Coloring cells in excel sheet using openXML in C#
Advanced styling in Excel Open XML
I still cannot make it work.
My task is actually at first glance seems to be somewhat easier and a little bit different from what is written in those articles. The mentioned tutorials predominantly show how to create a new document and style it. While I need to change the styling of the existing one.
That is, I have an existing xlsx document (a report template). I populate the report with the necessary values (managed to do it thanks to SO open xml excel read cell value and MSDN Working with sheets (Open XML SDK)). But next I need to mark several rows with, say, red background.
I am neither sure whether to use CellStyle nor if I should use CellFormat or something else...This is what I have got up to now:
SpreadsheetDocument doc = SpreadsheetDocument.Open("ole.xlsx", true);
Sheet sheet = (Sheet)doc.WorkbookPart
.Workbook
.Sheets
.FirstOrDefault();
WorksheetPart worksheetPart = (WorksheetPart)doc.WorkbookPart
.GetPartById(sheet.Id);
Worksheet worksheet = worksheetPart.Worksheet;
CellStyle cs = new CellStyle();
cs.Name = StringValue.FromString("Normal");
cs.FormatId = 0;
cs.BuiltinId = 0;
//where are the style values?
WorkbookStylesPart wbsp = doc.WorkbookPart
.GetPartsOfType<WorkbookStylesPart>()
.FirstOrDefault();
wbsp.Stylesheet.CellStyles.Append(cs);
wbsp.Stylesheet.Save();
Cell cell = GetCell(worksheet, "A", 20);
cell.StyleIndex = 1U;//get the new cellstyle index somehow
doc.Close();
Actually I would greatly appreciate a more light-weight and easy example of how to style, say, cell A20 or range from A20 to J20. Or probably a link to some more consecutive tutorial.
In the end I changed my mind to use cell background and used fonts. Thanks to answer by foson in SO Creating Excel document with OpenXml sdk 2.0 I managed to add a new Font and a new CellFormat, having preserved the original cell's formatting (i.e. having changed the font color only):
SpreadsheetDocument doc = SpreadsheetDocument.Open("1.xlsx", true);
Sheet sheet = (Sheet)doc.WorkbookPart.Workbook.Sheets.FirstOrDefault();
WorksheetPart worksheetPart = (WorksheetPart)doc.WorkbookPart
.GetPartById(sheet.Id);
Worksheet worksheet = worksheetPart.Worksheet;
WorkbookStylesPart styles = doc.WorkbookPart.WorkbookStylesPart;
Stylesheet stylesheet = styles.Stylesheet;
CellFormats cellformats = stylesheet.CellFormats;
Fonts fonts = stylesheet.Fonts;
UInt32 fontIndex = fonts.Count;
UInt32 formatIndex = cellformats.Count;
Cell cell = GetCell(worksheet, "A", 19);
cell.CellValue = new CellValue(DateTime.Now.ToLongTimeString());
cell.DataType = new EnumValue<CellValues>(CellValues.String);
CellFormat f = (CellFormat)cellformats.ElementAt((int)cell.StyleIndex.Value);
var font = (Font)fonts.ElementAt((int)f.FontId.Value);
var newfont = (Font)font.Clone();
newfont.Color = new Color() { Rgb = new HexBinaryValue("ff0000") };
fonts.Append(newfont);
CellFormat newformat = (CellFormat)f.Clone();
newformat.FontId = fontIndex;
cellformats.Append(newformat);
stylesheet.Save();
cell.StyleIndex = formatIndex;
doc.Close();
You have 3 options:
Use MS lib ExcelDataReader which requires your server installing Office and usually does not work if your program is running in IIS.
Use closed source libs.
Use OpenXML.
Try my code using pure OpenXML:
https://stackoverflow.com/a/59806422/6782249
cSimple solution:
Try this: (it works with the nuget package ClosedXML v 0.95.4)
using ClosedXML.Excel;
XLWorkbook wb = new XLWorkbook();
IXLWorksheet ws = wb.Worksheets.Add("Test Background Color");
ws.Cell("A1").Style.Fill.BackgroundColor = XLColor.LightBlue;
ws.Cell("A1").Value = "This cell should have light blue background";
wb.SaveAs(#"c:\Test\test.xlsx");

Categories

Resources