Formula causing .xlsx to need repair on open - c#

I'm generating an .xlsx file using the EPPlus library.
I create several worksheets, each with multiple rows. Some rows have a cell reference in column J which I am inserting using the following:
for (int i = 2; i < rowCount; i++) // start on row 2 (header)
{
var formula = GetCellRefFormula(i);
worksheet.Cells[$"J{i}"].Formula = formula;
}
// save worksheet/workbook
private string GetCellRefFormula(int i)
{
return $"\"Row \"&ROW(D{i})"
}
When I open the workbook I get the following errors:
Removed Records: Formula from /xl/worksheets/sheet1.xml part
Removed Records: Formula from /xl/worksheets/sheet7.xml part
The errors are certainly caused by the string returned from GetCellRefFormula(), if I don't set these formulas or GetCellRefFormula simply returns an empty string, I get no errors.
I have also tried setting the formula to have an equals sign in front, with the same result.
private string GetCellRefFormula(int i)
{
return $"=\"Row \"&ROW(D{i})"
}
Should I be setting the formula field like this?
Is there a way to see specifically which formulas are incorrect in the Excel repair log?
As far as I can see it only gives the errors I've copied above.

Related

"Exception from HRESULT: 0x800A03EC" while using Excel OpenFile()

I receive this exception every time I try to open a CSV file (whether a .csv or .txt). Because the exception is so generic, I haven't been able to figure out what is wrong.
My program is a Winform application that FTPs data from an IBM mainframe, parses it, and generates custom objects which are ideally viewed in a spreadsheet, for easy navigation and filtering. First I tried writing my data directly to a spreadsheet, but that was taking a really long time for large data sets. For example, for 100 of my custom objects, my program took about 20 seconds to write them to a spreadsheet. It's not unusual for my end-user to need to create spreadsheets for thousands to tens of thousands of custom objects. Since I don't want the user to sit idly for an hour while the spreadsheet is created, I looked into the cause of the slowness. I read on StackExchange the Interop.Excel instructions are very slow, so in order to limit the number of them, I'm trying a different solution: First I write the data to a CSV, then I use the Interop.Excel to open the CSV, maybe do some light formatting like coloring the header and freezing the top row, then saving as a .xls or .xlsx. But I'm having trouble at the very first step: opening the CSV!
Here is the code:
using Excel = Microsoft.Office.Interop.Excel;
...
private void BuildSpreadsheetFromCsvFiles(string spreadsheetPath, string csvPathTrimmedTranlog, string csvPathFullTranlog)
{
Excel.Application xlApp = new Excel.Application();
try
{
// Set up the arrays of XlColumnDataTypes
Excel.XlColumnDataType[] trimmedDataTypes = new Excel.XlColumnDataType[trimmedTranlog.TagsPresent.Count];
Excel.XlColumnDataType[] fullDataTypes = new Excel.XlColumnDataType[fullTranlog.TagsPresent.Count];
for(int i = 0; i < trimmedDataTypes.Length; i++)
{
trimmedDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
for(int i = 0; i < fullDataTypes.Length; i++)
{
fullDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
xlApp.Workbooks.OpenText(Filename: csvPathTrimmedTranlog, // THROWS EXCEPTION
Origin: Excel.XlPlatform.xlWindows,
DataType: Excel.XlTextParsingType.xlDelimited,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: trimmedDataTypes);
Excel.Workbook xlWorkbookTrimmed = xlApp.Workbooks[1];
xlApp.Workbooks.OpenText(Filename: csvPathFullTranlog, // ALSO THROWS EXCEPTION
DataType: Excel.XlTextParsingType.xlDelimited,
Origin: Excel.XlPlatform.xlWindows,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: fullDataTypes);
Excel.Workbook xlWorkbookFull = xlApp.Workbooks[1];
Excel.Worksheet xlWorksheetTrimmed = xlWorkbookTrimmed.Worksheets[1];
Excel.Worksheet xlWorksheetFull = xlWorkbookFull.Worksheets[1];
xlWorksheetTrimmed.Copy(Before: xlWorksheetFull);
xlApp.Visible = true;
}
catch(Exception e)
{
xlApp.Quit();
}
}
I tried opening the files with Open() instead of OpenText(), and that does technically work. However, for my purposes, I cannot use Open() since in doing so all the columns will be read as General format. My data contains long strings of numbers (20 digits or so) which need to be displayed as text, and Open() will display those numbers with scientific notation.
More information:
When the workbook is created, it is created in Excel 2016.
I trimmed the .txt file to contain 20 rows and 6 columns, and the problem persists.
The .txt file can be opened in Excel 2016 manually, but not programatically within my Winform application.
UPDATE
I've narrowed the issue to the last parameter in OpenText(), the FieldInfo: parameter. With that parameter omitted, the file opens successfully. Unfortunately, as I said earlier, the data must be formatted as Text, instead of the default General.
The problem is the FieldInfo parameter. According to the API:
When the data is delimited, this argument is an array of two-element
arrays, with each two-element array specifying the conversion options
for a particular column. The first element is the column number
(1-based), and the second element is one of theXlColumnDataType
constants specifying how the column is parsed.
I rewrote the "Set up the arrays of XlColumnDataTypes" section in the original code to the following, and now it works as intended.
// Set up the arrays of XlColumnDataTypes
int[,] trimmedDataTypes = new int[trimmedTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= trimmedDataTypes.Length / 2; i++)
{
trimmedDataTypes[i - 1, 0] = i;
trimmedDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}
int[,] fullDataTypes = new int[fullTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= fullDataTypes.Length / 2; i++)
{
fullDataTypes[i - 1, 0] = i;
fullDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}

How to get current selected excel sheet data without using oledb connection in c#

I am working on vsto application , i have one open workbook . i want to read selected sheet data from that workbook without using any oledb connection is there any way to read the data and store in datatable.
The tricky part is figuring out if the current selection is valid for what you want to do. In Excel's VBA world you'd work with the VBA information function TypeName to determine whether the current Selection is a Range object. C# doesn't have a direct equivalent, so you have to work around it. If all you're interested in is a Range, then you can check whether a direct conversion to an Excel.Range is valid and procede from there. A Range object will return an array, which you can put in a data set.
The following code sample shows how to test the Selection and work with the resulting array. It doesn't do anything with a dataset - that would be a different question.
object oSel = Globals.ThisAddIn.Application.Selection;
if ((oSel as Excel.Range) != null)
{
Excel.Range rngSelection = (Excel.Range)oSel;
object[,] data = rngSelection.Value2;
int rank = data.Rank;
int lbound = data.GetLowerBound(rank-1);
int ubound = data.GetUpperBound(rank-1);
for (int i = 1; i <= rank; i++)
{
for (int l = lbound; l <= ubound; l++)
{
System.Diagnostics.Debug.Print(data[i,l].ToString());
}
}
}
An alternative to using the cast test involves working with the COM APIs. If you needed to take various actions depending on the type of Selection this approach might be more effective. It's described here: https://www.add-in-express.com/creating-addins-blog/2011/12/20/type-name-system-comobject/

How to get only selected cells value from filtered cells from excel

I have a simple excel sheet:
Now, I filter it such that cell value > 1. Now my data looks like:
Now, I select the data that I require:
Note that I have selected all the Mobile Numbers.
Now in my code, I am trying to retrieve all the selected data as follows:
Range selection = (Range)Globals.ThisAddIn.Application.ActiveWindow.Selection;
But, it gives me the cells from starting to ending. I think excel selects the non-visible rows also. Because row no 4 that contains 0 is also retrieved. Look at the image below:
So, now I created another Range and tried to add all the values of cells that are visible as follows:
Range onlyFilteredSelection = selection.Cells.SpecialCells(XlCellType.xlCellTypeVisible);
Now, I can see that c# shows me only the two rows. Why is it not displaying the last row, which is after the non-filtered row. Take a look at the values here:
Update:
After posting this question, I got a thought in my mind that I might be getting multiple ranges instead of 1 and so, I started exploring. And look what I have found. I found that I was exactly right. I get multiple ranges.
Here is the code that I have tried:
Range selection = (Range)Globals.ThisAddIn.Application.ActiveWindow.Selection;
List<Range> onlyFilteredSelection = new List<Range>();
foreach (Range range in selection.Cells.SpecialCells(XlCellType.xlCellTypeVisible))
{
onlyFilteredSelection.Add(range);
}
Now, I get 4 items in selection variable. And in onlyFilteredSelection has got 3 items.
Now, I am in another trouble:
Previously, I was getting a Range, so I converted it to a Comma-Separated String very much easily using the below mentioned code:
string[] AllRecepientMobileNumberValues = ((Array)(selection.Cells.Value2)).OfType<object>().Select(o => o.ToString()).ToArray();
string RecepientMobileNumberValue = AllRecepientMobileNumberValues.Aggregate((a, x) => a + ", " + x);
But now, I get a List. So, now my big question is how to Convert a List to Comma-Separated string?????????
You can use one more Select to get the values out of a list.
string[] AllRecepientMobileNumberValues = onlyFilteredSelection.Select(x => x.Cells.Value2).OfType<object>().Select(o => o.ToString()).ToArray();
string RecepientMobileNumberValue = AllRecepientMobileNumberValues.Aggregate((a, x) => a + ", " + x);
I tried around a bit and had problems too. But I think I may have found a possible workaround for your situation. First some things I found out.
I was able to reproduce your behaviour
the selection of all visible cells fails, because a filter (seems?) not to hide the rows, but it set the row height to 0!
I could not find any other useful method/property when looking around on the Application Member or the Range Member.
I created a macro to record what VBA code would be generated on a copy action of your selection´too. Strange thing is that there is nothing special in the code as you can see.
vba macro code
Range("A3:A5").Select ' correct as three lines are selected,
' but only two of them have a rowHeight > 0
Selection.Copy
Range("F8").Select
ActiveSheet.Paste ' and here is the magic?? why does vba only paste 2 cells??
So I decided to come up with a workaround. Why not simulation what VBA seemingly does too. Only handle those cells whose rowHeight > 0.
exampleCode.cs
private static void readFilteredCells()
{
Excel.Application xlApp = (Excel.Application)Marshal.GetActiveObject("Excel.Application");
Workbook xlBook = (Excel.Workbook)xlApp.ActiveWorkbook;
Worksheet wrkSheet = xlBook.Worksheets[1];
Range selection = xlApp.Selection;
for (int rowIndex = selection.Row; rowIndex < selection.Row + selection.Rows.Count; rowIndex++)
{
if (wrkSheet.Rows[rowIndex].EntireRow.Height!=0)
{
// do something special
}
}
}
I hope my answer is of any use for you. If you need any further assistance please let me know.

Optimized way of adding multiple hyperlinks in excel file with C#

I wanted to ask if there is some practical way of adding multiple hyperlinks in excel worksheet with C# ..? I want to generate a list of websites and anchor hyperlinks to them, so the user could click such hyperlink and get to that website.
So far I have come with simple nested for statement, which loops through every cell in a given excel range and adds hyperlink to that cell:
for (int i = 0; i < _range.Rows.Count; i++)
{
Microsoft.Office.Interop.Excel.Range row = _range.Rows[i];
for (int j = 0; j < row.Cells.Count; j++)
{
Microsoft.Office.Interop.Excel.Range cell = row.Cells[j];
cell.Hyperlinks.Add(cell, adresses[i, j], _optionalValue, _optionalValue, _optionalValue);
}
}
The code is working as intended, but it is Extremely slow due to thousands of calls of the Hyperlinks.Add method.
One thing that intrigues me is that the method set_Value from Office.Interop.Excel can add thousands of strings with one simple call, but there is no similar method for adding hyperlinks (Hyperlinks.Add can add just one hyperlink).
So my question is, is there some way to optimize adding hyperlinks to excel file in C# when you need to add a large number of hyperlinks...?
Any help would be apreciated.
I am using VS2010 and MS Excel 2010.
I have the very same problems (adding 300 hyperlinks via Range.Hyperlinks.Add takes approx. 2 min).
The runtime issue is because of the many Range-Instances.
Solution:
Use a single range instance and add Hyperlinks with the "=HYPERLINK(target, [friendlyName])" Excel-Formula.
Example:
List<string> urlsList = new List<string>();
urlsList.Add("http://www.gin.de");
// ^^ n times ...
// create shaped array with content
object[,] content = new object [urlsList.Count, 1];
foreach(string url in urlsList)
{
content[i, 1] = string.Format("=HYPERLINK(\"{0}\")", url);
}
// get Range
string rangeDescription = string.Format("A1:A{0}", urlsList.Count+1) // excel indexes start by 1
Xl.Range xlRange = worksheet.Range[rangeDescription, XlTools.missing];
// set value finally
xlRange.Value2 = content;
... takes just 1 sec ...

ExtremeML Negative Exponent Export Issue

In C# I have a DataTable which has valid values, some of which are in exponential format. I export this DataTable to an xlsx file using ExtremeML (which I believe is based on OpenXML) and the following code:
if (!Directory.Exists(savePath.Text))
Directory.CreateDirectory(savePath.Text);
using (var package = SpreadsheetDocumentWrapper.Create(savePath.Text + fileName + ".xlsx"))
{
for (int i = 0; i < dataset.Tables.Count; i++)
{
// declares worksheet
var part = package.WorkbookPart.WorksheetParts.Add(dataset.Tables[i].TableName.ToString()); // second worksheet filled by second collection
bool first = true;
int position = 0;
for (int row = 0; row < dataset.Tables[i].Rows.Count; row++)
{
for (int col = 0; col < dataset.Tables[i].Columns.Count; col++)
{
// adds to file
if (first)
part.Worksheet.SetCellValue(new GridReference(row, col), dataset.Tables[i].Columns[col].ColumnName.ToString());
else
part.Worksheet.SetCellValue(new GridReference(row, col), dataset.Tables[i].Rows[position][col]);
}
if (first)
{
first = false;
}
else
position++;
}
}
}
The resulting excel file is almost acceptable, except that negative exponential negative numbers
(for example -7.45E-05) treat the negative sign as a digit (previous example becomes 0.000-745). The columns are all set to be doubles in the DataTable, so the excel file should be formatting them as such.
Any ideas as to the cause of this problem?
(Also, I know the code is a touch painful to grok- the code is not mine, and I haven't gotten around to cleaning up the program.)
Your sample code seems fine.
This looks like a bug in the way way ExtremeML converts doubles. As you say, it only appears to affect negative exponential negative numbers.
I am currently debugging it and creating additional unit tests to cover these scenarios, after which I will commit the updated code to CodePlex and post an update here.
Edit: This bug has now been fixed. You'll need to download and build the ExtremeML source code from CodePlex, as the runtime release is not currently up-to-date.

Categories

Resources