I receive this exception every time I try to open a CSV file (whether a .csv or .txt). Because the exception is so generic, I haven't been able to figure out what is wrong.
My program is a Winform application that FTPs data from an IBM mainframe, parses it, and generates custom objects which are ideally viewed in a spreadsheet, for easy navigation and filtering. First I tried writing my data directly to a spreadsheet, but that was taking a really long time for large data sets. For example, for 100 of my custom objects, my program took about 20 seconds to write them to a spreadsheet. It's not unusual for my end-user to need to create spreadsheets for thousands to tens of thousands of custom objects. Since I don't want the user to sit idly for an hour while the spreadsheet is created, I looked into the cause of the slowness. I read on StackExchange the Interop.Excel instructions are very slow, so in order to limit the number of them, I'm trying a different solution: First I write the data to a CSV, then I use the Interop.Excel to open the CSV, maybe do some light formatting like coloring the header and freezing the top row, then saving as a .xls or .xlsx. But I'm having trouble at the very first step: opening the CSV!
Here is the code:
using Excel = Microsoft.Office.Interop.Excel;
...
private void BuildSpreadsheetFromCsvFiles(string spreadsheetPath, string csvPathTrimmedTranlog, string csvPathFullTranlog)
{
Excel.Application xlApp = new Excel.Application();
try
{
// Set up the arrays of XlColumnDataTypes
Excel.XlColumnDataType[] trimmedDataTypes = new Excel.XlColumnDataType[trimmedTranlog.TagsPresent.Count];
Excel.XlColumnDataType[] fullDataTypes = new Excel.XlColumnDataType[fullTranlog.TagsPresent.Count];
for(int i = 0; i < trimmedDataTypes.Length; i++)
{
trimmedDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
for(int i = 0; i < fullDataTypes.Length; i++)
{
fullDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
xlApp.Workbooks.OpenText(Filename: csvPathTrimmedTranlog, // THROWS EXCEPTION
Origin: Excel.XlPlatform.xlWindows,
DataType: Excel.XlTextParsingType.xlDelimited,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: trimmedDataTypes);
Excel.Workbook xlWorkbookTrimmed = xlApp.Workbooks[1];
xlApp.Workbooks.OpenText(Filename: csvPathFullTranlog, // ALSO THROWS EXCEPTION
DataType: Excel.XlTextParsingType.xlDelimited,
Origin: Excel.XlPlatform.xlWindows,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: fullDataTypes);
Excel.Workbook xlWorkbookFull = xlApp.Workbooks[1];
Excel.Worksheet xlWorksheetTrimmed = xlWorkbookTrimmed.Worksheets[1];
Excel.Worksheet xlWorksheetFull = xlWorkbookFull.Worksheets[1];
xlWorksheetTrimmed.Copy(Before: xlWorksheetFull);
xlApp.Visible = true;
}
catch(Exception e)
{
xlApp.Quit();
}
}
I tried opening the files with Open() instead of OpenText(), and that does technically work. However, for my purposes, I cannot use Open() since in doing so all the columns will be read as General format. My data contains long strings of numbers (20 digits or so) which need to be displayed as text, and Open() will display those numbers with scientific notation.
More information:
When the workbook is created, it is created in Excel 2016.
I trimmed the .txt file to contain 20 rows and 6 columns, and the problem persists.
The .txt file can be opened in Excel 2016 manually, but not programatically within my Winform application.
UPDATE
I've narrowed the issue to the last parameter in OpenText(), the FieldInfo: parameter. With that parameter omitted, the file opens successfully. Unfortunately, as I said earlier, the data must be formatted as Text, instead of the default General.
The problem is the FieldInfo parameter. According to the API:
When the data is delimited, this argument is an array of two-element
arrays, with each two-element array specifying the conversion options
for a particular column. The first element is the column number
(1-based), and the second element is one of theXlColumnDataType
constants specifying how the column is parsed.
I rewrote the "Set up the arrays of XlColumnDataTypes" section in the original code to the following, and now it works as intended.
// Set up the arrays of XlColumnDataTypes
int[,] trimmedDataTypes = new int[trimmedTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= trimmedDataTypes.Length / 2; i++)
{
trimmedDataTypes[i - 1, 0] = i;
trimmedDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}
int[,] fullDataTypes = new int[fullTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= fullDataTypes.Length / 2; i++)
{
fullDataTypes[i - 1, 0] = i;
fullDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}
Related
The Excel spreadsheet should be read by .NET. It is very efficient to read all values from the active range by using the property Value. This transfers all values in a two dimensional array, by one single call to Excel.
However reading strings is not possible for a range which contains more than one single cell. Therefor we have to iterate over all cells and use the Text property. This shows very poor performance for larger document.
The reason of using strings rather than values is to obtains the correct format (for instance for dates or the number of digits).
Here is a sample code written in C# to demonstrate the approach.
static void Main(string[] args)
{
Excel.Application xlApp = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application");
var worksheet = xlApp.ActiveSheet;
var cells = worksheet.UsedRange();
// read all values in array -> fast
object[,] arrayValues = cells.Value;
// create array for text of the same extension
object[,] arrayText = (object[,])Array.CreateInstance(typeof(object),
new int[] { arrayValues.GetUpperBound(0), arrayValues.GetUpperBound(1) },
new int[] { arrayValues.GetLowerBound(0), arrayValues.GetLowerBound(1) });
// read text for each cell -> slow
for (int row = arrayValues.GetUpperBound(0); row <= arrayValues.GetUpperBound(0); ++row)
{
for (int col = arrayValues.GetUpperBound(0); col <= arrayValues.GetUpperBound(1); ++col)
{
object obj = cells[row, col].Text;
arrayText[row, col] = obj;
}
}
}
The question is, if there is a more efficient way to read the complete string content from an Excel document. One idea was to use cells.Copy to copy the content to the clipboard to get it from there. However this has some restrictions and could of course interfere with users which are working with the clipboard at the same time. So I wonder if there are better approaches to solve this performance issue.
You can use code below:
using (MSExcel.Application app = MSExcel.Application.CreateApplication())
{
MSExcel.Workbook book1 = app.Workbooks.Open( this.txtOpen_FilePath.Text);
MSExcel.Worksheet sheet = (MSExcel.Worksheet)book1.Worksheets[1];
MSExcel.Range range = sheet.GetRange("A1", "F13");
object value = range.Value; //the value is boxed two-dimensional array
}
The code is provided from this post. It should be much more efficient than your code, but may not be the best.
I am working on vsto application , i have one open workbook . i want to read selected sheet data from that workbook without using any oledb connection is there any way to read the data and store in datatable.
The tricky part is figuring out if the current selection is valid for what you want to do. In Excel's VBA world you'd work with the VBA information function TypeName to determine whether the current Selection is a Range object. C# doesn't have a direct equivalent, so you have to work around it. If all you're interested in is a Range, then you can check whether a direct conversion to an Excel.Range is valid and procede from there. A Range object will return an array, which you can put in a data set.
The following code sample shows how to test the Selection and work with the resulting array. It doesn't do anything with a dataset - that would be a different question.
object oSel = Globals.ThisAddIn.Application.Selection;
if ((oSel as Excel.Range) != null)
{
Excel.Range rngSelection = (Excel.Range)oSel;
object[,] data = rngSelection.Value2;
int rank = data.Rank;
int lbound = data.GetLowerBound(rank-1);
int ubound = data.GetUpperBound(rank-1);
for (int i = 1; i <= rank; i++)
{
for (int l = lbound; l <= ubound; l++)
{
System.Diagnostics.Debug.Print(data[i,l].ToString());
}
}
}
An alternative to using the cast test involves working with the COM APIs. If you needed to take various actions depending on the type of Selection this approach might be more effective. It's described here: https://www.add-in-express.com/creating-addins-blog/2011/12/20/type-name-system-comobject/
I'm generating an .xlsx file using the EPPlus library.
I create several worksheets, each with multiple rows. Some rows have a cell reference in column J which I am inserting using the following:
for (int i = 2; i < rowCount; i++) // start on row 2 (header)
{
var formula = GetCellRefFormula(i);
worksheet.Cells[$"J{i}"].Formula = formula;
}
// save worksheet/workbook
private string GetCellRefFormula(int i)
{
return $"\"Row \"&ROW(D{i})"
}
When I open the workbook I get the following errors:
Removed Records: Formula from /xl/worksheets/sheet1.xml part
Removed Records: Formula from /xl/worksheets/sheet7.xml part
The errors are certainly caused by the string returned from GetCellRefFormula(), if I don't set these formulas or GetCellRefFormula simply returns an empty string, I get no errors.
I have also tried setting the formula to have an equals sign in front, with the same result.
private string GetCellRefFormula(int i)
{
return $"=\"Row \"&ROW(D{i})"
}
Should I be setting the formula field like this?
Is there a way to see specifically which formulas are incorrect in the Excel repair log?
As far as I can see it only gives the errors I've copied above.
I'm looking to write a large 2d array to an Excel worksheet using C#. If the array is 500 x 500, the code that I would use to write this is as follows:
var startCell = Worksheet.Cells[1, 1];
var endCell = Worksheet.Cells[500, 500];
var writeRange = (Excel.Range)Worksheet.Cells[startCell, endCell;
writeRange.Value = myArray;
I get an exception on this line:
var endCell = Worksheet.Cells[500, 500];
As anybody who has used C# and Excel via COM can testify, the error message received is pretty much useless. I think that the issue is that the underlying data structure used for the worksheet is not of sufficient size to index cell 500,500 when I first create the sheet.
Does anybody know how to achieve the desired result? I'm hoping that there is a simple way to re-size the underlying data structure before creating the Range.
Thanks.
Edit: Error message is:
{"Exception from HRESULT: 0x800A03EC"}
With and excel error code of -2146827284.
Update: The link supplied in the comments below alluded to an issue with opening the Excel sheet in compatibility mode. This does seem to be the problem. If I save the document in .xlsx or .xlsm format before running my code, this seems to work. My issue is that I cannot expect my users to do this each time. Is there a programmitcal way of achieving this? Would it simply be a case of opening the file, checking the extension and then saving it in the new format if needed?
Found a solution.
Instead of using Worksheet.Cells[x, x], use Worksheet.get_range(x, x) instead.
I just wrote a small example that worked for me. Originally found at SO this answer. I had to adapt this answer as in my Interop assembly (Excel 14 Object Library) there is no more method Worksheet.get_Range(.., ..)
var startCell = ws.Cells[1, 1];
int row = 500, col = 500;
var endCell = ws.Cells[row, col];
try
{
// access range by Property and cells indicating start and end
var writeRange = ws.Range[startCell, endCell];
writeRange.Value = myArray;
}
catch (COMException ex)
{
Debug.WriteLine(ex.Message);
Debugger.Break();
}
I wanted to ask if there is some practical way of adding multiple hyperlinks in excel worksheet with C# ..? I want to generate a list of websites and anchor hyperlinks to them, so the user could click such hyperlink and get to that website.
So far I have come with simple nested for statement, which loops through every cell in a given excel range and adds hyperlink to that cell:
for (int i = 0; i < _range.Rows.Count; i++)
{
Microsoft.Office.Interop.Excel.Range row = _range.Rows[i];
for (int j = 0; j < row.Cells.Count; j++)
{
Microsoft.Office.Interop.Excel.Range cell = row.Cells[j];
cell.Hyperlinks.Add(cell, adresses[i, j], _optionalValue, _optionalValue, _optionalValue);
}
}
The code is working as intended, but it is Extremely slow due to thousands of calls of the Hyperlinks.Add method.
One thing that intrigues me is that the method set_Value from Office.Interop.Excel can add thousands of strings with one simple call, but there is no similar method for adding hyperlinks (Hyperlinks.Add can add just one hyperlink).
So my question is, is there some way to optimize adding hyperlinks to excel file in C# when you need to add a large number of hyperlinks...?
Any help would be apreciated.
I am using VS2010 and MS Excel 2010.
I have the very same problems (adding 300 hyperlinks via Range.Hyperlinks.Add takes approx. 2 min).
The runtime issue is because of the many Range-Instances.
Solution:
Use a single range instance and add Hyperlinks with the "=HYPERLINK(target, [friendlyName])" Excel-Formula.
Example:
List<string> urlsList = new List<string>();
urlsList.Add("http://www.gin.de");
// ^^ n times ...
// create shaped array with content
object[,] content = new object [urlsList.Count, 1];
foreach(string url in urlsList)
{
content[i, 1] = string.Format("=HYPERLINK(\"{0}\")", url);
}
// get Range
string rangeDescription = string.Format("A1:A{0}", urlsList.Count+1) // excel indexes start by 1
Xl.Range xlRange = worksheet.Range[rangeDescription, XlTools.missing];
// set value finally
xlRange.Value2 = content;
... takes just 1 sec ...