Writing a large 2d array to Excel - c#

I'm looking to write a large 2d array to an Excel worksheet using C#. If the array is 500 x 500, the code that I would use to write this is as follows:
var startCell = Worksheet.Cells[1, 1];
var endCell = Worksheet.Cells[500, 500];
var writeRange = (Excel.Range)Worksheet.Cells[startCell, endCell;
writeRange.Value = myArray;
I get an exception on this line:
var endCell = Worksheet.Cells[500, 500];
As anybody who has used C# and Excel via COM can testify, the error message received is pretty much useless. I think that the issue is that the underlying data structure used for the worksheet is not of sufficient size to index cell 500,500 when I first create the sheet.
Does anybody know how to achieve the desired result? I'm hoping that there is a simple way to re-size the underlying data structure before creating the Range.
Thanks.
Edit: Error message is:
{"Exception from HRESULT: 0x800A03EC"}
With and excel error code of -2146827284.
Update: The link supplied in the comments below alluded to an issue with opening the Excel sheet in compatibility mode. This does seem to be the problem. If I save the document in .xlsx or .xlsm format before running my code, this seems to work. My issue is that I cannot expect my users to do this each time. Is there a programmitcal way of achieving this? Would it simply be a case of opening the file, checking the extension and then saving it in the new format if needed?

Found a solution.
Instead of using Worksheet.Cells[x, x], use Worksheet.get_range(x, x) instead.

I just wrote a small example that worked for me. Originally found at SO this answer. I had to adapt this answer as in my Interop assembly (Excel 14 Object Library) there is no more method Worksheet.get_Range(.., ..)
var startCell = ws.Cells[1, 1];
int row = 500, col = 500;
var endCell = ws.Cells[row, col];
try
{
// access range by Property and cells indicating start and end
var writeRange = ws.Range[startCell, endCell];
writeRange.Value = myArray;
}
catch (COMException ex)
{
Debug.WriteLine(ex.Message);
Debugger.Break();
}

Related

Showing #value! before enable editing on excel if I write formula using epplus

Using C# .net core I am updating existing excel template with Data and formulas using EPPlus lib 4.5.3.3.
If you see the below screen shots all formula cells has '#value!' even after using calculate method in C# code (Just for reference attached xml screen short just after downloading excel before opening it). Auto calculation is also enabled in Excel.
In one of the blog mentioned to check the xml info,
My requirement is to upload this excel through code to sharepoint site and read the excel formula cells for other operations with out opening the excel manually.
is there any other way to calculate the formula cells form code and update the cell values?
I went through the Why won't this formula calculate unless i double click a cell? as well, but no luck.
using (ExcelPackage p = new ExcelPackage())
{
MemoryStream stream = new MemoryStream(byteArray);
p.Load(stream);
ExcelWorksheet worksheet = p.Workbook.Worksheets.FirstOrDefault(a => a.Name == "InputTemplate");
worksheet.Calculate();
if (worksheet != null)
{
worksheet.Cells["A3"].Value = company.CompanyName;//// Company Name
worksheet.Cells["B3"].Value = product.Name;////peoduct name
worksheet.Cells["C3"].Value = product.NetWeight;
worksheet.Cells["D3"].Value = product.ServingSize;
worksheet.Cells["E3"].Value = 0;
var produceAndIngredientDetailsForExcelList = await GetProduceAndIngredientDetails(companyId, productId);
////rowIndex will be 3
WriteProduceAndIngredientDetailsInExcel(worksheet, produceAndIngredientDetailsForExcelList);
///rowIndex will update based on no. of produce and then Agregates.
StageWiseAggregate(worksheet, produceAndIngredientDetailsForExcelList);
////Write Total Impacts Row
TotalImpactsFormulaSection(worksheet);
worksheet.Calculate();
}
Byte[] bin = p.GetAsByteArray();
return bin;
}
Formula Code
var columnIndex = 22;///"V" Column
for (; columnIndex <= 27; columnIndex++)
{
var columnName = GetExcelColumnName(columnIndex);
worksheet.Cells[currentRowIndex, columnIndex].Formula = $"=SUBTOTAL(109,{columnName}{firstRowIndex}:{columnName}{currentRowIndex - 1})";
}
Found the solution for this issue from my Architect (kudos to him).
I was writing formulas in wrong way by blindly fallowing tutorials like
https://riptutorial.com/epplus/example/26433/add-formulas-to-a-cell
Note: don't follow link shown above.
We should not use "=" for formulas. I just removed it worked like charm
var columnIndex = 22;///"V" Column
for (; columnIndex <= 27; columnIndex++)
{
var columnName = GetExcelColumnName(columnIndex);
worksheet.Cells[currentRowIndex, columnIndex].Formula = $"SUBTOTAL(109,{columnName}{firstRowIndex}:{columnName}{currentRowIndex - 1})";
}
Here is the official tutorial which mentioned correctly.
https://www.epplussoftware.com/en/Developers/ (check the second slide)
Working result:

How to retrieve efficiently all strings from a large Excel documents

The Excel spreadsheet should be read by .NET. It is very efficient to read all values from the active range by using the property Value. This transfers all values in a two dimensional array, by one single call to Excel.
However reading strings is not possible for a range which contains more than one single cell. Therefor we have to iterate over all cells and use the Text property. This shows very poor performance for larger document.
The reason of using strings rather than values is to obtains the correct format (for instance for dates or the number of digits).
Here is a sample code written in C# to demonstrate the approach.
static void Main(string[] args)
{
Excel.Application xlApp = (Excel.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Excel.Application");
var worksheet = xlApp.ActiveSheet;
var cells = worksheet.UsedRange();
// read all values in array -> fast
object[,] arrayValues = cells.Value;
// create array for text of the same extension
object[,] arrayText = (object[,])Array.CreateInstance(typeof(object),
new int[] { arrayValues.GetUpperBound(0), arrayValues.GetUpperBound(1) },
new int[] { arrayValues.GetLowerBound(0), arrayValues.GetLowerBound(1) });
// read text for each cell -> slow
for (int row = arrayValues.GetUpperBound(0); row <= arrayValues.GetUpperBound(0); ++row)
{
for (int col = arrayValues.GetUpperBound(0); col <= arrayValues.GetUpperBound(1); ++col)
{
object obj = cells[row, col].Text;
arrayText[row, col] = obj;
}
}
}
The question is, if there is a more efficient way to read the complete string content from an Excel document. One idea was to use cells.Copy to copy the content to the clipboard to get it from there. However this has some restrictions and could of course interfere with users which are working with the clipboard at the same time. So I wonder if there are better approaches to solve this performance issue.
You can use code below:
using (MSExcel.Application app = MSExcel.Application.CreateApplication())
{
MSExcel.Workbook book1 = app.Workbooks.Open( this.txtOpen_FilePath.Text);
MSExcel.Worksheet sheet = (MSExcel.Worksheet)book1.Worksheets[1];
MSExcel.Range range = sheet.GetRange("A1", "F13");
object value = range.Value; //the value is boxed two-dimensional array
}
The code is provided from this post. It should be much more efficient than your code, but may not be the best.

"Exception from HRESULT: 0x800A03EC" while using Excel OpenFile()

I receive this exception every time I try to open a CSV file (whether a .csv or .txt). Because the exception is so generic, I haven't been able to figure out what is wrong.
My program is a Winform application that FTPs data from an IBM mainframe, parses it, and generates custom objects which are ideally viewed in a spreadsheet, for easy navigation and filtering. First I tried writing my data directly to a spreadsheet, but that was taking a really long time for large data sets. For example, for 100 of my custom objects, my program took about 20 seconds to write them to a spreadsheet. It's not unusual for my end-user to need to create spreadsheets for thousands to tens of thousands of custom objects. Since I don't want the user to sit idly for an hour while the spreadsheet is created, I looked into the cause of the slowness. I read on StackExchange the Interop.Excel instructions are very slow, so in order to limit the number of them, I'm trying a different solution: First I write the data to a CSV, then I use the Interop.Excel to open the CSV, maybe do some light formatting like coloring the header and freezing the top row, then saving as a .xls or .xlsx. But I'm having trouble at the very first step: opening the CSV!
Here is the code:
using Excel = Microsoft.Office.Interop.Excel;
...
private void BuildSpreadsheetFromCsvFiles(string spreadsheetPath, string csvPathTrimmedTranlog, string csvPathFullTranlog)
{
Excel.Application xlApp = new Excel.Application();
try
{
// Set up the arrays of XlColumnDataTypes
Excel.XlColumnDataType[] trimmedDataTypes = new Excel.XlColumnDataType[trimmedTranlog.TagsPresent.Count];
Excel.XlColumnDataType[] fullDataTypes = new Excel.XlColumnDataType[fullTranlog.TagsPresent.Count];
for(int i = 0; i < trimmedDataTypes.Length; i++)
{
trimmedDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
for(int i = 0; i < fullDataTypes.Length; i++)
{
fullDataTypes[i] = Excel.XlColumnDataType.xlTextFormat;
}
xlApp.Workbooks.OpenText(Filename: csvPathTrimmedTranlog, // THROWS EXCEPTION
Origin: Excel.XlPlatform.xlWindows,
DataType: Excel.XlTextParsingType.xlDelimited,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: trimmedDataTypes);
Excel.Workbook xlWorkbookTrimmed = xlApp.Workbooks[1];
xlApp.Workbooks.OpenText(Filename: csvPathFullTranlog, // ALSO THROWS EXCEPTION
DataType: Excel.XlTextParsingType.xlDelimited,
Origin: Excel.XlPlatform.xlWindows,
TextQualifier: Excel.XlTextQualifier.xlTextQualifierNone,
Semicolon: true,
FieldInfo: fullDataTypes);
Excel.Workbook xlWorkbookFull = xlApp.Workbooks[1];
Excel.Worksheet xlWorksheetTrimmed = xlWorkbookTrimmed.Worksheets[1];
Excel.Worksheet xlWorksheetFull = xlWorkbookFull.Worksheets[1];
xlWorksheetTrimmed.Copy(Before: xlWorksheetFull);
xlApp.Visible = true;
}
catch(Exception e)
{
xlApp.Quit();
}
}
I tried opening the files with Open() instead of OpenText(), and that does technically work. However, for my purposes, I cannot use Open() since in doing so all the columns will be read as General format. My data contains long strings of numbers (20 digits or so) which need to be displayed as text, and Open() will display those numbers with scientific notation.
More information:
When the workbook is created, it is created in Excel 2016.
I trimmed the .txt file to contain 20 rows and 6 columns, and the problem persists.
The .txt file can be opened in Excel 2016 manually, but not programatically within my Winform application.
UPDATE
I've narrowed the issue to the last parameter in OpenText(), the FieldInfo: parameter. With that parameter omitted, the file opens successfully. Unfortunately, as I said earlier, the data must be formatted as Text, instead of the default General.
The problem is the FieldInfo parameter. According to the API:
When the data is delimited, this argument is an array of two-element
arrays, with each two-element array specifying the conversion options
for a particular column. The first element is the column number
(1-based), and the second element is one of theXlColumnDataType
constants specifying how the column is parsed.
I rewrote the "Set up the arrays of XlColumnDataTypes" section in the original code to the following, and now it works as intended.
// Set up the arrays of XlColumnDataTypes
int[,] trimmedDataTypes = new int[trimmedTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= trimmedDataTypes.Length / 2; i++)
{
trimmedDataTypes[i - 1, 0] = i;
trimmedDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}
int[,] fullDataTypes = new int[fullTranlog.TagsPresent.Count, 2];
for(int i = 1; i <= fullDataTypes.Length / 2; i++)
{
fullDataTypes[i - 1, 0] = i;
fullDataTypes[i - 1, 1] = (int)Excel.XlColumnDataType.xlTextFormat;
}

How to get current selected excel sheet data without using oledb connection in c#

I am working on vsto application , i have one open workbook . i want to read selected sheet data from that workbook without using any oledb connection is there any way to read the data and store in datatable.
The tricky part is figuring out if the current selection is valid for what you want to do. In Excel's VBA world you'd work with the VBA information function TypeName to determine whether the current Selection is a Range object. C# doesn't have a direct equivalent, so you have to work around it. If all you're interested in is a Range, then you can check whether a direct conversion to an Excel.Range is valid and procede from there. A Range object will return an array, which you can put in a data set.
The following code sample shows how to test the Selection and work with the resulting array. It doesn't do anything with a dataset - that would be a different question.
object oSel = Globals.ThisAddIn.Application.Selection;
if ((oSel as Excel.Range) != null)
{
Excel.Range rngSelection = (Excel.Range)oSel;
object[,] data = rngSelection.Value2;
int rank = data.Rank;
int lbound = data.GetLowerBound(rank-1);
int ubound = data.GetUpperBound(rank-1);
for (int i = 1; i <= rank; i++)
{
for (int l = lbound; l <= ubound; l++)
{
System.Diagnostics.Debug.Print(data[i,l].ToString());
}
}
}
An alternative to using the cast test involves working with the COM APIs. If you needed to take various actions depending on the type of Selection this approach might be more effective. It's described here: https://www.add-in-express.com/creating-addins-blog/2011/12/20/type-name-system-comobject/

How to get all the rows which contain data in a particular column in Excel from C#

I have a data set in Excel and am using C# to open the worksheet and access some of the data.
I am trying to get all the rows that contain data from a particular column. For example in column B starting from cell 'B3' going down I want to store all the rows that contain data in a collection like an Array.
This is what I have so far:
Application excelApplication;
_Workbook workbook;
_Worksheet sheet;
excelApplication = new Excel.Application
{
Visible = true,
ScreenUpdating = true
};
workbook = excelApplication.Workbooks.Open(#"C:\Documents and Settings\user\Desktop\Book1.xls");
sheet = (Worksheet)workbook.Worksheets[2];
Excel.Range range = sheet.Range["b3:b145"].
foreach (Range cell in range)
{
// Do something with rows which contain data
}
As you can see above I have specified the range from B3 to B45 which I don't want. I want to get all the rows in the B column which contain data starting from B3.
How would I achieve this?
In general when I get stuck in these situations I record a Macro and convert the VBA code to C#. The object model in VSTO is pretty much exactly the same (remember this its a great tip) and with .Net 4.0 onwards optional parameters save a lot of code.
In your particular instance I envisage the larger the spreadsheet the longer it will take to read all the Excel cells in column B using VSTO. My advice is to use this technique to read them all at once:
//Work out the number of rows with data in column B:
//int lastColumn = range.Columns.Count;
int lastRow = range.Rows.Count;
//Get all the column values:
object[,] objectArray = shtName.get_Range("B3:B" + lastRow.ToString()).Value2;
rngName.Value2 = objectArray;

Categories

Resources