C# Excel VSTO can not convert string to String.Array - c#

I am new to VSTO C# excel add-in. I am looking to find total count of not null/empty rows in a range. My Code looks at the range "A4:E4" and count total number of rows.
This is the code :
private void button1_Click(object sender, RibbonControlEventArgs e)
{
Workbook workbook = Globals.ThisAddIn.GetWorkBook("c:\\temp\\testfile.xlsx");
Worksheet mergeSheet = workbook.Worksheets["Data"];
Excel.Range mergeCells = mergeSheet.Range["A4:E4"];
var colValues = (System.Array)mergeCells.Columns[1].Cells.Value;
var strArray = colValues.OfType<object>().Select(o => o.ToString()).ToArray();
var rowCount = strArray.Length;
}
[public Excel.Workbook GetWorkBook(string pathName)
{
return (Excel.Workbook)Application.Workbooks.Open(pathName);
}][1]
I get error var colValues = (System.Array)mergeCells.Columns[1].Cells.Value;on line :
Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: 'Cannot convert type 'string' to 'System.Array''
It works when I have two rows in my range. I have hardcoded range A4:E4 to produce the error. My excel sheet (testfile.xlsx) looks like below:
Any ideas how do I resolve this?
Same line of code works when I have two rows. Eg and following line is updated
Excel.Range mergeCells = mergeSheet.Range["A4:E5"];

The problem is that Range.Value can return different types of objects. Among others, it can return
a single value of type String, if the range contains a single cell containing a string or
an array of values, if the range contains more than one cell.
The simplest solution would be to count the number of cells and "wrap" the special "single value" case in an array:
var range = mergeCells.Columns[1].Cells;
var values = (range.Count == 1)
? new object[] { range.Value })
: ((Sytem.Array)range.Value).Cast<object>();

This line is causing me trouble, var colValues = (System.Array)mergeCells.Columns[1].Cells.Value
This row has only one value. Note that same line works when mergeCells range has two rows.
Why it does not work for single cell range:
The Value of a single cell is not an array (it's a Long, String, DateTime, etc.) and won't be cast to an array in that manner. You can see this by testing like below:
var myArray = (System.Array)"hello"
This will give same failure for other types:
Why it works for multi-cell range:
The Value of a multi-cell range will return a variant array of the individual cell values, which either is, or can be cast to a System.Array
There may be a better resolution, but at least you should be able to do like:
var colValues;
if (mergeCells.Columns[1].Cells.Count > 1)
{
colValues = (System.Array)mergeCells.Columns[1].Cells.Value;
}
else
{
// NB: you may need to cast other data types to string
// or you could use .Cells[0].Text.Split()
colValues = (System.Array)mergeCells.Columns[1].Cells[0].Value.Split();
}

Related

ExcelDNA throwing exception accessing Range.Value2

I am porting an excel addin (used shimloader) to exceldna, and yeah, I have seen the other SO (and off SO) questions but nothing resolves my question, and I'm hoping there are newer solutions.
The code is simple.
[ExcelFunction(Name="DoSomething")]
string DoSomething()
{
var xl = ExcelDna.Application;
var callerCell = xl.Caller;
var row = getRow(excelReference.RowFirst+1, callerCell.WorkSheet) ;
}
In GetRow():
var row = (Range)worksheet.Rows[row];
var cell = (Range)bracketRow.Columns[4];
When I check debugger, I can see the retrieved cell is 100% correct because cell.FormulaLocal matches the excel row and column formula.
The value in FormulaLocal is "OtherSheet!A12".
But for some reason, whenever I try cell.Value2, it throws a COMException and nothing else. This is not a multithreaded application and I can't understand why this is happening.
Any ideas?
EDIT:
When I modify the formula to the value it should have gotten had the sheet reference been successful, it doesn't throw.
EDIT 2:
I got around this by adding IsMacroType=true attribute to the excel function. But now xl.Caller returns null, argh
Two issues needed solving:
range.Value2 threw a COMException if the cell has an invalid value e.g. #VALUE in excel.
range.Value2 threw a COMException if the cell referenced another worksheet in the same workbook e.g. "OtherSheet!A2"
To solve this, I set the IsMacroType attribute to true:
[ExcelFunction(Name="DoSomething",IsMacroType=true)]
string DoSomething()
{
var xl = ExcelDna.Application;
var callerCell = xl.Caller;
var row = getRow(excelReference.RowFirst+1, callerCell.WorkSheet) ;
}
The problem now though is, IsMacroType causes xl.Caller will now return null.
I got around this by:
ExcelReference reference = (ExcelReference)XlCall.Excel(XlCall.xlfCaller);
string sheetName = (string)XlCall.Excel(XlCall.xlSheetNm,reference);
int index = sheetName.IndexOf(']', 0) + 1;
int endIndex = sheetName.Length - index;
sheetName = sheetName.Substring(index, endIndex);
var worksheet = (Worksheet)xl.ActiveWorkbook.Sheets[sheetName];
This is my first rodeo to Excel world, is there any side effect to enabling IsMacroType? 'Cause I saw #Govert expressing some concerns of undefined behavior...

How do I read in a single column from an Excel spreadsheet?

I'm trying to read a single column from an Excel document. I'd like to read the entire column, but obviously only store the cells that have data. I also would like to try and handle the case, where a cell(s) in the column are empty, but it will read in later cell values if there's something farther down in the column. For example:
| Column1 |
|---------|
|bob |
|tom |
|randy |
|travis |
|joe |
| |
|jennifer |
|sam |
|debby |
If I had that column, I don't mind having a value of "" for the row after joe, but I do want it to keep getting values after the blank cell. However, I do not want it to go on for 35,000 lines past debby assuming debby is the last value in the column.
It is also safe to assume that this will always be the first column.
So far, I have this:
Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);
foreach (Excel.Range r in myRange)
{
MessageBox.Show(r.Text);
}
I've found lots of examples from older versions of .NET that do similar things, but not exactly this, and wanted to make sure I did something that's more modern (assuming the method one would use to do this has changed some amount).
My current code reads the entire column, but includes blank cells after the last value.
EDIT1
I liked Isedlacek's answer below, but I do have a problem with it, that I'm not certain is specific to his code. If I use it in this way:
Excel.Application myApplication = new Excel.Application();
myApplication.Visible = true;
Excel.Workbook myWorkbook = myApplication.Workbooks.Open("C:\\aFileISelect.xlsx");
Excel.Worksheet myWorksheet = myWorkbook.Sheets["aSheet"] as Excel.Worksheet;
Excel.Range myRange = myWorksheet.get_Range("A:A", Type.Missing);
var nonEmptyRanges = myRange.Cast<Excel.Range>()
.Where(r => !string.IsNullOrEmpty(r.Text));
foreach (var r in nonEmptyRanges)
{
MessageBox.Show(r.Text);
}
MessageBox.Show("Finished!");
the Finished! MessageBox never shows. I'm not sure why that happens, but it appears to never actually finish searching. I tried adding a counter to the loop to see if it was just continuously searching through the column, but it doesn't appear to be ... it appears to just stop.
Where the Finished! MessageBox is, I tried to just close the workbook and spreadsheet, but that code never ran (as expected, since the MessageBox never ran).
If I close the Excel spreadsheet manually, I get a COMException:
COMException was unhandled by user code
Additional information: Exception from HRESULT: 0x803A09A2
Any ideas?
The answer depends on whether you want to get the bounding range of the used cells or if you want to get the non-null values from a column.
Here's how you can efficiently get the non-null values from a column. Note that reading in the entire tempRange.Value property at once is MUCH faster than reading cell-by-cell, but the tradeoff is that the resulting array can use up much memory.
private static IEnumerable<object> GetNonNullValuesInColumn(_Application application, _Worksheet worksheet, string columnName)
{
// get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);
// if there is no intersection, there are no values in the column
if (tempRange == null)
yield break;
// get complete set of values from the temp range (potentially memory-intensive)
var value = tempRange.Value2;
// if value is NULL, it's a single cell with no value
if (value == null)
yield break;
// if value is not an array, the temp range was a single cell with a value
if (!(value is Array))
{
yield return value;
yield break;
}
// otherwise, the value is a 2-D array
var value2 = (object[,]) value;
var rowCount = value2.GetLength(0);
for (var row = 1; row <= rowCount; ++row)
{
var v = value2[row, 1];
if (v != null)
yield return v;
}
}
Here's an efficient way to get the minimum range that contains the non-empty cells in a column. Note that I am still reading the entire set of tempRange values at once, and then I use the resulting array (if multi-cell range) to determine which cells contain the first and last values. Then I construct the bounding range after having figured out which rows have data.
private static Range GetNonEmptyRangeInColumn(_Application application, _Worksheet worksheet, string columnName)
{
// get the intersection of the column and the used range on the sheet (this is a superset of the non-null cells)
var tempRange = application.Intersect(worksheet.UsedRange, (Range) worksheet.Columns[columnName]);
// if there is no intersection, there are no values in the column
if (tempRange == null)
return null;
// get complete set of values from the temp range (potentially memory-intensive)
var value = tempRange.Value2;
// if value is NULL, it's a single cell with no value
if (value == null)
return null;
// if value is not an array, the temp range was a single cell with a value
if (!(value is Array))
return tempRange;
// otherwise, the temp range is a 2D array which may have leading or trailing empty cells
var value2 = (object[,]) value;
// get the first and last rows that contain values
var rowCount = value2.GetLength(0);
int firstRowIndex;
for (firstRowIndex = 1; firstRowIndex <= rowCount; ++firstRowIndex)
{
if (value2[firstRowIndex, 1] != null)
break;
}
int lastRowIndex;
for (lastRowIndex = rowCount; lastRowIndex >= firstRowIndex; --lastRowIndex)
{
if (value2[lastRowIndex, 1] != null)
break;
}
// if there are no first and last used row, there is no used range in the column
if (firstRowIndex > lastRowIndex)
return null;
// return the range
return worksheet.Range[tempRange[firstRowIndex, 1], tempRange[lastRowIndex, 1]];
}
If you don't mind losing the empty rows completely:
var nonEmptyRanges = myRange.Cast<Excel.Range>()
.Where(r => !string.IsNullOrEmpty(r.Text))
foreach (var r in nonEmptyRanges)
{
// handle the r
MessageBox.Show(r.Text);
}
/// <summary>
/// Generic method which reads a column from the <paramref name="workSheetToReadFrom"/> sheet provided.<para />
/// The <paramref name="dumpVariable"/> is the variable upon which the column to be read is going to be dumped.<para />
/// The <paramref name="workSheetToReadFrom"/> is the sheet from which te column is going to be read.<para />
/// The <paramref name="initialCellRowIndex"/>, <paramref name="finalCellRowIndex"/> and <paramref name="columnIndex"/> specify the length of the list to be read and the concrete column of the file from which to perform the reading. <para />
/// Note that the type of data which is going to be read needs to be specified as a generic type argument.The method constraints the generic type arguments which can be passed to it to the types which implement the IConvertible interface provided by the framework (e.g. int, double, string, etc.).
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="dumpVariable"></param>
/// <param name="workSheetToReadFrom"></param>
/// <param name="initialCellRowIndex"></param>
/// <param name="finalCellRowIndex"></param>
/// <param name="columnIndex"></param>
static void ReadExcelColumn<T>(ref List<T> dumpVariable, Excel._Worksheet workSheetToReadFrom, int initialCellRowIndex, int finalCellRowIndex, int columnIndex) where T: IConvertible
{
dumpVariable = ((object[,])workSheetToReadFrom.Range[workSheetToReadFrom.Cells[initialCellRowIndex, columnIndex], workSheetToReadFrom.Cells[finalCellRowIndex, columnIndex]].Value2).Cast<object>().ToList().ConvertAll(e => (T)Convert.ChangeType(e, typeof(T)));
}

C# Cannot implicitly convert type System.DBNull to string

Cannot implicitly convert type 'System.DBNull' to 'string'
I am trying to pull data from a worksheet and I get the above error. It happens on the string Text line. What can I do convert this or ignore the null?
var excel = new Microsoft.Office.Interop.Excel.Application();
Workbook workbook = excel.Workbooks.Open(#"C:\Documents\ANIs.xlsx");
Worksheet worksheet = workbook.Worksheets[1];
Range a1 = worksheet.get_Range("A1","B2");
object rawValue = a1.Value;
string Text = a1.Text; //<--Error Occurs here.
for (int i = 0; i < a1.Count; i++)
{
if (a1.Text != null)
Console.WriteLine("{1}", rawValue, Text);
}
Console.ReadLine();
}
Basically, you need a Conditional Statement. If you call ToString() on certain types, if it is Null it will throw an exception. The easiest remedy would be:
if(!(a1 is DBNull))
{
// Do Something
}
Hopefully this clarifies a bit.
// Sample:
var range = worksheet.get_Range("A1","B2");
if(!(range is DBNull))
{
object raw = range.Value;
string text = range.Text;
// Loop here
}
Also, you need to not use Text as capital, that is predefined and can't be used as a variable. Another error in the code. Note in comments, what #RonBeyer said about the Text being used.

Read data from combined Excel columns/rows using C#

I'm trying to read data from an Excel document in C# using Microsofts COM Interop.
So far, I'm able to load the document and read some data from it. However, I need to read data from two different columns and output these as json (for a jquery ajax call)
I've made a quick prototype of how my Excel document is structured with the hope that it's a bit easier to explain ;-)
The method I have is called GetExcelDataByCategory(string categoryName) where the categoryName parameter would be used to find which column to get the data from.
So, i.e., if I'm making the call with "Category 2" as parameter, I need to get all the values in the C columns rows and the corresponding dates from the A column, so the output will look like this:
Which then needs to be transformed/parsed into JSON.
I've searched high and low on how to achieve this, but with no luck so far :-( I'm aware that I can use the get_Range() method to select a range, but it seems you need to explicitly tell the method which row and which column to get the data from. I.e.: get_Range("A1, C1")
This is my first experience with reading data from an Excel document, so I guess there's a lot to learn ;-) Is there a way to get the output on my second image?
Any help/hint is greatly appreciated! :-)
Thanks in advance.
All the best,
Bo
This is what I would do:
using Excel = Microsoft.Office.Interop.Excel;
Excel.Application xlApp = new Excel.Application();
Excel.Workbook xlWorkbook = xlApp.Workbooks.Open("path to book");
Excel.Worksheet xlSheet = xlWorkbook.Sheets[1]; // get first sheet
Excel.Range xlRange = xlSheet.UsedRange; // get the entire used range
int numberOfRows = xlRange.Rows.Count;
int numberOfCols = xlRange.Columns.Count;
List<int> columnsToRead = new List<int>();
// find the columns that correspond with the string columnName which
// would be passed into your method
for(int i=1; i<=numberOfCols; i++)
{
if(xlRange.Cells[1,i].Value2 != null) // ADDED IN EDIT
{
if(xlRange.Cells[1,i].Value2.ToString().Equals(categoryName))
{
columnsToRead.Add(i);
}
}
}
List<string> columnValue = new List<string>();
// loop over each column number and add results to the list
foreach(int c in columnsToRead)
{
// start at 2 because the first row is 1 and the header row
for(int r = 2; r <= numberOfRows; r++)
{
if(xlRange.Cells[r,c].Value2 != null) // ADDED IN EDIT
{
columnValue.Add(xlRange.Cells[r,c].Value2.ToString());
}
}
}
This is the code I would use to read the Excel. Right now it reads every column that has the heading (designated by whatever is in the first row) and then all the rows there. It isn't exactly what you asked (it doesn't format into JSON) but I think it is enough to get you over the hump.
EDIT: Looks like there are a few blank cells that are causing problems. A blank cell will be NULL in the Interop and thus we get errors if we try to call Value2 or Value2.ToString() since they don't exist. I added code to check to make sure that the cell isn't null before doing anything with it. It prevent the errors.
for Excel-parsing and creation you can use ExcelDataReader: http://exceldatareader.codeplex.com/
and for json you can use json.net: http://json.codeplex.com/
Both are fairly easy to use. Just have a look at the websites.

Writing string, numeric data to Excel via C# works, but Excel does not treat numeric data correctly

I'm getting result sets from Sybase that I return to a C# client.
I use the below function to write the result set data to Excel:
private static void WriteData(Excel.Worksheet worksheet, string cellRef, ref string[,] data)
{
Excel.Range range = worksheet.get_Range(cellRef, Missing.Value);
if (data.GetLength(0) != 0)
{
range = range.get_Resize(data.GetLength(0), data.GetLength(1));
range.set_Value(Missing.Value, data);
}
}
The data gets written correctly.
The issue is that since I'm using string array to write data (which is a mixture of strings and floats), Excel highlights every cell that contains numeric data with the message "Number Stored as Text".
How do I get rid of this issue?
Many thanks,
Chapax
Try the following: replace your array of string by an array of object.
var data = new object[2,2];
data[0, 0] = "A";
data[0, 1] = 1.2;
data[1, 0] = null;
data[1, 1] = "B";
var theRange = theSheet.get_Range("D4", "E5");
theRange.Value2 = data;
If I use this code, equivalent to yours:
var data = new string[2,2];
I get the same symptom as you.
As a side benefit, you don't have to cast anything to string: you can fill your array with whatever you want to see displayed.
Try setting the NumberFormat property on the range object.

Categories

Resources