C#. microsoft interop excel. - c#

Im trying to insert RoomType array into the excel book. Range of RoomType is From D22 To D25 so problem is that this code only put the firts value In this range. if i insert RoomType.set_Value into the for loop, excel range filling with last array item. can anyone help me?
Object[,] RoomtypeArray = new object[1, _RoomType.Count];
for (int i = 0; i < _RoomType.Count; i++)
{
RoomtypeArray[0, i] = _RoomType[i];
}
RoomType.set_Value(Type.Missing, RoomtypeArray);

This is what you need:
//Microsoft.Office.Interop.Excel.Range RoomType;
//List<double> _RoomType;
object[,] roomTypeArray = Array.CreateInstance(
typeof(object),
new int[] { 1, _RoomType.Count},
new int[] { 1, 1 });
for (int i = 0; i < _RoomType.Count; i++)
{
roomTypeArray[1, i + 1] = _RoomType[i];
}
RoomType.Value2 = roomTypeArray;
because setting an array for a range requires 1-based indexes instead of 0-based which are used with the new statment in C#.
(look also in the accepted answer of How can I quickly up-cast object[,] into double[,]? to find a neat trick going between object[,] and double[,] for use in Excel Inerop).

Related

Optimize performance of data processing method

I am using the following code to take some data (in XML like format - Not well formed) from a .txt file and then write it to an .xlsx using EPPlus after doing some processing. StreamElements is basically a modified XmlReader. My question is about performance, I have made a couple of changes but don't see what else I can do. I'm going to use this for large datasets so I'm trying to modify to make this as efficient and fast as possible. Any help will be appreciated!
I tried using p.SaveAs() to do the excel writing but it did not really see a performance difference. Are there better faster ways to do the writing? Any suggestions are welcome.
using (ExcelPackage p = new ExcelPackage())
{
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
int rowIndex = 1; int colIndex = 1;
foreach (var element in StreamElements(pa, "XML"))
{
var values = element.DescendantNodes().OfType<XText>()
.Select(v => Regex.Replace(v.Value, "\\s+", " "));
string[] data = string.Join(",", values).Split(',');
data[2] = toDateTime(data[2]);
for (int i = 0; i < data.Count(); i++)
{
if (rowIndex < 1000000)
{
var cell1 = ws.Cells[rowIndex, colIndex];
cell1.Value = data[i];
colIndex++;
}
}
rowIndex++;
}
}
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using (FileStream fs = File.OpenWrite("C:\\test.xlsx"))
{
fs.Write(bin, 0, bin.Length);
}
}
}
Currently, for it to do the processing and then write 1 Million lines into an Excel worksheet, it takes about ~30-35 Minutes.
I've ran into this issue before and excel has a huge overhead when you're modifying worksheet cells individually one by one.
The solution to this is to create an object array and populate the worksheet using the WriteRange functionality.
using(ExcelPackage p = new ExcelPackage()) {
ExcelWorksheet ws = p.Workbook.Worksheets[1];
ws.Name = "data1";
//Starting cell
int startRow = 1;
int startCol = 1;
//Needed for 2D object array later on
int maxColCount = 0;
int maxRowCount = 0;
//Queue data
Queue<string[]> dataQueue = new Queue<string[]>();
//Tried not to touch this part
foreach(var element in StreamElements(pa, "XML")) {
var values = element.DescendantNodes().OfType<XText>()
.Select(v = > Regex.Replace(v.Value, "\\s+", " "));
//Removed unnecessary split and join, use ToArray instead
string[] eData = values.ToArray();
eData[2] = toDateTime(eData[2]);
//Push the data to queue and increment counters (if needed)
dataQueue.Enqueue(eData);
if(eData.Length > maxColCount)
maxColCount = eData.Length;
maxRowCount++;
}
//We now have the dimensions needed for our object array
object[,] excelArr = new object[maxRowCount, maxColCount];
//Dequeue data from Queue and populate object matrix
int i = 0;
while(dataQueue.Count > 0){
string[] eData = dataQueue.Dequeue();
for(int j = 0; j < eData.Length; j++){
excelArr[i, j] = eData[j];
}
i++;
}
//Write data to range
Excel.Range c1 = (Excel.Range)wsh.Cells[startRow, startCol];
Excel.Range c2 = (Excel.Range)wsh.Cells[startRow + maxRowCount - 1, maxColCount];
Excel.Range range = worksheet.Range[c1, c2];
range.Value2 = excelArr;
//Tried not to touch this stuff
ws.Cells[ws.Dimension.Address].AutoFitColumns();
Byte[] bin = p.GetAsByteArray();
using(FileStream fs = File.OpenWrite("C:\\test.xlsx")) {
fs.Write(bin, 0, bin.Length);
}
}
I didn't try compiling this code, so double check the indexing used; and check for any small syntax errors.
A few extra pointers to consider for performance:
Try to parallel the population of the object array, since it is primarily index based (maybe have a dictionary with an index tracker Dictionary<int, string[]>) and lookup in there for faster population of the object array. You would likely have to trade space for time.
See if you are able to hardcode the column and row counts, or figure it out quickly. In my code fix, I've set counters to count the maximum rows and columns on the fly; I wouldn't recommend it as a permanent solution.
AutoFitColumns is very costly, especially if you're dealing with over a million rows

Convert Excel Range to C# Array

I would like to convert an Excel Range to a C# Array with this code:
System.Array MyRange = (System.Array)range.cells.value;
for (int k = 0; k <= MyRange.Length; k++)
{
List<service_name> _ml = new List<service_name>();
for (int j = 1; j < dataitems.Count; j++)
{
// enter code here
}
}
And then iterate over it like in the above loop.
But this code does not work, and throws this Exception:
"Unable to cast object of type 'System.String' to type 'System.Array'."
Based on the help provided my Microsoft here, this is how I read and write an array in Excel.
var xlApp=new Microsoft.Office.Interop.Excel.Application();
var wb=xlApp.Workbooks.Open(fn, ReadOnly: false);
xlApp.Visible=true;
var ws=wb.Worksheets[1] as Worksheet;
var r=ws.Range["A2"].Resize[100, 1];
var array=r.Value;
// array is object[1..100,1..1]
for(int i=1; i<=100; i++)
{
var text=array[i, 1] as string;
Debug.Print(text);
}
// to create an [1..100,1..1] array use
var array2=Array.CreateInstance(
typeof(object),
new int[] {100, 1},
new int[] {1, 1}) as object[,];
// fill array2
for(int i=1; i<=100; i++)
{
array2[i, 1] = string.Format("Text{0}",i);
}
r.Value2=array2;
wb.Close(SaveChanges: true);
xlApp.Quit();
The error message in the original post occurs when the range consists of exactly one cell, because the resulting value's type is variant, and actually can be array, double, string, date, and null.
One solution can be that you check the cell count and act differently in case of exactly one cell.
My solution creates an array of cells. This works even if one or more cells are empty, which could causes a null object. (When all cells of the range are empty, Range.Cells.Value is null.)
System.Array cellArray = range.Cells.Cast<Excel.Range>().ToArray<Excel.Range>();
If you prefer Lists over Arrays (like I do), you can use this code:
List<Excel.Range> listOfCells = range.Cells.Cast<Excel.Range>().ToList<Excel.Range>();
Range can be one- or two-dimensional.
Finally, if you definitely need a string array, here it is:
string[] strArray = range.Cells.Cast<Excel.Range>().Select(Selector).ToArray<string>();
where Selector function looks like this:
public string Selector(Excel.Range cell)
{
if (cell.Value2 == null)
return "";
if (cell.Value2.GetType().ToString() == "System.Double")
return ((double)cell.Value2).ToString();
else if (cell.Value2.GetType().ToString() == "System.String")
return ((string)cell.Value2);
else if (cell.Value2.GetType().ToString() == "System.Boolean")
return ((bool)cell.Value2).ToString();
else
return "unknown";
}
I included "unknown" for the case I don't remember all data types that Value2 can return. Please improve this function if you find any other type(s).
NOTE: This example only works with a range that is MORE THAN ONE CELL. If the Range is only a single cell (1x1), Excel will treat it in a special way, and the range.Value2 will NOT return a 2-dimensional array, but instead will be a single value. It's these types of special cases that will drive you nuts, as well as zero and non-zero array lower bounds:
using Excel = Microsoft.Office.Interop.Excel;
private static void Test()
{
Excel.Range range = Application.ActiveWorkbook.ActiveSheet.Range["A1:B2"]; // 2x2 array
range.Cells[1, 2] = "Foo"; // Sets Cell A2 to "Foo"
dynamic[,] excelArray = range.Value2 as dynamic[,]; // This is a very fast operation
Console.Out.WriteLine(excelArray[1, 2]); // => Foo
excelArray[1, 2] = "Bar";
range.Value2 = excelArray; // Sets Cell A2 to "Bar", again a fast operation even for large arrays
Console.Out.WriteLine(range.Cells[1, 2]); // => Bar
Note that excelArray will have row and column lower bounds of 1:
Console.Out.WriteLine("RowLB: " + excelArray.GetLowerBound(0)); // => RowLB: 1
Console.Out.WriteLine("ColLB: " + excelArray.GetLowerBound(1)); // => ColLB: 1
BUT, if you declare a newArray in C# and assign it, then the lower bounds will be 0, but it will still work:
dynamic[,] newArray = new dynamic[2, 2]; // Same dimensions as "A1:B2" (2x2)
newArray[0, 1] = "Foobar";
range.Value2 = newArray; // Sets Cell A2 to "Foobar"
Console.Out.WriteLine(range.Cells[1, 2]); // => Foobar
Fetching this value out of the range will give you the original array with lower bounds of 0:
range.Cells[1, 2] = "Fubar";
dynamic[,] lastArray = range.Value2 as dynamic[,];
Console.Out.WriteLine(lastArray[0, 1]); // => Fubar
Console.Out.WriteLine("RowLB: " + lastArray.GetLowerBound(0)); // => RowLB: 0
Console.Out.WriteLine("ColLB: " + lastArray.GetLowerBound(1)); // => ColLB: 0
}
Working with Excel Interop can be daunting as there are many special cases like this in the codebase, but I hope this helps clarify at least this one.
The error means that the value is a string, so you can't convert it directly to array.
If the value is for example comma delimeted string you can use Split to get an array:
string[] MyRange = (range.Cells.Value + "").Split(',');
for (int k = 0; k < MyRange.Length; k++)
{
//...loop here...
}
Also fixed your loop, you were going to get Index Out of Bounds error.
Late to the conversation, but here is a method to do this:
static string[][] GetStringArray(Object rangeValues)
{
string[][] stringArray = null;
Array array = rangeValues as Array;
if (null != array)
{
int rank = array.Rank;
if (rank > 1)
{
int rowCount = array.GetLength(0);
int columnCount = array.GetUpperBound(1);
stringArray = new string[rowCount][];
for (int index = 0; index < rowCount; index++)
{
stringArray[index] = new string[columnCount-1];
for (int index2 = 0; index2 < columnCount; index2++)
{
Object obj = array.GetValue(index + 1, index2 + 1);
if (null != obj)
{
string value = obj.ToString();
stringArray[index][index2] = value;
}
}
}
}
}
return stringArray;
}
called with:
string[][] rows = GetStringArray(range.Cells.Value2);
The issue is because ,when your range becomes a single cell ,and "range.value"/"range.cell.value" means String value,this string cannot be put into object array.
So check if your range has only one cell or more and do according to that

Using Excel in C# to transfer Cell Values to a 2D string array

I need someone to create a method that will get me all the cell values in an excel spreadsheet into a 2D array.
I'm making it using ribbons in C# to work with Excel but i just can't get it to work.
private string[,] GetSpreadsheetData ()
{
try
{
Excel.Application exApp =
Globals.TSExcelAddIn.Application as Excel.Application;
Excel.Worksheet ExWorksheet = exApp.ActiveSheet as Excel.Worksheet;
Excel.Range xlRange = ExWorksheet.get_Range("A1","F188000");
object[,] values = (object[,])xlRange.Value2;
string[,] tsReqs = new string[xlRange.Rows.Count, 7];
char[] alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray();
for (int i = 0; i < tsReqs.GetLength(0); i++)
{
for (int z = 0; z < tsReqs.GetLength(1); z++)
{
if(values[i+1,z+1] != null)
tsReqs[i, z] = values[i + 1, z + 1].ToString();
}
}
return tsReqs;
}
catch
{
MessageBox.Show
("Excel has encountered an error. \nSaving work and exitting");
return null;
}
}
Also if anyone has a more efficient way of doing this I would greatly appreciate it.
Excel.Range xlRange = ExWorksheet.get_Range("A1","F188000");
Reads all the way till F188000 cell from A1, I just want it to keep reading until it reaches a row with absolutely no data.
- Caught: "Index was outside the bounds of the array." (System.IndexOutOfRangeException) Exception Message = "Index was outside the bounds of the array.", Exception Type = "System.IndexOutOfRangeException"
You could consider using ExWorksheet.UsedRange instead of ExWorksheet.get_Range("A1","F188000")
EDIT: Also, I think if you use the .Text field of a Range, the value is automatically casted to a string, so no need to use .Value2 here
Excel.Range rng = ExWorksheet.UsedRange;
int rowCount = rng.Rows.Count;
int colCount = rng.Columns.Count;
string[,] tsReqs = new string[rowCount, colCount];
for (int i = 1; i <= rowCount; i++)
{
for (int j = 1; j <= colCount; j++)
{
string str = rng.Cells[i, j].Text;
tsReqs[i - 1, j - 1] = str;
}
}
Your arrays are different sizes. Your values come from A1:F188000 which is 6 columns, but you're looping through the columns of tsReqs which is 7 columns. It looks like you're trying to account for the fact that values will be a 1-based array but aren't accounting for that properly.
Change the declaration of tsReqs to:
string[,] tsReqs = new string[xlRange.Rows.Count, 6];
and you should be fine.
Try this
private static void printExcelValues(Worksheet xSheet)
{
Range xRng =xSheet.Cells.SpecialCells(
XlCellType.xlCellTypeConstants);
var arr = new string[xRng.Rows.Count,xRng.Columns.Count];
foreach (Range item in xRng)
{
arr[item.Row-1,item.Column-1]=item.Value.ToString();
}
}
You can use the | to specify more cell types like XlCellType.xlCellTypeFormulas or any other type you'd like.
I had to do something very similar. Reading an excel file takes a while... I took a different route but I got the job done with instant results.
Firstly I gave each column its own array. I saved each column from the spreadsheet as a CSV(Comma delimited) file. After opening that file with notePad I was able to copy the 942 values separated with a comma to the initialization of the array
int[] Column1 = {300, 305, 310, ..., 5000};
[0] [1] [2] ... [941]
Now... If you do this for each Column, the position will resemble the 'row' of each 'column' of your 'spreadsheet' .
This odd method worked PERFECT for me, as I needed to compare user input sizes to the values in the "Width" column in the spreadsheet to get information regarding each respective size found in the spreadsheet.
NOTE: If your cells contain strings and not integers like mine, you can use enumeration method:
enum column1 { thing_1, thing_2, ..., thing_n,}
[0] [1] [n]
Here's my code If you want to have a look:
(This is NOT the code converting a spreadsheet into some arrays --that is only a few lines of code, depending on your amount of columns-- this is the whole LookUp method I wrote)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace PalisadeWorld
{
//struct to store all the positions of each matched panel size in an array 'Identities'
struct ID
{
public int[] Identities;
public ID(int[] widths, int rows)
{
//Column1 from spreadsheet
int[] allWidths = { 300, 305, 310, 315, 320, 325, 330, ..., 5000 };
int i,j;
int[] Ids = new int[rows];
for (i = 0; i < rows; i++)
{
for (j = 0; j < 941; j++)
{
if (widths[i] == allWidths[j])
{
Ids[i] = j;
break;
}
}
}
this.Identities = Ids;
}
public override string ToString()
{
string data = String.Format("{0}", this.Identities);
return data;
}
}
class LookUpSheet
{
//retrieve user input from another class
public int[] lookUp_Widths {get; set;}
public int lookUp_Rows { get; set; }
//Method returning desired values from Column2
public int[] GetNumPales1()
{
//column2
int[] all_numPales = { 2, 2, 2, 2, 2, 2, 2, 2, 2, ..."goes on till [941]"...};
int[] numPales = new int[lookUp_Rows];
ID select = new ID(lookUp_Widths, lookUp_Rows);
for (int i = 0; i < lookUp_Rows; i++)
{
numPales[i] = all_numPales[select.Identities[i]];
}
return numPales;
}
//Method returning desired values from Column3
public int[] GetBlocks1()
{
//column3
int[] all_blocks = { 56, 59, 61, 64, 66, 69, 71, 74, "goes on till [941]"...};
int[] blocks = new int[lookUp_Rows];
ID select = new ID(lookUp_Widths, lookUp_Rows);
for (int i = 0; i < lookUp_Rows; i++)
{
blocks[i] = all_blocks[select.Identities[i]];
}
return blocks;
}
...
Goes on through each column of my spreadsheet
Really hope this helps someone. Cheers

'System.Collections.Generic.List<string>' to System.Collections.Generic.List<string>[*,*]

I am using C# asp.net in my project. I use 2d array in that. Named roomno. When I try to remove one row in that. So I Convert the array into list.
static string[,] roomno = new string[100, 14];
List<string>[,] lst = new List<string>[100, 14];
lst = roomno.Cast<string>[,]().ToList();
Error 1 Invalid expression term 'string' in this line...
if i try below code,
lst = roomno.Cast<string>().ToList();
I got
Error 3 Cannot implicitly convert type 'System.Collections.Generic.List<string>' to 'System.Collections.Generic.List<string>[*,*]'
lst = roomno.Cast().ToList();
What is the Mistake in my code?.
After that, I plan to remove the row in list, lst.RemoveAt(array_qty);
This:
List<string>[,] lst = new List<string>[100, 14];
is declaring a 2-D array of List<string> values.
This:
roomno.Cast<string>[,]().ToList();
... simply doesn't make sense due to the position of the [,] between the type argument and the () for the method invocation. If you changed it to:
roomno.Cast<string[,]>().ToList();
then it would be creating a List<string[,]> but it's still not the same as a List<string>[,].
Additionally, roomno is just a 2-D array of strings - which is actually a single sequence of strings as far as LINQ is concerned - so why are you trying to convert it into an essentially 3-dimensional type?
It's not clear what you're trying to do or why you're trying to do it, but hopefully this at least helps to explain why it's not working...
To be honest, I would try to avoid mixing 2-D arrays and lists within the same type. Would having another custom type help?
EDIT: LINQ isn't going to be much use with a 2-D array. It's designed for single sequences really. I suspect you'll need to do it "manually" - here's a short but complete program as an example:
using System;
class Program
{
static void Main(string[] args)
{
string[,] values = {
{"x", "y", "z"},
{"a", "b", "c"},
{"0", "1", "2"}
};
values = RemoveRow(values, 1);
for (int row = 0; row < values.GetLength(0); row++)
{
for (int column = 0; column < values.GetLength(1); column++)
{
Console.Write(values[row, column]);
}
Console.WriteLine();
}
}
private static string[,] RemoveRow(string[,] array, int row)
{
int rowCount = array.GetLength(0);
int columnCount = array.GetLength(1);
string[,] ret = new string[rowCount - 1, columnCount];
Array.Copy(array, 0, ret, 0, row * columnCount);
Array.Copy(array, (row + 1) * columnCount,
ret, row * columnCount, (rowCount - row - 1) * columnCount);
return ret;
}
}
If you need to use 2-dimensional List, try List<List<string>>(). But conversation to array in that case could be: list.Select(x => x.ToArray()).ToArray() and that is string[][] but not string[,]

How to specify format for individual cells with Excel.Range.set_Value()

When I write a whole table into an excel worksheet, I know to work with a whole Range at once instead of writing to individual cells. However, is there a way to specify format as I'm populating the array I'm going to export to Excel?
Here's what I do now:
object MissingValue = System.Reflection.Missing.Value;
Excel.Application excel = new Excel.Application();
int rows = 5;
int cols = 5;
int someVal;
Excel.Worksheet sheet = (Excel.Worksheet)excel.Workbooks.Add(MissingValue).Sheets[1];
Excel.Range range = sheet.Range("A1", sheet.Cells(rows,cols));
object[,] rangeData = new object[rows,cols];
for(int r = 0; r < rows; r++)
{
for(int c = 0; c < cols; c++)
{
someVal = r + c;
rangeData[r,c] = someVal.ToString();
}
}
range.set_Value(MissingValue, rangeData);
Now suppose that I want some of those numbers to be formatted as percentages. I know I can go back on a cell-by-cell basis and change the formatting, but that seems to defeat the whole purpose of using a single Range.set_Value() call. Can I make my rangeData[,] structure include formatting information, so that when I call set_Value(), the cells are formatted in the way I want them?
To clarify, I know I can set the format for the entire Excel.Range object. What I want is to have a different format specified for each cell, specified in the inner loop.
So here's the best "solution" I've found so far. It isn't the nirvanna I was looking for, but it's much, much faster than setting the format for each cell individually.
// 0-based indexes
static string RcToA1(int row, int col)
{
string toRet = "";
int mag = 0;
while(col >= Math.Pow(26, mag+1)){mag++;}
while (mag>0)
{
toRet += System.Convert.ToChar(64 + (byte)Math.Truncate((double)(col/(Math.Pow(26,mag)))));
col -= (int)Math.Truncate((double)Math.Pow(26, mag--));
}
toRet += System.Convert.ToChar(65 + col);
return toRet + (row + 1).ToString();
}
static Random rand = new Random(DateTime.Now.Millisecond);
static string RandomExcelFormat()
{
switch ((int)Math.Round(rand.NextDouble(),0))
{
case 0: return "0.00%";
default: return "0.00";
}
}
struct ExcelFormatSpecifier
{
public object NumberFormat;
public string RangeAddress;
}
static void DoWork()
{
List<ExcelFormatSpecifier> NumberFormatList = new List<ExcelFormatSpecifier>(0);
object[,] rangeData = new object[rows,cols];
for(int r = 0; r < rows; r++)
{
for(int c = 0; c < cols; c++)
{
someVal = r + c;
rangeData[r,c] = someVal.ToString();
NumberFormatList.Add(new ExcelFormatSpecifier
{
NumberFormat = RandomExcelFormat(),
RangeAddress = RcToA1(rowIndex, colIndex)
});
}
}
range.set_Value(MissingValue, rangeData);
int max_format = 50;
foreach (string formatSpecifier in NumberFormatList.Select(p => p.NumberFormat).Distinct())
{
List<string> addresses = NumberFormatList.Where(p => p.NumberFormat == formatSpecifier).Select(p => p.RangeAddress).ToList();
while (addresses.Count > 0)
{
string addressSpecifier = string.Join(",", addresses.Take(max_format).ToArray());
range.get_Range(addressSpecifier, MissingValue).NumberFormat = formatSpecifier;
addresses = addresses.Skip(max_format).ToList();
}
}
}
Basically what is happening is that I keep a list of the format information for each cell in NumberFormatList (each element also holds the A1-style address of the range it applies to). The original idea was that for each distinct format in the worksheet, I should be able to construct an Excel.Range of just those cells and apply the format to that range in a single call. This would reduce the number of accesses to NumberFormat from (potentially) thousands down to just a few (however many different formats you have).
I ran into an issue, however, because you apparently can't construct a range from an arbitrarily long list of cells. After some testing, I found that the limit is somewhere between 50 and 100 cells that can be used to define an arbitrary range (as in range.get_Range("A1,B1,C1,A2,AA5,....."). So once I've gotten the list of all cells to apply a format to, I have one final while() loop that applies the format to 50 of those cells at a time.
This isn't ideal, but it still reduces the number of accesses to NumberFormat by a factor of up to 50, which is significant. Constructing my spreadsheet without any format info (only using range.set_Value()) takes about 3 seconds. When I apply the formats 50 cells at a time, that is lengthened to about 10 seconds. When I apply the format info individually to each cell, the spreadsheet takes over 2 minutes to finish being constructed!
You can apply a formatting on the range, and then populate it with values you cannot specify formatting in you object[,] array
You apply the formatting to each individual cell within the inner loop via
for(int r = 0; r < rows; r++)
{
for(int c = 0; c < cols; c++)
{
Excel.Range r2 = sheet.Cells( r, c );
r2.xxxx = "";
}
}
Once you have r2, you can change the cell format any way you want.

Categories

Resources