I have an application that reads excel files and extracts all the data. The xlsx or xls files are not standard and vary in column size, column data type, and row size. My application currently is able to read all files of this nature but when I use Value2 all the dates are coming back as OLE Automation type (32442,45322..). After researching its clear that Value2 is returning the underlying data as days from Dec 30, 1899 in the double data type.
Im currently calling Value2 only once and moving everything into an object array:
object[,] data = xlRange.Value2;
Then depending on the rows size, column size, and if there is header Im reading everything into a new temp table that gets bulk copied into the file's matching table in the db.
Before the file is read, I perform a setup process like how many rows to skip at the top, the column names, if it has header, the header column names...
Now is there a way for me to automatically detect which columns are dates or currency without having to explicitly let the application know which columns to handle differently? I was able to solve this problem by labeling the columns that were dates as DATES and using the FromOADate function to load it as a date string but not really happy with this solution.
for (int r = startRow; r <= rowCount - rowsAfter; r++) //populates the raw table with all file rows
{
//if (i % 100 == 0)
//MessageBox.Show("Message" + DateTime.Now.ToLongTimeString());
//Application.DoEvents();
object[] rowData = new object[colCount];
for (int c = 1; c <= colCount; c++)
{
bool isDateColumn = false;
string tempGridFieldName = fieldListcount_excel2[c - 1]; //checks the fieldListNames and looks for keywork Date
if (tempGridFieldName.Contains("DATE"))
isDateColumn = true; //suppose to be true
int maxFieldSize = Int32.Parse(fieldListlength_excel2[c - 1]); //changed this to version 2
if (data[r, c] != null)
{
int cellSize = data[r, c].ToString().Length; //gets max cell size from grid
if (cellSize <= maxFieldSize)
if (isDateColumn) //column is a date field so we take the Excel serial date and convert it to Datetime
{
object o = data[r, c];
double d = Convert.ToDouble(o);
DateTime dt = DateTime.FromOADate(d);
rowData[c - 1] = dt.ToShortDateString();
}
else
{
rowData[c - 1] = data[r, c];
}
else
{
MessageBox.Show($"Cell [{r},{c}] content size larger than field size");
cellOversizeFlag = true;
eof = false;
break;
}
}
else
{
rowData[c - 1] = data[r, c];
}
}
rawExcelFileDT.Rows.Add(rowData);
}
I also tried to check the columns by using the getType function but the column names were always strings and the dates came back as doubles. The problem is that I cannot assume that all the doubles will be dates so how can I approach this problem?
I made a C# winforms project which allows the user to pick a cell in a spreadsheet and write an integer value "total" into the selected cell. This takes place on an existing spreadsheet and thus this action changes the value of the cell. What is the best way to change some more cells with particular similarities?
My code is below:
private void button5_Click(object sender, EventArgs e)
{
MySheet = (Excel.Worksheet)MyBook.Sheets[1]; // Explicit cast is not required here
bpRow = excelApp.ActiveCell.Row;
bpColumn = excelApp.ActiveCell.Column;
MySheet.Cells[bpRow, bpColumn] = total;
Excel.Range colRangeH = MySheet.Columns["R:R"];//get the range object where you want to search from
Excel.Range resultRangeH = colRangeH.Find(
What: MySheet.Cells[bpRow, 18],
LookIn: Excel.XlFindLookIn.xlValues,
LookAt: Excel.XlLookAt.xlPart,
SearchOrder: Excel.XlSearchOrder.xlByRows,
SearchDirection: Excel.XlSearchDirection.xlNext
);
textBox3.Text = " " + resultRangeH;
if (MySheet.Cells[resultRangeH, 16] == MySheet.Cells[bpRow, 16])
{
if (MySheet.Cells[resultRangeH, 22] == "T.PLATE")
MySheet.Cells[resultRangeH, bpColumn] = 3 * MySheet.Cells[bpRow, bpColumn];
else if (MySheet.Cells[resultRangeH, 22] == "STUD")
MySheet.Cells[resultRangeH, bpColumn] = (int)(Math.Ceiling(MySheet.Cells[bpRow, bpColumn] / 1.33));
}
MyBook.Save();
}
The spread sheet is made of about 90 entries and each row has a 21 numbers and a written description. I added a find method in my code to give me the row number of an entry with its 18th number equal to the 18th number in the entry corresponding to the selected cell. I then wrote a few if statements to check another number value and the written description. This is so that i may change the values in the column corresponding to the selected cell accordingly.
While creating and customizing Excel file using Interop and Office 2013 installed, gives me somehow extremely slow results (more than 5 minutes).
In fact, the same thing works very well on Excel 2010 interop (just 50 seconds, exactly same process). (Code snippet below)
It would be nice to know if there is a faster way to do this. I know there are different libraries to do this but I would like to stick to Interop since everything is already in the same.
I am creating Excel file first then check if there are any empty cells or cells containing a specific string and change color of those cells.
To create Excel, I used Object array and parse it that is really faster. Main thing which is pulling it down is to search and change cell color.
// Check for empty cell and make interior silver color
for (int row = 0; row < rowNo; row++)
{
for (int col = 0; col < columnNo; col++)
{
if (string.IsNullOrEmpty(objData[row, col].ToString()))
{
// Access that cell in Excel now and change interior color
Range cell = (Range)activeSheet.Cells[row + 2, col + 1];
cell.Interior.Color = System.Drawing.Color.Silver;
}
}
}
// Check for cells contains "column header string#"
for (int col = 16; col < columnNo; col++)
{
// Get column header - only once and use it for all rows in the same column
string cellValue = activeSheet.Cells[1, col + 1].Value2.ToString();
for (int row = 0; row < rowNo; row++)
{
string value = objData[row, col].ToString();
if (string.IsNullOrEmpty(value) || value.Contains(cellValue+"#"))
{
Range cell = (Range)activeSheet.Cells[row + 2, col + 1];
cell.Interior.Color = System.Drawing.Color.Silver;
cell.Font.Color = System.Drawing.Color.Red;
}
}
}
I'd look into selecting the whole range and setting conditional formatting so blank cells are the color you want. Sorry don't have any code for you, it's been awhile since I've played in that realm.
I am trying to get the last row of an excel sheet programatically using the Microsoft.interop.Excel Library and C#. I want to do that, because I am charged with looping through all the records of an excel spreadsheet and performing some kind of operation on them. Specifically, I need the actual number of the last row, as I will throw this number into a function. Anybody have any idea how to do that?
Couple ways,
using Excel = Microsoft.Office.Interop.Excel;
Excel.ApplicationClass excel = new Excel.ApplicationClass();
Excel.Application app = excel.Application;
Excel.Range all = app.get_Range("A1:H10", Type.Missing);
OR
Excel.Range last = sheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell, Type.Missing);
Excel.Range range = sheet.get_Range("A1", last);
int lastUsedRow = last.Row;
int lastUsedColumn = last.Column;
This is a common issue in Excel.
Here is some C# code:
// Find the last real row
nInLastRow = oSheet.Cells.Find("*",System.Reflection.Missing.Value,
System.Reflection.Missing.Value, System.Reflection.Missing.Value, Excel.XlSearchOrder.xlByRows,Excel.XlSearchDirection.xlPrevious, false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Row;
// Find the last real column
nInLastCol = oSheet.Cells.Find("*", System.Reflection.Missing.Value, System.Reflection.Missing.Value,System.Reflection.Missing.Value, Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlPrevious, false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Column;
found here
or using SpecialCells
Excel.Range last = sheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell, Type.Missing);
Excel.Range range = sheet.get_Range("A1", last);
[EDIT] Similar threads:
VB.NET - Reading ENTIRE content of an excel file
How to get the range of occupied cells in excel sheet
Pryank's answer is what worked closest for me. I added a little bit towards the end (.Row) so I am not just returning a range, but an integer.
int lastRow = wkSheet.Cells.SpecialCells(XlCellType.xlCellTypeLastCell, Type.Missing).Row;
The only way I could get it to work in ALL scenarios (except Protected sheets):
It supports:
Scanning Hidden Row / Columns
Ignores formatted cells with no data / formula
Code:
// Unhide All Cells and clear formats
sheet.Columns.ClearFormats();
sheet.Rows.ClearFormats();
// Detect Last used Row - Ignore cells that contains formulas that result in blank values
int lastRowIgnoreFormulas = sheet.Cells.Find(
"*",
System.Reflection.Missing.Value,
InteropExcel.XlFindLookIn.xlValues,
InteropExcel.XlLookAt.xlWhole,
InteropExcel.XlSearchOrder.xlByRows,
InteropExcel.XlSearchDirection.xlPrevious,
false,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value).Row;
// Detect Last Used Column - Ignore cells that contains formulas that result in blank values
int lastColIgnoreFormulas = sheet.Cells.Find(
"*",
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value,
InteropExcel.XlSearchOrder.xlByColumns,
InteropExcel.XlSearchDirection.xlPrevious,
false,
System.Reflection.Missing.Value,
System.Reflection.Missing.Value).Column;
// Detect Last used Row / Column - Including cells that contains formulas that result in blank values
int lastColIncludeFormulas = sheet.UsedRange.Columns.Count;
int lastColIncludeFormulas = sheet.UsedRange.Rows.Count;
For questions involving the Excel object model, it's often easier to try it out in VBA first, then translating to C# is fairly trivial.
In this case one way to do it in VBA is:
Worksheet.UsedRange.Row + Worksheet.UsedRange.Rows.Count - 1
The ActiveSheet.UsedRange.Value returns a 2 dimensional object array of [row, column]. Checking the length of both dimensions will provide the LastRow index and the LastColumn index. The example below is using C#.
Excel.Worksheet activeSheet;
Excel.Range activeRange;
public virtual object[,] RangeArray
{
get { return ActiveRange.Value; }
}
public virtual int ColumnCount
{
get { return RangeArray.GetLength(1); }
}
public virtual int RowCount
{
get { return RangeArray.GetLength(0); }
}
public virtual int LastRow
{
get { return RowCount; }
}
This issue is even worse when there are possibly empty cells. But you have to read a row even if only one value is filled. It can take a while when there are a lot of unfilled cells but if the input is close to correct it is rather fast.
My solution ignores completely empty rows and returns the longest column's row count:
private static int GetLastRow(Worksheet worksheet)
{
int lastUsedRow = 1;
Range range = worksheet.UsedRange;
for (int i = 1; i < range.Columns.Count; i++)
{
int lastRow = range.Rows.Count;
for (int j = range.Rows.Count; j > 0; j--)
{
if (lastUsedRow < lastRow)
{
lastRow = j;
if (!String.IsNullOrWhiteSpace(Convert.ToString((worksheet.Cells[j, i] as Range).Value)))
{
if (lastUsedRow < lastRow)
lastUsedRow = lastRow;
if (lastUsedRow == range.Rows.Count)
return lastUsedRow - 1;
break;
}
}
else
break;
}
}
return lastUsedRow;
}
For those who use SpecialCells method, (I'm not sure about others), Please Note in case your last cell is merged, you won't be able to get last row and column number using Range.Row and Range.Column to get the last row and column as numbers.
you need to first Unmerge your range and then Again get the last cell.
It cost me a lot.
private int[] GetLastRowCol(Ex.Worksheet ws)
{
Ex.Range last = ws.Cells.SpecialCells(Ex.XlCellType.xlCellTypeLastCell, Type.Missing);
bool isMerged = (bool)last.MergeCells;
if (isMerged)
{
last.UnMerge();
last = ws.Cells.SpecialCells(Ex.XlCellType.xlCellTypeLastCell, Type.Missing);
}
return new int[2] { last.Row, last.Column };
}
As previously discussed, the techniques above (xlCellTypeLastCell etc.) do not always provide expected results. Although it's not difficult to iterate down through a column checking for values, sometimes you may find that there are empty cells or rows with data that you want to consider in subsequent rows. When using Excel directly, a good way of finding the last row is to press CTRL + Down Arrow a couple of times (you'll end up at row 1048576 for an XLSX worksheet) and then press CTRL + Up Arrow which will select the last populated cell. If you do this within Excel while recording a Macro you'll get the code to replicate this, and then it's just a case of tweaking it for C# using the Microsoft.Office.Interop.Excel libraries. For example:
private int GetLastRow()
{
Excel.Application ExcelApp;
ExcelApp = new Excel.Application();
ExcelApp.Selection.End(Excel.XlDirection.xlDown).Select();
ExcelApp.Selection.End(Excel.XlDirection.xlDown).Select();
ExcelApp.Selection.End(Excel.XlDirection.xlDown).Select();
ExcelApp.Selection.End(Excel.XlDirection.xlUp).Select();
return ExcelApp.ActiveCell.Row;
}
It may not be the most elegant solution (I guess instead you could navigate to the final row within the spreadsheet first directly before using XlUp) but it seems to be more reliable.
As CtrlDot and Leo Guardian says, it is not very acuarate the method, there some files where formats affect the "SpecialCells".
So I used a combination of that plus a While.
Range last = sheet.Cells.SpecialCells(XlCellType.xlCellTypeLastCell, Type.Missing);
Range range = sheet.get_Range("A1", last);
int lastrow = last.Row;
// Complement to confirm that the last row is the last
string textCell= "Existe";
while (textCell != null)
{
lastrow++;
textCell = sheet.Cells[lastrow + 1, 1].Value;
}
In case of using OfficeOpenXml nowadays:
using OfficeOpenXml;
using System.IO;
FileInfo excelFile = new FileInfo(filename);
ExcelPackage package = new ExcelPackage(excelFile);
ExcelWorksheet sheet = package.Workbook.Worksheets[1];
int lastRow = sheet.Dimension.End.Row;
int lastColumn = sheet.Dimension.End.Column;
I don't know if using Microsoft.Office.Interop.Excel is still state of the art or more a legacy library. In my opinion I'm doing well replacing with OfficeOpenXml. So this answer might be usefull for future search results.
Probably simple once i see the correct code but what is the best way to loop through a specific column in a worksheet until the end?
It's pretty simple. Just create a range object that points to the range you want to start at, then loop through each offset from that range until you get to a blank cell.
int i = 0;
while (target_range.Offset(i, 0).Value != "")
{
i++;
}