C# range.Rows.Count doesn't count right - Excel formating? - c#

I'm using C# to get data from Excel.
For reading out the Data I use this piece of code:
for (int rCnt = 2; rCnt <= range.Rows.Count; rCnt++)
The Sheet is 80 rows long, the range.Rows.Count says it is 135 rows long.
I have this problem with 2 Excel Files.
The Excel files are generated from an Sharepoint and have filters and some other formating.
When I copy the data into an empty Excel file (with Strg + A, not manually selected) it counts the right amount of rows.
With a 3rd Excel file (from an Sharepoint, too) its no problem...
Maybe a solution is to change the excel file first, it is only needed for my programm not for anything else, so that would be ok.
Any Ideas?
Edit:
I just stopped the code and saw, that after the range of 80, all the entries in the object are "null", so there is no hidden Data or something
Edit2:
I deleted all the Data from that Sheet and now it counts 137 rows, so there has to be some formating stuff that is counted...

First of all you mention that the excel file has some filtering and formatting. Could it be that some of the formatting is applied to the first 135 rows in the file and therefore the select returns them all ?
And secondly, what do you use to read the Excel file ? Do you use OleDb?
And are the rest of the returned rows empty ?
If that is the case you can use : SELECT * FROM SHEET WHERE [Column] IS NOT NULL

Related

Select, Copy, and Re-Insert a range of Excel Rows Using C#

Using C# to replace an existing and now defunct legacy routine in Delphi, I’ve been trying re-develop a routine in C# that worked well in Delphi 7 that will open each of my 30 Excel files, select the last 22 rows in each, copy them, and reinsert them back into their respective files.
For example, I find the last row, August of 2019 has 22 business days so I begin my range 22 rows back from the last row.
Using the Excel file, I want to select the rows that range from A4157 to A4178. Then, I want to take the rows I’ve copied and insert them after row A4156 or before row A4157. I then take a separate list of August dates and write them to each of the lower 22 from A4178 to A4200. July is then complete and rows for August are ready to go.
I retain the last row because additional calculations on other worksheets in the same workbook refer to the last row for data and they will index automatically from A4178 to A4200 doing it this way.
As each entry made during the month, its data are copied down to the last row so the last row is always up to date. Another sheet in the workbook uses up-to-date data in its summary.
So far, I can open the Excel workbook and get the right sheet.
The following does highlight the proper row but I get a runtime error. I’m not sure what that means.
I’ve commented out the “select” line and run the “copy” line with the same result. I’m now into my second week trying to work out this problem. I need an example of a functioning routine if possible. A link to a text on C# - MS interfacing would be a great help.
Excel.Range selectRange = excelWorkSheet.Range[excelWorkSheet.Cells[A4157], excelWorkSheet.Cells[A4178].select;
Excel.Range copyRange = excelWorkSheet.Range[excelWorkSheet.Cells[A4157], excelWorkSheet.Cells[A4178]].select;
Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: 'Cannot implicitly convert type 'bool' to 'Microsoft.Office.Interop.Excel.Range''
I'm not 100% clear on what you want in your final worksheet. But I think you're trying to do something like this.
// In your example startRow = 4157 and lastRow = 4178.
// Get the rows to copy (rows 4157 to 4178 in your example).
Excel.Range copyRange = excelWorkSheet.Rows[startRow + ":" + lastRow];
// Insert enough new rows to fit the rows we're copying.
copyRange.Insert(Excel.XlInsertShiftDirection.xlShiftDown);
// The copied data will be put in the same place (starting at row 4157 in
// your example).
Excel.Range dest = excelWorkSheet.Rows[startRow + ":" + lastRow];
copyRange.Copy(dest);
Take a look at Epplus. https://github.com/JanKallman/EPPlus. It's available as a Nuget package.
There is a tutorial in the Github Wiki and a sample app that exercises many of the features.

Search text in particular column and retrieve data from another column in excel using C#

I have an Excel file of 20 columns and there are about 150 records. I need to search for a particular string in a particular column with header "DESCRIPTION"(usually column b). The search string and column header values come from an INI File. After the search is found, I need to copy the value in column J (again from INI file) to the output file.
I am new to C#. Can somebody help me here. I tried Range.Find but I got confused.
Usually OLEDB with SQL like statements can be used to retrieve specific column or row data. Please go throgh following solutions:
http://www.c-sharpcorner.com/UploadFile/6b8651/read-excel-file-in-windows-application-using-C-Sharp/
https://www.codeproject.com/articles/1088970/read-write-excel-file-with-oledb-in-csharp-without
Hope this will give you some idea. But try to show some of your code so that members can have the better understanding of your approach. Thanks
Apart from Oledb connection you can also use ClosedXML. ClosedXML is a wrapper around OpenXML that allows you to easily work with .XLSX files.
https://github.com/ClosedXML/ClosedXML.
You can look at the documentation that'll help you understand how to search for text.
If you want to search .XLS files, then you can use the Microsoft Excel Interoperability libraries. These libraries cannot be used in a web app, are slower than ClosedXML but they support all kinds of Excel files.
You can use free version of GemBox.Spreadsheet to search for a text in both XLS and XLSX file formats. Also regarding the INI file, you can use MadMilkman.Ini.
Here is an example that you could try:
// Load INI file.
IniFile ini = new IniFile();
ini.Load("Sample.ini");
// Get INI values.
string header = ini.Sections["SampleSection"].Keys["ColumnHeaderValue"].Value;
string search = ini.Sections["SampleSection"].Keys["SearchTextValue"].Value;
string j = ini.Sections["SampleSection"].Keys["JColumnValue"].Value;
// Load XLSX file.
ExcelFile excel = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet sheet = excel.Worksheets[0];
// Find column header value in first row.
ExcelColumn searchColumn = sheet.Rows[0].Cells
.First(cell => cell.ValueType == CellValueType.String && cell.StringValue == header)
.Column;
// Find search value in column.
int r, c;
searchColumn.Cells.FindText(search, false, false, out r, out c);
ExcelCell searchCell = sheet.Cells[r, c];
// Get cell from column "J" that is in the same row as cell that has search text.
ExcelCell jCell = sheet.Cells[r, ExcelColumnCollection.ColumnNameToIndex("J")];
// Set cell value.
jCell.Value = j;
// Save XLSX file.
excel.Save("Sample.xlsx");

ODBC driver cannot read rows added in Excel

I create xls/xlsx file from C# using ODBC (with Provider=Microsoft.ACE.OLEDB.12.0). The result table has 4 rows (for example). I open the file with Excel, add 5-th row and save the file. When try to read it from C# over ODBC with SELECT * FROM [table] I get only the original 4 rows without 5th. It seems ODBC stores somewhere in XLS file the number of rows and later reads only them without new data entered from Excel or LibreOffice. Is this known problem and can I solve it? If I create new spreadsheet in Excel, all its rows are read fron C#.
EDIT: I found some useful information. When the XLS file is first created from C#/ODBC, there are 2 tables (sheets). If the table name is TABLE, DataTable sheets = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null) will contain sheets.Rows[0] == "TABLE" and sheets.Rows[1] == "TABLE$". Excel will show only one sheet "TABLE". After edit the changes (5th row) exist only in "TABLE$" sheet.
Are you adding the 5th row by code if yes, could you please share the code lines which you are using for doing the same. There might be following issue in your code.
Save commit not done properly.
Before reading the file connection refresh not done.
I think I found the problem. It seems that internal spreadsheet names created by Excel have "$" sign at the end. The sheet name generated by ODBC are the exact string given in CREATE TABLE. On the other hand, Excel (and LibreOffice) show only one sheet for both TABLE and TABLE$ sheets. If I edit the table in Excel, after save the changes are only in TABLE$. The other sheet TABLE is unchanged. When I do SELECT * FROM [TABLE] the result is from the original ODBC generated table without Excel changes. Now I enumerate the available sheets inside XLS file and if first sheet name does not end with "$" and sheets are more than 1, I add "$" to first sheet name and open the correct table. I suppose ODBC connection string may include option to work with "$"-ending tables...

How to dump data in to Excel file beyond its limitation?

I have more than 2 million rows of data and I want to dump this data in Excel file but as given in this specification that Excel file can contains only 1,048,576 rows.
Consider that I have 40 million rows in the database and I want to dump this data in excel file.
I did 1 test but got the same result that is successfully got 1,048,576 rows and after that got error:
Exception from HRESULT: 0x800A03EC Error
Code:
for (int i = 1; i <= 1200000; i++)
{
oSheet.Cells[i, 1] = i;
}
I think of CSV file but I can't use it as because we cant give colors and styles to CSV file as per this Answer and my Excel file is going to contain many colors and styles.
Is there any third party tool or whatever through which I can dump more than 2 millions rows in Excel file? I am not concerned if it is paid or free.
Like you said the current excel specification Link has a maximum of 1,048,576 rows. But the amount of Sheets is only limited by the memory.
Maybe the seperation of the content on multiple sheets would be a solution for this.
or if you want to do some analysis on the data for instance you could maybe aggregat the information before loading them into the excel file.

Bulk Copy of multiple Excel files to database

I need to read the data of particular range in excel file and upload them in database.
The required data does not start at A1 cell instead, they start at A15 and A14 is the header row for columns. there are seven columns with headers.
(I tried to read cells via "get_Range" option)
We need to read the data in each cell and do a row by row update in database.
There are thousands of files of same type in a specific folder.
I am trying to achieve this as C# Console app because this is just a one time job.
Here is the answer i found.
step 1 : Loop through each file in the source directory.
Step 2 : Add Excel Interop Reference. and Create Excel Application Class Object, and also for Workbook, and Range(for used range).
Step 3 : Use the Get Range() function and read the rows. (since this is solution is specific for a problem, the start and end ranges of rows and columns are well known)
Step 4 : Each read row can be constructed as a string till the end of the file.
OR
Insert can be done after reading each row.
step 5 : Get Connection String and Create SQLConnection Object to perform insert. Better to use Transaction-Commit.
Done. Thanks to all.

Categories

Resources