I'm trying to read data from an Excel sheet using Office.Interoperability.Excel namespace. I'd like to get the first row of the sheet as the first row contains the headers, without specifying the start and end cells. Because I wouldn't know if a new column is added to the sheet.
Microsoft.Office.Interop.Excel.Application excelObj = new Application();
Microsoft.Office.Interop.Excel.Workbook myBook = excelObj.Workbooks.Open(#"D:\myFile.xlsx", 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 0, 0);
Microsoft.Office.Interop.Excel.Worksheet mySheet = (Worksheet)myBook.Sheets.get_Item(1);
Range range = mySheet.Cells.EntireRow;
Here, the range becomes the entire range and it doesn't get limited to the number of header columns. Also I've a huge data of about 10,000 rows to process.
If you requirement doesnt involve writing back to the excel file I would suggest that you use Excel Data Reader (http://exceldatareader.codeplex.com/) its a lot easier to use, doesnt require excel on the server and its faster
I think you're looking for this:
Range headers = mySheet.UsedRange.Rows(1);
I just answered another Excel reading question here: C# converting .xls to .csv without Excel
The FileHelpers library is perfect for your task. I use it myself for those numbers of rows and above.
I don't know what you are doing with the rows once they are read from Excel, but if you a looking at some processing that could be broken down into step, have a look at Rhino.Etl for that. It's a really powerful way to process large amounts of data.
Related
I'm not sure if using the word "dynamic" is correct. Anyway, I do have some basic understanding of using the Microsoft.Office.Interop.Excel. The problem is, I'm having about 100 excel files in a folder, each of the excel files has different sheet name, number of rows and number of columns.
As far as I understand, you need to specify the range and sheet name, i.e.:
xcel.Worksheet sheet = someExcelFiles.Sheets["SomeSheetName"] as Excel.Worksheet;
Excel.Range range = sheet.get_Range("A1:A5");
Is there anyway so that my application can read all data in all of the excel files without having to specify the sheet name and range (row and columns)?
Short answer yes. Long answer From DotNetPerls which also contains grabbing number of sheets programatically.
Range excelRange = sheet.UsedRange;
object[,] valueArray = (object[,])excelRange.get_Value(
XlRangeValueDataType.xlRangeValueDefault);
I tried searching for examples and never i found an example for inserting data into an empty excel.
Insert into [Sheet1$] (columnname1, columnName2) values ("somevalue","somevalue");
If I understand correctly, you want a simple way to create a file that can be read in excel. The simple solution I use many times, when I don't need any advanced features of excel sheets, is a CSV (comma seperated value).
You format your data like this :
COLUMN1,COLUMN2,COLUMN3
ROW1_VALUE1,ROW1_VALUE2,ROW1_VALUE3
ROW2_VALUE1,ROW2_VALUE2,ROW2_VALUE3
Between the lines there are linebreaks. On Windows use \r\n.
You can construct the file any way that you wish, for example :
File.WriteAllText("test.csv","product,price\r\nbook,100\r\ncoffee,500");
This will produce a CSV that can be read in excel.
Excel.Worksheet oSheet;
//------
oSheet.Cells[Row,Column] = "Some Info";
// --- Row & Column starts with 1
I have a excel template where I have 1-100 row filled just with ID, my other column name and email is empty.
ID Name Email
1
2
3
.
.
100
here I have no data in the excel when I try to get row count I get 100, but I want to get the row count of either name or email which is filled with data, How can I do that.
Excel.Application xlApp;
Excel.Workbook xlWorkBook;
Excel.Worksheet xlWorkSheet;
Excel.Range range;
xlApp = new Excel.ApplicationClass();
xlWorkBook = xlApp.Workbooks.Open(fileName, 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
xlWorkSheet = (Excel.Worksheet)xlWorkBook.Worksheets.get_Item(1);
range = xlWorkSheet.UsedRange;
int iRowCount = xlWorkSheet.UsedRange.Rows.Count;
You'll have to iterate over rows, because Excel will always show you the maximum used row and column number for worksheet.
Also, I would suggest you to not use Excel to read the data. And instead use some library, that can read files directly (and not depend on Excel installed). I used http://exceldatareader.codeplex.com/ quite succefully (at least for xlsx files). You may also want to download latest sources and build them your-self, because release is not very new, and there were a lot of fixes.
PS By not using Excel you will also solve the problem of performance, because using Excel as you showed in your code is very slow, and this will really matter when you start iterating over rows.
ExcelDataReader, on the other hand, will give a DataTable of data from excel worksheet, and will be able to parse it as you want in memory, which should be 100-1000 times quicker then working with excel.
You will have to access each Cell in your Column and check if it contains data using:
xlWorkSheet.Cells(x, y).Value
One easy way is to check which rows contain some value for aither Name or Email.
Use
System.Array values = (System.Array)range.Cells.Value;
after which loop through the array and check that either Name or Email are different than string.IsNullOrWhiteSpace.
The Data is stored in a Dictionary and arranged by month. (the string is what needs to be inserted.)
I have a preexisting excel file: it has all the dates of the year on the first column, and various entries in the following column. This data needs to be preserved.
My task is to basically insert the Dictionary into this excel file. What complicates things is that the Date value in the dictionary, needs to correspond to the date value in the excel column (date and month). To explicate: ("xxxx", 1980-05-12) needs to be inserted into the excel column with the first cell as "12-May" (this was generated via Fill->series).
And I have no idea. I'm spluttering by on bits of programming I'd picked up a couple of years ago. I've already extracted the data from the web page, and sorted it and all - automating some of the boring manual work. But I am faltering at the last mile, and seriously do not want to manually enter a couple of thousand data points when I know a simple script would suffice.
So, any help would be appreciated.
private void FindAndSetDate(WorkSheet ws, Dictionary<DateTime,string> dict)
{
Range find = null;
foreach(KeyValuePair<DateTime,String> kvp in Dict)
{
find = ws.Cells.Find(kvp.Key, Type.Missing,
Microsoft.Office.Interop.Excel.XlFindLookIn.xlValues, Microsoft.Office.Interop.Excel.XlLookAt.xlWhole,
Microsoft.Office.Interop.Excel.XlSearchOrder.xlByRows, Microsoft.Office.Interop.Excel.XlSearchDirection.xlNext, false,
Type.Missing, Type.Missing);
if(find!=null) find.Offset[0,1].Value=kvp.Value;
}
}
take a look at this thread How to iterate through a column in an Excel application through C# Console?
If you have Excel installed you could use excel interop to fill the excel sheet. Formatting the date is easy with the DateTime string format options. In your case myDateTime.ToString("dd-MMM") would result in the desired format.
Duplicate of: What’s the simplest way to import a System.Data.DataSet into Excel?
Using c# under VS2008, we can create an excel app, workbook, and then worksheet fine by doing this:
Application excelApp = new Application();
Workbook excelWb = excelApp.Workbooks.Add(template);
Worksheet excelWs = (Worksheet)this.Application.ActiveSheet;
Then we can access each cell by "excelWs.Cells[i,j]" and write/save without problems. However with large numbers of rows/columns, we are expecting a loss in efficiency.
Is there a way to "data bind" from a DataSet object into the worksheet without using the cell-by-cell approach? Most of the methods we have seen at some point revert to the cell-by-cell approach. Thanks for any suggestions.
Why not use ADO.NET using the OLEDB provider?
See: Tips for reading Excel spreadsheets using ADO.NET
This is a duplicate question.
See: What's the simplest way to import a DataSet into Excel
(The answer is, use the OleDbConnection and read the dataset from Excel).
Instead of going cell by cell, it is faster to set a 2-dimensional array of objects to an excel range.
object[][] values = ...
ws.Range("A1:F2000").Value = values;
However, the OleDb way is still much faster than above.
You could also render a GridView to a string and save the results to the client as HTML or XLS. If you use the XLS file extension to associate the data with Excel, you'll see an error when you open the file in Excel. It doesn't hurt anything and your HTML data will open and look perfect in Excel...