Slow Performance When Reading Excel - c#

I want to read excel file but in this way is too slow. What pattern should I use to read excel file faster. Should I try csv ?
I am using the following code:
ApplicationClass excelApp = excelApp = new ApplicationClass();
Workbook myWorkBook = excelApp.Workbooks.Open(#"C:\Users\OWNER\Desktop\Employees.xlsx");
Worksheet mySheet = (Worksheet)myWorkBook.Sheets["Sheet1"];
for (int row = 1; row <= mySheet.UsedRange.Rows.Count; row++)
{
for (int col = 1; col <= mySheet.UsedRange.Columns.Count; col++)
{
Range dataRange = (Range)mySheet.Cells[row, col];
Console.Write(String.Format(dataRange.Value2.ToString() + " "));
}
Console.WriteLine();
}
excelApp.Quit();

The reason your program is slow is because you are using Excel to open your Excel files. Whenever you are doing anything with the file you have to do a COM+ interop, which is extremely slow, as you have to pass memory across two different processes.
Microsoft has dropped support for reading .xlsx files using Excel interop. They released the OpenXML library specifically for this reason.
I suggest you use a wrapper library for using OpenXML, since the API is pretty hairy. You can check out this SO for how to use it correctly.
open xml reading from excel file

You're accessing Excel file through excel interop. By doing reads cell by cell you're doing a lot of P/Invoke's which is not very performant.
You can read data in ranges, not cell by cell. This loads the data into memory and you could iterate it much faster. (Eg. try to load column by column.)
BTW: You could use some library instead like http://epplus.codeplex.com which reads excel files directly.

Excel Data Reader
Lightweight and very fast if reading is your only concern.

Related

Exporting Data to Excel very Slow

I am trying to export data from my C# code to MS Excel 2007, but it is taking 30 seconds to insert data in an excel file.The code is like this->
Excel.Application excelapp = new Excel.Application();
Excel.Workbook excelworkbook = excelapp.Workbooks.Open(fileTest);
Excel.Sheets excelsheets = excelworkbook.Worksheets;
Excel.Worksheet mysheets = (Excel.Worksheet)excelsheets.get_Item("Sheet1");
Excel.Range mycells = mysheets.Cells;
mycells.Item[destroyer, "A"].Value = s[2];
mycells.Item[destroyer, "B"].Value = s[1];
mycells.Item[destroyer, "C"].Value = s[3];
mycells.Item[destroyer, "D"].Value = dbl_standard.Text;
mycells.Item[destroyer, "E"].Value = s[4];
mycells.Item[destroyer, "F"].Value = s[7];
mycells.Item[destroyer, "G"].Value = s[5];
mycells.Item[destroyer, "H"].Value = s[6];
excelworkbook.Save();
excelworkbook.Close();
excelapp.Quit();
Marshal.ReleaseComObject(mycells);
Marshal.ReleaseComObject(mysheets);
Marshal.ReleaseComObject(excelsheets);
Marshal.ReleaseComObject(excelworkbook);
Marshal.ReleaseComObject(excelapp);
I am inserting hardly 25 columns.Which thing am I doing wrong?How to make it fast?
Thanks in Advance
You have two issues going on here. The first issue is that Excel interop actually opens Excel.exe and iteroperates with the process. You won't be able to remove the overhead of starting Excel, which is probably the bulk of your processing time.
The other part is that for every cell you edit you create a lot of calls "under the hood" to the interop layer. You can vectorize these calls.
For reading:
https://stackoverflow.com/a/42604291/3387223
For writing (VB example):
https://stackoverflow.com/a/23503305/3387223
That way you only create one interop operation for the whole range of values. This will be roughly 25 times quicker than inserting 25 values.
But as I stated above, starting Excel is probably what takes most of your time.
You can read and write Excel sheets faster with OpenXML, but maybe you'll run into some formatting issues, and you won't get instant updates of other formulas in your Excel sheet (if that's what you need).
https://msdn.microsoft.com/en-us/us-en/library/office/bb448854.aspx
Here's an example on generating Excel sheets:
https://msdn.microsoft.com/en-us/library/office/hh180830(v=office.14).aspx
And if you want an easier time dealing with OpenXml there is ClosedXml:
https://github.com/closedxml/closedxml
Which will make OpenXml about as easy as standard interop.

Interop Excel is slow

I am writing an application to open an Excel sheet and read it
MyApp = new Excel.Application();
MyBook = MyApp.Workbooks.Open(filename);
MySheet = (Excel.Worksheet)MyBook.Sheets[1]; // Explict cast is not required here
lastRow = MySheet.Cells.SpecialCells(Excel.XlCellType.xlCellTypeLastCell).Row;
MyApp.Visible = false;
It takes about 6-7 seconds for this to take place, is this normal with interop Excel?
Also is there a quicker way to Read an Excel than this?
string[] xx = new string[lastRow];
for (int index = 1; index <= lastRow; index++)
{
int maxCol = endCol - startCol;
for (int j = 1; j <= maxCol; j++)
{
try
{
xx[index - 1] += (MySheet.Cells[index, j] as Excel.Range).Value2.ToString();
}
catch
{
}
if (j != maxCol) xx[index - 1] += "|";
}
}
MyApp.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(MySheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject(MyBook);
System.Runtime.InteropServices.Marshal.ReleaseComObject(MyApp);
Appending to the answer of #RvdK - yes COM interop is slow.
Why is it slow?
It is due to the fact how it works. Every call made from .NET must be marshaled to local COM proxy from there it must be marshaled from one process (your app) to the COM server (Excel) (through IPC inside Windows kernel) then it gets translated (dispatched) from the server's local proxy into a native code where arguments get marshaled from OLE Automation compatible types into native types, their validity checked and the function is performed. Result of the function travels back approximately same way through several layers between 2 different processes.
So each and every command is quite expensive to execute, the more of them you do the slower the whole process is. You can find lots of documentation all around the web as COM is old and well working standard (somehow dying with Visual Basic 6).
One example of such article is here: http://www.codeproject.com/Articles/990/Understanding-Classic-COM-Interoperability-With-NE
Is there a quicker way to read?
ClosedXML can both read and write Excel xlsx files (even formulas, formatting and stuff) using Microsoft's OpenXml SDK, see here: https://closedxml.codeplex.com/wikipage?title=Finding%20and%20extracting%20the%20data&referringTitle=Documentation
Excel data reader claims to be able to read both legacy and new Excel data files, I did not try it myself, take a look here: https://exceldatareader.codeplex.com/
another way to read data faster is to use Excel automation to translate sheet into a data file that you can understand easily and batch process without the interop layer (e.g. XML,CSV). This answer shows how to do it
Short answer: correct, interop is slow. (had the same problem, taking couple of seconds to read 300 lines...
Use a library for this:
http://epplus.codeplex.com/
http://npoi.codeplex.com/
This answer is only about the second part of your question.
Your are using lots of ranges there which is not as intended and indeed very slow.
First read the complete range and then iterate over the result like so:
var xx[,] = (MySheet.Cells["A1", "XX100"] as Excel.Range).Value2;
for (int i=0;i<xx.getLength(0);i++)
{
for (int j=0;j<xx.getLength(1);j++)
{
Console.WriteLine(xx[i,j].toString());
}
}
This will be much faster!
You can use this free library, xls & xlsx supported,
Workbook wb = new Workbook();
wb.LoadFromFile(ofd.FileName);
https://freenetexcel.codeplex.com/

read/write a simple excel file using c#

I'm trying to find a simple way of writing an excel file in c#, but everything that I've found on thank you for your help.
You have two options available to you
The First is to use Interop Assemblies here is a link to some sample code on how to do that Write Data to Excel using C#
The Second option is to use OLEDB. There is some information on Stack Overflow on that here
Use epplus as mentioned above,It makes it really simple. This is the code for a spread sheet i created with it today.
using (ExcelPackage pck = new ExcelPackage())
{
//Give the worksheet a name
ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Inventory as of " + DateTime.Now.ToString("MM/dd/yyyy"));
//dt is a datable that i am turning into an excel document
ws.Cells["A1"].LoadFromDataTable(dt, true);
//Format the header columns(Color,Pattern,etc.)
using (ExcelRange rng = ws.Cells["A1:AA1"])
{
rng.Style.Font.Bold = true;
rng.Style.Fill.PatternType = ExcelFillStyle.Solid; //Set Pattern for the background to Solid
rng.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(79, 129, 189)); //Set color to dark blue
rng.Style.Font.Color.SetColor(Color.White);
}
//End Format the header columns
//Give the file details(ie. filename, etc.)
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.AddHeader("content-disposition", "attachment;filename=Inventory Report " + DateTime.Now.ToString("MM/dd/yyyy") + ".xlsx");
//Write the file
Response.BinaryWrite(pck.GetAsByteArray());
Response.End();
pck.Save();
}
what you would need is epplus, this will help you to create 2007+ excel file
not compatible with 2003 and under
There is no really easy way depending of the version of excel file you want to write. If you want to go for xls you won't have much options than using Excel interop which would also have the dependency to have Excel installed.
The newer version offers some more options as it is just XML in the background. You can choose yourself how to create it, either yourself, some libraries or again Interop.
If you just want to display a table without any styling, there was (afair) a way to write csv file and excel can open it quite well (depending on the data types you want to use in it).

Saving Excel 2007 documents

In .NET C# I'm trying to open an Excel template, add some data and save it as a new document. I'm trying to use the OpenXML document format. I can't seem to find any guidance on how to do this. Seems like all the documentation talks about how to write various parts to the Package but I can't find anything on what to do when you're done and want to save it.
Anyone know where I can find this information? I must be thinking about this incorrectly because I'm not finding anything useful on what seems to be very basic.
Thanks
ExcelPackage works pretty good for that. It hasn't been worked on by the primary author I dont think for a little while but it has a good following of people on its forum that work any issues out.
FileInfo template = new FileInfo(Path.GetDirectoryName(Application.ExecutablePath)+"\\Template.xlsx");
try
{
using (ExcelPackage xlPackage = new ExcelPackage(strFileName,template))
{
//Enable DEBUG mode to create the xl folder (equlivant to expanding a xlsx.zip file)
//xlPackage.DebugMode = true;
ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets["Sheet1"];
worksheet.Name = WorkSheetName;
foreach (DataRow row in dt.Rows)
{
int c = 1;
if (r > startRow) worksheet.InsertRow(r);
// our query has the columns in the right order, so simply
// iterate through the columns
foreach (DataColumn col in dt.Columns)
{
if (row[col].ToString() != null)
{
worksheet.Cell(r, c).Value = colValue;
worksheet.Column(c).Width = 10;
}
c++;
}
r++;
}
// change the sheet view to show it in page layout mode
worksheet.View.PageLayoutView = false;
// save our new workbook and we are done!
xlPackage.Save();
xlPackage.Dispose();
}
}
Accessing Open XML / SpreadsheetML documents is far from a trivial exercise. The specification is large and complex. The "Open XML SDK" (google it) definitely helps, but still requires some knowledge of the Open XML standard to get much done.
SpreadsheetGear for .NET has an API similar to Excel and can read and write Excel Open XML (xlsx) documents as well as Excel 97-2003 (xls) documents.
You can see some SpreadsheetGear samples here and download a free trial here.
Disclaimer: I own SpreadsheetGear LLC

Read excel file from a stream

I need a way to read a Excel file from a stream. It doesn't seem to work with the ADO.NET way of doing things.
The scenario is that a user uploads a file through a FileUpload and i need to read some values from the file and import to a database.
For several reasons I can't save the file to disk, and there is no reason to do so either.
So, anyone know of a way to read a Excel file from a FileUpload stream?
It seems i found a soultion to the problem myself.
http://www.codeplex.com/ExcelDataReader
This library seems to work nicely and it takes a stream to read the excel file.
ExcelDataReader reader = new ExcelDataReader(ExcelFileUpload.PostedFile.InputStream);
This can be done easily with EPPlus.
//the excel sheet as byte array (as example from a FileUpload Control)
byte[] bin = FileUpload1.FileBytes;
//gen the byte array into the memorystream
using (MemoryStream ms = new MemoryStream(bin))
using (ExcelPackage package = new ExcelPackage(ms))
{
//get the first sheet from the excel file
ExcelWorksheet sheet = package.Workbook.Worksheets[1];
//loop all rows in the sheet
for (int i = sheet.Dimension.Start.Row; i <= sheet.Dimension.End.Row; i++)
{
//loop all columns in a row
for (int j = sheet.Dimension.Start.Column; j <= sheet.Dimension.End.Column; j++)
{
//do something with the current cell value
string currentCellValue = sheet.Cells[i, j].Value.ToString();
}
}
}
SpreadsheetGear can do it:
SpreadsheetGear.IWorkbook workbook = SpreadsheetGear.Factory.GetWorkbookSet().Workbooks.OpenFromStream(stream);
You can try it for yourself with the free evaluation.
Disclaimer: I own SpreadsheetGear LLC
Infragistics has an excel component that can read an excel file from a stream.
I'm using it in a project here and it works well.
Also the open source myXls component could easily be modified to support this. The XlsDocument contstructor only supports loading from a file given by a file name, but it works by creating a FileStream and then reading the Stream, so changing it to support loading from streams should be trivial.
Edit:
I see that you found a solution but I just wanted to note that I updated the source code for the component so that it now can read an excel file directly from a stream. :-)
I use ClosedXML nuget package to read excel content from stream. It has a constructor overload in XLWorkbook class which takes stream pointing to an excel file (aka workbook).
imported namespace at the top of your code file:
using ClosedXML.Excel;
Source code:
var stream = /*obtain the stream from your source*/;
if (stream.Length != 0)
{
//handle the stream here
using (XLWorkbook excelWorkbook = new XLWorkbook(stream))
{
var name = excelWorkbook.Worksheet(1).Name;
//do more things whatever you like as you now have a handle to the entire workbook.
var firstRow = excelWorkbook.Worksheet(1).Row(1);
}
}

Categories

Resources