I'm attempting Excel automation through C#. I have followed all the instructions from Microsoft on how to go about this, but I'm still struggling to discard the final reference(s) to Excel for it to close and to enable the GC to collect it.
A code sample follows. When I comment out the code block that contains lines similar to:
Sheet.Cells[iRowCount, 1] = data["fullname"].ToString();
then the file saves and Excel quits. Otherwise the file saves but Excel is left running as a process. The next time this code runs it creates a new instance and they eventually build up.
Any help is appreciated. Thanks.
This is the barebones of my code:
Excel.Application xl = null;
Excel._Workbook wBook = null;
Excel._Worksheet wSheet = null;
Excel.Range range = null;
object m_objOpt = System.Reflection.Missing.Value;
try
{
// open the template
xl = new Excel.Application();
wBook = (Excel._Workbook)xl.Workbooks.Open(excelTemplatePath + _report.ExcelTemplate, false, false, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt);
wSheet = (Excel._Worksheet)wBook.ActiveSheet;
int iRowCount = 2;
// enumerate and drop the values straight into the Excel file
while (data.Read())
{
wSheet.Cells[iRowCount, 1] = data["fullname"].ToString();
wSheet.Cells[iRowCount, 2] = data["brand"].ToString();
wSheet.Cells[iRowCount, 3] = data["agency"].ToString();
wSheet.Cells[iRowCount, 4] = data["advertiser"].ToString();
wSheet.Cells[iRowCount, 5] = data["product"].ToString();
wSheet.Cells[iRowCount, 6] = data["comment"].ToString();
wSheet.Cells[iRowCount, 7] = data["brief"].ToString();
wSheet.Cells[iRowCount, 8] = data["responseDate"].ToString();
wSheet.Cells[iRowCount, 9] = data["share"].ToString();
wSheet.Cells[iRowCount, 10] = data["status"].ToString();
wSheet.Cells[iRowCount, 11] = data["startDate"].ToString();
wSheet.Cells[iRowCount, 12] = data["value"].ToString();
iRowCount++;
}
DirectoryInfo saveTo = Directory.CreateDirectory(excelTemplatePath + _report.FolderGuid.ToString() + "\\");
_report.ReportLocation = saveTo.FullName + _report.ExcelTemplate;
wBook.Close(true, _report.ReportLocation, m_objOpt);
wBook = null;
}
catch (Exception ex)
{
LogException.HandleException(ex);
}
finally
{
NAR(wSheet);
if (wBook != null)
wBook.Close(false, m_objOpt, m_objOpt);
NAR(wBook);
xl.Quit();
NAR(xl);
GC.Collect();
}
private void NAR(object o)
{
try
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(o);
}
catch { }
finally
{
o = null;
}
}
Update
No matter what I try, the 'clean' method or the 'ugly' method (see answers below), the excel instance still hangs around as soon as this line is hit:
wSheet.Cells[iRowCount, 1] = data["fullname"].ToString();
If I comment that line out (and the other similar ones below it, obviously) the Excel app exits gracefully. As soon as one line per above is uncommented, Excel sticks around.
I think I'm going to have to check if there's a running instance prior to assigning the xl variable and hook into that instead. I forgot to mention that this is a windows service, but that shouldn't matter, should it?
UPDATE (November 2016)
I've just read a convincing argument by Hans Passant that using GC.Collect is actually the right way to go. I no longer work with Office (thank goodness), but if I did I'd probably want to give this another try - it would certainly simplify a lot of the (thousands of lines) of code I wrote trying to do things the "right" way (as I saw it then).
I'll leave my original answer for posterity...
As Mike says in his answer, there is an easy way and a hard way to deal with this. Mike suggests using the easy way because... it's easier. I don't personally believe that's a good enough reason, and I don't believe it's the right way. It smacks of "turn it off and on again" to me.
I have several years experience of developing an Office automation application in .NET, and these COM interop problems plagued me for the first few weeks & months when I first ran into the issue, not least because Microsoft are very coy about admitting there's a problem in the first place, and at the time good advice was hard to find on the web.
I have a way of working that I now use virtually without thinking about it, and it's years since I had a problem. It's still important to be alive to all the hidden objects that you might be creating - and yes, if you miss one, you might have a leak that only becomes apparent much later. But it's no worse than things used to be in the bad old days of malloc/free.
I do think there's something to be said for cleaning up after yourself as you go, rather than at the end. If you're only starting Excel to fill in a few cells, then maybe it doesn't matter - but if you're going to be doing some heavy lifting, then that's a different matter.
Anyway, the technique I use is to use a wrapper class that implements IDisposable, and which in its Dispose method calls ReleaseComObject. That way I can use using statements to ensure that the object is disposed (and the COM object released) as soon as I'm finished with it.
Crucially, it'll get disposed/released even if my function returns early, or there's an Exception, etc. Also, it'll only get disposed/released if it was actually created in the first place - call me a pedant but the suggested code that attempts to release objects that may not actually have been created looks to me like sloppy code. I have a similar objection to using FinalReleaseComObject - you should know how many times you caused the creation of a COM reference, and should therefore be able to release it the same number of times.
A typical snippet of my code might look like this (or it would, if I was using C# v2 and could use generics :-)):
using (ComWrapper<Excel.Application> application = new ComWrapper<Excel.Application>(new Excel.Application()))
{
try
{
using (ComWrapper<Excel.Workbooks> workbooks = new ComWrapper<Excel.Workbooks>(application.ComObject.Workbooks))
{
using (ComWrapper<Excel.Workbook> workbook = new ComWrapper<Excel.Workbook>(workbooks.ComObject.Open(...)))
{
using (ComWrapper<Excel.Worksheet> worksheet = new ComWrapper<Excel.Worksheet>(workbook.ComObject.ActiveSheet))
{
FillTheWorksheet(worksheet);
}
// Close the workbook here (see edit 2 below)
}
}
}
finally
{
application.ComObject.Quit();
}
}
Now, I'm not about to pretend that that isn't wordy, and the indentation caused by object creation can get out of hand if you don't divide stuff into smaller methods. This example is something of a worst case, since all we're doing is creating objects. Normally there's a lot more going on between the braces and the overhead is much less.
Note that as per the example above I would always pass the 'wrapped' objects between methods, never a naked COM object, and it would be the responsibility of the caller to dispose of it (usually with a using statement). Similarly, I would always return a wrapped object, never a naked one, and again it would be the responsibility of the caller to release it. You could use a different protocol, but it's important to have clear rules, just as it was when we used to have to do our own memory management.
The ComWrapper<T> class used here hopefully requires little explanation. It simply stores a reference to the wrapped COM object, and releases it explicitly (using ReleaseComObject) in its Dispose method. The ComObject method simply returns a typed reference to the wrapped COM object.
Hope this helps!
EDIT: I've only now followed the link over to Mike's answer to another question, and I see that another answer to that question there has a link to a wrapper class, much as I suggest above.
Also, with regard to Mike's answer to that other question, I have to say I was very nearly seduced by the "just use GC.Collect" argument. However, I was mainly drawn to that on a false premise; it looked at first glance like there would be no need to worry about the COM references at all. However, as Mike says you do still need to explicitly release the COM objects associated with all your in-scope variables - and so all you've done is reduce rather than remove the need for COM-object management. Personally, I'd rather go the whole hog.
I also note a tendency in lots of answers to write code where everything gets released at the end of a method, in a big block of ReleaseComObject calls. That's all very well if everything works as planned, but I would urge anyone writing serious code to consider what would happen if an exception were thrown, or if the method had several exit points (the code would not be executed, and thus the COM objects would not be released). This is why I favor the use of "wrappers" and usings. It's wordy, but it does make for bulletproof code.
EDIT2: I've updated the code above to indicate where the workbook should be closed with or without saving changes. Here's the code to save changes:
object saveChanges = Excel.XlSaveAction.xlSaveChanges;
workbook.ComObject.Close(saveChanges, Type.Missing, Type.Missing);
...and to not save changes, simply change xlSaveChanges to xlDoNotSaveChanges.
What is happening is that your call to:
Sheet.Cells[iRowCount, 1] = data["fullname"].ToString();
Is essentially the same as:
Excel.Range cell = Sheet.Cells[iRowCount, 1];
cell.Value = data["fullname"].ToString();
By doing it this way, you can see that you are creating an Excel.Range object, and then assigning a value to it. This way also gives us a named reference to our range variable, the cell variable, that allows us to release it directly if we wanted. So you could clean up your objects one of two ways:
(1) The difficult and ugly way:
while (data.Read())
{
Excel.Range cell = Sheet.Cells[iRowCount, 1];
cell.Value = data["fullname"].ToString();
Marshal.FinalReleaseComObject(cell);
cell = Sheet.Cells[iRowCount, 2];
cell.Value = data["brand"].ToString();
Marshal.FinalReleaseComObject(cell);
cell = Sheet.Cells[iRowCount, 3];
cell.Value = data["agency"].ToString();
Marshal.FinalReleaseComObject(cell);
// etc...
}
In the above, we are releasing each range object via a call to Marshal.FinalReleaseComObject(cell) as we go along.
(2) The easy and clean way:
Leave your code exactly as you currently have it, and then at the end you can clean up as follows:
GC.Collect();
GC.WaitForPendingFinalizers();
if (wSheet != null)
{
Marshal.FinalReleaseComObject(wSheet)
}
if (wBook != null)
{
wBook.Close(false, m_objOpt, m_objOpt);
Marshal.FinalReleaseComObject(wBook);
}
xl.Quit();
Marshal.FinalReleaseComObject(xl);
In short, your existing code is extremely close. If you just add calls to GC.Collect() and GC.WaitForPendingFinalizers() before your 'NAR' calls, I think it should work for you. (In short, both Jamie's code and Ahmad's code are correct. Jamie's is cleaner, but Ahmad's code is an easier "quick fix" for you because you would only have to add the calls to calls to GC.Collect() and GC.WaitForPendingFinalizers() to your existing code.)
Jamie and Amhad also listed links to the .NET Automation Forum that I participate on (thanks guys!) Here are a couple of related posts that I've made here on StackOverflow :
(1) How to properly clean up Excel interop objects in C#
(2) C# Automate PowerPoint Excel -- PowerPoint does not quit
I hope this helps, Sean...
Mike
Add the following before your call to xl.Quit():
GC.Collect();
GC.WaitForPendingFinalizers();
You can also use Marshal.FinalReleaseComObject() in your NAR method instead of ReleaseComObject. ReleaseComObject decrements the reference count by 1 while FinalReleaseComObject releases all references so the count is 0.
So your finally block would look like:
finally
{
GC.Collect();
GC.WaitForPendingFinalizers();
NAR(wSheet);
if (wBook != null)
wBook.Close(false, m_objOpt, m_objOpt);
NAR(wBook);
xl.Quit();
NAR(xl);
}
Updated NAR method:
private void NAR(object o)
{
try
{
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(o);
}
catch { }
finally
{
o = null;
}
}
I had researched this awhile ago and in examples I found usually the GC related calls were at the end after closing the app. However, there's an MVP (Mike Rosenblum) that mentions that it ought to be called in the beginning. I've tried both ways and they've worked. I also tried it without the WaitForPendingFinalizers and it worked although it shouldn't hurt anything. YMMV.
Here are the relevant links by the MVP I mentioned (they're in VB but it's not that different):
http://www.xtremevbtalk.com/showthread.php?p=1157109#post1157109
http://www.xtremevbtalk.com/showthread.php?s=bcdea222412c5cbfa7f02cfaf8f7b33f&p=1156479#post1156479
As others have already covered InterOp i would suggest that if you deal with Excel files with XLSX extension you should use EPPlus which will make your Excel nightmares go away.
I have just answered this question here:
Killing excel process by its main window hWnd
Its over 4 years since this was posted but I came across the same problem and was able to solve it. Apparently just accessing the Cells array creates a COM object. So if you were to do:
wSheet = (Excel._Worksheet)wBook.ActiveSheet;
Microsoft.Office.Interop.Excel.Range cells = wSheet.Cells;
int iRowCount = 2;
// enumerate and drop the values straight into the Excel file
while (data.Read())
{
Microsoft.Office.Interop.Excel.Range cell = cells[iRowCount, 1];
cell = data["fullname"].ToString();
Marshal.FinalReleaseComObject(cell);
}
Marshal.FinalReleaseComObject(cells);
and then the rest of your cleanup it should fix the problem.
What I ended up doing to solve a similar problem was get the process Id and kill that as a last resort...
[DllImport("user32.dll", SetLastError = true)]
static extern IntPtr GetWindowThreadProcessId(int hWnd, out IntPtr lpdwProcessId);
...
objApp = new Excel.Application();
IntPtr processID;
GetWindowThreadProcessId(objApp.Hwnd, out processID);
excel = Process.GetProcessById(processID.ToInt32());
...
objApp.Application.Quit();
Marshal.FinalReleaseComObject(objApp);
_excel.Kill();
Here's the contents of my hasn't-failed-yet finally block for cleaning up Excel automation. My application leaves Excel open so there's no Quit call. The reference in the comment was my source.
finally
{
// Cleanup -- See http://www.xtremevbtalk.com/showthread.php?t=160433
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
// Calls are needed to avoid memory leak
Marshal.FinalReleaseComObject(sheet);
Marshal.FinalReleaseComObject(book);
Marshal.FinalReleaseComObject(excel);
}
Have you considered using a pure .NET solution such as SpreadsheetGear for .NET? Here is what your code might like like using SpreadsheetGear:
// open the template
using (IWorkbookSet workbookSet = SpreadsheetGear.Factory.GetWorkbookSet())
{
IWorkbook wBook = workbookSet.Workbooks.Open(excelTemplatePath + _report.ExcelTemplate);
IWorksheet wSheet = wBook.ActiveWorksheet;
int iRowCount = 2;
// enumerate and drop the values straight into the Excel file
while (data.Read())
{
wSheet.Cells[iRowCount, 1].Value = data["fullname"].ToString();
wSheet.Cells[iRowCount, 2].Value = data["brand"].ToString();
wSheet.Cells[iRowCount, 3].Value = data["agency"].ToString();
wSheet.Cells[iRowCount, 4].Value = data["advertiser"].ToString();
wSheet.Cells[iRowCount, 5].Value = data["product"].ToString();
wSheet.Cells[iRowCount, 6].Value = data["comment"].ToString();
wSheet.Cells[iRowCount, 7].Value = data["brief"].ToString();
wSheet.Cells[iRowCount, 8].Value = data["responseDate"].ToString();
wSheet.Cells[iRowCount, 9].Value = data["share"].ToString();
wSheet.Cells[iRowCount, 10].Value = data["status"].ToString();
wSheet.Cells[iRowCount, 11].Value = data["startDate"].ToString();
wSheet.Cells[iRowCount, 12].Value = data["value"].ToString();
iRowCount++;
}
DirectoryInfo saveTo = Directory.CreateDirectory(excelTemplatePath + _report.FolderGuid.ToString() + "\\");
_report.ReportLocation = saveTo.FullName + _report.ExcelTemplate;
wBook.SaveAs(_report.ReportLocation, FileFormat.OpenXMLWorkbook);
}
If you have more than a few rows, you might be shocked at how much faster it runs. And you will never have to worry about a hanging instance of Excel.
You can download the free trial here and try it for yourself.
Disclaimer: I own SpreadsheetGear LLC
The easiest way to paste code is through a question - this doesn't mean that I have answered my own question (unfortunately).
Apologies to those trying to help me - I was not able to get back to this until now. It still has me stumped...
I have completely isolated the Excel code into one function as per
private bool GenerateDailyProposalsReport(ScheduledReport report)
{
// start of test
Excel.Application xl = null;
Excel._Workbook wBook = null;
Excel._Worksheet wSheet = null;
Excel.Range xlrange = null;
object m_objOpt = System.Reflection.Missing.Value;
xl = new Excel.Application();
wBook = (Excel._Workbook)xl.Workbooks.Open(#"E:\Development\Romain\APN\SalesLinkReportManager\ExcelTemplates\DailyProposalReport.xls", false, false, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt);
wSheet = (Excel._Worksheet)wBook.ActiveSheet;
xlrange = wSheet.Cells[2, 1] as Excel.Range;
// PROBLEM LINE ************
xlrange.Value2 = "fullname";
//**************************
wBook.Close(true, #"c:\temp\DailyProposalReport.xls", m_objOpt);
xl.Quit();
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
GC.WaitForPendingFinalizers();
Marshal.FinalReleaseComObject(xlrange);
Marshal.FinalReleaseComObject(wSheet);
Marshal.FinalReleaseComObject(wBook);
Marshal.FinalReleaseComObject(xl);
xlrange = null;
wSheet = null;
wBook = null;
xl = null;
// end of test
return true;
}
If I comment out the PROBLEM LINE above, the instance of Excel is released from memory. As it stands, it does not.
I'd appreciate any further help on this as time is fleeting and a deadline looms (don't they all).
Please ask if you need more information.
Thanks in anticipation.
Addendum
A bit more information that may or may not shed more light on this. I have resorted to killing the process (stopgap measure) after a certain time lapse (5-10 seconds to give Excel time to finish it's processes). I have two reports scheduled - the first report is created and saved to disk and the Excel process is killed, then emailed. The second is created, saved to disk, the process is killed but suddenly there is an error when attempting the email. The error is:
The process cannot access the file'....' etc
So even when the Excel app has been killed, the actual Excel file is still being held by the windows service. I have to kill the service to delete the file...
I'm afraid I am running out of ideas here Sean. :-(
Gary could have some thoughts, but although his wrapper approach is very solid, it won't actually help you in this case because you are already doing everything pretty much correctly.
I'll list a few thoughts here. I don't see how any of them will actually work because your mystery line
xlrange.Value2 = "fullname";
would not seem to be impacted by any of these ideas, but here goes:
(1) Don't make use of the _Workbook and _Worksheet interfaces. Use Workbook and Worksheet instead. (For more on this see: Excel interop: _Worksheet or Worksheet?.)
(2) Any time you have two dots (".") on the same line when accessing an Excel object, break it up into two lines, assigning each object to a named variable. Then, within the cleanup section of your code, explicitly release each variable using Marshal.FinalReleaseComObject().
For example, your code here:
wBook = (Excel._Workbook)xl.Workbooks.Open(#"E:\Development\Romain\APN\SalesLinkReportManager\ExcelTemplates\DailyProposalReport.xls", false, false, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt);
could be broken up to:
Excel.Workbooks wBooks = xl.Workbooks;
wBook = wBooks.Open("#"E:\Development\...\DailyProposalReport.xls", etc...);
And then later, within the cleanup section, you would have:
Marshal.FinalReleaseComObject(xlrange);
Marshal.FinalReleaseComObject(wSheet);
Marshal.FinalReleaseComObject(wBook);
Marshal.FinalReleaseComObject(wBooks); // <-- Added
Marshal.FinalReleaseComObject(xl);
(3) I am not sure what is going on with your Process.Kill approach. If you call wBook.Close() and then xl.Quit() before calling Process.Kill(), you should have no troubles. Workbook.Close() does not return execution to you until the workbook is closed, and Excel.Quit() will not return execution until Excel has finished shutting down (although it might still be hanging).
After calling Process.Kill(), you can check the Process.HasExited property in a loop, or, better, call the Process.WaitForExit() method which will pause until it has exited for you. I would guess that this will generally take well under a second to occur. It is better to wait less time and be certain than to wait 5 - 10 seconds and only be guessing.
(4) You should try these cleanup ideas that I've listed above, but I am starting to suspect that you might have an issue with other processes that might be working with Excel, such as an add-in or anti-virus program. These add-ins can cause Excel to hang if they are not done correctly. If this occurs, it can be very difficult or impossible to get Excel to release. You would need to figure out the offending program and then disable it. Another possibility is that operating as a Windows Service somehow is an issue. I don't see why it would be, but I do not have experience automating Excel via a Windows Service, so I can't say. If your problems are related to this, then using Process.Kill will likely be your only resort here.
This is all I can think of off-hand, Sean. I hope this helps. Let us know how it goes...
-- Mike
Sean,
I'm going to re-post your code again with my changes (below). I've avoided changing your code too much, so I haven't added any exception handling, etc. This code is not robust.
private bool GenerateDailyProposalsReport(ScheduledReport report)
{
Excel.Application xl = null;
Excel.Workbooks wBooks = null;
Excel.Workbook wBook = null;
Excel.Worksheet wSheet = null;
Excel.Range xlrange = null;
Excel.Range xlcell = null;
xl = new Excel.Application();
wBooks = xl.Workbooks;
wBook = wBooks.Open(#"DailyProposalReport.xls", false, false, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
wSheet = wBook.ActiveSheet;
xlrange = wSheet.Cells;
xlcell = xlrange[2, 1] as Excel.Range;
xlcell.Value2 = "fullname";
Marshal.ReleaseComObject(xlcell);
Marshal.ReleaseComObject(xlrange);
Marshal.ReleaseComObject(wSheet);
wBook.Close(true, #"c:\temp\DailyProposalReport.xls", Type.Missing);
Marshal.ReleaseComObject(wBook);
Marshal.ReleaseComObject(wBooks);
xl.Quit();
Marshal.ReleaseComObject(xl);
return true;
}
Points to note:
The Workbooks method of the
Application class creates a
Workbooks object which holds a
reference to the corresponding COM
object, so we need to ensure we
subsequently release that reference,
which is why I added the variable
wBooks and the corresponding call to ReleaseComObject.
Similarly, the Cells method of the
Worksheet object returns a Range
object with another COM reference,
so we need to clean that up too.
Hence the need for 2 separate Range variables.
I've released the COM references
(using ReleaseComObject) as soon
as they're no longer needed, which I
think is good practice even if it
isn't strictly necessary. Also, (and
this may be superstition) I've
released all the objects owned by
the workbook before closing the
workbook, and released the workbook
before closing Excel.
I'm not calling GC.Collect etc.
because it shouldn't be necessary.
Really!
I'm using ReleaseComObject rather
than FinalReleaseComObject, because it should be perfectly sufficient.
I'm not null-ing the variables after
use; once again, it's not doing
anything worthwhile.
Not relevant
here, but I'm using Type.Missing
instead of
System.Reflection.Missing.Value for
convenience. Roll on C#v4 where
optional parameters will be
supported by the compiler!
I've not been able to compile or run this code, but I'm pretty confident it'll work. Good luck!
There's no need to use the excel com objects from C#. You can use OleDb to modify the sheets.
http://www.codeproject.com/KB/office/excel_using_oledb.aspx
I was having a similar problem.
I removed _worksheet and _workbook and all was well.