I have been developing office solutions in VBA for a while now and have fairly complete knowledge regarding office development in VBA. I have decided it is time to learn some real programming with .Net and am having some teething problems.
Having looked through a bunch of articles and forums (here and elsewhere), there seems to be some mixed information regarding memory management in .Net when using COM objects.
Some people say I should always deterministically release COM objects and others say I should almost never do it.
People saying I should do it:
The book 'Professional Excel Development' on page 861.
This stack exchange question has been answered by saying "every reference you make to a COM object must be released. If you don't, the process will stay in memory"
This blog suggests using it solved his problems.
People saying I should not do it:
This MSDN blog by Eric Carter states "In VSTO scenarios, you typically don't ever have to use ReleaseCOMObject."
The book 'VSTO for Office 2007' which is co-authored by Eric Carter seems to make no mention whatsoever of memory management or ReleaseComObject.
This MSDN blog by Paul Harrington says don't do it.
Someone with mixed advice:
Jake Ginnivan says I should always do it on COM objects that do not leave the method scope. If a COM object leaves the method scope then forget about it. Why can't I just forget about it all the time then?
The blog by Paul Harrington seems to suggest that the advice from MS has changed sometime in the past. Is it the case that calling ReleaseCOMObject used to be best practice but is not anymore? Can I leave the finer details of memory management to MS and assume that everything will be mostly fine?
I try to adhere to the following rule in my interop development regarding ReleaseComObject.
If my managed object implements some kind of shutdown protocol similar to IDisposable, I call ReleaseComObject on any child COM objects I hold references to. Some examples of the shutdown protocols I'm talking about:
IObjectWithSite.SetSite(null)
IOleObject.SetClientSite(null)
IOleObject.Close()
IDTExtensibility2.OnDisconnection
IDTExtensibility2.OnBeginShutdown
IDisposable.Dispose itself
This helps breaking potential circular references between .NET and native COM objects, so the managed garbage collector can do its job unobstructively.
Perhaps, there's something similar which can be used in your VSTO interop scenario (AFAIR, IDTExtensibility2 is relevant there).
If the interop scenario involves IPC COM calls (e.g., when you pass a managed event sink object to an out-of-proc COM server like Excel), there's another option to track external references to the managed object: IExternalConnection interface. IExternalConnection::AddConnection/ReleaseConnection are very similar to IUnknown::AddRef/Release, but they get called when a reference is added from another COM appartment (including apartments residing in separate processes).
IExternalConnection provides a way to implement an almost universal shutdown mechanism for out-of-proc scenarios. When the external reference count reaches zero, you should call ReleaseComObject on any external Excel objects you may be holding references to, effectively breaking any potential circular COM references between your process and Excel process. Perhaps, something like this has already been implemented by VSTO runtime (I don't have much experience with VSTO).
That said, if there is no clear shutdown mechanism, I don't call ReleaseComObject. Also, I never use FinalReleaseComObject.
You should not ignore it, if you are working with the Office GUI! Like your second link states:
Every reference you make to a COM object must be released. If you don't, the process will stay in memory.
This means, that your objects will remain in memory if you do not explicitly release them. Since they are COM objects, the garbage collector is responsible for releasing them. However, Excel and the other fancy tools are implemented with no knowledge of the garbage collector in .NET. They relate on deterministic release of memory. If you are requesting an object from excel and do not properly release it, it might be that your application does not close correctly, because it wait's until your resources are released. If your objects live long enought to get into gen1 or gen2, then this can take hours or even days!
All considerations about not releasing COM objects are targeting multithreaded scenarios or scenarios where you are forced to push around many com objects over multiple instances. As a good advice, allways create your com objects as late as possible and release them as soon as possible in the opposite order than created. Also you should think about keeping your interop instances private wherever possible. This reduces the possibility that other threads or instances are accessing the object while you have already released it.
Related
This question already has answers here:
Clean up Excel Interop Objects with IDisposable
(2 answers)
Closed 3 years ago.
This is a very common question and I decided to ask it because this question may have a different answer as of today. Hopefully, the answers will help to understand what is the right way to work with COM objects.
Personally, I feel very confuse after getting different opinions on this subject.
The last 5 years, I used to work with COM objects and the rules were very clear for me:
Use a single period in lines of code. Using more than one period create temporary objects behind the scene that cannot be explictly released.
Do not use foreach, use a for loop instead and release each item on each iteration
Do not call FInalReleaseComObject, use ReleaseComObject instead.
Do not use GC for releasing COM objects. GC intent is mainly for debugging usage.
Release objects in reverse order of their creation.
Some of you may be frustrated after reading those last lines, this is what I knew about how to properly create/release Com Object, I hope getting answers that will make it clearer and uncontested.
Following, are some links I found on this topic. Some of them telling that it is needed to call ReleaseComObject and some of them not.
How to properly release Excel COM objects (Nov. 2013)
Proper Way of Releasing COM Objects in .NET (Aug. 2011)
Marshal.ReleaseComObject Considered Dangerous (Mar. 2010)
ReleaseCOMObject (Apr. 2004)
"... In VSTO scenarios, you typically don’t ever have to use ReleaseCOMObject. ..."
MSDN - Marshal.ReleaseComObject Method (current .NET Framework version):
"...You should use this method to free the underlying COM object that holds references..."
UPDATE:
This question has been marked as too broad. As requested, I will try to simplify and ask simpler questions.
Does ReleaseComObject is required when working with COM Objects or calling GC is the correct way?
Does VSTO approach change the way we used to work with COM Objects?
Which of the above rules I wrote are required and which are wrong? Is there any others?
The .NET / COM interop is well designed, and works correctly. In particular, the .NET Garbage Collector correctly tracks COM references, and will correctly release COM objects when they have no remaining runtime references. Interfering with the reference counts of COM object by calling Marshal.ReleaseComObject(...) or Marshal.FinalReleaseComObject(...) is a dangerous but common anti-pattern. Unfortunately, some of the bad advice came out of Microsoft.
Your .NET code can correctly interact with COM while ignoring all 5 of your rules.
If you do need to trigger deterministic clean-up of COM objects that are no longer referenced from the runtime, you can safely force a GC (and possibly wait for finalizers to complete). Otherwise, you don't have to do anything special in your code, to deal with COM objects.
There is one important caveat, that might have contributed to confusion about role of the garbage collector. When debugging .NET code, local variables artificially have their lifetime extended to the end of the method, in order to support watching the variabled under the debugger. That means you might still have managed references to a COM object (and hence the GC won't clean up) later than expect form just looking at the code. A good workaround for this issue (which only occurs under the debugger) is to split the scope of COM calls from the GC cleanup calls.
As an example, here is some C# code that interacts with Excel, and cleans up properly. You can paste into a Console application (just add a reference to Microsoft.Office.Interop.Excel):
using System;
using System.Runtime.InteropServices;
using Microsoft.Office.Interop.Excel;
namespace TestCsCom
{
class Program
{
static void Main(string[] args)
{
// NOTE: Don't call Excel objects in here...
// Debugger would keep alive until end, preventing GC cleanup
// Call a separate function that talks to Excel
DoTheWork();
// Now let the GC clean up (repeat, until no more)
do
{
GC.Collect();
GC.WaitForPendingFinalizers();
}
while (Marshal.AreComObjectsAvailableForCleanup());
}
static void DoTheWork()
{
Application app = new Application();
Workbook book = app.Workbooks.Add();
Worksheet worksheet = book.Worksheets["Sheet1"];
app.Visible = true;
for (int i = 1; i <= 10; i++) {
worksheet.Cells.Range["A" + i].Value = "Hello";
}
book.Save();
book.Close();
app.Quit();
// NOTE: No calls the Marshal.ReleaseComObject() are ever needed
}
}
}
You'll see that the Excel process properly shuts down, indicating that all the COM objects were properly cleaned up.
VSTO does not change any of these issues - it is just a .NET library that wraps and extends the native Office COM object model.
There is a lot of false information and confusion about this issue, including many posts on MSDN and on StackOverflow.
What finally convinced me to have a closer look and figure out the right advice was this post https://blogs.msdn.microsoft.com/visualstudio/2010/03/01/marshal-releasecomobject-considered-dangerous/ together with finding the issue with references kept alive under the debugger on a StackOverflow answer.
One exception to this general guidance is when the COM object model requires interfaces to be released in a particular order. The GC approach described here does not give you control over the order in which the COM objects are released by the GC.
I don't have any reference to indicate whether this would violate the COM contract. In general, I would expect COM hierarchies to use internal references to ensure any dependencies on the sequence are properly managed. E.g. in the case of Excel, one would expect a Range object to keep an internal reference to the parent Worksheet object, so that a user of the object model need not explicitly keep both alive.
There may be cases where even the Office applications are sensitive to the sequence in which COM objects are released. One case seems to be when the OLE embedding is used - see https://blogs.msdn.microsoft.com/vsofficedeveloper/2008/04/11/excel-ole-embedding-errors-if-you-have-managed-add-in-sinking-application-events-in-excel-2/
So it would be possible to create a COM object model that fails if objects are released in the wrong sequence, and the interop with such a COM model would then require some more care, and might need manual interference with the references.
But for general interop with the Office COM object models, I agree with the VS blog post calling "Marshal.ReleaseComObject – a problem disguised as a solution".
There are lots of questions on SO regarding the releasing COM objects and garbage collection but nothing I could find that address this question specifically.
When releasing COM objects (specifically Excel Interop in this case), in what order should I be releasing the reference and calling garbage collection?
In some places (such as here) I have seen this:
Marshall.FinalReleaseComObject(obj);
GC.Collect();
GC.WaitForPendingFinalizers();
And in others (such as here) this:
GC.Collect();
GC.WaitForPendingFinalizers();
Marshall.FinalReleaseComObject(obj);
Or doesn't it matter and I'm worrying about nothing?
Marshal.FinalReleaseComObject() releases the underlying COM interface pointer.
GC.Collect() and GC.WaitForPendingFinalizers() causes the finalizer for a COM wrapper to be called, which calls FinalReleaseComObject().
So what makes no sense is to do it both ways. Pick one or the other.
The trouble with explicitly calling FinalReleaseComObject() is that it will only work when you call it for all the interface pointers. The Office program will keep running if you miss just one of them. That's very easy to do, especially the syntax sugar allowed in C# version 4 makes it likely. An expression like range = sheet.Cells[1, 1], very common in Excel interop code. There's a hidden Range interface reference there that you never explicitly store anywhere. So you can't release it either.
That's not a problem with GC.Collect(), it can see them. It is however not entirely without trouble either, it will only collect and run the finalizer when your program has no reference to the interface anymore. Which is definitely what's wrong with your second snippet. And which tends to go wrong when you debug your program, the debugger extends the lifetime of local object references to the end of the method. Also the time you look at Taskmgr and yell "die dammit!"
The usual advice for GC.Collect() applies here as well. Keep your program running and perform work. The normal thing happens, you'll trigger a garbage collection and that releases the COM wrappers as well. And the Office program will exit. It just doesn't happen instantly, it will happen eventually.
Reference counting mechanism that is used by COM is another way of automatic memory management but with slightly different impact on memory and behavior.
Any reference counting implementation provide deterministic behavior for resource cleanup. This means that right after call to Marshal.FinalReleaseComObject() all resources (memory and other resources) related to the COM object would be reclaimed.
This means that if we have additional managed objects and you want to reclaim them as quickly as possible, you should release COM object first and only after that call GC.Collect method.
I am developing an application that relies heavily on multiple Microsoft Office products including Access, Excel, Word, PowerPoint and Outlook among others. While doing research on interop I found out that starting with VS2010 and .NET 4, we thankfully no longer have to go through the nightmares of PIAs.
Furthermore, I have been reading a lot of articles on proper disposal of objects, the most sensible one seemed to be this one.
However, the article is 5 years old and there are not many authoritative publications on the subject AFAIK. Here is a sample of code from the above link:
' Cleanup:
GC.Collect()
GC.WaitForPendingFinalizers()
GC.Collect()
GC.WaitForPendingFinalizers()
Marshal.FinalReleaseComObject(worksheet)
oWB.Close(SaveChanges:=False)
Marshal.FinalReleaseComObject(workbook)
oApp.Quit()
Marshal.FinalReleaseComObject(application)
What I want to know is by today's standards, how accurate is this and what should I look out for if I expect to support my application for the coming few years?
UPDATE: A link to some credible articles would be highly appreciated. By the way, this is not a server side application. This will be running in computer labs where we have users interact with office products that we instantiate for them.
FOUND IT: This three-part article is probably the closest to an authoritative account I would expect to find.
Objects should be disposed automatically by the GC after the object goes out of scope. If you need to release them sooner, you can use Marshal.ReleaseComObject http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.marshal.releasecomobject.aspx
Depending on the version of Office you are controlling through interop, and the circumstances that occur on the server, yes, you may need to do some serious convolutions to get rid of a running instance of one of the Office applications. There are unpredictable (and unnecessary) dialogs which, for example, Word will open when it is not interactive (None of the Office applications are particularly well designed for unattended execution, thus the recommendation that even with the latest versions that they not be installed for server-side use).
For server cases, consider using Aspose's stack rather than Office, or another of the alternatives. The interactive, local use cases, you may need that "extra vigorous killing" on occasion since the Office automation servers are particularly badly behaved unmanaged objects.
If you want to release COM objects fully especially in MS Office COM Objects, its very highly recommended that you release sub objects that you must have used which are inside the parent objects.
In your example, I would say release all Cell, Range any other objects that you may have used before releasing the worksheet that the cell, range or any other object belongs to.
When I read a few articles about memory management in C#, I was confused by Finalizer methods.
There are so many complicated rules which related with them.
For instance, nobody knows when the finalizers will be called, they called even if code in ctor throws, CLR doesn't guarantee that all finalizers be called when programs shutdowt, etc.
For what finalizers can be used in real life?
The only one example which I found was program which beeps when GC starts.
Do you use Finalizers in your code and may have some good samples ?
UPD:
Finalizers can be used when developpers want to make sure that some class always disposed correctly through IDisposable. (link ; Thanks Steve Townsend)
There is an exhaustive discussion of Finalizer usage, with examples, here. Link courtesy of #SLaks at a related answer.
See also here for a more concise summary of when you need one (which is "not very often").
There's a nice prior answer here with another good real-world example.
To summarize with a pertinent extract:
Finalizers are needed to guarantee the
release of scarce resources back into
the operating system like file handles, sockets,
kernel objects, etc.
For more correct real-world examples, browse around affected classes in .Net:
https://learn.microsoft.com/en-us/search/?terms=.Finalize&scope=.NET
One valid reason I can think of when you might need to use a finalizer is if you wrap a third-party native code API in a managed wrapper, and the underlying native code API library requires the timely release of used operating system resources.
The best practice known to me is plain simple don't use them. There might however be some corner cases when you want to use a finalizer, particularly when dealing with unmanaged objects and you can't implement Dispose pattern (I do not know legacy issues) then you can implement Finalize method with caution (and it could reduce the performance of your system, make your objects undead and other possibly weird scenarios, minding the exceptions as they are uncatchable:)).
In 99% of cases just write the use Dispose pattern and use this method to clean after yourself and everything will be fine.
This is a follow on question to
How to properly clean up excel interop objects in c#.
The gyst is that using a chaining calls together (eg. ExcelObject.Foo.Bar() ) within the Excel namespace prevents garbage collection for COM objects. Instead, one should explicitly create a reference to each COM object used, and explicitly release them using Marhsal.ReleaseComObject().
Is the behaviour of not releasing COM objects after a chained call specific to Excel COM objects only? Is it overkill to apply this kind of pattern whenever a COM object is in use?
It's definitely more important to handle releases properly when dealing with Office applications than many other COM libraries, for two reasons.
The Office apps run as out of process servers, not in-proc libraries. If you fail to clean up properly, you leave a process running.
Office apps (especially Excel IIRC) don't terminate properly even if you call Application.Quit, if there are outstanding references to its COM objects.
For regular, in-proc COM libraries, the consequences of failing to clean up properly are not so dramatic. When your process exits, all in-proc libraries goes away with it. And if you forget to call ReleaseComObject on an object when you no longer need it, it will still be taken care of eventually when the object gets finalized.
That said, this is no excuse to write sloppy code.
COM objects are essentially unmanaged code - and as soon as you start calling unmanaged code from a managed application, it becomes your responsibility to clean up after that unmanaged code.
In short, the pattern linked in the above post is necessary for all COM objects.