Excel interop: _Worksheet or Worksheet?

Excel interop: _Worksheet or Worksheet? - c#

I'm currently writing about dynamic typing, and I'm giving an example of Excel interop. I've hardly done any Office interop before, and it shows. The MSDN Office Interop tutorial for C# 4 uses the _Worksheet interface, but there's also a Worksheet interface. I've no idea what the difference is.
In my absurdly simple demo app (shown below) either works fine - but if best practice dictates one or the other, I'd rather use it appropriately.
using System;
using System.Linq;
using Excel = Microsoft.Office.Interop.Excel;
class DynamicExcel
{
static void Main()
{
var app = new Excel.Application { Visible = true };
app.Workbooks.Add();
// Can use Excel._Worksheet instead here. Which is better?
Excel.Worksheet workSheet = app.ActiveSheet;
Excel.Range start = workSheet.Cells[1, 1];
Excel.Range end = workSheet.Cells[1, 20];
workSheet.get_Range(start, end).Value2 = Enumerable.Range(1, 20)
.ToArray();
}
}
I'm trying to avoid doing a full deep-dive into COM or Office interoperability, just highlighting the new features of C# 4 - but I don't want to do anything really, really dumb.
(There may be something really, really dumb in the code above as well, in which case please let me know. Using separate start/end cells instead of just "A1:T1" is deliberate - it's easier to see that it's genuinely a range of 20 cells. Anything else is probably accidental.)
So, should I use _Worksheet or Worksheet, and why?

If I recall correctly -- and my memory on this is a bit fuzzy, it has been a long time since I took the Excel PIA apart -- it's like this.
An event is essentially a method that an object calls when something happens. In .NET, events are delegates, plain and simple. But in COM, it is very common to organize a whole bunch of event callbacks into interfaces. You therefore have two interfaces on a given object -- the "incoming" interface, the methods you expect other people to call on you, and the "outgoing" interface, the methods you expect to call on other people when events happen.
In the unmanaged metadata -- the type library -- for a creatable object there are definitions for three things: the incoming interface, the outgoing interface, and the coclass, which says "I'm a creatable object that implements this incoming interface and this outgoing interface".
Now when the type library is automatically translated into metadata, those relationships are, sadly, preserved. It would have been nicer to have a hand-generated PIA that made the classes and interfaces conform more to what we'd expect in the managed world, but sadly, that didn't happen. Therefore the Office PIA is full of these seemingly odd duplications, where every creatable object seems to have two interfaces associated with it, with the same stuff on them. One of the interfaces represents the interface to the coclass, and one of them represents the incoming interface to that coclass.
The _Workbook interface is the incoming interface on the workbook coclass. The Workbook interface is the interface which represents the coclass itself, and therefore inherits from _Workbook.
Long story short, I would use Workbook if you can do so conveniently; _Workbook is a bit of an implementation detail.

If you look at the PIA assembly (Microsoft.Office.Interop.Excel) in Reflector, the Workbook interface has this definition ...
public interface Workbook : _Workbook, WorkbookEvents_Event
Workbook is _Workbook but adds events. Same for Worksheet (sorry, just noticed you were not talking about Workbooks) ...
public interface Worksheet : _Worksheet, DocEvents_Event
DocEvents_Event ...
[ComVisible(false), TypeLibType((short) 0x10), ComEventInterface(typeof(DocEvents),
typeof(DocEvents_EventProvider))]
public interface DocEvents_Event
{
// Events
event DocEvents_ActivateEventHandler Activate;
event DocEvents_BeforeDoubleClickEventHandler BeforeDoubleClick;
event DocEvents_BeforeRightClickEventHandler BeforeRightClick;
event DocEvents_CalculateEventHandler Calculate;
event DocEvents_ChangeEventHandler Change;
event DocEvents_DeactivateEventHandler Deactivate;
event DocEvents_FollowHyperlinkEventHandler FollowHyperlink;
event DocEvents_PivotTableUpdateEventHandler PivotTableUpdate;
event DocEvents_SelectionChangeEventHandler SelectionChange;
}
I would say it's best bet to use Worksheet, but that's the difference.

Classes and Interfaces for Internal
Use Only
Avoid directly using any of the
following classes and interfaces,
which are used internally and are
typically not used directly.
Class/Interface : Examples
classid Class : ApplicationClass
(Word or Excel), WorksheetClass
(Excel)
classid Events x _SinkHelper :
ApplicationEvents4_SinkHelper (Word), WorkbookEvents_SinkHelper (Excel)
_classid : _Application (Word or Excel), _Worksheet (Excel)
classid Events x : ApplicationEvents4
(Word), AppEvents (Excel)
I classid Events x :
IApplicationEvents4 (Word), IAppEvents (Excel)
http://msdn.microsoft.com/en-gb/library/ms247299(office.11).aspx
edit: (re: formatting of this answer) cannot correctly format an escaped underscore followed immediately by italic text. Shows correctly in preview but broken when posted
edit2: works if you make the underscore itself italic which is conceptually horrible but looks the same I suppose

I have seen and written quite a bit of C# / Excel COM Interop code over the last few years and I've seen Worksheet used in almost every case. I have never seen anything definitive from Microsoft on the subject.

MSDN shows that the Worksheet interface simply inherits from the _Worksheet and DocEvents_Event interfaces. It would seem that one simply provides the events that a worksheet object might raise in additional to everything else. As far as I can see, Worksheet doesn't provide any other members of its own. So yeah, you might as well just go with using the Worksheet interface in all cases, since you don't lose anything by it, and potentially might need the events it exposes.

Related

PowerPoint VSTO - how to lock Shapes (manually you would do this in the Selection Pane)?

The following answer explains how to do this in VBA by setting the Locked property on the PowerPoint Shape. However, when trying to do this in C# as a VSTO Addon the Locked property is not available.

The ShapeRange class doesn't expose the Locked property. If that works in VBA then you may try using the late-binding technology to call hidden or private members. Use the Type.InvokeMember method to make such calls in .net applications.

Thanks to #Eugene Astafieve.
The following line achieves what I wanted to do:
var LockedVar = myGroup.GetType().InvokeMember("Locked", System.Reflection.BindingFlags.SetProperty, null, myGroup, new object[] { OfficeCore.MsoTriState.msoTrue });
Where: myGroup is a PPT Shape (in my case a Grouped Shape)

Release COM when developing an Excel Addin?

I understand that I should release COM objects when using interop. Are things a bit different when developing and Add-In, Excel for example? Here is a loop I have and I was curious to know if the Marshal.ReleaseComObject is necessary?
foreach (var sheet in results.Sheets)
{
var newSheet = workbook.AddSheet();
newSheet.SetSheetTabColor(sheet.TabColor);
newSheet.SetSheetName(sheet.TabName);
newSheet.Cells.SetFont("Calibri", 8);
newSheet.FreezeRow(1);
var endRow = sheet.Data.GetUpperBound(0) + 1;
var endColumn = sheet.Data.GetUpperBound(1) + 1;
var writeRange = newSheet.SetWriteRange(1, 1, endRow, endColumn);
writeRange.Value2 = sheet.Data;
newSheet.AutoFitColumns();
newSheet.RemoveColumn(1);
Marshal.ReleaseComObject(newSheet);
}
Also, I created a library with extension methods. One example is workbook.AddSheet() AddSheet looks like this:
public static Worksheet AddSheet(this Microsoft.Office.Interop.Excel.Workbook workbook)
{
var sheets = workbook.Sheets;
return sheets.Add(After: sheets[sheets.Count]);
}
Since I am accessing sheets from workbook.Sheets, do I have to release of this object? If so where since I am returning a Worksheet? I can't release before I return?
This may be a dumb question, but if the Marshal.ReleaseComObject was not necessary in the foreach scope, does it hurt even if it is still there?

Marshal.ReleaseComObject is used only if you need to control the lifetime of an COM object in timely manner, or in specific order. For casual usage of COM objects i would not advice you to use this method at all.
Check here
This method is used to explicitly control the lifetime of a COM object
used from managed code. You should use this method to free the
underlying COM object that holds references to resources in a timely
manner or when objects must be freed in a specific order.
About your second question
Since I am accessing sheets from workbook.Sheets, do I have to release
of this object? If so where since I am returning a Worksheet? I can't
release before I return?
The COM implementation in .NET is using reference counting mechanism to detect if object is used or not, so NO you don't have to release anything explicitly. The framework is resposible for this.
NOTE:
Use the approriate .Close(for Workbook), .Quit(for Application) methods for proper release of resurces. When you use .Quit() over the application object you will close the excel process in windows (so this will release all resource), and .Close() over a Workbook to release file locks .. etc over specific excel file.

C# Reflection inconsistency with COM objects

Having spent the last few days reading everything I can find about C# reflection on COM objects, trying many experiments in code, and analyzing sample code to try and improve my understanding, I am now forced to admit that I just don't know enough, so I am requesting help from the Community.
I need to be able to access and update the properties of a late bound COM object that is wrapped as System._COM Object.
I tried all the standard refection stuff without success and I looked through using IDispatch, but I'm not comfortable with using the pointers involved, so I'm hoping I have missed something pretty simple in the normal interface. I found papers on MSDN that DO show how to do what I need, but all the examples are in C++ and it is over my head.
It would be really helpful if someone could explain why the following simple C# code just doesn't work as expected:
try
{
// late binding:
// localCB is a COM object (System._COMObject) created by Activator.CreateInstance() from
// the ProgID of a registered COM .DLL.
//
// The original .DLL has a string PROPERTY called
// "TESTEXTERNAL1". localCB has an IDispatch Interface. The original COM .DLL has a separate Typelib,
// and, although I did not register the typelib separately, I can see from OLEView on the COM object
// that the GUID for the typelib is included in it.
// Here's the code that is puzzling me...
var vv = localCB.GetType().InvokeMember("TESTEXTERNAL1", BindingFlags.GetProperty,
null, localCB, null);
string rt = vv.ToString();
// works exactly as expected and returns the value of TESTEXTERNAL1 - OK.
// now try and update the SAME PROPERTY, on the SAME COM object...
Parameters = new Object[1];
Parameters[0] = "Hello, World!";
localCB.GetType().InvokeMember("TESTEXTERNAL1", BindingFlags.SetProperty,
null, localCB, Parameters);
// throws an (inner) exception: HRESULT 0x8002003 DISP_E_MEMBERNOTFOUND !!!
}
catch (Exception xa)
{
string xam = xa.Message;
}
Is it unreasonable to expect an object that has already found and provided a property, to be able to update the same property? Is there some "alternative update" strategy that I am not aware of?
Many thanks for any help,
Pete.
UPDATE:
in response to Jon's request, here are snippets of the OleView:
(I had to use images because Oleview would not let me cut & paste, sorry...)
OleView of the COM .DLL
OLEView typelib view
Jon, I think you have correctly identified that the problem is with a setter method. The DLL is written in Fujitsu COBOL and provides an "under the covers" GET and SET for fields identified as PROPERTY. Accessing the COM component from C# or COBOL, it works fine, but, as you can see, it doesn't work when I try and access it for SET with reflection. Because I am unfamiliar with using reflection I was doubtful whether I had the syntax right, so I tried to make the SET as close as possible to the GET. I think I will need to generate my own SET methods (for each PROPERTY) into the COBOL and then change my "BindingFlags.SetProperty" to be "BindingFlags.InvokeMember". (I did the homework on BindingFlags and found that if you specify "SetProperty" it automatically implies the other 2 flags you mentioned.)
I think the key to it all is in recognizing that the problem is with the Fujitsu *COM Class SET, and it took your experienced eye to see that. Many thanks. If you have any other comments after seeing the OLEView, or can suggest any alternative approach in order to get the properties set, I'd be very interested. (I'm not looking forward to having to generate SETter methods for every property; it smacks of brute force... :-))
Thanks again,
Pete.

Hans was correct. The problem was with the setter method. I have written code to generate a setter for each of the properties, back in the original COBOL COM component. It wasn't as tedious or ugly as I thought it would be (about 7 lines of COBOL for each PROPERTY) and it is all working very well now. Many thanks to the community and particularly Hans Passant for support.

How to properly clean up Excel interop object in C#, 2012 edition

I am in the process of writing an application in C# which will open an Excel spreadsheet (2007, for now) via interop, do some magic, then close. The "magic" part is non-trivial, so this application will contain many references to many COM objects spawned by Excel.
I have written this kind of application before (too many times, in fact) but I've never found a comfortable, "good smell" approach to interacting with COM objects. The problem is partly that, despite significant study, I still don't perfectly understand COM and partly that the interop wrappers hide much that probably shouldn't be hidden. The fact that there are so many different, conflicting suggestions from the community only makes matters worse.
In case you can't tell from the title, I've done my research. The title alludes to this post:
How do I properly clean up Excel interop objects?
First asked in 2008, the advice was really helpful and solid at the time (especially the "Never use 2 dots with com objects" bit) but now seems out of date. In March of 2010, the Visual Studio team posted a blog article warning fellow programmers that Marshal.ReleaseComObject [is] Considered Dangerous. The article referred to two articles, cbrumme's WebLog > ReleaseComObject and The mapping between interface pointers and runtime callable wrappers (RCWs), suggesting that people have been using ReleaseComInterop incorrectly all along (cbrumme: "If you are a client application using a modest number of COM objects that are passed around freely in your managed code, you should not use ReleaseComObject").
Does anyone have an example of a moderately complex application, preferably using multiple threads, that is able to successfully navigate between memory leaks (Excel continues running in the background after the application has closed) and InvalidComObjectExceptions? I'm looking for something which will allow a COM object to be used outside of the context in which it was created but can still be cleaned up once the application is finished with it: a hybrid of memory management strategies which can effectively straddle the managed/unmanaged divide.
A reference to an article or tutorial that discusses a correct approach to this problem would be a much appreciated alternative. My best Google-fu efforts have returned the apparently incorrect ReleaseComInterop approach.
UPDATE:
(This is not an answer)
I discovered this article not long after posting:
VSTO and COM Interop by Jake Ginnivan
I've been able to implement his strategy of wrapping COM objects in "AutoCleanup" classes via an extension method, and I'm pretty happy with the result. Though it does not provide a solution to allow COM objects to cross the boundaries of the context in which they were created and still makes use of the ReleaseComObject function, it does at least provide a neat and easy-to-read solution.
Here's my implementation:
class AutoCleanup<T> : IDisposable {
public T Resource {
get;
private set;
}
public AutoCleanup( T resource ) {
this.Resource = resource;
}
~AutoCleanup() {
this.Dispose();
}
private bool _disposed = false;
public void Dispose() {
if ( !_disposed ) {
_disposed = true;
if ( this.Resource != null &&
Marshal.IsComObject( this.Resource ) ) {
Marshal.FinalReleaseComObject( this.Resource );
} else if ( this.Resource is IDisposable ) {
( (IDisposable) this.Resource ).Dispose();
}
this.Resource = null;
}
}
}
static class ExtensionMethods {
public static AutoCleanup<T> WithComCleanup<T>( this T target ) {
return new AutoCleanup<T>( target );
}
}

did you now the NetOffice concept for COM proxy management?
NetOffice use wrapper classes for com proxies and the IDisposable pattern.
NetOffice keep the parent->child relationship for proxies. dispose a worksheet and all created childs from the instance(cells, styles), etc. was also disposed. you can also use a special event or static property to observe the count of open proxies in your application.
just take a look in this documentation snippet:
http://netoffice.codeplex.com/wikipage?title=Tec_Documentation_English_Management
you find some showstopper projects for com proxy management in the tutorials folder

Recommended program structure

as a beginner, I have formulated some ideas, but wanted to ask the community about the best way to implement the following program:
It decodes 8 different types of data file. They are all different, but most are similar (contain a lot of similar fields). In addition, there are 3 generations of system which can generate these files. Each is slightly different, but generates the same types of files.
I need to make a visual app which can read in any one of these, plot the data in a table (using datagridview via datatable at the moment) before plotting on a graph.
There is a bit more to it, but my question is regarding the basic structure.
I would love to learn more about making best use of object oriented techniques if that would suit well.
I am using c# (unless there are better recommendations) largely due to my lacking experience and quick development time.
I am currently using one class called 'log' that knows what generation/log type the file that is open is. it controls reading and exporting to a datatable. A form can then give it a path, wait for it to process the file and request the datatable to display.
Any obvious improvements?

As you have realised there is a great deal of potential in creating a very elegant OOP application here.
Your basic needs - as much as I can see from the information you have share - are:
1) A module that recognises the type of file
2) A module that can read the file and load the data into a common structure (is it going to be common structure??) this consists of handlers
3) A module that can visualise the data
For the first one, I would recommend two patterns:
1a) Factory pattern: File is passed to a common factory and is parsed to the point that it can decide the handler
2a) Chain-of-responsibility: File is passed to each handler which knows if it can support the file or not. If it cannot passes to the next one. At the end either one handler picks it up or an error will occur if the last handler cannot process it.
For the second one, I recommend to design a common interface and each handler implements common tasks such as load, parse... If visualisation is different and specific to handlers then you would have those set of methods as well.
Without knowing more about the data structure I cannot comment on the visualisation part.
Hope this helps.
UPDATE
This is the factory one - a very rough pseudocode:
Factory f = new Factory();
ILogParser parser = f.GetParser(fileName); // pass the file name so that factory inspects the content and returns appropriate handler
CommonDataStructure data = parser.Parse(fileName); // parse the file and return CommonDataStructure.
Visualiser v = new Visualiser(form1); // perhaps you want to pass the refernce of your form
v.Visualise(data); // draw pretty stuff now!

Ok, first thing - make one class for every file structure type, as a parser. Use inheritance as needed to combine common functionality.
Every file parser should have a method to identify whether it can parse a file, so you can take a file name, and just ask the parsers which thinks it can handle the data.
.NET 4.0 and the extensibility framework can allow dynamic integration of the parsers without a known determined selection graph.
The rest depends mostly on how similar the data is etc.

Okay, so the basic concept of OOP is thinking of Classes etc as Objects, straight from the offset, object orientated programming can be a tricky subject to pick up at first but the more practice you get the more easy you find it to implement programs using OOP.
Take a look here: http://msdn.microsoft.com/en-us/beginner/bb308750.aspx
So you can have a Decoder class and interface, something like this.
interface IDecoder
{
void DecodeTypeA(param1, 2, 3 etc);
void DecodeTypeB(param1, 2, 3 etc);
}
class FileDecoder : IDecoder
{
void DecodeTypeA(param1, 2, 3 etc)
{
// Some Code Here
}
void DecodeTypeB(param1, 2, 3 etc)
{
// Some Code Here
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.