We have a .NET 4 based service that self hosts a WCF service with callbacks. We encapsulate this service in a .NET 4 dll that exposes COM objects. This service is used by a large variety of clients, the majority being .NET based.
Unfortunately we have some VB6 clients that we can not change and we are getting AccessViolationExceptions when some callback methods are invoked.
The way the service is structured is with callbacks is illustrated below.
MethodA invoked by VB6 Client, proxied through the .NET dll to the WCF service (has not returned yet)
WCF CallbackA invoked providing Enum status values
Possibly WCF CallbackB will be invoked requiring further input from the VB6 Client (This information can not be obtained at the start of MethodA and impacts the outcome of MethodA)
MethodA Returns
CallbackA (Works great, no Exceptions!) it is a OneWayOperation that provides an Enum, the VB application currently is writing this value to a RichTextbox.
CallbackB (Causes an AccessViolationException) is a Method that provides an object and expects a different object back with two value based properties back.
I feel this is some sort of an issue with attempting to create COM objects on a different thread than the main thread (since its currently hanging on MethodA). Unfortunately I am not sure how to correct this. We have control over the code within the service, the encapsulating dll and we can advise on code within the VB6 client.
We have our own VB6 testing application and we can bypass the AccessViolation error.... but it involves commenting out any code within the callback method (See code below) I have highlighted the lines that are causing the exception if left in "<----- Causes Exception". Any help is much appreciated, please let me know if you need more information.
Private Function ITerminalCallbackComClient_VerifySignature() As Long
Dim result As Long
'Not-Authorized = 0 and Authorized = 1'
result = 0
Dim msgResponse As Long
msgResponse = MsgBox("Signature Accepted?", vbYesNo + vbQuestion, "Signature Verification")
If msgResponse = vbYes Then
result = 1
End
End If
ITerminalCallbackComClient_VerifySignature = result
End Function
UPDATE 2014-11-13
The callback works while debugging in Visual Studio 6. But as soon as we "Make" the sample project it crashes when the callback is executed. If we remove the reference to the MsgBox and just map back a static value, it works as expected.
We have updated the Signature to our COM Interop to remove all object references and are not just returning a 0 or 1 to avoid object naming issues.
I have updated the VB6 callback code above.
Callback Contract
[CallbackBehaviorAttribute(ConcurrencyMode = ConcurrencyMode.Reentrant, UseSynchronizationContext = false)]
public abstract class PS_Terminal_Link_Callback : ITerminalCallback
{
public abstract long VerifySignature();
}
Service Contract
[ServiceContract(CallbackContract=typeof(ITerminalCallback))]
public interface ITerminal
{
*MethodA*
}
If that is VB6 code, LONG is a win32 integer in VB6. For compatability with 16 bit basic integers are 16 bit in VB6. Longs are 32 bit. You are putting a VB6 long (32 bit) into a VB6 integer (16 bit) in the message box line.
Your other object is private to you so we can't look up it's specs and docs.
You can also start in a debugger.
windbg or ntsd (ntsd is a console program and maybe installed). Both are also from Debugging Tools For Windows.
Download and install Debugging Tools for Windows
http://msdn.microsoft.com/en-us/windows/hardware/hh852363
Install the Windows SDK but just choose the debugging tools.
Create a folder called Symbols in **C:**
Start Windbg. File menu - Symbol File Path and enter
srv*C:\symbols*http://msdl.microsoft.com/download/symbols
then
windbg -o -g -G c:\windows\system32\cmd.exe /k batfile.bat
You can press F12 to stop it and kb will show the call stack (g continues the program). If there's errors it will also stop and show them.
Type lm to list loaded modules, x *!* to list the symbols and bp symbolname to set a breakpoint
If programming in VB6 then this environmental variable link=/pdb:none stores the symbols in the dll rather than seperate files. Make sure you compile the program with No Optimisations and tick the box for Create Symbolic Debug Info. Both on the Compile tab in the Project's Properties.
Also CoClassSyms (microsoft.com/msj/0399/hood/hood0399.aspx) can make symbols from type libraries.
Put a breakpoint on the COM call (use x *!* to find it).
Now you can examine parameters and/or see detailed exception info.
.
Related
I have set up a local WCF service, self hosted in a console application using NetNamedPipeBinding for client access.
To make calls to the service I reference a library.dll where I have the following method:
public static string GetLevel(Point p)
{
ChannelFactory<IService> pipeFactory = new ChannelFactory<IService>(new NetNamedPipeBinding(), new EndpointAddress("net.pipe://localhost/PTS_Service"));
IService pipeProxy = pipeFactory.CreateChannel();
string result = pipeProxy.GetLevel(p);
((IClientChannel)pipeProxy).Close();
pipeFactory.Close();
}
The GetLevel() command returns a string from a list stored in the service, based on the Z coordinate of Point(X,Y,Z) p.
This works and gives speeds of 8ms total if the method is called from the above console app.
However when the same method from the library.dll is called from another app.exe or plugin.dll (loaded by an external program) the times increase drastically. I've stop watched the above 5 lines of code:
consoleHost.exe : 0 - 3 - 6 - 7 - 8
app.exe : 89 - 155 - 248 - 259 - 271
plugin.dll : 439 - 723 - 1210 - 1229 - 1245
Shouldn't the times be the same, not dependent on who makes the call to library.dll?
EDIT
Since I've cancelled out all methods to just retrieving a string from a running service, I believe the problem lies in the first creation run of the channelFactory, all subsequent calls in the same app/plugin run are equal in time.
I understand the first call is slower, but as I see this is around 30ms in a new app and around 900ms in my plugins, I believe there is another thing causing this.
I have found a question with similar delays:
First WCF connection made in new AppDomain is very slow to which the solution was to set LoaderOptimizationAttribute to MultiDomain. Could it be possible everytime the plugin runs it has to JIT-compile instead of use native code?
I tried adding this code above main in consoleHost.exe but see no gain in the plugin run time. Could this be because of the external program in between and is there a way around this? Say could my plugin create a new Appdomain whenever it wants to access the service and call from within this new Appdomain the above method from my library.dll or does this make no sense?
EDIT2
I recorded the time spent in JIT compiling with a profiling program as suggested in the comments, this gives 700ms for JIT compiling and total execution time of 800ms for the plugin.
I used ngen to precompile the library.dll to create a native image .ni.dll. I see in process explorer that this image is loaded by the external program, though there is no time gain in the plugin? As I understand there shouldn't be a reason the plugin would still JIT compile or am I doing something wrong?
I also noticed when debugging in VS that the console & app only do some loading of assemblies, the plugin loads and unloads everytime it creates or modifies a plugin instance. I believe this is the way plugins work and should not explain the difference in first execution time?
The communication should not depend on a caller, but the way the calls are done.
The most time expensive operation is creating a Channel.
Should the Proxy once created, then every next call will be done with a an average similar speed. (Of cause if the callers are using the service from the same place: same Machine in the Network. In your case should be the same, while in your case you use the localhost)
Some performance increase can be also archived by service configuration (SingleInstance should be faster than PerCall).
Another point to pay attention is to exam the possible locks in your service method. It can happen than some service clients are waiting for a call, while the service is busy.
if the service call is not an async one, try to make it async and use it.
After some further investigating: the external program prevented the sharing of loaded assemblies/JIT compilations through a setting in a .ini file when the process is started. Fortunately this could be disabled so sharing also becomes possible in the plugin.
After altering this setting (1 line in the .ini to No instead of Yes!) the time reduced to 30ms and every next call 3ms or less.
This question is not about how to restart an application. I am already achieving that by using a Mutex and a secondary starter application. I had to resort to that after facing some problems using Application.Restart.
In any case, not being fluent with IL, I was wondering if someone could explain how Application.Restart works in the first place. It is a call to the runtime but what exactly does the runtime do? How does it close the existing instance and how does it know when to launch a new one?
... not being fluent with IL, ...
Have you considered using a decompiler (Reflector, dotPeek) or, even better, the reference source code of the .NET framework?
Anyway.
At a casual look it does the following:
In all below cases, the current instance is terminated using Application.ExitInternal(). That is the gist of the public Application.Exit() method, eliding some security checks/asserts.
See if it can determine the Assembly.GetEntryAssembly(). If that is null the call the Application.Restart() was most likely done from unmanaged code and the operation throws a NotSupportedException to the caller.
See if the current process is ieexec.exe, if so use it to restart the application (for more information about ieexec.exe see here). Actually that is pretty much also a Process.Start() call to ieexec.exe, but the command line arguments are not gathered by Environment.GetCommandLineArgs() (see below), but by reading the APP_LAUNCH_URL application domain data.
See if the application is a click-once application (ApplicationDeployment.IsNetworkDeployed), if so call an CLR internal native code to (re)launch that: CorLauncApplication. The only publicly available source code that somewhat resembles the native parts of the CLR is the shared source CLI (sscli), which is based on the .NET 2.0 framework and also is partly incomplete. It contains a definition for that function (clr\src\vm\hosting.cpp), but it is only a stub. In the end it will use some means to restart the process (e.g. Win32's CreateProcess API).
Else: the application is a "regular" .NET application. Environment.GetCommandLineArgs() is used to recreate the original command line and Process.Start(Application.ExecutablePath) is used to restart the application.
The use of the Application.Exit-mechanism to try to end the current instance is probably the reason why you find it unreliable. Forms that cancel the send closing event can interrupt it. Also see this SO question.
My application is a mix of C# and C++ code. Startup module written in C# loads during initialization phase C++ module through COM (Component Object Model) mechanism. All was functioning correctly until I decided to add to C# part a wcf service. All wcf service calls are routed to C++ code using COM. After adding some new methods I noticed memory leaks in output window. So I added breakpoint to desctructor of C++ class as can be seen from screenshot. From this point on weird things started to happen. After program reaches breakpoint it unexpectedly crashes. First weird thing is that when I run program without breakpoint being set it ends graciously. Second weird thing is that the way program crashes is as if it were running without debugger. After clicking on button "Open in debugger" (or something like this) I get error message: "Program is already opened under debugger." None message in output window that could point me to the source of the error, none suspicious code.
When adding message box to destructor beginning it displays for fraction of second and then whole application closes (without adding user opportunity to read whats displayed in message box). Desperately searching for any clue.
P.S. Problems occurs only when wcf method was called at least once. Doesn't depend if program flow in this particular call was routed to C++ level or not.
When calling C# from C++ sometimes the garbage collector doesn't properly get called before program end. Try forcing garbage collection at the end of your C# code.
Resolved by following code:
public void Dispose()
{
Marshal.Release(internal_interface_ptr);
internal_interface_ptr = IntPtr.Zero;
Marshal.ReleaseComObject(internal_interface);
Marshal.ReleaseComObject(internal_interface);
internal_interface = null;
}
Beside this one other reference was hanging in C++ code. So to make conclusion, main mistake on my part was forgetting to explicitly release COM object in C# code. Even if garbage collector takes task of managing memory this isn't true for modules written in other programming languages. COM destructor was called very lately when particular dynamic linked library was to be unloaded from memory and this caused problems. Hope I explained it sufficient clearly.
I think I have a curly one here... I have an WinForms application that crashes fairly regularly every hour or so when running as an x64 process. I suspect this is due to stack corruption and would like to know if anyone has seen a similar issue or has some advice for diagnosing and detecting the issue.
The program in question has no visible UI. It's just a message window that sits in the background and acts as a sort of 'middleware' between our other client programs and a server.
It dies in different ways on different machines. Sometimes it's an 'APPCRASH' dialog that reports a fault in ntdll.dll. Sometimes it's an 'APPCRASH' that reports our own dll as the culprit. Sometimes it's just a silent death. Sometimes our unhandled exception hook logs the error, sometimes it doesn't.
In the cases where Windows Error Reporting kicks in, I've examined memory dumps from several different crash scenarios and found the same Managed exception in memory each time. This is the same exception I see reported as an unhandled exception in the cases where we it logs before it dies.
I've also been lucky (?) enough to have the application crash while I was actively debugging with Visual Studio - and saw that same exception take down the program.
Now here's the kicker. This particular exception was thrown, caught and swallowed in the first few seconds of the program's life. I have verified this with additional trace logging and I have taken memory dumps of the application a couple of minutes after application startup and verified that exception is still sitting there in the heap somewhere. I've also run a memory profiler over the application and used that to verify that no other .NET object had a reference to it.
The code in question looks a bit like this (vastly simplified, but maintains the key points of flow control)
public class AClass
{
public object FindAThing(string key)
{
object retVal = null;
Collection<Place> places= GetPlaces();
foreach (Place place in places)
{
try
{
retval = place.FindThing(key);
break;
}
catch {} // Guaranteed to only be a 'NotFound' exception
}
return retval;
}
}
public class Place
{
public object FindThing(string key)
{
bool found = InternalContains(key); // <snip> some complex if/else logic
if (code == success)
return InternalFetch(key);
throw new NotFoundException(/*UsefulInfo*/);
}
}
The stack trace I see, both in the event log and when looking at the heap with windbg looks a bit like this.
Company.NotFoundException:
Place.FindThing()
AClass.FindAThing()
Now... to me that reeks of something like stack corruption. The exception is thrown and caught while the application is starting up. But the pointer to it survives on the stack for an hour or more, like a bullet in the brain, and then suddenly breaches a crucial artery, and the application dies in a puddle.
Extra clues:
The code within 'InternalFetch' uses some Marshal.[Alloc/Free]CoTask and pinvoke code. I have run FxCop over it looking for portability issues, and found nothing.
This particular manifestation of the issue is only affecting x64 code built in release mode (with code optimization on). The code I listed for the 'Place.Find' method reflects the optimized .NET code. The unoptimized code returns the found object as the last statement, not 'throw exception'.
We make some COM calls during startup before the above code is run... and in a scenario where the above problem is going to manifest, the very first COM call fails. (Exception is caught and swallowed). I have commented out that particular COM call, and it does not stop the exception sticking around on the heap.
The problem might also affect 32 bit systems, but if it does - then the problem does not manifest in the same spot. I was only sent (typical users!) a few pixels worth of a screen shot of an 'APP CRASH' dialog, but the one thing I could make out was 'StackHash_2264' in the faulting module field.
EDIT:
Breakthrough!
I have narrowed down the problem to a particular call to SetTimer.
The pInvoke looks like this:
[DllImport("user32")]
internal static extern IntPtr SetTimer(IntPtr hwnd, IntPtr nIDEvent, int uElapse, TimerProc CB);
internal delegate void TimerProc(IntPtr hWnd, uint nMsg, IntPtr nIDEvent, int dwTime);
There is a particular class that starts a timer in its constructor. Any timers set before that object is constructed work. Any timers set after that object is constructed work. Any timer set during that constructor causes the application to crash, more often than not. (I have a laptop that crashes maybe 95% of the time, but my desktop only crashes 10% of the time).
Whether the interval is set to 1 hour, or 1 second, seems to make no different. The application dies when the timer is due - usually by throwing some previously handled exception as described above. The callback does not actually get executed. If I set the same timer on the very next line of managed code after the constructor returns - all is fine and happy.
I have had a debugger attached when the bad timer was about to fire, and it caused an access violation in 'DispatchMessage'. The timer callback was never called. I have enabled the MDAs that relate to managed callbacks being garbage collected, and it isn't triggering. I have examined the objects with sos and verified that the callback still existed in memory, and that the address it pointed to was the correct callback function.
If I run '!analyze -v' at this point, it usually (but not always) reports something along the lines of 'ERROR_SXS_CORRUPT_ACTIVATION_STACK'
Replacing the call to SetTimer with Microsoft's 'System.Windows.Forms.Timer' class also stops the crash. I've used a Reflector on the class and can see internally it still calls SetTimer - but does not register a procedure. Instead it has a native window that receives the callback. It's pInvoke definition actually looks wrong... it uses 'ints' for the eventId, where MSDN documentation says it should be a UIntPtr.
Our own code originally also used 'int' for nIDEvent rather than IntPtr - I changed it during the course of this investigation - but the crash continued both before and after this declaration change. So the only real difference that I can see is that we are registering a callback, and the Windows class is not.
So... at this stage I can 'fix' the problem by shuffing one particular call to SetTimer to a slightly different spot. But I am still no closer to actually understanding what is so special about starting the timer within that constructor that causes this error. And I dearly would like to understand the root cause of this issue.
Just briefly thinking about it it sounds like an x64 interop issue (i.e., calling x32 native functions from x64 managed code is fraught with danger). Does the problem go away if you force your application to compile as x32 platform from within project properties?
You can read suggestions on forcing x32 compile during x32/x64 development on Dotnetrocks. Richard Campbell's suggestion is that Visual Studio should default to x32 platform and not AnyCPU.
http://www.dotnetrocks.com/default.aspx?showNum=341 (transcript).
With regard to advanced debugging, I have not had a chance to debug x64 interop code, but i hear that this book is an great resource: Advanced .NET Debugging.
Finally, one thing you might try is force Visual Studio to break when an exception is thrown.
Use something like DebugDiag for x64 or Windbg to write a dump on Kernel32!TerminateProcess and second chance exception on .NET which should give you the actual .excr context frame of the exception that occurred.
This should help you in identifying the call-stack for the process terminate.
IMO it could be mostly because of PInvoke calls. You could use Managed Debugging Assistants to debug these issues.
If MDA is used along with Windbg it would give out messages that would be helpful in debugging
Also I have found tools from the http://clrinterop.codeplex.com/ team are extremely handy when dealing with interop
EDIT
This should give an answer why it is not working in 64 bit Issue with callback method in SetTimer Windows API called from C# code .
This does sound like a corruption issue. I would go through all of your interop calls and ensure that all of the parameters to the DllImport'ed functions are the correct types. For exmaple, using an int in place of an IntPtr will work in 32 bit code but can crash 64 bit.
I would use a site like PInvoke.net to verify all of the signatures.
the system I'm working with consists of:
A front-end application written in most likely VB or else VC++ (don't know, don't and can't have the sources for it)
An unmanaged VC++ .dll
A C# .dll
The application calls the first dll, the first dll calls different methods from the second one.
In order to make the first dll able to see and call the C# code I followed this guide:
http://support.microsoft.com/kb/828736
The only difference is that i am not compiling with /clr:OldSyntax, if I do then changing the other dependant compiling options makes the first dll load incorrectly from the application.
Everything compiles smoothly; the whole setup even worked fine initially, however after completely developing my code across the two dlls I now get an error in the application. The error is:
Run-time error '-2147417848 (80010108)':
Automation Error
The object invoked has disconnected from its clients.
And occurs when the following line is executed in the first dll:
MyManagedInterfacePtr ptrName(__uuidof(MyManagedClass));
I tried reproducing a fully working setup but without success.
Any ideas on how the heck I managed to do it in the first place?
Or alternatively on other approaches for making the two dlls work together?
Thanks in advance!
It is a low-level COM error, associated with RPC. That gets normally used in out-of-process servers, but that doesn't sound like your setup. It would also be used if you make calls on a COM interface from another thread. One possible cause is that the thread that created the COM object was allowed to exit, calling CoUninitialize and tearing down the COM object. A subsequent call made from another thread would generate this error. Getting reference counting wrong (calling Release too often) could cause this too.
Tackle this by carefully tracing which threads create a COM object and how long they survive.