On several occasions I have received the following error from a .Net (C#, 4.0) application out of the blue on sending a message thru a producer:
CWSMQ0082E: Failed to send to CompCode: 2, Reason: 2009. A problem was encountered whilst sending a message. See the linked exception for more information.
Of course, the LinkedException (why not use the InnerException IBM???) is null i.e. no more information available.
Code I'm using (pretty straightforward):
var m = _session.CreateBytesMessage();
m.WriteBytes(mybytearray);
m.JMSReplyTo = myreplytoqueue;
m.SetIntProperty(XMSC.JMS_IBM_MSGTYPE, MQC.MQMT_DATAGRAM);
m.SetIntProperty(XMSC.JMS_IBM_REPORT_COA, MQC.MQRO_COD);
m.SetIntProperty(XMSC.JMS_IBM_REPORT_COD, MQC.MQRO_COA);
myproducer.Send(m, DeliveryMode.Persistent, mypriority, myttl);
(Offtopic: I hate the SetIntProperty way of setting properties. Which <expletive deleted> came up with that idea? It takes ages to look up all sorts of constants all over the place and its allowed values.)
The exception is thrown on the .Send method. I'm using XMS.Net (IA9H / 2.0.0.7). The only Google result that turns up turns out to have a different reason code (and even if it were the same, it should be fixed in my version if I understand correctly). This occurs randomly (though it seems to happen more often when it's been a while since a message has been sent/received) and I have no way to reproduce this.
I have ab-so-lute-ly no idea how to troubleshoot this or even where to start looking. Is this something caused by the server-side? Is it caused by XMS.net or some underlying IBM WebSphere MQ infrastructure?
Some results that I found that seem similar are suggesting to set SHARECNV to any value higher than 0 or to "true" / "yes" but the documentation explicitly tells me the default is 10. Also; I have no idea if this is the cause so changing it to another value feels like a shotgun approach.
Anybody any idea on how to go about solving this? I could of course just catch the exception, tear everything (channels, sessions, whatever) down and restart but that's just plain ugly IMHO.
The 2009 return code means "Connection Broken." Basically, the underlying TCP socket is gone and the client finds out about it at the time of the API call. It is possible to tune the channels using heartbeat and keepalive so that WMQ tries harde to keep the socket alive. However if the socket is timed out by the underlying infrastructure, nothing WMQ can do will help. Examples we've seen are that firewalls and load balancers are often set to detect idle connections and sever them.
Modern versions of WMQ client will attempt to reconnect transparently. The application just blocks a bit longer when this occurs.
Short of using the automatic reconnect, the only solution is in fact to rebuild the connection. Since it will get a new connection handle, all the object handles must be rebuilt as well.
Many of the tuning functions described here are available through the client configuration file, available in v7.0 and greater clients. In particular, the TCP stanza of that file enables keepalive. (The TCP spec says that if keepalive is provided, it must be disabled by default.) The QMgr has a similar ini file with configuration stanzas, including one for keepalive. The latest WMQ client is available as SupportPac MQC71 if you need that.
In cases where the main exception is sufficient enough to indicate the error, the inner exception will be null. In your case it's MQ reason code 2009 which means a connection to queue manager has been broken. The socket through which your application and queue manager were communicating was closed for some reason. The reason for socket close could be a network blip.
Along with suggestions T.Rob noted above, You could also run a XMS and Queue manager trace to understand the problem further. Please see the Troubleshooting chapter in XMS InfoCenter.
HTH
Related
I have a Lync 2013-based application which:
connects to a UserEndpoint (hereinafter CallCenter)
redirects calls made to CallCenter according to bla bla bla business logic.
At times, a user will see CallCenter in their standard Lync 2013 Client as Online, but if that user attempts to start an IM call with CallCenter, the user receives the message "We couldn't send this message because CallCenter is unavailable or offline."
I haven't been able to identify the process that leads up to this, but if it's happened to one user, then all of the other users experience the same problem when attempting to call CallCenter. The only way I have been able to recover CallCenter has been to restart my application. Regular interaction with CallCenter then resumes without a problem.
If CallCenter is indeed "unavailable or offline", then why does it's Presence appear as "Online"? Is there a need to renew / keep CallCenter's connection alive every so often?
For reference, I connect CallCenter like so:
UserEndpointSettings settings = new UserEndpointSettings(userURI, _ProxyHost, _ProxyPort);
settings.AutomaticPresencePublicationEnabled = true;
settings.Presence.UserPresenceState = PresenceState.UserAvailable;
_userEndpoint = new UserEndpoint(_Platform.CollabPlatform, settings);
_userEndpoint.BeginEstablish(res =>
{
try
{
_userEndpoint.EndEstablish(res);
_userEndpoint.StateChanged += new EventHandler<LocalEndpointStateChangedEventArgs>(_userEndpoint_StateChanged);
}
catch (Exception ex)
{
LogError(ex, ErrorReference.EndpointEstablishFailed);
}
}, null);
In the client, when you go offline or experience an error, your presence reflects that (most of the time, that is). This can lead you to believe that the status portion of presence [1] is somehow tied to actual availability.
When you're working with UCMA, you are given ultimate control over everything related to your endpoint. As you've seen, you can make your UCMA application do things that would otherwise be impossible in the regular client. You don't have to publish any presence status (leaving you "offline" to your users), yet the service can still send/receive IMs. And, as you've seen, your service can be "Available" and yet ... have no capability to do anything but publish its status [2].
If you fail to wire up the appropriate modality (in your case IM), or your application encounters an exception which results in a particular modality no longer working (I suspect this may be your actual problem), the status of your service will still be available.
Begin/EndTerminate on the UserEndpoint should publish Offline for you automatically and publishing a presence other than Available is the only way to guarantee the presence won't be "Available" for the lifetime of your application (and even after the application ends/dies prematurely, though this is sometimes rectified by the server -- sometimes).
Here's how I'd attack resolving this issue. Ignore the presence problem and ignore the error. They're red herrings. Many problems result in the "unavailable or offline" message that have nothing to do with the service actually being stopped.
Instead, figure out why your calls aren't connecting.
If the call takes a while before you receive the error, check for deadlocks or circumstances where the Thread Pool has no room for another thread. Troubleshooting involves reviewing your code for race conditions and the myriad of other things that multi-threaded applications throw your way. If the IMCall fails instantly, check around the parts that handle incoming calls. In the latter case, your subscription may be gone (too many causes to list here, most of which are .Net related, not UCMA related), or your service may be dead.
If the importance of presence to your application is only to show it as "available" or "offline" when it is actually able to send/receive an IM, you're going to want to ensure your application terminates the endpoint properly during tear-down (including in the case of a critical failure: catch-terminate-rethrow or whatever is appropriate in your case).
[1] Be careful when thinking about the term "presence" as it relates to Lync. Presence contains availability status, modality specific states, capabilities (IM/Voice, etc), the "note" and contact information.
[2] This seems like a bizarre thing to do, however, it gave me the ability to use an ApplicationEndpoint to report on the availability of a web service (unrelated to Lync) that I wanted to be able to view in the Mobile client without connecting via VPN. When doing something like this, it's really important to publish the capabilities of your endpoint -- this will explicitly signal to your connected clients what your service can and cannot do.
[Final Footnote] There are a few ways to publish presence. The mechanism you're using to publish is the simplest and most logical to use if you're just interested in telling your users that the "service is here"/"service is not here" which is documented rather well here: Simplified Presence Publication for Endpoints
I'm trying to write an asynch socket application which transfering complex objects over across sides..
I used the example here...
Everything is fine till i try send multi package data. When the transferred data requires multiple package transfer server application is suspending and server is going out of control without any errors...
After many hours later i find a solution; if i close client sender socket after each EndSend callback, the problem is solving. But i couldn't understand why this is necessary? Or are there any other solution for the situation?
My (2) projects is same with example above only i changed EndSend callback method like following:
public void EndSendCallback(IAsyncResult result)
{
Status status = (Status)result.AsyncState;
int size = status.Socket.EndSend(result);
status.Socket.Close(); // <--------------- This line solved the situation
Console.Out.WriteLine("Send data: " + size + " bytes.");
Console.ReadLine();
allDone.Set();
}
Thanks..
This is due to the example code given not handling multiple packages (and being broken).
A few observations:
The server can only handle 1 client at a time.
The server simply checks whether the data coming in is in a single read smaller than the data requested and if so, assumes that's the last part.
The server then ignores the client socket while leaving the connection open. This puts the responsibility of closing the connection on the client side which can be confusing and which will waste resources on the server.
Now the first observation is an implementation detail and not really relevant in your case. The second observation is relevant for you since it will likely result in unexplained bugs- probably not in development- but when this code is actually running somewhere in a real scenario. Sockets are not streamlined. When the client sents over 1000 bytes. This might require 1 call to read on the server or 10. A call to read simply returns as soon as there is 'some' data available. What you need to do is implement some sort of protocol that communicates either how much data is being sent over- or when all the data has been sent over. I really recommend just to stick with the HTTP protocol since this is a well tested and well supported protocol that suits most scenario's.
The third observation might also cause bugs where the server is running out of resources since it leaves all connections open.
If i have a client that is connected to a server and if the server crashes, how can i determine, form my client, if the connection is off ? the idea is that if in my client's while i await to read a line from my server ( String a = sr.ReadLine(); ) and while the client is waiting to recieve that line , the server crashes , how do i close that thread that contains my while ?
Many have told me that in that while(alive) { .. } I should just change the alive value to true , but if my program is currently awaiting for a line to read, it won't get to exit the while because it will be trapped at sr.ReadLine() .
I was thinking that if i can't send a line to the server i should just close the client thread with .abort() . Any Ideas ?
Have a TimeOut parameter in ReadLine method which takes a TimeSpan value and times out after that interval if the response is not received..
public string ReadLine(TimeSpan timeout)
{
// ..your logic.
)
For an example check these SO posts -
Implementing a timeout on a function returning a value
Implement C# Generic Timeout
Is the server app your own, or something off the shelf?
If it's yours, send a "heart beat" every couple of seconds to let the clients know that the connection and service are still alive. (This is a bit more reliable than just seeing if the connection is closed since it may be possible for the connection to remain open while the server app is locked.)
That the server crashes has nothing to do with your clients. There are several external factors that can make the connection go down: The client is one of them, internet/lan problems is another one.
It doesn't matter why something fails, the server should handle it anyway. Servers going down will make your users scream ;)
Regarding multi threading, I suggest that you look at the BeginXXX/EndXXX asynchronous methods. They give you much more power and a more robust solution.
Try to avoid any strategy that relies on thread abort(). If you cannot avoid it, make sure you understand the idiom for that mechanism, which involves having a separate appdomain and catching ThreadAbortException
If the server crashes I imagine you will have more problems than just fixing a while loop. Your program may enter an unstable state for other reasons. State should not be overlooked. That being said, a nice "server timed out" message may suffice. You could take it a step further and ping, then give a slightly more advanced message "server appears to be down".
I get a CommunicationException while using WCF service. The message is:
The remote endpoint no longer recognizes this sequence. This is most likely due to an abort on the remote endpoint. The value of wsrm:Identifier is not a known Sequence identifier. The reliable session was faulted.
The exception is thrown in a moment after a contract method was called. Before calling contract method the channel state is Opened. I restore my service client after catching this exception and for some time it works fine. But then this error occures again. It seems like some timeout is exceeded, but I can't understand which one exactly.
I use wsHttpBinding with reliableSession enabled. The InactivityTimeout is set to half an hour and I'm sure it's not exceeded, because exception is thrown earlier.
I solved the problem. The reason was RecieveTimeout on a server side. It was set to 1 minute, so after having no requests during 1 minute server used to close a channel, and when client tried to call a contract, channel was already crashed due to the timeout.
I found the solution after reading this article:
http://msdn.microsoft.com/en-us/library/system.servicemodel.reliablesession.inactivitytimeout.aspx
I received this error while setting up a new WCF service which returned a list of objects.
My understanding is that WCF services can only pass very simple objects back n forth.
So objects with anything other than public properties will not be transferable.
The object had a read only property doing a bit of logic.
Once I got rid of this, rebuilt, and updated the web references, the error went away.
Tip:
If you're returning a object and it has properties check the gets and sets of each one.
We had a problem around it.
I have seen this happen when an application pool gets recycled.
Look at the very last section of this blog about service recycling .
In developing a relatively simple web service, that takes the data provided by a post and records it in a database table, we're getting this error:
Exception caught: The remote server returned an error: (500) Internal Server Er
or.
Stack trace: at System.Net.HttpWebRequest.GetResponse()
on some servers, but no others. The ones that are getting this are the physical machines, the others are virtual, and obviously the physical servers are far more powerful.
As far as we can tell, the problem is that the DB connections aren't being released back to the pools after each query. I'm using the using pattern below:
using (VoteDaoDataContext dao = new VoteDaoDataContext())
{
dao.insert_response_and_update_count(answerVal, swid, agent, geo, DateTime.Now, ip);
dao.SubmitChanges();
msg += "Thank you for your vote.";
dao.Dispose();
}
I added the dao.Dispose() call to ensure that connections are released when the method finishes, but I don't know whether or not it's necessary.
Am I using this pattern correctly? Is there something else I need to do to ensure that connections get returned to the pools correctly?
Thanks!
Your diagnostic information is not good enough. An HTTP/500 isn't enough detail to really tell if your theory is correct. You're going to need to capture a full stack trace in your logging if you want to get to the problem. I think you've jumped to a conclusion here. And no, you do not need that Dispose() before the end of your using{} block. That's what using{} does.
I thought that dispose() call was redundant, but I wanted to be sure.
We're seeing the connection pools saturating in the SQL logs (I can't look at the directly, I'm just a developer, and this stuff's running in a prod environment), and my ops guy said he's seeing connections timing out... and once they time out, the server starts running again, until the next time it saturates the connection pool.
We're going through the process of tweaking the connection pool settings at the moment... I wanted to be certain that I wasn't doing anything wrong, since this is my first time using Linq.
Thanks!