We are currently developing a software solution which has a client and a number of WCF services that it consumes. The issues we are having is WCF services timing out after a period of inactivity. As far as I understand, there are 2 ways to resolve this:
Increase timeouts (as far as I understood, this is generally not recommended. Eg. setting timeout to infinite/weeks is considered bad practice)
Periodically ping the WCF services from the Client (I'm not sure that I'm a huge fan of his as it will add redundant, periodic calls)
Handle timeout issues and attempt to reconnect (this is slow and requires a lot of manual code)
Reliable Sessions - some sources mention that this is the in-built WCF pinging and message reliability mechanism, but other sources mention that this will still time out.
What is the recommended/best way of resolving this issue? Is there any official reading material on this? I could not find all that much info myself
Thanks!
As i can see, you have to use a combination of your stated points.
You are right, increasing the timeouts is bad practice and can give you a lot of problems.
If you don't want to use Reliable Sessions, then Ping is the only applicable way to hold the connection.
You need to handle this things, no matter if a timeout occurs, the connection is lost or a exception is thrown. There are a plenty of possibilities that your connection can fault.
Reliable Sessions are a good way not to implement a ping, but technically, it does nearly the same. WCF automatically sends an "I am still here" Request.
The conclusion of this is, that you need point 3 and point 2 or 4. To reduce the manually code for point 3, you can use Proxies or a wrapper around your ServiceClient, that establishes a new connection if the old one is faulted during a request. Point 4 is easy to implement, because you only need some small additions to your binding in your config. And the traffic overhead is not that big. Point 2 is the most expensive way, you need to handle a Thread/Task that only pings the server and the service needs to be extended. But as you stated before, Reliable Sessions can fail, and Pings should bring you on the safe side.
You should ask yourself what is your WCF endpoint is doing? Is the way you have your command setup the most optimal?
Perhaps it'd be better to have your endpoint that takes a long time be based on a polling system that allows there to be a quick query instead of waiting on the results of the endpoints actions.
You should also consider data transfer as a possible issue. Is the amount of data you're transferring back a lot?
To get a more pointed answer, we'd need to know more about the specific endpoint as well as any other responsibilities there are for the service.
Related
We have an asp.net webapi application that needs to issue a lot of calls to other web applications (it's basically a reverse proxy). To do this we use the async methods of the HttpClient.
Yes, we have seen the hints about using only one HttpClient instance and not to dispose of it.
Yes, we have seen the hints about setting configuration values, especially the problem with the lease timeout. Currently we set ConnectionLimit = CPU*12, ConnectionLeaseTimeout = 5min and MaxIdleTime = 30s.
We can see that the connections behave as desired. The throughput in a load test was also very good. However we are facing issues where occasionally the connections stop working. It seems to happen when a lot of requests are coming in (and, being a reverse proxy, cause new requests to be issued) and it happens mostly (but not only) with the slowest of all backend applications. The behaviour is then that it takes forever to finish the requests to this endpoint or they simply end in a timeout.
An IISReset of the server hosting our reverse proxy application terminates the problems (for a while).
We have investigated in several areas already:
Performance issues of the remote web application: Although it behaves exactly as this would be the case the performance is good when the same requests are issued locally on the remote server. Also the values for CPU / network etc. are low.
Network issues (bandwidth, router, firewall, load balancers): Possible but rather unlikely since everything else runs stable and our hoster is involved in the analysis too.
Threadpool starvation: Not impossible but rather theoretical - sure we have a lot of async calls but shouldn't that help regarding this issue?
HttpCompletionOption.ResponseHeadersRead: Not a problem by itself but maybe one piece of the puzzle?
The best explanation so far focuses on the ConnectionLimit: We started setting the values mentioned above only recently and this seems to have triggered the problems. But why would it? Shouldn't it be an improvement to reuse the connections instead of opening a new one for every request? And the values we set seem to be rather conservative?
We have started to experiment with these values lately to see their impact in production. Yet it is still unclear to us if this is the only cause. And we'd appreciate a more straighforward approach for analysis. Unfortunately a memory dump and netstat printouts did not help any further.
Some suggestions about how to analyze or hints about possible causes would be highly appreciated.
***** EDIT *****
Setting the connection limit to 1000 is solving the issue! So the question remains as to why is that the case? From what we know the default connection limit is 2 in a non-web and 1000 in a web application. MS is suggesting a default value of CPU*12 (but they didn't implement it like that?!) so our change was basically to go from 1000 to 48. Still we can see that only a handful connections are open. Is there anyone who can shed some light on this? What is the exact behaviour about opening new connections, reusing existing ones, pipelining etc.? Is there any source of information for this?
ConnectionLimit means ServicePointManager.DefaultConnectionLimit? Yes it matters. When the value is X, if there are already X requests waiting response, new request will not be sent until any previous request is finished.
I posted a follow up question here: How to disable pipelining for the .NET HttpClient
Unfortunately there were no real answers to any of my questions. We ended up leaving the ConnectionLimit at 1000 (which is a workaround only but the only solution we were able to find).
Here's a problem I'm currently facing:
A WCF service exposes a large number of methods, some of which can take a longer amount of time.
The client is a WinRT (Metro-style) application (so some .NET classes are unavailable).
The timeout on the client has already been increased to 1.5 minutes.
Despite the increased timeout, some operations can take longer still (but not always).
If a timeout happens, the service continues on it's merry way. The result of the requested operation is lost. Even worse, if the operation is a success, then the client won't get the data required, and the server won't "rollback".
All operations are already implemented using the async pattern on the client. I could use an event-based implementation but, as far as I'm aware, the timeouts will still occur then.
Increasing the timeout value is definitely an option, but it feels like a very dirty solution - it feels like pushing the problem away rather than solving it.
Implementing a WS transaction flow on the server seems impossible - I don't have access to TransactionScope class when designing WinRT apps.
WS Atomic seems like overkill as well (it also requires a lot more set up, and I'm willing to bet the limited capabilities of WinRT applications will prove a big hassle to overcome).
So far my only idea (albeit one with a lot more moving parts, which sort of feels like reinventing the wheel) is to create two service methods - one which begins some long-running operation and returns some kind of "task ID", then runs the operation in the background, and saves the result of the operation (be it error or success) into a DB / storage with that task ID. The client can then poll for the operations result using that task ID via the second service method every once in a while until such a result is available (be it a success or an error).
This approach also has it's drawbacks:
long operations become even longer, as the client needs to poll for the results
lots of new moving parts, potentially making the whole thing less stable
What else could I possibly try to solve this issue?
PS. The actual service side is also not without limitations - it's an MS DAX service, which likely comes with it's own set of potential pitfalls and traps.
EDIT:
It appears my question has some similarity to this SO question... however, given the WinRT nature of the client and the MS DAX nature of the service I'm not sure anything in the answer is really useful to me.
Is there any way (preferably in C#) how to regularly measure connection layer latency (roundtrip) without changing the application protocol and without creating separate dedicated connection - e.g. using some similar SYN-ACK trick like tcping do but without closing/opening connection?
I'm connecting to the servers via given ASCII based protocol (and always using TCP_NODELAY). Servers send me large amount of discrete messages and I'm regularly sending 'heartbeat' payload (but there is no response payload to the heartbeat).
I cannot change the protocol and in many cases I also cannot create more than one physical connection to the server.
Keep in mind that TCP does windowing, so this could cause issues when trying to implement an elegant SEQ/ACK solution. (you would want sequence, not synchronize)
[EDIT: Snipped a very overcomplicated and confusing explaination.]
I'd have to say the best way is to use a simple stopwatch method of starting a timer, making a very thin request or poll, and measure the time back from it. If that query really is the lightest you can make it, then that should give you the minimum amount of time you can reasonably expect to wait, which sometimes more valuable than the ping (which can be misleading).
If you really absolutely need just the network time to machine and back, just use an ICMP ping.
I suppose similar questions were already asked, but I was unable to find any. Please feel free to point me to an existing solutions.
I'll explain my scenario. I'd like to create a server application. There are many clients (currently only a few dozens, but it should scale up to 1000+) that connect to the server (which is running on a single machine).
Each client periodically sends a small amount of data to the server to process (processing is quick). The server can also send small amounts of data to each client on a regular basis. The response time should be low (<100 ms), but realtime or anything like that is not required.
My first idea was back from when I was still programming in VB6: Create a server socket to listen to incoming requests, then create a client socket for each possible client (singlethreaded). I doubt this scales well. It is also difficult to implement the communication.
So I figured I'd create a listener thread to accept new client connections and a different thread to actually read the incoming data by the clients. Since there are going to be many clients, I don't want to create a thread for each client. Instead, I'd prefer to use a single thread to read all incoming data in a loop, then either processing these data directly or creating work items for a different thread to process. I guess this approach would scale well enough. Any comments on this idea are most welcome.
The remaining problem I'm worried about is easy of communication. The above solution seems to require a manual protocol, possibly sending ASCII commands via TCP. While this would work, I think there should be a better way nowadays.
Some interface/proxyish way seems reasonable. I worked a bit with Java RMI before. From my point of understanding, .NET Remoting serves a similar purpose. Is Remoting a feasible solution to the scenario I described (many clients)? Is there an even better way I don't know of yet?
Edit:
This is not in LAN, but internet, if that matters.
If possible, it should also run under Linux.
As AresnMkrt pointed out, you should try WCF.
Just take it as is (with netTcpBinding, but don't forget to switch security off) and create a Tracer Bullet - measure if performance meets your requirements.
If not, you can try to tune WCF - WCF is very extensible, and you can modify message serialization to send ASCII messages as you want.
Are you sure you need a binary protocol? Rather, are you sure you need to invent a whole new protocol where plain RESTful service with JSON/XML will suffice? WCF can help you in this regard a lot.
We're developing a .NET app that must make up to tens of thousands of small webservice calls to a 3rd party webservice. We would prefer a more 'chunky' call, but the 3rd party does not support it. We've designed the client to use a configurable number of worker threads, and through testing have code that is fairly well optimized for one multicore machine. However, we still want to improve the speed, and are looking at spreading the work accross multiple machines. We're well versed in typical client/server/database apps, but new to designing for multiple machines. So, a few questions related to that:
Is there any other client-side optimization, besides multithreading, that we should look at that could improve speed of a http request/response? (I should note this is a non-standard webservice, so is implemented using WebClient, not a WCF or SOAP client)
Our current thinking is to use WCF to publish chunks of work to MSMQ, and run clients on one or more machines to pull work off of the queue. We have experience with WCF + MSMQ, but want to be sure we're not missing better options. Are there other, better ways to do this today?
I've seen some 3rd party tools like DigiPede and Microsoft's HPC offerings, but these seem like overkill. Any experience with those products or reasons we should consider them over roll-our-own?
Sounds like your goal is to execute all these web service calls as quickly as you can, and get the results tabulated. Given that, your greatest efficiency control is going to be through scaling the number of concurrent requests you can make.
Be sure to look at your client-side connection limits. By default, I think the system default is 2 connections. I haven't tried this myself, but by upping the number of connections with this property, you should theoretically see a multiplier effect in terms of generating more requests by generating more connections from a single machine. There's more info on MS forums.
The MSMQ option works well. I'm running that configuration myself. ActiveMQ is also a fine solution, but MSMQ is already on the server.
You have a good starting point. Get that in operation, then move on to performance and throughput.
At CodeMash this year, Wesley Faler did an interesting presentation on this sort of problem. His solution was to store "jobs" in a DB, then use clients to pull down work and mark status when complete.
He then pushed the whole infrastructure up to Amazon's EC2.
Here's his slides from the presentation - they should give you the basic idea:
I've done something similar w/ multiple PC's locally - the basics of managing the workload were similar to Faler's approach.
If you have optimized the code, you could look into optimizing the network side to minimize the number of packets sent:
reuse HTTP sessions (i.e.: multiple transactions into one session by keeping the connection open, reduces TCP overhead)
reduce the number of HTTP headers to the minimum in the request to save bandwidth
if supported by server, use gzip to compress the body of the request (need to balance CPU usage to do the compression, and the bandwidth you save)
You might want to consider Rhino Service Bus instead of MSMQ. The source is available here.