How to disable Nagle's algorithm in ServiceStack?

How to disable Nagle's algorithm in ServiceStack? - c#

We're using ServiceStack 3.9.71.0 and we're currently experiencing unexplained latency issues with clients over a WAN connection.
A reply with a very small payload (<100 bytes) is received after 200ms+.
The round-trip-time (RTT) on the link is about 40ms due to the geographical distance. This has been verified by pinging the other host and using a simple echo service to test the latency of a TCP connection.
Both ping and echo test show latencies which are in line with expectations. Getting a reply from our ServiceStack host takes much longer than expected.
We've verified that:
WAN link is only running at 25% of capacity (no congestion)
No QOS is employed on the WAN link
same host gives fast reply to same request from a different host on local network
delay is not caused by our code processing the request
We've now stumbled across Nagle's algorithm and that it can mean delays for small requests on WAN networks (http://blogs.msdn.com/b/windowsazurestorage/archive/2010/06/25/nagle-s-algorithm-is-not-friendly-towards-small-requests.aspx).
In .NET it can be disabled by setting TcpClient.NoDelay = true (https://msdn.microsoft.com/en-us/en-US/library/system.net.sockets.tcpclient.nodelay(v=vs.110).aspx).
How can this be disabled for ServiceStack's TCP handling?
EDIT: I don't think that this is a duplicate of HttpWebRequest is slow with chunked data. The mentioned question covers HttpWebRequest which isn't used by ServiceStack. ServiceStack uses HttpListener which also happens to be controlled / managed by the mentioned ServicePointManager. We're going to conduct a test to see whether setting ServicePointManager.UseNagleAlgorithm = false solves the issue.

I think you provided an answer in your Update UseNagleAlgorithm = false should solve this issue. But be careful because ServicePointManager.UseNagleAlgorithm = false; is a global settings which means it will turn off this algorithm for all of your endpoint and for all of your requests in the entire App Domain. When you call more than one service endpoints (usually that is the case) with mixed sized of Request it will bite back. So you should consider setting this only for one specific ServicePoint, you can acquire it by:
ServicePoint sp = ServicePointManager.FindServicePoint(<uri>);
sp.UseNagleAlgorithm = false;
and not set it globally
Here is an article about it: https://technet2.github.io/Wiki/blogs/windowsazurestorage/nagles-algorithm-is-not-friendly-towards-small-requests.html

Related

429 Too many requests only production server side, not localhost, not browser

I readed this post: C# (429) Too Many Requests
and i understod the responde code but... why only return this status code when the call is done from server side (backend) and production mode (hosted)? the service never return this code when call (the same service) from chrome's navigate url or when i do the call server side (backend) but my localhost.
CASE 1 (works fine in localhost - the service url is not localhost, is hosted)
App A (localhost) call App B (hosted) --> works fine
for (int i = 0; i < 1000; i++)
{
HttpClient client = new HttpClient();
client.BaseAddress = new Uri(url);
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
String response = client.GetStringAsync(urlParameters).Result;
client.Dispose();
}
CASE 2 (work fine)
Chrome navigator call App B (hosted) --> works fine
CASE 3 (similar to case 1 but too less requests - NOT WORK)
App A (hosted) call App B (hosted) --> 429
Why? What is the problem? How can solve it?

What's Happening
The HTTP 429 response code indicates you have been rate limited. The idea is to prevent one caller from overwhelming a service, making it less availabe to other callers.
Most Common
That limiting can be based on many things. Most common are
Number of calls per unit time (usually per second)
Number of concurrent calls
The General Case
A rate limiter may also forgive a short burst of calls that happens occasionally, may allow more calls before hitting the brakes based on who you are (using your IP or an API key for example), dynamically adjust its limits based on total system load, or do other things.
Probably Happening Here
Based on your description, I would guess the number of concurrent calls could be causing production rate limiting. Rather than hitting the external API hard trying to guess what the rules are, try reaching out to them to ask. If that is not an option, running multiple requests in parallel could validate this theory.
Handling
A great way to deal with this is to back off your requests when you receive an HTTP 429.
The service should return a Retry-After header indicating how many seconds you should wait before trying again. If it does, wait that long before resubmitting your request.
If the service does not provide that header (I work with a major one that does not), use exponential backoff instead.
Depending on your needs, you may want to tell your own caller to try again later (return an HTTP 429 yourself) or you may want to queue up pending requests and work off the queue to submit them all.
Preventing
If you know the rate limits, you can pre-emptively limit your outbound call rate so you get into this situation less often.
For call-per-second limits, you can use a counter variable that you reset (in a thread-safe way) every second. If the known call limit would be exceeded, calculate when the counter will reset (store a timestamp when it does) and delay processing that long.
For a concurrent-call limit, a SemaphoreSlim works nicely. Set the maximum count to whatever your concurrent rate limit is. Acquire the semaphore before making a request and release it (in a finally block) after your call completes.
If you have multiple servers subject to the same rate limit (e.g. if rate limiting is based on an API key rather than IP address), it gets harder to self-limit, but you can set self-limiting parameters (calls per second and concurrent calls) in a configuration file, and tune them over time to maximize your throughput without hitting excessive HTTP 429's.

WCF request returns wrong response

I have a c# application that the client uses wcf to talk to the server. In the background every X seconds the client calls a Ping method to the server (through WCF). The following error has reproduced a couple of times (for different method calls):
System.ServiceModel.ProtocolException: A reply message was received for operation 'MyMethodToServer' with action 'http://tempuri.org/IMyInterface/PingServerResponse'. However, your client code requires action 'http://tempuri.org/IMyInterface/MyMethodToServerResponse'.
MyMethodToServer is not consistent and it falls on different methods.
How can this happen that a request receives a different response?

I think you have a pretty mess problem with async communication, main suggestion (as your question isn't clear very well), is try to identify every request, catch the calls and waiting for them, do asyncronic communication and getting a several work with threading.
As you present it, is a typical architecture problem.
If you present more code, can I suggest some code fixing in my answer and I'll be glad to update my answer.

If this occurs randomly and not you consistently, you might be running in a load-balanced setup, and deployed an update to only one of the servers?

Wild guess: your client uses same connection to do two requests in parallel. So what happens is:
Thread 1 sends request ARequest
Thread 2 sends request BRequest
Server sends reply BReply
Thread 1 receives reply BReply while expecting AReply
If you have request logs on the server, it'll be easy to confirm - you'll likely see two requests coming with short delay from the client host experiencing the issue
I think MaxConcurrentCall and ConcurrencyMode may be relevant here (although I did not touch WCF for a long while)

Debugging a long running Webservice Call

I'm using C# to connect to a Webservice to grab data. However, I'm currently having problems getting the code to run on a remote server; when I say problems, I mean its running, but the connection speed between client and server is ridiculously slow (through no fault of mine - the client is providing a slow resultset via a webservice, and they have all timeouts turned off their side in order to do so.)
if ((endpointConfiguration == EndpointConfiguration.SFFService))
{
System.ServiceModel.BasicHttpBinding result = new System.ServiceModel.BasicHttpBinding();
result.MaxBufferSize = int.MaxValue;
result.ReaderQuotas = System.Xml.XmlDictionaryReaderQuotas.Max;
result.MaxReceivedMessageSize = int.MaxValue;
result.AllowCookies = true;
result.OpenTimeout = TimeSpan.MaxValue;
result.CloseTimeout = TimeSpan.MaxValue;
result.SendTimeout = TimeSpan.MaxValue;
return result;
}
So. Not a great start. Open Close and Send all set to maximum.
Anyway, I've matched their long timeouts my side, and a few of the smaller webservice requests finish and succeed ok on the server. The biggest, slowest one however just hangs indefinitely, probably because I've told it to never timeout.
However, I'm pretty sure there's some other problem happening, as I left it overnight and it just sat there. Locally, on my development machine, although slow, it works.
My question is, has anyone any idea on additional things to check about the environment that could potentially be in play here? I thought perhaps firewall, but given that the small requests succeed (and connect) it is very difficult to debug the slow requests as I've no idea how long to wait until accepting that the program isn't going to do anything.
FWIW I've tried connecting via a browser, and again, the browser just sits there waiting for the request to finish which it never does (most likely due to the timeout being turned off on the server). If there was any way to see even how much of the request was left to finish (like a percentage download) that may help give me some guidance as to if the code is doing anything other than waiting.

There's no way to get a progress of the remote call even when you are attached to the remote process. Try using a local Visual Studio on the server machine (preferably on a non-production VM) and attach to the local process rather than using the Remote Debugger.
I am not sure exactly what the question is but the first step I'd take while debugging a slow application would be to test a local connection (local client and local server) to eliminate the network from the equation. If that works well, try hosting the server on a different place (public cloud maybe?) and try again there, if it works well then there's definitely something en-route or on that server.
If you're interested in tracking how long web service calls take you could track it by placing the start time into the HttpContext.Current.Items or OperationContext.Current.Items on BeginRequest/EndRequest in Global.asax or in a MessageInspector if you use WCF (you can send the datetime between the two methods by returning it into the Before method and read it from the corelationState parameter in the After method).

HttpWebResponse won't scale for concurrent outbound requests

I have an ASP.NET 3.5 server application written in C#. It makes outbound requests to a REST API using HttpWebRequest and HttpWebResponse.
I have setup a test application to send these requests on separate threads (to vaguely mimic concurrency against the server).
Please note this is more of a Mono/Environment question than a code question; so please keep in mind that the code below is not verbatim; just a cut/paste of the functional bits.
Here is some pseudo-code:
// threaded client piece
int numThreads = 1;
ManualResetEvent doneEvent;
using (doneEvent = new ManualResetEvent(false))
{
for (int i = 0; i < numThreads; i++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(Test), random_url_to_same_host);
}
doneEvent.WaitOne();
}
void Test(object some_url)
{
// setup service point here just to show what config settings Im using
ServicePoint lgsp = ServicePointManager.FindServicePoint(new Uri(some_url.ToString()));
// set these to optimal for MONO and .NET
lgsp.Expect100Continue = false;
lgsp.ConnectionLimit = 100;
lgsp.UseNagleAlgorithm = true;
lgsp.MaxIdleTime = 100000;
_request = (HttpWebRequest)WebRequest.Create(some_url);
using (HttpWebResponse _response = (HttpWebResponse)_request.GetResponse())
{
// do stuff
} // releases the response object
// close out threading stuff
if (Interlocked.Decrement(ref numThreads) == 0)
{
doneEvent.Set();
}
}
If I run the application on my local development machine (Windows 7) in the Visual Studio web server, I can up the numThreads and receive the same avg response time with minimal variation whether it's 1 "user" or 100.
Publishing and deploying the application to Apache2 on a Mono 2.10.2 environment, the response times scale almost linearly. (i.e, 1 thread = 300ms, 5 thread = 1500ms, 10 threads = 3000ms). This happens regardless of server endpoint (different hostname, different network, etc).
Using IPTRAF (and other network tools), it appears as though the application only opens 1 or 2 ports to route all connections through and the remaining responses have to wait.
We have built a similar PHP application and deployed in Mono with the same requests and the responses scale appropriately.
I have run through every single configuration setting I can think of for Mono and Apache and the ONLY setting that is different between the two environments (at least in code) is that sometimes the ServicePoint SupportsPipelining=false in Mono, while it is true from my machine.
It seems as though the ConnectionLimit (default of 2) is not being changed in Mono for some reason but I am setting it to a higher value both in code and the web.config for the specified host(s).
Either me and my team are overlooking something significant or this is some sort of bug in Mono.

I believe that you're hitting a bottleneck in the HttpWebRequest. The web requests each use a common service point infrastructure within the .NET framework. This appears to be intended to allow requests to the same host to be reused, but in my experience results in two bottlenecks.
First, the service points allow only two concurrent connections to a given host by default in order to be compliant to the HTTP specification. This can be overridden by setting the static property ServicePointManager.DefaultConnectionLimit to a higher value. See this MSDN page for more details. It looks as if you're already addressing this for the individual service point itself, but due to the concurrency locking scheme at the service point level, doing so may be contributing to the bottleneck.
Second, there appears to be an issue with lock granularity in the ServicePoint class itself. If you decompile and look at the source for the lock keyword, you'll find that it uses the instance itself to synchronize and does so in many places. With the service point instance being shared among web requests for a given host, in my experience this tends to bottleneck as more HttpWebRequests are opened and causes it to scale poorly. This second point is mostly personal observation and poking around the source, so take it with a grain of salt; I wouldn't consider it an authoritative source.
Unfortunately, I did not find a reasonable substitute at the time that I was working with it. Now that the ASP.NET Web API has been released, you may wish to give the HttpClient a look. Hope that helps.

I know this is pretty old but I'm putting this here in case it might help somebody else who runs into this issue. We ran into the same problem with parallel outbound HTTPS requests. There are a few issues at play.
The first issue is that ServicePointManager.DefaultConnectionLimit did not change the connection limit as far as I can tell. Setting this to 50, creating a new connection, and then checking the connection limit on the service point for the new connection says 2. Setting it on that service point to 50 once appears to work and persist for all connections that will end up going through that service point.
The second issue we ran into was with threading. The current implementation of the mono thread pool appears to create at most 2 new threads per second. This is an eternity if you are doing many parallel requests that start at exactly the same time. To counteract this, we tried setting ThreadPool.SetMinThreads to a higher number. It appears that Mono only creates up to 1 new thread when you make this call, regardless of the delta between the current number of threads and the desired number. We were able to work around this by calling SetMinThreads in a loop until the thread pool had the desired number of idle threads.
I opened a bug about the latter issue because that's the one I'm most confident is not working as intended: https://bugzilla.xamarin.com/show_bug.cgi?id=7055

If #jake-moshenko is right about ServicePointManager.DefaultConnectionLimit not having any effect if changed in Mono, please file this as a bug in http://bugzilla.xamarin.com/.
However I would try some things before discarding this completely as a Mono issue:
Try using the SGen garbage collector instead of the old boehm one, by passing --gc=sgen as a flag to mono.
If the above doesn't help, upgrade to Mono 3.2 (which BTW defaults to SGEN GC too), because there has been a lot of fixes since you asked the question.
If the above doesn't help, build your own Mono (master branch), as this important pull request about threading has been merged recently.
If the above doesn't help, build your own Mono with this pull request added. If it fixes your problem, please add a "+1" to the pull request. It might be a fix for bug 7055.

WCF client hangs on response

I have a WCF client (running on Win7) pointing to a WebSphere service.
All is good from a test harness (a little test fixture outside my web app) but when my calls to the service originate from my web project one of the calls (and only that one) is extremely slow to deserialize (it takes minutes VS seconds) and not just the first time.
I can see from fiddler that the response comes back quickly but then the WCF client hangs on the response itself for more than a minute before the next line of code is hit by the debugger, almost if the client was having trouble deserializing. This happens only if in the response I have a given pdf string (the operation generates a pdf), base64 encoded chunked. If for example the service raises a fault (thus the pdf string is not there) then the response is deserialized immediately.
Again, If I send the exact same envelope through Soap-UI or from outside the web project all is good.
I am at loss - What should I be looking for and is there some config setting that might do the trick?
Any help appreciated!
EDIT:
I coded a stub against the same service contract. Using the exact same basicHttpBinding and returning the exact same pdf string there is no delay registered. I think this rules out the string and the binding as a possible cause. What's left?

Changing transferMode="Buffered" into transferMode="Streamed" on the binding did the trick!
So the payload was apparently being chunked in small bits the size of the buffer.
I thought the same could have been achieved by increasing the buffersize (maxBufferSize="1000000") but I had that in place already and it did not help.

I have had this bite me many times. Check in your WCF client configuration that you are not trying to use the windows web proxy, that step to check on the proxy (even if there is not one configured) will eat up a lot of time during your connection.

If the tips of the other users don't help, you might want to Enable WCF Tracing and open it in the Service Trace Viewer. The information is detailed, but it has enabled me to fix a number of hard-to-identity problems in the past.
More information on WCF Tracing can be found on MSDN.

Two thing you can try:
Adjust the readerQoutas settings for your client. See http://msdn.microsoft.com/en-us/library/ms731325.aspx
Disable "Just My Code" in debugging options. Tools -> Options -> Debugging -> General "Enable Just My Code (Managed only)" and see if you can catch interal WCF exceptions.
//huusom

I had the very same issue... The problem of WCF, IMO, is in the deserialization of the base64 string returned by the service into a byte[] client side.
The easiest to solve this if you may not change your service configuration (Ex.: use a transferMode="Streamed") is to adapt your DataContract/ServiceContract client side. Replace the type "byte[]" with "string" in the Response DataContract.
Next simply decode the returned string yourself with a piece of code such as:
byte[] file = Convert.FromBase64String(pdfBase64String);
To download a PDF of 70KB, it used to required ~6 sec. With the suggested change here above, it takes now < 1 sec.
V.
PS.: Regarding the transfer mode, I did try to only change the client side (transferMode="StreamedResponse") but without improvement...

First things to check:
Is the config the same in the web project and the test project?
When you test from SOAP UI are you doing it from the same server and in the same security context as when the code is running from the web project.
-Is there any spike in the memory when the pdf comes back?
Edit
From your comments the 1 min wait, appears that it is waiting for a timeout. You also mention transactions.
I am wondering if the problem is somewhere else. The call to the WCF service goes OK, but the call is inside a transaction, and there is no complete or dispose on the transaction (I am guessing here), then the transaction / code will hang for 1 min while it waits to timeout.
Edit 2
Next things to check:
Is there any difference in the code is the test and in the web project, on how the service is being called.
Is there any differnce in the framework version, for example is one 3.0 and the other 3.5

Can it be that client side is trying to analyse what type of content is coming from server side? Try to specify mime type of the service response explicitly on the server side, e.g. Response.ContentType = "application/pdf" EDIT: By client side I mean any possible mediator like a firewall or a security suite.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.