I readed this post: C# (429) Too Many Requests
and i understod the responde code but... why only return this status code when the call is done from server side (backend) and production mode (hosted)? the service never return this code when call (the same service) from chrome's navigate url or when i do the call server side (backend) but my localhost.
CASE 1 (works fine in localhost - the service url is not localhost, is hosted)
App A (localhost) call App B (hosted) --> works fine
for (int i = 0; i < 1000; i++)
{
HttpClient client = new HttpClient();
client.BaseAddress = new Uri(url);
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
String response = client.GetStringAsync(urlParameters).Result;
client.Dispose();
}
CASE 2 (work fine)
Chrome navigator call App B (hosted) --> works fine
CASE 3 (similar to case 1 but too less requests - NOT WORK)
App A (hosted) call App B (hosted) --> 429
Why? What is the problem? How can solve it?
What's Happening
The HTTP 429 response code indicates you have been rate limited. The idea is to prevent one caller from overwhelming a service, making it less availabe to other callers.
Most Common
That limiting can be based on many things. Most common are
Number of calls per unit time (usually per second)
Number of concurrent calls
The General Case
A rate limiter may also forgive a short burst of calls that happens occasionally, may allow more calls before hitting the brakes based on who you are (using your IP or an API key for example), dynamically adjust its limits based on total system load, or do other things.
Probably Happening Here
Based on your description, I would guess the number of concurrent calls could be causing production rate limiting. Rather than hitting the external API hard trying to guess what the rules are, try reaching out to them to ask. If that is not an option, running multiple requests in parallel could validate this theory.
Handling
A great way to deal with this is to back off your requests when you receive an HTTP 429.
The service should return a Retry-After header indicating how many seconds you should wait before trying again. If it does, wait that long before resubmitting your request.
If the service does not provide that header (I work with a major one that does not), use exponential backoff instead.
Depending on your needs, you may want to tell your own caller to try again later (return an HTTP 429 yourself) or you may want to queue up pending requests and work off the queue to submit them all.
Preventing
If you know the rate limits, you can pre-emptively limit your outbound call rate so you get into this situation less often.
For call-per-second limits, you can use a counter variable that you reset (in a thread-safe way) every second. If the known call limit would be exceeded, calculate when the counter will reset (store a timestamp when it does) and delay processing that long.
For a concurrent-call limit, a SemaphoreSlim works nicely. Set the maximum count to whatever your concurrent rate limit is. Acquire the semaphore before making a request and release it (in a finally block) after your call completes.
If you have multiple servers subject to the same rate limit (e.g. if rate limiting is based on an API key rather than IP address), it gets harder to self-limit, but you can set self-limiting parameters (calls per second and concurrent calls) in a configuration file, and tune them over time to maximize your throughput without hitting excessive HTTP 429's.
I have put Flurl in high load using DownloadFileAsync method to download files in private network from one server to another and after several hours the method starts to throw exceptions "Get TimeOut". The only solution to solve that is restart application.
downloadUrl.DownloadFileAsync(Helper.CreateTempFolder()).Result;
I have added second method as failover using HTTPClient and its download files fine after flurl fails, so it is not server problem.
private void DownloadFile(string fileUri, string locationToStoreTo)
{
using (var client = new HttpClient())
using (var response = client.GetAsync(new Uri(fileUri)).Result)
{
response.EnsureSuccessStatusCode();
var stream = response.Content.ReadAsStreamAsync().Result;
using (var fileStream = File.Create(locationToStoreTo))
{
stream.CopyTo(fileStream);
}
}
}
Do you have any idea why Get TimeOut error starts popup on high load using the method?
public static Task<string> DownloadFileAsync(this string url, string localFolderPath, string localFileName = null, int bufferSize = 4096, CancellationToken cancellationToken = default(CancellationToken));
The two download code differ only that Flurl re-use HttpClient instance for all request and my code destroy and create new HttpClient object for every new request. I know that creating and destroying HttpClient is time and resource consuming I rather would use Flurl if it would work.
As others point out, you're trying to use Flurl synchronously by calling .Result. This is not supported, and under highly concurrent workloads you're most likely witnessing deadlocks.
The HttpClient solution is using a new instance for every call, and since instances aren't shared at all it's probably less prone to deadlocks. But it's inviting a whole new problem: port exhaustion.
In short, if you want to continue using Flurl then go ahead and do so, especially since you're getting smart HttpClient reuse "for free". Just use it asynchronously (with async/await) as intended. See the docs for more information and examples.
I can think of two or three possibilities (I'm sure there are others that I can't think of as well)
Server IP address has changed.
You wrote that Flurl reuses a HttpClient. I've never used, or even heard of Flurl, so I have no idea how it works. But an HttpClient re-uses a pool of connections, which is why it's efficient to reuse a single instance and why it's critical to do so in a high-volume microservice application, otherwise you're likely to exhaust all ports, but that gives a different error message, not a time out, so I know you haven't hit that case. However, while it's important to re-use an HttpClient in the short term, HttpClient will cache DNS results, which means it's important to dispose and create new HttpClients periodically. In short-lived processes, you can use a static or singleton instance. But in long running processes, you should create a new instance periodically. If you only use it to access one server, that server's DNS TTL is a good value to use.
So, what might be happening is the server changed IP addresses a few hours after your program started, and because Flurl keep reusing the same HttpClient, it doesn't get the new IP address from the DNS entry. One way to check if this is the problem is write the server's IP address to a log at the beginning of the process and when you encounter the problem, check if the IP address is the same or not.
If this is the problem, you can look into ASP.NET Core 2.1's HttpClientFactory. It's a bit awkward to use outside of ASP.NET, but I did once. It gives you re-use of HttpClients, to avoid the TCP port exhaustion problem of using more than 32k HttpClients in 120 seconds, but also avoid DNS caching issues. My memory is that it creates a new HttpClient every 5 minutes by default.
Reaching the maximum connections per server
ServicepointManager.DefaultConnectionLimit sets the maximum number of HTTP connections that a client will open to a server. If your code tries to use more than this simultaneously, the requests that exceed the limit will wait for an existing HTTP client to finish its request, then it will use the newly available connection. However, in the past when I was looking into this, the HTTP timeout started from when the HttpClient's method was called, not when the HttpClient sends the request to the server over a connection. This means that if your limit is 2 and both are used for longer than the timeout period (for example if downloading 2 large files), other requests to download from the same server will time out, even though no http request was ever sent to the server.
So, depending on your application and server, you may be able to use a higher connection limit, otherwise you need to implement request queuing in your app.
Thread pool exhaustion
Async code is awesome for performance when used correctly in highly concurrent, IO bound workloads. I sometimes think it's a bad idea to use anywhere else because it such huge potential for causing weird problems when used incorrectly. Like Crowcoder wrote in a comment on the question, you shouldn't use .Result, or any code that blocks a running thread, when in an async context. Although the code sample you provided says public void DownloadFile(... , if it's actually public async Task DownloadFile(..., or if DownloadFile is called from an async method, then there's real risk of issues. If DownloadFile is not called from an async method, but is called on the thread pool, there's the same risk of errors.
Understanding async is a huge topic, unfortunately with a lot of misinformation on the internet as well, so I can't possibly cover it in detail here. A key thing to note is that async tasks run on the thread pool. So, if you call ThreadPool.QueueUserWorkItem and block the thread that your code runs on, or if you have async tasks that you block on (for example by calling .Result), what could happen is that you block every thread in the thread pool, and when an HTTP response comes back from the network, the .NET run time has no threads available to complete the task. The problem with this idea is that there are also no threads available to signal the timeout, so I don't believe you're exhausting the thread pool (if you were, I would expect a deadlock), but I don't know how timeouts are implemented. If timeouts/timers use a dedicated thread it could be possible for a cancellation token (the thing that signals a timeout) to be set by the timer's thread, and then any code on a blocking wait for either the HTTP response or the cancellation token could be triggered. But thread pool exhaustion generally causes deadlocks, so if you're getting an error back, it's probably not this.
To check if you're having threadpool exhaustion issues, when your program starts getting the timeout errors, get a memory dump of your app (for example using Task Manager). If you have the Enterprise or Ultimate SKU of Visual Studio, you can open/debug the memory dump in VS. Otherwise you'll need to learn how to use windbg (or find another tool). When debugging the memory dump, check the number of threads. If there's a very large number of threads, that's a hint you might be on the right track. Check where the thread was at the time of the memory dump. If they're all in blocking calls like WaitForObject, or something similar, then there's a real risk you've exhausted the thread pool. I've never debugged an async task deadlock/thread pool exhaustion issue before, so I'm not sure if there's a way to get a list of tasks and see from their runstate if they're likely to be deadlocked or not. If you ever see more tasks in the running state than you have cores on your CPU, you almost certainly have blocking in an async task, however.
In summary, you haven't given us enough details to give you an answer that will work with 100% certainty. You need to keep investigating to understand the problem until you can either solve it yourself, or provide us with more information. I've given you some of the most likely causes, but it could very easily be something else completely.
I'm running the latest version of the WindowsAzure.Storage library, 6.1.1. This was previously a known issue but is supposed to have been fixed back in .NET 4.5.1. It's exactly the issue I'm having.
I'm hitting a table in Azure Storage with 100m+ rows to insert. I've focused on making the code fast and scalable, it maxes out an Azure D12 VM running Datacenter 2012 R2. I'm seeing 5,000 - 10,000 entities processed per second (read file, process, upload).
Update: This ONLY happens on Azure VMs. On my home system it doesn't occur.
The process always crashes out at ~16,384 batches (around 320,000 records) with a classic port exhaustion error: Only one usage of each socket address (protocol/network address/port) is normally permitted.
I've done the usual things: increase MaxUserPort (64434) and decreased TcpTimedWaitDelay (15 seconds). MaxUserPort seems to be ignored given the suspiciously logical 16,384 it fails at.
Netstat shows that the ports are never being closed in the first place. The state on all of them remains 'Established' untill the process itself is closed, then they disappear.
The actual connection code comes down to:
var acx = CloudStorageAccount.Parse(conn);
var client = acx.CreateCloudTableClient();
var table = client.GetTableReference("Test");
var op = new TableBatchOperation();
foreach (var record in batch) //Batch is just a bunch of entity objects
op.InsertOrReplace(record);
try
{
await table.ExecuteBatchAsync(op, opsConfig, null);
Interlocked.Add(ref totalUploaded, batch.Count);
}
catch...
I've tried every variation I can think of - reusing a pool of TableBatchOperations, having a single table/client/account to creating an object for every hit, and every combination in between.
The problem seems to be lower level than I can get to. When the issue was supposedly fixed two years ago the connections were staying open because the response stream wasn't being read properly.
Grateful for any suggestions! Please just ask if you need more information or clarification.
We're using ServiceStack 3.9.71.0 and we're currently experiencing unexplained latency issues with clients over a WAN connection.
A reply with a very small payload (<100 bytes) is received after 200ms+.
The round-trip-time (RTT) on the link is about 40ms due to the geographical distance. This has been verified by pinging the other host and using a simple echo service to test the latency of a TCP connection.
Both ping and echo test show latencies which are in line with expectations. Getting a reply from our ServiceStack host takes much longer than expected.
We've verified that:
WAN link is only running at 25% of capacity (no congestion)
No QOS is employed on the WAN link
same host gives fast reply to same request from a different host on local network
delay is not caused by our code processing the request
We've now stumbled across Nagle's algorithm and that it can mean delays for small requests on WAN networks (http://blogs.msdn.com/b/windowsazurestorage/archive/2010/06/25/nagle-s-algorithm-is-not-friendly-towards-small-requests.aspx).
In .NET it can be disabled by setting TcpClient.NoDelay = true (https://msdn.microsoft.com/en-us/en-US/library/system.net.sockets.tcpclient.nodelay(v=vs.110).aspx).
How can this be disabled for ServiceStack's TCP handling?
EDIT: I don't think that this is a duplicate of HttpWebRequest is slow with chunked data. The mentioned question covers HttpWebRequest which isn't used by ServiceStack. ServiceStack uses HttpListener which also happens to be controlled / managed by the mentioned ServicePointManager. We're going to conduct a test to see whether setting ServicePointManager.UseNagleAlgorithm = false solves the issue.
I think you provided an answer in your Update UseNagleAlgorithm = false should solve this issue. But be careful because ServicePointManager.UseNagleAlgorithm = false; is a global settings which means it will turn off this algorithm for all of your endpoint and for all of your requests in the entire App Domain. When you call more than one service endpoints (usually that is the case) with mixed sized of Request it will bite back. So you should consider setting this only for one specific ServicePoint, you can acquire it by:
ServicePoint sp = ServicePointManager.FindServicePoint(<uri>);
sp.UseNagleAlgorithm = false;
and not set it globally
Here is an article about it: https://technet2.github.io/Wiki/blogs/windowsazurestorage/nagles-algorithm-is-not-friendly-towards-small-requests.html
I'm using a 3rd party library that makes a number of http calls. By decompiling the code, I've determined that it is creating and using raw HttpWebRequest's, all going to a single URL. The issue is that some of the requests don't get closed properly. After some time, all new HttpWebRequest's block forever when the library calls GetRequestStream()* on them. I've determined this blocking is due to the ConnectionLimit on the ServicePoint for that particular host, which has the default value of 2. In other words, the library has opened 2 requests, and then tries to open a 3rd, which blocks.
I want to protect against this blocking. The library is fairly resilient and will reconnect itself, so it's okay if I kill the existing connections it has made. The problem is that I don't have access to any of the HttpWebRequest or HttpWebResponses this library makes. However I do know the URL it accesses and therefore I can access the ServicePoint for it.
var sp = ServicePointManager.FindServicePoint(new Uri("http://UrlThatIKnowAbout.com"));
(Note: KeepAlive is enabled on these HttpWebRequests)
This worked, though I'm not sure it's the best way to solve the problem.
Get the service point object for the url
var sp = ServicePointManager.FindServicePoint(new Uri("http://UrlThatIKnowAbout.com"));
Increase the ConnectionLimit to int.MaxValue
Create a background thread that periodically checks the ConnectionCount on the service point. If it goes above 5, call CloseConnectionGroup()
Set MaxIdleTime to 1 hour (instead of default)
Setting the ConnectionLimit should prevent the blocking. The monitor thread will ensure that too many connections are never active at the same time. Setting MaxIdleTime should serve as a fall back.