Throttling connections on 429 errors in WebClient mechanism, best practice?

Throttling connections on 429 errors in WebClient mechanism, best practice? - c#

I know theres a better way to do this, I'm sure my mechanism is actually wrong and would crash if it consistently failed. Is there a better practice to a retry mechanism, other than the way I've done it?
Since I'm relying on the response the web client gives, I never want to miss a response from this web client. I'm converting a list on a new system a website has. I know flooding it with traffic will consistently result in a 429 error (Too many connections) so the correct thing to do is throttle, right?
Here is my mechanism.
public static string GetUsernameFromId(long userId)
{
using (var client = new WebClient())
{
try
{
// removed business logic, minimal example
}
catch (WebException we)
{
if (we.Message.Contains("429"))
{
return ThrottleConnections(userId);
}
throw;
}
}
}
public static string ThrottleConnections(long userId)
{
System.Threading.Thread.Sleep(1 * 60 * 1000);
return GetUsernameFromId(userId);
}

Yeah i wouldn't be doing this with recursion, its asking for trouble.
This would probably be better, in a while loop with a retry count and a limit and some nice asycnrony, also i have left a lot to the imagination, you probably want to throw on max retries
public static async Task<string> GetUsernameFromId(long userId)
{
var retries = 0;
while (retries++ < MaxRetries)
{
using (var client = new WebClient())
{
try
{
///await client.OpenReadTaskAsync();
///blah
///
break;
}
catch (WebException we)
{
if (!we.Message.Contains("429"))
{
await Task.Delay(waitTime);
continue;
}
throw;
}
}
}
}

Using Thread.Sleep is almost never a good idea, better use a timer or await Task.Delay(...) as it does not block.
Your best bet is using a library that can provide retries etc. in a well thought manner. For example Polly is a well known library. It supports time based retrying as well, see the docs
// Retry, waiting a specified duration between each retry
Policy
.Handle<SomeExceptionType>()
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(3)
});
If the response has a Retry-After header you can use that as well:
Some systems specify how long to wait before retrying as part of the fault response returned. This is typically expressed as a Retry-After header with a 429 response code.
This can be handled by using WaitAndRetry/Forever/Async(...) overloads where the sleepDurationProvider takes the handled fault/exception as an input parameter (example overload; discussion and sample code).
Some SDKs wrap RetryAfter in custom responses or exceptions: for example, the underlying Azure CosmosDB architecture sends a 429 response code (too many requests) with a x-ms-retry-after-ms header, but the Azure client SDK expresses this back to calling code by throwing a DocumentClientException with RetryAfter property. The same overloads can be used to handle these.
If you do not want to use an external library than at least you can browse the source to get an idea how to deal with retries.

Related

Polly.Contrib.WaitAndRetry to "funnel" all requests when hitting rate limit

We're using the Dropbox API wrapped in Polly to handle retries.
We have it set up as an exponential back-off, like explained here.
The issue we have is that we make plenty of concurrent calls.
When the API starts throwing rate limit exceptions, each individual caller backs off
but new callers will still call the API and "steal" the retry of callers that are waiting.
That means that on high load we are experiencing failed API calls and errors.
What we would like to achieve is that on rate limit errors all calls (including new callers) to the API are synchronized and wait for the rate limit to expire.
Then calls can resume (ideally in sequence to make sure the calls don't return rate limit exceptions anymore).
Is there a Polly-supported way of achieving that?

According to my understanding you want to have the following:
The downstream system can throttle incoming requests
1.1 The system is smart enough to provide a RetryAfter time span
You want to avoid flooding the downstream system if you already know that you are throttled
But you don't want to lose any incoming request rather prefer processing all of them eventually
Let's put together a working example
#1 - Downstream system
Here we will implement a super simple mock which can mimic throttling.
Let's start with the exception
public class DownstreamServiceException: Exception
{
public TimeSpan RetryAfter { get; set; }
}
Now, let's see the service code
public class DownstreamService
{
private readonly CancellationTokenSource initCompletionSignal;
private readonly TimeSpan initDuration;
private bool isAvailable = false;
private DateTime initEstimatedEnd;
public DownstreamService()
{
initDuration = TimeSpan.FromSeconds(10);
initCompletionSignal = new CancellationTokenSource(initDuration);
initCompletionSignal.Token.Register(() => isAvailable = true);
initEstimatedEnd = DateTime.UtcNow.Add(initDuration);
}
public Task<string> GetAsync()
{
if (!isAvailable) throw new DownstreamServiceException { RetryAfter = initEstimatedEnd - DateTime.UtcNow };
return Task.FromResult("Available");
}
}
For the sake of simplicity I've used made the service unavailable for the first 10 seconds
I've used a CancellationTokenSource as a timer to make the service available
If the GetAsync is called while it is not available (we are throttled) it returns an exception otherwise with the "Available" string
#2 - Avoid flooding is downstream is not available
Here we will define a Circuit Breaker to short-cut the requests if the downstream is not available (we are throttled)
var throttledPolicy = Policy<string>
.Handle<DownstreamServiceException>()
.CircuitBreakerAsync(1, TimeSpan.FromSeconds(0),
onBreak: (result, state, _, __) => {
if (state == CircuitState.Open) return;
Console.WriteLine("onBreak");
throw result.Exception;
},
onReset: (_) => Console.WriteLine("onReset"),
onHalfOpen: () => { });
The Circuit Breaker will transit from Closed to Open when we receive the first DownstreamServiceException
The duration of break (TimeSpan.FromSeconds(0)) does not matter here
We will control the Circuit Breaker's state from the Retry logic
if (state == CircuitState.Open): This will be explained under the retry section
And finally re-throw the original exception (I know, I know ... it should be avoided, but it keeps our example application simple)
#3 - Retry until eventually processed
This is the most complicated part of the solution, because this retry policy handles multiple exceptions (DownstreamServiceException, IsolatedCircuitException) in a different way
CancellationTokenSource throttlingEndSignal;
var retryPolicy = Policy<string>
.Handle<DownstreamServiceException>()
.Or<IsolatedCircuitException>()
.WaitAndRetryForeverAsync(_ => TimeSpan.FromSeconds(3),
onRetry: (dr, __) =>
{
Console.WriteLine($"onRetry caused by {dr.Exception.GetType().Name}");
if (dr.Exception is DownstreamServiceException dse)
{
throttledPolicy.Isolate();
throttlingEndSignal = new(dse.RetryAfter);
throttlingEndSignal.Token.Register(() => throttledPolicy.Reset());
}
});
Let's start with the DownstreamServiceException
We will receive this exception because we are going to chain together the two policies and Circuit Breaker's onBreak delegate re-throws the received exception
Inside the onRetry we have a guard expression for DownstreamServiceException
Here we call the Isolate on the Circuit Breaker, which tries to transit from Open state to Isolated state >> calls the onBreak delegate
In order to avoid infinite loop that's why we had this if (state == CircuitState.Open) return; code there
We do the same timer trick here with the CancellationTokenSource, when ever the throttling ends we push the Circuit Breaker back to Closed state (Reset)
The IsolatedCircuitException case is much more simple
We receive this exception whenever we tries to perform a retry attempt but the Circuit Breaker is in Isolated state
So, the CB short cuts the execution and because of WaitAndRetryForever call we will eventually succeed
Put things together
var combinedPolicy = Policy.WrapAsync(retryPolicy, throttledPolicy);
var result = await combinedPolicy.ExecuteAsync(async () => await service.GetAsync());
Please note the followings:
This solution works well with multiple requests as well because Circuit Breaker is shared
This solution is a workaround, because we ca not set the duration of break dynamically
I hope you found this little sample application useful :)

HttpClient Error too many requests rate limit

I am calling HTTP requests to an API which has a limit on how many requests can be made.
These HTTP requests are done in loops and are done very quickly resulting in the HttpClient sometimes throwing a '10030 App Rate Limit Exceeded' exception error which is a 429 HTTP error for too many requests.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
Here is the code I am using:
public static async Task<List<WilliamHillData.Event>> GetAllCompetitionEvents(string compid)
{
string res = "";
try
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/json");
client.DefaultRequestHeaders.TryAddWithoutValidation("apiKey", "KEY");
using (HttpResponseMessage response = await client.GetAsync("https://gw.whapi.com/v2/sportsdata/competitions/" + compid + "/events/?&sort=startDateTime"))
{
res = await response.Content.ReadAsStringAsync();
}
}
JObject jobject = JObject.Parse(res);
List<WilliamHillData.Event> list = jobject["events"].ToObject<List<WilliamHillData.Event>>();
return list;
}
catch (Exception ex)
{
throw ex;
}
}
Is there a way I can find out how many requests can be made per second/minute? And possibly once the limit has been reached throttle down or do a Thread.Sleep until the limit has gone down instead of doing a Thread.Sleep each time it is been called so I am slowing down the app when it is required?
Cheers.

Is there a way I can find out how many requests can be made per second/minute?
Asking the owner of the API or reading the API Docs would help. You also can test it but you never know if there are other limits and even a Ban.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
You need to. You can filter out the 429 Error and not letting it throw. You must look whats better and faster. Slowing down your API Calls, so you stay within the limit or go full speed until you get a timeout and waiting that time.

How to wait for an external response with callbacks?

I've been searching quite a bit on this topic, but I think I'm not using the right words for searching any of this stuff, because I've not found an answer yet.
I'm looking for a way to make a process wait for a (specific) response of an external source.
In more detail, in a standard socket connection, I ask my remote endpoint for a certain value via a socket.send, how can I "catch" their reply? The idea that I already had was to send some sort of identifier along to determine what request this belongs to.
Is there a way to efficiently achieve this? (Performance is rather important). I'm currently using .NET2.0 if that's relevant information.
Some example code:
public void AskForReply()
{
//Send to connected endpoint
}
public void ReceiveReply(IAsyncResult response)
{
//Do stuff with the response
}
I've been working out several ideas in my head. But they all feel really messy and not very efficient. Is there a design pattern for this? Are there standards for this behavior?
And help is greatly appreciated!

For anyone who runs into a similar problem, I have found a way to make an asynchronous call synchronous (which is essentially what you are trying to achieve).
EventWaitHandle waitHandler;
string replyMessage;
void string AskForReply()
{
//Already requesting something...
if(waitHandler != null) { return; }
waitHandler = new EventWaitHandle(false, EventResetMode.AutoReset);
//Send a request to a remote service
waitHandler.WaitOne(timeout);
//Will reply null (or the default value) if the timeout passes.
return replyMessage;
}
void ReceiveReply(string message)
{
//We never asked for a reply? (Optional)
if (waitHandler != null) { return; }
replyMessage = message;
//Process your reply
waitHandler.Set();
waitHandler = null;
}
It's probably a good idea to put the EventWaitHandle and the reply message in a class for better and cleaner management. You can then even put this object in a dictionary along with a key that you can use handle multiple requests at once (do keep in mind they are synchronous and will block your thread until the timeout or the waithandle is set).

Consuming a web service in multiple threads c#

I am consuming a web service provided to me by a vendor in c# application. This application calls a web method in a loop and that slows down the performance. To get the complete set of results, it takes more than an hour.
Can I apply multi threading on my side to consume this web service in multiple threads and combine the results together?
Is there any better approach to retrieve data in minutes instead of hours?

First of all you have to make sure your vendor does indeed support this or does not prohibit it (which is very probable too).
The code itself to do this is fairly straightforward, using a method such as Parallel.For
Simple Example (google.com):
Parallel.For(0, norequests,
i => {
//Code that does your request goes here
} );
Exaplanation:
In a Parallel.For loop, all the requests get executed in-parallel (as implied in the name), which could potentially provide a very significant increase in performance.
Further reading:
MSDN on Parallel.For loops

You should really ask your vendor. We can only speculate about why it takes that long or if firing multiple requests will actually yield the same results as the one that takes long.
Basically, sending one request getting one response should beat the multi-threaded variant because it should be easier to optimize on the servers side.
If you want to know why this is not the case with the current version of the service, ask the vendor.

This is only samples, if you call web services in parallel:
private void TestParallelForeach()
{
string[] uris = {"http://192.168.1.2", "http://192.168.1.3", "http://192.168.1.4"};
var results = new List<string>();
var syncObj = new object();
Parallel.ForEach(uris, uri =>
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
try
{
var result = webClient.DownloadString(uri);
lock (syncObj)
{
results.Add(result);
}
}
catch (Exception ex)
{
// Do error handling here...
}
}
});
// Do with "results" here....
}

Why does HttpClient appear to deadlock here?

I have an API made in a portable class library which needs to reach out to platform specific APIs for sending HTTP requests. Here is the method I wrote to do an HTTP POST on WinRT:
public bool Post(IEnumerable<KeyValuePair<string, string>> headers, string data)
{
bool success = false;
HttpClient client = new HttpClient(new HttpClientHandler {AllowAutoRedirect = false});
foreach (var header in headers)
{
client.DefaultRequestHeaders.Add(header.Key, header.Value);
}
try
{
var task=client.PostAsync(endpoint, new StringContent(data, Encoding.UTF8, "text/xml")).ContinueWith( postTask =>
{
try
{
postTask.Wait(client.Timeout); //Don't wait longer than the client timeout.
success = postTask.Result.IsSuccessStatusCode;
}catch {}
}, TaskContinuationOptions.LongRunning);
task.ConfigureAwait(false);
task.Wait(client.Timeout);
}
catch
{
success = false;
}
return success;
}
This exhibits an interesting problem though when put under any kind of stress though. It appears to deadlock internally. Like if I create 5 threads and send POST requests out of them, this method will get to where it will do nothing but timeout. Content never reaches the server, and the .Continue code is never executed. However, if I run it serially or maybe even with 2 or 3 threads it will work OK. It seems that the more threads thrown at it though make the performance exponentially worse
Exactly what am I doing wrong here?

I don't think this is where you problem is but it could be and it's really easy to implement and test it out. By default Windows sets the Max Network connections to 2 and with more than 2 threads you could be locking on the connection pool. You can add this to your app config
<system.net>
<connectionManagement>
<add address="*" maxconnection="300" />
</connectionManagement>
</system.net>
or in code you can do this
ServicePointManager.DefaultConnectionLimit = 300
I'd also consider commenting out the wait in the continue with. I don't think it's necessary.
try
{
//Comment this line out your handling it in the outside task already
//postTask.Wait(client.Timeout); //Don't wait longer than the client timeout.
success = postTask.Result.IsSuccessStatusCode;
}catch {}
And finally if the 2 things above don't work I'd try commenting out the this code.
//Task.ConfigureAwait(false);
It could be that the combination of Task.Wait plus setting Task.ConfigureAwait(false) is causing some kind of deadlock but I'm no expert on why. I just know that I have some really similar code that runs multi-threaded just fine and I don't have Task.ConfigureAwait(false) in my code, mostly because I tried out the HttpClient library but didn't upgrade to .NET 4.5 so await isn't available.

Here's some things that stick out to me with the current code:
ContinueWith queues a delegate to run when the task is complete. So there's no need to wait for it.
LongRunning is not needed here; it will decrease performance because your continuation is very fast, not long running at all.
ConfigureAwait is meaningless because there's no await (and the return value is discarded anyway).
The timeout doesn't need to be passed to Task.Wait because the task will already completed after that timeout anyway.
I have an API made in a portable class library which needs to reach out to platform specific APIs for sending HTTP requests.
I recommend making your API asynchronous since it's doing HTTP. You can use Microsoft.Bcl.Async if you want full async/await support in PCLs.
public async Task<bool> Post(IEnumerable<KeyValuePair<string, string>> headers, string data)
{
HttpClient client = new HttpClient(new HttpClientHandler {AllowAutoRedirect = false});
foreach (var header in headers)
{
client.DefaultRequestHeaders.Add(header.Key, header.Value);
}
try
{
var result = await client.PostAsync(endpoint, new StringContent(data, Encoding.UTF8, "text/xml")).ConfigureAwait(false);
return result.IsSuccessStatusCode;
}
catch
{
return false;
}
}

I have observed this HttpClientHandler issue as well when multiple requests are issued concurrently. (.NET Framework 4.7.2)
I was able to resolve the issue by backporting the .NET Core 2.1 SocketsHttpHandler to .NET Framework and the backported implementation significantly improved performance when dozens of multiple requests are issued concurrently.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Throttling connections on 429 errors in WebClient mechanism, best practice? - c#

Related

Polly.Contrib.WaitAndRetry to "funnel" all requests when hitting rate limit

HttpClient Error too many requests rate limit

How to wait for an external response with callbacks?

Consuming a web service in multiple threads c#

Why does HttpClient appear to deadlock here?

Categories

Resources