I am calling HTTP requests to an API which has a limit on how many requests can be made.
These HTTP requests are done in loops and are done very quickly resulting in the HttpClient sometimes throwing a '10030 App Rate Limit Exceeded' exception error which is a 429 HTTP error for too many requests.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
Here is the code I am using:
public static async Task<List<WilliamHillData.Event>> GetAllCompetitionEvents(string compid)
{
string res = "";
try
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/json");
client.DefaultRequestHeaders.TryAddWithoutValidation("apiKey", "KEY");
using (HttpResponseMessage response = await client.GetAsync("https://gw.whapi.com/v2/sportsdata/competitions/" + compid + "/events/?&sort=startDateTime"))
{
res = await response.Content.ReadAsStringAsync();
}
}
JObject jobject = JObject.Parse(res);
List<WilliamHillData.Event> list = jobject["events"].ToObject<List<WilliamHillData.Event>>();
return list;
}
catch (Exception ex)
{
throw ex;
}
}
Is there a way I can find out how many requests can be made per second/minute? And possibly once the limit has been reached throttle down or do a Thread.Sleep until the limit has gone down instead of doing a Thread.Sleep each time it is been called so I am slowing down the app when it is required?
Cheers.
Is there a way I can find out how many requests can be made per second/minute?
Asking the owner of the API or reading the API Docs would help. You also can test it but you never know if there are other limits and even a Ban.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
You need to. You can filter out the 429 Error and not letting it throw. You must look whats better and faster. Slowing down your API Calls, so you stay within the limit or go full speed until you get a timeout and waiting that time.
Related
I know theres a better way to do this, I'm sure my mechanism is actually wrong and would crash if it consistently failed. Is there a better practice to a retry mechanism, other than the way I've done it?
Since I'm relying on the response the web client gives, I never want to miss a response from this web client. I'm converting a list on a new system a website has. I know flooding it with traffic will consistently result in a 429 error (Too many connections) so the correct thing to do is throttle, right?
Here is my mechanism.
public static string GetUsernameFromId(long userId)
{
using (var client = new WebClient())
{
try
{
// removed business logic, minimal example
}
catch (WebException we)
{
if (we.Message.Contains("429"))
{
return ThrottleConnections(userId);
}
throw;
}
}
}
public static string ThrottleConnections(long userId)
{
System.Threading.Thread.Sleep(1 * 60 * 1000);
return GetUsernameFromId(userId);
}
Yeah i wouldn't be doing this with recursion, its asking for trouble.
This would probably be better, in a while loop with a retry count and a limit and some nice asycnrony, also i have left a lot to the imagination, you probably want to throw on max retries
public static async Task<string> GetUsernameFromId(long userId)
{
var retries = 0;
while (retries++ < MaxRetries)
{
using (var client = new WebClient())
{
try
{
///await client.OpenReadTaskAsync();
///blah
///
break;
}
catch (WebException we)
{
if (!we.Message.Contains("429"))
{
await Task.Delay(waitTime);
continue;
}
throw;
}
}
}
}
Using Thread.Sleep is almost never a good idea, better use a timer or await Task.Delay(...) as it does not block.
Your best bet is using a library that can provide retries etc. in a well thought manner. For example Polly is a well known library. It supports time based retrying as well, see the docs
// Retry, waiting a specified duration between each retry
Policy
.Handle<SomeExceptionType>()
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(3)
});
If the response has a Retry-After header you can use that as well:
Some systems specify how long to wait before retrying as part of the fault response returned. This is typically expressed as a Retry-After header with a 429 response code.
This can be handled by using WaitAndRetry/Forever/Async(...) overloads where the sleepDurationProvider takes the handled fault/exception as an input parameter (example overload; discussion and sample code).
Some SDKs wrap RetryAfter in custom responses or exceptions: for example, the underlying Azure CosmosDB architecture sends a 429 response code (too many requests) with a x-ms-retry-after-ms header, but the Azure client SDK expresses this back to calling code by throwing a DocumentClientException with RetryAfter property. The same overloads can be used to handle these.
If you do not want to use an external library than at least you can browse the source to get an idea how to deal with retries.
I am currently running a script that is hitting an api roughly 3000 times over an extended period of time. I am using parallel.foreach() so that I can send a few requests at a time to speed up the process. I am creating two requests in each foreach iteration so I should have no more than 10 requests. My question is that I keep receiving 429 errors from the server and I spoke to someone who manages the server and they said they are seeing requests in bursts of 40+. With my current understanding of my code, I don't believe this is even possible, Can someone let me know If I am missing something here?
public static List<Requests> GetData(List<Requests> requests)
{
ParallelOptions para = new ParallelOptions();
para.MaxDegreeOfParallelism = 5;
Parallel.ForEach(requests, para, request =>
{
WeatherAPI.GetResponseForDay(request);
});
return requests;
}
public static Request GetResponseForDay(Request request)
{
var request = WebRequest.Create(request);
request.Timeout = 3600000;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader myStreamReader = new
StreamReader(response.GetResponseStream());
string responseData = myStreamReader.ReadToEnd();
response.Close();
var request2 WebRequest.Create(requestthesecond);
HttpWebResponse response2 = (HttpWebResponse)request2.GetResponse();
StreamReader myStreamReader2 = new
StreamReader(response2.GetResponseStream());
string responseData2 = myStreamReader2.ReadToEnd();
response2.Close();
DoStuffWithData(responseData, responseData2)
return request;
}
as smartobelix pointed out, your degreeofparallelism set to 5 doesn't prevent you of sending too many requests as defined by servers side policy. all it does is prevents you from exhausting number of threads needed for making those requests on your side. so what you need to do is communicate to servers owner, get familiar with their limits and change your code to never meet the limits.
variables involved are:
number of concurrent requests you will send (parallelism)
average time one request takes
maximum requests per unit time allowed by server
so, for example, if your average request time is 200ms and you have max degree of parallelism of 5, then you can expect to send 25 requests per second on average. if it takes 500ms per request, then you'll send only 10.
mix servers allowed numbers into that and you'll get the idea how to fine tune your numbers.
Both answers above are essentially correct. The problem I had was that the server manager set a rate limit of 100 requests/min where there previously wasn't one and didn't inform me. As soon as I hit this limit of 100 requests/min, I started receiving 429 errors nearly instantly and started firing off requests at that speed which were also receiving 429 errors.
I am consuming a web service provided to me by a vendor in c# application. This application calls a web method in a loop and that slows down the performance. To get the complete set of results, it takes more than an hour.
Can I apply multi threading on my side to consume this web service in multiple threads and combine the results together?
Is there any better approach to retrieve data in minutes instead of hours?
First of all you have to make sure your vendor does indeed support this or does not prohibit it (which is very probable too).
The code itself to do this is fairly straightforward, using a method such as Parallel.For
Simple Example (google.com):
Parallel.For(0, norequests,
i => {
//Code that does your request goes here
} );
Exaplanation:
In a Parallel.For loop, all the requests get executed in-parallel (as implied in the name), which could potentially provide a very significant increase in performance.
Further reading:
MSDN on Parallel.For loops
You should really ask your vendor. We can only speculate about why it takes that long or if firing multiple requests will actually yield the same results as the one that takes long.
Basically, sending one request getting one response should beat the multi-threaded variant because it should be easier to optimize on the servers side.
If you want to know why this is not the case with the current version of the service, ask the vendor.
This is only samples, if you call web services in parallel:
private void TestParallelForeach()
{
string[] uris = {"http://192.168.1.2", "http://192.168.1.3", "http://192.168.1.4"};
var results = new List<string>();
var syncObj = new object();
Parallel.ForEach(uris, uri =>
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
try
{
var result = webClient.DownloadString(uri);
lock (syncObj)
{
results.Add(result);
}
}
catch (Exception ex)
{
// Do error handling here...
}
}
});
// Do with "results" here....
}
We are building a highly concurrent web application, and recently we have started using asynchronous programming extensively (using TPL and async/await).
We have a distributed environment, in which apps communicate with each other through REST APIs (built on top of ASP.NET Web API). In one specific app, we have a DelegatingHandler that after calling base.SendAsync (i.e., after calculating the response) logs the response to a file. We include the response's basic information in the log (status code, headers and content):
public static string SerializeResponse(HttpResponseMessage response)
{
var builder = new StringBuilder();
var content = ReadContentAsString(response.Content);
builder.AppendFormat("HTTP/{0} {1:d} {1}", response.Version.ToString(2), response.StatusCode);
builder.AppendLine();
builder.Append(response.Headers);
if (!string.IsNullOrWhiteSpace(content))
{
builder.Append(response.Content.Headers);
builder.AppendLine();
builder.AppendLine(Beautified(content));
}
return builder.ToString();
}
private static string ReadContentAsString(HttpContent content)
{
return content == null ? null : content.ReadAsStringAsync().Result;
}
The problem is this: when the code reaches content.ReadAsStringAsync().Result under heavy server load, the request sometimes hangs on IIS. When it does, it sometimes returns a response -- but hangs on IIS as if it didn't -- or in other times it never returns.
I have also tried reading the content using ReadAsByteArrayAsync and then converting it to String, with no luck.
When I convert the code to use async throughout I get even weirder results:
public static async Task<string> SerializeResponseAsync(HttpResponseMessage response)
{
var builder = new StringBuilder();
var content = await ReadContentAsStringAsync(response.Content);
builder.AppendFormat("HTTP/{0} {1:d} {1}", response.Version.ToString(2), response.StatusCode);
builder.AppendLine();
builder.Append(response.Headers);
if (!string.IsNullOrWhiteSpace(content))
{
builder.Append(response.Content.Headers);
builder.AppendLine();
builder.AppendLine(Beautified(content));
}
return builder.ToString();
}
private static Task<string> ReadContentAsStringAsync(HttpContent content)
{
return content == null ? Task.FromResult<string>(null) : content.ReadAsStringAsync();
}
Now HttpContext.Current is null after the call to content.ReadAsStringAsync(), and it keeps being null for all the subsequent requests! I know this sounds unbelievable -- and it took me some time and the presence of three coworkers to accept that this was really happening.
Is this some kind of expected behavior? Am I doing something wrong here?
I had this problem. Although, I haven't fully tested yet, using CopyToAsync instead of ReadAsStringAsync seems to fix the problem:
var ms = new MemoryStream();
await response.Content.CopyToAsync(ms);
ms.Seek(0, SeekOrigin.Begin);
var sr = new StreamReader(ms);
responseContent = sr.ReadToEnd();
With regards to your second issue, the async/await is syntactic sugar for the compiler building a state machine where the call to to a function preceded by "await" returns immediately on the current thread...one that contains HttpContext.Current in its thread local storage. The completion of that async call can occur on a different thread...one that does NOT have HttpContext.Current in its thread local storage.
If you want the completion to execute on the same thread (thus having the same objects in thread local storage like HttpContext.Current), then you need to be aware of this behavior. This is especially important on calls from the main UI thread (if you're building a Windows application) or in ASP.NET, calls from an ASP.NET request thread where you are dependent on HttpContext.Current.
See reference docs on ConfigureAwait(false). Also, view some Channel 9 tutorials on TPL. Once the "easy" stuff is grokked, the presenter will invariably talk about this issue as it causes subtle problems that are not easily understood unless you know what the TPL is doing underneath the covers.
Good luck.
With regards to your first problem, if the caller gets a result, I'm not convinced that IIS has not completed the request. How are you determining that the ASP.NET request thread initiated by this caller is hung in IIS?
I have an API made in a portable class library which needs to reach out to platform specific APIs for sending HTTP requests. Here is the method I wrote to do an HTTP POST on WinRT:
public bool Post(IEnumerable<KeyValuePair<string, string>> headers, string data)
{
bool success = false;
HttpClient client = new HttpClient(new HttpClientHandler {AllowAutoRedirect = false});
foreach (var header in headers)
{
client.DefaultRequestHeaders.Add(header.Key, header.Value);
}
try
{
var task=client.PostAsync(endpoint, new StringContent(data, Encoding.UTF8, "text/xml")).ContinueWith( postTask =>
{
try
{
postTask.Wait(client.Timeout); //Don't wait longer than the client timeout.
success = postTask.Result.IsSuccessStatusCode;
}catch {}
}, TaskContinuationOptions.LongRunning);
task.ConfigureAwait(false);
task.Wait(client.Timeout);
}
catch
{
success = false;
}
return success;
}
This exhibits an interesting problem though when put under any kind of stress though. It appears to deadlock internally. Like if I create 5 threads and send POST requests out of them, this method will get to where it will do nothing but timeout. Content never reaches the server, and the .Continue code is never executed. However, if I run it serially or maybe even with 2 or 3 threads it will work OK. It seems that the more threads thrown at it though make the performance exponentially worse
Exactly what am I doing wrong here?
I don't think this is where you problem is but it could be and it's really easy to implement and test it out. By default Windows sets the Max Network connections to 2 and with more than 2 threads you could be locking on the connection pool. You can add this to your app config
<system.net>
<connectionManagement>
<add address="*" maxconnection="300" />
</connectionManagement>
</system.net>
or in code you can do this
ServicePointManager.DefaultConnectionLimit = 300
I'd also consider commenting out the wait in the continue with. I don't think it's necessary.
try
{
//Comment this line out your handling it in the outside task already
//postTask.Wait(client.Timeout); //Don't wait longer than the client timeout.
success = postTask.Result.IsSuccessStatusCode;
}catch {}
And finally if the 2 things above don't work I'd try commenting out the this code.
//Task.ConfigureAwait(false);
It could be that the combination of Task.Wait plus setting Task.ConfigureAwait(false) is causing some kind of deadlock but I'm no expert on why. I just know that I have some really similar code that runs multi-threaded just fine and I don't have Task.ConfigureAwait(false) in my code, mostly because I tried out the HttpClient library but didn't upgrade to .NET 4.5 so await isn't available.
Here's some things that stick out to me with the current code:
ContinueWith queues a delegate to run when the task is complete. So there's no need to wait for it.
LongRunning is not needed here; it will decrease performance because your continuation is very fast, not long running at all.
ConfigureAwait is meaningless because there's no await (and the return value is discarded anyway).
The timeout doesn't need to be passed to Task.Wait because the task will already completed after that timeout anyway.
I have an API made in a portable class library which needs to reach out to platform specific APIs for sending HTTP requests.
I recommend making your API asynchronous since it's doing HTTP. You can use Microsoft.Bcl.Async if you want full async/await support in PCLs.
public async Task<bool> Post(IEnumerable<KeyValuePair<string, string>> headers, string data)
{
HttpClient client = new HttpClient(new HttpClientHandler {AllowAutoRedirect = false});
foreach (var header in headers)
{
client.DefaultRequestHeaders.Add(header.Key, header.Value);
}
try
{
var result = await client.PostAsync(endpoint, new StringContent(data, Encoding.UTF8, "text/xml")).ConfigureAwait(false);
return result.IsSuccessStatusCode;
}
catch
{
return false;
}
}
I have observed this HttpClientHandler issue as well when multiple requests are issued concurrently. (.NET Framework 4.7.2)
I was able to resolve the issue by backporting the .NET Core 2.1 SocketsHttpHandler to .NET Framework and the backported implementation significantly improved performance when dozens of multiple requests are issued concurrently.