Parallel.foreach() with WebRequest. 429 error - c#

I am currently running a script that is hitting an api roughly 3000 times over an extended period of time. I am using parallel.foreach() so that I can send a few requests at a time to speed up the process. I am creating two requests in each foreach iteration so I should have no more than 10 requests. My question is that I keep receiving 429 errors from the server and I spoke to someone who manages the server and they said they are seeing requests in bursts of 40+. With my current understanding of my code, I don't believe this is even possible, Can someone let me know If I am missing something here?
public static List<Requests> GetData(List<Requests> requests)
{
ParallelOptions para = new ParallelOptions();
para.MaxDegreeOfParallelism = 5;
Parallel.ForEach(requests, para, request =>
{
WeatherAPI.GetResponseForDay(request);
});
return requests;
}
public static Request GetResponseForDay(Request request)
{
var request = WebRequest.Create(request);
request.Timeout = 3600000;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader myStreamReader = new
StreamReader(response.GetResponseStream());
string responseData = myStreamReader.ReadToEnd();
response.Close();
var request2 WebRequest.Create(requestthesecond);
HttpWebResponse response2 = (HttpWebResponse)request2.GetResponse();
StreamReader myStreamReader2 = new
StreamReader(response2.GetResponseStream());
string responseData2 = myStreamReader2.ReadToEnd();
response2.Close();
DoStuffWithData(responseData, responseData2)
return request;
}

as smartobelix pointed out, your degreeofparallelism set to 5 doesn't prevent you of sending too many requests as defined by servers side policy. all it does is prevents you from exhausting number of threads needed for making those requests on your side. so what you need to do is communicate to servers owner, get familiar with their limits and change your code to never meet the limits.
variables involved are:
number of concurrent requests you will send (parallelism)
average time one request takes
maximum requests per unit time allowed by server
so, for example, if your average request time is 200ms and you have max degree of parallelism of 5, then you can expect to send 25 requests per second on average. if it takes 500ms per request, then you'll send only 10.
mix servers allowed numbers into that and you'll get the idea how to fine tune your numbers.

Both answers above are essentially correct. The problem I had was that the server manager set a rate limit of 100 requests/min where there previously wasn't one and didn't inform me. As soon as I hit this limit of 100 requests/min, I started receiving 429 errors nearly instantly and started firing off requests at that speed which were also receiving 429 errors.

Related

HttpClient Error too many requests rate limit

I am calling HTTP requests to an API which has a limit on how many requests can be made.
These HTTP requests are done in loops and are done very quickly resulting in the HttpClient sometimes throwing a '10030 App Rate Limit Exceeded' exception error which is a 429 HTTP error for too many requests.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
Here is the code I am using:
public static async Task<List<WilliamHillData.Event>> GetAllCompetitionEvents(string compid)
{
string res = "";
try
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.TryAddWithoutValidation("Content-Type", "application/json");
client.DefaultRequestHeaders.TryAddWithoutValidation("apiKey", "KEY");
using (HttpResponseMessage response = await client.GetAsync("https://gw.whapi.com/v2/sportsdata/competitions/" + compid + "/events/?&sort=startDateTime"))
{
res = await response.Content.ReadAsStringAsync();
}
}
JObject jobject = JObject.Parse(res);
List<WilliamHillData.Event> list = jobject["events"].ToObject<List<WilliamHillData.Event>>();
return list;
}
catch (Exception ex)
{
throw ex;
}
}
Is there a way I can find out how many requests can be made per second/minute? And possibly once the limit has been reached throttle down or do a Thread.Sleep until the limit has gone down instead of doing a Thread.Sleep each time it is been called so I am slowing down the app when it is required?
Cheers.
Is there a way I can find out how many requests can be made per second/minute?
Asking the owner of the API or reading the API Docs would help. You also can test it but you never know if there are other limits and even a Ban.
I can solve this by doing a Thread.Sleep between each call, however this slows down the application and is not reasonable.
You need to. You can filter out the 429 Error and not letting it throw. You must look whats better and faster. Slowing down your API Calls, so you stay within the limit or go full speed until you get a timeout and waiting that time.

Improving simultaneous HttpWebRequest performance in c#

I have an application that batches web requests to a single endpoint using the HttpWebRequest mechanism, the goal of the application is to revise large collections of product listings (specifically their descriptions).
Here is an example of the code I use to make these requests:
static class SomeClass
{
static RequestCachePolicy cachePolicy;
public static string DoRequest(string requestXml)
{
string responseXml = string.Empty;
Uri ep = new Uri(API_ENDPOINT);
HttpWebRequest theRequest = (HttpWebRequest)WebRequest.Create(ep);
theRequest.ContentType = "text/xml;charset=\"utf-8\"";
theRequest.Accept = "text/xml";
theRequest.Method = "POST";
theRequest.Headers[HttpRequestHeader.AcceptEncoding] = "gzip";
theRequest.Proxy = null;
if (cachePolicy == null) {
cachePolicy = new RequestCachePolicy(RequestCacheLevel.BypassCache);
}
theRequest.CachePolicy = cachePolicy;
using (Stream requestStream = theRequest.GetRequestStream())
{
using (StreamWriter requestWriter = new StreamWriter(requestStream))
{
requestWriter.Write(requestXml);
}
}
WebResponse theResponse = theRequest.GetResponse();
using (Stream responseStream = theResponse.GetResponseStream())
{
using (MemoryStream ms = new MemoryStream())
{
responseStream.CopyTo(ms);
byte[] resultBytes = GzCompressor.Decompress(ms.ToArray());
responseXml = Encoding.UTF8.GetString(resultBytes);
}
}
return responseXml;
}
}
My question is this; If I thread the task, I can call and complete at most 3 requests per second (based on the average sent data length) and this is through a gigabit connection to a router running business grade fibre internet. However if I divide the task up into 2 sets, and run the second set in a second process, I can double the requests complete per second.
The same can be said if I divide the task into 3 or 4 (after that performance seems to plateau unless I grab another machine to do the same), why is this? and can I change something in the first process so that running multiple processes (or computers) is no longer needed?
Things I have tried so far include the following:
Implementing GZip compression (as seen in the example above).
Re-using the RequestCachePolicy (as seen in the example above).
Setting Expect100Continue to false.
Setting DefaultConnectionLimit before the ServicePoint is created to a larger number.
Reusing the HttpWebRequest (does not work as remote host does not support it).
Increasing the ReceiveBufferSize on the ServicePoint both before and after creation.
Disabling proxy detection in Internet Explorer's Lan Settings.
My suspicion is not with the remote host as I can quite clearly wrench far more performance out by the methods I explained, but instead that some mechanism is capping the amount amount of data that is allowed to be sent through the HttpWebRequest (maybe something to do with the ServicePoint?). Thanks in advance, and please let me know if there is anything else you need clarifying.
--
Just to expand on the topic, my colleague and I used the same code on a system running Windows Server Standard 2016 64bit and requests using this method run significantly faster and more numerous. This seems to be pointing out that there is likely some sort of software bottleneck imposed proving that there is something going on. The slow operations are observed on Windows 10 Home/Pro 64bit and lower on faster hardware than the server is running on.
Scaling
I do not have a better solution for your problem but i think i know why your performance seems to peek or why it is machine dependent.
Usually a program has the best performance when the number of threads or processes matches exactly the number of cores. That is because the system can run them independently and the overhead for scheduling or context switching is minimized.
You arrived at your peek performance at 3 or 4 different tasks. From that i would conclude your machine has 2 or 4 cores. That would exactly match my explanation.

.NET HttpClient GET request very slow after ~100 seconds idle

The first request or a request after idling roughly 100 seconds is very slow and takes 15-30 seconds. Any request without idling takes less than a second. I am fine with the first request taking time, just not the small idle time causing the slowdown.
The slowdown is not unique to the client, if I keep making requests on one client then it stays quick on the other. Only when all are idle for 100 seconds does it slowdown.
Here are some changes that I have tried:
Setting HttpClient to a singleton and not disposing it using a using() block
Setting ServicePointManager.MaxServicePointIdleTime to a higher value since by default it is 100 seconds. Since the time period is the same as mine I thought this was the issue but it did not solve it.
Setting a higher ServicePointManager.DefaultConnectionLimit
Default proxy settings set via web.config
using await instead of httpClient.SendAsync(request).Result
It is not related to IIS application pool recycling since the default there is set to 20mn and the rest of the application remains quick.
The requests are to a web service which communicates with AWS S3 to get files. I am at a loss for ideas at this point and all my research has led me to the above points that I already tried. Any ideas would be appreciated!
Here is the method:
`
//get httpclient singleton or create
var httpClient = HttpClientProvider.FileServiceHttpClient;
var queryString = string.Format("?key={0}", key);
var request = new HttpRequestMessage(HttpMethod.Get, queryString);
var response = httpClient.SendAsync(request).Result;
if (response.IsSuccessStatusCode)
{
var metadata = new Dictionary<string, string>();
foreach (var header in response.Headers)
{
//grab tf headers
if (header.Key.StartsWith(_metadataHeaderPrefix))
{
metadata.Add(header.Key.Substring(_metadataHeaderPrefix.Length), header.Value.First());
}
}
var virtualFile = new VirtualFile
{
QualifiedPath = key,
FileStream = response.Content.ReadAsStreamAsync().Result,
Metadata = metadata
};
return virtualFile;
}
return null;
The default idle timeout is about 1-2 mins. After that, the client has to re handshake with the server. So, you will find that it will be slow after 100s.
You could use socket handler to extend the idle timeout.
var socketsHandler = new SocketsHttpHandler
{
PooledConnectionIdleTimeout = TimeSpan.FromHours(27),//Actually 5 mins can be idle at maximum. Note that timeouts longer than the TCP timeout may be ignored if no keep-alive TCP message is set at the transport level.
MaxConnectionsPerServer = 10
};
client = new HttpClient(socketsHandler);
As you can see, although I set the idle timeout to 27 hours, but actually it just keep 5 mins alive.
So, finally I just call the target endpoint using the same HttpClient every 1 min. In this case, there is always an established connection. You could use netstat to check that. It works fine.

How do I get reasonable performance out of C# WebClient UploadString

I have a Windows Service that sends json data to a MVC5 WebAPI using the WebClient. Both the Windows Service and the WebClient are currently on the same machine.
After it has run for about 15 minutes with about 10 requests per second each post takes unreasonably long to complete. It can start out at about 3 ms to complete a request and build up to take about 5 seconds, which is way too much for my application.
This is the code I'm using:
private WebClient GetClient()
{
var webClient = new WebClient();
webClient.Headers.Add("Content-Type", "application/json");
return webClient;
}
public string Post<T>(string url, T data)
{
var sw = new Stopwatch();
try
{
var json = JsonConvert.SerializeObject(data);
sw.Start();
var result = GetClient().UploadString(GetAddress(url), json);
sw.Stop();
if (Log.IsVerboseEnabled())
Log.Verbose(String.Format("json: {0}, time(ms): {1}", json, sw.ElapsedMilliseconds));
return result;
}
catch (Exception)
{
sw.Stop();
Log.Debug(String.Format("Failed to send to webapi, time(ms): {0}", sw.ElapsedMilliseconds));
return "Failed to send to webapi";
}
}
The result of the request isn't really of importance to me.
The serialized data size varies from just a few bytes to about 1 kB but that does not seem to affect the time it takes to complete the request.
The api controllers that receive the request completes their execution almost instantly (0-1 ms).
From various questions here on SO and some blog posts I've seen people suggesting the use of HttpWebRequest instead to be able to control options of the request.
Using HttpWebRequest I've tried these things that did not work:
Setting the proxy to an empty proxy.
Setting the proxy to null
Setting the ServicePointManager.DefaultConnectionLimit to an arbitrary large number.
Disable KeepAlive (I don't want to but it was suggested).
Not opening the response stream at all (had some impact but not enough).
Why are the requests taking so long? Any help is greatly appreciated.
It turned out to be another part of the program that took all available connections. I.e. I was out of sockets and had to wait for a free one.
I found out by monitoring ASP.NET Applications\Requests/Sec in the Performance Monitor.

Timeout in web request doesn't seem to be honored

I want my http get request to fail if it takes more than 10 seconds by timint out.
I have this:
var request = (HttpWebRequest)WebRequest.Create(myUrl);
request.Method = "GET";
request.Timeout = 1000 * 10; // 10 seconds
HttpStatusCode httpStatusCode = HttpStatusCode.ServiceUnavailable;
using (var webResponse = (HttpWebResponse)request.GetResponse())
{
httpStatusCode = webResponse.StatusCode;
}
It doesn't seem to timeout when I put a bad URL in the request, it just keeps going and going for a long time (seems like minutes).
WHy is this?
If you are doing it in a web project, make sure the debug attribute of the system.web/compilation tag in the Web.Config file is set to "false".
If it is a console application or such, compile it in "Release" mode.
A lot of timeouts are ignored in "Debug" mode.
Your code is probably performing a DNS lookup on the bad URL which takes a minimum of 15 seconds.
According to the documentation for HttpWebRequest.Timeout
A Domain Name System (DNS) query may take up to 15 seconds to return
or time out. If your request contains a host name that requires
resolution and you set Timeout to a value less than 15 seconds, it may
take 15 seconds or more before a WebException is thrown to indicate a
timeout on your request.
You can perform a DNS Lookup using Dns.GetHostEntry but it looks like it will take 5 seconds by default.

Categories

Resources