I am calling an external API which is slow. Currently if I havent called the API to get some orders for a while the call can be broken up into pages (pagingation).
So therefore fetching orders could be making multiple calls rather than the 1 call. Sometimes each call can be around 10 seconds per call so this could be about a minute in total which is far too long.
GetOrdersCall getOrders = new GetOrdersCall();
getOrders.DetailLevelList.Add(DetailLevelCodeType.ReturnSummary);
getOrders.CreateTimeFrom = lastOrderDate;
getOrders.CreateTimeTo = DateTime.Now;
PaginationType paging = new PaginationType();
paging.EntriesPerPage = 20;
paging.PageNumber = 1;
getOrders.Pagination = paging;
getOrders.Execute();
var response = getOrders.ApiResponse;
OrderTypeCollection orders = new OrderTypeCollection();
while (response != null && response.OrderArray.Count > 0)
{
eBayConverter.ConvertOrders(response.OrderArray, 1);
if (response.HasMoreOrders)
{
getOrders.Pagination.PageNumber++;
getOrders.Execute();
response = getOrders.ApiResponse;
orders.AddRange(response.OrderArray);
}
}
This is a summary of my code above... The getOrders.Execute() is when the api fires.
After the 1st "getOrders.Execute()" there is a Pagination result which tells me how many pages of data there are. My thinking is that I should be able to start an asnychronous call for each page and to populate the OrderTypeCollection. When all the calls are made and the collection is fully loaded then I will commit to the database.
I have never done Asynchronous calls via c# before and I can kind of follow Async await but I think my scenario falls out of the reading I have done so far?
Questions:
I think I can set it up to fire off the multiple calls asynchronously but I'm not sure how to check when all tasks have been completed i.e. ready to commit to db.
I've read somewhere that I want to avoid combining the API call and the db write to avoid locking in SQL server - Is this correct?
If someone can point me in the right direction - It would be greatly appreciated.
I think I can set it up to fire off the multiple calls asynchronously
but I'm not sure how to check when all tasks have been completed i.e.
ready to commit to db.
Yes you can break this up
The problem is ebay doesn't have an async Task Execute Method, so you are left with blocking threaded calls and no IO optimised async await pattern. If there were, you could take advantage of a TPL Dataflow pipeline which is async aware (and fun for the whole family to play), you could anyway, though i propose a vanilla TPL solution...
However, all is not lost, just fall back to Parallel.For and a ConcurrentBag<OrderType>
Example
var concurrentBag = new ConcurrentBag<OrderType>();
// make first call
// add results to concurrentBag
// pass the pageCount to the for
int pagesize = ...;
Parallel.For(1, pagesize,
page =>
{
// Set up
// add page
// make Call
foreach(var order in getOrders.ApiResponse)
concurrentBag.Add(order);
});
// all orders have been downloaded
// save to db
Note : There are MaxDegreeOfParallelism which you configure, maybe set it to 50, though it wont really matter how much you give it, the Task Scheduler is not going to aggressively give you threads, maybe 10 or so initially and grow slowly.
The other way you can do this, is create your own Task Scheduler, or just spin up your own Threads with the old fashioned Thread Class
I've read somewhere that I want to avoid combining the API call and
the db write to avoid locking in SQL server - Is this correct?
If you mean locking as in slow DB insert, use Sql Bulk Insert and update tools.
If you mean locking as in the the DB deadlock error message, then this is an entirely different thing, and worthy of its own question
Additional Resources
For(Int32, Int32, ParallelOptions, Action)
Executes a for (For in Visual Basic) loop in which iterations may run
in parallel and loop options can be configured.
ParallelOptions Class
Stores options that configure the operation of methods on the Parallel
class.
MaxDegreeOfParallelism
Gets or sets the maximum number of concurrent tasks enabled by this
ParallelOptions instance.
ConcurrentBag Class
Represents a thread-safe, unordered collection of objects.
Yes ConcurrentBag<T> Class can be used to server the purpose of one of your questions which was: "I think I can set it up to fire off the multiple calls asynchronously but I'm not sure how to check when all tasks have been completed i.e. ready to commit to db."
This generic class can be used to Run your every task and wait all your tasks to be completed to do further processing. It is thread safe and useful for parallel processing.
Related
I'm fairly new to programming (< 3 years exp), so I don't have a great understanding of the subjects in this post. Please bear with me.
My team is developing an integration with a third party system, and one of the third party's endpoints lacks a meaningful way to get a list of entities matching a condition.
We have been fetching these entities by looping over the collection of requests, and adding the results of each awaited call to a list. This works just fine, but getting the entities takes a lot longer than getting entities from other endpoints that lets us get a list of entities by providing a list of ids.
.NET 6.0 introduced Parallel.ForEachAsync(), which lets us execute multiple awaitable tasks asynchronously in parallel.
For example:
public async Task<List<TEntity>> GetEntitiesInParallelAsync<TEntity>(List<IRestRequest> requests)
where TEntity : IEntity
{
var entities = new ConcurrentBag<TEntity>();
// Create a function that takes a RestRequest and returns the
// result of the request's execution, for each request
var requestExecutionTasks = requests.Select(i =>
new Func<Task<TEntity>>(() => GetAsync<TEntity>(i)));
// Execute each of the functions asynchronously in parallel,
// and add the results to the aggregate as they come in
await Parallel.ForEachAsync(requestExecutionTasks, new ParallelOptions
{
// This lets us limit the number of threads to use. -1 is unlimited
MaxDegreeOfParallelism = -1
}, async (func, _) => entities.Add(await func()));
return entities.ToList();
}
Using this code rather than the simple foreach-loop sped up the time it takes to get the ~30 entities on my test instance, by 91% on average. That's awesome. However, we are worried about the rate limiting that is likely to occur when we use it on a client's system with possibly thousands of entities. We have a system in place that detects the "you are rate limited"-message from their API, and cues the requests for a second or so before trying again, but this is not as much a good solution as it is a safety measure.
If we where just looping over the requests, we could have throttled the calls by doing something like await Task.Delay(minimumDelay) in each iteration of the loop. Correct me if I'm wrong, but from what I understand this wouldn't actually work when executing the requests in parallel foreach, as it would make all requests wait the same amount of time before the execution. Is there a way to make each individual request wait a certain amount of time before execution, only if we are close to being rate limited? If at all possible, I would like to do this without limiting the number of threads to use.
Edit
I wanted to let this question sit a little so more people could answer. Since no new answers or comments have been added, I'm marking the one answer I got as correct. That being said, the answer suggests a different approach than using Parallel.ForEachAsync.
If I understand the current answer correctly, the answer to my original question of whether or not it's possible to throttle Parallel.ForEachAsync, would be: "no, it's not".
My suggestion is to ditch the Parallel.ForEachAsync approach, and use instead the new Chunk LINQ operator in combination with the Task.WhenAll method. You can launch 100 asynchronous operations every second like this:
public async Task<List<TEntity>> GetEntitiesInParallelAsync<TEntity>(
List<IRestRequest> requests) where TEntity : IEntity
{
var tasks = new List<Task<TEntity>>();
foreach (var chunk in requests.Chunk(100))
{
tasks.AddRange(chunk.Select(request => GetAsync<TEntity>(request)));
await Task.Delay(TimeSpan.FromSeconds(1.0));
}
return (await Task.WhenAll(tasks)).ToList();
}
It is assumed that the time required to launch an asynchronous operation (to invoke the GetAsync method) is negligible.
This approach has the inherent disadvantage that in case of an exception, the failure will not be propagated before all operations are completed. For comparison the Parallel.ForEachAsync method stops invoking the async delegate and completes ASAP, after the first failure is detected.
This question already has answers here:
Brief explanation of Async/Await in .Net 4.5
(3 answers)
Closed 7 years ago.
C# offers multiple ways to perform asynchronous execution such as threads, futures, and async.
In what cases is async the best choice?
I have read many articles about the how and what of async, but so far I have not seen any article that discusses the why.
Initially I thought async was a built-in mechanism to create a future. Something like
async int foo(){ return ..complex operation..; }
var x = await foo();
do_something_else();
bar(x);
Where call to 'await foo' would return immediately, and the use of 'x' would wait on the the return value of 'foo'. async does not do this. If you want this behavior you can use the futures library: https://msdn.microsoft.com/en-us/library/Ff963556.aspx
The above example would instead be something like
int foo(){ return ..complex operation..; }
var x = Task.Factory.StartNew<int>(() => foo());
do_something_else();
bar(x.Result);
Which isn't as pretty as I would have hoped, but it works nonetheless.
So if you have a problem where you want to have multiple threads operate on the work then use futures or one of the parallel operations, such as Parallel.For.
async/await, then, is probably not meant for the use case of performing work in parallel to increase throughput.
async solves the problem of scaling an application for a large number of asynchronous events, such as I/O, when creating many threads is expensive.
Imagine a web server where requests are processed immediately as they come in. The processing happens on a single thread where every function call is synchronous. To fully process a thread might take a few seconds, which means that an entire thread is consumed until the processing is complete.
A naive approach to server programming is to spawn a new thread for each request. In this way it does not matter how long each thread takes to complete because no thread will block any other. The problem with this approach is that threads are not cheap. The underlying operating system can only create so many threads before running out of memory, or some other kind of resource. A web server that uses 1 thread per request will probably not be able to scale past a few hundred/thousand requests per second. The c10k challenge asks that modern servers be able to scale to 10,000 simultaneous users. http://www.kegel.com/c10k.html
A better approach is to use a thread pool where the number of threads in existence is more or less fixed (or at least, does not expand past some tolerable maximum). In that scenario only a fixed number of threads are available for processing the incoming requests. If there are more requests than there are threads available for processing then some requests must wait. If a thread is processing a request and has to wait on a long running I/O process then effectively the thread is not being utilized to its fullest extent, and the server throughput will be much less than it otherwise could be.
The question is now, how can we have a fixed number of threads but still use them efficiently? One answer is to 'cut up' the program logic so that when a thread would normally wait on an I/O process, instead it will start the I/O process but immediately become free for any other task that wants to execute. The part of the program that was going to execute after the I/O will be stored in a thing that knows how to keep executing later on.
For example, the original synchronous code might look like
void process(){
string name = get_user_name();
string address = look_up_address(name);
string tax_forms = find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
Where look_up_address and find_tax_form have to talk to a database and/or make requests to other websites.
The asynchronous version might look like
void process(){
string name = get_user_name();
invoke_after(() => look_up_address(name), (address) => {
invoke_after(() => find_tax_form(address), (tax_forms) => {
render_tax_form(name, address, tax_forms);
}
}
}
This is continuation passing style, where next thing to do is passed as the second lambda to a function that will not block the current thread when the blocking operation (in the first lambda) is invoked. This works but it quickly becomes very ugly and hard to follow the program logic.
What the programmer has manually done in splitting up their program can be automatically done by async/await. Any time there is a call to an I/O function the program can mark that function call with await to inform the caller of the program that it can continue to do other things instead of just waiting.
async void process(){
string name = get_user_name();
string address = await look_up_address(name);
string tax_forms = await find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
The thread that executes process will break out of the function when it gets to look_up_address and continue to do other work: such as processing other requests. When look_up_address has completed and process is ready to continue, some thread (or the same thread) will pick up where the last thread left off and execute the next line find_tax_forms(address).
Since my current belief of async is about managing threads, I don't believe that async makes a lot of sense for UI programming. Generally UI's will not have that many simultaneous events that need to be processed. The use case for async with UI's is preventing the UI thread from being blocked. Even though async can be used with a UI, I would find it dangerous because ommitting an await on some long running function, due to either an accident or forgetfulness, would cause the UI to block.
async void button_callback(){
await do_something_long();
....
}
This code won't block the UI because it uses an await for the long running function that it invokes. If later on another function call is added
async void button_callback(){
do_another_thing();
await do_something_long();
...
}
Where it wasn't clear to the programmer who added the call to do_another_thing just how long it would take to execute, the UI will now be blocked. It seems safer to just always execute all processing in a background thread.
void button_callback(){
new Thread(){
do_another_thing();
do_something_long();
....
}.start();
}
Now there is no possibility that the UI thread will be blocked, and the chances that too many threads will be created is very small.
I am writing a very, very simple query which just gets a document from a collection according to its unique Id. After some frusteration (I am new to mongo and the async / await programming model), I figured this out:
IMongoCollection<TModel> collection = // ...
FindOptions<TModel> options = new FindOptions<TModel> { Limit = 1 };
IAsyncCursor<TModel> task = await collection.FindAsync(x => x.Id.Equals(id), options);
List<TModel> list = await task.ToListAsync();
TModel result = list.FirstOrDefault();
return result;
It works, great! But I keep seeing references to a "Find" method, and I worked this out:
IMongoCollection<TModel> collection = // ...
IFindFluent<TModel, TModel> findFluent = collection.Find(x => x.Id == id);
findFluent = findFluent.Limit(1);
TModel result = await findFluent.FirstOrDefaultAsync();
return result;
As it turns out, this too works, great!
I'm sure that there's some important reason that we have two different ways to achieve these results. What is the difference between these methodologies, and why should I choose one over the other?
The difference is in a syntax.
Find and FindAsync both allows to build asynchronous query with the same performance, only
FindAsync returns cursor which doesn't load all documents at once and provides you interface to retrieve documents one by one from DB cursor. It's helpful in case when query result is huge.
Find provides you more simple syntax through method ToListAsync where it inside retrieves documents from cursor and returns all documents at once.
Imagine that you execute this code in a web request, with invoking find method the thread of the request will be frozen until the database return results it's a synchron call, if it's a long database operation that takes seconds to complete, you will have one of the threads available to serve web request doing nothing simply waiting that database return the results, and wasting valuable resources (the number of threads in thread pool is limited).
With FindAsync, the thread of your web request will be free while is waiting the database for returning the results, this means that during the database call this thread is free to attend an another web request. When the database returns the result then the code continue execution.
For long operations like read/writes from file system, database operations, comunicate with another services, it's a good idea to use async calls. Because while you are waiting for the results, the threads are available for serve another web request. This is more scalable.
Take a look to this microsoft article https://msdn.microsoft.com/en-us/magazine/dn802603.aspx.
I've got a database entity type Entity, a long list of Thingy and method
private Task<Entity> MakeEntity(Thingy thingy) {
...
}
MakeEntity does lots of stuff, and is CPU bound. I would like to convert all my thingies to entities, and save them in a db.context. Considering that
I don't want to finish as fast as possible
The amount of entities is large, and I want to effectively use the database, so I want to start saving changes and waiting for the remote database to do it's thing
how can I do this performantly? What I would really like is to loop while waiting for the database to do its thing, and offer all the newly made entities so far, untill the database has processed them all. What's the best route there? I've run in to saveChanges throwing if it's called concurrently, so I can't do that. What I'd really like is to have a threadpool of eight threads (or rather, as many threads as I have cores) to do the CPU bound work, and a single thread doing the SaveChanges()
This is a kind of "asynchronous stream", which is always a bit awkward.
In this case (assuming you really do want to multithread on ASP.NET, which is not recommended in general), I'd say TPL Dataflow is your best option. You can use a TransformBlock with MaxDegreeOfParallelism set to 8 (or unbounded, for that matter), and link it to an ActionBlock that does the SaveChanges.
Remember, use synchronous signatures (not async/await) for CPU-bound code, and asynchronous methods for I/O-bound code (i.e., SaveChangesAsync).
You could set up a pipeline of N CPU workers feeding into a database worker. The database worker could batch items up.
Since MakeEntity is CPU bound there is no need to use async and await there. await does not create tasks or threads (a common misconception).
var thingies = ...;
var entities = thingies.AsParallel().WithDOP(8).Select(MakeEntity);
var batches = CreateBatches(entities, batchSize: 100);
foreach (var batch in batches) {
Insert(batch);
}
You need to provide a method that creates batches from an IEnumerable. This is available on the web.
If you don't need batching for the database part you can delete that code.
For the database part you probably don't need async IO because it seems to be a low-frequency operation.
I'm just beginning to learn C# threading and concurrent collections, and am not sure of the proper terminology to pose my question, so I'll describe briefly what I'm trying to do. My grasp of the subject is rudimentary at best at this point. Is my approach below even feasible as I've envisioned it?
I have 100,000 urls in a Concurrent collection that must be tested--is the link still good? I have another concurrent collection, initially empty, that will contain the subset of urls that an async request determines to have been moved (400, 404, etc errors).
I want to spawn as many of these async requests concurrently as my PC and our bandwidth will allow, and was going to start at 20 async-web-request-tasks per second and work my way up from there.
Would it work if a single async task handled both things: it would make the async request and then add the url to the BadUrls collection if it encountered a 4xx error? A new instance of that task would be spawned every 50ms:
class TestArgs args {
ConcurrentBag<UrlInfo> myCollection { get; set; }
System.Uri currentUrl { get; set; }
}
ConcurrentQueue<UrlInfo> Urls = new ConncurrentQueue<UrlInfo>();
// populate the Urls queue
<snip>
// initialize the bad urls collection
ConcurrentBag<UrlInfo> BadUrls = new ConcurrentBag<UrlInfo>();
// timer fires every 50ms, whereupon a new args object is created
// and the timer callback spawns a new task; an autoEvent would
// reset the timer and dispose of it when the queue was empty
void SpawnNewUrlTask(){
// if queue is empty then reset the timer
// otherwise:
TestArgs args = {
myCollection = BadUrls,
currentUrl = getNextUrl() // take an item from the queue
};
Task.Factory.StartNew( asyncWebRequestAndConcurrentCollectionUpdater, args);
}
public async Task asyncWebRequestAndConcurrentCollectionUpdater(TestArgs args)
{
//make the async web request
// add the url to the bad collection if appropriate.
}
Feasible? Way off?
The approach seems fine, but there are some issues with the specific code you've shown.
But before I get to that, there have been suggestions in the comments that Task Parallelism is the way to go. I think that's misguided. There's a common misconception that if you want to have lots of work going on in parallel, you necessarily need lots of threads. That's only true if the work is compute-bound. But the work you're doing will be IO bound - this code is going to spend the vast majority of its time waiting for responses. It will do very little computation. So in practice, even if it only used a single thread, your initial target of 20 requests per second doesn't seem like a workload that would cause a single CPU core to break into a sweat.
In short, a single thread can handle very high levels of concurrent IO. You only need multiple threads if you need parallel execution of code, and that doesn't look likely to be the case here, because there's so little work for the CPU in this particular job.
(This misconception predates await and async by years. In fact, it predates the TPL - see http://www.interact-sw.co.uk/iangblog/2004/09/23/threadless for a .NET 1.1 era illustration of how you can handle thousands of concurrent requests with a tiny number of threads. The underlying principles still apply today because Windows networking IO still basically works the same way.)
Not that there's anything particularly wrong with using multiple threads here, I'm just pointing out that it's a bit of a distraction.
Anyway, back to your code. This line is problematic:
Task.Factory.StartNew( asyncWebRequestAndConcurrentCollectionUpdater, args);
While you've not given us all your code, I can't see how that will be able to compile. The overloads of StartNew that accept two arguments require the first to be either an Action, an Action<object>, a Func<TResult>, or a Func<object,TResult>. In other words, it has to be a method that either takes no arguments, or accepts a single argument of type object (and which may or may not return a value). Your 'asyncWebRequestAndConcurrentCollectionUpdater' takes an argument of type TestArgs.
But the fact that it doesn't compile isn't the main problem. That's easily fixed. (E.g., change it to Task.Factory.StartNew(() => asyncWebRequestAndConcurrentCollectionUpdater(args));) The real issue is what you're doing is a bit weird: you're using Task.StartNew to invoke a method that already returns a Task.
Task.StartNew is a handy way to take a synchronous method (i.e., one that doesn't return a Task) and run it in a non-blocking way. (It'll run on the thread pool.) But if you've got a method that already returns a Task, then you didn't really need to use Task.StartNew. The weirdness becomes more apparent if we look at what Task.StartNew returns (once you've fixed the compilation error):
Task<Task> t = Task.Factory.StartNew(
() => asyncWebRequestAndConcurrentCollectionUpdater(args));
That Task<Task> reveals what's happening. You've decided to wrap a method that was already asynchronous with a mechanism that is normally used to make non-asynchronous methods asynchronous. And so you've now got a Task that produces a Task.
One of the slightly surprising upshots of this is that if you were to wait for the task returned by StartNew to complete, the underlying work would not necessarily be done:
t.Wait(); // doesn't wait for asyncWebRequestAndConcurrentCollectionUpdater to finish!
All that will actually do is wait for asyncWebRequestAndConcurrentCollectionUpdater to return a Task. And since asyncWebRequestAndConcurrentCollectionUpdater is already an async method, it will return a task more or less immediately. (Specifically, it'll return a task the moment it performs an await that does not complete immediately.)
If you want to wait for the work you've kicked off to finish, you'll need to do this:
t.Result.Wait();
or, potentially more efficiently, this:
t.Unwrap().Wait();
That says: get me the Task that my async method returned, and then wait for that. This may not be usefully different from this much simpler code:
Task t = asyncWebRequestAndConcurrentCollectionUpdater("foo");
... maybe queue up some other tasks ...
t.Wait();
You may not have gained anything useful by introducing `Task.Factory.StartNew'.
I say "may" because there's an important qualification: it depends on the context in which you start the work. C# generates code which, by default, attempts to ensure that when an async method continues after an await, it does so in the same context in which the await was initially performed. E.g., if you're in a WPF app and you await while on the UI thread, when the code continues it will arrange to do so on the UI thread. (You can disable this with ConfigureAwait.)
So if you're in a situation in which the context is essentially serialized (either because it's single-threaded, as will be the case in a GUI app, or because it uses something resembling a rental model, e.g. the context of an particular ASP.NET request), it may actually be useful to kick an async task off via Task.Factory.StartNew because it enables you to escape the original context. However, you just made your life harder - tracking your tasks to completion is somewhat more complex. And you might have been able to achieve the same effect simply by using ConfigureAwait inside your async method.
And it may not matter anyway - if you're only attempting to manage 20 requests a second, the minimal amount of CPU effort required to do that means that you can probably manage it entirely adequately on one thread. (Also, if this is a console app, the default context will come into play, which uses the thread pool, so your tasks will be able to run multithreaded in any case.)
But to get back to your question, it seems entirely reasonable to me to have a single async method that picks a url off the queue, makes the request, examines the response, and if necessary, adds an entry to the bad url collection. And kicking the things off from a timer also seems reasonable - that will throttle the rate at which connections are attempted without getting bogged down with slow responses (e.g., if a load of requests end up attempting to talk to servers that are offline). It might be necessary to introduce a cap for the maximum number of requests in flight if you hit some pathological case where you end up with tens of thousands of URLs in a row all pointing to a server that isn't responding. (On a related note, you'll need to make sure that you're not going to hit any per-client connection limits with whichever HTTP API you're using - that might end up throttling the effective throughput.)
You will need to add some sort of completion handling - just kicking off asynchronous operations and not doing anything to handle the results is bad practice, because you can end up with exceptions that have nowhere to go. (In .NET 4.0, these used to terminate your process, but as of .NET 4.5, by default an unhandled exception from an asynchronous operation will simply be ignored!) And if you end up deciding that it is worth launching via Task.Factory.StartNew remember that you've ended up with an extra layer of wrapping, so you'll need to do something like myTask.Unwrap().ContinueWith(...) to handle it correctly.
Of course you can. Concurrent collections are called 'concurrent' because they can be used... concurrently by multiple threads, with some warranties about their behaviour.
A ConcurrentQueue will ensure that each element inserted in it is extracted exactly once (concurrent threads will never extract the same item by mistake, and once the queue is empty, all the items have been extracted by a thread).
EDIT: the only thing that could go wrong is that 50ms is not enough to complete the request, and so more and more tasks cumulate in the task queue. If that happens, your memory could get filled, but the thing would work anyway. So yes, it is feasible.
Anyway, I would like to underline the fact that a task is not a thread. Even if you create 100 tasks, the framework will decide how many of them will be actually executed concurrently.
If you want to have more control on the level of parallelism, you should use asynchronous requests.
In your comments, you wrote "async web request", but I can't understand if you wrote async just because it's on a different thread or because you intend to use the async API.
If you were using the async API, I'd expect to see some handler attached to the completion event, but I couldn't see it, so I assumed you're using synchronous requests issued from an asynchronous task.
If you're using asynchronous requests, then it's pointless to use tasks, just use the timer to issue the async requests, since they are already asynchronous.
When I say "asynchronous request" I'm referring to methods like WebRequest.GetResponseAsync and WebRequest.BeginGetResponse.
EDIT2: if you want to use asynchronous requests, then you can just make requests from the timer handler. The BeginGetResponse method takes two arguments. The first one is a callback procedure, that will be called to report the status of the request. You can pass the same procedure for all the requests. The second one is an user-provided object, which will store status about the request, you can use this argument to differentiate among different requests. You can even do it without the timer. Something like:
private readonly int desiredConcurrency = 20;
struct RequestData
{
public UrlInfo url;
public HttpWebRequest request;
}
/// Handles the completion of an asynchronous request
/// When a request has been completed,
/// tries to issue a new request to another url.
private void AsyncRequestHandler(IAsyncResult ar)
{
if (ar.IsCompleted)
{
RequestData data = (RequestData)ar.AsyncState;
HttpWebResponse resp = data.request.EndGetResponse(ar);
if (resp.StatusCode != 200)
{
BadUrls.Add(data.url);
}
//A request has been completed, try to start a new one
TryIssueRequest();
}
}
/// If urls is not empty, dequeues a url from it
/// and issues a new request to the extracted url.
private bool TryIssueRequest()
{
RequestData rd;
if (urls.TryDequeue(out rd.url))
{
rd.request = CreateRequestTo(rd.url); //TODO implement
rd.request.BeginGetResponse(AsyncRequestHandler, rd);
return true;
}
else
{
return false;
}
}
//Called by a button handler, or something like that
void StartTheRequests()
{
for (int requestCount = 0; requestCount < desiredConcurrency; ++requestCount)
{
if (!TryIssueRequest()) break;
}
}