What is the use case for async/await? [duplicate] - c#

This question already has answers here:
Brief explanation of Async/Await in .Net 4.5
(3 answers)
Closed 7 years ago.
C# offers multiple ways to perform asynchronous execution such as threads, futures, and async.
In what cases is async the best choice?
I have read many articles about the how and what of async, but so far I have not seen any article that discusses the why.
Initially I thought async was a built-in mechanism to create a future. Something like
async int foo(){ return ..complex operation..; }
var x = await foo();
do_something_else();
bar(x);
Where call to 'await foo' would return immediately, and the use of 'x' would wait on the the return value of 'foo'. async does not do this. If you want this behavior you can use the futures library: https://msdn.microsoft.com/en-us/library/Ff963556.aspx
The above example would instead be something like
int foo(){ return ..complex operation..; }
var x = Task.Factory.StartNew<int>(() => foo());
do_something_else();
bar(x.Result);
Which isn't as pretty as I would have hoped, but it works nonetheless.
So if you have a problem where you want to have multiple threads operate on the work then use futures or one of the parallel operations, such as Parallel.For.
async/await, then, is probably not meant for the use case of performing work in parallel to increase throughput.

async solves the problem of scaling an application for a large number of asynchronous events, such as I/O, when creating many threads is expensive.
Imagine a web server where requests are processed immediately as they come in. The processing happens on a single thread where every function call is synchronous. To fully process a thread might take a few seconds, which means that an entire thread is consumed until the processing is complete.
A naive approach to server programming is to spawn a new thread for each request. In this way it does not matter how long each thread takes to complete because no thread will block any other. The problem with this approach is that threads are not cheap. The underlying operating system can only create so many threads before running out of memory, or some other kind of resource. A web server that uses 1 thread per request will probably not be able to scale past a few hundred/thousand requests per second. The c10k challenge asks that modern servers be able to scale to 10,000 simultaneous users. http://www.kegel.com/c10k.html
A better approach is to use a thread pool where the number of threads in existence is more or less fixed (or at least, does not expand past some tolerable maximum). In that scenario only a fixed number of threads are available for processing the incoming requests. If there are more requests than there are threads available for processing then some requests must wait. If a thread is processing a request and has to wait on a long running I/O process then effectively the thread is not being utilized to its fullest extent, and the server throughput will be much less than it otherwise could be.
The question is now, how can we have a fixed number of threads but still use them efficiently? One answer is to 'cut up' the program logic so that when a thread would normally wait on an I/O process, instead it will start the I/O process but immediately become free for any other task that wants to execute. The part of the program that was going to execute after the I/O will be stored in a thing that knows how to keep executing later on.
For example, the original synchronous code might look like
void process(){
string name = get_user_name();
string address = look_up_address(name);
string tax_forms = find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
Where look_up_address and find_tax_form have to talk to a database and/or make requests to other websites.
The asynchronous version might look like
void process(){
string name = get_user_name();
invoke_after(() => look_up_address(name), (address) => {
invoke_after(() => find_tax_form(address), (tax_forms) => {
render_tax_form(name, address, tax_forms);
}
}
}
This is continuation passing style, where next thing to do is passed as the second lambda to a function that will not block the current thread when the blocking operation (in the first lambda) is invoked. This works but it quickly becomes very ugly and hard to follow the program logic.
What the programmer has manually done in splitting up their program can be automatically done by async/await. Any time there is a call to an I/O function the program can mark that function call with await to inform the caller of the program that it can continue to do other things instead of just waiting.
async void process(){
string name = get_user_name();
string address = await look_up_address(name);
string tax_forms = await find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
The thread that executes process will break out of the function when it gets to look_up_address and continue to do other work: such as processing other requests. When look_up_address has completed and process is ready to continue, some thread (or the same thread) will pick up where the last thread left off and execute the next line find_tax_forms(address).
Since my current belief of async is about managing threads, I don't believe that async makes a lot of sense for UI programming. Generally UI's will not have that many simultaneous events that need to be processed. The use case for async with UI's is preventing the UI thread from being blocked. Even though async can be used with a UI, I would find it dangerous because ommitting an await on some long running function, due to either an accident or forgetfulness, would cause the UI to block.
async void button_callback(){
await do_something_long();
....
}
This code won't block the UI because it uses an await for the long running function that it invokes. If later on another function call is added
async void button_callback(){
do_another_thing();
await do_something_long();
...
}
Where it wasn't clear to the programmer who added the call to do_another_thing just how long it would take to execute, the UI will now be blocked. It seems safer to just always execute all processing in a background thread.
void button_callback(){
new Thread(){
do_another_thing();
do_something_long();
....
}.start();
}
Now there is no possibility that the UI thread will be blocked, and the chances that too many threads will be created is very small.

Related

Are there performance differences between await Task.Delay() and Task.Delay().Wait() in a separate Task?

I have a construct like this:
Func<object> func = () =>
{
foreach(var x in y)
{
...
Task.Delay(100).Wait();
}
return null;
}
var t = new Task<object>(func);
t.Start(); // t is never awaited or waited for, the main thread does not care when it's done.
...
So basically i create a function func that calls Task.Delay(100).Wait() quite a few times. I know the use of .Wait() is discouraged in general.
But I want to know if there are concrete performance losses for the displayed case.
The .Wait() calls happen in a separate Task, that is completely independend from the main Thread, i.e. it is never awaited or waited for. I am curious what happens when I call Wait() this way, and what happens on the processor of my machine. During the 100 ms that are waited, can the processor core execute another thread, and returns to the Task after the time has passed? Or did I just produce a busy waiting procedure, where my CPU "actively does nothing" for 100 ms, thus slowing down the rest of my program?
Does this approach have any practical downsides when compared to a solution where i make it an async function and call await Task.Delay(100)? That would be an option for me, but I would rather not go for it if it is reasonably avoidable.
There are concrete efficiency losses because of the inefficient use of threads. Each thread requires at least 1 MB for its stack, so the more threads that are created in order to do nothing, the more memory is allocated for unproductive purposes.
It is also possible for concrete performance losses to appear, in case the demand for threads surpasses the ThreadPool availability. In this case the ThreadPool becomes saturated, and new threads are injected in the pool in a conservative (slow) rate. So the tasks you create and Start will not start immediately, but instead they will be entered in an internal queue, waiting for a free thread, either one that completed some previous work, or an new injected one.
Regarding your concerns about creating busy waiting procedures, no, that's not what happening. A sleeping thread does not consume CPU resources.
As a side note, creating cold Tasks using the Task constructor is an advanced technique that's only used in special occasions. The common way of creating delegate-based tasks is through the convenient Task.Run method, that returns hot (already started) tasks.

Usage of ConfigureAwait in .NET

I've read about ConfigureAwait in various places (including SO questions), and here are my conclusions:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
My questions are:
Are my conclusions correct?
When does ConfigureAwait(false) improves performance, and when it doesn't?
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
To answer your questions more directly:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
Not necessarily the same thread, but the same synchronization context. The synchronization context can decide how to run the code. In a UI application, it will be the same thread. In ASP.NET, it may not be the same thread, but you will have the HttpContext available, just like you did before.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
This is not correct. ConfigureAwait(false) tells it that it does not need the context, so the code can be run anywhere. It could be any thread that runs it.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
It is not correct that it "should be appended with .ConfigureAwait(true)". ConfigureAwait(true) is the default. So if that's what you want, you don't need to specify it.
When does ConfigureAwait(false) improves performance, and when it doesn't?
Returning to the synchronization context might take time, because it may have to wait for something else to finish running. In reality, this rarely happens, or that waiting time is so minuscule that you'd never notice it.
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
You could use ConfigureAwait(false), but I suggest you don't, for a few reasons:
I doubt you would notice any performance improvement.
It can introduce parallelism that you may not expect. If you use ConfigureAwait(false), the continuation can run on any thread, so you could have problems if you're accessing non-thread-safe objects. It is not common to have these problems, but it can happen.
You (or someone else maintaining this code) may add code that interacts with the UI later and exceptions will be thrown. Hopefully the ConfigureAwait(false) is easy to spot (it could be in a different method than where the exception is thrown) and you/they know what it does.
I find it's easier to not use ConfigureAwait(false) at all (except in libraries). In the words of Stephen Toub (a Microsoft employee) in the ConfigureAwait FAQ:
When writing applications, you generally want the default behavior (which is why it is the default behavior).
Edit: I've written an article of my own on this topic: .NET: Don’t use ConfigureAwait(false)
ConfigureAwait(false) may improve performance if there are not many worker threads available and if the thread that it would need to wait for is constantly busy.
ConfigureAwait(false) is recommended everywhere where coming back to same SynchronizationContext (which usualy is linked with thread) is not needed, especially in libraries that awaits something internally: https://medium.com/bynder-tech/c-why-you-should-use-configureawait-false-in-your-library-code-d7837dce3d7f.
ConfigureAwait(true) (which is the default) is needed when you require same context but may also lead to a dead lock in certain situations.
Consider this code:
void Main()
{
// creating a windows form attaches a synchronization context to the current thread
new System.Windows.Forms.Form();
var task = DoSth();
Console.WriteLine(task.Result);
}
async Task<int> DoSth()
{
await Task.Delay(1000);
return 1;
}
in this example because of not awaited task DoSth, the main UI thread is blocked by waiting for task.Result - at the same time DoSth is blocked because it wants to come back to the UI thread after a delay. This will lead to a deadlock and this code will never execute to the end. Adding .ConfigureAwait(false) solves the problem in this case.
Using ConfigureAwait(false) in application code is normally not going to boost your application's performance in any meaningful way, because normally you don't await inside loops in application code. For example lets consider the case that your app has a button, and an async operation is started everytime the user clicks the button, and the async operation includes a single await. By typing the 22 characters .ConfigureAwait(false) after this await you have already lost comparable time of your life, with the time you can hope to save from 10 users that click this button once every minute, 8 hours per day, for 20 years each (~35,000,000 context switchings in total = some seconds of CPU processing time).
And this before taking into account the time you need to think about whether you can safely include this configuration (depending on whether the continuation contains UI-related code), the time you'll need in order to reconfirm you previous assessment every time you have to maintain/modify the code, and the time you'll lose on debugging in case your assessment was wrong.
On the other hand if your Button_Click handler contains code like this:
private async void Button_Click(object sender, EventArgs e)
{
var client = new WebClient();
using var stream = await client.OpenReadTaskAsync("someUrl");
var buffer = new byte[1024];
while ((await stream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
//...
}
}
...then by all means do spend the extra time to ConfigureAwait(false) the ReadAsync task. Also do consider refactoring the code by moving the stream-reading part to a separate asynchronous method, so that you can safely access UI elements anywhere inside the Button_Click handler, without been distracted by technicalities that don't belong to this layer of the app.

Can Task.Delay cause thread switching? [duplicate]

This question already has answers here:
I thought await continued on the same thread as the caller, but it seems not to
(3 answers)
Closed 2 years ago.
I have a long running process that sends data to the other machine. But this data is received in chunks (like a set of 100 packets, then delay of minimum 10 seconds).
I started the sending function on a separate thread using
Task.Run(() => { SendPackets(); });
The packets to be sent are queued in a Queue<Packet> object by some other function.
In SendPackets() I am using a while loop to retrieve and send (asynchronously) all items available in the queue.
void SendPackets()
{
while(isRunning)
{
while(thePacketQueue.Count > 0)
{
Packet pkt = thePacketQueue.Dequeue();
BeginSend(pkt, callback); // Actual code to send data over asynchronously
}
Task.Delay(1000); // <---- My question lies here
}
}
All the locks are in place!
My question is, when I use Task.Delay, is it possible then the next cycle may be executed by a thread different from the current one?
Is there any other approach, where instead of specifying delay for 1 second, I can use something like ManualResetEvent, and what would be the respective code (I have no idea how to use the ManualResetEvent etc.
Also, I am new to async/await and TPL, so kindly bear with my ignorance.
TIA.
My question is, when I use Task.Delay, is it possible then the next cycle may be executed by a thread different from the current one?
Not with the code you've got, because that code is buggy. It won't actually delay between cycles at all. It creates a task that will complete in a second, but then ignores that task. Your code should be:
await Task.Delay(1000);
or potentially:
await Task.Delay(1000).ConfigureAwait(false);
With the second piece of code, that can absolutely run each cycle on a different thread. With the first piece of code, it will depend on the synchronization context. If you were running in a synchronization context with thread affinity (e.g. you're calling this from the UI thread of a WPF or WinForms app) then the async method will continue on the same thread after the delay completes. If you're running without a synchronization context, or in a synchronization context that doesn't just use one thread, then again it could run each cycle in a different thread.
As you're starting this code with Task.Run, you won't have a synchronization context - but it's worth being aware that the same piece of code could behave differently when run in a different way.
As a side note, it's not clear what's adding items to thePacketQueue, but unless that's a concurrent collection (e.g. ConcurrentQueue), you may well have a problem there too.

Async/Await regarding system resources consumption and efficiency

Short version: how does async calls scale when async methods are called thousands and thousands of times in a loop, and these methods might call other async methods? Will my threadpool explode?
I've been reading and experimenting with the TPL and Async and after reading a lot of material I'm still confused about some aspects that I could not find much information about, like how async calls scale. I will try to go straight to the point.
Async calls
For IO, I read it is better to use async than a new thread/start a task, but from what I understand, performing an async operation without using a different thread is impossible, which means async must use other threads/start tasks at some point.
So my question is: how would code A be better than code B regarding system resources?
Code A
// an array with 5000 urls.
var urls = new string[5000];
// list of awaitable tasks.
var tasks = new List<Task<string>>(5000);
HttpClient httpClient;
foreach (string url in urls)
{
tasks.Add(httpClient.GetStringAsync(url));
}
await Task.WhenAll(tasks);
Code B
...same variables as code A...
foreach (string url in urls)
{
tasks.Add(
Task.Factory.StartNew(() =>
{
// This method represents a
// synchronous version of the GetStringAsync.
httpClient.GetString(url);
})
);
}
await Task.WhenAll(tasks);
Which leads me to the questions:
1 - should async calls be avoided in a loop?
2 - Is there a reasonable max of async calls that should be fired at a time, or is firing any number of async calls ok? How does this scale?
3 - Do async methods, under the hood, start a task for each call?
I tested this with 1000 urls and the number of used threadpool worker threads never even reached 30, and the number of IO completion threads is always about 5.
My Practical Experiment
I created a web application with a simple async controller.
The page is composed of a single form with a textarea where the user enters all urls he wishes to request/do some work with.
Upon submition, the urls are requested in loop using the HttpClient.GetUrlAsync method just like the code A above.
An interesting point is that if I submit 1000 urls, it takes about 3 minutes to finish all requests.
On the other hand, if I submit 3 forms from 3 different tabs (i.e. clients), each with 1000 urls, it takes much much longer for the result (about 10 minutes), which really got me confused, because as per msdn definition, it should not take much longer than 3 minutes, specially when even while processing all the requests at the same time the number of used threads from the threadpool is approx 25, which means resources are not being well explored at all!
The way it is working now, this type of application is far from scalable (say I had about 5000 clients requesting a bunch of urls all the time), and I fail to see how asyncis the way to fire multiple IO requests.
Further explanation about the application
Client side:
1. user enter the site
2. types 1000 urls in the text area
3. submits the urls
Server side:
1. receive urls as an array
2. perform the code
foreach (string url in urls)
{
tasks.Add(GetUrlAsync(url));
}
await Task.WhenAll(tasks);
//at this point the thread is
// returned to the pool to receive
// further requests.
notifies the client that work is done
Please, enlighten me!
Thank you.
from what I understand, performing an async operation without using a different thread is impossible, which means async must use other threads/start tasks at some point.
Nope. As I describe on my blog, pure async methods do not block threads.
So my question is: how would code A be better than code B regarding system resources?
A uses fewer threads than B.
(On a side note, do not use StartNew. It's horribly out-of-date and has very dangerous default parameter values. Use Task.Run instead. If you got this idea/code from a blog post or article, please pass the word along. StartNew is a cancer that seems to be taking over the Internet.)
should async calls be avoided in a loop?
Nope, that's fine.
Is there a reasonable max of async calls that should be fired at a time, or is firing any number of async calls ok?
Any number of them are fine, as long as your backend resource can handle it.
How does this scale?
Asynchronous I/O on .NET almost always uses IOCPs (I/O Completion Ports) underneath, which is generally considered the most scalable form of I/O available on Windows.
Do async methods, under the hood, start a task for each call?
Yes and no. The execution of every asynchronous method is represented by a Task instance, but these do not represent running tasks - they don't represent a thread.
I call async tasks Promise Tasks, as opposed to Delegate Tasks (tasks that actually do run on the thread pool).
really got me confused
One thing to be aware of when you're testing URL requests is that there's automatic throttling for URL requests built-in to .NET. Try setting ServicePointManager.DefaultConnectionLimit to int.MaxValue.

Can many instances of an async task share a reference to a concurrent collection and add items concurrently to it in C#?

I'm just beginning to learn C# threading and concurrent collections, and am not sure of the proper terminology to pose my question, so I'll describe briefly what I'm trying to do. My grasp of the subject is rudimentary at best at this point. Is my approach below even feasible as I've envisioned it?
I have 100,000 urls in a Concurrent collection that must be tested--is the link still good? I have another concurrent collection, initially empty, that will contain the subset of urls that an async request determines to have been moved (400, 404, etc errors).
I want to spawn as many of these async requests concurrently as my PC and our bandwidth will allow, and was going to start at 20 async-web-request-tasks per second and work my way up from there.
Would it work if a single async task handled both things: it would make the async request and then add the url to the BadUrls collection if it encountered a 4xx error? A new instance of that task would be spawned every 50ms:
class TestArgs args {
ConcurrentBag<UrlInfo> myCollection { get; set; }
System.Uri currentUrl { get; set; }
}
ConcurrentQueue<UrlInfo> Urls = new ConncurrentQueue<UrlInfo>();
// populate the Urls queue
<snip>
// initialize the bad urls collection
ConcurrentBag<UrlInfo> BadUrls = new ConcurrentBag<UrlInfo>();
// timer fires every 50ms, whereupon a new args object is created
// and the timer callback spawns a new task; an autoEvent would
// reset the timer and dispose of it when the queue was empty
void SpawnNewUrlTask(){
// if queue is empty then reset the timer
// otherwise:
TestArgs args = {
myCollection = BadUrls,
currentUrl = getNextUrl() // take an item from the queue
};
Task.Factory.StartNew( asyncWebRequestAndConcurrentCollectionUpdater, args);
}
public async Task asyncWebRequestAndConcurrentCollectionUpdater(TestArgs args)
{
//make the async web request
// add the url to the bad collection if appropriate.
}
Feasible? Way off?
The approach seems fine, but there are some issues with the specific code you've shown.
But before I get to that, there have been suggestions in the comments that Task Parallelism is the way to go. I think that's misguided. There's a common misconception that if you want to have lots of work going on in parallel, you necessarily need lots of threads. That's only true if the work is compute-bound. But the work you're doing will be IO bound - this code is going to spend the vast majority of its time waiting for responses. It will do very little computation. So in practice, even if it only used a single thread, your initial target of 20 requests per second doesn't seem like a workload that would cause a single CPU core to break into a sweat.
In short, a single thread can handle very high levels of concurrent IO. You only need multiple threads if you need parallel execution of code, and that doesn't look likely to be the case here, because there's so little work for the CPU in this particular job.
(This misconception predates await and async by years. In fact, it predates the TPL - see http://www.interact-sw.co.uk/iangblog/2004/09/23/threadless for a .NET 1.1 era illustration of how you can handle thousands of concurrent requests with a tiny number of threads. The underlying principles still apply today because Windows networking IO still basically works the same way.)
Not that there's anything particularly wrong with using multiple threads here, I'm just pointing out that it's a bit of a distraction.
Anyway, back to your code. This line is problematic:
Task.Factory.StartNew( asyncWebRequestAndConcurrentCollectionUpdater, args);
While you've not given us all your code, I can't see how that will be able to compile. The overloads of StartNew that accept two arguments require the first to be either an Action, an Action<object>, a Func<TResult>, or a Func<object,TResult>. In other words, it has to be a method that either takes no arguments, or accepts a single argument of type object (and which may or may not return a value). Your 'asyncWebRequestAndConcurrentCollectionUpdater' takes an argument of type TestArgs.
But the fact that it doesn't compile isn't the main problem. That's easily fixed. (E.g., change it to Task.Factory.StartNew(() => asyncWebRequestAndConcurrentCollectionUpdater(args));) The real issue is what you're doing is a bit weird: you're using Task.StartNew to invoke a method that already returns a Task.
Task.StartNew is a handy way to take a synchronous method (i.e., one that doesn't return a Task) and run it in a non-blocking way. (It'll run on the thread pool.) But if you've got a method that already returns a Task, then you didn't really need to use Task.StartNew. The weirdness becomes more apparent if we look at what Task.StartNew returns (once you've fixed the compilation error):
Task<Task> t = Task.Factory.StartNew(
() => asyncWebRequestAndConcurrentCollectionUpdater(args));
That Task<Task> reveals what's happening. You've decided to wrap a method that was already asynchronous with a mechanism that is normally used to make non-asynchronous methods asynchronous. And so you've now got a Task that produces a Task.
One of the slightly surprising upshots of this is that if you were to wait for the task returned by StartNew to complete, the underlying work would not necessarily be done:
t.Wait(); // doesn't wait for asyncWebRequestAndConcurrentCollectionUpdater to finish!
All that will actually do is wait for asyncWebRequestAndConcurrentCollectionUpdater to return a Task. And since asyncWebRequestAndConcurrentCollectionUpdater is already an async method, it will return a task more or less immediately. (Specifically, it'll return a task the moment it performs an await that does not complete immediately.)
If you want to wait for the work you've kicked off to finish, you'll need to do this:
t.Result.Wait();
or, potentially more efficiently, this:
t.Unwrap().Wait();
That says: get me the Task that my async method returned, and then wait for that. This may not be usefully different from this much simpler code:
Task t = asyncWebRequestAndConcurrentCollectionUpdater("foo");
... maybe queue up some other tasks ...
t.Wait();
You may not have gained anything useful by introducing `Task.Factory.StartNew'.
I say "may" because there's an important qualification: it depends on the context in which you start the work. C# generates code which, by default, attempts to ensure that when an async method continues after an await, it does so in the same context in which the await was initially performed. E.g., if you're in a WPF app and you await while on the UI thread, when the code continues it will arrange to do so on the UI thread. (You can disable this with ConfigureAwait.)
So if you're in a situation in which the context is essentially serialized (either because it's single-threaded, as will be the case in a GUI app, or because it uses something resembling a rental model, e.g. the context of an particular ASP.NET request), it may actually be useful to kick an async task off via Task.Factory.StartNew because it enables you to escape the original context. However, you just made your life harder - tracking your tasks to completion is somewhat more complex. And you might have been able to achieve the same effect simply by using ConfigureAwait inside your async method.
And it may not matter anyway - if you're only attempting to manage 20 requests a second, the minimal amount of CPU effort required to do that means that you can probably manage it entirely adequately on one thread. (Also, if this is a console app, the default context will come into play, which uses the thread pool, so your tasks will be able to run multithreaded in any case.)
But to get back to your question, it seems entirely reasonable to me to have a single async method that picks a url off the queue, makes the request, examines the response, and if necessary, adds an entry to the bad url collection. And kicking the things off from a timer also seems reasonable - that will throttle the rate at which connections are attempted without getting bogged down with slow responses (e.g., if a load of requests end up attempting to talk to servers that are offline). It might be necessary to introduce a cap for the maximum number of requests in flight if you hit some pathological case where you end up with tens of thousands of URLs in a row all pointing to a server that isn't responding. (On a related note, you'll need to make sure that you're not going to hit any per-client connection limits with whichever HTTP API you're using - that might end up throttling the effective throughput.)
You will need to add some sort of completion handling - just kicking off asynchronous operations and not doing anything to handle the results is bad practice, because you can end up with exceptions that have nowhere to go. (In .NET 4.0, these used to terminate your process, but as of .NET 4.5, by default an unhandled exception from an asynchronous operation will simply be ignored!) And if you end up deciding that it is worth launching via Task.Factory.StartNew remember that you've ended up with an extra layer of wrapping, so you'll need to do something like myTask.Unwrap().ContinueWith(...) to handle it correctly.
Of course you can. Concurrent collections are called 'concurrent' because they can be used... concurrently by multiple threads, with some warranties about their behaviour.
A ConcurrentQueue will ensure that each element inserted in it is extracted exactly once (concurrent threads will never extract the same item by mistake, and once the queue is empty, all the items have been extracted by a thread).
EDIT: the only thing that could go wrong is that 50ms is not enough to complete the request, and so more and more tasks cumulate in the task queue. If that happens, your memory could get filled, but the thing would work anyway. So yes, it is feasible.
Anyway, I would like to underline the fact that a task is not a thread. Even if you create 100 tasks, the framework will decide how many of them will be actually executed concurrently.
If you want to have more control on the level of parallelism, you should use asynchronous requests.
In your comments, you wrote "async web request", but I can't understand if you wrote async just because it's on a different thread or because you intend to use the async API.
If you were using the async API, I'd expect to see some handler attached to the completion event, but I couldn't see it, so I assumed you're using synchronous requests issued from an asynchronous task.
If you're using asynchronous requests, then it's pointless to use tasks, just use the timer to issue the async requests, since they are already asynchronous.
When I say "asynchronous request" I'm referring to methods like WebRequest.GetResponseAsync and WebRequest.BeginGetResponse.
EDIT2: if you want to use asynchronous requests, then you can just make requests from the timer handler. The BeginGetResponse method takes two arguments. The first one is a callback procedure, that will be called to report the status of the request. You can pass the same procedure for all the requests. The second one is an user-provided object, which will store status about the request, you can use this argument to differentiate among different requests. You can even do it without the timer. Something like:
private readonly int desiredConcurrency = 20;
struct RequestData
{
public UrlInfo url;
public HttpWebRequest request;
}
/// Handles the completion of an asynchronous request
/// When a request has been completed,
/// tries to issue a new request to another url.
private void AsyncRequestHandler(IAsyncResult ar)
{
if (ar.IsCompleted)
{
RequestData data = (RequestData)ar.AsyncState;
HttpWebResponse resp = data.request.EndGetResponse(ar);
if (resp.StatusCode != 200)
{
BadUrls.Add(data.url);
}
//A request has been completed, try to start a new one
TryIssueRequest();
}
}
/// If urls is not empty, dequeues a url from it
/// and issues a new request to the extracted url.
private bool TryIssueRequest()
{
RequestData rd;
if (urls.TryDequeue(out rd.url))
{
rd.request = CreateRequestTo(rd.url); //TODO implement
rd.request.BeginGetResponse(AsyncRequestHandler, rd);
return true;
}
else
{
return false;
}
}
//Called by a button handler, or something like that
void StartTheRequests()
{
for (int requestCount = 0; requestCount < desiredConcurrency; ++requestCount)
{
if (!TryIssueRequest()) break;
}
}

Categories

Resources