Adding List of tasks without executing - c#

I have a method which returns a task, which I want to call multiple times and wait for any 1 of them to be successful. The issue I am facing is as soon as I add the task to the List, it executes and since I added delay to simulate the work, it just block there.
Is there a way to add the the tasks to the list without really executing it and let whenAny execute the tasks.
The below code is from Linqpad editor.
async void Main()
{
var t = new Test();
List<Task<int>> tasks = new List<Task<int>>();
for( int i =0; i < 5; i++)
{
tasks.Add(t.Getdata());
}
var result = await Task.WhenAny(tasks);
result.Dump();
}
public class Test
{
public Task<int> Getdata()
{
"In Getdata method".Dump();
Task.Delay(90000).Wait();
return Task.FromResult(10);
}
}
Update :: Below one makes it clear, I was under the impression that if GetData makes a call to network it will get blocked during the time it actually completes.
async void Main()
{
OverNetwork t = new OverNetwork();
List<Task<string>> websitesContentTask = new List<Task<string>>();
websitesContentTask.Add(t.GetData("http://www.linqpad.net"));
websitesContentTask.Add(t.GetData("http://csharpindepth.com"));
websitesContentTask.Add(t.GetData("http://www.albahari.com/nutshell/"));
Task<string> completedTask = await Task.WhenAny(websitesContentTask);
string output = await completedTask;
Console.WriteLine(output);
}
public class OverNetwork
{
private HttpClient client = new HttpClient();
public Task<string> GetData(string uri)
{
return client.GetStringAsync(uri);
}
}

since I added delay to simulate the work, it just block there
Actually, your problem is that your code is calling Wait, which blocks the current thread until the delay is completed.
To properly use Task.Delay, you should use await:
public async Task<int> Getdata()
{
"In Getdata method".Dump();
await Task.Delay(90000);
return 10;
}
Is there a way to add the the tasks to the list without really executing it and let whenAny execute the tasks.
No. WhenAny never executes tasks. Ever.
It's possible to build a list of asynchronous delegates, i.e., a List<Func<Task>> and execute them later, but I don't think that's what you're really looking for.

There are multiple tasks in your Getdata method. First does delay, but you are returning finished task which returned 10. Try to change your code like this
return Task.Delay(90000).ContinueWith(t => 10)

Related

How to make a task more independent?

Originally I wrote my C# program using Threads and ThreadPooling, but most of the users on here were telling me Async was a better approach for more efficiency. My program sends JSON objects to a server until status code of 200 is returned and then moves on to the next task.
The problem is once one of my Tasks retrieves a status code of 200, it waits for the other Tasks to get 200 code and then move onto the next Task. I want each task to continue it's next task without waiting for other Tasks to finish (or catch up) by receiving a 200 response.
Below is my main class to run my tasks in parallel.
public static void Main (string[] args)
{
var tasks = new List<Task> ();
for (int i = 0; i < 10; i++) {
tasks.Add (getItemAsync (i));
}
Task.WhenAny (tasks);
}
Here is the getItemAsync() method that does the actual sending of information to another server. Before this method works, it needs a key. The problem also lies here, let's say I run 100 tasks, all 100 tasks will wait until every single one has a key.
public static async Task getItemAsync (int i)
{
if (key == "") {
await getProductKey (i).ConfigureAwait(false);
}
Uri url = new Uri ("http://www.website.com/");
var content = new FormUrlEncodedContent (new[] {
...
});
while (!success) {
using (HttpResponseMessage result = await client.PostAsync (url, content)) {
if (result.IsSuccessStatusCode) {
string resultContent = await result.Content.ReadAsStringAsync ().ConfigureAwait(false);
Console.WriteLine(resultContent);
success=true;
}
}
}
await doMoreAsync (i);
}
Here's the function that retrieves the keys, it uses HttpClient and gets parsed.
public static async Task getProductKey (int i)
{
string url = "http://www.website.com/key";
var stream = await client.GetStringAsync (url).ConfigureAwait(false);
var doc = new HtmlDocument ();
doc.LoadHtml (stream);
HtmlNode node = doc.DocumentNode.SelectNodes ("//input[#name='key']") [0];
try {
key = node.Attributes ["value"].Value;
} catch (Exception e) {
Console.WriteLine ("Exception: " + e);
}
}
After every task receives a key and has a 200 status code, it runs doMoreAsync(). I want any individual Task that retrieved the 200 code to run doMoreAsync() without waiting for other tasks to catch up. How do I go about this?
Your Main method is not waiting for a Task to complete, it justs fires off a bunch of async tasks and returns.
If you want to await an async task you can only do that from an async method. Workaround is to kick off an async task from the Main method and wait for its completion using the blocking Task.Wait():
public static void Main(string[] args) {
Task.Run(async () =>
{
var tasks = new List<Task> ();
for (int i = 0; i < 10; i++) {
tasks.Add (getItemAsync (i));
}
var finishedTask = await Task.WhenAny(tasks); // This awaits one task
}).Wait();
}
When you are invoking getItemAsync() from an async method, you can remove the ConfigureAwait(false) as well. ConfigureAwait(false) just makes sure some code does not execute on the UI thread.
If you want to append another task to a task, you can also append it to the previous Task directly by using ContinueWith():
getItemAsync(i).ContinueWith(anotherTask);
Your main problem seems to be that you're sharing the key field across all your concurrently-running asynchronous operations. This will inevitably lead to race hazards. Instead, you should alter your getProductKey method to return each retrieved key as the result of its asynchronous operation:
// ↓ result type of asynchronous operation
public static async Task<string> getProductKey(int i)
{
// retrieval logic here
// return key as result of asynchronous operation
return node.Attributes["value"].Value;
}
Then, you consume it like so:
public static async Task getItemAsync(int i)
{
string key;
try
{
key = await getProductKey(i).ConfigureAwait(false);
}
catch
{
// handle exceptions
}
// use local variable 'key' in further asynchronous operations
}
Finally, in your main logic, use a WaitAll to prevent the console application from terminating before all your tasks complete. Although WaitAll is a blocking call, it is acceptable in this case since you want the main thread to block.
public static void Main (string[] args)
{
var tasks = new List<Task> ();
for (int i = 0; i < 10; i++) {
tasks.Add(getItemAsync(i));
}
Task.WaitAll(tasks.ToArray());
}

Async Await Few Confusions

As experiencing the new async & Await features of 4.5 I want to clear some confusions before going any further. I have been reading different article and also different question on SO and it help me undertands how Async and Await works. I will just try to put my understanding and confusions here and will appreciate if someone code educate me and other people who are looking for same things. I am discussing this in very simple wordings.
So Async is used so that compiler know that method marked by Async contains Await operation (Long operation). Latest framework contains different new builtin methods for Async Operations.
The builtin Async functions like connection.OpenAsync, ExecuteScalarAsync etc are used with Await keyword. I don't know the inner working of these Async Methods but my strong guess is that under the hood they are using Tasks.
Can I make this as general rule that Await will be with any method which implements Task. So if I need to create my own method which is performing long operation then will I create it as Task and when it is called I will use Await Keyword with it?
Second most important thing is that what is the rule of thumb of creating a method as Async or creating it as task. For example,
public void SampleMain()
{
for (int i = 1; i <= 100; i++)
{
DataTable dt = ReadData(int id);
}
}
public DataTable ReadData(int id)
{
DataTable resultDT = new DataTable();
DataTable dt1 = new DataTable();
// Do Operation to Fill DataTable from first connection string
adapter.Fill(dt1);
DataTable dt2 = new DataTable();
// Do Operation to Fill DataTable from first connection string
adapter.Fill(dt2);
// Code for combining datatable and returning the resulting datatable
// Combine DataTables
return resultDT;
}
public string GetPrimaryConnectionString()
{
// Retrieve connection string from some file io operations
return "some primary connection string";
}
public string GetSecondaryConnectionString()
{
// Retrieve connection string from some file io operations
return "some secondaryconnection string";
}
Now this is a very simple scenario that I have created based on some real world application I worked in past. So I was just wondering how to make this whole process Async.
Should I make GetPrimaryConnectionString and GetSecondaryConnectionString as Tasks and Await them in ReadData. Will ReadData be also a Task? How to call ReadData in the SampleMain function?
Another way could be to create a Task for ReadData in SampleMain and run that Task and skip converting other methods as Task. Is this the good approach? Will it be truly Asynchronous?
So Async is used so that compiler know that method marked by Async
contains Await operation
The async is used so that the compiler will have an indication to create a state-machine out of the method. An async method can have no await, and still work, though it will execute completely synchronously.
The builtin Async functions like connection.OpenAsync,
ExecuteScalarAsync etc are used with Await keyword. I don't know the
inner working of these Async Methods but my strong guess is that under
the hood they are using Tasks.
Task is a promise of work to be completed in the future. There are a couple of ways to create a Task. But, Task isn't the only thing that can be represent an asynchronous operation. You can create an awaitable yourself if you wanted, all it needs it to implement a GetAwaiter method which returns a type implementing INotifyCompletion.
If you want to know how a method is implemented in the framework, you can view the source. In this particular case, they use TaskCompletionSource<T>.
Should I make GetPrimaryConnectionString and
GetSecondaryConnectionString as Tasks and Await them in ReadData. Will
ReadData be also a Task? How to call ReadData in the SampleMain
function?
There is nothing inherently asynchronous about retrieving a connection string. You usually (not always) use async-await with naturally async IO operations. In this particular case, the only actual async operation is ReadData, and if you want to make it asynchronous, you can use SqlDataReader, which exposes async methods.
An example, taken from the ADO.NET teams blog:
public static async Task<Product> GetProductAndReviewsAsync(
int productID, int reviewsToGet)
{
using (SqlConnection connection = new SqlConnection(ConnectionString))
{
await connection.OpenAsync();
const string commandString = GetProductByIdCommand + ";"
+ GetProductReviewsPagedById;
using (SqlCommand command = new SqlCommand(commandString, connection))
{
command.Parameters.AddWithValue("productid", productID);
command.Parameters.AddWithValue("reviewStart", 0);
command.Parameters.AddWithValue("reviewCount", reviewsToGet);
using (SqlDataReader reader = await command.ExecuteReaderAsync())
{
if (await reader.ReadAsync())
{
Product product = GetProductFromReader(reader, productID);
if (await reader.NextResultAsync())
{
List<Review> allReviews = new List<Review>();
while (await reader.ReadAsync())
{
Review review = GetReviewFromReader(reader);
allReviews.Add(review);
}
product.Reviews = allReviews.AsReadOnly();
return product;
}
else
{
throw new InvalidOperationException(
"Query to server failed to return list of reviews");
}
}
else
{
return null;
}
}
}
}
}
how to make this whole process Async
If there are asynchronous versions of adapter.Fill, then simply await for it in ReadData, which in turn also become async and you can await for it in the caller method:
// in async button click event
button.Enabled = false;
var dt = await ReadData(int id);
button.Enabled = true;
... // do something with dt
public async Task<DataTable> ReadData(int id)
{
...
var job1 = adapter.AsyncFill(dt1);
var job2 = adapter.Fill(dt2);
// wait for all of them to finish
Task.WaitAll(new[] {job1, job2});
...
return Task.FromResult(resultDT); // dump approach
}
If there are no asynchronous version then you have to create them (by using Task):
// in async button click event
button.Enabled = false;
// run synchronous task asynchronously
var dt = await Task.Run(() => ReadData(int id));
button.Enabled = true;
... // do something with dt
async/await shines when it comes to UI, otherwise (if no UI is involved) just create task and run synchronous operation there.
The only reason to use async-await is if your main thread might do something useful while another thread is doing the length operation. If the main thread would start the other thread and only wait for the other thread to finish, it is better to let the main thread do the action.
One of the things a main thread quite often does is keep the UI responsive.
You are right, under the hood async-await uses Task, hence you see that an async function returns a Task.
The rules:
If a function would return void, the async version returns Task. If the function would return TResult, the async version should return Task<TResult>.
There is one exception: the async event handler returns void.
The return value of await Task is void. The return value of await Task<TResult> is TResult.
Only async functions can call other async functions.
If you have a non-async function you can still use the async function. However you cannot use await. Use the Task return value of the async function and the System.Threading.Tasks.Task methods to wait for the results.
If you have an async function and want to start a non-async function in a separate thread, use:
private int SlowCalculation(int a, int b)
{
System.Threading.Thread.Sleep(TimeSpan.FromSeconds(5));
return a + b;
}
private async Task CalculateAsync(int a, int b)
{
Task myTask = Task.Run( () => SlowCalculation(a, b);
// while SlowCalcuation is calculating slowly, do other useful things
// after a while you need the answer
int sum = await myTask;
return sum;
}
See that the return of await Task<int> is int.
Some people used to use functions like Task.ContinueWith. Because of the await statement that is not needed anymore. Await makes sure that the task is finished. The statement after the await is what you'd normally do in the ContinueWith.
In Task.ContinueWith you could say: "do this only if the task failed". The async-await equivalent for this is try-catch.
Remember: if your thread has nothing useful to do (like keeping your UI responsive), don't use async-await
Starting several tasks in async-await and waiting for them to finish is done as follows:
private async Task MyAsyncFunction(...)
{
var tasks = new List<Task<int>>();
for (int i=0; i<10; ++i)
{
tasks.Add(CalculateAsync(i, 2*i);
}
// while all ten tasks are slowly calculating do something useful
// after a while you need the answer, await for all tasks to complete:
await Task.WhenAll(tasks);
// the result is in Task.Result:
if (task[3].Result < 5) {...}
}
The async-await version of Task.Waitall is Task.WhenAll. WhenAll returns a Task instead of void, so you can await for it. The main thread remains responsive even while awaiting.
The main thread is not the case when using Task.WaitAll, because you don't await.

Task List with parameters

I'm needing to create a list of tasks to execute a routine that takes one parameter and then wait for those tasks to complete before continuing with the rest of the program code. Here is an example:
List<Task> tasks = new List<Task>();
foreach (string URL in LIST_URL_COLLECTION)
{
tasks[i] = Task.Factory.StartNew(
GoToURL(URL)
);
}
//wait for them to finish
Console.WriteLine("Done");
I've have googled and searched this site but I just keep hitting a dead end, I did this once but can't remember how.
The Task Parallel Library exposes a convinent way to asynchronously wait for the completion of all tasks via the Task.WhenAll method. The method returns a Task by itself which is awaitable and should be awaited:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url => Task.Run(url));
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}
Note that in order to await, your method must be marked with the async modifier in the method signature and return either a Task (if it has no return value) or a Task<T> (if it does have a return value, which type is T).
As a side note, your method looks like it's fetching urls, which i am assuming is generating a web request to some endpoint. In order to do that, there's no need to use extra threads via Task.Factory.StartNew or Task.Run, as these operations are naturally asynchronous. You should look into HttpClient as a starting point. For example, your method could look like this:
public async Task QueryUrlsAsync()
{
var urlFetchingTasks = ListUrlCollection.Select(url =>
{
var httpClient = new HttpClient();
return httpClient.GetAsync(url);
});
await Task.WhenAll(urlFetchingTasks);
Console.WriteLine("Done");
}

C# tasks are executed before Task.WhenAll

Why the tasks are executed before Task.WhenAll??
If you see here, from the below code snippet, first Console.WriteLine("This should be written first.."); should be printed because I am awaiting the tasks beneath to it..
But if you see the output result, the Tasks method result is being printed before the above statement. Ideally, the tasks method should be executed when I await them, but it seems that- the tasks methods are executed the moment I add them in tasks list. Why is it so?
Would you please do let me know why is this happening??
Code:
public static async Task Test()
{
var tasks = new List<Task>();
tasks.Add(PrintNumber(1));
tasks.Add(PrintNumber(2));
tasks.Add(PrintNumber(3));
Console.WriteLine("This should be written first..");
// This should be printed last..
await Task.WhenAll(tasks);
}
public static async Task PrintNumber(int number)
{
await Task.FromResult(0);
Console.WriteLine(number);
}
Output
When you call an async method you get a "hot" task in return. That means that the task already started running (and maybe even completed) before you get to await them. That means that it's quite possible for the tasks to run and complete before the call to Task.WhenAll.
In your case however, while the PrintNumber is marked async it isn't asynchronous at all since you're using Task.FromResult. The synchronous part of an asynchronous method (which is the part until you await an asynchronous task) is always executed synchronously on the calling thread and is done before the call returns. When you use Task.FromResult you get a completed task so all your method is just the synchronous part and is completed before the call returns.
When you await a completed task (as is created by Task.FromResult, it completes synchronously. This means that in your example, nothing is actually happening asynchronously, which explains the order of execution.
If instead, you were to
await Task.Yield();
you'd see output more in line with your expectations.
Task.FromResult won't cause yield and the task will be executed on the same thread. To achieve what you want you can do this:
public static async Task Test()
{
var tasks = new List<Task>();
tasks.Add(PrintNumber(1));
tasks.Add(PrintNumber(2));
tasks.Add(PrintNumber(3));
Console.WriteLine("This should be written first..");
// This should be printed last..
await Task.WhenAll(tasks);
}
public static async Task PrintNumber(int number)
{
await Task.Yield();
Console.WriteLine(number);
}
If you want a Task or tasks to run after something else, its easiest to write your code accordingly.
public static async Task Test()
{
Console.WriteLine("This should be written first..");
// These should be printed last..
await Task.WhenAll(new[]
{
PrintNumber(1),
PrintNumber(2),
PrintNumber(3)
});
}
following on from your comment.
So we have some functions,
async Task<Customer> GetRawCustomer()
{
...
}
async Task<string> GetCity(Customer customer)
{
...
}
async Task<string> GetZipCode(Customer customer)
{
...
}
We could use them like this
var rawCustomer = await GetRawCustomer();
var populationWork = new List<Task>();
Task<string> getCity;
if (string.IsNullOrWhiteSpace(rawCustomer.City))
{
getCity = GetCity(rawCustomer);
populationWork.Add(getCity);
}
Task<string> getZipCode;
if (string.IsNullOrWhiteSpace(rawCustomer.City))
{
getZipCode = GetZipCode(rawCustomer);
populationWork.Add(getZipCode);
}
...
await Task.WhenAll(populationWork);
if (getCity != null)
rawCustomer.City = getCity.Result;
if (getZipCode != null)
rawCustomer.ZipCode = getZipCode.Result;

Calling an async method from a synchronous method

I am attempting to run async methods from a synchronous method. But I can't await the async method since I am in a synchronous method. I must not be understanding TPL as this is the fist time I'm using it.
private void GetAllData()
{
GetData1()
GetData2()
GetData3()
}
Each method needs the previous method to finish as the data from the first is used for the second.
However, inside each method I want to start multiple Task operations in order to speed up the performance. Then I want to wait for all of them to finish.
GetData1 looks like this
internal static void GetData1 ()
{
const int CONCURRENCY_LEVEL = 15;
List<Task<Data>> dataTasks = new List<Task<Data>>();
for (int item = 0; item < TotalItems; item++)
{
dataTasks.Add(MyAyncMethod(State[item]));
}
int taskIndex = 0;
//Schedule tasks to concurency level (or all)
List<Task<Data>> runningTasks = new List<Task<Data>>();
while (taskIndex < CONCURRENCY_LEVEL && taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
//Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<Data> dataTask = await Task.WhenAny(runningTasks);
runningTasks.Remove(dataTask);
myData = await dataTask;
//Schedule next concurrent task
if (taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
}
Task.WaitAll(dataTasks.ToArray()); //This probably isn't necessary
}
I am using await here but get an Error
The 'await' operator can only be used within an async method. Consider
marking this method with the 'async' modifier and changing its return
type to 'Task'
However, if I use the async modifier this will be an asynchronous operation. Therefore, if my call to GetData1 doesn't use the await operator won't control go to GetData2 on the first await, which is what I am trying to avoid? Is it possible to keep GetData1 as a synchronous method that calls an asynchronous method? Am I designing the Asynchronous method incorrectly? As you can see I'm quite confused.
This could be a duplicate of How to call asynchronous method from synchronous method in C#? However, I'm not sure how to apply the solutions provided there as I'm starting multiple tasks, want to WaitAny, do a little more processing for that task, then wait for all tasks to finish before handing control back to the caller.
UPDATE
Here is the solution I went with based on the answers below:
private static List<T> RetrievePageTaskScheduler<T>(
List<T> items,
List<WebPageState> state,
Func<WebPageState, Task<List<T>>> func)
{
int taskIndex = 0;
// Schedule tasks to concurency level (or all)
List<Task<List<T>>> runningTasks = new List<Task<List<T>>>();
while (taskIndex < CONCURRENCY_LEVEL_PER_PROCESSOR * Environment.ProcessorCount
&& taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
// Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<List<T>> task = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(task);
try
{
items.AddRange(task.Result);
}
catch (AggregateException ex)
{
/* Throwing this exception means that if one task fails
* don't process any more of them */
// https://stackoverflow.com/questions/8853693/pattern-for-implementing-sync-methods-in-terms-of-non-parallel-task-translating
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(
ex.Flatten().InnerExceptions.First()).Throw();
}
// Schedule next concurrent task
if (taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
}
return items;
}
Task<TResult>.Result (or Task.Wait() when there's no result) is similar to await, but is a synchronous operation. You should change GetData1() to use this. Here's the portion to change:
Task<Data> dataTask = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(dataTask);
myData = gameTask.Result;
First, I recommend that your "internal" tasks not use Task.Run in their implementation. You should use an async method that does the CPU-bound portion synchronously.
Once your MyAsyncMethod is an async method that does some CPU-bound processing, then you can wrap it in a Task and use parallel processing as such:
internal static void GetData1()
{
// Start the tasks
var dataTasks = Enumerable.Range(0, TotalItems)
.Select(item => Task.Run(() => MyAyncMethod(State[item]))).ToList();
// Wait for them all to complete
Task.WaitAll(dataTasks);
}
Your concurrency limiting in your original code won't work at all, so I removed it for simpilicity. If you want to apply a limit, you can either use SemaphoreSlim or TPL Dataflow.
You can call the following:
GetData1().Wait();
GetData2().Wait();
GetData3().Wait();

Categories

Resources