Task.Run(), passing parameter to - c#

Consider the following code:
attempt = 0;
for (int counter = 0; counter < 8; counter++)
{
if (attempt < totalitems)
{
Tasklist<output>.Add(Task.Run(() =>
{
return someasynctask(inputList[attempt]);
}));
}
else
{
break;
}
attempt++;
}
await Task.WhenAll(Tasklist).ConfigureAwait(false);
I want to have for example 8 concurrent tasks, each working on different inputs concurrently, and finally check the result, when all of them have finished.
Because I'm not awaiting for completion of Task.Run() attempt is increased before starting of tasks, and when the task is started, there may be items in the inputList that are not processed or processed twice or more instead (because of uncertainty in attempt value.
How to do that?

The problem lies within the use of a "lambda": when Task.Run(() => return someasynctask(inputList[attempt])); is reached during the execution, the variable attempt is captured, not its value (i.e. it is a "closure"). Consequently, when the lambda gets executed, the value of the variable at that specific moment will be used.
Just add a temporary copy of the variable before your lambda, and use that. E.g.
if (attempt < totalitems)
{
int localAttempt = attempt;
Tasklist<output>.Add(Task.Run(() =>
{
return someasynctask(inputList[localAttempt]);
}));
}

Thanks to #gobes for his answer:
Try this:
attempt = 0;
for (int counter = 0; counter < 8; counter++)
{
if (attempt < totalitems)
{
Tasklist<output>.Add(Task.Run(() =>
{
int tmpAttempt = attempt;
return someasynctask(inputList[tmpAttempt]);
}));
}
else
{
break;
}
attempt++;
}
await Task.WhenAll(Tasklist).ConfigureAwait(false);
Actually, what the compiler is doing is extracting your lambda into a method, located in an automagically generated class, which is referencing the attempt variable. This is the important point: the generated code only reference the variable from another class; it doesn't copy it. So every change to attempt is seen by the method.
What happens during the execution is roughly this:
enter the loop with attempt = 0
add a call of the lambda-like-method to your tasklist
increase attempt
repeat
After the loop, you have some method calls awaiting (no pun intended) to be executed, but each one is referencing THE SAME VARIABLE, therefore sharing its value - the last affected to it.
For more details, I really recommend reading C# in depth, or some book of the same kind - there are a lot of resources about closure in C# on the web :)

Related

starting tasks with lambda expressions in loops in C#

In the preparation for a C# exam at university I found the following multiple choice question:
Client applications call your library by passing a set of operations
to perform. Your library must ensure that system resources are most
effectively used. Jobs may be scheduled in any order, but your
librarymust log the position of each operation. You have declared this
code:
public IEnumerable<Task> Execute(Action[] jobs)
{
var tasks = new Task[jobs.Length];
for (var i = 0; i < jobs.Length; i++)
{
/* COMPLETION NEEDED */
}
return tasks;
}
public void RunJob(Action job, int index)
{
// implementation omitted
}
Complete the method by inserting code in the for loop. Choose the
correct answer.
1.)
tasks[i] = new Task((idx) => RunJob(jobs[(int)idx], (int)idx), i);
tasks[i].Start();
2.)
tasks[i] = new Task(() => RunJob(jobs[i], i));
tasks[i].Start();
3.)
tasks[i] = Task.Run(() => RunJob(jobs[i], i));
I have opted for answer 3 since Task.Run() queues the specified work on the thread pool and returns a Task object that represents the work.
But the correct answer was 1, using the Task(Action, Object) constructor. The explanation says the following:
In answer 1, the second argument to the constructor is passed as the
only argument to the Action delegate. The current value of the
i variable is captured when the value is boxed and passed to the Task
constructor.
Answer 2 and 3 use a lambda expression that captures the i variable
from the enclosing method. The lambda expression will probably return
the final value of i, in this case 10, before the operating system
preempts the current thread and begins every task delegate created by
the loop. The exact value cannot be determined because the OS
schedules thread execution based on many factors external to your
program.
While I perfectly understand the explanation of answer 1, I don't get the point in the explanations for answer 2 and 3. Why would the lambda expression return the final value?
In options 2 and 3 lambda captures original i variable used in for loop. It's not guaranteed when tasks will be run on thread pool. So possible behavior: for loop is finished, i=10 and then tasks are started to execute. So all of them will use i=10.
Similar behavior you can see here:
void Do()
{
var actions = new List<Action>();
for (int i = 0; i < 3; i++)
{
actions.Add(() => Console.WriteLine(i));
}
//actions executed after loop is finished
foreach(var a in actions)
{
a();
}
}
Output is:
3
3
3
You can fix it like this:
for (int i = 0; i < 3; i++)
{
var local = i;
actions.Add(() => Console.WriteLine(local));
}

Is there really a valid code-path where this function won't return a value?

I have the below function that iterates over a list of workers, invoking their DoStuff() method. If the first worker fails, I try the next one until I'm out of workers. If they all fail, I re-throw the last exception.
// workers is an IList<>.
public object TryDoStuff()
{
for (int i = 0; i < workers.Count; i++)
{
try
{
return worker[i].DoStuff();
}
catch
{
if (i == workers.Count - 1)
{
throw; // This preserves the stack trace
}
else
{
continue; // Try the next worker
}
}
}
}
When compiling, I get an error that "not all code paths return a value" for this function. Although I can silence the error by adding an explicit return after the for loop, I'm doubting the compiler is accurate here as I don't see how the for loop will be escaped without either returning or re-throwing an exception. And if an exception is re-thrown, not returning a value is valid.
What am I missing? Is csc unable to reason about the conditional in the catch block?
Yes
If there is an exception thrown on the last index and count isn't what you expect it to be (unlikely yet possible)
Or as RAM pointed out if Count is zero
In this case, the static analysis and subsequent compiler error is very justified
As previously mentioned, if workers is empty (Count is 0), there's no valid return path.
There's also another race condition (depending on the full context, obviously) where workers is not empty, an exception is thrown on an element, there are still elements to iterate in workers, but after evaluating if (i == workers.Count - 1) and before the continue statement executes, another thread removes elements from workers (or changes the entire workers variable to a new instance).
In that scenario, the for condition will return false on the next iteration unexpectedly and you'll fall out of the loop with no return statement for the method.
public object TryDoStuff()
{
for (int i = 0; i < workers.Count; i++)
{
try
{
return worker[i].DoStuff();
}
catch
{
if (i == workers.Count - 1)
{
throw; // This preserves the stack trace
}
else
{
// XXX If workers is changed by another thread here. XXX
continue; // Try the next worker
}
}
}
}
I wrote as a comment for you:
What will be happen if the count of the workers list items be
zero?
It seems this is the compiler question and it dose not do more research about your code! :)
Actually this reason is enough for compiler to show the bellow error to you
not all code paths return a value
When the compiler encounters with a loop in the whole of the body of a method it assume that the loop condition cause that the loop body be ignored then it expected any value out of the loop too.
Yes, even if we set the condition of the loop at the way that the loop be executed!
Proof:
With error:
public static object TryDoStuff()
{
var result =0;
for (int i = 0; i < 3; i++)
{
Console.WriteLine("Add 100 unit");
result += 100;
return result;
}
//Console.WriteLine("last line");
// return result;
}
Without error:
public static object TryDoStuff()
{
var result =0;
for (int i = 0; i < 3; i++)
{
Console.WriteLine("Add 100 unit");
result += 100;
// return result; you can un-comment this line too
}
Console.WriteLine("last line");
return result;
}

Task.Run() in console application print weird result [duplicate]

This question already has answers here:
Captured variable in a loop in C#
(10 answers)
Closed 1 year ago.
I am trying to run several tasks at the same time and I came across an issue I can't seem to be able to understand nor solve.
I used to have a function like this :
private void async DoThings(int index, bool b) {
await SomeAsynchronousTasks();
var item = items[index];
item.DoSomeProcessing();
if(b)
AVolatileList[index] = item; //volatile or not, it does not work
else
AnotherVolatileList[index] = item;
}
That I wanted to call in a for loop using Task.Run(). However I could not find a way to send parameters to this Action<int, bool> and everyone recommends using lambdas in similar cases:
for(int index = 0; index < MAX; index++) { //let's say that MAX equals 400
bool b = CheckSomething();
Task.Run(async () => {
await SomeAsynchronousTasks();
var item = items[index]; //here, index is always evaluated at 400
item.DoSomeProcessing();
if(b)
AVolatileList[index] = item; //volatile or not, it does not work
else
AnotherVolatileList[index] = item;
}
}
I thought using local variables in lambdas would "capture" their values but it looks like it does not; it will always take the value of index as if the value would be captured at the end of the for loop. The index variable is evaluated at 400 in the lambda at each iteration so of course I get an IndexOutOfRangeException 400 times (items.Count is actually MAX).
I am really not sure about what is happening here (though I am really curious about it) and I don't know how to do what I am trying to achieve either. Any hints are welcome!
Make a local copy of your index variable:
for(int index = 0; index < MAX; index++) {
var localIndex = index;
Task.Run(async () => {
await SomeAsynchronousTasks();
var item = items[index];
item.DoSomeProcessing();
if(b)
AVolatileList[index] = item;
else
AnotherVolatileList[index] = item;
}
}
This is due to the way C# does a for loop: there is only one index variable that is updated, and all your lambdas are capturing that same variable (with lambdas, variables are captured, not values).
As a side note, I recommend that you:
Avoid async void. You can never know when an async void method completes, and they have difficult error handling semantics.
await all of your asynchronous operations. I.e., don't ignore the task returned from Task.Run. Use Task.WhenAll or the like to await for them. This allows exceptions to propagate.
For example, here's one way to use WhenAll:
var tasks = Enumerable.Range(0, MAX).Select(index =>
Task.Run(async () => {
await SomeAsynchronousTasks();
var item = items[localIndex];
item.DoSomeProcessing();
if(b)
AVolatileList[localIndex] = item;
else
AnotherVolatileList[localIndex] = item;
}));
await Task.WhenAll(tasks);
All your lambdas capture the same variable which is your loop variable. However, all your lambdas are executed only after the loop has finished. At that point in time, the loop variable has the maximum value, hence all your lambdas use it.
Stephen Cleary shows in his answer how to fix it.
Eric Lippert wrote a detailled two-part series about this.

Index out of range error when using a list as a parameter for thread start routine

I am writing a C# program that requires giving a thread parameters to a function so that the function will run properly on the separate thread. Specifically one of the parameters is a string name to a file that it is supposed to access. The problem is that I am storing the names of the files in a list and I am accessing the value from the list. However, when I do this I get an index out of range error after one or two threads are created. I think that this is list of strings is my issue, but I know that the index is not out of range.
I am not sure if I am doing something wrong with the way I am passing in the parameters or what else could be wrong.
Here is a sample of my C# code (excluding the code for the functions called):
for (int i = 0; i < 5; i++)
{
surfaceGraphDataNames.Add(String.Format(surfacePlotDataLocation+"ThreadData{0}.txt", i));
try
{
generateInputFile(masterDataLocation);
}
catch
{
MessageBox.Show("Not enough data remaining to create an input file");
masterDataLocation = masterDataSet.Count - ((graphData.NumRootsUsed + 1) * (graphData.Polynomial + 1) - 1);
this.dataSetLabel.Text = String.Format("Current Data Set: {0}", masterDataLocation + 1);
return;
}
try
{
//creates the data in a specific text file I hope
createSurfaceGraph(surfaceGraphDataNames[i]);
//start threads
threadsRunning.Add(new Thread(() => runGnuplotClicks(surfaceGraphDataNames[i], masterDataLocation)));
threadsRunning[i].Start();
}
catch
{
this.graphPictureBox1.Image = null;//makes image go away if data fails
MessageBox.Show("Gridgen failed to generate good data");
}
masterDataLocation++;
}
Looks like that you have to do something like this:
threadsRunning.Add(new Thread(() => {
var k = i;
runGnuplotClicks(surfaceGraphDataNames[k], masterDataLocation)
}
));
The reason is that when you use the variable i, it's not safe because when your i++, and the surfaceGraphDataNames has not been added with new item yet, the exception will throw because your Thread run nearly simultaneously.
Here is the context which leads to the exception:
for(int i = 0; i < 5; i++){
//Suppose i is increased to 3 at here
//Here is where your Thread running code which accesses to the surfaceGraphDataNames[i]
//That means it's out of range at this time because
//the surfaceGraphDataNames has not been added with new item by the code below
surfaceGraphDataNames.Add(String.Format(surfacePlotDataLocation+"ThreadData{0}.txt", i));
//....
}
UPDATE
Looks like that the code above even can't work possibly because the i is increased before the actual ThreadStart is called. I think you can do this to make it safer:
var j = i;
threadsRunning.Add(new Thread(() => {
var k = j;
runGnuplotClicks(surfaceGraphDataNames[k], masterDataLocation)
}
));
Synchronization Attempt:
Queue<int> q = new Queue<int>();
for(int i = 0; i < 5; i++){
//.....
q.Enqueue(i);
threadsRunning.Add(new Thread(() => {
runGnuplotClicks(surfaceGraphDataNames[q.Dequeue()], masterDataLocation)
}
));
threadsRunning[i].Start();
}
I had a problem like this then I use Thread. I was sure that the index is not out of range and this situation not happened if I tried to stop by break-point and then continued.
Try to use Task instead of Thread. It works
The most obvious problem is that you are closing over the loop variable. When you construct a lambda expression any variable references are to the variable itself and not its value. Consider the following code taken from your example.
for (int i = 0; i < 5; i++)
{
// Code omitted for brevity.
new Thread(() => runGnuplotClicks(surfaceGraphDataNames[i], masterDataLocation))
// Code omitted for brevity.
}
What this is actually doing is capturing the variable i. But, by the time the thread starts executing the i could have been incremented several times possibly (and even likely) to the point where its value is now 5. It is possible that the IndexOutOfRangeException is being thrown because surfaceGraphDataNames does not have 6 slots. Nevermind the fact that your thread is not using the value of i that you thought it was.
To fix this you need to create a special capturing variable.
for (int i = 0; i < 5; i++)
{
// Code omitted for brevity.
int capture = i;
new Thread(() => runGnuplotClicks(surfaceGraphDataNames[capture], masterDataLocation))
// Code omitted for brevity.
}

For loop goes out of range [duplicate]

This question already has answers here:
Captured variable in a loop in C#
(10 answers)
Closed 11 days ago.
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.StartTasks();
}
}
class MyClass
{
int[] arr;
public void StartTasks()
{
arr = new int[2];
arr[0] = 100;
arr[1] = 101;
for (int i = 0; i < 2; i++)
{
Task.Factory.StartNew(() => WorkerMethod(arr[i])); // IndexOutOfRangeException: i==2!!!
}
}
void WorkerMethod(int i)
{
}
}
}
It seems that i++ gets executed one more time before the loop iteration is finished. Why do I get the IndexOutOfRangeException?
You are closing over loop variable. When it's time for WorkerMethod to get called, i can have the value of two, not the value of 0 or 1.
When you use closures it's important to understand that you are not using the value that the variable has at the moment, you use the variable itself. So if you create lambdas in loop like so:
for(int i = 0; i < 2; i++) {
actions[i] = () => { Console.WriteLine(i) };
}
and later execute the actions, they all will print "2", because that's what the value of i is at the moment.
Introducing a local variable inside the loop will solve your problem:
for (int i = 0; i < 2; i++)
{
int index = i;
Task.Factory.StartNew(() => WorkerMethod(arr[index]));
}
<Resharper plug> That's one more reason to try Resharper - it gives a lot of warnings that help you catch the bugs like this one early. "Closing over a loop variable" is amongst them </Resharper plug>
The reason is that you are using a loop variable inside a parallel task. Because tasks can execute concurrently the value of the loop variable may be different to the value it had when you started the task.
You started the task inside the loop. By the time the task comes to querying the loop variable the loop has ended becuase the variable i is now beyond the stop point.
That is:
i = 2 and the loop exits.
The task uses variable i (which is now 2)
You should use Parallel.For to perform a loop body in parallel. Here is an example of how to use Parallel.For
Alternativly, if you want to maintain you current strucuture, you can make a copy of i into a loop local variable and the loop local copy will maintain its value into the parallel task.
e.g.
for (int i = 0; i < 2; i++)
{
int localIndex = i;
Task.Factory.StartNew(() => WorkerMethod(arr[localIndex]));
}
Using foreach does not throw:
foreach (var i in arr)
{
Task.Factory.StartNew(() => WorkerMethod(i));
}
But is doesn't work either:
101
101
It executes WorkerMethod with the last entry in the array. Why is nicely explained in the other answers.
This does work:
Parallel.ForEach(arr,
item => Task.Factory.StartNew(() => WorkerMethod(item))
);
Note
This actually is my first hands-on experience with System.Threading.Tasks. I found this question, my naive answer and especially some of the other answers useful for my personal learning experience. I'll leave my answer up here because it might be useful for others.

Categories

Resources