I have an issue with passing a long by value to a Task.
I have a list of ID's where I loop through each one, assign to a local variable then pass as a parameter to a new Task. I do not wait for the task to complete before looping round and processing the next ID. I keep an array of Tasks but this is irrelevant.
loop
long ID = list[index];
task[index] = Task.Factory.StartNew(() => doWork(ID));
end loop
If the list contained for example 100 and 200. I would want the first task called with 100
then the second task called with 200. But it does not, doWork receives 200 for both tasks so there is an issue when the value is copied.
I can demonstrate with some simple console code
class Program
{
static void Main(string[] args)
{
long num = 100;
Task one = Task.Factory.StartNew(() => doWork(num));
num = 200;
Console.ReadKey();
}
public static void doWork(long val)
{
Console.WriteLine("Method called with {0}", val);
}
}
The above code will always display
Method called with 200
I modified the code to wait for the task status to switch from WaitingToRun
static void Main(string[] args)
{
long num = 100;
Task one = Task.Factory.StartNew(() => doWork(num));
while(one.Status == TaskStatus.WaitingToRun)
{}
num = 200;
Console.ReadKey();
}
This improves things but not 100% proof, after a few runs I got Method called with 200
Also tried the following
while (true)
{
if (one.Status == TaskStatus.Running | one.IsCompleted == true)
break;
}
but again got 200 displayed.
Any ideas how you can guarantee the value passed to the task without waiting for the task to complete?
Any ideas how you can guarantee the value passed to the task without waiting for the task to complete?
Sure - just create a separate variable which isn't modified anywhere else. You can use a new scope to make that clear:
long num = 100;
Task one;
{
// Nothing can change copyOfNum!
long copyOfNum = num;
one = Task.Factory.StartNew(() => doWork(copyOfNum));
}
You can't change the C# compiler to capture "the value of the variable when delegate is created" rather than capturing the variable, but you can make sure the variable isn't changed afterwards, which accomplishes the same thing.
Related
I have a function that generates database layers for a map application but since the computation takes too long I put it into a thread. However, obviously, but not to me 5 seconds ago, the function returns without it finishing the computation so the function returns nothing. Is there any way I can still have the work be sent to where it needs to go?
public static int calcNum()
{
int value = 0;
Thread thread1 = new Thread(() => {
value = 5;
Thread.Sleep(5000); //Simulates work being done
});
return value; //Returns 0 when I want it to be 5
}
A "computation" doesn't get any faster at all just because you run it on a different thread.
If don't want your method to return before the computation has finished, you should implement your method like this without involving any additional thread:
public static int calcNum()
{
int value = 0;
value = 5;
Thread.Sleep(5000); //Simulates work being done
return value;
}
If you want to prevent the calling thread from being blocked while executing the "computation", you could implemenent an async method that executes the long running operation in a Task:
public static async Task<int> CalcNumAsync()
{
int value = 0;
await Task.Run(() =>
{
value = 5;
Thread.Sleep(5000); //Simulates work being done
});
return value;
}
You will then have to await the method when you call it:
int num = await CalcNumAsync();
The latter approach is for example useful in UI application, when you want the applicaiton to stay responsive while your long running operation is being executed on a background thread.
I am using the code below for creating multiple tasks in C#:
private static List<Task> _taskList = new List<Task>();
private static ConcurrentQueue<string> cred = new ConcurrentQueue<string>();
private static void TaskMethod1(string usercred)
{
// I am doing a bunch of operations here, all of them can be replaced with
// a sleep for 25 minutes.
// After all operations are done, enqueue again.
cred.Enqueue("usercred")
}
private static void TaskMethod()
{
while(runningService)
{
string usercred;
// This will create more than one task in parallel to run,
// and each task can take up to 30 minutes to finish.
while(cred.TryDequeue(out usercred))
{
_taskList.Add(Task.Run(() => TaskMethod1(usercred)));
}
}
}
internal static void Start()
{
runningService = true;
cred.enqueue("user1");
cred.enqueue("user2");
cred.enqueue("user3");
Task1 = Task.Run(() => TaskMethod());
}
I am encountering a strange behaviour in the code above. By putting a breakpoint at line _taskList.Add(Task.Run(() => TaskMethod1(usercred)));, I am checking value of usercred every time TaskMethod1 is called and it is not null while being called but in one of the cases the value of usercred is null inside TaskMethod1. I have no clue how this could be happening.
You are using Task.Run where in you are using variables from while loop. You are not passing it to the task. So, by the time task executes, its value gets changed.
You should use
while (runningService)
{
string usercred;
// This will create more than one task in parallel to run,
// and each task can take upto 30 minutes to finish.
while (cred.TryDequeue(out usercred))
{
_taskList.Add(Task.Factory.StartNew((data) => TaskMethod1(data.ToString()), usercred)
}
}
You should declare the usercred variable inside the inner while loop, so that the lambda inside the Task.Run captures a separate variable for each loop, and not the same variable for all loops.
while(runningService)
{
while(cred.TryDequeue(out var usercred))
{
_taskList.Add(Task.Run(() => TaskMethod1(usercred)));
}
}
As a side note, I would consider using a BlockingCollection instead of the ConcurrentQueue, to have a way to block the current thread until an item is available, so that I don't have to worry about creating inadvertently a tight loop.
The following change solved the problem.
private static void TaskMethod()
{
while(runningService)
{
string usercred;
// This will create more than one task in parallel to run,
// and each task can take up to 30 minutes to finish.
while(cred.TryDequeue(out usercred))
{
var uc = usercred;
_taskList.Add(Task.Run(() => TaskMethod1(uc)));
}
}
}
I am actually reading some topics about the Task Parallel Library and the asynchronous programming with async and await. The book "C# 5.0 in a Nutshell" states that when awaiting an expression using the await keyword, the compiler transforms the code into something like this:
var awaiter = expression.GetAwaiter();
awaiter.OnCompleted (() =>
{
var result = awaiter.GetResult();
Let's assume, we have this asynchronous function (also from the referred book):
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
Console.WriteLine (await GetPrimesCountAsync (i*1000000 + 2, 1000000) +
" primes between " + (i*1000000) + " and " + ((i+1)*1000000-1));
Console.WriteLine ("Done!");
}
The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread. In general invoking multiple threads from within a for loop has the potential for introducing race conditions.
So how does the CLR ensure that the requests will be processed in the order they were made? I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.
Just for the sake of simplicity, I'm going to replace your example with one that's slightly simpler, but has all of the same meaningful properties:
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
{
var value = await SomeExpensiveComputation(i);
Console.WriteLine(value);
}
Console.WriteLine("Done!");
}
The ordering is all maintained because of the definition of your code. Let's imagine stepping through it.
This method is first called
The first line of code is the for loop, so i is initialized.
The loop check passes, so we go to the body of the loop.
SomeExpensiveComputation is called. It should return a Task<T> very quickly, but the work that it'd doing will keep going on in the background.
The rest of the method is added as a continuation to the returned task; it will continue executing when that task finishes.
After the task returned from SomeExpensiveComputation finishes, we store the result in value.
value is printed to the console.
GOTO 3; note that the existing expensive operation has already finished before we get to step 4 for the second time and start the next one.
As far as how the C# compiler actually accomplishes step 5, it does so by creating a state machine. Basically every time there is an await there's a label indicating where it left off, and at the start of the method (or after it's resumed after any continuation fires) it checks the current state, and does a goto to the spot where it left off. It also needs to hoist all local variables into fields of a new class so that the state of those local variables is maintained.
Now this transformation isn't actually done in C# code, it's done in IL, but this is sort of the morale equivalent of the code I showed above in a state machine. Note that this isn't valid C# (you cannot goto into a a for loop like this, but that restriction doesn't apply to the IL code that is actually used. There are also going to be differences between this and what C# actually does, but is should give you a basic idea of what's going on here:
internal class Foo
{
public int i;
public long value;
private int state = 0;
private Task<int> task;
int result0;
public Task Bar()
{
var tcs = new TaskCompletionSource<object>();
Action continuation = null;
continuation = () =>
{
try
{
if (state == 1)
{
goto state1;
}
for (i = 0; i < 10; i++)
{
Task<int> task = SomeExpensiveComputation(i);
var awaiter = task.GetAwaiter();
if (!awaiter.IsCompleted)
{
awaiter.OnCompleted(() =>
{
result0 = awaiter.GetResult();
continuation();
});
state = 1;
return;
}
else
{
result0 = awaiter.GetResult();
}
state1:
Console.WriteLine(value);
}
Console.WriteLine("Done!");
tcs.SetResult(true);
}
catch (Exception e)
{
tcs.SetException(e);
}
};
continuation();
}
}
Note that I've ignored task cancellation for the sake of this example, I've ignored the whole concept of capturing the current synchronization context, there's a bit more going on with error handling, etc. Don't consider this a complete implementation.
The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread.
No. await does not initiate any kind of background processing. It waits for existing processing to complete. It is up to GetPrimesCountAsync to do that (e.g. using Task.Run). It's more clear this way:
var myRunningTask = GetPrimesCountAsync();
await myRunningTask;
The loop only continues when the awaited task has completed. There is never more than one task outstanding.
So how does the CLR ensure that the requests will be processed in the order they were made?
The CLR is not involved.
I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.
The transform that you shows is basically right but notice that the next loop iteration is not started right away but in the callback. That's what serializes execution.
The main idea here is to fetch some data from somewhere, when it's fetched start writing it, and then prepare the next batch of data to be written, while waiting for the previous write to be complete.
I know that a Task cannot be restarted or reused (nor should it be), although I am trying to find a way to do something like this :
//The "WriteTargetData" method should take the "data" variable
//created in the loop below as a parameter
//WriteData basically do a shedload of mongodb upserts in a separate thread,
//it takes approx. 20-30 secs to run
var task = new Task(() => WriteData(somedata));
//GetData also takes some time.
foreach (var data in queries.Select(GetData))
{
if (task.Status != TaskStatus.Running)
{
//start task with "data" as a parameter
//continue the loop to prepare the next batch of data to be written
}
else
{
//wait for task to be completed
//"restart" task
//continue the loop to prepare the next batch of data to be written
}
}
Any suggestion appreciated ! Thanks. I don't necessarily want to use Task, I just think it might be the way to go.
This may be over simplifying your requirements, but would simply "waiting" for the previous task to complete work for you? You can use Task.WaitAny and Task.WaitAll to wait for previous operations to complete.
pseudo code:
// Method that makes calls to fetch and write data.
public async Task DoStuff()
{
Task currTask = null;
object somedata = await FetchData();
while (somedata != null)
{
// Wait for previous task.
if (currTask != null)
Task.WaitAny(currTask);
currTask = WriteData(somedata);
somedata = await FetchData();
}
}
// Whatever method fetches data.
public Task<object> FetchData()
{
var data = new object();
return Task.FromResult(data);
}
// Whatever method writes data.
public Task WriteData(object somedata)
{
return Task.Factory.StartNew(() => { /* write data */});
}
The Task class is not designed to be restarted. so you Need to create a new task and run the body with the same Parameters. Next i do not see where you start the task with the WriteData function in its body. That will property Eliminate the call of if (task.Status != TaskStatus.Running) There are AFAIK only the class Task and Thread where task is only the abstraction of an action that will be scheduled with the TaskScheduler and executed in different threads ( when we talking about the Common task Scheduler, the one you get when you call TaskFactory.Scheduler ) and the Number of the Threads are equal to the number of Processor Cores.
To you Business App. Why do you wait for the execution of WriteData? Would it be not a lot more easy to gater all data and than submit them into one big Write?
something like ?
public void Do()
{
var task = StartTask(500);
var array = new[] {1000, 2000, 3000};
foreach (var data in array)
{
if (task.IsCompleted)
{
task = StartTask(data);
}
else
{
task.Wait();
task = StartTask(data);
}
}
}
private Task StartTask(int data)
{
var task = new Task(DoSmth, data);
task.Start();
return task;
}
private void DoSmth(object time)
{
Thread.Sleep((int) time);
}
You can use a thread and an AutoResetEvent. I have code like this for several different threads in my program:
These are variable declarations that belong to the main program.
public AutoResetEvent StartTask = new AutoResetEvent(false);
public bool IsStopping = false;
public Thread RepeatingTaskThread;
Somewhere in your initialization code:
RepeatingTaskThread = new Thread( new ThreadStart( RepeatingTaskProcessor ) ) { IsBackground = true; };
RepeatingTaskThread.Start();
Then the method that runs the repeating task would look something like this:
private void RepeatingTaskProcessor() {
// Keep looping until the program is going down.
while (!IsStopping) {
// Wait to receive notification that there's something to process.
StartTask.WaitOne();
// Exit if the program is stopping now.
if (IsStopping) return;
// Execute your task
PerformTask();
}
}
If there are several different tasks you want to run, you can add a variable that would indicate which one to process and modify the logic in PerformTask to pick which one to run.
I know that it doesn't use the Task class, but there's more than one way to skin a cat & this will work.
I have a static field of type ConcurrentQueue:
static readonly ConcurrentQueue<int> q = new ConcurrentQueue<int>();
and an async method:
static async Task<int?> NextNum()
{
int? n = await Task.Run<int?>(() =>
{
int i = 0;
if (q.TryDequeue(out i)) return i;
return null;
});
return n;
}
Then I execute this code:
var nt = NextNum();
q.Enqueue(10);
nt.Wait();
Console.WriteLine("{0}", nt.Result.HasValue ? nt.Result.Value : -1);
And the output is 10.
Now I add MethodImpl attribute to my async method:
[System.Runtime.CompilerServices.MethodImpl(System.Runtime.CompilerServices.MethodImplOptions.AggressiveInlining)]
static async Task<int?> NextNum()
{
int? n = await Task.Run<int?>(() =>
{
int i = 0;
if (q.TryDequeue(out i)) return i;
return null;
});
return n;
}
And when I execute the previously mentioned code I get -1.
Question: Does this mean in an async method the returned Task does not start immediately? And if we add MethodImpl (with AggressiveInlining) attribute it starts immediately?
I want to know if a method decorated with AggressiveInlining has any effect on task scheduler behavior.
Your test is nondeterministic, so the results may be different based on changes in timings / thread switches / load on the machine / number of cores / etc.
E.g., if you change your test to:
var nt = NextNum();
Thread.Sleep(1000);
q.Enqueue(10);
then the output is most likely -1 even without AggressiveInlining.
Question: Does this mean in an async method the returned Task does not start immediately? And if we add MethodImpl (with AggressiveInlining) attribute it starts immediately?
Not at all. The task returned by NextNum always starts immediately. However, the task queued to the thread pool by Task.Run may not. That's where you're seeing the difference in behavior.
In your original test, the task queued by Task.Run happens to take long enough that q.Enqueue gets executed before it does. In your second test, the task queued by Task.Run happens to run before q.Enqueue. Both are nondeterministic, and AggressiveInlining just changes the timings.
Update from comments:
I want to know if a method decorated with AggressiveInlining has any effect on task scheduler behavior.
No, it does not.