Parallel Tasks Sharing a Global Variable - c#

Hi I am new to using Parallel tasks. I have a function which I need to run multiple times in parallel. Below is the dummy code to show this,
public MyClass GlobalValue;
static void Main(string[] args)
{
Task task1 = Task.Factory.StartNew(() => SaveValue());
Task task2 = Task.Factory.StartNew(() => SaveValue());
Task task3 = Task.Factory.StartNew(() => SaveValue());
}
public void SaveValue()
{
string val = GetValueFromDB();
if (GlobalValue == NULL)
{
GlobalValue = New MyClass(val);
}
else if (GlobalValue.Key != val)
{
GlobalValue = New MyClass(val);
}
string result = GlobalValue.GetData();
}
Now the line GlobalValue = New GlobalValue(val) is called every time. Kindly help me with this. I think there is a problem with the Global Variable.

You need to synchronize the access to the shared data, as each thread will try to access it at the same time, and see that it's null, then all will allocate.
Note that the synchronization, if done via lock, will likely cause the three threads to effectively run sequentially, as only one thread can enter a lock at a time.

well, why not do
static void Main()
{
var tasks = new[]
{
Task.Factory.StartNew(() => YourFunction()),
Task.Factory.StartNew(() => YourFunction()),
Task.Factory.StartNew(() => YourFunction())
};
Task.WaitAll(tasks)
}
public static string YourFunction()
{
var yourClass = new MyClass(GetValueFromDB());
return yourClass.GetData();
}
I don't see why you need GlobalValue. Is MyClass expensive to instantiate? More notably, you don't do anything with the results so all is moot.
Since the features are available, assuming you're using .Net 4.5 (c# 5.0), you could do
static void Main()
{
await Task.WhenAll(YourFunction(), YourFunction(), YourFunction());
}
public async Task<string> YourFunction()
{
return new MyClass(GetValueFromDB()).GetData();
}
For the sake of illustration, you could still use a global variable but it would massively mitigate the benefits of parallelization. You just have to make sure you serialize access to shared state or use thread safe types that do it for you.
adapted from your example,
private readonly SemaphoreSlim globalLock = new SemaphoreSlim(1));
...
public void SaveValue()
{
string val = GetValueFromDB();
MyClass thisValue;
globalLock.Wait();
try
{
if (this.GlobalValue == NULL)
{
this.GlobalValue = new MyClass(val);
}
else if (this.GlobalValue.Key != val)
{
this.GlobalValue = new MyClass(val);
}
thisValue = this.GlobalValue
}
finally
{
globalLock.Release();
}
string result = thisValue.GetData();
}

Related

C# Design pattern for periodic execution of multiple Threads

I have a below requirement in my C# Windows Service.
At the starting of Service, it fetches a collection of data from db
and keeps it in memory.
Have a business logic to be executed periodically from 3 different threads.
Each thread will execute same bussiness logic with different subset of data from the data collection mentioned in step 1. Each thread will produce different result sets.
All 3 threads will run periodically if any change happened to the data collection.
When any client makes call to the service, service should be able to return the status of the thread execution.
I know C# has different mechanisms to implement periodic thread execution.
Timers, Threads with Sleep, Event eventwaithandle ect.,
I am trying to understand Which threading mechanism or design pattern will be best fit for this requirement?
A more modern approach would be using tasks but have a look at the principles
namespace Test {
public class Program {
public static void Main() {
System.Threading.Thread main = new System.Threading.Thread(() => new Processor().Startup());
main.IsBackground = false;
main.Start();
System.Console.ReadKey();
}
}
public class ProcessResult { /* add your result state */ }
public class ProcessState {
public ProcessResult ProcessResult1 { get; set; }
public ProcessResult ProcessResult2 { get; set; }
public ProcessResult ProcessResult3 { get; set; }
public string State { get; set; }
}
public class Processor {
private readonly object _Lock = new object();
private readonly DataFetcher _DataFetcher;
private ProcessState _ProcessState;
public Processor() {
_DataFetcher = new DataFetcher();
_ProcessState = null;
}
public void Startup() {
_DataFetcher.DataChanged += DataFetcher_DataChanged;
}
private void DataFetcher_DataChanged(object sender, DataEventArgs args) => StartProcessingThreads(args.Data);
private void StartProcessingThreads(string data) {
lock (_Lock) {
_ProcessState = new ProcessState() { State = "Starting", ProcessResult1 = null, ProcessResult2 = null, ProcessResult3 = null };
System.Threading.Thread one = new System.Threading.Thread(() => DoProcess1(data)); // manipulate the data toa subset
one.IsBackground = true;
one.Start();
System.Threading.Thread two = new System.Threading.Thread(() => DoProcess2(data)); // manipulate the data toa subset
two.IsBackground = true;
two.Start();
System.Threading.Thread three = new System.Threading.Thread(() => DoProcess3(data)); // manipulate the data toa subset
three.IsBackground = true;
three.Start();
}
}
public ProcessState GetState() => _ProcessState;
private void DoProcess1(string dataSubset) {
// do work
ProcessResult result = new ProcessResult(); // this object contains the result
// on completion
lock (_Lock) {
_ProcessState = new ProcessState() { State = (_ProcessState.State ?? string.Empty) + ", 1 done", ProcessResult1 = result, ProcessResult2 = _ProcessState?.ProcessResult2, ProcessResult3 = _ProcessState?.ProcessResult3 };
}
}
private void DoProcess2(string dataSubset) {
// do work
ProcessResult result = new ProcessResult(); // this object contains the result
// on completion
lock (_Lock) {
_ProcessState = new ProcessState() { State = (_ProcessState.State ?? string.Empty) + ", 2 done", ProcessResult1 = _ProcessState?.ProcessResult1 , ProcessResult2 = result, ProcessResult3 = _ProcessState?.ProcessResult3 };
}
}
private void DoProcess3(string dataSubset) {
// do work
ProcessResult result = new ProcessResult(); // this object contains the result
// on completion
lock (_Lock) {
_ProcessState = new ProcessState() { State = (_ProcessState.State ?? string.Empty) + ", 3 done", ProcessResult1 = _ProcessState?.ProcessResult1, ProcessResult2 = _ProcessState?.ProcessResult2, ProcessResult3 = result };
}
}
}
public class DataEventArgs : System.EventArgs {
// data here is string, but could be anything -- just think of thread safety when accessing from the 3 processors
private readonly string _Data;
public DataEventArgs(string data) {
_Data = data;
}
public string Data => _Data;
}
public class DataFetcher {
// watch for data changes and fire when data has changed
public event System.EventHandler<DataEventArgs> DataChanged;
}
}
The simplest solution would be to define the scheduled logic in Task Method() style, and execute them using Task.Run(), while in the main thread just wait for the execution to finish using Task.WaitAny(). If a task is finished, you could Call Task.WaitAny again, but instead of the finished task, you'd pass Task.Delay(timeUntilNextSchedule).
This way the tasks are not blocking the main thread, and you can avoid spinning the CPU just to wait. In general, you can avoid managing directly in modern .NET
Depending on other requirements, like standardized error handling, monitoring capability, management of these scheduled task, you could also rely on a more robust solution, like HangFire.

Waiting for all jobs to be finished with async await

I'm trying to understand the usage of async-await in C#5. If I have 2 jobs started in a method, is there a best way to wait for their completion in C#5+ ? I've done the example below but I fail to see what the async await keywork brings here besides free documentation with async keyword.
I made the following example, I want "FINISHED !" to be printed last. It is not the case however. What did I miss ? How can I make the async method wait until all jobs are finished ? Is there a point using async-await here ? I could just do Task.WaitAll with a non-async method here. I don't really understand what async brings in case you want to wait.
class Program
{
static void Main(string[] args)
{
var fooWorker = new FooWorker();
var barWorker = new BarWorker();
var test = new Class1(fooWorker, barWorker);
test.SomeWork();
Console.ReadLine();
}
}
public class Foo
{
public Foo(string bar) => Bar = bar;
public string Bar { get; }
}
public class Class1
{
private IEnumerable<Foo> _foos;
private readonly FooWorker _fooWorker;
private readonly BarWorker _barWorker;
public Class1(FooWorker fooWorker, BarWorker barWorker)
{
_fooWorker = fooWorker;
_barWorker = barWorker;
}
public void SomeWork()
{
_foos = ProduceManyFoo();
MoreWork();
Console.WriteLine("FINISHED !");
}
private async void MoreWork()
{
if (_foos == null || !_foos.Any()) return;
var fooList = _foos.ToList();
Task fooWorkingTask = _fooWorker.Work(fooList);
Task barWorkingTask = _barWorker.Work(fooList);
await Task.WhenAll(fooWorkingTask, barWorkingTask);
}
private IEnumerable<Foo> ProduceManyFoo()
{
int i = 0;
if (++i < 100) yield return new Foo(DateTime.Now.ToString(CultureInfo.InvariantCulture));
}
}
public abstract class AWorker
{
protected virtual void DoStuff(IEnumerable<Foo> foos)
{
foreach (var foo in foos)
{
Console.WriteLine(foo.Bar);
}
}
public Task Work(IEnumerable<Foo> foos) => Task.Run(() => DoStuff(foos));
}
public class FooWorker : AWorker { }
public class BarWorker : AWorker { }
You are firing off tasks and just forgetting them, while the thread continues running. This fixes it.
Main:
static async Task Main(string[] args)
{
var fooWorker = new FooWorker();
var barWorker = new BarWorker();
var test = new Class1(fooWorker, barWorker);
await test.SomeWork();
Console.ReadLine();
}
SomeWork:
public async Task SomeWork()
{
_foos = ProduceManyFoo();
await MoreWork();
Console.WriteLine("FINISHED !");
}
MoreWork signature change:
private async Task MoreWork()
The obvious code smell which should help make the problem clear is using async void. Unless required this should always be avoided.
When using async and await you'll usually want to chain the await calls to the top-level (in this case Main).
await is non-blocking, so anything that calls an async method should really care about the Task being returned.

How to best prevent running async method again before it completes?

I've got this pattern for preventing calling into an async method before it has had a chance to complete previously.
My solution involving needing a flag, and then needing to lock around the flag, feels pretty verbose. Is there a more natural way of achieving this?
public class MyClass
{
private object SyncIsFooRunning = new object();
private bool IsFooRunning { get; set;}
public async Task FooAsync()
{
try
{
lock(SyncIsFooRunning)
{
if(IsFooRunning)
return;
IsFooRunning = true;
}
// Use a semaphore to enforce maximum number of Tasks which are able to run concurrently.
var semaphoreSlim = new SemaphoreSlim(5);
var trackedTasks = new List<Task>();
for(int i = 0; i < 100; i++)
{
await semaphoreSlim.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
// DoTask();
semaphoreSlim.Release();
}));
}
// Using await makes try/catch/finally possible.
await Task.WhenAll(trackedTasks);
}
finally
{
lock(SyncIsFooRunning)
{
IsFooRunning = false;
}
}
}
}
As noted in the comments, you can use Interlocked.CompareExchange() if you prefer:
public class MyClass
{
private int _flag;
public async Task FooAsync()
{
try
{
if (Interlocked.CompareExchange(ref _flag, 1, 0) == 1)
{
return;
}
// do stuff
}
finally
{
Interlocked.Exchange(ref _flag, 0);
}
}
}
That said, I think it's overkill. Nothing wrong with using lock in this type of scenario, especially if you don't expect a lot of contention on the method. What I do think would be better is to wrap the method so that the caller can always await on the result, whether a new asynchronous operation was started or not:
public class MyClass
{
private readonly object _lock = new object();
private Task _task;
public Task FooAsync()
{
lock (_lock)
{
return _task != null ? _task : (_task = FooAsyncImpl());
}
}
public async Task FooAsyncImpl()
{
try
{
// do async stuff
}
finally
{
lock (_lock) _task = null;
}
}
}
Finally, in the comments, you say this:
Seems a bit odd that all the return types are still valid for Task?
Not clear to me what you mean by that. In your method, the only valid return types would be void and Task. If your return statement(s) returned an actual value, you'd have to use Task<T> where T is the type returned by the return statement(s).

Is it safe to pass Task as method parameter?

Consider the following code:
public class Program {
static void Main(string[] args) {
Generate();
}
static void Generate() {
Task t = null;
t = Task.Run(() => {
MyClass myClass = new MyClass();
myClass.ContinueTask(t);
});
Console.ReadLine();
}
}
public class MyClass {
public void ContinueTask(Task t) {
t.ContinueWith(x => {
Console.WriteLine("Continue here...");
});
}
}
Is it safe to pass t as parameter as so or is it better to directly start a new task inside MyClass?
This is unsafe because t might not be assigned at the point where it is used. In fact this is a data race.
Even if you fixed that it would be bad architecture. Why does ContinueTask need to know that it is continuing with something. This is a not a concern that should be located here. ContinueTask should perform it's work assuming that its antecedent has completed already.
It's hard to tell what you are trying to accomplish. What's wrong with sequencing your code like this:?
static async Task Generate() {
var t = Task.Run(() => {
//... other code ...
});
MyClass myClass = new MyClass();
await t;
myClass.ContinueTask();
Console.ReadLine();
}
await is perfect for sequencing tasks.
reusing the Task object
What do you mean by that? A task cannot be reused. It cannot run twice. All that your ContinueWith does is logically wait for the antecedent and then run the lambda. Here, the task serves as an event basically.
ContinueWith does not modify the task it is being called on. It creates a new task.
I've reduced your code down to this example:
public Task<int> Parse()
{
Task<int> t = null;
t = Task.Run(() => this.Read(t));
return t;
}
public Task<int> Read(Task<int> t)
{
return t.ContinueWith(v => 42);
}
I think that has the same underlying structure.
This causes a dead-lock. I suspect your code does too. So I think it's unsafe.

given a Task instance, can I tell if ContinueWith has been called on it?

Given a Task instance, how can I tell if ContinueWith has been called on it? I want to know if I'm the last task executing in a chain.
Task task = Task.FromResult();
void SomeMethod(var x) {
task = task.ContinueWith(previous => {
if (task.ContinueWith is called) return;
// do something with x...
}
}
If you meant multiple continuations. A possible solution may be like this.
class Program
{
public class TaskState
{
public bool Ended { get; set; }
}
static void Main(string[] args)
{
var task = Task.FromResult("Stackoverflow");
var state = new TaskState();
task.ContinueWith((result, continuationState) =>
{
Console.WriteLine("in first");
}, state).ContinueWith((result, continuationState) =>
{
if (!state.Ended)
Console.WriteLine("in second");
state.Ended = true;
}, state).ContinueWith((result, continuationState) =>
{
if (!state.Ended)
Console.WriteLine("in third");
state.Ended = true;
}, state);
Console.ReadLine();
}
}
You can have a static variable (a dictionary object) declared on the parent and update it with unique keyvalues when your Tasks are triggered. You can monitor this static variable to see if all the other threads has completed the execution or not.

Categories

Resources