recursively calling method (for object reuse purpose) - c#

I have a rather large class which contains plenty of fields (10+), a huge array (100kb) and some unmanaged resources. Let me explain by example
class ResourceIntensiveClass
{
private object unmaganedResource; //let it be the expensive resource
private byte[] buffer = new byte[1024 * 100]; //let it be the huge managed memory
private Action<ResourceIntensiveClass> OnComplete;
private void DoWork(object state)
{
//do long running task
OnComplete(this); //notify callee that task completed so it can reuse same object for another task
}
public void Start(object dataRequiredForCurrentTask)
{
ThreadPool.QueueUserWorkItem(DoWork); //initiate long running work
}
}
The problem is that the start method never returns after the 10000th iteration causing a stack overflow. I could execute the OnComplete delegate in another thread giving a chance for the Start method to return, but it requires using extra cpu time and resources as you know. So what is the best option for me?

Is there a good reason for doing your calculations recursively? This seems like a simple loop would do the trick, thus obviating the need for incredibly deep stacks. This design seems especially problematic as you are relying on main() to setup your recursion.

recursive methods can get out of hand quite fast. Have you looked into using Parallel Linq?
you could do something like
(your Array).AsParallel().ForAll(item => item.CallMethod());
you could also look into the Task Parallel Library (TPL)
with tasks, you can define an action and a continue with task.
The Reactive Framework (RX) on the other hand could handle these on complete events in an async manner.

Where are you changing the value of taskData so that its length can ever equal currentTaskIndex? Since the tasks you are assigning to the data are never changing, they are being carried out forever...

I would guess that the problem arises from using the pre-increment operator here:
if(c.CurrentCount < 10000)
c.Start(++c.CurrentCount);
I am not sure of the semantics of pre-increment in C#, perhaps the value passed to a method call is not what you expect.
But since your Start(int) method assigns the value of the input to this.CurrentCount as it's first step anyway, you should be safe replacing this with:
if(c.CurrentCount < 10000)
c.Start(c.CurrentCount + 1);
There is no point in assigning to c.CurrentCount twice.

If using the threadpool, I assume you are protecting the counters (c.CurrentCount), otherwise concurrent increments will cause more activity, not just 10000 executions.

There's a neat tool called a ManualResetEvent that could simplify life for you.
Place a ManualResetEvent in your class and add a public OnComplete event.
When you declare your class, you can wire up the OnComplete event to some spot in your code or not wire it up and ignore it.
This would help your custom class to have more correct form.
When your long process is complete (I'm guessing this is in a thread), simply call the Set method of the ManualResetEvent.
As for running your long method, it should be in a thread that uses the ManualResetEvent in a way similar to below:
private void DoWork(object state)
{
ManualResetEvent mre = new ManualResetEvent(false);
Thread thread1 = new Thread(
() => {
//do long running task
mre.Set();
);
thread1.IsBackground = true;
thread1.Name = "Screen Capture";
thread1.Start();
mre.WaitOne();
OnComplete(this); //notify callee that task completed so it can reuse same object for another task
}

Related

Are non-thread-safe functions async safe?

Consider the following async function that modifies a non-thread-safe list:
async Task AddNewToList(List<Item> list)
{
// Suppose load takes a few seconds
Item item = await LoadNextItem();
list.Add(item);
}
Simply put: Is this safe?
My concern is that one may invoke the async method, and then while it's loading (either on another thread, or as an I/O operation), the caller may modify the list.
Suppose that the caller is partway through the execution of list.Clear(), for example, and suddenly the Load method finishes! What will happen?
Will the task immediately interrupt and run the list.Add(item); code? Or will it wait until the main thread is done with all scheduled CPU tasks (ie: wait for Clear() to finish), before running the code?
Edit: Since I've basically answered this for myself below, here's a bonus question: Why? Why does it immediately interrupt instead of waiting for CPU bound operations to complete? It seems counter-intuitive to not queue itself up, which would be completely safe.
Edit: Here's a different example I tested myself. The comments indicate the order of execution. I am disappointed!
TaskCompletionSource<bool> source;
private async void buttonPrime_click(object sender, EventArgs e)
{
source = new TaskCompletionSource<bool>(); // 1
await source.Task; // 2
source = null; // 4
}
private void buttonEnd_click(object sender, EventArgs e)
{
source.SetResult(true); // 3
MessageBox.Show(source.ToString()); // 5 and exception is thrown
}
No, its not safe. However also consider that the caller might also have spawned a thread and passed the List to its child thread before calling your code, even in a non async environment, which will have the same detrimental effect.
So; although not safe, there is nothing inherently thread-safe about receiving a List from a caller anyway - there is no way of knowing whether the list is actually being processed from other threads that your own.
Short answer
You always need to be careful using async.
Longer answer
It depends on your SynchronizationContext and TaskScheduler, and what you mean by "safe."
When your code awaits something, it creates a continuation and wraps it in a task, which is then posted to the current SynchronizationContext's TaskScheduler. The context will then determine when and where the continuation will run. The default scheduler simply uses the thread pool, but different types of applications can extend the scheduler and provide more sophisticated synchronization logic.
If you are writing an application that has no SynchronizationContext (for example, a console application, or anything in .NET core), the continuation is simply put on the thread pool, and could execute in parallel with your main thread. In this case you must use lock or synchronized objects such as ConcurrentDictionary<> instead of Dictionary<>, for anything other than local references or references that are closed with the task.
If you are writing a WinForms application, the continuations are put in the message queue, and will all execute on the main thread. This makes it safe to use non-synchronized objects. However, there are other worries, such as deadlocks. And of course if you spawn any threads, you must make sure they use lock or Concurrent objects, and any UI invocations must be marshaled back to the UI thread. Also, if you are nutty enough to write a WinForms application with more than one message pump (this is highly unusual) you'd need to worry about synchronizing any common variables.
If you are writing an ASP.NET application, the SynchronizationContext will ensure that, for a given request, no two threads are executing at the same time. Your continuation might run on a different thread (due to a performance feature known as thread agility), but they will always have the same SynchronizationContext and you are guaranteed that no two threads will access your variables at the same time (assuming, of course, they are not static, in which case they span across HTTP requests and must be synchronized). In addition, the pipeline will block parallel requests for the same session so that they execute in series, so your session state is also protected from threading concerns. However you still need to worry about deadlocks.
And of course you can write your own SynchronizationContext and assign it to your threads, meaning that you specify your own synchronization rules that will be used with async.
See also How do yield and await implement flow of control in .NET?
Assuming the "invalid acces" occures in LoadNextItem(): The Task will throw an exception. Since the context is captured it will pass on to the callers thread so list.Add will not be reached.
So, no it's not thread-safe.
Yes I think that could be a problem.
I would return item and add to the list on the main tread.
private async void GetIntButton(object sender, RoutedEventArgs e)
{
List<int> Ints = new List<int>();
Ints.Add(await GetInt());
}
private async Task<int> GetInt()
{
await Task.Delay(100);
return 1;
}
But you have to call from and async so I do not this this would work either.

Under what conditions can a thread enter a lock (Monitor) region more than once concurrently?

(question revised): So far, the answers all include a single thread re-entering the lock region linearly, through things like recursion, where you can trace the steps of a single thread entering the lock twice. But is it possible somehow, for a single thread (perhaps from the ThreadPool, perhaps as a result of timer events or async events or a thread going to sleep and being awaken/reused in some other chunk of code separately) to somehow be spawned in two different places independently of each other, and hence, run into the lock re-entrance problem when the developer didn't expect it by simply reading their own code?
In the ThreadPool Class Remarks (click here) the Remarks seem to suggest that sleeping threads should be reused when they're not in use, or otherwise wasted by sleeping.
But on the Monitor.Enter reference page (click here) they say "It is legal for the same thread to invoke Enter more than once without it blocking." So I figure there must be something I'm supposed to be careful to avoid. What is it? How is it even possible for a single thread to enter the same lock region twice?
Suppose you have some lock region that takes an unfortunately long time. This might be realistic, for example, if you access some memory that has been paged out (or whatever.) The thread in the locked region might go to sleep or something. Does the same thread become eligible to run more code, which might accidentally step into the same lock region? The following does NOT, in my testing, get multiple instances of the same thread to run into the same lock region.
So how does one produce the problem? What exactly do you need to be careful to avoid?
class myClass
{
private object myLockObject;
public myClass()
{
this.myLockObject = new object();
int[] myIntArray = new int[100]; // Just create a bunch of things so I may easily launch a bunch of Parallel things
Array.Clear(myIntArray, 0, myIntArray.Length); // Just create a bunch of things so I may easily launch a bunch of Parallel things
Parallel.ForEach<int>(myIntArray, i => MyParallelMethod());
}
private void MyParallelMethod()
{
lock (this.myLockObject)
{
Console.Error.WriteLine("ThreadId " + Thread.CurrentThread.ManagedThreadId.ToString() + " starting...");
Thread.Sleep(100);
Console.Error.WriteLine("ThreadId " + Thread.CurrentThread.ManagedThreadId.ToString() + " finished.");
}
}
}
Suppose you have a queue that contains actions:
public static Queue<Action> q = whatever;
Suppose Queue<T> has a method Dequeue that returns a bool indicating whether the queue could be successfully dequeued.
And suppose you have a loop:
static void Main()
{
q.Add(M);
q.Add(M);
Action action;
while(q.Dequeue(out action))
action();
}
static object lockObject = new object();
static void M()
{
Action action;
lock(lockObject)
{
if (q.Dequeue(out action))
action();
}
}
Clearly the main thread enters the lock in M twice; this code is re-entrant. That is, it enters itself, through an indirect recursion.
Does this code look implausible to you? It should not. This is how Windows works. Every window has a message queue, and when a message queue is "pumped", methods are called corresponding to those messages. When you click a button, a message goes in the message queue; when the queue is pumped, the click handler corresponding to that message gets invoked.
It is therefore extremely common, and extremely dangerous, to write Windows programs where a lock contains a call to a method which pumps a message loop. If you got into that lock as a result of handling a message in the first place, and if the message is in the queue twice, then the code will enter itself indirectly, and that can cause all manner of craziness.
The way to eliminate this is (1) never do anything even slightly complicated inside a lock, and (2) when you are handling a message, disable the handler until the message is handled.
Re-Entrance is possible if you have a structure like so:
Object lockObject = new Object();
void Foo(bool recurse)
{
lock(lockObject)
{
Console.WriteLine("In Lock");
if (recurse) { foo(false); }
}
}
While this is a pretty simplistic example, it's possible in many scenarios where you have interdependent or recursive behaviour.
For example:
ComponentA.Add(): locks a common 'ComponentA' object, adds new item to ComponentB.
ComponentB.OnNewItem(): new item triggers data-validation on each item in list.
ComponentA.ValidateItem(): locks a common 'ComponentA' object to validate the item.
Same-thread re-entry on the same lock is needed to ensure you don't get deadlocks occurring with your own code.
One of the more subtle ways you can recurse into a lock block is in GUI frameworks. For example, you can asynchronously invoke code on a single UI thread (a Form class)
private object locker = new Object();
public void Method(int a)
{
lock (locker)
{
this.BeginInvoke((MethodInvoker) (() => Method(a)));
}
}
Of course, this also puts in an infinite loop; you'd likely have a condition by which you'd want to recurse at which point you wouldn't have an infinite loop.
Using lock is not a good way to sleep/awaken threads. I would simply use existing frameworks like Task Parallel Library (TPL) to simply create abstract tasks (see Task) to creates and the underlying framework handles creating new threads and sleeping them when needed.
IMHO, Re-entering a lock is not something you need to take care to avoid (given many people's mental model of locking this is, at best, dangerous, see Edit below). The point of the documentation is to explain that a thread cannot block itself using Monitor.Enter. This is not always the case with all synchronization mechanisms, frameworks, and languages. Some have non-reentrant synchronization in which case you have to be careful that a thread doesn't block itself. What you do need to be careful about is always calling Monitor.Exit for every Monitor.Enter call. The lock keyword does this for you automatically.
A trivial example with re-entrance:
private object locker = new object();
public void Method()
{
lock(locker)
{
lock(locker) { Console.WriteLine("Re-entered the lock."); }
}
}
The thread has entered the lock on the same object twice so it must be released twice. Usually it is not so obvious and there are various methods calling each other that synchronize on the same object. The point is that you don't have to worry about a thread blocking itself.
That said you should generally try to minimize the amount the time you need to hold a lock. Acquiring a lock is not computationally expensive, contrary to what you may hear (it is on the order of a few nanoseconds). Lock contention is what is expensive.
Edit
Please read Eric's comments below for additional details, but the summary is that when you see a lock your interpretation of it should be that "all activations of this code block are associated with a single thread", and not, as it is commonly interpreted, "all activations of this code block execute as a single atomic unit".
For example:
public static void Main()
{
Method();
}
private static int i = 0;
private static object locker = new object();
public static void Method()
{
lock(locker)
{
int j = ++i;
if (i < 2)
{
Method();
}
if (i != j)
{
throw new Exception("Boom!");
}
}
}
Obviously, this program blows up. Without the lock, it is the same result. The danger is that the lock leads you into a false sense of security that nothing could modify state on you between initializing j and evaluating the if. The problem is that you (perhaps unintentionally) have Method recursing into itself and the lock won't stop that. As Eric points out in his answer, you might not realize the problem until one day someone queues up too many actions simultaneously.
ThreadPool threads cannot be reused elsewhere just because they went to sleep; they need to finish before they're reused. A thread that is taking a long time in a lock region does not become eligible to run more code at some other independent point of control. The only way to experience lock re-entry is by recursion or executing methods or delegates inside a lock that re-enter the lock.
Let's think about something other than recursion.
In some of business logics, they would like to control the behaviors of synchronization.
One of these patterns, they invoke Monitor.Enter somewhere and would like to invoke Monitor.Exit elsewhere later. Here is the code to get the idea about that:
public partial class Infinity: IEnumerable<int> {
IEnumerator IEnumerable.GetEnumerator() {
return this.GetEnumerator();
}
public IEnumerator<int> GetEnumerator() {
for(; ; )
yield return ~0;
}
public static readonly Infinity Enumerable=new Infinity();
}
public partial class YourClass {
void ReleaseLock() {
for(; lockCount-->0; Monitor.Exit(yourLockObject))
;
}
void GetLocked() {
Monitor.Enter(yourLockObject);
++lockCount;
}
void YourParallelMethod(int x) {
GetLocked();
Debug.Print("lockCount={0}", lockCount);
}
public static void PeformTest() {
new Thread(
() => {
var threadCurrent=Thread.CurrentThread;
Debug.Print("ThreadId {0} starting...", threadCurrent.ManagedThreadId);
var intanceOfYourClass=new YourClass();
// Parallel.ForEach(Infinity.Enumerable, intanceOfYourClass.YourParallelMethod);
foreach(var i in Enumerable.Range(0, 123))
intanceOfYourClass.YourParallelMethod(i);
intanceOfYourClass.ReleaseLock();
Monitor.Exit(intanceOfYourClass.yourLockObject); // here SynchronizationLockException thrown
Debug.Print("ThreadId {0} finished. ", threadCurrent.ManagedThreadId);
}
).Start();
}
object yourLockObject=new object();
int lockCount;
}
If you invoke YourClass.PeformTest(), and get a lockCount greater than 1, you've reentered; not necessarily be concurrent.
If it was not safe for reentrancy, you will get stuck in the foreach loop.
In the code block where Monitor.Exit(intanceOfYourClass.yourLockObject) will throw you a SynchronizationLockException, it is because we are trying to invoke Exit more than the times it have entered. If you are about to use the lock keyword, you possibly would not encounter this situation except directly or indirectly of recursive calls. I guess that's why the lock keyword was provided: it prevents the Monitor.Exit to be omitted in a careless manner.
I remarked the calling of Parallel.ForEach, if you are interested then you can test it for fun.
To test the code, .Net Framework 4.0 is the least requirement, and following additional name spaces are required, too:
using System.Threading.Tasks;
using System.Diagnostics;
using System.Threading;
using System.Collections;
Have fun.

Thread racing, why do threads work so?

I have two different result from exchanging two lines of code ( done = true with Console.Write() one )
If I put done = true, firstly, the result will be:
True
Else If I put Console.WriteLine() firstly, the result will be:
False
False
Why? ( see carefully, that bool variable is static! )
using System;
using System.Threading;
class Program
{
static bool done;
static void Main(string[] args)
{
new Thread(test).Start();
test();
}
static void test()
{
if (!done)
{
done = true;
Console.WriteLine(done);
}
}
}
My bet is that the Console.WriteLine will be enough work to keep the thread busy while the second call to test() has a chance to execute.
So basically the call to WriteLine delays the setting of done long enough for the second call to test to be able to test done and find it is still set as false.
If you leave it as shown, with done = true; before the write to the console then this will be set almost instantly and thus the second call to test will find done set to true and will therefore not perform the Console.WriteLine.
Hope that all makes sense.
I just found this which contains code very much like your question. If you didn't get your question from this page already, then I would suggest having a read as it explains in much more detail the cause of this effect.
With the follow key extract:
On a single-processor computer, a thread scheduler performs
time-slicing — rapidly switching execution between each of the active
threads. Under Windows, a time-slice is typically in the
tens-of-milliseconds region — much larger than the CPU overhead in
actually switching context between one thread and another (which is
typically in the few-microseconds region).
So essentially the call to Console.WriteLine is taking long enough for the processor to decide that it is time for the main thread to have another go before your extra thread is permitted to continue (and ultimate set the done flag)
Your code isn't thread safe, and the results will be unpredictable.
You need to lock access when reading / writing to the static boolean, like so:
static bool done;
static readonly object _mylock = new object();
static void Main()
{
//Application.EnableVisualStyles();
//Application.SetCompatibleTextRenderingDefault(false);
//Application.Run(new Form1());
new Thread(test).Start();
test();
Console.ReadKey();
}
static void test()
{
lock (_mylock)
{
if (!done)
{
Console.WriteLine(done);
done = true;
}
}
}
Edit : readonly thanks #d4wn
Looks like the scheduler just cut the CPU time from one thread after it's call of Console.Writeline and then gave it to the other thread, all before done was set to true.
Are you certain that it always prints False\nFalse when you call Console.Writeline before assigning done = true;? To my understanding, this should be quite random.
Each time a shared variable is accessed by one of the sharing threads must be protected by one of the syncronization techniques explicitly. The environment (clr..) doesn't do it for us, cause in the whole possible complexity of multithreading it would be impossible. So this definetely responsible and not easy task must be done by the developer, writing multithreading code.
I guess there you can find a great deal of necessary information:
Thread Synchronization (C# Programming Guide)

Control.BeginInvoke Execution Order

When calling BeginInvoke(), will the delegates comes back in the same order that the method is being called? or there is no guarantee which delegates will come back first?
public Form1()
{
InitializeComponent();
for (int i = 0; i < 100; i++)
{
Thread t = new Thread(DisplayCount);
t.Start(i);
}
}
public void DisplayCount(object count)
{
if (InvokeRequired)
{
BeginInvoke(new Action<object>(DisplayCount), count);
return;
}
listBox1.Items.Add(count);
}
And list of integers will come back out of order.
Control.BeginInvoke() will execute the action asynchronously, but on the UI thread.
If you call BeginInvoke() multiple times with different actions, they will come back in order of whichever ones complete the fastest.
As a side-note, you should probably use some sort of snychronization mechanism around your listBox1.Items.Add(count) calls, perhaps locking on its SynchRoot property.
From MSDN - ListBox.ObjectCollection Class
Any public static (Shared in Visual Basic) members of this type are
thread safe. Any instance members are not guaranteed to be thread
safe.
(Emphasis added)
If you call the same function multiple times, then they should come back in the same order, maybe! If you have a function analysing a 1 TB Dataset and another function just doing some Logging then I don't think they will came back in the same order.
It also depends on the DispatcherPriority you have set for BeginInvoke. A low priority like SystemIdl will be executet later then a higher priority like Send.
If you start a thread using Thread.Start() then the execution of the Thread-Function happens asynchronously at a random time after that call.
That's why you get random numbers in my opinion.

Thread-safe asynchronous code in C#

I asked the question below couple of weeks ago. Now, when reviewing my question and all the answers, a very important detail jumped into my eyes: In my second code example, isn't DoTheCodeThatNeedsToRunAsynchronously() executed in the main (UI) thread? Doesn't the timer just wait a second and then post an event to the main thread? This would mean then that the code-that-needs-to-run-asynchronously isn't run asynchronously at all?!
Original question:
I have recently faced a problem multiple times and solved it in different ways, always being uncertain on whether it is thread safe or not: I need to execute a piece of C# code asynchronously. (Edit: I forgot to mention I'm using .NET 3.5!)
That piece of code works on an object that is provided by the main thread code. (Edit: Let's assume that object is thread-safe in itself.) I'll present you two ways I tried (simplified) and have these four questions:
What is the best way to achieve what I want? Is it one of the two or another approach?
Is one of the two ways not thread-safe (I fear both...) and why?
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates). In Java you are forced to declare the local variable as final (i.e. it cannot be changed once assigned). In C# there is no such possibility, is there?
Approach 1: Thread
new Thread(new ParameterizedThreadStart(
delegate(object parameter)
{
Thread.Sleep(1000); // wait a second (for a specific reason)
MyObject myObject = (MyObject)parameter;
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
})).Start(this.MyObject);
There is one problem I had with this approach: My main thread might crash, but the process still persists in the memory due to the zombie thread.
Approach 2: Timer
MyObject myObject = this.MyObject;
System.Timers.Timer timer = new System.Timers.Timer();
timer.Interval = 1000;
timer.AutoReset = false; // i.e. only run the timer once.
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
DoSomeStuff();
myObject = that.MyObject; // hypothetical second assignment.
The local variable myObject is what I'm talking about in question 4. I've added a second assignment as an example. Imagine the timer elapses after the second assigment, will the delegate code operate on this.MyObject or that.MyObject?
Whether or not either of these pieces of code is safe has to do with the structure of MyObject instances. In both cases you are sharing the myObject variable between the foreground and background threads. There is nothing stopping the foreground thread from modifying myObject while the background thread is running.
This may or may not be safe and depends on the structure of MyObject. However if you haven't specifically planned for it then it's most certainly an unsafe operation.
I recommend using Task objects, and restructuring the code so that the background task returns its calculated value rather than changing some shared state.
I have a blog entry that discusses five different approaches to background tasks (Task, BackgroundWorker, Delegate.BeginInvoke, ThreadPool.QueueUserWorkItem, and Thread), with the pros and cons of each.
To answer your questions specifically:
What is the best way to achieve what I want? Is it one of the two or another approach? The best solution is to use the Task object instead of a specific Thread or timer callback. See my blog post for all the reasons why, but in summary: Task supports returning a result, callbacks on completion, proper error handling, and integration with the universal cancellation system in .NET.
Is one of the two ways not thread-safe (I fear both...) and why? As others have stated, this totally depends on whether MyObject.ChangeSomeProperty is threadsafe. When dealing with asynchronous systems, it's easier to reason about threadsafety when each asynchronous operation does not change shared state, and rather returns a result.
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object? Personally, I prefer using lambda binding, which is more type-safe (no casting necessary).
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? Lambdas (and delegate expressions) bind to variables, not to values, so the answer is yes: the reference may change before it is used by the delegate. If the reference may change, then the usual solution is to create a separate local variable that is only used by the lambda expression,
as such:
MyObject myObject = this.MyObject;
...
timer.AutoReset = false; // i.e. only run the timer once.
var localMyObject = myObject; // copy for lambda
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
localMyObject.ChangeSomeProperty();
});
// Now myObject can change without affecting timer.Elapsed
Tools like ReSharper will try to detect whether local variables bound in lambdas may change, and will warn you if it detects this situation.
My recommended solution (using Task) would look something like this:
var ui = TaskScheduler.FromCurrentSynchronizationContext();
var localMyObject = this.myObject;
Task.Factory.StartNew(() =>
{
// Run asynchronously on a ThreadPool thread.
Thread.Sleep(1000); // TODO: review if you *really* need this
return DoTheCodeThatNeedsToRunAsynchronously();
}).ContinueWith(task =>
{
// Run on the UI thread when the ThreadPool thread returns a result.
if (task.IsFaulted)
{
// Do some error handling with task.Exception
}
else
{
localMyObject.ChangeSomeProperty(task.Result);
}
}, ui);
Note that since the UI thread is the one calling MyObject.ChangeSomeProperty, that method doesn't have to be threadsafe. Of course, DoTheCodeThatNeedsToRunAsynchronously still does need to be threadsafe.
"Thread-safe" is a tricky beast. With both of your approches, the problem is that the "MyObject" your thread is using may be modified/read by multiple threads in a way that makes the state appear inconsistent, or makes your thread behave in a way inconsistent with actual state.
For example, say your MyObject.ChangeSomeproperty() MUST be called before MyObject.DoSomethingElse(), or it throws. With either of your approaches, there is nothing to stop any other thread from calling DoSomethingElse() before the thread that will call ChangeSomeProperty() finishes.
Or, if ChangeSomeProperty() happens to be called by two threads, and it (internally) changes state, the thread context switch may happen while the first thread is in the middle of it's work and the end result is that the actual new state after both threads is "wrong".
However, by itself, neither of your approaches is inherently thread-unsafe, they just need to make sure that changing state is serialized and that accessing state is always giving a consistent result.
Personally, I wouldn't use the second approach. If you're having problems with "zombie" threads, set IsBackground to true on the thread.
Your first attempt is pretty good, but the thread continued to exist even after the application exits, because you didn't set the IsBackground property to true... here is a simplified (and improved) version of your code:
MyObject myObject = this.MyObject;
Thread t = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
t.IsBackground = true;
t.Start();
With regards to the thread safety: it's difficult to tell if your program functions correctly when multiple threads execute simultaneously, because you're not showing us any points of contention in your example. It's very possible that you will experience concurrency issues if your program has contention on MyObject.
Java has the final keyword and C# has a corresponding keyword called readonly, but neither final nor readonly ensure that the state of the object you're modifying will be consistent between threads. The only thing these keywords do is ensure that you do not change the reference the object is pointing to. If two threads have read/write contention on the same object, then you should perform some type of synchronization or atomic operations on that object in order to ensure thread safety.
Update
OK, if you modify the reference to which myObject is pointing to, then your contention is now on myObject. I'm sure that my answer will not match your actual situation 100%, but given the example code you've provided I can tell you what will happen:
You will not be guaranteed which object gets modified: it can be that.MyObject or this.MyObject. That's true regardless if you're working with Java or C#. The scheduler may schedule your thread/timer to be executed before, after or during the second assignment. If you're counting on a specific order of execution, then you have to do something to ensure that order of execution. Usually that something is a communication between the threads in the form of a signal: a ManualResetEvent, Join or something else.
Here is a join example:
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
task.IsBackground = true;
task.Start();
task.Join(); // blocks the main thread until the task thread is finished
myObject = that.MyObject; // the assignment will happen after the task is complete
Here is a ManualResetEvent example:
ManualResetEvent done = new ManualResetEvent(false);
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
done.Set();
});
task.IsBackground = true;
task.Start();
done.WaitOne(); // blocks the main thread until the task thread signals it's done
myObject = that.MyObject; // the assignment will happen after the task is done
Of course, in this case it's pointless to even spawn multiple threads, since you're not going to allow them to run concurrently. One way to avoid this is by not changing the reference to myObject after you've started the thread, then you won't need to Join or WaitOne on the ManualResetEvent.
So this leads me to a question: why are you assigning a new object to myObject? Is this a part of a for-loop which is starting multiple threads to perform multiple asynchronous tasks?
What is the best way to achieve what I want? Is it one of the two or another approach?
Both look fine, but...
Is one of the two ways not thread-safe (I fear both...) and why?
...they are not thread safe unless MyObject.ChangeSomeProperty() is thread safe.
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
Yes. Using a closure (as in your second approach) is fine as well, with the additional advantage that you don't need to do a cast.
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates).
Sure, if you add myObject = null; directly after setting timer.Elapsed, then the code in your thread will fail. But why would you want to do that? Note that changing this.MyObject will not affect the variable captured in your thread.
So, how to make this thread-safe? The problem is that myObject.ChangeSomeProperty(); might run in parallel with some other code that modifies the state of myObject. There are basically two solutions to that:
Option 1: Execute myObject.ChangeSomeProperty() in the main UI thead. This is the simplest solution if ChangeSomeProperty is fast. You can use the Dispatcher (WPF) or Control.Invoke (WinForms) to jump back to the UI thread, but the easiest way is to use a BackgroundWorker:
MyObject myObject = this.MyObject;
var bw = new BackgroundWorker();
bw.DoWork += (sender, args) => {
// this will happen in a separate thread
Thread.Sleep(1000);
DoTheCodeThatNeedsToRunAsynchronously();
}
bw.RunWorkerCompleted += (sender, args) => {
// We are back in the UI thread here.
if (args.Error != null) // if an exception occurred during DoWork,
MessageBox.Show(args.Error.ToString()); // do your error handling here
else
myObject.ChangeSomeProperty();
}
bw.RunWorkerAsync(); // start the background worker
Option 2: Make the code in ChangeSomeProperty() thread-safe by using the lock keyword (inside ChangeSomeProperty as well as inside any other method modifying or reading the same backing field).
The bigger thread-safety concern here, in my mind, may be the 1 second Sleep. If this is required in order to synchronize with some other operation (giving it time to complete), then I strongly recommend using a proper synchronization pattern rather than relying on the Sleep. Monitor.Pulse or AutoResetEvent are two common ways to achieve synchronization. Both should be used carefully, as it's easy to introduce subtle race conditions. However, using Sleep for synchronization is a race condition waiting to happen.
Also, if you want to use a thread (and don't have access to the Task Parallel Library in .NET 4.0), then ThreadPool.QueueUserWorkItem is preferable for short-running tasks. The thread pool threads also won't hang up the application if it dies, as long as there is not some deadlock preventing a non-background thread from dying.
One thing not mentioned so far: The choice of threading methods depends heavily on specifically what DoTheCodeThatNeedsToRunAsynchronously() does.
Different .NET threading approaches are suitable for different requirements. One very large concern is whether this method will complete quickly, or take some time (is it short-lived or long-running?).
Some .NET threading mechanisms, like ThreadPool.QueueUserWorkItem(), are for use by short-lived threads. They avoid the expense of creating a thread by using "recycled" threads--but the number of threads it will recycle is limited, so a long-running task shouldn't hog the ThreadPool's threads.
Other options to consider are using:
ThreadPool.QueueUserWorkItem() is a convienient means to fire-and-forget small tasks on a ThreadPool thread
System.Threading.Tasks.Task is a new feature in .NET 4 which makes small tasks easy to run in async/parallel mode.
Delegate.BeginInvoke() and Delegate.EndInvoke() (BeginInvoke() will run the code asynchronously, but it's crucial that you ensure EndInvoke() is called as well to avoid potential resource-leaks. It's also based on ThreadPool threads I believe.
System.Threading.Thread as shown in your example. Threads provide the most control but are also more expensive than the other methods--so they are ideal for long-running tasks or detail-oriented multithreading.
Overall my personal preference has been to use Delegate.BeginInvoke()/EndInvoke() -- it seems to strike a good balance between control and ease of use.

Categories

Resources