Are non-thread-safe functions async safe? - c#

Consider the following async function that modifies a non-thread-safe list:
async Task AddNewToList(List<Item> list)
{
// Suppose load takes a few seconds
Item item = await LoadNextItem();
list.Add(item);
}
Simply put: Is this safe?
My concern is that one may invoke the async method, and then while it's loading (either on another thread, or as an I/O operation), the caller may modify the list.
Suppose that the caller is partway through the execution of list.Clear(), for example, and suddenly the Load method finishes! What will happen?
Will the task immediately interrupt and run the list.Add(item); code? Or will it wait until the main thread is done with all scheduled CPU tasks (ie: wait for Clear() to finish), before running the code?
Edit: Since I've basically answered this for myself below, here's a bonus question: Why? Why does it immediately interrupt instead of waiting for CPU bound operations to complete? It seems counter-intuitive to not queue itself up, which would be completely safe.
Edit: Here's a different example I tested myself. The comments indicate the order of execution. I am disappointed!
TaskCompletionSource<bool> source;
private async void buttonPrime_click(object sender, EventArgs e)
{
source = new TaskCompletionSource<bool>(); // 1
await source.Task; // 2
source = null; // 4
}
private void buttonEnd_click(object sender, EventArgs e)
{
source.SetResult(true); // 3
MessageBox.Show(source.ToString()); // 5 and exception is thrown
}

No, its not safe. However also consider that the caller might also have spawned a thread and passed the List to its child thread before calling your code, even in a non async environment, which will have the same detrimental effect.
So; although not safe, there is nothing inherently thread-safe about receiving a List from a caller anyway - there is no way of knowing whether the list is actually being processed from other threads that your own.

Short answer
You always need to be careful using async.
Longer answer
It depends on your SynchronizationContext and TaskScheduler, and what you mean by "safe."
When your code awaits something, it creates a continuation and wraps it in a task, which is then posted to the current SynchronizationContext's TaskScheduler. The context will then determine when and where the continuation will run. The default scheduler simply uses the thread pool, but different types of applications can extend the scheduler and provide more sophisticated synchronization logic.
If you are writing an application that has no SynchronizationContext (for example, a console application, or anything in .NET core), the continuation is simply put on the thread pool, and could execute in parallel with your main thread. In this case you must use lock or synchronized objects such as ConcurrentDictionary<> instead of Dictionary<>, for anything other than local references or references that are closed with the task.
If you are writing a WinForms application, the continuations are put in the message queue, and will all execute on the main thread. This makes it safe to use non-synchronized objects. However, there are other worries, such as deadlocks. And of course if you spawn any threads, you must make sure they use lock or Concurrent objects, and any UI invocations must be marshaled back to the UI thread. Also, if you are nutty enough to write a WinForms application with more than one message pump (this is highly unusual) you'd need to worry about synchronizing any common variables.
If you are writing an ASP.NET application, the SynchronizationContext will ensure that, for a given request, no two threads are executing at the same time. Your continuation might run on a different thread (due to a performance feature known as thread agility), but they will always have the same SynchronizationContext and you are guaranteed that no two threads will access your variables at the same time (assuming, of course, they are not static, in which case they span across HTTP requests and must be synchronized). In addition, the pipeline will block parallel requests for the same session so that they execute in series, so your session state is also protected from threading concerns. However you still need to worry about deadlocks.
And of course you can write your own SynchronizationContext and assign it to your threads, meaning that you specify your own synchronization rules that will be used with async.
See also How do yield and await implement flow of control in .NET?

Assuming the "invalid acces" occures in LoadNextItem(): The Task will throw an exception. Since the context is captured it will pass on to the callers thread so list.Add will not be reached.
So, no it's not thread-safe.

Yes I think that could be a problem.
I would return item and add to the list on the main tread.
private async void GetIntButton(object sender, RoutedEventArgs e)
{
List<int> Ints = new List<int>();
Ints.Add(await GetInt());
}
private async Task<int> GetInt()
{
await Task.Delay(100);
return 1;
}
But you have to call from and async so I do not this this would work either.

Related

Usage of ConfigureAwait in .NET

I've read about ConfigureAwait in various places (including SO questions), and here are my conclusions:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
My questions are:
Are my conclusions correct?
When does ConfigureAwait(false) improves performance, and when it doesn't?
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
To answer your questions more directly:
ConfigureAwait(true): Runs the rest of the code on the same thread the code before the await was run on.
Not necessarily the same thread, but the same synchronization context. The synchronization context can decide how to run the code. In a UI application, it will be the same thread. In ASP.NET, it may not be the same thread, but you will have the HttpContext available, just like you did before.
ConfigureAwait(false): Runs the rest of the code on the same thread the awaited code was run on.
This is not correct. ConfigureAwait(false) tells it that it does not need the context, so the code can be run anywhere. It could be any thread that runs it.
If the await is followed by a code that accesses the UI, the task should be appended with .ConfigureAwait(true). Otherwise, an InvalidOperationException will occur due to another thread accessing UI elements.
It is not correct that it "should be appended with .ConfigureAwait(true)". ConfigureAwait(true) is the default. So if that's what you want, you don't need to specify it.
When does ConfigureAwait(false) improves performance, and when it doesn't?
Returning to the synchronization context might take time, because it may have to wait for something else to finish running. In reality, this rarely happens, or that waiting time is so minuscule that you'd never notice it.
If writing for a GUI application, but the next lines doesn't access the UI elements. Should I use ConfigureAwait(false) or ConfigureAwait(true) ?
You could use ConfigureAwait(false), but I suggest you don't, for a few reasons:
I doubt you would notice any performance improvement.
It can introduce parallelism that you may not expect. If you use ConfigureAwait(false), the continuation can run on any thread, so you could have problems if you're accessing non-thread-safe objects. It is not common to have these problems, but it can happen.
You (or someone else maintaining this code) may add code that interacts with the UI later and exceptions will be thrown. Hopefully the ConfigureAwait(false) is easy to spot (it could be in a different method than where the exception is thrown) and you/they know what it does.
I find it's easier to not use ConfigureAwait(false) at all (except in libraries). In the words of Stephen Toub (a Microsoft employee) in the ConfigureAwait FAQ:
When writing applications, you generally want the default behavior (which is why it is the default behavior).
Edit: I've written an article of my own on this topic: .NET: Don’t use ConfigureAwait(false)
ConfigureAwait(false) may improve performance if there are not many worker threads available and if the thread that it would need to wait for is constantly busy.
ConfigureAwait(false) is recommended everywhere where coming back to same SynchronizationContext (which usualy is linked with thread) is not needed, especially in libraries that awaits something internally: https://medium.com/bynder-tech/c-why-you-should-use-configureawait-false-in-your-library-code-d7837dce3d7f.
ConfigureAwait(true) (which is the default) is needed when you require same context but may also lead to a dead lock in certain situations.
Consider this code:
void Main()
{
// creating a windows form attaches a synchronization context to the current thread
new System.Windows.Forms.Form();
var task = DoSth();
Console.WriteLine(task.Result);
}
async Task<int> DoSth()
{
await Task.Delay(1000);
return 1;
}
in this example because of not awaited task DoSth, the main UI thread is blocked by waiting for task.Result - at the same time DoSth is blocked because it wants to come back to the UI thread after a delay. This will lead to a deadlock and this code will never execute to the end. Adding .ConfigureAwait(false) solves the problem in this case.
Using ConfigureAwait(false) in application code is normally not going to boost your application's performance in any meaningful way, because normally you don't await inside loops in application code. For example lets consider the case that your app has a button, and an async operation is started everytime the user clicks the button, and the async operation includes a single await. By typing the 22 characters .ConfigureAwait(false) after this await you have already lost comparable time of your life, with the time you can hope to save from 10 users that click this button once every minute, 8 hours per day, for 20 years each (~35,000,000 context switchings in total = some seconds of CPU processing time).
And this before taking into account the time you need to think about whether you can safely include this configuration (depending on whether the continuation contains UI-related code), the time you'll need in order to reconfirm you previous assessment every time you have to maintain/modify the code, and the time you'll lose on debugging in case your assessment was wrong.
On the other hand if your Button_Click handler contains code like this:
private async void Button_Click(object sender, EventArgs e)
{
var client = new WebClient();
using var stream = await client.OpenReadTaskAsync("someUrl");
var buffer = new byte[1024];
while ((await stream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
//...
}
}
...then by all means do spend the extra time to ConfigureAwait(false) the ReadAsync task. Also do consider refactoring the code by moving the stream-reading part to a separate asynchronous method, so that you can safely access UI elements anywhere inside the Button_Click handler, without been distracted by technicalities that don't belong to this layer of the app.

Async method not running in parallel

In the following code, in the B method, the code Trace.TraceInformation("B - Started"); never gets called.
Should the method be running in parallel?
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
private static async Task A()
{
for (;;)
{
}
}
private static async Task B()
{
Trace.TraceInformation("B - Started");
}
static void Main(string[] args)
{
var tasks = new List<Task> { A(), B() };
Task.WaitAll(tasks.ToArray());
}
}
}
Short answer
No, as you wrote your two async methods, they are indeed not running in parallel. Adding await Task.Yield(); to your first method (e.g. inside the loop) would allow them to do so, but there are more reasonable and straightforward methods, highly depending on what you actually need (interleaved execution on a single thread? Actual parallel execution on multiple threads?).
Long answer
First of all, declaring functions as async does not inherently make them run asynchronously or something. It rather simplifies the syntax to do so - read more about the concepts here: Asynchronous Programming with Async and Await
Effectively A is not asynchronous at all, as there is not a single await inside its method body. Instructions up to the first use of await run synchronously like a regular method would.
From then on, the object that you await determines what happens next, i.e. the context that the remaining method runs in.
To force execution of a task to happen on another thread, use Task.Run or similar.
In this scenario, adding await Task.Yield() does the trick since the current synchronization context is null and this happens to indeed cause the task scheduler (should be ThreadPoolTaskScheduler) to execute the remaining instuctions on a thread-pool thread - some environment or configuration might cause you to only have one of them, so things would still not run in parallel.
Summary
The moral of the story is: Be aware of the differences between two concepts:
concurrency (which is enabled by using async/await reasonably) and
parallelism (which only happens when concurrent tasks get scheduled the right way or if you enforce it using Task.Run, Thread, etc. in which case the use of async is completely irrelevant anyway)
The async modifier is not a magic spawn-a-thread-here marker. It's sole purpose is to let the compiler know a method might depend on some asynchronous operation (A complex data-processing thread, I/O...) so it has to setup a state machine to coordinate the callbacks resulting from those asynchronous operations.
To make A run on another thread you would invoke it using Task.Run which wraps the invocation on a new thread with a Task object, which you can await. Be aware that await-ing a method does not mean your code runs in parallel to A's execution all by itself: It will until the very line you await the Task object, telling the compiler you need the value that the Task object returns. In this case await-ing Task.Run(A) will effectively make your program run forever, waiting for A to return, something that will never happen (barring computer malfunction).
Do have in mind that marking a method as async but not actually awaiting anything will only have the effect of a compiler warning. If you await something that is not truly async (It returns immediately on the calling thread with something like Task.FromResult) it will mean your program takes a runtime speed penalty. It is very slight, however.
No, the methods shown are not expected to "run in parallel".
Why B is never called - you have list of tasks tasks constructed via essentially series of .Add calls - and first is result of A() is added. Since the A method does not have any await it will run to the completion synchronously on the same thread. And after that B() would be called.
Now A will never complete (it is sitting in infinite loop) so really code will not even reach call to B.
Note that even if creation would succeed code never finish WaitAll as A still sits in infinite loop.
If you want methods to "run in parallel" you need to either run them implicitly/explicitly on new threads (i.e. with Task.Run or Thread.Start) or for I/O bound calls let method to release thread with await.

What is the use case for async/await? [duplicate]

This question already has answers here:
Brief explanation of Async/Await in .Net 4.5
(3 answers)
Closed 7 years ago.
C# offers multiple ways to perform asynchronous execution such as threads, futures, and async.
In what cases is async the best choice?
I have read many articles about the how and what of async, but so far I have not seen any article that discusses the why.
Initially I thought async was a built-in mechanism to create a future. Something like
async int foo(){ return ..complex operation..; }
var x = await foo();
do_something_else();
bar(x);
Where call to 'await foo' would return immediately, and the use of 'x' would wait on the the return value of 'foo'. async does not do this. If you want this behavior you can use the futures library: https://msdn.microsoft.com/en-us/library/Ff963556.aspx
The above example would instead be something like
int foo(){ return ..complex operation..; }
var x = Task.Factory.StartNew<int>(() => foo());
do_something_else();
bar(x.Result);
Which isn't as pretty as I would have hoped, but it works nonetheless.
So if you have a problem where you want to have multiple threads operate on the work then use futures or one of the parallel operations, such as Parallel.For.
async/await, then, is probably not meant for the use case of performing work in parallel to increase throughput.
async solves the problem of scaling an application for a large number of asynchronous events, such as I/O, when creating many threads is expensive.
Imagine a web server where requests are processed immediately as they come in. The processing happens on a single thread where every function call is synchronous. To fully process a thread might take a few seconds, which means that an entire thread is consumed until the processing is complete.
A naive approach to server programming is to spawn a new thread for each request. In this way it does not matter how long each thread takes to complete because no thread will block any other. The problem with this approach is that threads are not cheap. The underlying operating system can only create so many threads before running out of memory, or some other kind of resource. A web server that uses 1 thread per request will probably not be able to scale past a few hundred/thousand requests per second. The c10k challenge asks that modern servers be able to scale to 10,000 simultaneous users. http://www.kegel.com/c10k.html
A better approach is to use a thread pool where the number of threads in existence is more or less fixed (or at least, does not expand past some tolerable maximum). In that scenario only a fixed number of threads are available for processing the incoming requests. If there are more requests than there are threads available for processing then some requests must wait. If a thread is processing a request and has to wait on a long running I/O process then effectively the thread is not being utilized to its fullest extent, and the server throughput will be much less than it otherwise could be.
The question is now, how can we have a fixed number of threads but still use them efficiently? One answer is to 'cut up' the program logic so that when a thread would normally wait on an I/O process, instead it will start the I/O process but immediately become free for any other task that wants to execute. The part of the program that was going to execute after the I/O will be stored in a thing that knows how to keep executing later on.
For example, the original synchronous code might look like
void process(){
string name = get_user_name();
string address = look_up_address(name);
string tax_forms = find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
Where look_up_address and find_tax_form have to talk to a database and/or make requests to other websites.
The asynchronous version might look like
void process(){
string name = get_user_name();
invoke_after(() => look_up_address(name), (address) => {
invoke_after(() => find_tax_form(address), (tax_forms) => {
render_tax_form(name, address, tax_forms);
}
}
}
This is continuation passing style, where next thing to do is passed as the second lambda to a function that will not block the current thread when the blocking operation (in the first lambda) is invoked. This works but it quickly becomes very ugly and hard to follow the program logic.
What the programmer has manually done in splitting up their program can be automatically done by async/await. Any time there is a call to an I/O function the program can mark that function call with await to inform the caller of the program that it can continue to do other things instead of just waiting.
async void process(){
string name = get_user_name();
string address = await look_up_address(name);
string tax_forms = await find_tax_form(address);
render_tax_form(name, address, tax_forms);
}
The thread that executes process will break out of the function when it gets to look_up_address and continue to do other work: such as processing other requests. When look_up_address has completed and process is ready to continue, some thread (or the same thread) will pick up where the last thread left off and execute the next line find_tax_forms(address).
Since my current belief of async is about managing threads, I don't believe that async makes a lot of sense for UI programming. Generally UI's will not have that many simultaneous events that need to be processed. The use case for async with UI's is preventing the UI thread from being blocked. Even though async can be used with a UI, I would find it dangerous because ommitting an await on some long running function, due to either an accident or forgetfulness, would cause the UI to block.
async void button_callback(){
await do_something_long();
....
}
This code won't block the UI because it uses an await for the long running function that it invokes. If later on another function call is added
async void button_callback(){
do_another_thing();
await do_something_long();
...
}
Where it wasn't clear to the programmer who added the call to do_another_thing just how long it would take to execute, the UI will now be blocked. It seems safer to just always execute all processing in a background thread.
void button_callback(){
new Thread(){
do_another_thing();
do_something_long();
....
}.start();
}
Now there is no possibility that the UI thread will be blocked, and the chances that too many threads will be created is very small.

Is it Safe to use ReaderWriterLockSlim in an async method

Since the ReaderWriterLockSlim class uses Thread ID to see who owns the lock is it safe to use with async methods where there is no guarantee that all the method will be executed on the same thread.
For example.
System.Threading.ReaderWriterLockSlim readerwriterlock = new System.Threading.ReaderWriterLockSlim();
private async Task Test()
{
readerwriterlock.EnterWriteLock();
await Task.Yield(); //do work that could yield the task
readerwriterlock.ExitWriteLock(); //potentailly exit the lock on a different thread
}
Is it Safe to use ReaderWriterLockSlim in an async method
Yes and no. It can be safe to use this in an async method, but it is likely not safe to use it in an async method where you enter and exit the lock spanning an await.
In this case, no, this is not necessarily safe.
ExitWriteLock must be called from the same thread that called EnterWriteLock. Otherwise, it throws a SynchronizationLockException. From the documentation, this exception is thrown when:
The current thread has not entered the lock in write mode.
The only time this would be safe is if this was used in an async method which was always in an environment where there was a current SynchronizationContext in place which will move things back to the same thread (ie: Windows Forms, WPF, etc), and wasn't used by a nested async call where a "parent" up the call chain setup a Task with ConfigureAwait(false) (which would prevent the Task from capturing the synchronization context). If you are in that specific scenario, you'd know the thread would be maintained, as the await call would marshal you back onto the calling context.
No. Thread-affine coordination primitives should not be used as in your example.
You correctly identified the problem where a different thread can be used to resume after the await. There is another issue due to the the way async methods return early: the caller is not aware that the lock is held.
ReaderWriterLockSlim by default is a non-recursive lock, so if another async method attempts to take the same lock, you'll get a deadlock. Even if you make the lock recursive, you'll still end up with a problem: arbitrary end-user code should never be called while holding a lock, and that's essentially what you're doing when you use an await.
The SemaphoreSlim type is async-aware (via its WaitAsync method), and Stephen Toub has a series of async coordination primitives also available in my AsyncEx library.

Thread-safe asynchronous code in C#

I asked the question below couple of weeks ago. Now, when reviewing my question and all the answers, a very important detail jumped into my eyes: In my second code example, isn't DoTheCodeThatNeedsToRunAsynchronously() executed in the main (UI) thread? Doesn't the timer just wait a second and then post an event to the main thread? This would mean then that the code-that-needs-to-run-asynchronously isn't run asynchronously at all?!
Original question:
I have recently faced a problem multiple times and solved it in different ways, always being uncertain on whether it is thread safe or not: I need to execute a piece of C# code asynchronously. (Edit: I forgot to mention I'm using .NET 3.5!)
That piece of code works on an object that is provided by the main thread code. (Edit: Let's assume that object is thread-safe in itself.) I'll present you two ways I tried (simplified) and have these four questions:
What is the best way to achieve what I want? Is it one of the two or another approach?
Is one of the two ways not thread-safe (I fear both...) and why?
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates). In Java you are forced to declare the local variable as final (i.e. it cannot be changed once assigned). In C# there is no such possibility, is there?
Approach 1: Thread
new Thread(new ParameterizedThreadStart(
delegate(object parameter)
{
Thread.Sleep(1000); // wait a second (for a specific reason)
MyObject myObject = (MyObject)parameter;
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
})).Start(this.MyObject);
There is one problem I had with this approach: My main thread might crash, but the process still persists in the memory due to the zombie thread.
Approach 2: Timer
MyObject myObject = this.MyObject;
System.Timers.Timer timer = new System.Timers.Timer();
timer.Interval = 1000;
timer.AutoReset = false; // i.e. only run the timer once.
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
DoSomeStuff();
myObject = that.MyObject; // hypothetical second assignment.
The local variable myObject is what I'm talking about in question 4. I've added a second assignment as an example. Imagine the timer elapses after the second assigment, will the delegate code operate on this.MyObject or that.MyObject?
Whether or not either of these pieces of code is safe has to do with the structure of MyObject instances. In both cases you are sharing the myObject variable between the foreground and background threads. There is nothing stopping the foreground thread from modifying myObject while the background thread is running.
This may or may not be safe and depends on the structure of MyObject. However if you haven't specifically planned for it then it's most certainly an unsafe operation.
I recommend using Task objects, and restructuring the code so that the background task returns its calculated value rather than changing some shared state.
I have a blog entry that discusses five different approaches to background tasks (Task, BackgroundWorker, Delegate.BeginInvoke, ThreadPool.QueueUserWorkItem, and Thread), with the pros and cons of each.
To answer your questions specifically:
What is the best way to achieve what I want? Is it one of the two or another approach? The best solution is to use the Task object instead of a specific Thread or timer callback. See my blog post for all the reasons why, but in summary: Task supports returning a result, callbacks on completion, proper error handling, and integration with the universal cancellation system in .NET.
Is one of the two ways not thread-safe (I fear both...) and why? As others have stated, this totally depends on whether MyObject.ChangeSomeProperty is threadsafe. When dealing with asynchronous systems, it's easier to reason about threadsafety when each asynchronous operation does not change shared state, and rather returns a result.
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object? Personally, I prefer using lambda binding, which is more type-safe (no casting necessary).
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? Lambdas (and delegate expressions) bind to variables, not to values, so the answer is yes: the reference may change before it is used by the delegate. If the reference may change, then the usual solution is to create a separate local variable that is only used by the lambda expression,
as such:
MyObject myObject = this.MyObject;
...
timer.AutoReset = false; // i.e. only run the timer once.
var localMyObject = myObject; // copy for lambda
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
localMyObject.ChangeSomeProperty();
});
// Now myObject can change without affecting timer.Elapsed
Tools like ReSharper will try to detect whether local variables bound in lambdas may change, and will warn you if it detects this situation.
My recommended solution (using Task) would look something like this:
var ui = TaskScheduler.FromCurrentSynchronizationContext();
var localMyObject = this.myObject;
Task.Factory.StartNew(() =>
{
// Run asynchronously on a ThreadPool thread.
Thread.Sleep(1000); // TODO: review if you *really* need this
return DoTheCodeThatNeedsToRunAsynchronously();
}).ContinueWith(task =>
{
// Run on the UI thread when the ThreadPool thread returns a result.
if (task.IsFaulted)
{
// Do some error handling with task.Exception
}
else
{
localMyObject.ChangeSomeProperty(task.Result);
}
}, ui);
Note that since the UI thread is the one calling MyObject.ChangeSomeProperty, that method doesn't have to be threadsafe. Of course, DoTheCodeThatNeedsToRunAsynchronously still does need to be threadsafe.
"Thread-safe" is a tricky beast. With both of your approches, the problem is that the "MyObject" your thread is using may be modified/read by multiple threads in a way that makes the state appear inconsistent, or makes your thread behave in a way inconsistent with actual state.
For example, say your MyObject.ChangeSomeproperty() MUST be called before MyObject.DoSomethingElse(), or it throws. With either of your approaches, there is nothing to stop any other thread from calling DoSomethingElse() before the thread that will call ChangeSomeProperty() finishes.
Or, if ChangeSomeProperty() happens to be called by two threads, and it (internally) changes state, the thread context switch may happen while the first thread is in the middle of it's work and the end result is that the actual new state after both threads is "wrong".
However, by itself, neither of your approaches is inherently thread-unsafe, they just need to make sure that changing state is serialized and that accessing state is always giving a consistent result.
Personally, I wouldn't use the second approach. If you're having problems with "zombie" threads, set IsBackground to true on the thread.
Your first attempt is pretty good, but the thread continued to exist even after the application exits, because you didn't set the IsBackground property to true... here is a simplified (and improved) version of your code:
MyObject myObject = this.MyObject;
Thread t = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
t.IsBackground = true;
t.Start();
With regards to the thread safety: it's difficult to tell if your program functions correctly when multiple threads execute simultaneously, because you're not showing us any points of contention in your example. It's very possible that you will experience concurrency issues if your program has contention on MyObject.
Java has the final keyword and C# has a corresponding keyword called readonly, but neither final nor readonly ensure that the state of the object you're modifying will be consistent between threads. The only thing these keywords do is ensure that you do not change the reference the object is pointing to. If two threads have read/write contention on the same object, then you should perform some type of synchronization or atomic operations on that object in order to ensure thread safety.
Update
OK, if you modify the reference to which myObject is pointing to, then your contention is now on myObject. I'm sure that my answer will not match your actual situation 100%, but given the example code you've provided I can tell you what will happen:
You will not be guaranteed which object gets modified: it can be that.MyObject or this.MyObject. That's true regardless if you're working with Java or C#. The scheduler may schedule your thread/timer to be executed before, after or during the second assignment. If you're counting on a specific order of execution, then you have to do something to ensure that order of execution. Usually that something is a communication between the threads in the form of a signal: a ManualResetEvent, Join or something else.
Here is a join example:
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
task.IsBackground = true;
task.Start();
task.Join(); // blocks the main thread until the task thread is finished
myObject = that.MyObject; // the assignment will happen after the task is complete
Here is a ManualResetEvent example:
ManualResetEvent done = new ManualResetEvent(false);
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
done.Set();
});
task.IsBackground = true;
task.Start();
done.WaitOne(); // blocks the main thread until the task thread signals it's done
myObject = that.MyObject; // the assignment will happen after the task is done
Of course, in this case it's pointless to even spawn multiple threads, since you're not going to allow them to run concurrently. One way to avoid this is by not changing the reference to myObject after you've started the thread, then you won't need to Join or WaitOne on the ManualResetEvent.
So this leads me to a question: why are you assigning a new object to myObject? Is this a part of a for-loop which is starting multiple threads to perform multiple asynchronous tasks?
What is the best way to achieve what I want? Is it one of the two or another approach?
Both look fine, but...
Is one of the two ways not thread-safe (I fear both...) and why?
...they are not thread safe unless MyObject.ChangeSomeProperty() is thread safe.
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
Yes. Using a closure (as in your second approach) is fine as well, with the additional advantage that you don't need to do a cast.
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates).
Sure, if you add myObject = null; directly after setting timer.Elapsed, then the code in your thread will fail. But why would you want to do that? Note that changing this.MyObject will not affect the variable captured in your thread.
So, how to make this thread-safe? The problem is that myObject.ChangeSomeProperty(); might run in parallel with some other code that modifies the state of myObject. There are basically two solutions to that:
Option 1: Execute myObject.ChangeSomeProperty() in the main UI thead. This is the simplest solution if ChangeSomeProperty is fast. You can use the Dispatcher (WPF) or Control.Invoke (WinForms) to jump back to the UI thread, but the easiest way is to use a BackgroundWorker:
MyObject myObject = this.MyObject;
var bw = new BackgroundWorker();
bw.DoWork += (sender, args) => {
// this will happen in a separate thread
Thread.Sleep(1000);
DoTheCodeThatNeedsToRunAsynchronously();
}
bw.RunWorkerCompleted += (sender, args) => {
// We are back in the UI thread here.
if (args.Error != null) // if an exception occurred during DoWork,
MessageBox.Show(args.Error.ToString()); // do your error handling here
else
myObject.ChangeSomeProperty();
}
bw.RunWorkerAsync(); // start the background worker
Option 2: Make the code in ChangeSomeProperty() thread-safe by using the lock keyword (inside ChangeSomeProperty as well as inside any other method modifying or reading the same backing field).
The bigger thread-safety concern here, in my mind, may be the 1 second Sleep. If this is required in order to synchronize with some other operation (giving it time to complete), then I strongly recommend using a proper synchronization pattern rather than relying on the Sleep. Monitor.Pulse or AutoResetEvent are two common ways to achieve synchronization. Both should be used carefully, as it's easy to introduce subtle race conditions. However, using Sleep for synchronization is a race condition waiting to happen.
Also, if you want to use a thread (and don't have access to the Task Parallel Library in .NET 4.0), then ThreadPool.QueueUserWorkItem is preferable for short-running tasks. The thread pool threads also won't hang up the application if it dies, as long as there is not some deadlock preventing a non-background thread from dying.
One thing not mentioned so far: The choice of threading methods depends heavily on specifically what DoTheCodeThatNeedsToRunAsynchronously() does.
Different .NET threading approaches are suitable for different requirements. One very large concern is whether this method will complete quickly, or take some time (is it short-lived or long-running?).
Some .NET threading mechanisms, like ThreadPool.QueueUserWorkItem(), are for use by short-lived threads. They avoid the expense of creating a thread by using "recycled" threads--but the number of threads it will recycle is limited, so a long-running task shouldn't hog the ThreadPool's threads.
Other options to consider are using:
ThreadPool.QueueUserWorkItem() is a convienient means to fire-and-forget small tasks on a ThreadPool thread
System.Threading.Tasks.Task is a new feature in .NET 4 which makes small tasks easy to run in async/parallel mode.
Delegate.BeginInvoke() and Delegate.EndInvoke() (BeginInvoke() will run the code asynchronously, but it's crucial that you ensure EndInvoke() is called as well to avoid potential resource-leaks. It's also based on ThreadPool threads I believe.
System.Threading.Thread as shown in your example. Threads provide the most control but are also more expensive than the other methods--so they are ideal for long-running tasks or detail-oriented multithreading.
Overall my personal preference has been to use Delegate.BeginInvoke()/EndInvoke() -- it seems to strike a good balance between control and ease of use.

Categories

Resources