I am learning multi threading concepts (in general and targeted to C#.NET). Reading different articles, still could not fully understand few basic concepts.
I post this question. "Hans Passant" explained it well but I was not able to understand some of its part. So I started googling.
I read this question which have no answers.
Is Multithreading and MTA same?
Suppose I write a WinForm application which is STA (as mentioned above its Main() method), still I can create multiple threads in my application. I can safely say my application is "multi-threaded". Does that also mean my application is MTA?
While talking about STA/MTA, most of the articles (like this) talk about COM/DCOM/Automation/ActiveX. Does that mean DotNet have nothing to do with STA/MTA?
No. MTA is a property of a single thread, just like STA. You now make the exact opposite promise, you declare that the thread does absolutely nothing to keep external code thread-safe. So no need to have a dispatcher and you can block as much and as long as you like.
This has consequences of course and they can be quite unpleasant. It is deadly if the UI thread of your program is in the MTA since it uses so many external components that are fundamentally thread-unsafe. The clipboard won't work, drag+drop doesn't work, OpenFileDialog typically just hangs your program, WebBrowser won't fire its events.
Some components check for this and raise an exception but this check isn't consistently implemented. WPF is notable, while apartment state normally matters only to unmanaged code, WPF borrowed the concept and raises "The calling thread must be STA, because many UI components require this." Which is a bit misleading, what it really means is that the thread must have a dispatcher to allow its controls to work. But otherwise consistent with the STA promise.
It can work when the component uses COM and the author has provided a proxy. The COM infrastructure now steps in to make the component thread-safe, it creates a new thread that is STA to give it a safe home. And every method call is automatically marshaled so it runs on that thread, thus providing thread-safety. The exact equivalent of Dispatcher.Invoke() but done entirely automatic. The consequence however is that this is slow, a simple property access that normally takes a few nanoseconds can now take multiple microseconds.
You'd be lucky if the component supports MTA as well as STA. This is not common, only somebody like Microsoft goes the extra thousand miles to keep their libraries thread-safe.
I should perhaps emphasize that the concepts of apartments is entirely missing in the .NET Framework. Other than the basics of stating the apartment type, necessary since .NET programs often need to interop with unmanaged code. So writing a Winforms app with worker threads is just fine, and those worker threads are always in the MTA, you do however get to deal with thread-safety yourself and nothing is automatic.
This is generally well-understood, just about everybody knows how to use the lock keyword, the Task and BackgroundWorker classes and knows that the Control.Begin/Invoke() method is required to update UI from a worker thread. With an InvalidOperationException to remind you when you get it wrong. Leaving it up to the programmer instead of the system taking care of thread-safety does make it harder to use threads. But gives you lots of opportunities to do it better than the system can. Which was necessary, this system-provided thread-safety got a serious black eye when Java punched it in the face during the middleware wars of the late 90s.
There are some questions but first let's start by this:
An Apartment is a context where a COM object is initialized and executed, and it can be a either single thread (STA), normally used for not thread-safe objects, or multi thread.
the term apartment, which describes the constructs in which COM
objects are created
From: https://msdn.microsoft.com/en-us/library/ms809971.aspx
So Multithreading and MTA are not the same, but MTA is Multithreaded.
We can say that STA and MTA are related to COM objects.
You can read more here: https://msdn.microsoft.com/en-us/library/ms693344(v=vs.85).aspx
So, for your second question, if your WinForm application is "multi-threaded" does not mean it is "MTA".
Finally, the MTA/STA concepts are older than .Net technology, but we cannot say that they have nothing related to, because .Net supports COM technology in both STA and MTA.
I expect my answer help you to undestand the difference between Apartment and Threading.
More interesting reading here:Could you explain STA and MTA?
This is possibly related to ProgressBar updates in blocked UI thread but is a little different.
While troubleshooting a crashing WinForms control (a DevExpress treelist) a member of our team encountered an unusual situation, and I'm wondering if anyone can help us understand what's going on. English is not his first language, so I'm posting on his behalf.
Please see this screenshot from Visual Studio 2005.
Note the following points:
The main UI thread is stopped and is currently in a DevExpress control draw method.
The code shown on screen is from a point earlier in the same call-stack. This code is in the data layer and was called in response to the control's request for an image to display for the tree node. (perhaps also originating from a Paint handler)
The displayed code is from earlier in the callstack, on the main UI thread, and is currently waiting on a lock! Since remote systems can send events which are processed on background threads in the data model (i.e., data models are sync'd between client and servers), we lock to keep the data collections thread safe.
As the callstack shows, we continued to process paint messages on the UI thread, while we would expect the thread to be blocked.
This is very difficult to replicate, and I have not been able to do so using a simpler test project on my own box. When this situation arises, however, the result is that the DevExpress control's internal state can be messed up, causing the control to crash. This doesn't really seem like a bug in the control, since it was no doubt written with the assumption that these paint methods are running only on the UI thread. What we see here makes it look like the UI thread is acting like two threads.
It would seem possible that this is merely a Visual Studio bug in the presentation of the callstack, except that this whole endeavor is resulting from an effort to troubleshoot the occasional crash of the control in the released app (in which case it shows as a big red X in the UI), so it seems the problem is not isolated to the debug environment.
Alright, that was complicated, but hopefully made sense. Any ideas?
I would strongly recommend against locking the UI to wait for background processing. Consider something like multiple buffering. You can probably get this behavior fairly easily by utilizing thread-safe collections in .NET 4, but if that's not an option there are versions of those in the Parallel Extensions released prior to v4.
What about altering the synchronization scheme so that you don't need to acquire exclusive locks to read data?
In situations where you can be sure that a read will always produce consistent data even when it happens while the data is also being written, you might be able to get away with having no lock statements for getters. Otherwise there's ReaderWriterLockSlim, which permits multiple concurrent readers but still allows you to stop the presses for write operations.
It doesn't fix everything, but it does at least reduce the number of opportunities for deadlocks.
We see something similar in our project. The stack trace looks like the pump's event loop is called while the UI thread is waiting on a lock. This could happen if Monitor.enter has some special behavior when called on the UI thread. I believe this is what's happening, but I haven't found any documentation to back it up yet.
Probably has something to do with synchronization contexts :)
This is a two part question:
I am working on a big project where multiple plugins developed by different teams are loaded inside one common container shell. At times I can see that my UI updates are blocked as there are multiple parallel UI updates, i want to know if there is a way to find which component is blocking the ui thread
In .net how can i create a separate UI thread which requires dedicated UI intensive work?
Much Appreciate your help. Thanks.
Use the debugger. Debug + Break All when you notice it blocking. Then Debug + Windows + Threads and select the main thread. The call stack window shows you what it is doing.
A corner case is where these plugins are using a lot of calls to Control.Begin/Invoke or Dispatcher.Begin/Invoke. Your UI thread is not blocked in this case, it is just being overwhelmed by requests to dispatch the delegate targets. And doesn't get around to doing its normal duties anymore, like repainting the windows and responding mouse and keyboard events. There's little you can do about this beyond working with the plugin authors to get them to mend their ways.
You've already got an UI thread, the thread that created the first window. Creating additional threads that have their own windows is possible but causes unsolvable problems with window Z-order (a window will disappear underneath the window of another app) and generous helpings of window interop threading misery.
Visual Studio 2010's (in the higher SKUs) include features to check for this. If you run your program under the Concurrency Profiler, you can see exactly which threads are waiting on which locks when the deadlock occurs. In addition, it will highlight the deadlock (I believe in bright red) to make it easy to track down.
One approach you can take (though it may require a bit of redesign) is to disallow all plugin logic from running in the UI thread. All operations that require updates to the UI must be routed through well-defined service interfaces that can interpret, dispatch and perhaps even throttle the UI updates. This is only practical if your plugins are not deeply UI-centric and you have a service model that allows you to isolate the data being manipulated by the plugins from the visualization of that data. Without knowing more about your application, I can't give more concrete recommendations.
Here are two possible solutions to the problem that I came up with quickly. I am sure there are other equally valid solutions though.
Option 1: Instead of using the push model (via the ISynchronizeInvoke methods) switch to a pull (or poll) model in which the UI queries the plugin for updates. This has the following advantages.
It breaks the tight coupling between the UI and worker/plugin threads that Control.Invoke imposes.
It puts the responsibility of updating the UI thread on the UI thread where it should belong anyway.
The UI thread gets to dictate when and how often the update should take place.
There is no risk of the UI message pump being overrun as would be the case with the marshaling techniques initiated by the worker/plugin thread.
The worker/plugin thread does not have to wait for an acknowledgement that the update was performed before proceeding with its next steps (ie. you get more throughput on both the UI and worker/plugin threads).
Option 2: Have the plugin accept an ISynchronizeInvoke instance instead of an actual Form or Control. This special synchronizing object will be implemented using a dedicated thread and a queue that acts as buffer between the plugin and the UI. It will accept update messages via the normal Invoke or BeginInvoke methods, which means you can keep the plugin architecture and interfaces mostly intact, and then forward those messages on to the UI after some type of filtering, merging, and throttling operations have occurred. The number of update messages existing in the synchronizing object will ebb and flow as the UI and plugin threads work load changes. It could be smart enough to change its forwarding strategy as the rate of messages increase.
In a previous question, I made a bit of a faux pas. You see, I'd been reading about threads and had got the impression that they were the tastiest things since kiwi jello.
Imagine my confusion then, when I read stuff like this:
[T]hreads are A Very Bad Thing. Or, at least, explicit management of threads is a bad thing
and
Updating the UI across threads is usually a sign that you are abusing threads.
Since I kill a puppy every time something confuses me, consider this your chance get your karma back in the black...
How should I be using thread?
Enthusiam for learning about threading is great; don't get me wrong. Enthusiasm for using lots of threads, by contrast, is symptomatic of what I call Thread Happiness Disease.
Developers who have just learned about the power of threads start asking questions like "how many threads can I possible create in one program?" This is rather like an English major asking "how many words can I use in a sentence?" Typical advice for writers is to keep your sentences short and to the point, rather than trying to cram as many words and ideas into one sentence as possible. Threads are the same way; the right question is not "how many can I get away with creating?" but rather "how can I write this program so that the number of threads is the minimum necessary to get the job done?"
Threads solve a lot of problems, it's true, but they also introduce huge problems:
Performance analysis of multi-threaded programs is often extremely difficult and deeply counterintuitive. I've seen real-world examples in heavily multi-threaded programs in which making a function faster without slowing down any other function or using more memory makes the total throughput of the system smaller. Why? Because threads are often like streets downtown. Imagine taking every street and magically making it shorter without re-timing the traffic lights. Would traffic jams get better, or worse? Writing faster functions in multi-threaded programs drives the processors towards congestion faster.
What you want is for threads to be like interstate highways: no traffic lights, highly parallel, intersecting at a small number of very well-defined, carefully engineered points. That is very hard to do. Most heavily multi-threaded programs are more like dense urban cores with stoplights everywhere.
Writing your own custom management of threads is insanely difficult to get right. The reason is because when you are writing a regular single-threaded program in a well-designed program, the amount of "global state" you have to reason about is typically small. Ideally you write objects that have well-defined boundaries, and that do not care about the control flow that invokes their members. You want to invoke an object in a loop, or a switch, or whatever, you go right ahead.
Multi-threaded programs with custom thread management require global understanding of everything that a thread is going to do that could possibly affect data that is visible from another thread. You pretty much have to have the entire program in your head, and understand all the possible ways that two threads could be interacting in order to get it right and prevent deadlocks or data corruption. That is a large cost to pay, and highly prone to bugs.
Essentially, threads make your methods lie. Let me give you an example. Suppose you have:
if (!queue.IsEmpty) queue.RemoveWorkItem().Execute();
Is that code correct? If it is single threaded, probably. If it is multi-threaded, what is stopping another thread from removing the last remaining item after the call to IsEmpty is executed? Nothing, that's what. This code, which locally looks just fine, is a bomb waiting to go off in a multi-threaded program. Basically that code is actually:
if (queue.WasNotEmptyAtSomePointInThePast) ...
which obviously is pretty useless.
So suppose you decide to fix the problem by locking the queue. Is this right?
lock(queue) {if (!queue.IsEmpty) queue.RemoveWorkItem().Execute(); }
That's not right either, necessarily. Suppose the execution causes code to run which waits on a resource currently locked by another thread, but that thread is waiting on the lock for queue - what happens? Both threads wait forever. Putting a lock around a hunk of code requires you to know everything that code could possibly do with any shared resource, so that you can work out whether there will be any deadlocks. Again, that is an extremely heavy burden to put on someone writing what ought to be very simple code. (The right thing to do here is probably to extract the work item in the lock and then execute it outside the lock. But... what if the items are in a queue because they have to be executed in a particular order? Now that code is wrong too because other threads can then execute later jobs first.)
It gets worse. The C# language spec guarantees that a single-threaded program will have observable behaviour that is exactly as the program is specified. That is, if you have something like "if (M(ref x)) b = 10;" then you know that the code generated will behave as though x is accessed by M before b is written. Now, the compiler, jitter and CPU are all free to optimize that. If one of them can determine that M is going to be true and if we know that on this thread, the value of b is not read after the call to M, then b can be assigned before x is accessed. All that is guaranteed is that the single-threaded program seems to work like it was written.
Multi-threaded programs do not make that guarantee. If you are examining b and x on a different thread while this one is running then you can see b change before x is accessed, if that optimization is performed. Reads and writes can logically be moved forwards and backwards in time with respect to each other in single threaded programs, and those moves can be observed in multi-threaded programs.
This means that in order to write multi-threaded programs where there is a dependency in the logic on things being observed to happen in the same order as the code is actually written, you have to have a detailed understanding of the "memory model" of the language and the runtime. You have to know precisely what guarantees are made about how accesses can move around in time. And you cannot simply test on your x86 box and hope for the best; the x86 chips have pretty conservative optimizations compared to some other chips out there.
That's just a brief overview of just a few of the problems you run into when writing your own multithreaded logic. There are plenty more. So, some advice:
Do learn about threading.
Do not attempt to write your own thread management in production code.
Use higher-level libraries written by experts to solve problems with threads. If you have a bunch of work that needs to be done in the background and want to farm it out to worker threads, use a thread pool rather than writing your own thread creation logic. If you have a problem that is amenable to solution by multiple processors at once, use the Task Parallel Library. If you want to lazily initialize a resource, use the lazy initialization class rather than trying to write lock free code yourself.
Avoid shared state.
If you can't avoid shared state, share immutable state.
If you have to share mutable state, prefer using locks to lock-free techniques.
Explicit management of threads is not intrinsically a bad thing, but it's frought with dangers and shouldn't be done unless absolutely necessary.
Saying threads are absolutely a good thing would be like saying a propeller is absolutely a good thing: propellers work great on airplanes (when jet engines aren't a better alternative), but wouldn't be a good idea on a car.
You cannot appreciate what kind of problems threading can cause unless you've debugged a three-way deadlock. Or spent a month chasing a race condition that happens only once a day. So, go ahead and jump in with both feet and make all the kind of mistakes you need to make to learn to fear the Beast and what to do to stay out of trouble.
There's no way I could offer a better answer than what's already here. But I can offer a concrete example of some multithreaded code that we actually had at my work that was disastrous.
One of my coworkers, like you, was very enthusiastic about threads when he first learned about them. So there started to be code like this throughout the program:
Thread t = new Thread(LongRunningMethod);
t.Start(GetThreadParameters());
Basically, he was creating threads all over the place.
So eventually another coworker discovered this and told the developer responsible: don't do that! Creating threads is expensive, you should use the thread pool, etc. etc. So a lot of places in the code that originally looked like the above snippet started getting rewritten as:
ThreadPool.QueueUserWorkItem(LongRunningMethod, GetThreadParameters());
Big improvement, right? Everything's sane again?
Well, except that there was a particular call in that LongRunningMethod that could potentially block -- for a long time. Suddenly every now and then we started seeing it happen that something our software should have reacted to right away... it just didn't. In fact, it might not have reacted for several seconds (clarification: I work for a trading firm, so this was a complete catastrophe).
What had ended up happening was that the thread pool was actually filling up with long-blocking calls, leading to other code that was supposed to happen very quickly getting queued up and not running until significantly later than it should have.
The moral of this story is not, of course, that the first approach of creating your own threads is the right thing to do (it isn't). It's really just that using threads is tough, and error-prone, and that, as others have already said, you should be very careful when you use them.
In our particular situation, many mistakes were made:
Creating new threads in the first place was wrong because it was far more costly than the developer realized.
Queuing all background work on the thread pool was wrong because it treated all background tasks indiscriminately and did not account for the possibility of asynchronous calls actually being blocked.
Having a long-blocking method by itself was the result of some careless and very lazy use of the lock keyword.
Insufficient attention was given to ensuring that the code that was being run on background threads was thread-safe (it wasn't).
Insufficient thought was given to the question of whether making a lot of the affected code multithreaded was even worth doing to begin with. In plenty of cases, the answer was no: multithreading just introduced complexity and bugs, made the code less comprehensible, and (here's the kicker): hurt performance.
I'm happy to say that today, we're still alive and our code is in a much healthier state than it once was. And we do use multithreading in plenty of places where we've decided it's appropriate and have measured performance gains (such as reduced latency between receiving a market data tick and having an outgoing quote confirmed by the exchange). But we learned some pretty important lessons the hard way. Chances are, if you ever work on a large, highly multithreaded system, you will too.
Unless you are on the level of being able to write a fully-fledged kernel scheduler, you will get explicit thread management always wrong.
Threads can be the most awesome thing since hot chocolate, but parallel programming is incredibly complex. However, if you design your threads to be independent then you can't shoot yourself in the foot.
As fore rule of the thumb, if a problem is decomposed into threads, they should be as independent as possible, with as few but well defined shared resources as possible, with the most minimalistic management concept.
I think the first statement is best explained as such: with the many advanced APIs now available, manually writing your own thread code is almost never necessary. The new APIs are a lot easier to use, and a lot harder to mess up!. Whereas, with the old-style threading, you have to be quite good to not mess up. The old-style APIs (Thread et. al.) are still available, but the new APIs (Task Parallel Library, Parallel LINQ, and Reactive Extensions) are the way of the future.
The second statement is from more of a design perspective, IMO. In a design that has a clean separation of concerns, a background task should not really be reaching directly into the UI to report updates. There should be some separation there, using a pattern like MVVM or MVC.
I would start by questioning this perception:
I'd been reading about threads and had got the impression that they were the tastiest things since kiwi jello.
Don’t get me wrong – threads are a very versatile tool – but this degree of enthusiasm seems weird. In particular, it indicates that you might be using threads in a lot of situations where they simply don’t make sense (but then again, I might just mistake your enthusiasm).
As others have indicated, thread handling is additionally quite complex and complicated. Wrappers for threads exist and only in rare occasions do they have to be handled explicitly. For most applications, threads can be implied.
For example, if you just want to push a computation to the background while leaving the GUI responsive, a better solution is often to either use callback (that makes it seem as though the computation is done in the background while really being executed on the same thread), or by using a convenience wrapper such as the BackgroundWorker that takes and hides all the explicit thread handling.
A last thing, creating a thread is actually very expensive. Using a thread pool mitigates this cost because here, the runtime creates a number of threads that are subsequently reused. When people say that explicit management of threads is bad, this is all they might be referring to.
Many advanced GUI Applications usually consist of two threads, one for the UI, one (or sometimes more) for Processing of data (copying files, making heavy calculations, loading data from a database, etc).
The processing threads shouldn't update the UI directly, the UI should be a black box to them (check Wikipedia for Encapsulation).
They just say "I'm done processing" or "I completed task 7 of 9" and call an Event or other callback method. The UI subscribes to the event, checks what has changed and updates the UI accordingly.
If you update the UI from the Processing Thread you won't be able to reuse your code and you will have bigger problems if you want to change parts of your code.
I think you should experiement as much as possible with Threads and get to know the benefits and pitfalls of using them. Only by experimentation and usage will your understanding of them grow. Read as much as you can on the subject.
When it comes to C# and the userinterface (which is single threaded and you can only modify userinterface elements on code executed on the UI thread). I use the following utility to keep myself sane and sleep soundly at night.
public static class UIThreadSafe {
public static void Perform(Control c, MethodInvoker inv) {
if(c == null)
return;
if(c.InvokeRequired) {
c.Invoke(inv, null);
}
else {
inv();
}
}
}
You can use this in any thread that needs to change a UI element, like thus:
UIThreadSafe.Perform(myForm, delegate() {
myForm.Title = "I Love Threads!";
});
A huge reason to try to keep the UI thread and the processing thread as independent as possible is that if the UI thread freezes, the user will notice and be unhappy. Having the UI thread be blazing fast is important. If you start moving UI stuff out of the UI thread or moving processing stuff into the UI thread, you run a higher risk of having your application become unresponsive.
Also, a lot of the framework code is deliberately written with the expectation that you will separate the UI and processing; programs will just work better when you separate the two out, and will hit errors and problems when you don't. I don't recall any specifics issues that I encountered as a result of this, though I have vague recollections of in the past trying to set certain properties of stuff the UI was responsible for outside of the UI and having the code refuse to work; I don't recall whether it didn't compile or it threw an exception.
Threads are a very good thing, I think. But, working with them is very hard and needs a lot of knowledge and training. The main problem is when we want to access shared resources from two other threads which can cause undesirable effects.
Consider classic example: you have a two threads which get some items from a shared list and after doing something they remove the item from the list.
The thread method that is called periodically could look like this:
void Thread()
{
if (list.Count > 0)
{
/// Do stuff
list.RemoveAt(0);
}
}
Remember that the threads, in theory, can switch at any line of your code that is not synchronized. So if the list contains only one item, one thread could pass the list.Count condition, just before list.Remove the threads switch and another thread passes the list.Count (list still contains one item). Now the first thread continues to list.Remove and after that second thread continues to list.Remove, but the last item already has been removed by the first thread, so the second one crashes. That's why it would have to be synchronized using lock statement, so that there can't be a situation where two threads are inside the if statement.
So that is the reason why UI which is not synchronized must always run in a single thread and no other thread should interfere with UI.
In previous versions of .NET if you wanted to update UI in another thread, you would have to synchronize using Invoke methods, but as it was hard enough to implement, new versions of .NET come with BackgroundWorker class which simplifies a thing by wrapping all the stuff and letting you do the asynchronous stuff in a DoWork event and updating UI in ProgressChanged event.
A couple of things are important to note when updating the UI from a non-UI thread:
If you use "Invoke" frequently, the performance of your non-UI thread may be severely adversely affected if other stuff makes the UI thread run sluggishly. I prefer to avoid using "Invoke" unless the non-UI thread needs to wait for the UI-thread action to be performed before it continues.
If you use "BeginInvoke" recklessly for things like control updates, an excessive number of invocation delegates may get queued, some of which may well be pretty useless by the time they actually occur.
My preferred style in many cases is to have each control's state encapsulated in an immutable class, and then have a flag which indicates whether an update is not needed, pending, or needed but not pending (the latter situation may occur if a request is made to update a control before it is fully created). The control's update routine should, if an update is needed, start by clearing the update flag, grabbing the state, and drawing the control. If the update flag is set, it should re-loop. To request another thread, a routine should use Interlocked.Exchange to set the update flag to update pending and--if it wasn't pending--try to BeginInvoke the update routine; if the BeginInvoke fails, set the update flag to "needed but not pending".
If an attempt to control occurs just after the control's update routine checks and clears its update flag, it may well happen that the first update will reflect the new value but the update flag will have been set anyway, forcing an extra screen redraw. On the occasions when this happens, it will be relatively harmless. The important thing is that the control will end up being drawn in the correct state, without there ever having been more than one BeginInvoke pending.
My current project is a WPF application with an SQL Server back end.
In WPF, the UI can only be modified by the UI thread. If a UI modification needs to be done on another thread, then the dispatcher object can be called and given an action. Effectively, this is mapping my Delegate to a WM_ message.
Since the linq datacontexts to SQL Server are also single threaded, how could I copy this "Dispatcher" idea from WPF and create a similar object that I can use to marshal requests to my public datacontext to be always from the "Public SQL thread".
I'm guessing I'd need to create a thread at start up which initialises the data contexts and then sleeps until woken by the SqlThread.Invoke() method.
Does anyone know of anything similar to this idea or any materials that may help me do this?
If you mean a LINQ-to-SQL DataContext, I would advise against this; use DataContexts as a short-lived unit-of-work, then Dispose() it; don't keep it for lots of different purposes (there are issues with stale data, cache growth, threading, concurrency, plus (importantly) how to handle failure / rollback).
Re the bigger picture:
Essentially you are describing a work queue, such as a producer/consumer queue. There are plenty of such around, or they are relatively easy to write (for example, see here or here; just add a loop to dequeue+process items). IIRC .NET 4.0 also includes (in the parallel extensions) such constructs pre-canned.