I am learning multi threading concepts (in general and targeted to C#.NET). Reading different articles, still could not fully understand few basic concepts.
I post this question. "Hans Passant" explained it well but I was not able to understand some of its part. So I started googling.
I read this question which have no answers.
Is Multithreading and MTA same?
Suppose I write a WinForm application which is STA (as mentioned above its Main() method), still I can create multiple threads in my application. I can safely say my application is "multi-threaded". Does that also mean my application is MTA?
While talking about STA/MTA, most of the articles (like this) talk about COM/DCOM/Automation/ActiveX. Does that mean DotNet have nothing to do with STA/MTA?
No. MTA is a property of a single thread, just like STA. You now make the exact opposite promise, you declare that the thread does absolutely nothing to keep external code thread-safe. So no need to have a dispatcher and you can block as much and as long as you like.
This has consequences of course and they can be quite unpleasant. It is deadly if the UI thread of your program is in the MTA since it uses so many external components that are fundamentally thread-unsafe. The clipboard won't work, drag+drop doesn't work, OpenFileDialog typically just hangs your program, WebBrowser won't fire its events.
Some components check for this and raise an exception but this check isn't consistently implemented. WPF is notable, while apartment state normally matters only to unmanaged code, WPF borrowed the concept and raises "The calling thread must be STA, because many UI components require this." Which is a bit misleading, what it really means is that the thread must have a dispatcher to allow its controls to work. But otherwise consistent with the STA promise.
It can work when the component uses COM and the author has provided a proxy. The COM infrastructure now steps in to make the component thread-safe, it creates a new thread that is STA to give it a safe home. And every method call is automatically marshaled so it runs on that thread, thus providing thread-safety. The exact equivalent of Dispatcher.Invoke() but done entirely automatic. The consequence however is that this is slow, a simple property access that normally takes a few nanoseconds can now take multiple microseconds.
You'd be lucky if the component supports MTA as well as STA. This is not common, only somebody like Microsoft goes the extra thousand miles to keep their libraries thread-safe.
I should perhaps emphasize that the concepts of apartments is entirely missing in the .NET Framework. Other than the basics of stating the apartment type, necessary since .NET programs often need to interop with unmanaged code. So writing a Winforms app with worker threads is just fine, and those worker threads are always in the MTA, you do however get to deal with thread-safety yourself and nothing is automatic.
This is generally well-understood, just about everybody knows how to use the lock keyword, the Task and BackgroundWorker classes and knows that the Control.Begin/Invoke() method is required to update UI from a worker thread. With an InvalidOperationException to remind you when you get it wrong. Leaving it up to the programmer instead of the system taking care of thread-safety does make it harder to use threads. But gives you lots of opportunities to do it better than the system can. Which was necessary, this system-provided thread-safety got a serious black eye when Java punched it in the face during the middleware wars of the late 90s.
There are some questions but first let's start by this:
An Apartment is a context where a COM object is initialized and executed, and it can be a either single thread (STA), normally used for not thread-safe objects, or multi thread.
the term apartment, which describes the constructs in which COM
objects are created
From: https://msdn.microsoft.com/en-us/library/ms809971.aspx
So Multithreading and MTA are not the same, but MTA is Multithreaded.
We can say that STA and MTA are related to COM objects.
You can read more here: https://msdn.microsoft.com/en-us/library/ms693344(v=vs.85).aspx
So, for your second question, if your WinForm application is "multi-threaded" does not mean it is "MTA".
Finally, the MTA/STA concepts are older than .Net technology, but we cannot say that they have nothing related to, because .Net supports COM technology in both STA and MTA.
I expect my answer help you to undestand the difference between Apartment and Threading.
More interesting reading here:Could you explain STA and MTA?
Related
While looking for a memoryleak in a vb.net WebService, I detected that finalizers where blocked, and so several objects where never released (e.g. System.Threading.ReaderWriterLock)
Google told me that this might be, because the STAThread Attribute is set on my main method.
It took a long while until I found out that VB.net uses STA-as default, while c# uses MTA.
When I added the MTAThread-Attribute to my Main Method, everything worked fine and objects are released.
So if I understand it right, the Finalizer-Thread is blocked in STA-Mode.
So far so good, but to be honest, I heard about STA and MTA today for the first time.
Can I switch between STA and MTA without any thoughts?
UPDATE
I'm still not sure if I can switch between MTA and STA without breaking my code.
Here are some more thoughts
I do not use COM Objects in my code.
But some other libraries I'm using seem to use them under the hood, for example OracleCommand
My application is written in vb.net, and so by chance it is set to STA-Appartment, since this is the vb.net default, which I did not know at development time
If I wrote my application in c#, it would be set to MTA by default
So do I need to care about the COM Objects that are used under the hood or not?
because the STAThread Attribute is set on my main method
Yes, that's a regrettable practice that VB.NET inherited from VB6. A strong goal in COM (the original underpinning of VB6 and what you use in your web service) was to hide the complexities of threading and dealing with thread-unsafe code automatically without the client programmer having to know anything about it. A COM object tells the COM runtime what kind of threading it supports. By far the most common choice is "Apartment", a confuzzling word that means that it is not thread-safe.
COM solves thread-safety issues by automatically marshaling a call of the COM method from a worker thread to the thread on which the COM object was created. Thus guaranteeing thread-safety for the COM object. The equivalent in .NET is Dispatcher.Invoke() or Control.Invoke(). Methods that you have to call explicitly in a .NET program to keep the thread-unsafe user interface working, it is done entirely automagically for a COM object.
That kind of marshaling is pretty expensive, it inevitably involves two thread context switches plus the overhead of serializing the method arguments, tens of thousands of CPU cycles at a minimum.
A thread can tell COM that it is a friendly home for a thread-unsafe COM object and will take care of the marshaling requirements, it marks itself as a Single Threaded Apartment. STA. Any calls it makes to a COM method do not have to be marshaled and run at full speed. If a call is made from a worker thread then the STA thread takes care of actually making the call.
An STA thread however has to abide by two very important rules. Breaking one of those rules causes very hard to diagnose runtime failure. Deadlock will occur if you break those rule, like you observed for your finalizer thread. They are:
An STA thread must pump a message loop. The equivalent of Application.Run() in a .NET program. It is the message loop that implements the universal solution to the producer-consumer problem. Required to be able to marshal a call from one thread to a specific other thread. If it doesn't pump then the call made on a worker thread cannot complete and will deadlock.
An STA thread is not allowed to block. Blocking greatly increases the odds for deadlock, a blocked thread isn't pumping messages. The lesser problem in a .NET program, the CLR has a great deal of support for pumping itself on calls like WaitHandle.WaitOne() and Thread.Join().
Sometimes the COM component itself will make hard assumptions about being owned by an STA thread. And use PostMessage() internally, usually to raise events. So even though you never actually make any calls on a worker thread, the component will still malfunction. WebBrowser is the most notorious example of that, its DocumentCompleted event won't fire when the thread doesn't pump.
Your web service no doubt violated the first bullet. You only get a message loop automatically in a Winforms or WPF application. And yes, poison to the finalizer thread since its final release call on the COM object must be marshaled to keep the object thread-safe. Deadlock is the inevitable outcome since the STA thread isn't pumping. A ratty problem that's pretty hard to diagnose, the only hint you get is that the program's memory usage explodes.
By marking the thread as MTA, you explicitly promise to not provide a safe home for an apartment-threaded COM server. COM is now forced to deal with the hard case, it must create a thread by itself to provide safety. That thread always pumps. While that can solve the problem with your web server, it should be noted that this is not a panacea. Those extra threads do not come for free and the calls are always marshaled so always slow. Getting too many of those helper threads is a ratty problem that's pretty hard to diagnose, the only hint you get is that the program's memory usage explodes :)
Automatic thread-safety is a very nice feature. It works 99% of the time without any hassles. Getting rid of the 1% failure mode is however a very major headache. Ultimately it boils down to the universal truth, threading is complicated and error prone. One approach is to not leave it up to COM but take the threading bull by the horns yourself. The code in this post could be helpful with that.
I've recently encountered a STA-related error in my program when I tried to launch an OpenFileDialog in a WinForm. I've done some reading, and before I add the [STAThread] attribute to my main thread I want to know how it will affect my program's execution.
I am a foreigner to COM so not everything I read made sense to me. Some points that stuck with me are:
The [STAThread] attribute defines the application as using a single-threaded apartment model.
More specifically, it changes the state of the application thread to be single-threaded.
http://www.a2zdotnet.com/View.aspx?Id=93
The STA architecture can impose significant performance penalties when an object is accessed by many threads. Each thread's access to the object is serialized and so each thread must wait in line for its turn to have a go with the object.
http://www.codeproject.com/Articles/9190/Understanding-The-COM-Single-Threaded-Apartment-Pa
I understand the need for thread-safety but I still don't understand what STAThread does. In my program (which I inherited from another developer) the main thread launches several other threads, one of which initializes the UI forms - and I think this is where the problem arises. With [STAThread] added what happens to the new threads? Does this affect multi-thread communication for non-Windows objects?
The error occurs when I try to open an OpenFileDialog in one of my forms. I added the dialog to the form using the VS designer: it didn't work. I then attempted to create a dialog box in a global file which is run by the main thread and call that instance from my form. It had no effect.
[STAThread] or Thread.SetApartmentState() are a really, really big deal. You make a promise to the operating system that you write code that is well-behaved. It matters to lots and lots of code inside Windows as well as components you use that are not thread-safe. Standard examples of such code are the Clipboard, Drag + Drop, the shell dialogs (like OpenFileDialog), components like WebBrowser and many Windows sub-components that are wrapped by .NET classes.
Thread-safety is always a big deal, writing truly thread-safe code is very, very difficult. The .NET Framework itself accomplishes it very rarely. Very basic classes list List<> are not thread-safe.
By making the promise to behave well, you must abide by the rules of writing code in a thread that reports itself to be an STA thread. You must do two basic things:
You must pump a message loop. Aka Application.Run() in a Winforms or WPF app. A message loop is a basic mechanism by which you can get code to run on a specific thread. It is the universal solution to the producer-consumer problem. Which solves the thread-safety problem, if you call thread-unsafe code always from the same thread then it isn't unsafe anymore.
You must never block your thread. Blocking an STA thread is very likely to cause deadlock. Because it stops those chunks of code that are not thread-safe from being called. There is core support for this in the CLR, blocking an STA thread with WaitOne() causes it to pump a message loop itself.
These requirements are easily met in a Winforms or WPF app. They are class libraries that were completely designed to help you implement them. Almost every single aspect about the way they behave was affected by it.
You must mark the Main() method in a GUI app as [STAThread]. Rock-hard requirement when it creates windows.
Creating another thread that displays a window is supported and possible. This time you must call SetApartmentState() to switch to STA, it cannot be a thread-pool thread. Getting this right is very difficult, in Winforms you'll get bitten badly by the SystemEvents class if you use certain kind of controls. It has a knack to start raising its events on the wrong thread. Debugging such a problem requires black-belt skills that look like this. That's suppose to scare you.
I'm running a multithreaded windows service that need to call a VB6 dll. There's no documentation about this VB6 dll and this legacy system supports a very critical business process.
At first time (1st thread), this dll performs well. As other threads need access, it start provide wrong results.
I read one guys saying:
"Just be careful of one thing if you are using VB6. Your threading
model is going to have to change to support apartments if you are
running a multithreaded service. VB only supports multiple
single-threaded apartments, but .NET runs fully free threaded
normally. The thread that calls into the VB6 DLL needs to be
compatible with the DLL."
Another guy from team gave me the idea to put this ddl in a separated application domain. But I'm not sure.
How can we work with VB6 dll called from a multithreaded c# windows service application?
When the threads come in, are you saving objects and reusing them later on new threads? If you can, create the objects fresh for every thread. We have a situation like this with a data layer dll we use. If you create a connection on one thread, it can't be used from another. If you create a new connection on each thread, it works fine.
If it's slow to create your objects, look at the ThreadPool class and the ThreadStatic attribute. Threadpools recycle the same set of threads over and over to do work, and ThreadStatic lets you create an object that exists for one thread only. eg
[ThreadStatic]
public static LegacyComObject myObject;
As a request comes in, turn it into a job and queue it in your thread pool. When the job starts, check if the static object is initialised;
void DoWork()
{
if (myObject == null)
{
// slow intialisation process
myObject = New ...
}
// now do the work against myObject
myObject.DoGreatStuff();
}
You say
I'm running a multithreaded windows
service that need to call a VB6 dll.
There's no documentation about this
VB6 dll and this legacy system
supports a very critical business
process.
and at the same time you say
At first time (1ยบ thread), this dll
performs well. As other threads need
access, it start provide wrong
results.
I'd make very certain that Management is aware of the failure you're seeing because the code supporting the critical business process is old and undocumented, and is being used in a way it was never intended to be used, and was never tested to be used. I bet it's also never been tested to be used from .NET before, has it?
Here's my suggestion, and this is similar to something I've actually implemented:
The VB6 DLL expects to be called on a single thread. Do not disappoint it! When your service starts, have it start up a thread of the appropriate type (I can't say, since I've deliberately forgotten all that STA/MTA stuff). Queue up requests to that thread for access to the VB6 DLL. Have all such access go through the single thread.
That way, as far as the VB6 DLL is concerned, it's running exactly as it was tested to run.
BTW, this is slightly different from what I've implemented. I had a web service, not a Windows Service. I had a C DLL, not VB6, and it wasn't COM. I just refactored all access to the thing into a single class, then put lock statements around each of the public methods.
This article on multithreading Visual Basic 6 DLL's provides some insight. It says:
To make an ActiveX DLL project
multithreaded, select the desired
threading options on the General tab
of the Project Properties dialog box.
This article says there are three possible models to choose from:
One thread of execution
Thread pool with round-robin thread assignment
Every externally created object is on its own thread
I assume that the default is one thread of execution, and that one of the other two options needs to be selected.
You might want to take a look at this: linky
And here is a snippet that caught my attention:
VB6 COM objects are STA objects, that means they must run on an STA thread.
You did create two instances of the object from two MTA threads, but the object itself will run on a single (COM (OLE) created) STA
thread, and access from the two MTA threads will be marshaled and synchronized.
So what you should do is, initialize the threads as STA so that each objects runs on his own STA thread without marshaling and you
will be fine.
Anyway, VB style COM objects are always STA. Now in order to prevent apartment marshaling and thread switching you need to create
instances in STA initialized apartments.
Note also that when you set the [MTAThread] attribute on Main, you effectively initialize the main thread as MTA, when you create
instances of STA objects from MTA threads COM will create a separate (unmanaged) thread and initialize it as STA (this is called the
default STA), all calls to STA objects from MTA threads will be marshaled (and incur thread switches), in some cases Idispatch calls
will fail due to IP marshaling failures.
So the advise is use STA (and therefore VB6) objects from compatible apartments only.
I've read that threads are very problematic. What alternatives are available? Something that handles blocking and stuff automatically?
A lot of people recommend the background worker, but I've no idea why.
Anyone care to explain "easy" alternatives? The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Any ideas?
To summarize the problems with threads:
if threads share memory, you can get
race conditions
if you avoid races by liberally using locks, you
can get deadlocks (see the dining philosophers problem)
An example of a race: suppose two threads share access to some memory where a number is stored. Thread 1 reads from the memory address and stores it in a CPU register. Thread 2 does the same. Now thread 1 increments the number and writes it back to memory. Thread 2 then does the same. End result: the number was only incremented by 1, while both threads tried to increment it. The outcome of such interactions depend on timing. Worse, your code may seem to work bug-free but once in a blue moon the timing is wrong and bad things happen.
To avoid these problems, the answer is simple: avoid sharing writable memory. Instead, use message passing to communicate between threads. An extreme example is to put the threads in separate processes and communicate via TCP/IP connections or named pipes.
Another approach is to share only read-only data structures, which is why functional programming languages can work so well with multiple threads.
This is a bit higher-level answer, but it may be useful if you want to consider other alternatives to threads. Anyway, most of the answers discussed solutions based on threads (or thread pools) or maybe tasks from .NET 4.0, but there is one more alternative, which is called message-passing. This has been successfuly used in Erlang (a functional language used by Ericsson). Since functional programming is becoming more mainstream in these days (e.g. F#), I thought I could mention it. In genral:
Threads (or thread pools) can usually used when you have some relatively long-running computation. When it needs to share state with other threads, it gets tricky (you have to correctly use locks or other synchronization primitives).
Tasks (available in TPL in .NET 4.0) are very lightweight - you can split your program into thousands of tasks and then let the runtime run them (it will use optimal number of threads). If you can write your algorithm using tasks instead of threads, it sounds like a good idea - you can avoid some synchronization when you run computation using smaller steps.
Declarative approaches (PLINQ in .NET 4.0 is a great option) if you have some higher-level data processing operation that can be encoded using LINQ primitives, then you can use this technique. The runtime will automatically parallelize your code, because LINQ doesn't specify how exactly should it evaluate the results (you just say what results you want to get).
Message-passing allows you two write program as concurrently running processes that perform some (relatively simple) tasks and communicate by sending messages to each other. This is great, because you can share some state (send messages) without the usual synchronization issues (you just send a message, then do other thing or wait for messages). Here is a good introduction to message-passing in F# from Robert Pickering.
Note that the last three techniques are quite related to functional programming - in functional programming, you desing programs differently - as computations that return result (which makes it easier to use Tasks). You also often write declarative and higher-level code (which makes it easier to use Declarative approaches).
When it comes to actual implementation, F# has a wonderful message-passing library right in the core libraries. In C#, you can use Concurrency & Coordination Runtime, which feels a bit "hacky", but is probably quite powerful too (but may look too complicated).
Won't the parallel programming options in .Net 4 be an "easy" way to use threads? I'm not sure what I'd suggest for .Net 3.5 and earlier...
This MSDN link to the Parallel Computing Developer Center has links to lots of info on Parellel Programming including links to videos, etc.
I can recommend this project. Smart Thread Pool
Project Description
Smart Thread Pool is a thread pool written in C#. It is far more advanced than the .NET built-in thread pool.
Here is a list of the thread pool features:
The number of threads dynamically changes according to the workload on the threads in the pool.
Work items can return a value.
A work item can be cancelled.
The caller thread's context is used when the work item is executed (limited).
Usage of minimum number of Win32 event handles, so the handle count of the application won't explode.
The caller can wait for multiple or all the work items to complete.
Work item can have a PostExecute callback, which is called as soon the work item is completed.
The state object, that accompanies the work item, can be disposed automatically.
Work item exceptions are sent back to the caller.
Work items have priority.
Work items group.
The caller can suspend the start of a thread pool and work items group.
Threads have priority.
Can run COM objects that have single threaded apartment.
Support Action and Func delegates.
Support for WindowsCE (limited)
The MaxThreads and MinThreads can be changed at run time.
Cancel behavior is imporved.
"Problematic" is not the word I would use to describe working with threads. "Tedious" is a more appropriate description.
If you are new to threaded programming, I would suggest reading this thread as a starting point. It is by no means exhaustive but has some good introductory information. From there, I would continue to scour this website and other programming sites for information related to specific threading questions you may have.
As for specific threading options in C#, here's some suggestions on when to use each one.
Use BackgroundWorker if you have a single task that runs in the background and needs to interact with the UI. The task of marshalling data and method calls to the UI thread are handled automatically through its event-based model. Avoid BackgroundWorker if (1) your assembly does not already reference the System.Windows.Form assembly, (2) you need the thread to be a foreground thread, or (3) you need to manipulate the thread priority.
Use a ThreadPool thread when efficiency is desired. The ThreadPool helps avoid the overhead associated with creating, starting, and stopping threads. Avoid using the ThreadPool if (1) the task runs for the lifetime of your application, (2) you need the thread to be a foreground thread, (3) you need to manipulate the thread priority, or (4) you need the thread to have a fixed identity (aborting, suspending, discovering).
Use the Thread class for long-running tasks and when you require features offered by a formal threading model, e.g., choosing between foreground and background threads, tweaking the thread priority, fine-grained control over thread execution, etc.
Any time you introduce multiple threads, each running at once, you open up the potential for race conditions. To avoid these, you tend to need to add synchronization, which adds complexity, as well as the potential for deadlocks.
Many tools make this easier. .NET has quite a few classes specifically meant to ease the pain of dealing with multiple threads, including the BackgroundWorker class, which makes running background work and interacting with a user interface much simpler.
.NET 4 is going to do a lot to ease this even more. The Task Parallel Library and PLINQ dramatically ease working with multiple threads.
As for your last comment:
The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Most of the routines in .NET are built upon the ThreadPool. In .NET 4, when using the TPL, the work load will actually scale at runtime, for you, eliminating the burden of having to specify the number of threads to use. However, there are ways to do this now.
Currently, you can use ThreadPool.SetMaxThreads to help limit the number of threads generated. In TPL, you can specify ParallelOptions.MaxDegreesOfParallelism, and pass an instance of the ParallelOptions into your routine to control this. The default behavior scales up with more threads as you add more processing cores, which is usually the best behavior in any case.
Threads are not problematic if you understand what causes problems with them.
For ex. if you avoid statics, you know which API's to use (e.g. use synchronized streams), you will avoid many of the issues that come up for their bad utilization.
If threading is a problem (this can happen if you have unsafe/unmanaged 3rd party dll's that cannot support multithreading. In this can an option is to create a meachism to queue the operations. ie store the parameters of the action to a database and just run through them one at a time. This can be done in a windows service. Obviously this will take longer but in some cases is the only option.
Threads are indispensable tools for solving many problems, and it behooves the maturing developer to know how to effectively use them. But like many tools, they can cause some very difficult-to-find bugs.
Don't shy away from some so useful just because it can cause problems, instead study and practice until you become the go-to guy for multi-threaded apps.
A great place to start is Joe Albahari's article: http://www.albahari.com/threading/.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Could you explain STA and MTA?
All ThreadPool threads are in the
multithreaded apartment.
--As per the MSDN
What does that mean? I am really concerned with what the difference between the multi vs single threaded apartment model is. Or what does the apartment model mean? I have read the MSDN on it, and it doesn't really make sense to me. I think I may have an idea, but I was thinking someone on here could explain it in plain English.
Thanks,
Anthony D
Update 1
Found this
Could you explain STA and MTA?
Can anyone be more descriptive?
Update 2
I am also looking for an answer about how this applies to the thread pool, and what I need to watch out for because of this.
STA (single-threaded apartment) and MTA (multi-threaded apartment) are to do with COM. COM components can be designed to be accessed by a single thread, in which case it they are hosted in an STA, or they can be made internally thread safe, and hosted in an MTA. A process can have only one MTA, but many STAs. If you're only going to consume COM components all that you really need to know is that you have to match the apartment to the component or nasty things will happen.
In actuality, STAs and MTAs have an impact on .NET code. See Chris Brumme's blog entry for way more detail then you probably need:
https://devblogs.microsoft.com/cbrumme/apartments-and-pumping-in-the-clr/
It's really important to understand how STAs pump messages in .NET. It does have consequences.
If your COM object needs to believe that it is in a single-threaded environment, use STA. You are guaranteed that the creation and all calls will be made by the same thread. You can safely use Thread local storage and you don't need to use critical sections.
If your COM object can be accessed by many threads simultaneously, use MTA -- there will be no guards put in place.
As others have pointed out, it generally has little impact on .NET applications.
However, be aware that the Microsoft test host used for unit tests is actually implemented in an STA, which means that there are limitations on what you can do in unit test. For example you cannot do a WaitAll on a WaitHandle in a unit test is you're using Microsoft's test host.
You don't have to worry about it unless you're doing COM-interop, in which case there are marshalling issues. It has no ramifications for .net itself.