I have a program that involves receiving a packet from a network on one thread, then notifying other threads that the packet was received. My current approach uses Thread.Interrupt, which seems to be a bit slow when transferring huge amounts of data. Would it be faster to use "lock" to avoid using to many interrupts, or is a lock really just calling Interrupt() in its implementation?
I don't understand why you would use Thread.Interrupt rather than some more traditional signalling method to notify waiting threads that data is received. Thread.Interrupt requires the target thread to be in a wait state anyway, so why not just add an object that you can signal to the target thread's wait logic, and use that to kick it for new data?
lock is used to protect critical code or data from execution by other threads and is ill-suited as a mechanism for inter-thread active signalling.
Use WaitOne or WaitAll on suitable object(s) instead of either. System.Collections.Concurrent in .Net 4 also provides excellent means for queueing new data to a pol of target threads, and other possible approaches to your problem.
Both Thread.Interrupt and lock are not well suited for signaling other threads.
Thread.Interrupt is used to poke or unstick one of the blocking calls in the BCL.
lock is used to prevent simultaneous access to a resource or a block of code.
Signaling other threads is better accomplished with one the following mechanisms.
ManualResetEvent (or ManualResetEventSlim)
AutoResetEvent
EventWaitHandle
Barrier
CountdownEvent
WaitHandle.WaitAny or WaitHandle.WaitAll
I usually use a standard queue and the lock keyword when reading or writing it. Alternatively, the Synchronized method on the queue removes the need to use lock. A System.Threading.Semaphore is the best tool for notifying worker threads when there is a new job to process.
Example of how to add to queue
lock ( myQueue) { myQueue.Enqueue(workItem); }
mySemaphore.Release();
Example of how to process a work item:
mySemaphore.WaitOne();
lock (myQueue) { object workItem = myQueue.Dequeue(); }
// process work item
Semaphore setup:
mySemaphore = new Semaphore(0, int.MaxValue);
If this is too slow and synchronization overhead still dominates your application, you might want to look at dispatching more than one work item at a time.
Depending on what you're doing, the new parallelization features in .NET 4.0 may also be very helpful for your application (if that's an option).
Related
I have an application that handles TCP connections, and when a connection is made, BeginRead is called on the stream to wait for data, and the main thread resumes waiting for new connections.
Under normal circumstances, there will be only a few connections at a time, and so generally the number of worker threads created by BeginRead is not an issue. However, in the theoretical situation that many, many connections exist at the same time, eventually when a new connection is made, the call to BeginRead causes an OutOfMemoryException. I would like to prevent the thread from being created in this situation (or be informed of a better way to wait for data from multiple streams).
What are some decent ways of accomplishing this? All I can think to do is to either
a) only allow a certain number of active connections at a time, or
b) attempt to use something called a MemoryFailPoint. After reading, I think this might be the better option, but how do I know how much memory a call to BeginRead will need to do its thing safely?
Look at this thread here. It can give you many answers for that.
But you can read your current memory usage of your process like this:
Process currentProcess = Process.GetCurrentProcess();
long memorySize = currentProcess.PrivateMemorySize64;
You should be using the thread pool to handle these operations, rather than creating whole new threads for every operation you need to perform. Not only will using a thread pool that can re-use threads greatly remove effort spent creating and tearing down threads (which isn't cheap) but the thread pool will work to ensure that threads are only created when it will be beneficial to do so, and will simply let requests to have work done queue up when adding more threads wouldn't be beneficial.
I have a situation where I have a polling thread for a TCPClient (is that the best plan for a discrete TCP device?) which aggregates messages and occasionally responds to those messages by firing off events. The event producer really doesn't care much if the thread is blocked for a long time, but the consumer's design is such that I'd prefer to have it invoke the handlers on a single worker thread that I've got for handling a state machine.
The question then is this. How should I best manage the creation, configuration (thread name, is background, etc.) lifetime, and marshaling of calls for these threads using the Task library? I'm somewhat familiar with doing this explicitly using the Thread type, but when at all possible my company prefers to do what we can just through the use of Task.
Edit: I believe what I need here will be based around a SynchronizationContext on the consumer's type that ensures that tasks are schedules on a single thread tied to that context.
The question then is this. How should I best manage the creation, configuration (thread name, is background, etc.) lifetime, and marshaling of calls for these threads using the Task library?
This sounds like a perfect use case for BlockingCollection<T>. This class is designed specifically for producer/consumer scenarios, and allows you to have any threads add to the collection (which acts like a thread safe queue), and one (or more) thread or task call blockingCollection.GetConsumingEnumerable() to "consume" the items.
You could consider using TPL DataFlow where you setup an ActionBlock<T> that you push messages into from your TCP thread and then TPL DataFlow will take care of the rest by scaling out the processing of the actions as much your hardware can handle. You can also control exactly how much processing of the actions happen by configuring the ActionBlock<T> with a MaxDegreeOfParallelism.
Since processing sometimes can't keep up with the flow of incoming data, you might want to consider "linking" a BufferBlock<T> in front of the ActionBlock<T> to ensure that the TCP processing thread doesn't get too far ahead of what you can actually process. This would have the same effect as using BlockingCollection<T> with a bounded capacity.
Finally, note that I'm linking to .NET 4.5 documentation because it's easiest, but TPL DataFlow is available for .NET 4.0 via a separate download. Unfortunately they never made a NuGet package out of it.
Trying to figure out whether or not I should use async methods or not such as:
TcpListener.BeginAcceptTcpClient
TcpListener.EndcceptTcpClient
and
NetworkStream.BeginRead
NetworkStream.EndRead
as opposed to their synchronous TcpListener.AcceptTcpClient and NetworkStream.Read versions. I've been looking at related threads but I'm still a bit unsure about one thing:
Question: The main advantage of using an asynchronous method is that the GUI is not locked up. However, these methods will be called on separate Task threads as it is so there is no threat of that. Also, TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles. Since this is the case, then why do so many always recommend using the async versions? It seems like in this case the synchronous versions would be superior?
Also, another disadvantage of using asynchronous methods is the increased complexity and constant casting of objects. For example, having to do this:
private void SomeMethod()
{
// ...
listener.BeginAcceptTcpClient(OnAcceptConnection, listener);
}
private void OnAcceptConnection(IAsyncResult asyn)
{
TcpListener listener = (TcpListener)asyn.AsyncState;
TcpClient client = listener.EndAcceptTcpClient(asyn);
}
As opposed to this:
TcpClient client = listener.AcceptTcpClient();
Also it seems like the async versions would have much more overhead due to having to create another thread. (Basically, every connection would have a thread and then when reading that thread would also have another thread. Threadception!)
Also, there is the boxing and unboxing of the TcpListener and the overhead associated with creating, managing, and closing these additional threads.
Basically, where normally there would just be individual threads for handling individual client connections, now there is that and then an additional thread for each type of operation performed (reading/writing stream data and listening for new connections on the server's end)
Please correct me if I am wrong. I am still new to threading and I'm trying to understand this all. However, in this case it seems like using the normal synchronous methods and just blocking the thread would be the optimal solution?
TcpListener.AcceptTcpClient blocks the thread until a connection is made so there is no wasted CPU cycles.
But there is also no work getting done. A Thread is a very expensive operating system object, about the most expensive there is. Your program is consuming a megabyte of memory without it being used while the thread blocks on connection request.
However, these methods will be called on separate Task threads as it is so there is no threat of that
A Task is not a good solution either, it uses a threadpool thread but the thread will block. The threadpool manager tries to keep the number of running TP threads equal to the number of cpu cores on the machine. That won't work well when a TP thread blocks for a long time. It prevents other useful work from being done by other TP threads that are waiting to get their turn.
BeginAcceptTcpClient() uses a so-called I/O completion callback. No system resources are consumed while the socket is listening. As soon as a connection request comes in, the operating system runs an APC (asynchronous procedure call) which grabs a threadpool thread to make the callback. The thread itself is in use for, typically, a few microseconds. Very efficient.
This kind of code will get a lot simpler in the next version of C# with the next async and await keywords. End of the year, maybe.
If you call AcceptTcpClient() on any thread, that thread is useless until you get a connection.
If you call BeginAcceptTcpClient(), the calling thread can stop immediately, without wasting the thread.
This is particularly important when using the ThreadPool (or the TPL), since they use a limited number of pool threads.
If you have too many threads waiting for operations, you can run out of threadpool threads, so that new work items will have to wait until one of the other threads finish.
I have a need to create multiple processing threads in a new application. Each thread has the possibility of being "long running". Can someone comment on the viability of the built in .net threadpool or some existing custom threadpool for use in my application?
Requirements :
Works well within a windows service. (queued work can be removed from the queue, currently running threads can be told to halt)
Ability to spin up multiple threads.
Work needs to be started in sequential order, but multiple threads can be processing in parallel.
Hung threads can be detected and killed.
EDIT:
Comments seem to be leading towards manual threading. Unfortunately I am held to 3.5 version of the framework. Threadpool was appealing because it would allow me to queue work up and threads created for me when resources were available. Is there a good 3.5 compatable pattern (producer/consumer perhaps) that would give me this aspect of threadpool without actually using the threadpool?
Your requirements essentially rule out the use of the .NET ThreadPool;
It generally should not be used for long-running threads, due to the danger of exhausting the pool.
It does work well in Windows services, though, and you can spin up multiple threads - limited automatically by the pool's limits.
You can not guarantee thread starting times with the thread pool; it may queue threads for execution when it has enough free ones, and it does not even guarantee they will be started in the sequence you submit them.
There are no easy ways to detect and kill running threads in the ThreadPool
So essentially, you will want to look outside the ThreadPool; I might recommend that perhaps you might need 'full' System.Threading.Thread instances just due to all of your requirements. As long as you handle concurrency issues (as you must with any threading mechanism), I don't find the Thread class to be all that difficult to manage myself, really.
Simple answer, but the Task class (Fx4) meets most of your requirements.
Cancellation is cooperative, ie your Task code has to check for it.
But detecting hung threads is difficult, that is a very high requirement anyway.
But I can also read your requirements as for a JobQueue, where the 'work' consists of mostly similar jobs. You could roll your own system that Consumes that queue and monitors execution on a few Threads.
I've done essentially the same thing with .Net 3.5 by creating my own thread manager:
Instantiate worker classes that know how long they've been running.
Create threads that run a worker method and add them to a Queue<Thread>.
A supervisor thread reads threads from the Queue and adds them to a Dictionary<int, Worker> as it launches them until it hits its maximum running threads. Add the thread as a property of the Worker instance.
As each worker finishes it invokes a callback method from the supervisor that passes back its ManagedThreadId.
The supervisor removes the thread from the Dictionary and launches another waiting thread.
Poll the Dictionary of running workers to see if any have timed out, or put timers in the workers that invoke a callback if they take too long.
Signal a long-running worker to quit, or abort its thread.
The supervisor invokes callbacks to your main thread to inform of progress, etc.
So my question is how to implement cancel/interrupt feature into all (I mean ALL) thread workers in your application in best and most elegant way?
It's not important if it's an HttpWebRequest, IO operation or calculation. User should have an possibility to cancel every action/thread at any moment.
Use .NET 4.0 Tasks with CancellationTokens - they are the new universal cancellation system.
User should have an possibility to
cancel every action/thread at any
moment.
Threading is a practice, not a design... and believe me it has been tried as a design, but it failed miserably. The basic problem with simply canceling any action at any moment is that in a multithreaded environment it's just evil! Imagine that you have a section of code guarded by a lock and you have two threads running in parallel:
Thread 1 acquires the lock.
Thread 2 waits until the lock is released so it can acquire it.
Thread 1 is canceled while it's holding the lock and it doesn't release the lock.
DEADLOCK: Thread 2 is waiting for the lock which will never be released.
This is the simplest example and technically we can take care of this situation in the design, i.e. automatically release any locks that the thread has acquired, but instead of locks think of object states, resource utilization, client dependencies, etc. If your thread is modifying a big object and it's canceled in the middle of the modification, then the state of the object may be inconsistent, the resource which you're utilizing might get hung up, the client depending on that thread might crash... there is a slew of things which can happen and there is simply no way to design for them. In this case you make it a practice to manage the threads: you ensure a safe cancellation of your threads.
Others have already mentioned various methods for starting threads that can be canceled, but I just wanted to touch on the principles. Even in the cases where there is a way to cancel your threads, you still have to keep in mind that you're responsible for determining the safest way to cancel your thread.
It's not important if it's an HttpWebRequest, IO operation or calculation.
I hope now you understand why it's the MOST important thing! Unless you specifically know what your thread is doing, then there is no safe way to automatically cancel it.
P.S.
One thing to remember is that if you don't want hanging threads then for each one of them you can set the Thread.IsBackground flag to true and they will automatically be closed when your application exits.
Your worker threads need a way to check with your main thread to see if they should keep going. One way is to share a static volatile bool that's set by your UI and periodically checked by the worker threads.
My preference is to create your own threads that run instances of a worker class that periodically invoke a callback method provided by your main thread. This callback returns a value that tells the worker to continue, pause, or stop.
Avoid the temptation to use Thread.Abort() to kill worker threads: Manipulating a thread from a different thread.