I'm creating a server-type application at the moment which will do the usual listening for connections from external clients and, when they connect, handle requests, etc.
At the moment, my implementation creates a pair of threads every time a client connects. One thread simply reads requests from the socket and adds them to a queue, and the second reads the requests from the queue and processes them.
I'm basically looking for opinions on whether or not you think having all of these threads is overkill, and importantly whether this approach is going to cause me problems.
It is important to note that most of the time these threads will be idle - I use wait handles (ManualResetEvent) in both threads. The Reader thread waits until a message is available and if so, reads it and dumps it in a queue for the Process thread. The Process thread waits until the reader signals that a message is in the queue (again, using a wait handle). Unless a particular client is really hammering the server, these threads will be sat waiting. Is this costly?
I'm done a bit of testing - had 1,000 clients connected continually nagging - the server (so, 2,000+ threads) and it seemed to cope quite well.
I think your implementation is flawed. This kind of design doesn't scale because creating threads is expensive and there is a limit on how many threads can be created.
That is the reason that most implementations of this type use a thread pool. That makes it easy to put a cap on the maximum amount of threads while easily managing new connections and reusing the threads when the work is finished.
If all you are doing with your thread is putting items in a queue, then use the
ThreadPool.QueueUserWorkItem method to use the default .NET thread pool.
You haven't given enough information in your question to specify for definite but perhaps you now only need one other thread, constantly running clearing down the queue, you can use a wait handle to signal when something has been added.
Just make sure to synchronise access to your queue or things will go horribly wrong.
I advice to use following patter. First you need thread pool - build in or custom. Have a thread that checks is there something available to read, if yes it picks Reader thread. Then reading thread puts into queue and then thread from pool of processing threads will pick it. it will minimize number of threads and minimize time spend in waiting state
Related
I have an application that handles TCP connections, and when a connection is made, BeginRead is called on the stream to wait for data, and the main thread resumes waiting for new connections.
Under normal circumstances, there will be only a few connections at a time, and so generally the number of worker threads created by BeginRead is not an issue. However, in the theoretical situation that many, many connections exist at the same time, eventually when a new connection is made, the call to BeginRead causes an OutOfMemoryException. I would like to prevent the thread from being created in this situation (or be informed of a better way to wait for data from multiple streams).
What are some decent ways of accomplishing this? All I can think to do is to either
a) only allow a certain number of active connections at a time, or
b) attempt to use something called a MemoryFailPoint. After reading, I think this might be the better option, but how do I know how much memory a call to BeginRead will need to do its thing safely?
Look at this thread here. It can give you many answers for that.
But you can read your current memory usage of your process like this:
Process currentProcess = Process.GetCurrentProcess();
long memorySize = currentProcess.PrivateMemorySize64;
You should be using the thread pool to handle these operations, rather than creating whole new threads for every operation you need to perform. Not only will using a thread pool that can re-use threads greatly remove effort spent creating and tearing down threads (which isn't cheap) but the thread pool will work to ensure that threads are only created when it will be beneficial to do so, and will simply let requests to have work done queue up when adding more threads wouldn't be beneficial.
I need some guidance on a project we are developing. When triggered, my program needs to contact 1,000 devices by TCP and exchange about 200 bytes of information. All the clients are wireless on a private network. The majority of the time the program will be sitting idle, but then needs to send these messages as quickly as possible. I have come up with two possible methods:
Method 1
Use thread pooling to establish a number of worker threads and have these threads process their way through the 1,000 conversations. One thread handles one conversation until completion. The number of threads in the thread pool would then be tuned for best use of resources.
Method 2
A number of threads would be used to handle multiple conversations per thread. For example a thread process would open 10 socket connections start the conversation and then use asynchronous methods to wait for responses. As a communication is completed, a new device would be contacted.
Method 2 looks like it would be more effective in that operations wouldn’t have to wait with the server device responded. It would also save on the overhead of starting the stopping all those threads.
Am I headed in the right direction here? What am I missing or not considering?
There is a well-established way to deal with this problem. Simply use async IO. There is no need to maintain any threads at all. Async IO uses no threads while the IO is in progress.
Thanks to await doing this is quite easy.
The select/poll model is obsolete in .NET.
I have a problem with scalability and processing and I want to get the opinion of the stack overflow community.
I basically have XML data coming down a socket and I want to process that data. For each XML line sent processing can include writing to a text file, opening a socket to another server and using various database queries; all of which take time.
At the minute my solution involves the following threads:
Thread 1
Accepts incoming sockets and thus generates child threads that handle each socket (there will only be a couple of incoming sockets from clients). When an XML line comes through (ReadLine() method on StreamReader) I basically put this line into a Queue, which is accessible via a static method on a class. This static method contains locking logic to ensure that the program is threadsafe (I could use Concurrent Queue for this of course instead of manual locking).
Threads 2-5
Constantly take XML lines from the queue and processes them one at a time (database queries, file writes etc).
This method seems to be working but I was curious if there is a better way of doing things because this seems very crude. If I take the processing that threads 2-5 do into thread 1 this results in extremely slow performance, which I expected, so I created my worker threads (2-5).
I appreciate I could replace threads 2-5 with a thread pool but the thread pool would still be reading from the same Queue of XML lines so I wandered if there is a more efficient way of processing these events instead of using the Queue?
A queue1 is the right approach. But I would certainly move from manual thread control to the thread pool (and thus I don't need to do thread management) and let it manage the number of threads.2
But in the end there is only so much processing a single computer (however expensive) can do. At some point one of memory size, CPU-memory bandwidth, storage IO, network IO, … is going to be saturated. At that point using an external queuing system (MSMQ, WebSphere*MQ, Rabbit-MQ, …) with each task being a separate message allows many workers on many computers to process the data ("competing consumers" pattern).
1 I would move immediately to ConcurrentQueue: getting locking right is hard, the more you don't need to do it yourself the better.
2 At some point you might find you need more control than the thread pool providers, that is the time to switch to a custom thread pool. But prototype and test: it is quite possible your implementation will actually be worse: see paragraph 2.
When you're dealing with socket IO using BeginReceive/EndReceive, the callback is invoked by an IOCP thread.
Once you're done receiving you need to process the data.
Should you do it on the callback's calling thread?
Or should you run the task using the ThreadPool.QueueUserWorkItem
Normally samples do the work on the callback thread, which is a little bit confusing.
If you're dealing with several hundred active connections, running the processing on the IOCP thread ends up with a process with hundreds of threads. Would ThreadPool help limiting the number of concurrent threads?
I don't know that there's a generic 'right answer' for everyone - it will depend on your individual use case.
These are some things to consider if you do consider going down the ThreadPool path.
Is Out of Order / Concurrent message processing something you can support?
If Socket A receives messages 1, 2, and 3 in rapid succession - they may end up being processed concurrently, or out of order.
.NET has per-CPU thread pools, if one CPU runs out of work, it may 'steal' tasks from other CPUs. This may mean that your messages might be executed in some random order.
Don't forget to consider what out-of-order processing might do to the client - eg if they send three messages which require responses, sending the responses out of order may be a problem.
If you must process each message in order, then you'll probably have to implement your own thread pooling.
Do you need to be concerned about potential Denial of Service?
If one socket suddenly sends a swarm of data, accepting all of that for processing may clog up internal queues and memory. You may have to limit the amount of un-processed data you accept.
IOCP threads has meant for serving IO operations. if your work executed by iocp callback can be long running or incures locks or other facilities that my block execution, you'd better hand it of to a worker thread.
I have a thread which fills a queue. And I have another thread which process this queue. My problem is first thread fills the queue very fast so the other thread couldn't process this queue that much faster and my program keeps overuse ram. What is the optimum solution for this problem?
Sorry I forgot to add something. I can't limit my queue or producer thread. My producer thread couldn't wait because it's capturing network packets and I shouldn't miss any packet. I have to process these packets fast than producer thread.
Well, assuming that the order of processing of items in the queue is not important, you can run two (or more) threads processing the queue.
Unless there's some sort of contention between them, that should enable faster processing. This is known as a multi-consumer model.
Another possibility is to have your producer thread monitor the size of the queue and refuse to add entries until it drops below some threshold. Standard C# queues don't provide a way to stop expansion of the capacity (even using a 1.0 growth factor will not inhibit growth).
You could define a maximum queue size (let's say 2000) which when hit causes the queue to only accept more items when it's down to a lower size (let's say 1000).
I'd recommend using an EventWaitHandle or a ManualResetEvent in order not to busy-wait. http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx
Unless you are already doing so, use BlockingCollection<T> as your queue and pass some reasonable limit to the boundedCapacity parameter of constructor (which is then reflected in BoundedCapacity property) - your producer will block on Add if this would make the queue too large and resume after consumer has removed some element from the queue.
According to MSDN documentation for BlockingCollection<T>.Add:
If a bounded capacity was specified when this instance of BlockingCollection<T> was initialized, a call to Add may block until space is available to store the provided item.
Another method is to new() X inter-thread comms instances at startup, put them on a queue and never create any more. Thread A pops objects off this pool queue, fills them with data and queues them to thread B. Thread B gets the objects, processes them and then returns them to the pool queue.
This provides flow control - if thread A tries to post too fast, the pool will dry up and A will have to wait on the pool queue until B returns objects. It has the potential to improve peformance since there are no mallocs and frees after the initial pool filling - the lock time on a queue push/pop will be less than that of a memory-manager call. There is no need for complex bounded queues - any old producer-consumer queue class will do. The pool can be used for inter-thread comms throughout a full app with many threads/threadPools, so flow-controlling them all. Shutdown problems can be mitigated - if the pool queue is created by the main thread at startup before any forms etc and never freed, it is often possible to avoid explicit background thread shutdowns on app close - a pain that would be nice to just forget about. Object leaks and/or double-releases are easily detected by monitoring the pool level, ('detected', not 'fixed':).
The inevitable downsides - all the inter -thread comms instance memory is permanently allocated even if the app is completely idle. An object popped off the pool will be full of 'garbage' from the previous use of it. If the 'slowest' thread gets an object before releasing one, it is possible for the app to deadlock with the pool empty and all objects queued to the slowest thread. A very heavy burst of loading may cause the app to throttle itself 'early' when a simpler 'new/queue/dispose' mechanism would just allocate more instances and so clope better with the burst of work.
Rgds,
Martin
The simplest possible solution would be that the producer thread check if the queue has reached a certain limit of pending items, if so then go to sleep before pushing more work.
Other solutions depend on what the actual problem you are trying to solve, is the processing more IO bound or CPU bound etc, that will even allow you to design the solution which doesn't even need a queue. For ex: The producer thread can generate, lets say 10 items, and call another consumer "method" which process them in parallel and so on.