I have a thread which fills a queue. And I have another thread which process this queue. My problem is first thread fills the queue very fast so the other thread couldn't process this queue that much faster and my program keeps overuse ram. What is the optimum solution for this problem?
Sorry I forgot to add something. I can't limit my queue or producer thread. My producer thread couldn't wait because it's capturing network packets and I shouldn't miss any packet. I have to process these packets fast than producer thread.
Well, assuming that the order of processing of items in the queue is not important, you can run two (or more) threads processing the queue.
Unless there's some sort of contention between them, that should enable faster processing. This is known as a multi-consumer model.
Another possibility is to have your producer thread monitor the size of the queue and refuse to add entries until it drops below some threshold. Standard C# queues don't provide a way to stop expansion of the capacity (even using a 1.0 growth factor will not inhibit growth).
You could define a maximum queue size (let's say 2000) which when hit causes the queue to only accept more items when it's down to a lower size (let's say 1000).
I'd recommend using an EventWaitHandle or a ManualResetEvent in order not to busy-wait. http://msdn.microsoft.com/en-us/library/system.threading.manualresetevent.aspx
Unless you are already doing so, use BlockingCollection<T> as your queue and pass some reasonable limit to the boundedCapacity parameter of constructor (which is then reflected in BoundedCapacity property) - your producer will block on Add if this would make the queue too large and resume after consumer has removed some element from the queue.
According to MSDN documentation for BlockingCollection<T>.Add:
If a bounded capacity was specified when this instance of BlockingCollection<T> was initialized, a call to Add may block until space is available to store the provided item.
Another method is to new() X inter-thread comms instances at startup, put them on a queue and never create any more. Thread A pops objects off this pool queue, fills them with data and queues them to thread B. Thread B gets the objects, processes them and then returns them to the pool queue.
This provides flow control - if thread A tries to post too fast, the pool will dry up and A will have to wait on the pool queue until B returns objects. It has the potential to improve peformance since there are no mallocs and frees after the initial pool filling - the lock time on a queue push/pop will be less than that of a memory-manager call. There is no need for complex bounded queues - any old producer-consumer queue class will do. The pool can be used for inter-thread comms throughout a full app with many threads/threadPools, so flow-controlling them all. Shutdown problems can be mitigated - if the pool queue is created by the main thread at startup before any forms etc and never freed, it is often possible to avoid explicit background thread shutdowns on app close - a pain that would be nice to just forget about. Object leaks and/or double-releases are easily detected by monitoring the pool level, ('detected', not 'fixed':).
The inevitable downsides - all the inter -thread comms instance memory is permanently allocated even if the app is completely idle. An object popped off the pool will be full of 'garbage' from the previous use of it. If the 'slowest' thread gets an object before releasing one, it is possible for the app to deadlock with the pool empty and all objects queued to the slowest thread. A very heavy burst of loading may cause the app to throttle itself 'early' when a simpler 'new/queue/dispose' mechanism would just allocate more instances and so clope better with the burst of work.
Rgds,
Martin
The simplest possible solution would be that the producer thread check if the queue has reached a certain limit of pending items, if so then go to sleep before pushing more work.
Other solutions depend on what the actual problem you are trying to solve, is the processing more IO bound or CPU bound etc, that will even allow you to design the solution which doesn't even need a queue. For ex: The producer thread can generate, lets say 10 items, and call another consumer "method" which process them in parallel and so on.
Related
I'm working on project with following workflow :
Background service consumme messages from Rabbitmq's queue
Background service use background task queue like this and here to process task paralleling
Each task execute queries to retrieve some datas and cache them in collection
If collection size is over 1000 objects, I would like to read collection and then clear it. Like each tasks are processing as parallel, I don't want that another thread add datas in collection until it was cleared.
There are blockingcollection or concurrentdictionary (thread-safe collection), but I don't know which mechanic to use ?
What's the best way to achieve this?
The collection that seems more suitable for your case is the Channel<T>. This is an asynchronous version of the BlockingCollection<T>, and internally it's based on the same storage (the ConcurrentQueue<T> collection). The similarities are:
They both can be configured to be bounded or unbounded.
A consumer can take a message, even if none is currently available. In this case the Take/TakeAsync call will block either synchronously or asynchronously until a message can be consumed, or the collection completes, whatever comes first.
A producer can push a message, even if the collection is currently full. In this case the Add/WriteAsync call will block either synchronously or asynchronously until there is space available for the message, or the collection completes, whatever comes first.
A consumer can enumerate the collection in a consuming fashion, with a foreach/await foreach loop. Each message received in the loop is consumed by this loop, and will never be available to other consuming loops that might be active by other consumers in parallel.
Some features of the Channel<T> that the BlockingCollection<T> lacks:
A Channel<T> exposes two facades, a Writer and a Reader, that allow a better separation between the roles of the producer and the consumer. In practice this can be more of an annoyance than a useful feature IMHO, but nonetheless it's part of the experience of working with a channel.
A ChannelWriter<T> can be optionally completed with an error. This error is propagated to the consumers of the channel.
A ChannelReader<T> has a Completion property of type Task.
A bounded Channel<T> can be configured to be lossy, so that it drops old buffered messages automatically in order to make space for new incoming messages.
Some features of the BlockingCollection<T> that the Channel<T> lacks:
There is no direct support for timeout when writing/reading messages. This can be achieved indirectly (but precariously, see below) with timer-based CancellationTokenSources.
The contents of a channel cannot be enumerated in a non-consuming fashion.
Some auxiliary features like the BlockingCollection<T>.TakeFromAny method are not available.
A channel cannot be backed by other internal collections, other than the ConcurrentQueue<T>. So it can't have, for example, the behavior of a stack instead of a queue.
Caveat:
There is a nasty memory leak issue that is triggered when a channel is idle (empty with an idle producer, or full with an idle consumer), and the consumer or the producer attempts continuously to read/write messages with timer-based CancellationTokenSources. Each such canceled operation leaks about 800 bytes. The leak is resolved automatically when the first read/write operation completes successfully. This issue is known for more than two years, and Microsoft has not decided yet what to do with it.
Check out concurrentQueue. It appears to be suitable for the tasks you have mentioned in your questions. Documentation here - https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentqueue-1?view=net-6.0
There are other concurrent collection types as well - https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/
I have an application that handles TCP connections, and when a connection is made, BeginRead is called on the stream to wait for data, and the main thread resumes waiting for new connections.
Under normal circumstances, there will be only a few connections at a time, and so generally the number of worker threads created by BeginRead is not an issue. However, in the theoretical situation that many, many connections exist at the same time, eventually when a new connection is made, the call to BeginRead causes an OutOfMemoryException. I would like to prevent the thread from being created in this situation (or be informed of a better way to wait for data from multiple streams).
What are some decent ways of accomplishing this? All I can think to do is to either
a) only allow a certain number of active connections at a time, or
b) attempt to use something called a MemoryFailPoint. After reading, I think this might be the better option, but how do I know how much memory a call to BeginRead will need to do its thing safely?
Look at this thread here. It can give you many answers for that.
But you can read your current memory usage of your process like this:
Process currentProcess = Process.GetCurrentProcess();
long memorySize = currentProcess.PrivateMemorySize64;
You should be using the thread pool to handle these operations, rather than creating whole new threads for every operation you need to perform. Not only will using a thread pool that can re-use threads greatly remove effort spent creating and tearing down threads (which isn't cheap) but the thread pool will work to ensure that threads are only created when it will be beneficial to do so, and will simply let requests to have work done queue up when adding more threads wouldn't be beneficial.
I've run into a problem while writing an async multi-server network app in c#. I have many jobs being taken care of by the thread pool and these include the writes to the network sockets. This ended up allowing for the case where more than one thread could write to the socket at the same time and discombobulate my outgoing messages. My idea for getting around this was to implement a queue system where whenever data got added to the queue, the socket would write it.
My problem is, I can't quite wrap my head around the architecture of something of this nature. I imagine having a queue object that fires an event on whenever data gets added to the queue. The event then writes the data being held in the queue, but that won't work because if two threads come by and add to the queue simultaneously, even if the queue is made to be thread safe, events will still be fired for both and I'll run into the same problem. So then maybe someway to hold off an event if another is in progress, but then how do I continue that event once the first finishes without simply blocking the thread on some mutex or something. This wouldn't be so hard if I wasn't trying to stay strict with my "block nothing" architecture but this particular application requires that I allow the thread pool threads to keep doing their thing.
Any ideas?
While similar to Porges answer it differs a bit in implementation.
First, I usually don't queue the bytes to send, but objects and seralize them in the sending thread but I guess that's a matter of taste.
But the bigger difference is in the use of ConcurrentQueues (in addition to the BlockingCollection).
So I'd end up with code similar to
BlockingCollection<Packet> sendQueue = new BlockingCollection<Packet>(new ConcurrentQueue<Packet>());
while (true)
{
var packet = sendQueue.Take(); //this blocks if there are no items in the queue.
SendPacket(packet); //Send your packet here.
}
The key-take away here is that you have one thread which loops this code, and all other threads can add to the queue in a thread-safe way (both, BlockingCollection and ConcurrentQueue are thread-safe)
have a look at Processing a queue of items asynchronously in C# where I answered a similar question.
Sounds like you need one thread writing to the socket synchronously and a bunch of threads writing to a queue for that thread to process.
You can use a blocking collection (BlockingCollection<T>) to do the hard work:
// somewhere there is a queue:
BlockingCollection<byte[]> queue = new BlockingCollection<byte[]>();
// in socket-writing thread, read from the queue and send the messages:
foreach (byte[] message in queue.GetConsumingEnumerable())
{
// just an example... obviously you'd need error handling and stuff here
socket.Send(message);
}
// in the other threads, just enqueue messages to be sent:
queue.Add(someMessage);
The BlockingCollection will handle all synchronization. You can also enforce a maximum queue length and other fun things.
I don't know C#, but what I would do is have the event trigger the socket manager to start pulling from the queue and write things out one at a time. If it is already going the trigger won't do anything, and once there is nothing in the queue, it stops.
This solves the problem of two threads writing to the queue simultaneously because the second event would be a no-op.
You could have a thread-safe queue that all your worker thread write their results to. Then have another thread that polls the queue and sends results when it sees them waiting.
I am using the ThreadPool to queue 1000's of workitems
While(reading in data for processing)
{
args = some data that has been read;
ThreadPool.QueueUserWorkItem(new WaitCallback(threadRunner), args);
}
This is working very well, however, as the main thread queues the requests faster than they are processed memory is slowly eaten up.
I would like to do something akin to the following to throttle the queueing as the queue grows
Thread.Sleep(numberOfItemsCurrentlyQueued);
This would result in longer waits as the queue grows.
Is there any way to discover how many items are in the queue?
A more manageable abstraction for Producer/Consumer queue is BlockingCollection<T>. The example code there shows how to use Tasks to seed and drain the queue. The queue count is readily available via the Count property.
If you can, avoid using Sleep to delay production of more items. Have the producer wait on an Event or similar when queue gets too large, and have consumer(s) signal the Event when the queue backlog reaches a threshold where you are comfortable allowing more items to be produced. Always try to make things event-driven - Sleep is a bit of a guess.
I don't think there is a built-in way, but you can introduce a [static?] counter that would increase/decrease; for that you would have to create your own method that would wrap ThreadPool.QueueUserWorkItem() and take care of the counter.
By the way, just in case you are running .NET 4.0, you should use TaskFactory.StartNew instead of ThreadPool.QueueUserWorkItem() - it's said to have better memory/thread management.
I'm creating a server-type application at the moment which will do the usual listening for connections from external clients and, when they connect, handle requests, etc.
At the moment, my implementation creates a pair of threads every time a client connects. One thread simply reads requests from the socket and adds them to a queue, and the second reads the requests from the queue and processes them.
I'm basically looking for opinions on whether or not you think having all of these threads is overkill, and importantly whether this approach is going to cause me problems.
It is important to note that most of the time these threads will be idle - I use wait handles (ManualResetEvent) in both threads. The Reader thread waits until a message is available and if so, reads it and dumps it in a queue for the Process thread. The Process thread waits until the reader signals that a message is in the queue (again, using a wait handle). Unless a particular client is really hammering the server, these threads will be sat waiting. Is this costly?
I'm done a bit of testing - had 1,000 clients connected continually nagging - the server (so, 2,000+ threads) and it seemed to cope quite well.
I think your implementation is flawed. This kind of design doesn't scale because creating threads is expensive and there is a limit on how many threads can be created.
That is the reason that most implementations of this type use a thread pool. That makes it easy to put a cap on the maximum amount of threads while easily managing new connections and reusing the threads when the work is finished.
If all you are doing with your thread is putting items in a queue, then use the
ThreadPool.QueueUserWorkItem method to use the default .NET thread pool.
You haven't given enough information in your question to specify for definite but perhaps you now only need one other thread, constantly running clearing down the queue, you can use a wait handle to signal when something has been added.
Just make sure to synchronise access to your queue or things will go horribly wrong.
I advice to use following patter. First you need thread pool - build in or custom. Have a thread that checks is there something available to read, if yes it picks Reader thread. Then reading thread puts into queue and then thread from pool of processing threads will pick it. it will minimize number of threads and minimize time spend in waiting state