Basically i have multi threads that adds data into a queue via SQLite. I have another one thread that pulls them and process them one at a time (too much resource to do multiple at once). The processing thread does this:
pull data from DB
foreach { proccess }
if count == 0 { thread.suspend() } (waken by thread.resume())
repeat
my worker thread does:
Validates data
Inserts into DB
call Queue.Poke(QueueName)
When I poke it, if the thread is suspended I .resume() it.
What I am worried about is if the process thread sees count==0, my worker inserts and pokes then my process continues down the if and sleeps. It won't realize there is something new in the DB.
How should I write this in such a way that I won't have a race condition.
Processing thread:
event.Reset
pull data from DB
foreach { proccess }
if count == 0 then event.Wait
repeat
And the other thread:
Validates data
Inserts into DB
event.Set()
You'll have extra wakes (wake on an empty queue, nothing to process, go back to sleep) but you won't have missed inserts.
I think this may be the structure you need.
private readonly Queue<object> _Queue = new Queue<object>();
private readonly object _Lock = new object();
void FillQueue()
{
while (true)
{
var dbData = new { Found = true, Data = new object() };
if (dbData.Found)
{
lock (_Lock)
{
_Queue.Enqueue(dbData.Data);
}
}
// If you have multiple threads filling the queue you
// probably want to throttle it a little bit so the thread
// processing the queue won't be throttled.
// If 1ms is too long consider using
// TimeSpan.FromTicks(1000).
Thread.Sleep(1);
}
}
void ProcessQueue()
{
object data = null;
while (true)
{
lock (_Lock)
{
data = _Queue.Count > 0 ? _Queue.Dequeue() : null;
}
if (data == null)
{
Thread.Sleep(1);
}
else
{
// Proccess
}
}
}
Related
Suppose we have many randomly incoming threads accessing same resource in parallel. To access the resource thread needs to acquire a lock. If we could pack N incoming threads into one request resource usage would be N times more efficient. Also we need to answer individual request as fast as possible. What is the best way/pattern to do that in C#?
Currently I have something like that:
//batches lock
var ilock = ModifyBatch.GetTableDeleteBatchLock(table_info.Name);
lock (ilock)
{
// put the request into requests batch
if (!ModifyBatch._delete_batch.ContainsKey(table_info.Name))
{
ModifyBatch._delete_batch[table_info.Name] = new DeleteData() { Callbacks = new List<Action<string>>(), ids = ids };
}
else
{
ModifyBatch._delete_batch[table_info.Name].ids.UnionWith(ids);
}
//this callback will get called once the job is done by a thread that will acquire resource lock
ModifyBatch._delete_batch[table_info.Name].Callbacks.Add(f =>
{
done = true;
error = f;
});
}
bool lockAcquired = false;
int maxWaitMs = 60000;
DeleteData _delete_data = null;
//resource lock
var _write_lock = GetTableWriteLock(typeof(T).Name);
try
{
DateTime start = DateTime.Now;
while (!done)
{
lockAcquired = Monitor.TryEnter(_write_lock, 100);
if (lockAcquired)
{
if (done) //some other thread did our job
{
Monitor.Exit(_write_lock);
lockAcquired = false;
break;
}
else
{
break;
}
}
Thread.Sleep(100);
if ((DateTime.Now - start).TotalMilliseconds > maxWaitMs)
{
throw new Exception("Waited too long to acquire write lock?");
}
}
if (done) //some other thread did our job
{
if (!string.IsNullOrEmpty(error))
{
throw new Exception(error);
}
else
{
return;
}
}
//not done, but have write lock for the table
lock (ilock)
{
_delete_data = ModifyBatch._delete_batch[table_info.Name];
var oval = new DeleteData();
ModifyBatch._delete_batch.TryRemove(table_info.Name, out oval);
}
if (_delete_data.ids.Any())
{
//doing the work with resource
}
foreach (var cb in _delete_data.Callbacks)
{
cb(null);
}
}
catch (Exception ex)
{
if (_delete_data != null)
{
foreach (var cb in _delete_data.Callbacks)
{
cb(ex.Message);
}
}
throw;
}
finally
{
if (lockAcquired)
{
Monitor.Exit(_write_lock);
}
}
If it is OK to process the task outside the scope of the current request, i.e. to queue it for later, then you can think of a sequence like this1:
Implement a resource lock (monitor) and a List of tasks.
For each request:
Lock the List, Add current task to the List, remember nr. of tasks in the List, unlock the List.
Try to acquire the lock.
If unsuccessful:
If the nr. of tasks in the list < threshold X, then Return.
Else Acquire the Lock (will block)
Lock the List, move it's contents to a temp list, unlock the List.
If temp list is not empty
Execute the tasks in the temp list.
Repeat from step 5.
Release the lock.
The first request will go through the whole sequence. Subsequent requests, if the first is still executing, will short-circuit at step 4.
Tune for the optimal threshold X (or change it to a time-based threshold).
1 If you need to wait for the task in the scope of the request, then you need to extend the process slightly:
Add two fields to the Task class: completion flag and exception.
At step 4, before Returning, wait for the task to complete (Monitor.Wait) until its completion flag becomes true. If exception is not null, throw it.
At step 6, for each task, set the completion flag and optionally the exception and then notify the waiters (Monitor.PulseAll).
I need an advice regarding a variable lifetime inside an Action Queue
in a multi-threaded ASP.NET Web API environment.
One of my API receive requests that update the database and to send out emails. However, sending out emails would take too long to process and I have decided to surround that portion inside a producer/consumer queue using the example from below.
using System;
using System.Threading;
using System.Collections.Generic;
public class PCQueue : IDisposable
{
private static readonly PCQueue instance = new TaskQueue(50);
readonly object _locker = new object();
Thread[] _workers;
Queue<Action> _itemQ = new Queue<Action>();
public PCQueue (int workerCount)
{
_workers = new Thread [workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(_workers [i] = new Thread (Consume)).Start();
}
public static PCQueue Instance
{
get
{
return instance;
}
}
public void Dispose()
{
// Enqueue one null item per worker to make each exit.
foreach (Thread worker in _workers) EnqueueItem (null);
}
public void EnqueueItem (Action item)
{
lock (_locker)
{
_itemQ.Enqueue (item); // We must pulse because we're
Monitor.Pulse (_locker); // changing a blocking condition.
}
}
void Consume()
{
while (true) // Keep consuming until
{ // told otherwise.
Action item;
lock (_locker)
{
while (_itemQ.Count == 0) Monitor.Wait (_locker);
item = _itemQ.Dequeue();
}
if (item == null) return; // This signals our exit.
item(); // Execute item.
}
}
}
But the code below is where I am not sure if the garbage collector would clean up the values of email_address when UpdateTask finishes.
[HttpPost]
public async Task<IHttpActionResult> UpdateTask()
{
List<Email> email_address = new List<Email>(); // Assuming there are a few records here
// Update something here
PCQueue.Instance.EnqueueItem(() =>
{
SendEmail(email_address);
}
return Ok();
}
Since List is a reference type it will get garbage collected once nothing holds a reference to it. You pass the list to the SendEmail method, it will live at least till the SendEmail ends (assuming that you don't pass the list anywhere else or store it in a field/property somewhere).
Some more general pointers:
However, beware of creating race conditions by accessing the list in more than one thread. Also you might want to consider using ConcurrentQueue instead of Queue.
Finally, if this is part of the ASP.NET app code I would make damn sure to either finish the queue before responding to the request or have a service to which I offload the long running queue (job server concept). I am not versed in risks regarding having threads run past the request lifetime, but I would imagine ASP can mess with it, leading to loss of enqueued work without it being complete.
public class ThreadDemo
{
Semaphore sem = new Semaphore(0, 1);//Semaphore with maxCount of 1
public ThreadDemo()
{
Thread worker = new Thread(WorkerThread);
worker.Start();
}
public void NotifyNewData()
{
//New data added to queue
//WorkerThread could be in one of two states
//1) still working or
//2) waiting for new data
if (sem.WaitOne(0) == false)//If worker thread is waiting for new data
sem.Release();//tell worker thread to process new data
}
private void WorkerThread()
{
while (true)
{
while(/*data in queue*/)
{
//process data in some queue
}
sem.WaitOne();//Worker thread processed all data, wait for more
}
}
}
See the above example.
I have a worker thread processing data from a queue and I want it to imediatly start processing new data when notified by another thread. I don't want to poll for new data.
Will the above solution work? Or, could this code potentialy result in the worker thread being blocked even when there is data that could be processed? Maybe if both threads called sem.WaitOne() at the same time.
What I am trying to achieve is to have a consumer producer method. There can be many producers but only one consumer. There cannot be a dedicated consumer because of scalability, so the idea is to have the producer start the consuming process if there is data to be consumed and there is currently no active consumer.
1. Many threads can be producing messages. (Asynchronous)
2. Only one thread can be consuming messages. (Synchronous)
3. We should only have a consumer in process if there is data to be consumed
4. A continuous consumer that waits for data would not be efficient if we add many of these classes.
In my example I have a set of methods that send data. Multiple threads can write data Write() but only one of those threads will loop and Send data SendNewData(). The reason that only one loop can write data is because the order of data must be synchronous, and with a AsyncWrite() out of our control we can only guarantee order by running one AyncWrite() at a time.
The problem that I have is that if a thread gets called to Write() produce, it will queue the data and check the Interlocked.CompareExchance to see if there is a consumer. If it sees that another thread is in the loop already consuming, it will assume that this consumer will send the data. This is a problem if that looping thread consumer is at "Race Point A" since this consumer has already checked that there is no more messages to send and is about to shut down the consuming process.
Is there a way to prevent this race condition without locking a large part of the code. The real scenario has many queues and is a bit more complex than this.
In the real code List<INetworkSerializable> is actually a byte[] BufferPool. I used List for the example to make this block easier to read.
With 1000s of these classes being active at once, I cannot afford to have the SendNewData looping continuously with a dedicated thread. The looping thread should only be active if there is data to send.
public void Write(INetworkSerializable messageToSend)
{
Queue.Enqueue(messageToSend);
// Check if there are any current consumers. If not then we should instigate the consuming.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
{ //We are now the thread that consumes and sends data
SendNewData();
}
}
//Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
List<INetworkSerializable> dataToSendList = new List<INetworkSerializable>();
while (true)
{
if (!Queue.TryDequeue(out dataToSend))
{
//Race Point A
if (dataToSendList.IsEmpty)
{
//All data is sent, return so that another thread can take responsibility.
Interlocked.Decrement(ref RunningWrites);
return;
}
//We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
dataToSendList.Add(dataToSend);
}
//Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
//Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
//Data was written to sockets, now lets loop back for more data
SendNewData();
}
It sounds like you would be better off with the producer-consumer pattern that is easily implemented with the BlockingCollection:
var toSend = new BlockingCollection<something>();
// producers
toSend.Add(something);
// when all producers are done
toSend.CompleteAdding();
// consumer -- this won't end until CompleteAdding is called
foreach(var item in toSend.GetConsumingEnumerable())
Send(item);
To address the comment of knowing when to call CompleteAdding, I would launch the 1000s of producers as tasks, wait for all those tasks to complete (Task.WaitAll), and then call CompleteAdding. There are good overloads taking in CancellationTokens that would give you better control, if needed.
Also, TPL is pretty good about scheduling off blocked threads.
More complete code:
var toSend = new BlockingCollection<int>();
Parallel.Invoke(() => Produce(toSend), () => Consume(toSend));
...
private static void Consume(BlockingCollection<int> toSend)
{
foreach (var value in toSend.GetConsumingEnumerable())
{
Console.WriteLine("Sending {0}", value);
}
}
private static void Produce(BlockingCollection<int> toSend)
{
Action<int> generateToSend = toSend.Add;
var producers = Enumerable.Range(0, 1000)
.Select(n => new Task(value => generateToSend((int) value), n))
.ToArray();
foreach(var p in producers)
{
p.Start();
}
Task.WaitAll(producers);
toSend.CompleteAdding();
}
Check this variant. There are some descriptive comments in code.
Also notice that WriteAsyncCallback now don't call SendNewData method anymore
private int _pendingMessages;
private int _consuming;
public void Write(INetworkSerializable messageToSend)
{
Interlocked.Increment(ref _pendingMessages);
Queue.Enqueue(messageToSend);
// Check if there is anyone consuming messages
// if not, we will have to become a consumer and process our own message,
// and any other further messages until we have cleaned the queue
if (Interlocked.CompareExchange(ref _consuming, 1, 0) == 0)
{
// We are now the thread that consumes and sends data
SendNewData();
}
}
// Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
var dataToSendList = new List<INetworkSerializable>();
int messagesLeft;
do
{
if (!Queue.TryDequeue(out dataToSend))
{
// there is one possibility that we get here while _pendingMessages != 0:
// some other thread had just increased _pendingMessages from 0 to 1, but haven't put a message to queue.
if (dataToSendList.Count == 0)
{
if (_pendingMessages == 0)
{
_consuming = 0;
// and if we have no data this mean that we are safe to exit from current thread.
return;
}
}
else
{
// We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
}
dataToSendList.Add(dataToSend);
messagesLeft = Interlocked.Decrement(ref _pendingMessages);
}
while (messagesLeft > 0);
// Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
private void WriteAsync(List<INetworkSerializable> dataToSendList)
{
// some code
}
// Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
// ...
SendNewData();
}
The race condition can be prevented by adding the following and double checking the Queue after we have declared that we are no longer the consumer.
if (dataToSend.IsEmpty)
{
//Declare that we are no longer the consumer.
Interlocked.Decrement(ref RunningWrites);
//Double check the queue to prevent race condition A
if (Queue.IsEmpty)
return;
else
{ //Race condition A occurred. There is data again.
//Let's try to become a consumer.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
continue;
//Another thread has nominated itself as the consumer. Our job is done.
return;
}
}
break;
I have issue with email sending window service. The service starts after every three minutes delay and get messages that are to send from the db, and start sending it. Here is how the code looks like:
MessageFilesHandler MFHObj = new MessageFilesHandler();
List<Broadcostmsg> imidiateMsgs = Manager.GetImidiateBroadCastMsgs(conString);
if (imidiateMsgs.Count > 0)
{
// WriteToFileImi(strLog);
Thread imMsgThread = new Thread(new ParameterizedThreadStart(MFHObj.SendImidiatBroadcast));
imMsgThread.IsBackground = true;
imMsgThread.Start(imidiateMsgs);
}
This sends messages to large lists, and take long to complete sending to a larger list. now the problem occurs when on message is still sending and the service get a new message to send, the previous sending is haulted and new message sending started, although i am using threads, each time service get message to send it initiate a new thread.
Can u please help where i am doing mistake in the code.
I think you are using your code inside a loop which WAITS for new messages, did you manage those waits?? let's see:
while(imidiateMsgs.Count == 0)
{
//Wait for new Message
}
//Now you have a new message Here
//Make a new thread to process message
there are different methods for that wait, I suggest using BlockingQueues:
In public area:
BlockingCollection<Broadcostmsg> imidiateMsgs = new BlockingCollection<Broadcostmsg>();
In your consumer(thread which generates messages):
SendImidiatBroadcast = imidiateMsgs.Take();//this will wait for new message
//Now you have a new message Here
//Make a new thread to process message
In producer(thread which answers messages):
imidiateMsgs.Add(SendImidiatBroadcast);
And you have to use thread pool for making new threads each time to answer messages, don' initialize new thread each time.
It looks like requirement is to build a consumer producer queue. In which producer will keep adding message to a list and consumer would pick item from that list and do some work with it
Only worry for me is, you are each time creating a new Thread to send email rather than picking threads from thread pool. If you keep on creating more and more thread, performance of your application will degrade due to over head created by context switching.
If you are using .Net framwe work 4.0, the soultion become pretty easy. You could use System.Collections.Concurrent.ConcurrentQueue for en-queuing and dequeuing your items. Its thread safe, so no lock objects required. Use Tasks to process your messages.
BlockingCollection takes an IProducerConsumerCollection in its constructor, or it will use a ConcurrentQueue by default if you call its empty constructor.
So to enqueue your messages.
//define a blocking collectiom
var blockingCollection = new BlockingCollection<string>();
//Producer
Task.Factory.StartNew(() =>
{
while (true)
{
blockingCollection.Add("value" + count);
count++;
}
});
//consumer
//GetConsumingEnumerable would wait until it find some item for work
// its similar to while(true) loop that we put inside consumer queue
Task.Factory.StartNew(() =>
{
foreach (string value in blockingCollection.GetConsumingEnumerable())
{
Console.WriteLine("Worker 1: " + value);
}
});
UPDATE
Since you are using FrameWork 3.5. I suggest you have a look at Joseph Albahari's implementation of Consumer/Producer Queue. Its one of the best that you would ever find out.
Taking the code directly from above link
public class PCQueue
{
readonly object _locker = new object();
Thread[] _workers;
Queue<Action> _itemQ = new Queue<Action>();
public PCQueue (int workerCount)
{
_workers = new Thread [workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(_workers [i] = new Thread (Consume)).Start();
}
public void Shutdown (bool waitForWorkers)
{
// Enqueue one null item per worker to make each exit.
foreach (Thread worker in _workers)
EnqueueItem (null);
// Wait for workers to finish
if (waitForWorkers)
foreach (Thread worker in _workers)
worker.Join();
}
public void EnqueueItem (Action item)
{
lock (_locker)
{
_itemQ.Enqueue (item); // We must pulse because we're
Monitor.Pulse (_locker); // changing a blocking condition.
}
}
void Consume()
{
while (true) // Keep consuming until
{ // told otherwise.
Action item;
lock (_locker)
{
while (_itemQ.Count == 0) Monitor.Wait (_locker);
item = _itemQ.Dequeue();
}
if (item == null) return; // This signals our exit.
item(); // Execute item.
}
}
}
The advantage with this approach is you can control the number of Threads that you need to create for optimized performance. With threadpools approach, although its safe, you can not control the number of threads that could be created simultaneously.