Thread-safe buffer - c#

I'm implementing some kind of buffering mechanism:
private static readonly ConcurrentQueue<ProductDto> ProductBuffer = new ConcurrentQueue<ProductDto>();
private async void On_ProductReceived(object sender, ProductReceivedArgs e)
{
ProductBuffer.Enqueue(e.Product);
if (ProductBuffer.Count >= handlerConfig.ProductBufferSize)
{
var products = ProductBuffer.ToList();
ProductBuffer.Clear();
await SaveProducts(products);
}
}
And the question is - should I bother to add some kind of lock, to ensure no data is lost (f.e. some other thread will add product after buffer.ToList() and before buffer.Clear(), hypothetically:), or ConcurrentQueue will handle all the dirty work for me?

You can do it like this:
if (ProductBuffer.Count < handlerConfig.ProductBufferSize)
return;
var productsToSave = new List<Product>();
Product dequeued = null;
while(ProductBuffer.TryDequeue(out dequeued))
{
productsToSave.Add(dequeued);
}
SaveProducts(products);
You never Clear the queue. You just keep taking things out until it's empty. Or you could stop taking things out when productsToSave reaches a certain size, process that list, and then start a new one if you don't want to save too many products at once.
This way it doesn't matter if new items are added to the queue. If they're added while you're reading from the queue, they get read too. If they're added just after you stop reading from the queue, they'll be there and get read the next time the queue gets full and you process it.
The point of a ConcurrentQueue is that you can add to it and read from it from multiple threads, with no need for lock.
If you were to do this:
productsToSave = ProductBuffer.ToList();
ProductBuffer.Clear();
then you would need the lock (which would defeat the purpose.) Presumably you're using a ConcurrentQueue because multiple threads may be adding items to the queue. If that's the case then it is entirely possible that something could go into the queue in between the execution of those two statements. It wouldn't get added to the list, but it would be deleted by Clear. That item would be lost.

This is how I would implement it, I am assuming you do not need to be notified of when the save is finished?
private void On_ProductReceived(object sender, ProductReceivedArgs e)
{
// Variable to hold potential list of products to save
List<Products> productsToSave;
// Lock buffer
lock(ProductBuffer)
{
ProductBuffer.Enqueue(e.Product);
// If it is under size, return immediately
if (ProductBuffer.Count < handlerConfig.ProductBufferSize)
return;
// Otherwise save products, clear buffer, release lock.
productsToSave = ProductBuffer.ToList();
ProductBuffer.Clear();
}
// Save Produts,
SaveProducts(products);
}
What if you get 1 product, and don't get anything else, will you not want to save this after some timeout?
I would use something like Rx for your use case, especially IObservable<T>.Buffer(count)

Related

Multithreading and how to make sure that everything is perfectly synchronized

So first all of my example code:
class Program
{
static List<string> queue = new List<string>();
static System.Threading.Thread queueWorkerThread;
static void Main(string[] args)
{
// Randomly call 'AddItemToQueue' at certain circumstances and user inputs (unpredictable)
}
static void AddItemToQueue(string item)
{
queue.Add(item);
// Check if the queue worker thread is active
if (queueWorkerThread == null || queueWorkerThread.IsAlive == false)
{
queueWorkerThread = new System.Threading.Thread(QueueWorker);
queueWorkerThread.Start();
Console.WriteLine("Added item to queue and started queue worker!");
}
else
{
Console.WriteLine("Added item to queue and queue worker is already active!");
}
}
static void QueueWorker()
{
do
{
string currentItem = queue[0];
// ...
// Do things with 'currentItem'
// ...
// Remove item from queue and process next one
queue.RemoveAt(0);
} while (queue.Count > 0);
// Reference Point (in my question) <----
}
}
What I am trying to create in my code is a QueueWorker()-method which is always active when there is something in the queue.
Items can be added to the queue via a AddItemToQueue()-method as you can see in the code example.
It basically adds the item to the queue and then checks whether the queue worker is active (aka. if there were other items in the queue previously) or if its not (aka. if the queue was completely empty previously).
And what I am not fully sure about is that: Let's say the queue-worker-thread was currently at the position shown in the screenshot (it left the while-loop just now) and of course the thread's IsAlive-property is still set to true at this point.
So what if the AddItemToQueue()-method checked the thread's IsAlive-property at the exact same time?
That would mean the thread would end shortly after and the new item would just be left in the queue and nothing would happen because the AddItemToQueue()-method didn't realize that the thread was just about to end.
How do I deal with this? (I want to make sure everything works 100%)
(If there's any questions about my question or something is not clear, then feel free to ask!)

Suggest data structure/synchronization method

I have a data source that generates ~1Million events per second from 15-20 threads.
The event callback handler implements a caching strategy, to record changes to objects from the events (it is guaranteed that updates for individual objects always originate from the same thread)
Every 100ms I want to pause/lock the event handler and publish a snapshot of the latest state of all modified objects.
A mock implementation of what I currently have looks like:
private static void OnHandleManyEvents(FeedHandlerSource feedHandlerSource, MyObject myObject, ChangeFlags flags)
{
if (objectsWithChangeFlags[myObject.ID] == ChangeFlags.None)
{
UpdateStorage updateStorage = feedHandlerSourceToUpdateStorage[(int)feedHandlerSource];
lock (updateStorage.MyOjectUpdateLock)
{
objectsWithChangeFlags[myObject.ID] = objectsWithChangeFlags[myObject.ID] | flags;
updateStorage.MyUpdateObjects.Add(myObject);
}
} else
objectsWithChangeFlags[myObject.ID] = objectsWithChangeFlags[myObject.ID] | flags;
}
// runs on separate thread
private static void MyObjectPump()
{
while (true)
{
foreach (UpdateStorage updateStorage in feedHandlerSourceToUpdateStorage)
{
lock (updateStorage.MyOjectUpdateLock)
{
if (updateStorage.MyUpdateObjects.Count == 0)
continue;
foreach (MyObject myObject in updateStorage.MyUpdateObjects)
{
// do some stuff
objectsWithChangeFlags[myObject.ID] = ChangeFlags.None;
}
updateStorage.MyUpdateObjects.Clear();
}
}
Thread.Sleep(100);
}
}
The problem with this code, while it shows good performance is a potential race condition.
Specifically, it is possibly for the ChangeFlags to be set to None for an object in the Pump thread while an event callback sets it back to an altered state without locking the resource (in which case the object would never be added to the MyObjectUpdates list and would forever remain stale).
The alternative is to lock on every event callback, which induces too much of a performance hit.
How would you solve this problem?
--- UPDATE ---
I believe I solved this problem now by introducing a "CacheItem" that is stored in the objectsWithChangeFlags array that tracks if an object is currently "Enqueued".
I've also tested ConcurrentQueue for enqueuing/dequeuing as Holger suggested below but it shows slightly lower throughput than just using a lock (I'm guessing because the contention rate is not very high and the overhead for a lock without contention is very low)
private class CacheItem
{
public ChangeFlags Flags;
public bool IsEnqueued;
}
private static void OnHandleManyEvents(MyObject myObject, ChangeFlags flags)
{
Interlocked.Increment(ref _countTotalEvents);
Interlocked.Increment(ref _countTotalEventsForInterval);
CacheItem f = objectsWithChangeFlags[myObject.Id];
if (!f.IsEnqueued)
{
Interlocked.Increment(ref _countEnqueue);
f.Flags = f.Flags | flags;
f.IsEnqueued = true;
lock (updateStorage.MyObjectUpdateLock)
updateStorage.MyObjectUpdates.Add(myObject);
}
else
{
Interlocked.Increment(ref _countCacheHits);
f.Flags = f.Flags | flags;
}
}
private static void QuotePump()
{
while (true)
{
lock (updateStorage.MyObjectUpdateLock)
{
foreach (var obj in updateStorage.MyObjectUpdates)
{
Interlocked.Increment(ref _countDequeue);
CacheItem f = objectsWithChangeFlags[obj.Id];
f.Flags = ChangeFlags.None;
f.IsEnqueued = false;
}
updateStorage.MyObjectUpdates.Clear();
}
_countQuotePumpRuns++;
Thread.Sleep(75);
}
}
In similiar szenarios (logging thread) I used the following strategy:
The events where enqueued to a ConcurrentQueue. The Snapshot thread looks once a while if the queue is not empty. If not it reads everythink out of it until it is empty, executes the changes and then takes the snapshot. After that it could sleep for a while or check again immediatly if there is something more to process and only if not sleep for a while.
With this approach your events are executed in batches and your snapshot is taken after every batch.
About Caching:
I could imagine a (Concurrent)Dictionary where you lookup the object in the event handler. If its not found, its loaded (or whereever it comes from). AFTER event processing its added (even if it was found already in there). The Snapshot method removes all objects it snapshots from the dictionary BEFORE it snapshots them. Then either the event will be in the snapshot, or the object will still be in the Dictionary after the event.
This should work with your premise that all changes to one object come from the same thread. The Dictionary will only contain the objects that are changed since the last snapshot run.
Could you have two objectsWithChangeFlags collections, and switch the reference every 100ms? That way you wouldn't have to lock anything as the pump thread would be working on an "offline" collection.

C# - how to implement an image preloading cache with threads

In my application, there is a list of images through which the user can step. Image loading is slow, so to improve user experience I would like to preload some images in the background (e.g. those images in the list succeeding the currently selected one).
I've never really used threads in C#, so I am looking for some kind of "best practice" advice how to implement the following behaviour:
public Image LoadCachedImage(string path)
{
// check if the cache (being operated in the background)
// has preloaded the image
Image result = TryGetFromCache(path);
if (result == null) { result = LoadSynchronously(path); }
// somehow get a list of images that should be preloaded,
// e.g. the successors in the list
string[] candidates = GetCandidates(path);
// trigger loading of "candidates" in the background, so they will
// be in the cache when queried later
EnqueueForPreloading(candidates);
return result;
}
I believe, a background thread should be monitoring the queue, and consecutively process the elements that are posted through EnqueueForPreloading(). I would like to know how to implement this "main loop" of the background worker thread (or maybe there is a better way to do this?)
If you really need sequential processing of the candidates, you can do one of the following:
Create a message queue data structure that has a AutoResetEvent. The class should spawn a thread that waits on the event and then processes everything in the queue. The class's Add or Enqueue should add it to the queue and then set the event. This would release the thread, which processes the items in the queue.
Create a class that starts an STA thread, creates a System.Windows.Forms.Control, and then enters Application.Run(). Every time you want to process an image asynchronously, call Control.BeginInvoke(...) and the STA thread will pick it up in its message queue.
There are probably other alternatives, but these two would be what I would try.
If you don't actually need sequential processing, consider using ThreadPool.QueueUserWorkItem(...). If there are free pool threads, it will use them, otherwise it will queue up the items. But you won't be guaranteed order of processing, and several may/will get processed concurrently.
Here's a (flawed) example of a message queue:
class MyBackgroundQueue<T>
{
private Queue<T> _queue = new Queue<T>();
private System.Threading.AutoResetEvent _event = new System.Threading.AutoResetEvent(false);
private System.Threading.Thread _thread;
public void Start()
{
_thread = new System.Threading.Thread(new System.Threading.ThreadStart(ProcessQueueWorker));
_thread.Start();
}
public class ItemEventArgs : EventArgs
{ public T Item { get; set; } }
public event EventHandler<ItemEventArgs> ProcessItem;
private void ProcessQueueWorker()
{
while (true)
{
_event.WaitOne();
while (_queue.Count > 0)
ProcessItem(this, new ItemEventArgs { Item = _queue.Dequeue() });
}
}
public void Enqueue(T item)
{
_queue.Enqueue(item);
_event.Set();
}
}
One flaw here, of course, are that _queue is not locked so you'll run into race conditions. But I'll leave it to you to fix that (e.g. use the 2 queue swap method). Also, the while(true) never breaks, but I hope the sample serves your purpose.
This is what I call cheat caching. The operating system already caches files for you, but you have to access them first. So what you can do is just load the files but don't save a reference to them.
You can do this without multi-threading per-se, and without holding the images in a list. Just create a method delegate and invoke for each file you want to load in the background.
For example, pre-loading all the jpeg images in a directory.
Action<string> d = (string file) => { System.Drawing.Image.FromFile(file); };
foreach(string file in dir.GetFiles("*.jpg"))
d.BeginInvoke(file);
BeginInvoke() is a multi-threaded approach to this, that loop will go very fast, but each file will be loaded on a different thread. Or you could change that up a little to put the loop inside the delegate, aka.
public void PreCache(List<string> files)
{
foreach(string file in files)
System.Drawing.Image.FromFile(file);
}
Then in your code
Action<List<string>> d = PreCache;
d.BeginInvoke(theList);
Then all the loading is done on just one worker thread.

Using Threads and .Invoke() and controls still remain inactive - C#

I am trying to populate a text box with some data, namely the names of several instruments a line at a time.
I have a class that will generate and return a list of instruments, I then iterate through the list and append a new line to the text box after each iteration.
Starting the Thread:
private void buttonListInstruments_Click(object sender, EventArgs e)
{
if (ins == null)
{
ins = new Thread(GetListOfInstruments);
ins.Start();
}
else if (ins != null)
{
textBoxLog.AppendText("Instruments still updating..");
}
}
Delegate to update textbox:
public delegate void UpdateLogWithInstrumentsCallback(List<Instrument> instruments);
private void UpdateInstruments(List<Instrument> instruments)
{
textBoxLog.AppendText("Listing available Instruments...\n");
foreach (var value in instruments)
{
textBoxLog.AppendText(value.ToString() + "\n");
}
textBoxLog.AppendText("End of list. \n");
ins = null;
}
Invoking the control:
private void GetListOfInstruments()
{
textBoxLog.Invoke(new UpdateLogWithInstrumentsCallback(this.UpdateInstruments),
new object[] { midiInstance.GetInstruments() });
}
Note: GetInstruments() returns a List of type Instrument.
I am implementing therads to try to keep the GUI functional whilst the text box updates.
For some reason the other UI controls on the WinForm such as a seperate combo box remain inactive when pressed until the text box has finished updating.
Am I using threads correctly?
Thanks.
You haven't accomplished anything, the UpdateInstruments() method still runs on the UI thread, just like it did before. Not so sure why you see such a long delay, that must be a large number of instruments. You can possibly make it is less slow by first appending all of them into a StringBuilder, then append its ToString() value to the TextBox. That cuts out the fairly expensive Windows call.
I would recommend using a SynchronizationContext in general:
From the UI thread, e.g. initialization:
// make sure a SC is created automatically
Forms.WindowsFormsSynchronizationContext.AutoInstall = true;
// a control needs to exist prior to getting the SC for WinForms
// (any control will do)
var syncControl = new Forms.Control();
syncControl.CreateControl();
SyncrhonizationContext winformsContext = System.Threading.SynchronizationContext.Current;
Later on, from any thread wishing to post to the above SC:
// later on -- no need to worry about Invoke/BeginInvoke! Whoo!
// Post will run async and will guarantee a post to the UI message queue
// that is, this returns immediately
// it is OKAY to call this from the UI thread or a non-UI thread
winformsContext.Post(((state) => ..., someState);
As others have pointed out, either make the UI update action quicker (this is the better method!!!) or separate it into multiple actions posted to the UI queue (if you post into the queue then other message in the queue won't be blocked). Here is an example of "chunking" the operations into little bit of time until it's all done -- it assumes UpdateStuff is called after the data is collected and not necessarily suitable when the collection itself takes noticeable time. This doesn't take "stopping" into account and is sort of messy as it uses a closure instead of passing the state. Anyway, enjoy.
void UpdateStuff (List<string> _stuff) {
var stuff = new Queue<string>(_stuff); // make copy
SendOrPostCallback fn = null; // silly so we can access in closure
fn = (_state) => {
// this is in UI thread
Stopwatch s = new Stopwatch();
s.Start();
while (s.ElapsedMilliseconds < 20 && stuff.Count > 0) {
var item = stuff.Dequeue();
// do stuff with item
}
if (stuff.Count > 0) {
// have more stuff. we may have run out of our "time-slice"
winformsContext.Post(fn, null);
}
};
winformsContext.Post(fn, null);
}
Happy coding.
Change this line:
textBoxLog.Invoke(new UpdateLogWithInstrumentsCallback(this.UpdateInstruments),
new object[] { midiInstance.GetInstruments() });
with this:
textBoxLog.BeginInvoke(new UpdateLogWithInstrumentsCallback(this.UpdateInstruments),
new object[] { midiInstance.GetInstruments() });
You are feeding all instruments into the textbox at once rather then one-by-one in terms of threading. The call to Invoke shall be placed in the for-loop and not to surround it.
nope, you start a thread, and then use invoke, which basically means you are going back to the UI thread to do the work... so your thread does nothing!
You might find that it's more efficient to build a string first and append to the textbox in one chunk, instead of line-by-line. The string concatenation operation could then be done on the helper thread as well.

.NET Working with Locking and Threads

Work on this small test application to learn threading/locking. I have the following code, I would think that the line should only write to console once. However it doesn't seem to be working as expected. Any thoughts on why? What I'm trying to do is add this Lot object to a List, then if any other threads try and hit that list, it would block. Am i completely misusing lock here?
class Program
{
static void Main(string[] args)
{
int threadCount = 10;
//spin up x number of test threads
Thread[] threads = new Thread[threadCount];
Work w = new Work();
for (int i = 0; i < threadCount; i++)
{
threads[i] = new Thread(new ThreadStart(w.DoWork));
}
for (int i = 0; i < threadCount; i++)
{
threads[i].Start();
}
// don't let the console close
Console.ReadLine();
}
}
public class Work
{
List<Lot> lots = new List<Lot>();
private static readonly object thisLock = new object();
public void DoWork()
{
Lot lot = new Lot() { LotID = 1, LotNumber = "100" };
LockLot(lot);
}
private void LockLot(Lot lot)
{
// i would think that "Lot has been added" should only print once?
lock (thisLock)
{
if(!lots.Contains(lot))
{
lots.Add(lot);
Console.WriteLine("Lot has been added");
}
}
}
}
The lock statement ensures that two pieces of code will not execute simultaneously.
If two threads enter a lock block at once, the seconbd thread will wait until the first one finishes, then continue and execute the block.
In your code, lots.Contains(lot) is always false because the DoWork method creates a different Lot object in each thread. Therefore, eah thread adds another Lot object after acquiring the lock.
You probably want to override Equals and GetHashCode in your Lot class and make it compare by value, so that lots.Contains(lot) will return true for different Lot objects with the same values.
lock is essentially a critical section, and will only lock the object while the code within is executed. As soon as the code exists the lock block, the object will be unlocked. So... it makes sense that each thread would (eventually) print to console.
You are creating a new Lot object on each thread, so if you have not defined your own Equals method for the object it makes sense that lots.Contains(lot) will always return false.
You need the lock statement to protect shared data, variables that are read and written by more than one thread at the same time. The "lot" variable doesn't qualify that requirement, every thread creates its own instance of a Lot object. And the reference is stored in a local variable ("lot"), every thread has its own local variables.
The lots field does fit the requirement. There is only one instance of it, because there is only one instance of the Work class, all threads access it. And threads both read and write to the list, respectively through the Contains method and the Add method. Your lock statement prevents a thread from accessing the list at the same time and is correct, Contains can never run at the same time as Add.
You are 95% there, you just missed that each thread has a unique "lot" object. One that cannot have been added to the list before. Every single thread will therefore get a false return from Contains.
If you want the Lot class to have identity, based on the LotID and LotNumber property values instead of just the object instance, then you'll need to give it identity by overriding the Equals() and GetHashCode() method. Check your favorite C# programming book, they all mention this. It doesn't otherwise have anything to do with threading.
Why would you only expect it to run once? You call DoWork in 10 different threads, each one creates its own "new Lot()" object. Were you expecting value comparison of Lot objects? Did you override Equals() and implement IEquatable?

Categories

Resources