C# Blocking collection data handoff time intermittent - c#

Data handoff in blocking collection is taking too much time sometimes...
Example code:
Producer:
Blockingcollection<byte[]> collection = new Blockingcollection (5000);
{
while (condition)
{
byte[] data = new byte[10240]
// fill data here.. Read from external source
collection.add(data);
}
collection.CompleteAdding();
}
Consumer:
{
while(!collection.IsCompleteAdding)
{
byte[] data = collection.Take();
// write data to disk..
}
}
Both producer and consumer are running on different task. It runs perfectly but sometime when adding array to collection take around 50 milliseconds which is deal breaker and usually it takes less then 1 millisecond to hand off data. Theoretically the consumer thread should not block when writing as writing to disk is on a separate thread.

It's the boundedCapacity value you're passing to the constructor:
Blockingcollection<byte[]> collection =
new Blockingcollection (5000 /* <--- boundedCapacity */ );
You are initializing a blocking collection with a queue limited to 5000 items. When there are 5000 items in the queue, any producer will get blocked until there's an empty slot again. This limit makes sure your queue satisfies Little's Law. You'll need to analyse your system to get the optimal bound value, or you can leave the queue unbounded and write some unit tests to make sure it doesn't overflow.

Related

How can I read a constant stream of data and keep my ListView updated in real time?

I'm Using a Task to start a while loop to run and constantly collect data from from a USB device. The data can come in very fast, multiple messages per millisecond. I want to display the data in real time using a ListView.
The goal is to have two options to display the data. The first way is to display the newest message at the top of the list view. I've tried calling a dispatcher and inserting the data at the beginning of an ObservableCollection. This works fine with a message every 20 ms or so.
Often the data coming in is the same message again and again with a consistent interval. The second way is to have a row in the listview for each unique message. As a new message comes in it either takes the place of the previous similar message or it is inserted into a new position. I accomplished this by inheriting from ObservableCollection and implementing a binarysearch function to get an index and then replace or insert the newest message. This also worked fine at about the same rate.
The problem is Updating the UI can't keep up with reading the data from the USB device when the traffic coming from the USB device is heavy. It worked fine with low volumes of data but I'm stuck trying to make this thing more efficient.
I've tried creating my own methods in my ExtendedObservableCollection. I created some AddRange methods for each scenario and calling OnCollectionChange after all the updates. The performance this way seems to be even worse than it was before which is very confusing to me. This seems like the way to go. I think the issue has something to do with my while loop which is collecting the data and the AddRange method not getting along.
I also tried calling BindingOperations.EnableCollectionSynchronization(MessageList, balanceLock);
with out using the dispatcher and it didn't seem to help much. I put my AddRange methods inside a lock statement.
I also tried running the Batchupdate method in its own while loop running parallel the my loop collecting data it didn't work at all.
This is my loop for reading the messages from the USB device
int interval = 40;
private void BeginReading()
{
do
{
waitHandle.WaitOne(TimeSpan.FromMilliseconds(.5));
if (ReadOk)
{
MessageBuffer.Add(message);
}
if (Stopwatch.ElapsedMilliseconds > interval)
{
BatchUpdate();
MessageBuffer = new List<Message>();
interval += 40;
}
} while (ReceiveReady);
}
This is one of my AddRange Methods in my extended ObservableCollection
public void AddRangeScroll(List<Message> MessageList)
{
if (MessageList == null) throw new ArgumentNullException("List");
foreach (Message message in MessageList)
{
Items.Insert(0, message);
}
OnCollectionChanged(newNotifyCollectionChangedEventArgs
(NotifyCollectionChangedAction.Reset));
}
I'm hoping I'll be able to read the data from the USB device and update the ListView in something that resembles real time.
The messages I'm reading are CAN messages and I'm using the PEAK PCANBasic API to connect to one of their gridconnect USB to CAN devices.
Your approach is absolutely bad. You are blocking threads and keep them busy with useless resources consuming polling. (I don't know why you are making the thread wait. But consider to use a non-blocking wait handle operation like SemaphoreSlim.WaitAsync or similar. Maybe the awaitable Task.Delay is sufficient at this point. No waiting would be the best.)
If you wish to group messages like same type together use a Dictionary<K, V> e.g. Dictionary<string, IEnumerable<string>> where the key is the type of the message and the value a collection of grouped messages. The lookup is very fast and doesn't require any search. For this scenario consider to introduce a type for the key (e.g. an enum) and overwrite GetHash on it to improve performance. If the key is guaranteed to be unique hash tables perform the best.
First option is to use an event driven logic:
public void ReadDataFromUsb()
{
var timer = new System.Timers.Timer(40);
timer.Elapsed += ReadNextDataBlockAsync;
}
// Will be invoked every 40 ms by the timer
public async void ReadNextDataBlockAsync(Object source, ElapsedEventArgs e)
{
await Task.Run(BeginReading);
}
private void BeginReading()
{
// Read from data source
}
Second option is to use yield return and make BeginRead return an IEnumerable. You can then use it like this:
foreach (var message in BeginRead())
{
// Process message
}
Third option is to use Producer Consumer Pattern using a BlockingCollection<T>. It exposes a non-blocking TryTake method to get the items that are ready for consumption.
See Dataflow (Task Parallel Library) to get more solutions how to handle data flow without blocking threads.
Since a 40 ms data refresh rate is very high also consider to show snapshots that are expandable when details are needed by the user.

Is there a safe way for a quick exchange of data between threads?

I'm setting up an application that reads data from a load cell and, in real time, based on the data read, interrupts the thrust of a motor. It is essential to have a high frequency reading from the load cell.
I'm programming in c# and I decided to use a separate thread to acquire data from the load cell.
My question is this: how can I use the data acquired in the thread in a thread-safe way? For example to show them in a chart.
This is the thread I call to acquire data in the queue.
Thread t = new Thread(() =>
{
Thread.CurrentThread.IsBackground = true;
while (save_in_queue)
{
Thread.Sleep(1);
if (queue.Count <= 1000)
{
queue.Enqueue(Frm_main.ComPh1.LeggiAnalogica(this.Address));
}
else
{
queue.Dequeue();
queue.Enqueue(Frm_main.ComPh1.LeggiAnalogica(this.Address));
}
}
});
t.Name = "Queue " + this.name;
t.Start();
This is method I use to associate queue filled in thread and queue in the main
public void SetData(Queue<int> q)
{
this.data = q;
}
This is the timer I use in the main application to set data for the series
private void timer1_Tick(object sender, EventArgs e)
{
List<int> dati = new List<int>();
lock (data)
{
dati = data.ToList();
}
grafico.Series[serie.Name].Points.Clear();
for (int x = 0; x < dati.Count; x++)
{
DataPoint pt = new DataPoint();
pt.XValue = x;
pt.YValues = new double[] { dati.ElementAt(x) };
grafico.Series[serie.Name].Points.Add(pt);
}
}
This code does not work because somethimes I receive the exception
"Collection was modified; enumeration operation may not execute" on the line dati = data.ToList();
Form me it's pretty clear why I receive this exception. But how to solve it?
I would like to avoid using too many "lock" or too many synchronization variables in order not to reduce the acquisition performance, which at the moment is excellent.
Don't do this in your consumer thread:
lock (data) {
dati = data.ToList();
}
You're using the queue for two different purposes; You're using it to pass data between the two threads, which is good; but you're also using it as a history buffer for previous data samples. That's bad.
What's doubly bad is, each time the timer ticks, you're locking the queue long enough to let the consumer copy maybe hundreds of data that it had previously copied on earlier ticks.
This is bad too:
if (queue.Count <= 1000) {
queue.Enqueue(Frm_main.ComPh1.LeggiAnalogica(this.Address));
}
else {
queue.Dequeue(); <== THIS IS BAD!
queue.Enqueue(Frm_main.ComPh1.LeggiAnalogica(this.Address));
}
One problem with that is, you are making the producer manage the history buffer (e.g., by limiting the length of the queue), but it's the consumer who cares about the length.
Another problem is that the producer does not lock the queue. If any thread needs to lock a data structure, then every thread needs to lock it.
The producer should do just one thing: It should read data from the sensor, and stuff the data into a queue.
The queue should be used for just one purpose: To communicate new data between the threads.
The producer should lock the queue just long enough to get the new data from the queue, and copy that into its own, private collection.
Multi-threaded programming often can be counter-intuitive. One example is; If you can decrease the amount of time that threads spend accessing a shared object by increasing the amount of work that each thread has to do, that often will improve the overall performance of the program. That's because locking is expensive, and because accessing memory locations that have been touched by other threads is expensive.
You may want to check Concurrent Collections namespace which provides Thread-Safe implementation of some collections
The System.Collections.Concurrent namespace provides several
thread-safe collection classes that should be used in place of the
corresponding types in the System.Collections and
System.Collections.Generic namespaces whenever multiple threads are
accessing the collection concurrently.
https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent
So you can use System.Collections.Concurrent.ConcurrentQueue instead of System.Collections.Queue in order to provide lock free solution to your problem.

BlockingCollection that doesn't try again within 10 seconds

I am using a Blockingcollection as a FIFO queue but I am doing a lot of operations on files, where the consumer may easily encounter a file lock, so what I have done is created a simple try catch where the consumer re-queue's itself, but in a long FIFO queue with lots of other Items in the queue this is enough of a pause, but in an empty or very short FIFO queue it means the consumer perpetually hammers the queue with repeating re-occurrences of itself that are probably going to be still file locked.
i.e.
consumer busy -> requeue -> consumer busy -> requeue (ad infinitum)
is there a way to get the BlockingCollection to not attempt to run the new consumer if it is less than 10 seconds old? i.e. potentially get the net one in the queue and carry on and only take the next consumer if it's createdDateTime is null (default for first attempt) or if it is > 10 seconds?
There's nothing built-in to help with that. Store with each work item the DateTime when it was last attempted (could be null if this is the first attempt). Then, in your processing function wait for TimeSpan.FromSeconds(10) - (DateTime.UtcNow - lastAttemptDateTime) seconds before making the next attempt.
Consider switching to a priority queue that stores items in the order of earliest next attempt datetime.
You could keep two blocking collections: the main one and the "delayed" one. One worker thread would only work on the delayed one, readding them to the main collection. The signature of the rejected collection would be something like:
BlockingCollection<Tuple<DateTime, YourObject>>
now... If the time is fixed at 10 seconds, the delayed collection will nearly be DateTime sorted (in case of items added nearly at the same time this could be not-true, but we are speaking of milliseconds difference... not a problem)
public class MainClass
{
// The "main" BlockingCollection
// (the one you are already using)
BlockingCollection<Work> Works = new BlockingCollection<Work>();
// The "delayed" BlockingCollection
BlockingCollection<Tuple<DateTime, Work>> Delayed = new BlockingCollection<Tuple<DateTime, Work>>();
// This is a single worker that will work on the Delayed collection
// in a separate thread
public void DelayedWorker()
{
Tuple<DateTime, Work> tuple;
while (Delayed.TryTake(out tuple, -1))
{
var dt = DateTime.Now;
if (tuple.Item1 > dt)
{
Thread.Sleep(tuple.Item1 - dt);
}
Works.Add(tuple.Item2);
}
}
}

What's the most efficient way to handle infinite tasks in Producer/Consumer?

I have Gigabytes of data (stored in messages, each message is about 500KB) in a cloud queue (Azure) and data keeps coming.
I need to do some processing on each message. I've decided to create 2 background workers, one to get the data into memory, and the other to process that data:
GetMessage(CloudQueue cloudQueue, LocalQueue localQueue)
{
lock (localQueue)
{
localQueue.Enqueue(cloudQueue.Dequeue());
}
}
ProcessMessage(LocalQueue localQueue)
{
lock (localQueue)
{
Process(localQueue.Dequeue());
}
}
The issue is that data never stops coming so I'll be spending ALOT of time on synchronizing the local queue. Is there a known pattern for this type of problem?
You don't need to hold the lock while you process
Item i;
lock (localQueue)
{
i = localQueue.Dequeue();
}
Process(i);
Hence there should be little contention. If necessary, reduce the frequency with which the Producer takes the lock for enqueuing by batching the insertions: rather than the queue hold individual items have it hold batches. You effectively reduce the number of locks by a factor which is the average batch size. You can have a simple model of batching, say, every 10 or by time or some combination of time and threshold.

Best Data Structure? - 2 Threads, 1 producer, 1 consumer

What is the best data structure to use to do the following:
2 Threads:
1 Produces (writes) to a data structure
1 Consumes (reads and then deletes) from the data structure.
Thread safe
Producer and Consumer can access data structure simultaenously
Efficient for large amounts of data
I wouldn't say that point 4 is impossible, but it is pretty hard, and actually, you should think hard if you really have that requirement.
...
Now that you realized that you don't, the Queue<T> would be what immediately springs to my mind when reading Producer/Consumer.
Let's say you have a thread running ProducerProc() and another running ConsumerProc(), and a method CreateThing() which produces, and a method HandleThing() which consumes, my solution would look something like this:
private Queue<T> queue;
private void ProducerProc()
{
while (true) // real abort condition goes here
{
lock (this.queue)
{
this.queue.Enqueue(CreateThing());
Monitor.Pulse(this.queue);
}
Thread.Yield();
}
}
private void ConsumerProc()
{
while (true)
{
T thing;
lock (this.queue)
{
Monitor.Wait(this.queue);
thing = this.queue.Dequeue();
}
HandleThing(thing);
}
}
Seeing lock, you realize immediately, that the two threads do NOT access the data structure completely simultaneously. But then, they only keep the lock for the tiniest amount of time. And the Pulse/Wait thing makes the consumer thread immediately react to the producer thread. This should really be good enough.

Categories

Resources