How to process (dynamically added) items at a given time?

How to process (dynamically added) items at a given time? - c#

I've got a (concurrent) priority queue with a timestamp (in the future) as the key and a function that should be called (/ an item that should be processed) when the time is reached as the value. I don't want to attach a timer to each item, cause there a lots of it. I'd rather go with a scheduler thread/task.
What would be a good strategy to do so?
With a thread running a scheduler... (pseudo-code follows)
// scheduler
readonly object _threadLock = new object();
while (true)
{
if(queue.Empty)
{
Monitor.Wait(_threadLock);
}
else
{
var time = GetWaitingTimeForNextElement();
if(time > 0)
Monitor.Wait(_threadLock, time);
else
// dequeue and process element
}
}
...and pulsing when adding elements (to an empty queue or adding a new first element)?
// element enqueued
Monitor.Pulse(_threadLock);
Or with somehow chained (Task.ContinueWith(...)) Tasks using Task.Delay(int, CancellationToken )? This would need some logic to abort the waiting if a new first element is enqueued or to create a new task if no one is running. It feels like there is a simpler solution I'm not getting right now. :)
Or using a timer (very-pseudo-code, just to get the idea)...
System.Timers.Timer x = new System.Timers.Timer().Start();
x.Elapsed += (sender, args) =>
{
// dequeue and process item(s)
x.Interval = GetWaitingTimeForNextElement(); // does this reset the timer anyway?
}
...and updating the interval when adding elements (like above).
// element enqueued
x.Interval = updatedTime;
I'm also concerned with the precision of the wait methods / timers: Milliseconds is quite rough (allthough it might work) Is there a better alternative?
Ergo...
Thats again a bunch of questions/thoughts - sorry for that - but there are so many options and concerns that its hard to get an overview. So to summarize: What is the best way to implement a (precise) time scheduling system for dynamically incoming items?.
I appreciate all hints and answers! Thanks a lot.

I would suggest doing it like this:
Create a class called TimedItemsConcurrentPriorityQueue<TKey, TValue> that inherits from ConcurrentPriorityQueue<TKey, TValue>.
Implement an event called ItemReady in your TimedItemsConcurrentPriorityQueue<TKey, TValue> class that gets fired whenever an item is ready (for being processed) according to the timestamp. You can use a single timer and update the timer as needed by shadowing the Enqueue, Insert, Remove and other methods as needed (Or by modifying the source of ConcurrentPriorityQueue<TKey, TValue> and make those methods virtual so you can override them).
Instantiate a single instance of TimedItemsConcurrentPriorityQueue<TKey, TValue>, let's call that variable itemsWaitingToBecomeReady.
Instantiate a single object of BlockingCollection<T>, let's call that variable itemsReady. Use the constructor that takes an IProducerConsumerCollection<T> and pass it a new instance of ConcurrentPriorityQueue<TKey, TValue> (it inherits IProducerConsumerCollection<KeyValuePair<TKey,TValue>>)
Whenever the event ItemReady is fired in itemsWaitingToBecomeReady, you deque that item and enqueue it to itemsReady.
Process the items in itemsReady using the BlockingCollection<T>.GetConsumingEnumerable method using a new task like this:
.
Task.Factory.StartNew(() =>
{
foreach (var item in itemsReady.GetConsumingEnumerable())
{
...
}
}

Related

Delaying a method call

I have a high rate of events that can occur for a specific entity and i need to transfer them over a network. The problem is that those event can generate high level of traffic and calculation and that is not desired.
So my question would be what would be the best way to delay the execution of calculation function for a specific amount of time. In my case events doesn't have any actual data that i need to buffer or occurrence order so basically it would be just to start a timer once event occurs and fire it with entity parameter once delay expires.
I could build my own implementation with a timer but it seem that there are already ones that should support it e.g reactive extensions ?
In any case if somebody can point me out to an existing implementation or framework would be greatly appreciated.
Edit
Ok, i have looked at RX observable pattern and it looks like it can do the job. I can see a simple implementation that i could use e.g
IDisposable handlers;
Subject<int> subject = new Subject<int>();
handlers = subject.AsObservable().Sample(TimeSpan.FromSeconds(10))
.Subscribe(sample =>
{
Trace.WriteLine(sample);
});
Now whenever i want to process event i would call
subject.OnNext(someValue);
The sample should delay the calls to subscribers.
Can somebody comment if i am correct with this usage?

Here is an example to what you can do:
public class ExpiryDictionarty
{
Timer timer; //will hanlde the expiry
ConcurrentDictionary<string, string> state; //will be used to save the last event
public ExpiryDictionarty(int milisec)
{
state = new ConcurrentDictionary<string, string>();
timer = new Timer(milisec);
timer.Elapsed += new ElapsedEventHandler(Elapsed_Event);
timer.Start();
}
private void Elapsed_Event(object sender, ElapsedEventArgs e)
{
foreach (var key in state.Keys)
{
//fire the calculation for each event in the dictionary
}
state.Clear();
}
public void Add(string key, string value)
{
state.AddOrUpdate(key, value);
}
}
you can create a collection that will save all the events that you receive, once the time ticks you can fire all the events in the collection, because we are using a dictionary we can save only the last event so we don't have to save all the events you get.

I suggest you look into Proxy design pattern. Your clients will know only about a proxy and trigger events on the Proxy object. Your Proxy object will contain the logic that determines when to send actual request over the wire. This logic may depend on your requirements. From what I understood, having a boolean switch isEventRaised and checking it within a configurable interval may suffice your requirements (you will reset the flag to false at the end of this interval).
Also, you may check Throttling implementations first and try to figure out whether they will suite your requirements. For example, here is a StackOverflow question about different Throttling methods, which references among others a Token bucket algorithm.

Using TPL to batch/de-parallelise separate invocations

Maybe the TPL isn't the right tool, but at least from one not particularly familiar with it, it seems like it ought to have what I'm looking for. I'm open to answers that don't use it though.
Given a method like this:
public Task Submit(IEnumerable<WorkItem> work)
This can execute an expensive async operation on a collection of items. Normally the caller batches up these items and submits as many as it can at once, and there's a fairly long delay between such batches, so it executes fairly efficiently.
However there are some occasions where no external batching happens and Submit gets called for a small number of items (typically only one) many times in quick succession, possibly even concurrently from separate threads.
What I'd like to do is to defer processing (while accumulating the arguments) until there has been a certain amount of time with no calls, and then execute the operation with the whole batch, in the originally specified order.
Or in other words, each time the method is called it should add its arguments to the list of pending items and then restart the delay from zero, such that a certain idle time is required before anything is processed.
I don't want a size limit on the batch (so I don't think BatchBlock is the right answer), I just want a delay/timeout. I'm certain that the calling pattern is such that there will be an idle period at some point.
I'm not sure whether it's better to defer even the first call, or if it should start the operation immediately and only defer subsequent calls if the operation is still in progress.
If it makes the problem easier, I'm ok with making Submit return void instead of a Task (ie. not being able to observe when it completes).
I'm sure I can muddle together something that works like this, but it seems like the sort of thing that ought to already exist somewhere. Can anyone point me in the right direction? (I'd prefer not to use non-core libraries, though.)

Ok, so for lack of finding anything suitable I ended up implementing something myself. Seems to do the trick. (I implemented it a bit more generically than shown here in my actual code, so I could reuse it more easily, but this illustrates the concept.)
private readonly ConcurrentQueue<WorkItem> _Items
= new ConcurrentQueue<WorkItem>();
private CancellationTokenSource _CancelSource;
public async Task Submit(IEnumerable<WorkItem> items)
{
var cancel = ReplacePreviousTasks();
foreach (var item in items)
{
_Items.Enqueue(item);
}
await Task.Delay(TimeSpan.FromMilliseconds(250), cancel.Token);
if (!cancel.IsCancellationRequested)
{
await RunOperation();
}
}
private CancellationTokenSource ReplacePreviousTasks()
{
var cancel = new CancellationTokenSource();
var old = Interlocked.Exchange(ref _CancelSource, cancel);
if (old != null)
{
old.Cancel();
}
return cancel;
}
private async Task RunOperation()
{
var items = new List<WorkItem>();
WorkItem item;
while (_Items.TryDequeue(out item))
{
items.Add(item);
}
// do the operation on items
}
If multiple submissions occur within 250ms, the earlier ones are cancelled, and the operation executes once on all of the items after the 250ms is up (counting from the latest submit).
If another submit occurs while the operation is running, it will continue to run without cancelling (there's a tiny chance it will steal some of the items from the later call, but that's ok).
(Technically checking cancel.IsCancellationRequested isn't really necessary, since the await above will throw an exception if it was cancelled during the delay. But it doesn't hurt, and there is a tiny window it might catch.)

Multi-threading list pattern advice

I have made an application which also contains a folder/file scanner. I'm coming across a problem with the threading structure.
How it works:
For each folder/file it finds it starts a thread. There is a function inside each thread that uses a list to check if a similar item has been found so that it can add to the existing item. If it's not found it will add the item to the earlier mentioned list. The threads are executed parallel (async).
Problem:
Because it's async it will sometimes fail on the listcheck. This is caused because there is a time period between the check and adding to the list. Something that can happen is that the check returns that there is not a similar item, while there certainly is. This will result in the same item occurring in the list.
I have also made it that threads wait on each other. I really like the effect this gives it on the frontend. (items nicely adding to the list real time). But this takes way to long for a lot of folders/files.
Now I'm thinking of making a mix between the functions, but i would really like to see a combination of the speed of async threads and the safety of waiting on each thread.
Anybody any idea?

You should lock the entire code part that checks the list and adds a value.
Something like this:
private void YourThreadMethod(object state)
{
// long taking operation
lock (dictionary)
{
if (!dictionary.ContainsKey(yourItemKey))
{
// construct object, long taking operation
dictionary.Add(yourItemKey, createdObject);
}
}
}
In this way, every thread will have to wait until the list is free to use. If you want a more advanced solution, you could read into the ReaderWriterLockSlim class which gives a more fine grained solution.

The most sleekest approach is the usage of a ConcurrentDictionary<string, byte> when yourItemKey is type of string (otherwise adapt TKey and use a proper IEqualityComparer or implement IEquatable):
private readonly ConcurrentDictionary<string, byte> _list = new ConcurrentDictionary<string, byte>();
private void Foo(object state)
{
// looong operation
this._list.TryAdd(yourItemKey, 0);
}
public void Bar()
{
// this is how to query the content
this._list.Keys...;
}
The trick behind that is to not use a too complex object as the key, which may need disposal or has external references (I'd prefer any string representation), and a small type for the value, which just acts as a marker.

I would consider using one of the thread safe collections in C#. For your case something like a ConcurrentBag will be more efficient than using a lock.
In case there is a time delay between checking and adding, you can use ConcurrentDictionary. It has a TryAdd method which will return false if an item with the same key is already in the dictionary.

Disposing of thread with infinite loop

I have an infinite loop that is used to consume items from a BlockingCollection.
public class MessageFileLogger
{
private BlockingCollection<ILogItem> _messageQueue;
private Thread _worker;
private bool _enabled = false;
public MessageFileLogger()
{
_worker = new Thread(LogMessage);
_worker.IsBackground = true;
_worker.Start();
}
private void LogMessage()
{
while (_enabled)
{
if (_messageQueue.Count > 0)
{
itm = _messageQueue.Take();
processItem(itm);
}
else
{
Thread.Sleep(1000);
}
}
}
}
which is referenced by another object that gets instantiated every minute or couple of minutes (could be moved out to 1 hour increments or such).
public class Helper
{
MessageFileLogger _logger;
public Helper(string logFilePath, LogMode logMode)
{
_logger = new MessageFileLogger(logFilePath, logMode);
_logger.Enabled = true;
}
public void foo()
{
}
}
Question #1)
What can I do to ensure that the thread is exited when the object that references it is no longer needed?
Note: Helper only needs to call foo, so once it no longer needs to call foo, the object can be garbage collected. So, incorporating a using statement with Helper is certainly a possibility.
Question #2)
Does _messageQueue need to be disposed? If so, how do I dispose of it without it affecting the LogMessage thread? (I tried disposing of it while the thread was running and no surprise got an error).
I tried extending IDisposable (in MessageFileLogger):
public void Dispose()
{
_enabled = false;
_messageQueue.Dispose();
}
and I haven't had any issues with this but I'm not confident that I just haven't had an issue yet. Also, would this mean that Helper also needs to IDisposable and a using statement needs to be used with Helper?
Note: This question is based on the same code I had with another question of mine.

First off, your consumer shouldn't be calling Thread.Sleep. It also most certainly shouldn't be checking the count of the collection. The whole point of BlockingCollection is that when you call Take, it either gives you and item, or it waits until there is an item to give you, and then gives it to you. So you can just keep calling Take in a loop with nothing else. This prevents you from waiting some fraction of a second when there is already an item you could be processing.
Better still, you can simply use GetConsumingEnumerable to get a sequence of items.
Your consumer can now look like this:
foreach(var item in _messageQueue.GetConsumingEnumerable())
processItem(item);
Additionally, BlockingCollection has built in support for indicating that the queue is done. Simply have the producer call CompleteAdding to indicate that no more items will be added. After doing that, once the queue is empty, the Enumerable will end, and the foreach loop will finish. The consumer can do any clean up it needs to at that point in time.
In addition to the fact that using the BlockingCollection to determine when you're done is just generally more convenient, it's also correct, unlike your code. Since _enabled isn't volatile, even though you're reading and writing to it from different threads, you're not introducing the proper memory barriers, so the consumer is likely to be reading a stale value of that variable for some time. When you use mechanisms from the BCL specifically designed to handle these types of multithreaded situations you can be sure that they'll be handled properly on your behalf, without you needing to think about them.

Thread Synchronization

I have a list of objects,
for each object i want to run a totally separate thread (thread safty),like....i will pick one a object from my list in while loop and run a thread and then for next object run the next threads...all thread should be synchronized such that resources (values/connection (close/open) )shared by them should not change.....

Starting a thread per object is not necessarily wise; you should probably have a small number of worker threads picking items off the list (or better, a Queue<T>), synchronizing access to that list/queue. An example of a thread-safe queue can be found in this thread.
Once you have a work item, there is no magic bullet for making the rest of the code you write (to process it) thread-safe. A sensible approach that keeps things simple is immutability - either true immutability (the items can't change), or simply don't change the object. You can of course implement locking around the work item, but this only helps if all your code uses the same locking strategy, which is hard to enforce.

i will pick one a object from my list
in while loop and run a thread and
then for next object run the next
threads
If I really wanted a thread per object, which I probably wouldn't, I would create a class like this:
class ObjectProcessingThread
{
Thread processingThread = new Thread();
public TargetObject { get; set;}
public Start()
{
//start the processing thread with threadEntryPoint as the work the thread will do
}
private threadEntryPoint
{
//do stuff with targetObject
}
}
Then in the while loop new up an ObjectProcessingThread for each object, setting it's TargetObject property, then calling Start.
all thread should be synchronized such
that resources (values/connection
(close/open) )shared by them should
not change.....
If you don't want values to change, don't change them.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to process (dynamically added) items at a given time? - c#

Related

Delaying a method call

Using TPL to batch/de-parallelise separate invocations

Multi-threading list pattern advice

Disposing of thread with infinite loop

Thread Synchronization

Categories

Resources