Can you make Quartz.Net use a priority queue? - c#

Quart.Net by default only uses the priority of a job if it is triggered at the same time as another job. If two jobs are triggered at different times however, the earliest one will always be run first on the thread pool. I have a scenario though where I need the highest priority job in the queue to always run on the next thread. Is there an easy way to do this with Quartz.Net, or will I have to implement my own scheduler if I wanted this done? (Or move to another technology).
The specific scenario I have in mind:
Periodically jobs will be triggered at a high priority that may produce some output for another process. Minimizing wait times here is reasonably important. However, I also have times where I may trigger batches of up to several thousand jobs at once at a much lower priority. I'm worried that when these "batches" are triggered, that the much more important work will wait too long to run.
Is there an easy way to do this with Quartz.Net or a rival technology?

Have you seen Trigger Priorities? I don't think it will run it on a different thread, but it will push the job to start before another job that was triggered before with lower priority.
Trigger trigger = TriggerBuilder.Create()
.WithIdentity("TestTrigger")
.StartNow()
.WithPriority(10)
.Build();

As you mention, this isn't directly supported by Quartz.Net. You could create your own scheduler (see below) or you can work around it if you're using a database store by updating the NEXT_FIRE_TIME column in the QRTZ_TRIGGERS table. If you update the NEXT_FIRE_TIMEs so that they are ordered by the priority, then the next time the scheduler checks for a schedule change, your jobs will be executed in priority order.
There is a
public void SignalSchedulingChange(DateTimeOffset? candidateNewNextFireTimeUtc)
method on the QuartzScheduler that you could call after updating the fire times but unfortunately it's not exposed by StdScheduler. If you're going down the "my own scheduler" route, you would call this directly. Otherwise, calling the ScheduleJob method on the scheduler (and scheduling a job) calls this as well, so you could leverage that and schedule a job after updating your next fire times.

The easiest way to implement this behavior is to subclass StdAdoDelegate (or one of the DB specific implementations) and override GetSelectNextTriggerToAcquireSql so that it sorts by PRIORITY before NEXT_FIRE_TIME instead.
The following class fetches the default SQL query and swaps the order by clauses with the help of a temporary null character placeholder:
public class CustomAdoDelegate : StdAdoDelegate
{
protected override string GetSelectNextTriggerToAcquireSql(int maxCount)
{
return base.GetSelectNextTriggerToAcquireSql(maxCount)
.Replace(ColumnNextFireTime + " ASC", "\0")
.Replace(ColumnPriority + " DESC", ColumnNextFireTime + " ASC")
.Replace("\0", ColumnPriority + " DESC");
}
}
Keep in mind that this could result in long delays for lower priority jobs if the scheduler is overwhelmed with higher priority ones.
I haven't tried it myself but it might be worth considering some kind of hybrid approach that balances the PRIORITY and NEXT_FIRE_TIME values somehow. For example, you could subtract the priority in seconds from the next fire time to create a kind of variable sized priority window.
order by dateadd(ss, -PRIORITY, NEXT_FIRE_TIME)
So a priority=10 trigger would only beat a priority=5 trigger if it was triggered no more than 5 seconds later. Just a thought.

Related

What is the best approach to schedule events?

I have an application in which the user is able to create different notifications, like sticky notes and set their starting times. When he presses the start button a timer starts and these reminders should pop up at the time they were set for. I've searched for other answers, like this one, but the problem here is the notifications' times are different.
So what is the best way to schedule the events that activate the notifications?
I can think of two possible ways with their Pros and Cons:
Run a DispatcherTimer, that ticks every second and checks whether the time for a notification has come and pop it up. Pros: single DispatcherTimer instance. Cons: ticking every second and checking all notifications is an overhead.
Create a DispatcherTimer for each notification and let them handle time themselves. Pros: every timer ticks just once to pop the notification. Cons: too many timers is an overhead and may be hard to control.
Am I on the right track? Which of the two approaches is better, resource wise? Is there a third better way I am overlooking?
EDIT: If it makes any difference, the notifications should also auto close after some user-defined time and repeat at regular user-defined intervals.
I've used many methods to schedule events in C# Applications (Threads, Timers, Quartz API...), and I think that the Quertz.NET API -link- is the best tool you'll find (For me at least). It's easy and simple to use.
Example of your job class:
public class HelloJob : IJob
{
public void Execute(IJobExecutionContext context)
{
Console.WriteLine("Greetings from HelloJob!");
}
}
Example from the internet:
// Instantiate the Quartz.NET scheduler
var schedulerFactory = new StdSchedulerFactory();
var scheduler = schedulerFactory.GetScheduler();
// Instantiate the JobDetail object passing in the type of your
// class. Your class needs to implement a IJob interface
var job = new JobDetail("job1", "group1", typeof(HelloJob));
// Instantiate a trigger using the basic cron syntax.
// Example : run at 1AM every Monday - Friday.
var trigger = new CronTrigger(
"trigger1", "group1", "job1", "group1", "0 0 1 ? * MON-FRI");
// Add the job to the scheduler
scheduler.AddJob(job, true);
scheduler.ScheduleJob(trigger);
You'll find a helpful code example in the QuickSart guide here.
Regards.
If the notification system is going to be used inside single process, continue with single dispatcher timer. Make sure the dispatcher timer is set to the near notification. and each time a new notification is created or timer hit ,change the time to next nearest notification.
That way you can avoid processing every time.
eg: First time when somebody creates notification point timer to that time. If someone else create another notification before first hits change the timer to second. If the second time is after the first time, change the timer after dispatching the first notification call back. If its threaded you may need to work hard to get thread safety.
If notification is needed across process use windows task scheduler which already knows how to run timer and call our code on time. You may need to use some sort of IPC (WCF net.pipe, msmq etc...) to achieve notification.

Large Number of Timers

I need to write a component that receives an event (the event has a unique ID). Each event requires me to send out a request. The event specifies a timeout period, which to wait for a response from the request.
If the response comes before the timer fires, great, I cancel the timer.
If the timer fires first, then the request timed out, and I want to move on.
This timeout period is specified in the event, so it's not constant.
The expected timeout period is in the range of 30 seconds to 5 minutes.
I can see two ways of implementing this.
Create a timer for each event and put it into a dictionary linking the event to the timer.
Create an ordered list containing the DateTime of the timeout, and a new thread looping every 100ms to check if something timed out.
Option 1 would seem like the easiest solution, but I'm afraid that creating so many timers might not be a good idea because timers might be too expensive. Are there any pitfalls when creating a large number of timers? I suspect that in the background, the timer implementation might actually be an efficient implementation of Option 2. If this option is a good idea, which timer should I use? System.Timers.Timer or System.Threading.Timer.
Option 2 seems like more work, and may not be an efficient solution compared to Option 1.
Update
The maximum number of timers I expect is in the range of 10000, but more likely in the range of 100. Also, the normal case would be the timer being canceled before firing.
Update 2
I ran a test using 10K instances of System.Threading.Timer and System.Timers.Timer, keeping an eye on thread count and memory. System.Threading.Timer seems to be "lighter" compared to System.Timers.Timer judging by memory usage, and there was no creation of excessive number of threads for both timers (ie - thread pooling working properly). So I decided to go ahead and use System.Threading.Timer.
I do this a lot in embedded systems (pure c), where I can't burn a lot of resources (e.g. 4k of RAM is the system memory). This is one approach that has been used (successfully):
Create a single system timer (interrupt) that goes off on a periodic basis (e.g. every 10 ms).
A "timer" is an entry in a dynamic list that indicates how many "ticks" are left till the timer goes off.
Each time the system timer goes off, iterate the list and decrement each of the "timers". Each one that is zero is "fired". Remove it from the list and do whatever the timer was supposed to do.
What happens when the timer goes off depends on the application. It may be a state machine gets run. It may be a function gets called. It may be an enumeration telling the execution code what to do with the parameter sent it the "Create Timer" call. The information in the timer structure is whatever is necessary in the context of the design. The "tick count" is the secret sauce.
We also have created this returning an "ID" for the timer (usually the address of the timer structure, which is drawn from a pool) so it can be cancelled or status on it can be obtained.
Convenience functions convert "seconds" to "ticks" so the API of creating the timers is always in terms of "seconds" or "milliseconds".
You set the "tick" interval to a reasonable value for granularity tradeoff.
I have done other implementations of this in C++, C#, objective-C, with little change in the general approach. It is a very general timer subsystem design/architecture. You just need something to create the fundamental "tick".
I even did it once with a tight "main" loop and a stopwatch from the high-precision internal timer to create my own "simulated" tick when I did not have a timer. I do not recommend this approach; I was simulating hardware in a straight console app and did not have access to the system timers, so it was a bit of an extreme case.
Iterating over a list of a hundreds of timers 10 times a second is not that big a deal on a modern processor. There are ways you can overcome this as well by inserting the items with "delta seconds" and putting them into the list in sorted order. This way you only have to check the ones at the front of the list. This gets you past scaling issues, at least in terms of iterating the list.
Was this helpful?
You should do it the simplest way possible. If you are concerned about performance, you should run your application through a profiler and determine the bottlenecks. You might be very surprised to find out it was some code which you least expected, and you had optimized your code for no reason. I always write the simplest code possible as this is the easiest. See PrematureOptimization
I don't see why there would be any pitfalls with a large number of timers. Are we talking about a dozen, or 100, or 10,000? If it's very high you could have issues. You could write a quick test to verify this.
As for which of those Timer classes to use: I don't want to steal anyone elses answer who probably did much more research: check out this answer to that question`
The first option simply isn't going to scale, you are going to need to do something else if you have a lot of concurrent timeouts. (If you don't know if how many you have is enough to be a problem though, feel free to try using timers to see if you actually have a problem.)
That said, your second option would need a bit of tweaking. Rather than having a tight loop in a new thread, just create a single timer and set its interval (each time it fires) to be the timespan between the current time and the "next" timeout time.
Let me propose a different architecture: for each event, just create a new Task and send the request and wait1 for the response there.
The ~1000 tasks should scale just fine, as shown in this early demo. I suspect ~10000 tasks would still scale, but I haven't tested that myself.
1 Consider implementing the wait by attaching a continuation on Task.Delay (instead of just Thread.Sleep), to avoid under-subscription.
I think Task.Delay is a really good option. Here is the test code for measuring how many concurrent tasks can be executed in different delay times. This code is also calculating error statistics for waiting time accuracy.
static async Task Wait(int delay, double[] errors, int index)
{
var sw = new Stopwatch();
sw.Start();
await Task.Delay(delay);
sw.Stop();
errors[index] = Math.Abs(sw.ElapsedMilliseconds - delay);
}
static void Main(string[] args)
{
var trial = 100000;
var minDelay = 1000;
var maxDelay = 5000;
var errors = new double[trial];
var tasks = new Task[trial];
var rand = new Random();
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < trial; i++)
{
var delay = rand.Next(minDelay, maxDelay);
tasks[i] = Wait(delay, errors, i);
}
sw.Stop();
Console.WriteLine($"{trial} tasks started in {sw.ElapsedMilliseconds} milliseconds.");
Task.WaitAll(tasks);
Console.WriteLine($"Avg Error: {errors.Average()}");
Console.WriteLine($"Min Error: {errors.Min()}");
Console.WriteLine($"Max Error: {errors.Max()}");
Console.ReadLine();
}
You may change the parameters to see different results. Here are several results in milliseconds:
100000 tasks started in 9353 milliseconds.
Avg Error: 9.10898
Min Error: 0
Max Error: 110

Reactive Extensions Test Scheduler Simulating Time elapse

I am working with RX scheduler classes using the .Schedule(DateTimeOffset, Action>) stuff. Basically I've a scheduled action that can schedule itself again.
Code:
public SomeObject(IScheduler sch, Action variableAmountofTime)
{
this.sch = sch;
sch.Schedule(GetNextTime(), (Action<DateTimeOffset> runAgain =>
{
//Something that takes an unknown variable amount of time.
variableAmountofTime();
runAgain(GetNextTime());
});
}
public DateTimeOffset GetNextTime()
{
//Return some time offset based on scheduler's
//current time which is irregular based on other inputs that i have left out.
return this.sch.now.AddMinutes(1);
}
My Question is concerning simulating the amount of time variableAmountofTime might take and testing that my code behaves as expected and only triggers calling it as expected.
I have tried advancing the test scheduler's time inside the delegate but that does not work. Example of code that I wrote that doesnt work. Assume GetNextTime() is just scheduleing one minute out.
[Test]
public void TestCallsAppropriateNumberOfTimes()
{
var sch = new TestScheduler();
var timesCalled = 0;
var variableAmountOfTime = () =>
{
sch.AdvanceBy(TimeSpan.FromMinutes(3).Ticks);
timescalled++;
};
var someObject = new SomeObject(sch, variableAmountOfTime);
sch.AdvanceTo(TimeSpan.FromMinutes(3).Ticks);
Assert.That(timescalled, Is.EqualTo(1));
}
Since I am wanting to go 3 minutes into the future but the execution takes 3 minutes, I want to see this only trigger 1 time..instead it triggers 3 times.
How can I simulate something taking time during execution using the test scheduler.
Good question. Unfortunately, this is currently not supported in Rx v1.x and Rx v2.0 Beta (but read on). Let me explain the complication of nested Advance* calls to you.
Basically, Advance* implies starting the scheduler to run work till the point specified. This involves running the work in order on a single logical thread that represents the flow of time in the virtual scheduler. Allowing nested Advance* calls raises a few questions.
First of all, should a nested Advance* call cause a nested worker loop to be run? If that were the case, we're no longer mimicking a single logical thread of execution as the current work item would be interrupted in favor of running the inner loop. In fact, Advance* would lead to an implicit yield where the rest of the work (that was due now) after the Advance* call would not be allowed to run until all nested work has been processed. This leads to the situation where future work cannot depend on (or wait for) past work to finish its execution. One way out is to introduce real physical concurrency, which defeats various design points of the virtual time and historical schedulers to begin with.
Alternatively, should a nested Advance* call somehow communicate to the top-most worker loop dispatching call (Advance* or Start) it may need to extend its due time because a nested invocation has asked to advance to a point beyond the original due time. Now all sorts of things are getting weird though. The clock doesn't reflect the changes after returning from Advance* and the top-most call no longer finishes at a predictable time.
For Rx v2.0 RC (coming next month), we took a look at this scenario and decided Advance* is not the right thing to emulate "time slippage" because it'd need an overloaded meaning depending on the context where it's invoked from. Instead, we're introducing a Sleep method that can be used to slip time forward from any context, without the side-effect of running work. Think of it as a way to set the Clock property but with safeguarding against going back in time. The name also reflects the intent clearly.
In addition to the above, to reduce the surprise factor of nested Advance* calls having no effect, we made it detect this situation and throw an InvalidOperationException in a nested context. Sleep, on the other hand, can be called from anywhere.
One final note. It turns out we needed exactly the same feature for work we're doing in Rx v2.0 RC with regards to our treatment of time. Several tests required a deterministic way to emulate slippage of time due to the execution of user code that can take arbitrarily long (think of the OnNext handler to e.g. Observable.Interval).
Hope this helps... Stay tuned for our Rx v2.0 RC release in the next few weeks!
-Bart (Rx team)

job incomplete during when by using timer event completes in windowservice c#.net

here i have written a window service, it job is to read files from one folder and sending the same content to database and sending readed files to some other folder
now my service having timer event has sets it was about of 10000 means ten seconds,
now if a process a files between 100 - 1000 ,with in 10 sec it was doing that job processing good output, case if process the files 6000 - 9000 at that particular situation my service is not producing exact out, it was not able to do that job in 10000 (ten seconds), so i need when service in middle of the job it should get interrupted since by timer completed but real scenario it should completed the particular job.
kindly give some suggestions, it would be appreciated
Different approaches that can work:
Have the method called by the timer safe for re-entrance and then not worry about it. This tends to be either very easy to do (the task done is inherently re-entrant) or pretty tricky (if it's not naturally re-entrant you have to consider the effects of multiple threads upon every single thing hit during the task).
Have a lock in the operation, so different threads from different timer events just wait on each other. Note that you must know that there will not be several tasks in a row that take longer than the time interval, as otherwise you will have an ever-growing queue of threads waiting for their chance to run, and the amount of resources consumed just by waiting to do something will grown with it.
Have the timer not set to have a recurring interval, but rather re-set it at the end of each task, so the next task will happen X seconds after the current one finishes.
Have a lock and obtain it only if you don't have to block. If you would have to block then a current task is still running, and we just let this time slot go by to stay out of it's ways.
After all, there'll be another timer event along in 10 seconds:
private static void TimerHandler(object state)
{
if(!Monitor.TryEnter(LockObject))
return;//last timer still running
try
{
//do useful stuff here.
}
finally
{
Monitor.Exit(LockObject);
}
}
Use a static boolean variable named something like IsProcessing.
When you start working on the file you set it to true.
When the timer is fired next check if the file is still in processing.
If it's still processing, do nothing.

Implementing multithreading in C# (code review)

Greetings.
I'm trying to implement some multithreaded code in an application. The purpose of this code is to validate items that the database gives it. Validation can take quite a while (a few hundred ms to a few seconds), so this process needs to be forked off into its own thread for each item.
The database may give it 20 or 30 items a second in the beginning, but that begins to decline rapidly, eventually reaching about 65K items over 24 hours, at which point the application exits.
I'd like it if anyone more knowledgeable could take a peek at my code and see if there's any obvious problems. No one I work with knows multithreading, so I'm really just on my own, on this one.
Here's the code. It's kinda long but should be pretty clear. Let me know if you have any feedback or advice. Thanks!
public class ItemValidationService
{
/// <summary>
/// The object to lock on in this class, for multithreading purposes.
/// </summary>
private static object locker = new object();
/// <summary>Items that have been validated.</summary>
private HashSet<int> validatedItems;
/// <summary>Items that are currently being validated.</summary>
private HashSet<int> validatingItems;
/// <summary>Remove an item from the index if its links are bad.</summary>
/// <param name="id">The ID of the item.</param>
public void ValidateItem(int id)
{
lock (locker)
{
if
(
!this.validatedItems.Contains(id) &&
!this.validatingItems.Contains(id)
){
ThreadPool.QueueUserWorkItem(sender =>
{
this.Validate(id);
});
}
}
} // method
private void Validate(int itemId)
{
lock (locker)
{
this.validatingItems.Add(itemId);
}
// *********************************************
// Time-consuming routine to validate an item...
// *********************************************
lock (locker)
{
this.validatingItems.Remove(itemId);
this.validatedItems.Add(itemId);
}
} // method
} // class
The thread pool is a convenient choice if you have light weight sporadic processing that isn't time sensitive. However, I recall reading on MSDN that it's not appropriate for large scale processing of this nature.
I used it for something quite similar to this and regret it. I took a worker-thread approach in subsequent apps and am much happier with the level of control I have.
My favorite pattern in the worker-thread model is to create a master thread which holds a queue of tasks items. Then fork a bunch of workers that pop items off that queue to process. I use a blocking queue so that when there are no items the process, the workers just block until something is pushed onto the queue. In this model, the master thread produces work items from some source (db, etc.) and the worker threads consume them.
I second the idea of using a blocking queue and worker threads. Here is a blocking queue implementation that I've used in the past with good results:
https://www.codeproject.com/Articles/8018/Bounded-Blocking-Queue-One-Lock
What's involved in your validation logic? If its mainly CPU bound then I would create no more than 1 worker thread per processor/core on the box. This will tell you the number of processors:
Environment.ProcessorCount
If your validation involves I/O such as File Access or database access then you could use a few more threads than the number of processors.
Be careful, QueueUserWorkItem might fail
There is a possible logic error in the code posted with the question, depending on where the item id in ValidateItem(int id) comes from. Why? Because although you correctly lock your validatingItems and validatedItems queues before queing a work item, you do not add the item to the validatingItems queue until the new thread spins up. That means there could be a time gap where another thread calls ValidateItem(id) with the same id (unless this is running on a single main thread).
I would add item to the validatingItems queue just before queuing the item, inside the lock.
Edit: also QueueUserWorkItem() returns a bool so you should use the return value to make sure the item was queued and THEN add it to the validatingItems queue.
ThreadPool may not be optimal for jamming so much at once into it. You may want to research the upper limits of its capabilities and/or roll your own.
Also, there is a race condition that exists in your code, if you expect no duplicate validations. The call to
this.validatingItems.Add(itemId);
needs to happen in the main thread (ValidateItem), not in the thread pool thread (Validate method). This call should occur a line before the queueing of the work item to the pool.
A worse bug is found by not checking the return of QueueUserWorkItem. Queueing can fail, and why it doesn't throw an exception is a mystery to us all. If it returns false, you need to remove the item that was added to the validatingItems list, and handle the error (throw exeception probably).
I would be concerned about performance here. You indicated that the database may give it 20-30 items per second and an item could take up to a few seconds to be validated. That could be quite a large number of threads -- using your metrics, worst case 60-90 threads! I think you need to reconsider the design here. Michael mentioned a nice pattern. The use of the queue really helps keep things under control and organized. A semaphore could also be employed to control number of threads created -- i.e. you could have a maximum number of threads allowed, but under smaller loads, you wouldn't necessarily have to create the maximum number if fewer ended up getting the job done -- i.e. your own pool size could be dynamic with a cap.
When using the thread-pool, I also find it more difficult to monitor the execution of threads from the pool in their performing the work. So, unless it's fire and forget, I am in favor of more controlled execution. I know you mentioned that your app exits after the 65K items are all completed. How are you monitoring you threads to determine if they have completed their work -- i.e. all queued workers are done. Are you monitoring the status of all items in the HashSets? I think by queuing your items up and having your own worker threads consume off that queue, you can gain more control. Albeit, this can come at the cost of more overhead in terms of signaling between threads to indicate when all items have been queued allowing them to exit.
You could also try using the CCR - Concurrency and Coordination Runtime. It's buried inside Microsoft Robotics Studio, but provides an excellent API for doing this sort of thing.
You'd just need to create a "Port" (essentially a queue), hook up a receiver (method that gets called when something is posted to it), and then post work items to it. The CCR handles the queue and the worker thread to run it on.
Here's a video on Channel9 about the CCR.
It's very high-performance and is even being used for non-Robotics stuff (Myspace.com uses it behind the scenese for their content-delivery network).
I would recommend looking into MSDN: Task Parallel Library - DataFlow. You can find examples of implementing Producer-Consumer in your case would be the database producing items to validate and the validation routine becomes the consumer.
Also recommend using ConcurrentDictionary<TKey, TValue> as a "Concurrent" hash set where you just populate the keys with no values :). You can potentially make your code lock-free.

Categories

Resources