How to approximate job completion times in Hangfire - c#

I have an application that uses hangfire to do long-running jobs for me (I know the time the job takes and it is always roughly the same), and in my UI I want to give an estimate for when a certain job is done. For that I need to query hangfire for the position of the job in the queue and the number of servers working on it.
I know I can get the number of enqueued jobs (in the "DEFAULT" queue) by
public long JobsInQueue() {
var monitor = JobStorage.Current.GetMonitoringApi();
return monitor.EnqueuedCount("DEFAULT");
}
and the number of servers by
public int HealthyServers() {
var monitor = JobStorage.Current.GetMonitoringApi();
return monitor.Servers().Count(n => (n.Heartbeat != null) && (DateTime.Now - n.Heartbeat.Value).TotalMinutes < 5);
}
(BTW: I exclude older heartbeats, because if I turn off servers they sometimes linger in the hangfire database. Is there a better way?), but to give a proper estimate I need to know the position of the job in the queue. How do I get that?

The problem you have is that hangfire is asynchronous, queued, parallel, exhibits an at-least-once durability semantic, and basically non-deterministic.
To know with certainty the order in which an item will finish being processed in such a system is impossible. In fact, if the requirement was to enforce strict ordering, then many of the benefits of hangfire would go away.
There is a very good blog post by #odinserj (the author of hangfire) where he outlines this point: http://odinserj.net/2014/05/10/are-your-methods-ready-to-run-in-background/
However, that said, it's not impossible to come up with a sensible estimation algorithm, but it would have to be one where the order of execution is approximated in some way. As to how you can arrive at such an algorithm I don't know but something like this might work (but probably won't):
Approximate seconds remaining until completion =
(
(average duration of job in seconds * queue depth)
/ (the lower of: number of hangfire threads OR queue depth)
)
- number of seconds already spent in queue
+ average duration of job in seconds

Related

Concurrently running multiple tasks C#

I have REST web API service in IIS which takes a collection of request objects. The user can enter more than 100 request objects.
I want to run this 100 request concurrently and then aggregate the result and send it back. This involves both I/O operation (calling to backend services for each request) and CPU bound operations (to compute few response elements)
Code snippet -
using System.Threading.Tasks;
....
var taskArray = new Task<FlightInformation>[multiFlightStatusRequest.FlightRequests.Count];
for (int i = 0; i < multiFlightStatusRequest.FlightRequests.Count; i++)
{
var z = i;
taskArray[z] = Tasks.Task.Run(() =>
PerformLogic(multiFlightStatusRequest.FlightRequests[z],lite, fetchRouteByAnyLeg)
);
}
Task.WaitAll(taskArray);
for (int i = 0; i < taskArray.Length; i++)
{
flightInformations.Add(taskArray[i].Result);
}
public Object PerformLogic(Request,...)
{
//multiple IO operations each depends on the outcome of the previous result
//Computations after getting the result from all I/O operations
}
If i individually run the PerformLogic operation (for 1 object) it is taking 300 ms, now my requirement is when I run this PerformLogic() for 100 objects in a single request it should take around 2 secs.
PerformLogic() has the following steps - 1. Call a 3rd Party web service to get some details 2. Based on the details call another 3rd Party webservice 3. Collect the result from the webservice, apply few transformation
But with Task.run() it takes around 7 secs, I would like to know the best approach to handle concurrency and achieve the desired NFR of 2 secs.
I can see that at any point of time 7-8 threads are working concurrently
not sure if I can spawn 100 threads or tasks may be we can see some better performance. Please suggest an approach to handle this efficiently.
Judging by this
public Object PerformLogic(Request,...)
{
//multiple IO operations each depends on the outcome of the previous result
//Computations after getting the result from all I/O operations
}
I'd wager that PerformLogic spends most its time waiting on the IO operations. If so, there's hope with async. You'll have to rewrite PerformLogicand maybe even the IO operations - async needs to be present in all levels, from the top to the bottom. But if you can do it, the result should be a lot faster.
Other than that - get faster hardware. If 8 cores take 7 seconds, then get 32 cores. It's pricey, but could still be cheaper than rewriting the code.
First, don't reinvent the wheel. PLINQ is perfectly capable of doing stuff in parallel, there is no need for manual task handling or result merging.
If you want 100 tasks each taking 300ms done in 2 seconds, you need at least 15 parallel workers, ignoring the cost of parallelization itself.
var results = multiFlightStatusRequest.FlightRequests
.AsParallel()
.WithDegreeOfParallelism(15)
.Select(flightRequest => PerformLogic(flightRequest, lite, fetchRouteByAnyLeg)
.ToList();
Now you have told PLinq to use 15 concurrent workers to work on your queue of tasks. Are you sure your machine is up to the task? You could put any number you want in there, that doesn't mean that your computer magically gets the power to do that.
Another option is to look at your PerformLogic method and optimize that. You call it 100 times, maybe it's worth optimizing.

C# parallel.foreach starving for data

My application processes millions of pieces of data which vary in size. Small objects are processed quickly while others can take upwards of fifteen minutes.
My current code:
List<QueueRecords> queueRecords= Get500QueueRecords();
bool morefiles=true;
while(morefiles)
{
Parallel.ForEach(
queueRecords,parallelOptions,(record,loopstate)=>
{
//dowork
}
queueRecords = Get500QueueRecords();
if(queueRecords.Count() == 0)
{
morefiles = false;
}
}
The issue with this is that many times I will end up with one thread performing a long running task while there are still massive amounts of data to be processed.
Which pattern should I look into to resolve this issue?
Issues:
1) Get500QueueRecords could also taking some time to execute during which time you aren't doing any processing;
2) If the last record in a set takes 15 minutes you are only processing one at a time when it's processing because ParallelForEach will be waiting for it to complete.
You really should look at TPL DataFlow (https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/dataflow-task-parallel-library) or at least create a reader Task that's pumping data into a BlockingCollection<T> and then launch multiple reader Tasks that pull from the blocking collection until it's consumed.
Using a producer and a consumer with a finite size BlockingCollection<T> between them allows you to control (i) how many items are buffered from the reader Task and (ii) how many Tasks you have consuming it.

Parallel.ForEach slows down towards end of the iteration

I have the following issue :
I am using a parallel.foreach iteration for a pretty CPU intensive workload (applying a method on a number of items) & it works fine for about the first 80% of the items - using all cpu cores very nice.
As the iteration seems to come near to the end (around 80% i would say) i see that the number of threads begins to go down core by core, & at the end the last around 5% of the items are proceesed only by two cores. So insted to use all cores untill the end, it slows down pretty hard toward the end of the iteration.
Please note the the workload can be per item very different. One can last 1-2 seconds, the other item can take 2-3 minutes to finish.
Any ideea, suggestion is very welcome.
Code used:
var source = myList.ToArray();
var rangePartitioner = Partitioner.Create(0, source.Lenght);
using (SqlConnection connection =new SqlConnection(cnStr))
{
connection.Open();
try
(
Parallel.ForEach(rangePartitioner, (range, loopState) =>
{
for(int i = range.Item1; i<range.Item2; i++)
{
CPUIntensiveMethod(source[i]);
}
});
}
catch(AggretateException ae)
{ //Exception cachting}
}
This is an unavoidable consequence of the fact the parallelism is per computation. It is clear that the whole parallel batch cannot run any quicker than the time taken by the slowest single item in the work-set.
Imagine a batch of 100 items, 8 of which are slow (say 1000s to run) and the rest are quick (say 1s to run). You kick them off in a random order across 8 threads. Its clear that eventually each thread will be calculating one of your long running items, at this point you are seeing full utilisation. Eventually the one(s) that hit their long-op(s) first will finish up their long op(s) and quickly finish up any remaining short ops. At that time you ONLY have some of the long ops waiting to finish, so you will see the active utilisation drop off.. i.e. at some point there are only 3 ops left to finish, so only 3 cores are in use.
Mitigation Tactics
Your long running items might be amenable to 'internal parallelism' allowing them to have a faster minimum limit runtime.
Your long running items may be able to be identified and prioritised to start first (which will ensure you get full CPU utilisation for a long as possible)
(see update below) DONT use partitioning in cases where the body can be long running as this simply increases the 'hit' of this effect. (ie get rid of your rangePartitioner entirely). This will massively reduce the impact of this effect to your particular loop
either way your batch run-time is bound by the run-time of the slowest item in the batch.
Update I have also noticed you are using partitioning on your loop, which massively increases the scope of this effect, i.e. you are saying 'break this work-set down into N work-sets' and then parallelize the running of those N work-sets. In the example above this could mean that you get (say) 3 of the long ops into the same work-set and so those are going to process on that same thread. As such you should NOT be using partitioning if the inner body can be long running. For example the docs on partitioning here https://msdn.microsoft.com/en-us/library/dd560853(v=vs.110).aspx are saying this is aimed at short bodies
If you have multiple threads that process the same number of items each and each item takes varying amount of time, then of course you will have some threads that finish earlier.
If you use collection whose size is not known, then the items will be taken one by one:
var source = myList.AsEnumerable();
Another approach can be a Producer-Consumer pattern
https://msdn.microsoft.com/en-us/library/dd997371

Hangfire recurring tasks under minute

Is there a way to set hangfire recurring jobs every few seconds?
I do not seek a solution where fire and forget task creates another fire and forget task, and if not, what are suggested alternatives?
Not sure when this became supported but tried this in ASP.NET Core 2.0 with Hangfire 1.7.0. The following code schedules a job every 20 seconds:
RecurringJob.AddOrUpdate<SomeJob>(
x => x.DoWork(),
"*/20 * * * * *");
If I am not mistaken 6 tokens (as opposed to standard 5 tokens) is supported due to Hangfire use of NCrontab which allows cron expressions with 6 tokens (second granularity instead of minute granularity).
Hangfire dashboard also nicely shows the small time interval between runs:
I think anyone who is against allowing a recurring trigger of less than 1 min is short sighted. After all, is 55 secs any less efficient than 1 min ? It seems so arbitrary! As much as I love Hangfire, I've encountered situations where I've had to steer a client to Quartz.net simply because there was a business requirement for a job to run every 55 secs or alike.
Anyone who makes the counter argument that if it was configured to run every 1sec it would have a serious impact on performance is again taking a closed view on things. Of course a trigger with a 1 sec interval is probably not a good idea, but do we dissalow 55 sec or 45 sec for the unlikely situation where someone will choose 1 sec ?
In any case, performance is both subjective and dependent on the host platform and hardware. It really isn't up to the API to enforce opinion when it comes to performance. Just make the polling interval and trigger recurrence configurable. That way the user can determine the best result for themselves.
Although a background process which is orchestrating a job to run every 55 sec may be an option, it isn't very satisfactory. In this case, the process isn't visible via the Hangfire UI so it's hidden from the administrator. I feel this approach is circumventing one of the major benefits of Hangfire.
If Hangfire was a serious competitor to the likes of Quartz.net it would at least match their basic functionality. If Quartz can support triggers with an interval below 1 min than why can't Hangfire!
Although Hangfire doesn't allow you to schedule tasks for less than a minute, you can actually achieve this by having the function recursively scheduling itself; i.e. let's say you want some method to be hit every 2s you can schedule a background job that calls the method on Startup;
BackgroundJob.Schedule(() => PublishMessage(), TimeSpan.FromMilliseconds(2000));
And then in your PublishMessage method do your stuff and then schedule a job to call the same method;
public void PublishMessage()
{
/* do your stuff */
//then schedule a job to exec the same method
BackgroundJob.Schedule(() => PublishMessage(), TimeSpan.FromMilliseconds(2000));
}
The other thing you need to override is the default SchedulePollingInterval of 15s, otherwise your method will only be hit after every 15s. To do so just pass in an instance of BackgroundJobServerOptions to UseHangfireServer in your startup, like so;
var options = new BackgroundJobServerOptions
{
SchedulePollingInterval = TimeSpan.FromMilliseconds(2000)
};
app.UseHangfireServer(options);
I don't know how "foolproof" my solution is, but I managed to achieve my goal with it and everything is "happy" in production.
I had to do the same but with 5 seconds. The default schedule polling interval is set to 15s. So it requires 2 steps to achieve a 5s interval job.
in Startup.cs
var options = new BackgroundJobServerOptions
{
SchedulePollingInterval = TimeSpan.FromMilliseconds(5000)
};
app.UseHangfireDashboard();
app.UseHangfireServer(options);
Your job
RecurringJob.AddOrUpdate(() => YourJob(), "*/5 * * * * *");
Hangfire doesn't support intervals of less than a minute for recurring jobs.
Why? Imagine if they allowed less than a minute: let say 1 sec. How frequently would hangfire check recurring jobs in the database? This would cause a lot of database IO.
See this discussion on Hangfire for more information.
I faced with the same problem, and here it is my solution:
private void TimingExecuteWrapper(Action action, int sleepSeconds, int intervalSeconds)
{
DateTime beginTime = DateTime.UtcNow, endTime;
var interval = TimeSpan.FromSeconds(intervalSeconds);
while (true)
{
action();
Thread.Sleep(TimeSpan.FromSeconds(sleepSeconds));
endTime = DateTime.UtcNow;
if (endTime - beginTime >= interval)
break;
}
}
intervalSeconds is minimal NCRON interval. It is 1 minute.
action is our job code.
Also I suggest to use DisableConcurrentExecution to avoid some collisions of concurrency.
I had a similar requirement, in that I had a recurring job that needs running every 15 seconds.
What I did to try to get around this limitation was to just delay the creation of the scheduled jobs (set to 1 minute intervals), which seemed to do the trick.
However what I found was that, taking into account the polling intervals (set the schedulepolling interval to my frequency) and delays in picking up the new jobs this isn't always as accurate as it should be, but is doing the trick for the moment. However a better/proper solution would be good.
feel a bit dirty having to resolve to the below approach, but it helped me out...
so in essence I created 4 jobs doing the same thing, created 15 seconds apart.
along the lines of:
...
new Thread(() =>
{
//loop from {id=1 through 4}
// create job here with {id} in the name at interval of 1 minute
// sleep 15 seconds
//end loop
}).Start();
...

Threading Volume #9000

Ok, So, I just started screwing around with threading, now it's taking a bit of time to wrap my head around the concepts so i wrote a pretty simple test to see how much faster if faster at all printing out 20000 lines would be (and i figured it would be faster since i have a quad core processor?)
so first i wrote this, (this is how i would normally do the following):
System.DateTime startdate = DateTime.Now;
for (int i = 0; i < 10000; ++i)
{
Console.WriteLine("Producing " + i);
Console.WriteLine("\t\t\t\tConsuming " + i);
}
System.DateTime endtime = DateTime.Now;
Console.WriteLine(a.startdate.Second + ":" + a.startdate.Millisecond + " to " + endtime.Second + ":" + endtime.Millisecond);
And then with threading:
public class Test
{
static ProducerConsumer queue;
public System.DateTime startdate = DateTime.Now;
static void Main()
{
queue = new ProducerConsumer();
new Thread(new ThreadStart(ConsumerJob)).Start();
for (int i = 0; i < 10000; i++)
{
Console.WriteLine("Producing {0}", i);
queue.Produce(i);
}
Test a = new Test();
}
static void ConsumerJob()
{
Test a = new Test();
for (int i = 0; i < 10000; i++)
{
object o = queue.Consume();
Console.WriteLine("\t\t\t\tConsuming {0}", o);
}
System.DateTime endtime = DateTime.Now;
Console.WriteLine(a.startdate.Second + ":" + a.startdate.Millisecond + " to " + endtime.Second + ":" + endtime.Millisecond);
}
}
public class ProducerConsumer
{
readonly object listLock = new object();
Queue queue = new Queue();
public void Produce(object o)
{
lock (listLock)
{
queue.Enqueue(o);
Monitor.Pulse(listLock);
}
}
public object Consume()
{
lock (listLock)
{
while (queue.Count == 0)
{
Monitor.Wait(listLock);
}
return queue.Dequeue();
}
}
}
Now, For some reason i assumed this would be faster, but after testing it 15 times, the median of the results is ... a few milliseconds different in favor of non threading
Then i figured hey ... maybe i should try it on a million Console.WriteLine's, but the results were similar
am i doing something wrong ?
Writing to the console is internally synchronized. It is not parallel. It also causes cross-process communication.
In short: It is the worst possible benchmark I can think of ;-)
Try benchmarking something real, something that you actually would want to speed up. It needs to be CPU bound and not internally synchronized.
As far as I can see you have only got one thread servicing the queue, so why would this be any quicker?
I have an example for why your expectation of a big speedup through multi-threading is wrong:
Assume you want to upload 100 pictures. The single threaded variant loads the first, uploads it, loads the second, uploads it, etc.
The limiting part here is the bandwidth of your internet connection (assuming that every upload uses up all the upload bandwidth you have).
What happens if you create 100 threads to upload 1 picture only? Well, each thread reads its picture (this is the part that speeds things up a little, because reading the pictures is done in parallel instead of one after the other).
As the currently active thread uses 100% of the internet upload bandwidth to upload its picture, no other thread can upload a single byte when it is not active. As the amount of bytes that needs to be transmitted, the time that 100 threads need to upload one picture each is the same time that one thread needs to upload 100 pictures one after the other.
You only get a speedup if uploading pictures was limited to lets say 50% of the available bandwidth. Then, 100 threads would be done in 50% of the time it would take one thread to upload 100 pictures.
"For some reason i assumed this would be faster"
If you don't know why you assumed it would be faster, why are you surprised that it's not? Simply starting up new threads is never guaranteed to make any operation run faster. There has to be some inefficiency in the original algorithm that a new thread can reduce (and that is sufficient to overcome the extra overhead of creating the thread).
All the advice given by others is good advice, especially the mention of the fact that the console is serialized, as well as the fact that adding threads does not guarantee speedup.
What I want to point out and what it seems the others missed is that in your original scenario you are printing everything in the main thread, while in the second scenario you are merely delegating the entire printing task to the secondary worker. This cannot be any faster than your original scenario because you simply traded one worker for another.
A scenario where you might see speedup is this one:
for(int i = 0; i < largeNumber; i++)
{
// embarrassingly parallel task that takes some time to process
}
and then replacing that with:
int i = 0;
Parallel.For(i, largeNumber,
o =>
{
// embarrassingly parallel task that takes some time to process
});
This will split the loop among the workers such that each worker processes a smaller chunk of the original data. If the task does not need synchronization you should see the expected speedup.
Cool test.
One thing to have in mind when dealing with threads is bottlenecks. Consider this:
You have a Restaurant. Your kitchen can make a new order every 10
minutes (your chef has a bladder problem so he's always in the
bathroom, but is your girlfriend's cousin), so he produces 6 orders an
hour.
You currently employ only one waiter, which can attend tables
immediately (he's probably on E, but you don't care as long as the
service is good).
During the first week of business everything is fine: you get
customers every ten minutes. Customers still wait for exactly ten
minutes for their meal, but that's fine.
However, after that week, you are getting as much as 2 costumers every
ten minutes, and they have to wait as much as 20 minutes to get their
meal. They start complaining and making noises. And god, you have
noise. So what do you do?
Waiters are cheap, so you hire two more. Will the wait time change?
Not at all... waiters will get the order faster, sure (attend two
customers in parallel), but still some customers wait 20 minutes for
the chef to complete their orders.You need another chef, but as you
search, you discover they are lacking! Every one of them is on TV
doing some crazy reality show (except for your girlfriend's cousin who
actually, you discover, is a former drug dealer).
In your case, waiters are the threads making calls to Console.WriteLine; But your chef is the Console itself. It can only service so much calls a second. Adding some threads might make things a bit faster, but the gains should be minimal.
You have multiple sources, but only 1 output. It that case multi-threading will not speed it up. It's like having a road where 4 lanes that merge into 1 lane. Having 4 lanes will move traffic faster, but at the end it will slow back down when it merges into 1 lane.

Categories

Resources