C# delayed (async) execution without reliability across restarts [duplicate] - c#

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the correct way to delay the start of a Task in c#
I need to schedule small tasks to be executed in the future (delay is always < 1minute). Implementation is in .NET on the .NET 4.0 runtime. Async ctp is an option, although I don't see the added value for the moment.
the scheduling needs to be async
scheduling resolution is in seconds
the task execution is implicitly async (i think)
the number of scheduled tasks could be in the hundreds or even thousands
it is possible, however unlikely, that two tasks will be scheduled at the exact same time
My current solution is this:
internal class TimerState
{
internal Timer Timer { get; set; }
internal object Payload { get; set; }
internal Action<object> Action { get; set; }
}
public class TimerModule
{
public static void ScheduleTask(object input, Action<object> action, TimeSpan delay)
{
//create state to pass to timer method
var state = new TimerState { Payload = input, Action = action };
//schedule timer without firing
var t = new Timer(HandleScheduleTimer, state, -1, -1);
//add timer to state to be able to dispose it
state.Timer = t;
//schedule timer to fire in delay time
t.Change(delay, TimeSpan.FromMilliseconds(-1));
}
private static void HandleScheduleTimer(object state)
{
var s = state as TimerState;
Task.Factory.StartNew(s.Action, s.Payload, CancellationToken.None,
TaskCreationOptions.PreferFairness, TaskScheduler.Current);
//dispose the timer immediately
if(s.Timer != null)
s.Timer.Dispose();
}
}
I've done some tests with performance counters (.NET physical threads), but I don't see that many threads running at the same time, even though I schedule thousands of tasks at approximately the same time.
Is there a better way to do this?
Are there any proven design patterns around this?
Mostly I've found scheduling to be reliable across restarts, but I don't need that, I can replay the data in the system after a crash and compensate for the scheduled tasks that weren't executed.
Edit: I don't mean that I want to see thousands of threads running at the same time. I'm aware that this will probably be handled with the threadpool. Like I say in the comments, a test with 5M tasks spanned over 100 seconds only sees an increase of 20 in the physical threads.
My main question is this: is there a better way to delay task execution?

The solution you're looking for is Quartz.
http://quartznet.sourceforge.net/
It's a very robust scheduler capable of handling different jobs tied to different schedules. It is far more reliable than a timer. It has failover if anything goes wrong.
The schedules either use a CRON interval or can run on specific dates. I've heard of instances having thousands of jobs. Quartz can be clustered if need be, and can run stateful (keeping knowledge of previous instances) and non-stateful jobs. Depending on how they're configured, they can run simultaneously or in a queue. Most thread and thread safety issues have been taken care of for you, you're free to write your job class without headache.
Set up is simple (install service, and configure), creating quartz jobs are even simpler. Create a class that inherits from IJob, apply your logic, and add it to the quartz config. Config can be an xml file, sql server database, or you can create your own solution. I'm working on one for RavenDB.

Related

Is Async Await in Console apps as useful as GUI and ASP.NET apps?

I am currently learning C# Async/Await feature and can see its usefulness in GUI and web apps but I am still trying to figure out its real usefulness in Console apps. Can you give an example that drives home the point?
Async allows running more code until tasks are awaited, so if there is more code that can be run simultaniously (meaning - it is not dependent of the other task), it can start right away.
for example:
public async Task<string> GetUserFullNameAsync(string firstName)
{
return await GetUserFullNameAsyncInner(firstName); // gets user name from db in an async fashion - takes 4 seconds
}
public async Task<DateTime> GetFlightTimeAsync(string filghtName)
{
return await GetFlightTimeAsyncInner(filghtName); // gets filget time from db in as async fashion - takes 4 seconds
}
public async Task<UserDetails> GetUserDetailsAsync(string userFullName)
{
return await GetUserDetailsAsyncInner(name); // gets user details by its full name from db in an async fashion - takes 4 seconds
}
lets look at this function:
public async <UserDetails> GetUserDetails(string firstName)
{
var userFullName = await GetUserDetailsAsync(firstName);
return await GetUserDetailsAsync(userFullName);
}
notice how GetUserDetailsAsync is dependent of getting the full name first, by using GetUserDetailsAsync.
So if you need to get the UserDetails object, you are dependent of waitig for the GetUserDetailsAsync to finish. that may take some time - especailly for heavier actions like video processing and such.
In this example - 4 seconds for the first function + 4 seconds for the seconds = 8 seconds.
now lets look at this second function:
public async <FlightDetails> GetUserFlightDetails(string firstName, string flightName)
{
var userFullNameTask = GetUserDetailsAsync(firstName);
var flightTimeTask = GetFlightTimeAsync(flightName);
await Task.WhenAll(userFullNameTask, flightTimeTask);
return new FlightDetails(await userFullNameTask, await flightTimeTask);
}
Notice that GetFlightTimeAsync is not dependent on any other function, so if you need say that user full name and flight time, you can do it in a parallel way, so both actions are processed in the same time - hence the total time to wait is faster than getting the full name and then getting the flight time.
4 seconds for the first function + 4 seconds for the second - in a parallel way < 8 seconds.
Let's look at a different angle on the asynchronous programming than just a way of doing things in parallel. Yes, you can run tasks in parallel but you can find so much code that is using await/async but it is waiting on every asynchronous execution.
What is the point of doing so? There is no parallel execution there...
It is everything about making better use of available system resources, especially threads.
Once the execution reaches an asynchronous code the thread can be released and the threads are limited system resources. By releasing the thread when it’s idling for an IO-bound work to complete, it can be used to serve another request. It also protects against usage bursts since the scheduler doesn’t suddenly find itself starved of threads to serve new requests.
Choosing an async operation instead of a synchronous one doesn't speed up the operation. It will take the same amount of time (or even more). It just enables that thread to continue executing some other CPU bound work instead of wasting resources.
If you have any I/O-bound needs (such as requesting data from a network, accessing a database, or reading and writing to a file system), you'll want to utilize asynchronous programming. No matter if the application is a console one or not.
Bonus: If you are wondering: "Ok, my application released the thread but there must be some other thread that is really doing the wait!" have a look at this article from Stephen Cleary

create multiple threads and communicate with them

I have a program, that takes long time to initialize but it's execution is rather fast.
It's becoming a bottleneck, so I want to start multiple instances of this program (like a pool) having it already initialized, and the idea is to just pass the needed arguments for it's execution, saving all the initialization time.
The problem is that I only found howto start new processes passing arguments:
How to pass parameters to ThreadStart method in Thread?
but I would like to start the process normally and then be able to communicate with it to send each thread the needed paramenters required for it's execution.
The best aproach I found was to create multiple threads where I would initialize the program and then using some communication mechanism (named pipes for example as it's all running in the same machine) be able to pass those arguments and trigger the execution of the program (one of the triggers could break an infinite loop, for example).
I'm asking if anyone can advice a more optimal solution rather that the one I came up with.
I suggest you don't mess with direct Thread usage, and use the TPL, something like this:
foreach (var data in YOUR_INITIALIZATION_LOGIC_METHOD_HERE)
{
Task.Run(() => yourDelegate(data), //other params here);
}
More about Task.Run on MSDN, Stephen Cleary blog
Process != Thread
A thread lives inside a process, while a process is an entire program or service in your OS.
If you want to speed-up your app initialization you can still use threads, but nowadays we use them on top of Task Parallel Library using the Task Asynchronous Pattern.
In order to communicate tasks (usually threads), you might need to implement some kind of state machine (some kind of basic workflow) where you can detect when some task progress and perform actions based on task state (running, failed, completed...).
Anyway, you don't need named pipes or something like that to communicate tasks/threads as everything lives in the same parent process. That is, you need to use regular programming approaches to do so. I mean: use C# and thread synchronization mechanisms and some kind of in-app messaging.
Some very basic idea...
.NET has a List<T> collection class. You should design a coordinator class where you might add some list which receives a message class (designed by you) like this:
public enum OperationType { DataInitialization, Authentication, Caching }
public class Message
{
public OperationType Operation { get; set; }
public Task Task { get; set; }
}
And you start all parallel initialization tasks, you add everyone to a list in the coordinator class:
Coordinator.Messages.AddRange
(
new List<Message>
{
new Message
{
Operation = Operation.DataInitialization,
Task = dataInitTask
},
..., // <--- more messages
}
);
Once you've added all messages with pending initialization tasks, somewhere in your code you can wait until initialization ends asynchronously this way:
// You do a projection of each message to get an IEnumerable<Task>
// to give it as argument of Task.WhenAll
await Task.WhenAll(Coordinator.Messages.Select(message => message.Task));
While this line awaits to finish all initialization, your UI (i.e. the main thread) can continue to work and show some kind of loading animation or who knows what (whatever).
Perhaps you can go a step further, and don't wait for all but wait for a group of tasks which allow your users to start using your app, while other non-critical tasks end...

Is it correct to delay a Task using TimeSpan.FromTicks?

My program needs to constantly perform many repetitive calculations as fast as possible. There are many tasks running parallelly which cause CPU utilisation is at 100%. To let users slow down processing overload(a little under 100% of CPU, depending on hardware), I added
await Task.Delay(TimeSpan.FromMilliseconds(doubleProcessingCycleIntervalMilliseconds));
to heavy processing methods. This works perfect as far as value of doubleProcessingCycleIntervalMilliseconds is at least 1 ms.
For users who have high-end computers(calculations speed will take less than one millisecond), I wanted to add same option for delay but instead of milliseconds using ticks. So now code looks:
if (ProcessingCycleIntervalOptionsMilliseconds == true)
{
await Task.Delay(TimeSpan.FromMilliseconds(doubleProcessingCycleIntervalMilliseconds));
}
else
{
await Task.Delay(TimeSpan.FromTicks(longProcessingCycleIntervalTicks));
}
When walue of longProcessingCycleIntervalTicks is at least 10000 ticks(=1ms) program works perfect. Unfortunately when values go under 1ms(0 for doubleProcessingCycleIntervalMilliseconds which I can understand) or under 10000(i.e. 9999 for longProcessingCycleIntervalTicks) program becomes not responsive. So literally difference of 1 tick below 1ms hangs the program. I don't use MVVM. (Just in case: I checked Stopwatch.IsHighResolution gives true on the development computer)
Is it possible/correct to use
await Task.Delay(TimeSpan.FromTicks(longProcessingCycleIntervalTicks));
in .NET 4.5.1 ? If yes, then how to determine when user can use it?
Your intention is not to keep CPU utilization below 100%. Your intention is to keep the system responsive. Limiting CPU utilization is a misguided goal.
The way you do this is by using low priority threads. Use a custom task scheduler for your CPU bound tasks.
Timing in Windows has limited accuracy. Thread.Sleep cannot work with fractional milliseconds. .NET rounds them away before handing over to Sleep.
You might be better off looking at the way you are performing the tasks rather than trying to sleep them.
The best way I can think of is by using a task manager to manage each task independently (such as a background worker) and then thread collections of tasks.
This would enable you to manage how many tasks are running instead of trying to 'slow' them down..
i.e
public class Task<returnType>
{
public delegate returnType funcTask(params object[] args);
public delegate void returnCallback(returnType ret);
public funcTask myTask;
public event returnCallback Callback;
public Task(funcTask myTask, returnCallback Callback)
{
this.myTask = myTask;
this.Callback = Callback;
}
public void DoWork(params object[] args)
{
if (this.Callback != null)
{
this.Callback(myTask(args));
}
else
{
throw new Exception("no Callback!");
}
}
}
Then you need a manager that has a Queue in it of the tasks you want to complete, call myQueue.Enqueue to queue, myQueue.Dequeue to run the tasks. Basically you can use the already built-in Queue to do this.
You then can create a Queue of task managers full of tasks and have them all run asychronously, and stack nicely on the CPU as they are event driven and the OS and .NET will sort out the rest.
EDIT:
To continuously run tasks you will need to create a class that inherits the Queue class, then call an event when something is de-queued. The reasoning behind why I say to use events is that they stack on the CPU.
For a neverending stackable 'Loop' something like this would work...
public class TaskManager<T> : Queue<T>
{
public delegate void taskDequeued();
public event taskDequeued OnTaskDequeued;
public override T Dequeue()
{
T ret = base.Dequeue();
if (OnTaskDequeued != null) OnTaskDequeued();
return ret;
}
}
In your function that instantiates the 'loop' you need to do something like...
TaskManager<Task<int>> tasks = new TaskManager<Task<int>>();
Task<int> task = new Task<int>(i => 3 + 4, WriteIntToScreen); // WriteIntToScreen is a fake function to use as the callback
tasks.Enqueue(task);
tasks.OnTaskDequeued += delegate
{
tasks.Enqueue(task);
tasks.Dequeue.Invoke();
};
// start the routine with
tasks.Dequeue.Invoke(); // you call do some async threading here with BeginInvoke or something but I am not gonna write all that out as it will be pages...
To cancel you just empty the queue.

What is the most efficient method for assigning threads based on the following scenario?

I can have a maximum of 5 threads running simultaneous at any one time which makes use of 5 separate hardware to speedup the computation of some complex calculations and return the result. The API (contains only one method) for each of this hardware is not thread safe and can only run on a single thread at any point in time. Once the computation is completed, the same thread can be re-used to start another computation on either the same or a different hardware depending on availability. Each computation is stand alone and does not depend on the results of the other computation. Hence, up to 5 threads may complete its execution in any order.
What is the most efficient C# (using .Net Framework 2.0) coding solution for keeping track of which hardware is free/available and assigning a thread to the appropriate hardware API for performing the computation? Note that other than the limitation of 5 concurrently running threads, I do not have any control over when or how the threads are fired.
Please correct me if I am wrong but a lock free solution is preferred as I believe it will result in increased efficiency and a more scalable solution.
Also note that this is not homework although it may sound like it...
.NET provides a thread pool that you can use. System.Threading.ThreadPool.QueueUserWorkItem() tells a thread in the pool to do some work for you.
Were I designing this, I'd not focus on mapping threads to your HW resources. Instead I'd expose a lockable object for each HW resource - this can simply be an array or queue of 5 Objects. Then for each bit of computation you have, call QueueUserWorkItem(). Inside the method you pass to QUWI, find the next available lockable object and lock it (aka, dequeue it). Use the HW resource, then re-enqueue the object, exit the QUWI method.
It won't matter how many times you call QUWI; there can be at most 5 locks held, each lock guards access to one instance of your special hardware device.
The doc page for Monitor.Enter() shows how to create a safe (blocking) Queue that can be accessed by multiple workers. In .NET 4.0, you would use the builtin BlockingCollection - it's the same thing.
That's basically what you want. Except don't call Thread.Create(). Use the thread pool.
cite: Advantage of using Thread.Start vs QueueUserWorkItem
// assume the SafeQueue class from the cited doc page.
SafeQueue<SpecialHardware> q = new SafeQueue<SpecialHardware>()
// set up the queue with objects protecting the 5 magic stones
private void Setup()
{
for (int i=0; i< 5; i++)
{
q.Enqueue(GetInstanceOfSpecialHardware(i));
}
}
// something like this gets called many times, by QueueUserWorkItem()
public void DoWork(WorkDescription d)
{
d.DoPrepWork();
// gain access to one of the special hardware devices
SpecialHardware shw = q.Dequeue();
try
{
shw.DoTheMagicThing();
}
finally
{
// ensure no matter what happens the HW device is released
q.Enqueue(shw);
// at this point another worker can use it.
}
d.DoFollowupWork();
}
A lock free solution is only beneficial if the computation time is very small.
I would create a facade for each hardware thread where jobs are enqueued and a callback is invoked each time a job finishes.
Something like:
public class Job
{
public string JobInfo {get;set;}
public Action<Job> Callback {get;set;}
}
public class MyHardwareService
{
Queue<Job> _jobs = new Queue<Job>();
Thread _hardwareThread;
ManualResetEvent _event = new ManualResetEvent(false);
public MyHardwareService()
{
_hardwareThread = new Thread(WorkerFunc);
}
public void Enqueue(Job job)
{
lock (_jobs)
_jobs.Enqueue(job);
_event.Set();
}
public void WorkerFunc()
{
while(true)
{
_event.Wait(Timeout.Infinite);
Job currentJob;
lock (_queue)
{
currentJob = jobs.Dequeue();
}
//invoke hardware here.
//trigger callback in a Thread Pool thread to be able
// to continue with the next job ASAP
ThreadPool.QueueUserWorkItem(() => job.Callback(job));
if (_queue.Count == 0)
_event.Reset();
}
}
}
Sounds like you need a thread pool with 5 threads where each one relinquishes the HW once it's done and adds it back to some queue. Would that work? If so, .Net makes thread pools very easy.
Sounds a lot like the Sleeping barber problem. I believe the standard solution to that is to use semaphores

How would you change my Heartbeat process written in C#?

I'm looking at implementing a "Heartbeat" process to do a lot of repeated cleanup tasks throughout the day.
This seemed like a good chance to use the Command pattern, so I have an interface that looks like:
public interface ICommand
{
void Execute();
bool IsReady();
}
I've then created several tasks that I want to be run. Here is a basic example:
public class ProcessFilesCommand : ICommand
{
private int secondsDelay;
private DateTime? lastRunTime;
public ProcessFilesCommand(int secondsDelay)
{
this.secondsDelay = secondsDelay;
}
public void Execute()
{
Console.WriteLine("Processing Pending Files...");
Thread.Sleep(5000); // Simulate long running task
lastRunTime = DateTime.Now;
}
public bool IsReady()
{
if (lastRunTime == null) return true;
TimeSpan timeSinceLastRun = DateTime.Now.Subtract(lastRunTime.Value);
return (timeSinceLastRun.TotalSeconds > secondsDelay);
}
}
Finally, my console application runs in this loop looking for waiting tasks to add to the ThreadPool:
class Program
{
static void Main(string[] args)
{
bool running = true;
Queue<ICommand> taskList = new Queue<ICommand>();
taskList.Enqueue(new ProcessFilesCommand(60)); // 1 minute interval
taskList.Enqueue(new DeleteOrphanedFilesCommand(300)); // 5 minute interval
while (running)
{
ICommand currentTask = taskList.Dequeue();
if (currentTask.IsReady())
{
ThreadPool.QueueUserWorkItem(t => currentTask.Execute());
}
taskList.Enqueue(currentTask);
Thread.Sleep(100);
}
}
}
I don't have much experience with multi-threading beyond some work I did in Operating Systems class. However, as far as I can tell none of my threads are accessing any shared state so they should be fine.
Does this seem like an "OK" design for what I want to do? Is there anything you would change?
This is a great start. We've done a bunch of things like this recently so I can offer a few suggestions.
Don't use thread pool for long running tasks. The thread pool is designed to run lots of tiny little tasks. If you're doing long running tasks, use a separate thread. If you starve the thread pool (use up all the tasks), everything that gets queued up just waits for a threadpool thread to become available, significantly impacting the effective performance of the threadpool.
Have the Main() routine keep track of when things ran and how long till each runs next. Instead of each command saying "yes I'm ready" or "no I'm not" which will be the same for each command, just have LastRun and Interval fields which Main() can then use to determine when each command needs to run.
Don't use a Queue. While it may seem like a Queue type operation, since each command has it's own interval, it's really not a normal Queue. Instead put all the commands in a List and then sort the list by shortest time to next run. Sleep the thread until the first command is needed to run. Run that command. Resort the list by next command to run. Sleep. Repeat.
Don't use multiple threads. If each command's interval is a minute or few minutes, you probably don't need to use threads at all. You can simplify by doing everything on the same thread.
Error handling. This kind of thing needs extensive error handling to make sure a problem in one command doesn't make the whole loop fail, and so you can debug a problem when it occurs. You also may want to decide if a command should get immediately retried on error or wait until it's next scheduled run, or even delay it more than normal. You may also want to not log an error in a command if the error happens every time (an error in a command that runs often can easily create huge log files).
Instead of writing everything from scratch, you could choose to build your application using a framework that handles all of the scheduling and threading for you. The open-source library NCron is designed for exactly this purpose, and it is very easy to use.
Define your job like this:
class MyFirstJob : CronJob
{
public override void Execute()
{
// Put your logic here.
}
}
And create a main entry point for your application including scheduling setup like this:
class Program
{
static void Main(string[] args)
{
Bootstrap.Init(args, ServiceSetup);
}
static void ServiceSetup(SchedulingService service)
{
service.Hourly().Run<MyFirstJob>();
service.Daily().Run<MySecondJob>();
}
}
This is all the code you will need to write if you choose to go down this path. You also get the option to do more complex schedules or dependency injection if needed, and logging is included out-of-the-box.
Disclaimer: I am the lead programmer on NCron, so I might just be a tad biased! ;-)
I would make all your Command classes immutable to insure that you don't have to worry about changes to state.
Now a days 'Parallel Extensions' from microsoft should be the viable option to write concurrent code or doing any thread related tasks. It provides good abstraction on top of thread pool and system threads such that you need not to think in an imperative manner to get the task done.
In my opinion consider using it. By the way, your code is clean.
Thanks.
running variable will need to be marked as volatile if its state is going to be changed by another thread.
As to the suitability, why not just use a Timer?

Categories

Resources