Framework to setup a process scheduler

Framework to setup a process scheduler - c#

So we have a bunch of different processes setup in code right now. We have a framework setup around this with a couple of classes that control when these pieces of code are kicked off, where they log to, which other process they depend on, etc.
The way we currently work this is that all of these processes inherit a base one that contains parameters, a Validate() method, and a Start() method.
I'd like to re-do this. Right now the code is very difficult to deal with. I think each process in it of itself is setup fine, but I would like to know if there are any frameworks that anyone has used to setup what is basically just a scheduler that kicks off certain processes at different times throughout the day.
Each process should have the ability to depend on another one, have its own set of parameters, a kick off time, a frequency (Daily, Ad-hoc, etc.), and the ability to log its messages and any exceptions to the UI. The reason we want to keep interdependence is because a process shouldn't run if one it depends on fails.
Anyone know of a good framework to set something like this up?
Thanks.

You might want to have a look at Quarz.NET. Looking at the project page, it seem to be reasonably active. Disclaimer: personally never used it.

Related

C# What exactly is application domain?

I understand that an application domain forms:
an isolation boundary for security,
versioning,
reliability,
and unloading of managed code,
but so does a process
Can someone please help me understand the practical benefits of an application domain?
I assumed app domain provides you a container to load one version of an assembly but recently I discovered that multiple versions of strong key assembly can be loaded in an app domain.
My concept of application domain is still not clear. And I am struggling to understand why this concept was implemented when the concept of process is present.
Thank you.

I can't tell if you are talking in general or specifically .NET's AppDomain.
I am going to assume .NET's AppDomain and why it can be really useful when you need that isolation inside of a single process.
For instance:
Say you are dealing with a library that had certain worker classes and you have no choice, but to use those workers and can't modify the code. It's your job to build a Windows Service that manages said workers and makes sure they all stay up and running and need to work in parallel.
Easy enough right? Well, you hoped. It turns out your worker library is prone to throwing exceptions, uses a static configuration, and is generally just a real PITA.
You could try to launch them in their own process, but monitor them, you'll need to implement namedpipes or try to thoughtfully parse the STDIN and STDOUT of the process.
What else can you do? Well AppDomain actually solves this. I can spawn an AppDomain for each worker, give them their own configuration, they can't screw each other up by changing static properties because they are isolated, and on top of that, if the library bombs out and I failed to catch the exception, it doesn't bother the workers in their domain. And during all of this, I can still communicate with those workers easily.
Sadly, I have had to do this before
EDIT: Started to write this as a comment response, but got too large
Individual processes can work great in many scenarios, however, there are just times where they can become a pain. I am not saying one should use an AppDomain over another process. I think it's uncommon you would need a separate process or AppDomain, but once you need it, you'll definitely know.
The main problem I see with processes in the scenario I've given above is that processes have their own downfalls that are easier to mitigate with the AppDomain.
A process can go rogue, become unresponsive, and crash or be killed at any point.
If you're managing processes, you need to keep track of the process ID and monitor the status of it. IPCs are great, but it does take time to get proper communication going back and forth as needed.
As an example let's say your process just dies. What do you do? Depending on the mechanism you chose to monitor, maybe the communication thread died, perhaps the work finished and you still show it as "processing". What do you do?
Now what happens when you have 20 processes and your management app dies. You don't have any real information, all you have is 20 "myprocess.exe" and maybe now have to start parsing the command line arguments they were started with to see which workers you actually have. Obviously with an AppDomain all 20 would have died too, but did you really gain anything with the process? You still have to code the ability to recover, however, now you have to also code all of the recovery for your processes instead of just firing the workers back up.
As with anything in programming, there's 1,000 different ways to achieve the same goal. It's up to you to decide which solution you feel is most appropriate.

Some practical benefits using app domain:
Multiple app domains can be run in a process. You can also stop individual app domain without stopping the entire process. This alone drastically increases the server scalability.
Managing app domain life cycle is done programmatically by runtime hosts (you can override it as well). For processes & threads, you have to explicitly manage their life cycle. Initialization, execution, termination, inter-process/multithread communication is complex and that's why it's easier to defer that to CLR management.
Source: https://learn.microsoft.com/en-us/dotnet/framework/app-domains/application-domains

How do I add high level tests to lock down behavior before refactoring?

I have a moderately sized, highly tangled, messy, big ball-of-mud project that I want to refactor. It is about 50k lines of code and I am the only maintainer. One class is 10k LOC long with line widths of 400 characters. Features are still being added, albeit slowly. I feel good about the refactoring part. Ripping apart code is very comfortable to me. However, getting a set of tests around the application to ensure that I don't break it seems daunting. Huge object graphs and mocking out all the databases will be real buggers. I probably could get away with doing massive refactoring without breaking things horribly. But, prudence dictates some level of tests to ensure at least some level of safety. At the same time, I don't want to spend any more than a minimal amount of time getting a set of "behavior lock-down" tests around the code. I fully intend to add a full set of unit tests once I get things a bit decoupled. A rewrite is a non-starter.
Does anyone have any pointers or techniques? Has anyone done this sort of thing before?

Mindset
Generally it can be tough going trying to write automated tests (even high level ones) for applications that were not build with testability in mind.
The best thing is going to be make sure you are disciplined in writing tests as you refactor (which it sounds like you are intending to be). This will slowly turn that ball of code, into an elegant dancing unicorn of well encapsulated testable classes.
Suggestions
Start with creating some manual high level tests (e.g. user goes to page one, clicks on the red button, then a textbox appears..) to have a starting point. Depending on the technology the app is built in there are a few frameworks out there that can help automate these high level (often UI driven) tests:
For web apps Selenium is a great choice, for WPF apps you can use the UI Automation framework, for other application, while its a bit rudimentary, AutoIt can be a life saver.

Here how I do it in c++ (and c-).
Create a directory to house the tests, cd to this directory
Create a directory to house mock objects, say mock-objs.
Create a makefile to compile all object files of interest.
Add necessary include directories or mock .h files to make all object file compile.
Congratulate yourself you are 90% done.
Add a test harness of your choice (e.g. cppunit, atf-tests, google test ..).
Add a null test case - just start, log and declare success.
Add necessary libraries and/or mock .c/.cpp files until link is successful and the very first test passes. Note: all functions in these .c/.cpp mock files should contain only a primitive to fail the test when called.
Congratulate yourself you are 99% done.
Add a primitive event scheduler: say just list of callbacks - so you can post request and receive response from event callback.
Add a primitive timer: say timer wheel or even timer list if you need just a few timers.
Write advance-time function: (a) process all queued events (b)increment current time to the next tick, (c)expire all timers waiting for this tick, (d)process all queued events over and over again until no events left, (e) if end the time advance is not reached go to step(b).
Congratulate yourself: now you can add tests with relative ease: add a test case, modify mock functions as required, repeat.

How to prevent NHibernate long-running process from locking up web site?

I have an NHibernate MVC application that is using ReadCommitted Isolation.
On the site, there is a certain process that the user could initiate, and depending on the input, may take several minutes. This is because the session is per request and is open that entire time.
But while that runs, no other user can access the site (they can try, but their request won't go through unless the long-running thing is finished)
What's more, I also have a need to have a console app that also performs this long running function while connecting to the same database. It is causing the same issue.
I'm not sure what part of my setup is wrong, any feedback would be appreciated.
NHibernate is set up with fluent configuration and StructureMap.
Isolation level is set as ReadCommitted.
The session factory lifecycle is HybridLifeCycle (which on the web should be Session per request, but on the win console app would be ThreadLocal)

It sounds like your requests are waiting on database locks. Your options are really:
Break the long running process into a series of smaller transactions.
Use ReadUncommitted isolation level most of the time (this is appropriate in a lot of use cases).
Judicious use of Snapshot isolation level (Assuming you're using MS-SQL 2005 or later).
(N.B. I'm assuming the long-running function does a lot of reads/writes and the requests being blocked are primarily doing reads.)

As has been suggested, breaking your process down into multiple smaller transactions will probably be the solution.
I would suggest looking at something like Rhino Service Bus or NServiceBus (my preference is Rhino Service Bus - I find it much simpler to work with personally). What that allows you to do is separate the functionality down into small chunks, but maintain the transactional nature. Essentially with a service bus, you send a message to initiate a piece of work, the piece of work will be enlisted in a distributed transaction along with receiving the message, so if something goes wrong, the message will not just disappear, leaving your system in a potentially inconsistent state.
Depending on what you need to do, you could send an initial message to start the processing, and then after each step, send a new message to initiate the next step. This can really help to break down the transactions into much smaller pieces of work (and simplify the code). The two service buses I mentioned (there is also Mass Transit), also have things like retries built in, and error handling, so that if something goes wrong, the message ends up in an error queue and you can investigate what went wrong, hopefully fix it, and reprocess the message, thus ensuring your system remains consistent.
Of course whether this is necessary depends on the requirements of your system :)

Another, but more complex solution would be:
You build a background robot application which runs on one of the machines
this background worker robot can be receive "worker jobs" (the one initiated by the user)
then, the robot processes the jobs step & step in the background
Pitfalls are:
- you have to programm this robot very stable
- you need to watch the robot somehow
Sure, this is involves more work - on the flip side you will have the option to integrate more job-types, enabling your system to process different things in the background.

I think the design of your application /SQL statements has a problem , unless you are facebook I dont think any process it should take all this time , it is better to review your design and check where is the bottleneck are, instead of trying to make this long running process continue .
also some times ORM is not good for every scenario , did you try to use SP ?

Is this a bad idea/practice what problems would I run into

I have a process that I need to make into a service. This process runs autonomously right now so there are no concerns with user interaction I just need to "turn" it into a service. I got to thinking about it and decided that I could just create a service that launched the process, this would give me the added benefit of having outside control of the process.. I could watch it for an unexpected exit and re-launch it.. I could also watch its memory usage and kill it if it gets out of hand. I dont think I have seen many other applications do this and I was thinking there must be a reason why so...

It's going to add complexity.
Instead of just having the process exist, you'll now need to make a second executable to "launch and monitor" this process. This adds overhead (the service and process both running), adds complexity, and makes life as a whole a bit more difficult.
That being said, if you've got a .NET Console application, turning it into a service is incredibly trivial. Your Main routine basically just gets moved into a method, and launched in a thread. Once you do that, the service application is effectively done - it's just configuring the service (which can be done in a designer) and overriding OnStart to spin up a thread and call your routine.

This is a good idea, but you've reinvented the wheel. What you're thinking of is essentially server monitoring. There are several high-quality open source implementations of what you want.

Pretty much anything that you can do this way you can do with less complexity by just putting the application logic in the service. Not to mention that you get Service Recovery for free by doing it in the service directly.

How to implement a job that runs every hour but can also be triggered from .aspx pages?

I need a method to run every so often that does some database processing. However, I may need it to be triggerable by an admin on the site. But I don't want this method being run more than once at the same time, as this could cause issues with the way it hits the database.
For example, could I...
Create a singleton class that runs the method on a timer, and instantiate it in the global.asax file. Then, since it's a singleton, I can call it from my normal .aspx pages and call the method whenever I want. I would probably need to use that "lock" feature of C# to check to see if the method is already running.
I heard some talk lately that Singletons are "evil", but this seems like the perfect fit for it. What do you think? Thanks in advance.

Timers and locks (that are intended to synchronize access to the database) are a bad idea on the web; you may have zero, one or many app-pools on different servers. They may recycle at any time, and won't be spun up until needed. Basically, this won't prevent you hammering the db from multiple sources.
Personally, I'd be tempted to either write a service to do this work (either db-polling, or via WCF etc), or use the db (a SP or similar) - set a flag in a table-row to say "in progress", do the work at the db, and clear the flag (duplicate attempts exit immediately while in progress).

I would do it this way
Build a normal ASP.NET page which does the processing
StealBorrow LFSR Consultings idea for a flag in the DB which does the work of checking if the process is currently running
Use normal cronjob or windows task scheduler to call the web page on a regular basis.
And Singletons aren't evil they just get abused easily.

Another option which Joel Spolsky mentioned in one of the SO Podcasts, i believe it was #20 something. Is to set an empty Cache object on application start with a certain expiration date, and in the CacheItemRemovedCallback make a call out to page or do some work and then reset the empty cache object.
I'm probably horribly mis-quoting him, so I recommend you listen or look through the transcripts for yourself.

What about just setting up a flag in the database and checking that to determine if the job is running or not? Seems simpler IMO.

The canonical way to write a singleton ends up not being thread safe. Especially in a webby environment, where threads needn't even be on the same machine!
If you really want to do a "singleton", think of it as a service that you only ever deploy to one machine. Then use the transactional semantics of your database like Marc Gravell suggests to synchronize the locks.

We've done similar things by using a Web Service to do the backend processing, then writing a Desktop App to call it on whatever schedule we need. We can then run that app on a server, or an admin can run it directly from their PC to trigger the job.
Edit: After I saw your revision that you don't want them to run simulatenously, we have usually just controlled that with a database flag like a few others have said, nothing fancy but it gets the job done

Set an Application wide variable to denote that the process is running. That should be a little easier than storing the variable in the database, right?

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.