I currently have a c# console app where multiple instances run at the same time. The app accesses values in a database and processes them. While a row is being processed it becomes flagged so that no other instance attempts to process it at the same time. My question is what is a efficient and graceful way to unflag those values in the event an instance of the program crashes? So if an instance crashed I would only want to unflag those values currently being processed by that instance of the program.
Thanks
The potential solution will depend heavily on how you start the console applications.
In our case, the applications are started based on configuration records in the database. When one of these applications performs a lock, it uses the primary key from the database configuration record to perform the lock.
When the application starts up, the first thing it does is release all locks on the records that it previously locked.
To control all of the child processes, we have a service that uses the information from the configuration tables to start the processes and then keeps an eye on them, restarting them when they fail.
Each of the processes is also responsible for updating a status table in the database with the last time it was available with a maximum allowed delay of 2 minutes (for heavy processing). This status table is used by sysadmins to watch for problems, but it could also be used to manually release locks in case of a repeating failure in a given process.
If you don't have a structured approach like this, it could be very difficult to automatically unlock records unless you have a solid profile of your application performance that would allow you to know that any lock over 5 minutes old is invalid because it should only take, on average, 15 seconds to process a record with a maximum of 2 minutes.
To be able to handle any kind of crash, even power off I would suggest to timestamp records additionally and after some reasonable timeout treat records as unlocked even if they are flagged.
Related
Ok, I guess I need to make myself a bit clearer, sorry.
I DO NOT have problem with the app it self. it runs single or multiple instances just fine. we have mechanism built in to prevent cross-instance interference, file/record blocking etc..
the issue is that sometimes when running several instances, (some one go and click on myapp.exe several times manually ) one or more instances can crush.
be it from a bad data in the file, loss db connection, what ever. I am still trying to figure out some of the unexplained crushes and hanged instances sources.
what I am looking for is a way for me to setup a monitoring process that
A. can identify each instance of the app running as separate entity.
B. check if that one instance is running and not hanged / crushed , this maybe mediated but me changing the code and force the app to quit if a fatal error is detected, as in I crushed and can not recover I quit, kind of setup.
C. start new additional instances up to the total count of desired instances running. that is I want to have a minimal count or app running at the same time. if current count is less start up new until desired count is reached.
what would be the best approach to have this setup.
I have a custom inhouse application written on C++ and C# mixed API.
it is a file processing parser for EDI process, that takes data from flat files or Flat file representation from a DB table (we have other modules that store a contents of flat file in DB varchar field) and process and store data in appropriate DB tables.
it works fine for the most part, but several clients have a need to run multiple instances of it to speed up the process.
problem is that when the app crashes, there is no automatic way I know to identify the instance that crashed , shut it down and restart.
need help to identify my options here.
how can I run the same EXE multiple time yet monitor and manage each in the event of instance misbehaving.
a code change is possible but C++ is not my main language so it will be a nightmare for me to do any extensive changes.
PS> the app is a mix of C++ and C# modules with MSSQL DB backend
thanks.
the process is monitored with MSSQL SPs. an issue can be identified by luck of new records processed and other factors
my needs are to be able to
1. start several instances of an app at the same time.
2. monitor each instance of running EXE and if crush is detected kill and restart that one instance. I have a kill switch in the app that can shut it down gracefully, but need a way to monitor and restart it. but only that one.
I've run into this a few times recently at work. Where we have to develop an application that completes a series of items on a schedule, sometimes this schedule is configurable by the end user, other times its set in Config File. Either way, this task is something that should only be executed once, by a single machine. This isnt generally difficult, until you introduce the need for SOA/Geo Redundancy. In this particular case there are a total of 4 (could be 400) instances of this application running. There are two in each data center on opposite sides of the US.
I'm investigating successful patterns for this sort of thing. My current solution has each physical location determining if it should be active or dormant. We do this by checking a Session object that is maintained to another server. If DataCenter A is the live setup, then the logic auto-magically prevents the instances in DataCenter B from performing any execution. (We dont want the work to traverse the MPLS between DCs)
The two remaining instances in DC A will then query the Database for any jobs that need to be executed in the next 3 hours and cache them. A separate timer runs every second checking for jobs that need executed.
If it finds one it will execute a stored procedure first, that forces a full table lock, queries for the job that needs to be executed, checks the "StartedByInstance" Column for a value, if it doesnt find a value then it marks that record as being executed by InstanceX. Only then will it actually execute the job.
My direct questions are:
Is this a good pattern?
Are there any better patterns?
Are there any libraries/apis that would be of interest?
Thanks!
So I am creating a customer indexing program for my company and I have basically everything coded and working except I want the indexing program to watch the user-specified indexed directories and update the underlying data store on the fly to help eliminate the need for full indexing often.
I coded everything in WPF/C# with an underlying SQLite database and I am sure the folder watchers would work well under "non-heavy loads", but the problem is we use TortoiseSVN and when the user does an SVN Update, etc. that creates a heavy file load the FileSystemWatcher and SQLite updates just can't keep up (even with the max buffer size). Basically I am doing a database insert every time the watcher event is hit.
So my main question is...does anyone have suggestions on how to implement this file watcher to handle such heavy loads?
Some thoughts I had were: (1) Create a staging collection for all the queries and use a timer and thread to insert the data later on (2) Write the queries to a file and use a timer thread later on for the insert
Help....
You want to buffer the data received from your file watch events in memory. Thus when receiving the events from your registered file watchers, you can accumulate them in memory as fast as possible for a burst of activity. Then on a separate process or thread, you read them from your in memory buffer and do whatever you need for persistent storage or whatever process is more time intensive.
You can use a Queue to queue in all the requests. I have made good experience with the MS MessageQueue, which comes out of the box and is quite easy to use.
see http://www.c-sharpcorner.com/UploadFile/rajkpt/101262007012217AM/1.aspx
Then have a separate WorkerThread which grabs a predefined number of elements from the queue and inserts them to the database. Here I'd suggest to merge the single inserts to a bulkinsert.
If you want to be 100% sure you can check for cpu and IO performance before making the inserts.
Here a code snippet for determining the cpu utilization:
Process.TotalProcessorTime.TotalMilliseconds/Environment.ProcessorCount
The easiest is to have the update kick off a timer (say, one minute). If another update comes in in the meantime, you queue the change and restart the timer. Only when a minute has gone by without activity do you start processing.
I'm developing a system that isn't real time but there's an intervening standalone server between the end user machines and the database. The idea is that instead of burdening the database server every time a user sends something up, a windows service on the database machine sweeps the relay server at regular intervals and updates the database, deleting the temporary files on the relay box.
There is a scenario where the client software installed on thousands of machines sends up information at nearly the same time. The following hold true:
The above scenario won't occur often but could occur once every other week.
For each machine, 24 bytes of data (4k on the disk) is written on the relay server, which we want to then pick up and update the database with. So although it's fine if the user base is only a few thousands for now, they may amount to millions overtime.
I was thinking of a batch operation that only picks up some 15,000 - 20,000 files at a time and runs every whenever (amendable from app.config). The problem is that if the user base grows to a few million that will take days to complete. Yes, it doesn't have to be real-time information but waiting for days for all the data to reach the database isn't ideal either.
I think there will always be a bottleneck if the relay box is hammered, but are there better ways to improve performance and get the data across at a reasonable time (a day, two tops)?
Regards,
F.
I think you might consider that to avoid hammering the disk only one thread reads the files and then hands off processing to multiple threads to write to the database and returns to the disk thread to delete the files after commit. The amount of DB threads could be "amendable from app.config" to find the best value for your hardware config.
Just my 2 cents to get you thinking.
There is a multi threaded batch processing program, that creates multiple worker threads to process each batch process.
Now to scale the application to handle 100 million records, we need to use a server farm to do the processing of each batch process. Is there native support on C# for handling requests running on a server farm? Any thoughts on how to setup the C# executable to work with this setup?
You can either create a manager that distributes the work like fejesjoco said or you can make your apps smart enough to only grab a certain number of units of work to process on. When they have completed processing of those units, have them contact the db server to get the next batch. Rinse and repeat until done.
As a side note most distributed worker systems run by:
Work is queued in server by batches
Worker Processes check in with server to get a batch to operate on, the available batch is marked as being processed by that worker.
(optional) Worker Processes check back in with server with status report (ie: 10% done, 20% done, etc)
Worker process completes work and submits results.
Go to step 2.
Another option is to have 3 workers process the exact same data set. This would allow you to compare results. If 2 or more have identical results then you accept those results. If all 3 have different results then you know there is a problem and you need to inspect the data/code. Usually this only happens when the workers are outside of your control (like SETI) or you are running massive calculations and want to correct for potential hardware issues.
Sometimes there is a management app which displays current number of workers and progress with entire set. If you know roughly how long an individual batch takes then you can detect when a worker died and can let a new process get the same batch.
This allows you to add or remove as many individual workers as you want without having to recode anything.
I don't think there's builtin support for clustering. In the most simple case, you might try creating a simple manager application which divides the input among the servers, and your processes will not need to know about each other, so no need to rewrite anything.
Why not deploy the app using a distributed framework? I'd recommend CloudIQ Platform You can use the platform to distribute your code to any number of servers. It also handles the load balancing, so you would only need to submit your jobs to the framework, and it will handle job distribution to the individual machines. It also monitors application execution, so if one of the machines suffers a failure, the jobs running there will be restarted on another machine in the group.
Check out the Community link for downloads, forums, etc.