.NET - limiting the number of instances of an execution unit - c#

Let's say I have an application written in C# called EquipCtrl.exe which runs as a local process on a PC to control a piece of equipment.
Obviously, I would wish to have only one instance of Equipctrl to run on each PC. If I had two equip to control per PC, I would then limit it to two instances per PC.
The way I did is was either one of
1. Process name. I name the process EqCtrl and at process start-up, it would count the number processes with the name "EqCtrl".
2. Execution name. At start-up count the number of processes with the execution name EquipCtrl.exe.
3. Registry record.
4. SQL Server database record.
To me, process name or execution name detection is the simplest and what I do most (if not all) of the time. But, they are susceptible to name clashing. Even if I go further to find out the execution path, the limit could be circumvented by copying the execution file to another folder.
What is the best way to perform execution limiting on .NET? Why? Is registry record the best way?

Try a system-wide Semaphore:
http://msdn.microsoft.com/en-us/library/system.threading.semaphore.aspx

Scott Hanselman's Blog has a great post on this...
http://www.hanselman.com/blog/TheWeeklySourceCode31SingleInstanceWinFormsAndMicrosoftVisualBasicdll.aspx

Hmm .. when we want to allow only a single instance of an application we use a named mutex
That doesn't exactly enable the scenario you desire.
Another point, to avoid collisions, in addition to the executable name and directory, use an MD5 hash of the executable file.

You can do this with counting named mutex\semaphore, also make sure the mutex/semaphore prefix you create starts with GLOBAL\ and LOCAL\ otherwise remote desktop connection won't read these named mutex and count will be lost.
http://msdn.microsoft.com/en-us/library/ms682411(VS.85).aspx

Related

Runing and monitoring several instances of a custom C++/C# application automatically, possible?

Ok, I guess I need to make myself a bit clearer, sorry.
I DO NOT have problem with the app it self. it runs single or multiple instances just fine. we have mechanism built in to prevent cross-instance interference, file/record blocking etc..
the issue is that sometimes when running several instances, (some one go and click on myapp.exe several times manually ) one or more instances can crush.
be it from a bad data in the file, loss db connection, what ever. I am still trying to figure out some of the unexplained crushes and hanged instances sources.
what I am looking for is a way for me to setup a monitoring process that
A. can identify each instance of the app running as separate entity.
B. check if that one instance is running and not hanged / crushed , this maybe mediated but me changing the code and force the app to quit if a fatal error is detected, as in I crushed and can not recover I quit, kind of setup.
C. start new additional instances up to the total count of desired instances running. that is I want to have a minimal count or app running at the same time. if current count is less start up new until desired count is reached.
what would be the best approach to have this setup.
I have a custom inhouse application written on C++ and C# mixed API.
it is a file processing parser for EDI process, that takes data from flat files or Flat file representation from a DB table (we have other modules that store a contents of flat file in DB varchar field) and process and store data in appropriate DB tables.
it works fine for the most part, but several clients have a need to run multiple instances of it to speed up the process.
problem is that when the app crashes, there is no automatic way I know to identify the instance that crashed , shut it down and restart.
need help to identify my options here.
how can I run the same EXE multiple time yet monitor and manage each in the event of instance misbehaving.
a code change is possible but C++ is not my main language so it will be a nightmare for me to do any extensive changes.
PS> the app is a mix of C++ and C# modules with MSSQL DB backend
thanks.
the process is monitored with MSSQL SPs. an issue can be identified by luck of new records processed and other factors
my needs are to be able to
1. start several instances of an app at the same time.
2. monitor each instance of running EXE and if crush is detected kill and restart that one instance. I have a kill switch in the app that can shut it down gracefully, but need a way to monitor and restart it. but only that one.

Using Azure to process large amounts of data

We have an application that over time stores immense amounts of data for our users (talking hundreds of TB or more here). Due to new EU directives, should a user decide to discontinue using our sevices, all their data must be available for export for the next 80 days, after which it MUST be eradicated completely. The data is stored in azure storage block blobs, and the metadata in an sql database.
Sadly, the data cannot be exported as-is (it is in a proprietary format), so it would need to be processed and converted to PDF for export. A file is approximately 240KB in size, so imagine the amount of PDFs for the TB value stated above.
We tried using functions to split the job into tiny 50 value chunks, but it went haywire at some point and created enormous costs, spinning out of control.
So what we're looking for is this:
Can be run on demand from a web trigger/queue/db entry
Is pay-what-you-use as this will occur at random times and (so we hope) rarely.
Can process extreme amounts of data fairly effectively at minimum cost
Is easy to maintain and keep track of. The functions jobs were just fire and pray -utter chaos- due to their amount and parallel processing.
Does anyone know of a service fitting our requirements?
Here's a getting started link for .NET, python or node.js:
https://learn.microsoft.com/en-us/azure/batch/batch-dotnet-get-started
The concept in batch is pretty simple, although it takes a bit of fiddling to get it working the first time, in my experience. I'll try to explain what it involves, to the best of my ability. Any suggestions or comments are welcome.
The following concepts are important:
The Pool. This is an abstraction of all the nodes (i.e. virtual machines) that you provision to do work. These could be running Linux, Windows Server or any of the other offerings that Azure has. You can provision a pool through the API.
The Jobs. This is an abstraction where you place the 'Tasks' you need executed. Each task is a command-line execution of your executable, possibly with some arguments.
Your tasks are picked one by one by an available node in the pool and it executes the command that the task specifies. Available on the node are your executables and a file that you assigned to the task, containing some data identifying, in your case e.g. which users should be processed by the task.
So suppose in your case that you need to perform the processing for 100 users. Each individual processing job is an execution of some executable you create, e.g. ProcessUserData.exe.
As an example, suppose your executable takes, in addition to a userId, an argument specifying whether this should be performed in test or prod, so e.g.
ProcessUserData.exe "path to file containing user ids to process" --environment test.
We'll assume that your executable doesn't need other input than the user id and the environment in which to perform the processing.
You upload all the application files to a blob (named "application blob" in the following). This consists of your main executable along with any dependencies. It will all end up in a folder on each node (virtual machine) in your pool, once provisioned. The folder is identified through an environment variable created on each node in your pool so that you can find it easily.
See https://learn.microsoft.com/en-us/azure/batch/batch-compute-node-environment-variables
In this example, you create 10 input files, each containing 10 userIds (100 userIds total) that should be processed. One file for each of the command line tasks. Each file could contain 1 user id or 10 userids, it's entirily up to you how you want your main executable to parse this file and process the input. You upload these to the 'input' blob container.
These will also end up in a directory identified by an environment variable on each node so are also easy to construct a path in your command line activity on each node.
When uploaded to the input container, you will receive a reference (ResourceFile) to each input file. One input file should be associated with one "Task" and each task is passed to an available node as the job executes.
The details of how to do this are clear from the getting started link, I'm trying to focus on the concepts, so I won't go into much detail.
You now create the tasks (CloudTask) to be executed, specify what it should run on the command line, and add them to the job. Here you reference the input file that each task should take as input.
An example (assuming Windows cmd):
cmd /c %AZ_BATCH_NODE_SHARED_DIR%\ProcessUserdata.exe %AZ_BATCH_TASK_DIR%\userIds1.txt --environment test
Here, userIds1.txt is the filename your first ResourceFile returned when you uploaded the input files. The next command will specify userIds2.txt, etc.
When you've created your list of CloudTask objects containing the commands, you add them to the job, e.g in C#.
await batchClient.JobOperations.AddTaskAsync(jobId, tasks);
And now you wait for the job to finish.
What happens now is that Azure batch looks at the nodes in the pool and while there are more tasks in the tasks list, it assigns a task to an available (idle) node.
Once completed (which you can poll for through the API), you can delete the pool, the job and pay only for the compute that you've used.
One final note: Your tasks may depend on external packages, i.e. an execution environment that is not installed by default on the OS you've selected, so there are a few possible ways of resolving this:
1. Upload an application package, which will be distributed to each node as it enters the pool (again, there's an environment variable pointing to it). This can be done through the Azure Portal.
2. Use a command line tool to get what you need, e.g. apt-get install on Ubuntu.
Hope that gives you an overview of what Batch is. In my opinion the best way to get started is to do something very simple, i.e. print environment variables in a single task on a single node.
You can inspect the stdout and stderr of each node while the execution is underway, again through the portal.
There's obviously a lot more to it than this, but this is a basic guide. You can create linked tasks and a lot of other nifty things, but you can read up on that if you need it.
Assuming lot of people are looking for a solution for these kind of requirements, the new release of ADLA (Azure Data Lake Analytics) supports the Parquet format now. This will be supported along with U-SQL. With less than 100 lines of code now you can read these small files to large files and with less number of resources (vertices) you can compress the data to Parquet files. For example, you can store 3TB data into 10000 parquet files. And reading these files is also very simple and as per the requirement you can create csv files in no time. This will save too much cost and time for you for sure.

Can I open one batch of files from Explorer in one instance of my app, and a separate batch in a second instance?

(First: I know this is very similar to a lot of other questions, but I have a secondary requirement that I haven't seen elsewhere.)
I'm creating an app where the user can do some work on a file (adjusting metadata, sending to another location, possibly some other things in the future). In a lot of cases, the same actions are going to happen to a lot of files at once, so what I want to do is, if the user has multiple files selected, they can right-click and use an action of some sort from that context menu, and all selected files & folders will be passed to my app for bulk processing.
Now here's the tricky bit - while the user's doing this bulk processing on one set of files, they might go off and select a different set of files and set up another bulk processing action on them, and I need to keep the two batches separate because the actions taken on the first batch are not likely to be appropriate for the second batch.
That second requirement is where I'm running into problems. I can't just do a normal singleton, because the two batches will get lumped together into one batch, which would be a Very Bad Thing(tm) for my users. But so far, I haven't found any elegant way to open one batch of files in one instance, and a later batch in a new instance.
I've come up with a couple options... First, when my app starts I could look up other instances, communicate with each of them in turn to find out when they were started, and if the times were close enough together, shut down this instance in favour of the older one. A second option would be to somehow add a unique operation identifier to the command line arguments; similarly to the first option, when the app starts it would look for other instances of itself and try to find the one that's the master (probably using the identifier as part of a mutex name to make it easy to figure out which one was first). Another option might be to create a shell extension to bundle up the list for me and pass the list to my app all at once (don't know for sure if this would work, to be honest...)
Of the ideas I've come up with, #2 is definitely my preference if it works. It keeps all the code in C#, which means anyone on my team could maintain it, and it also provides a way to definitively group the commands into batches. The only problem is that I don't know of a way to add some ID to the args for each instance that'll be consistent for all attempts to start my app from a single right-click action, but different for each time the user does a separate right-click + open (or whatever). Maybe I could do that through a shell extension...?
If anyone has any suggestions on how to make this happen, I would really appreciate it. I'd prefer a reasonably elegant, robust solution that's easy to maintain rather than the bull-in-a-china-shop type of code I'm liable to end up with if I have to just hack my way to a solution.
Here's what I've come up with...
I took the code sample from C++ Windows Shell context menu handler (CppShellExtContextMenuHandler) and adjusted it considerably. I started by making it act on * and Directory, because in my case the user should be able to perform this action on any combination of files and folders. Next, instead of popping up a message box with the command, I changed it to use CreateProcess with a redirected stdin (as per the example here). Once the process is created, I write the list of files to the pipe, one per line. (I'm going to try changing that to write the list of files to the pipe before starting the process, to avoid the potential race issue.) My target application simply reads in the console line by line until it runs out of data, and uses that list of files.
It's a longer and more perilous process than I'd like - I crashed the shell a couple times during testing because of oversights in coding the extension! - but it gives me exactly what I want, which is the ability to right-click a set of files in Explorer, have them all open in the same instance of my app, and then right-click on a different set of files and have them open in a separate instance.
Of course, just because I have an answer doesn't mean it's the answer. I'm not going to mark this as the final answer for a little while, in case anyone else comes along with something better. Heck, even if you don't think your answer is better, I'd love to hear it! It's always fun to see how other people approach a problem.

Building C# console app for multiple instances

I'm building a console application which imports data into databases. This is to run every hour depending on an input CSV file being present. The application also needs to be reused for other database imports on the same server, e.g. there could be up to 20 instances of the same .exe file with each instance having their own separate configuration.
At the moment I have the base application which passes a location of config file via args, so it can be tweaked depending on which application needs to use it. It also undertakes the import via a transaction, which all works fine.
I'm concerned that having 20 instances of the same .exe file running on the same box, every hour, may cause the CPU to max out?
What can I do to resolve this? Would threading help?
Why not make a single instance that can handle multiple configurations? Seems a lot easier to maintain and control.
Each executable will be running in it's own process, and therefore, with it's own thread(s). Depending on how processor intensive each task is, the CPU may well max out but this is not necessarily something to be concerned about. If you are concerned about concurrent load then the best way may be to stagger the scheduling of your processes so that you have the minimum number of them running simultaneously.
No, this isn't a threading issue.
Just create a system-wide named Mutex at the start of the application. When creating that Mutex, see if it already exists. If it does, it means that there is another instance of your application running. At this point you can give the user a message (via the console or message box) to say that another instance is already running, then you can terminate the application.
I realize this thread is very old but I had the very same issues on my project. I suggest using MSMQ to process jobs in sequence.

Hints and tips for a Windows service I am creating in C# and Quartz.NET

I have a project ongoing at the moment which is create a Windows Service that essentially moves files around multiple paths. A job may be to, every 60 seconds, get all files matching a regular expression from an FTP server and transfer them to a Network Path, and so on. These jobs are stored in an SQL database.
Currently, the service takes the form of a console application, for ease of development. Jobs are added using an ASP.NET page, and can be editted using another ASP.NET page.
I have some issues though, some relating to Quartz.NET and some general issues.
Quartz.NET:
1: This is the biggest issue I have. Seeing as I'm developing the application as a console application for the time being, I'm having to create a new Quartz.NET scheduler on all my files/pages. This is causing multiple confusing errors, but I just don't know how to institate the scheduler in one global file, and access these in my ASP.NET pages (so I can get details into a grid view to edit, for example)
2: My manager would suggested I could look into having multiple 'configurations' inside Quartz.NET. By this, I mean that at any given time, an administrator can change the applications configuration so that only specifically chosen applications run. What'd be the easiest way of doing this in Quartz.NET?
General:
1: One thing that that's crucial in this application is assurance that the file has been moved and it's actually on the target path (after the move the original file is deleted, so it would be disastrous if the file is deleted when it hasn't actually been copied!). I also need to make sure that the files contents match on the initial path, and the target path to give peace of mind that what has been copied is right. I'm currently doing this by MD5 hashing the initial file, copying the file, and before deleting it make sure that the file exists on the server. Then I hash the file on the server and make sure the hashes match up. Is there a simpler way of doing this? I'm concerned that the hashing may put strain on the system.
2: This relates to the above question, but isn't as important as not even my manager has any idea how I'd do this, but I'd love to implement this. An issue would arise if a job is executed when a file is being written to, which may be that a half written file will be transferred, thus making it totally useless, and it would also be bad as the the initial file would be destroyed while it's being written to! Is there a way of checking of this?
As you've discovered, running the Quartz scheduler inside an ASP.NET presents many problems. Check out Marko Lahma's response to your question about running the scheduler inside of an ASP.NET web app:
Quartz.Net scheduler works locally but not on remote host
As far as preventing race conditions between your jobs (eg. trying to delete a file that hasn't actually been copied to the file system yet), what you need to implement is some sort of job-chaining:
http://quartznet.sourceforge.net/faq.html#howtochainjobs
In the past I've used the TriggerListeners and JobListeners to do something similar to what you need. Basically, you register event listeners that wait to execute certain jobs until after another job is completed. It's important that you test out those listeners, and understand what's happening when those events are fired. You can easily find yourself implementing a solution that seems to work fine in development (false positive) and then fails to work in production, without understanding how and when the scheduler does certain things with regards to asynchronous job execution.
Good luck! Schedulers are fun!

Categories

Resources