I have a process that I need to make into a service. This process runs autonomously right now so there are no concerns with user interaction I just need to "turn" it into a service. I got to thinking about it and decided that I could just create a service that launched the process, this would give me the added benefit of having outside control of the process.. I could watch it for an unexpected exit and re-launch it.. I could also watch its memory usage and kill it if it gets out of hand. I dont think I have seen many other applications do this and I was thinking there must be a reason why so...
It's going to add complexity.
Instead of just having the process exist, you'll now need to make a second executable to "launch and monitor" this process. This adds overhead (the service and process both running), adds complexity, and makes life as a whole a bit more difficult.
That being said, if you've got a .NET Console application, turning it into a service is incredibly trivial. Your Main routine basically just gets moved into a method, and launched in a thread. Once you do that, the service application is effectively done - it's just configuring the service (which can be done in a designer) and overriding OnStart to spin up a thread and call your routine.
This is a good idea, but you've reinvented the wheel. What you're thinking of is essentially server monitoring. There are several high-quality open source implementations of what you want.
Pretty much anything that you can do this way you can do with less complexity by just putting the application logic in the service. Not to mention that you get Service Recovery for free by doing it in the service directly.
Related
I understand that an application domain forms:
an isolation boundary for security,
versioning,
reliability,
and unloading of managed code,
but so does a process
Can someone please help me understand the practical benefits of an application domain?
I assumed app domain provides you a container to load one version of an assembly but recently I discovered that multiple versions of strong key assembly can be loaded in an app domain.
My concept of application domain is still not clear. And I am struggling to understand why this concept was implemented when the concept of process is present.
Thank you.
I can't tell if you are talking in general or specifically .NET's AppDomain.
I am going to assume .NET's AppDomain and why it can be really useful when you need that isolation inside of a single process.
For instance:
Say you are dealing with a library that had certain worker classes and you have no choice, but to use those workers and can't modify the code. It's your job to build a Windows Service that manages said workers and makes sure they all stay up and running and need to work in parallel.
Easy enough right? Well, you hoped. It turns out your worker library is prone to throwing exceptions, uses a static configuration, and is generally just a real PITA.
You could try to launch them in their own process, but monitor them, you'll need to implement namedpipes or try to thoughtfully parse the STDIN and STDOUT of the process.
What else can you do? Well AppDomain actually solves this. I can spawn an AppDomain for each worker, give them their own configuration, they can't screw each other up by changing static properties because they are isolated, and on top of that, if the library bombs out and I failed to catch the exception, it doesn't bother the workers in their domain. And during all of this, I can still communicate with those workers easily.
Sadly, I have had to do this before
EDIT: Started to write this as a comment response, but got too large
Individual processes can work great in many scenarios, however, there are just times where they can become a pain. I am not saying one should use an AppDomain over another process. I think it's uncommon you would need a separate process or AppDomain, but once you need it, you'll definitely know.
The main problem I see with processes in the scenario I've given above is that processes have their own downfalls that are easier to mitigate with the AppDomain.
A process can go rogue, become unresponsive, and crash or be killed at any point.
If you're managing processes, you need to keep track of the process ID and monitor the status of it. IPCs are great, but it does take time to get proper communication going back and forth as needed.
As an example let's say your process just dies. What do you do? Depending on the mechanism you chose to monitor, maybe the communication thread died, perhaps the work finished and you still show it as "processing". What do you do?
Now what happens when you have 20 processes and your management app dies. You don't have any real information, all you have is 20 "myprocess.exe" and maybe now have to start parsing the command line arguments they were started with to see which workers you actually have. Obviously with an AppDomain all 20 would have died too, but did you really gain anything with the process? You still have to code the ability to recover, however, now you have to also code all of the recovery for your processes instead of just firing the workers back up.
As with anything in programming, there's 1,000 different ways to achieve the same goal. It's up to you to decide which solution you feel is most appropriate.
Some practical benefits using app domain:
Multiple app domains can be run in a process. You can also stop individual app domain without stopping the entire process. This alone drastically increases the server scalability.
Managing app domain life cycle is done programmatically by runtime hosts (you can override it as well). For processes & threads, you have to explicitly manage their life cycle. Initialization, execution, termination, inter-process/multithread communication is complex and that's why it's easier to defer that to CLR management.
Source: https://learn.microsoft.com/en-us/dotnet/framework/app-domains/application-domains
I have an NHibernate MVC application that is using ReadCommitted Isolation.
On the site, there is a certain process that the user could initiate, and depending on the input, may take several minutes. This is because the session is per request and is open that entire time.
But while that runs, no other user can access the site (they can try, but their request won't go through unless the long-running thing is finished)
What's more, I also have a need to have a console app that also performs this long running function while connecting to the same database. It is causing the same issue.
I'm not sure what part of my setup is wrong, any feedback would be appreciated.
NHibernate is set up with fluent configuration and StructureMap.
Isolation level is set as ReadCommitted.
The session factory lifecycle is HybridLifeCycle (which on the web should be Session per request, but on the win console app would be ThreadLocal)
It sounds like your requests are waiting on database locks. Your options are really:
Break the long running process into a series of smaller transactions.
Use ReadUncommitted isolation level most of the time (this is appropriate in a lot of use cases).
Judicious use of Snapshot isolation level (Assuming you're using MS-SQL 2005 or later).
(N.B. I'm assuming the long-running function does a lot of reads/writes and the requests being blocked are primarily doing reads.)
As has been suggested, breaking your process down into multiple smaller transactions will probably be the solution.
I would suggest looking at something like Rhino Service Bus or NServiceBus (my preference is Rhino Service Bus - I find it much simpler to work with personally). What that allows you to do is separate the functionality down into small chunks, but maintain the transactional nature. Essentially with a service bus, you send a message to initiate a piece of work, the piece of work will be enlisted in a distributed transaction along with receiving the message, so if something goes wrong, the message will not just disappear, leaving your system in a potentially inconsistent state.
Depending on what you need to do, you could send an initial message to start the processing, and then after each step, send a new message to initiate the next step. This can really help to break down the transactions into much smaller pieces of work (and simplify the code). The two service buses I mentioned (there is also Mass Transit), also have things like retries built in, and error handling, so that if something goes wrong, the message ends up in an error queue and you can investigate what went wrong, hopefully fix it, and reprocess the message, thus ensuring your system remains consistent.
Of course whether this is necessary depends on the requirements of your system :)
Another, but more complex solution would be:
You build a background robot application which runs on one of the machines
this background worker robot can be receive "worker jobs" (the one initiated by the user)
then, the robot processes the jobs step & step in the background
Pitfalls are:
- you have to programm this robot very stable
- you need to watch the robot somehow
Sure, this is involves more work - on the flip side you will have the option to integrate more job-types, enabling your system to process different things in the background.
I think the design of your application /SQL statements has a problem , unless you are facebook I dont think any process it should take all this time , it is better to review your design and check where is the bottleneck are, instead of trying to make this long running process continue .
also some times ORM is not good for every scenario , did you try to use SP ?
I'm trying to determine the cause of a very long (imho) initial start up of an ASP.NET application.
The application uses various third party libraries, and lots of references that I'm sure could be consolidated, however, I'm trying to identify (and apportion blame) the dlls and how much they contribute to the extended startup process.
So far, the start up times vary from 2-5 minutes depending on usage of other things on the box. This is unacceptable in my opinion based on the complexity of the site, and I need to reduce this to something in the region of 30 seconds maximum.
To be clear on the scope of the performance I'm looking for, it's the time from first request to the initial Application_Start method being hit.
So where would I start with getting information on which DLL's are loaded, and how long they take to load so I can try to put a cost/benefit together on which we need to tackle/consolidate.
From an ability perspective, I've been using JetBrains dotTrace for a while, and I'm clear on how benchmark the application once we're in the application, but it appears this is outside of the application code, and therefore outside of what I currently know.
What I'm looking for is methodologies on how to get visibility of what is happening before the first entry point into my code.
Note: I know that I can call the default page on recycle/upgrade to do an initial load, but I'd rather solve the actual problem rather than papering over it.
Note2: the hardware is more than sufficiently scaled and separated in terms of functionality, therefore I'm fairly sure that this isn't the issue.
Separate answer on profiling/debugging start up code:
w3wp is just a process that runs .Net code. So you can use all profiling and debugging tools you would use for normal .Net application.
One tricky point is that w3wp process starts automatically on first request to an application and if your tools do not support attaching to process whenever it start it makes problematic to investigate startup code of your application.
Trick to solve it is to add another application to the same Application Pool. This way you can trigger w3wp creation by navigating to another application, than attach/configure your tools against already running process. When you finally trigger your original application tools will see loading happening in existing w3wp process.
With 2-5 minutes delay you may not even need profiler - simply attach Visual Studio debugger the way suggested above and randomly trigger "break all" several times during load of your site. There is a good chance that slowest portion of the code will be on the stack of one of many threads. Also watch out for debug output - may give you some clues what is going on.
You may also use WinDbg to capture stacks of all threads in similar way (could be more light way than VS).
Your DLL references are loaded as needed, not all at once.
Do external references slow down my ASP.NET application? (VS: Add Reference dialog)
If startup is taking 2-5 minutes, I would look at what happens in Application_Start, and at what the DLLs do once loaded. Are they trying to connect to a remote service that is very slow? Is the machine far too small for what it's doing (e.g. running a DB with large amounts of data plus the web server on an AWS micro instance or similar)?
Since the load time is probably not the IIS worker process resolving references, I would turn to traditional application profilers (e.g. Jetbrains, Antz, dotTrace) to see where the time is being spent as the DLLs initialize, and in your Application_Start method.
Entertainment options check along with profiling:
profile everything, add time tracing to everything and log the information
if you have many ASPX views that need to be compiled on startup (I think it is default for release configuration) than it will take some time
references to Web services or other XML serialization related code will need to compile serialization assemblies if none are present yet
access to remote services (including local SQL) may require the services start up too
aggressive caching in application/remote services may require per-population of caches
Production:
What is the goal for start up time? Figure out it first, otherwise you will not be able to reach it.
What is price you are willing to pay to decrease start up time. Adding 1-10 more servers may be cheaper than spending months of development/test time and delaying the product.
Consider multiple servers, rolling restarts with warm up calls, web gardens
If caching of DB objects or in general is an issue consider existing distributed in-memory caches...
Despite a large number of dlls I'm almost sure that for a reasonable application it cannot be a cause of problem. Most of the time it is static objects initialization is causing slow startup.
In C# static variables are initialized when a type is first time accessed. I would recommend to use a sql profiler and see what are the queries that are performed during the start time of the application and from there see what are the objects that are expensive to initialized.
I am writing a c# windows service which will perform some background processing - basically it is a consumer for a work queue.
It needs to not go down (stop processing new items), and if it does go down I need to be notified.
What are some design guidelines and considerations for a) ensuring that such a service is as reliable as possible, and b) sending out a notification if something does go wrong? I have considered, for instance, creating a watcher thread whose only job is to make sure the worker thread is still processing jobs.
There are a number of things that you can do here to help improve the reliability, as well as gauge that you have a solution that is going to meet your needs.
Testing
First and foremost though, the testing process that you go through will need to be a very solid one, test for those "unexpected" situations, loss of network connection, etc. Make sure that you are testing those, and seeing what is happening. Notification on failure, can be a bit of a "mixed bag". For example, you can't e-mail yourself if you don't have network connections available.
Proper Code Design
In addition to setting up valid test scenarios, be sure that your code is a bullet proof as possible, since you are creating a windows service, be sure that you are capturing, logging, and dealing with all errors possible, as if an error bubbles up to the OS, your service will go down.
Monitoring
Consider putting monitoring, in my day-job we have two types of monitoring used, errors are reported the the Windows Event log in some cases and Microsoft MOM is used to notify us of any/all issues that are going on in the environment. A second process that we use is a second scheduled job that every X minutes validates that the critical job is in a "Started" state, if it isn't in a started state, it will re-start it. Not elegant, but it works.
I think a MOM and/or Solar Winds or some other monitoring application which your system administrator might be using to monitor the machine on which the service is deployed & take proper action (send email, ring phones :)
I hate asking questions like this - they're so undefined... and undefinable, but here goes.
Background:
I've got a DLL that is the guts of an application that is a timed process. My timer receives a configuration for the interval at which it runs and a delegate that should be run when the interval elapses. I've got another DLL that contains the process that I inject.
I created two applications, one Windows Service and one Console Application. Each of the applications read their own configuration file and load the same libraries pushing the configured timer interval and delegate into my timed process class.
Problem:
Yesterday and for the last n weeks, everything was working fine in our production environment using the Windows Service. Today, the Windows Service will run for a period of around 20-30 minutes and hangs (with a timer interval of 30 secods), but the console application runs without issue and has for the past 4 hours. Detailed logging doesn't indicate any failure. It's as if the Windows Service just...dies quietly - without stopping.
Given that my Windows Service and Console Applications are doing the exact same thing, I can only think that there is something that is causing the Windows Service process to hang - but I have no idea what could be causing that. I've checked the configuration files, and they're both identical - I even copied and pasted the contents of one into the other just to be sure. No dice.
Can anyone make suggestions as to what might cause a Windows Service to hang, when a counterpart Console Application using the same base libraries doesn't; or can anyone point me in the direction of tools that would allow me to diagnose what could be causing this issue?
Thanks for everyone's help - still digging.
You need to figure out what changed on the production server. At first, the IT guys responsible will swear that nothing changed but you have to be persistent. i've seen this happen to often i've lost count. Software doesn't spoil. Period. The change must have been to the environment.
Difference in execution: You have two apps running the same code. The most likely difference (and culprit) is that the service is running with a different set of security credentials than your console app and might fall victim to security vagaries. Check on that first. Which Windows account is running the service? What is its role and scope? Is there any 3rd party security software running on the server and perhaps Killing errant apps? Do you have to register your service with a 3rd party security service? Is your .Net assembly properly signed? Are your .Net assemblies properly registered and configured on the server? Last but not least, don't forget that a debugger user, which you most likely are, gets away with a lot more stuff than many other account types.
Another thought: Since timing seems to be part of the issues, check the scheduled tasks on the machine. Perhaps there's a process that is set to go off every 30 minutes that is interfering with your own.
You can debug a Windows service by running it interactively within Visual Studio. This may help you to isolate the problem by setting (perhaps conditional) breakpoints.
Alternatively, you can use the Visual Studio "Attach to process" dialog window to find the service process and attach to it with the "Debug CLR" option enabled. Again this allows you to set breakpoints as needed.
Are you using any assertions? If an assertion fires without being re-directed to write to a log file, your service will hang. If the code throws an unhandled exception, perhaps because of a memory leak, then your service process will crash. If you set the Service Control Manager (SCM) to restart your process in the event of a crash, you should be able to see that the service has been restarted. As you have identical code running in both environments, these two situations don't seem likely. But remember that your service is being hosted by the SCM, which means a very different environment to the one in which your console app is running.
I often use a "heartbeat", where each active thread in the service sends a regular (say every 30 seconds) message to a local MSMQ. This enables manual or automated monitoring, and should give you some clues when these heartbeat messages stop arriving.
Annother possibility is some sort of permissions problem, because the service is probably running with a different local/domain user to the console.
After the hang, can you use the SCM to stop the service? If you can't, then there is probably some sort of thread deadlock problem. After the service appears to hang, you can go to a command-line and type sc queryex servicename. This should give you the current STATE of the service.
I would probably put in some file logging just to see how far the program is getting. It may give you a better idea of what is looping/hanging/deadlocked/crashing.
You can try these techniques
Logging start logging the flow of the code in the service. Have this parameter based so you dont have a deluge after you are done. You should log all function names, parameters, timestamps.
Attach Debugger Locally or Remotely attach a debugger with the code to the running service, set appropriate breakpoints (can be based on the data gathered from logging)
PerfMon Run this utility and gather information about the machine that the service is running on for any additional clues (high CPU spikes, IO spikes, excessive paging, etc)
Microsoft provides a good resource on debugging a Windows Service. That essentially sounds like what you'd have to do given that your question is so generic. With that said, has any changes been made to the system over the last few days that could aversely affect the service? Have you made any updates to the code that change the way the service might possibly work?
Again, I think you're going to have to do some serious debugging to find your problem.
What type of timer are you using in the windows service? I've seen numberous people on SO have problems with timers and windows services. Here is a good tutorial just to make sure you are setting it up correctly and using the right type of timer. Hope that helps.
Another potential problem in reference to psasik's answer is if your application is relying on something only available when being run in User Mode.
Running in service mode runs in (is it desktop0?) which can cause some issues in my experience if you are trying to determine states of something that can only be seen in user mode.
Smells like a threading issue to me. Is there any threading or async work being done at all? One crucial question is "does the service hang on the same line of code or same method every time?" Use your logging to find out the last thing that happens before a hang, and if so, post the problem code.
One other tool you may consider is a good profiler. If it is .NET code, I believe RedGate ANTS can monitor it and give you a good picture of any threadlock scenarios.