I have a system wherein the already set up service for a specific process used to have a dingle instance mode. The service was used to run a long process that could be serve only one client. The architecture is as follows:
Now I am trying to make this wcf service per-session, so that it can run the long operation for two or more clients simultaneously. Since the process usually takes time. I am also sending the percentage of completion back to the client using a callback channel. This is what the architecture looks like the one shown below:
The major difference between the two architecture is:
Previously only one user could run the process for multiple
objects.Now each user can run the long process but for different
objects.
We have added callback facility to the new architecture
with per-session service.
We also plan on giving the user facility
to terminate the process,if he wishes to or the client connection is
closed.
But while trying to achieve the above we are facing the following issues.
The long time taking operation, occurs in database with the help of multiple stored procedures, called one by one from the static datamanager class.
Each SP is responsible for addition of around 500k rows in the multiple tables.
Though terminating the connection from client removes the instance of the service but since the database operations are done in the static class, the control gets stuck there and everything stops responding.
I know there is a DBCommand.Cancel() method which stops the operation associated with the DBCommand, but since the class is static cancelling that is also not possible.
Please suggest the architectural changes needed to solve this issue. I am ready to share more details.
From what I understand, you want multiple client at the same time and the static behavior that makes to have a singleton don't match together.
I would correct that.
Regards
Related
I've created a BackgroundService in a WebAPI based on the code examples here: https://learn.microsoft.com/en-us/dotnet/architecture/microservices/multi-container-microservice-net-applications/background-tasks-with-ihostedservice . The article doesn't give any guidance for implementing this in a multi-server environment. My use-case involves a FileSystemWatcher monitoring a shared network folder for changes. It works great.
The issue is there will be multiple instances of this and I don't want all of the instances responding - just one. Is this feasible, and if so, what steps do I need to implement? I've read about using queues, but I can't see how that will help. Also, Hangfire or similar is not an option. Do I need to re-examine my logic?
I can think of multiple ways to achieve this, with pros and cons.
Individual service
If you need only one instance of this, implement it as a standalone service and deploy on one server only. True, you can't leverage background processes, but do you really need to?
Configuration
Have a config value indicating where to register the service. This could be comma separated values and whatnot. This will require some deployment handling though, to change the config to on, on the server running the background service. It could even be a comma separated value to indicate server names.
Persist value in db
If there is a single database somewhere, you can have the services communicating through it. Have a table storing which server executes the background service and once the first one locks it, then the others just sleep. Some keep alive logic needs to be implemented as well.
I would honestly go with solution one. Individually scalable, deployable and no workaround needed.
A background service, indicates that it should be running on all instances if it's part of the application.
you need to go with microservice architecture.
On microserver will use file watcher and prepare queue
then you can have another microserver which works up that queue msg(this you can scale with multiple instance )
you also make another service/microservice to keep eye on the health of file watcher and do failover task
We have created a dotnet core web api project which is using SQL Server database. Now, we are planning to deploy this project to Microsoft Azure.
While the deployment of this application, we are also considering to enable autoscaling option (horizontal scaling).
Before, we do it. We want to have some questions that we want to clarify.
Should we need to add some additional code in our application which allows autoscaling to work properly?
Properly in a sense, as there can be more than one instance of the application running because of horizontal scaling. We are using database and more than one instance is running will it case race condition (i.e., two resources accessing the same data at a time). I mean we can add a transaction (or use locking) in our code to avoid these kinds of scenarios?
I want to know that is there any best practices to follow while implementing that kind of application?
Thank you and waiting for your answers!
Consider the following points when designing an autoscaling strategy:
The system must be designed to be horizontally scalable. Avoid making
assumptions about instance affinity; do not design solutions that
require that the code is always running in a specific instance of a
process. When scaling a cloud service or web site horizontally, do
not assume that a series of requests from the same source will always
be routed to the same instance. For the same reason, design services
to be stateless to avoid requiring a series of requests from an
application to always be routed to the same instance of a service.
When designing a service that reads messages from a queue and
processes them, do not make any assumptions about which instance of
the service handles a specific message because autoscaling could
start additional instances of a service as the queue length grows.
The Competing Consumers pattern describes how to handle this
scenario.
If the solution implements a long-running task, design this task to
support both scaling out and scaling in. Without due care, such a
task could prevent an instance of a process from being shutdown
cleanly when the system scales in, or it could lose data if the
process is forcibly terminated. Ideally, refactor a long-running task
and break up the processing that it performs into smaller, discrete
chunks. The Pipes and Filters pattern provides an example of how you
can achieve this. Alternatively, you can implement a checkpoint
mechanism that records state information about the task at regular
intervals, and save this state in durable storage that can be
accessed by any instance of the process running the task. In this
way, if the process is shutdown, the work that it was performing can
be resumed from the last checkpoint by using another instance.
For more information, follow the doc : https://github.com/Huachao/azure-content/blob/master/articles/best-practices-auto-scaling.md
Regarding this:
Properly in a sense, as there can be more than one instance of the application running because of horizontal scaling. We are using database and more than one instance is running will it case race condition (i.e., two resources accessing the same data at a time). I mean we can add a transaction (or use locking) in our code to avoid these kinds of scenarios?
Please keep in mind that, even if the app is running on a single machine, requests will still be handled concurrently. This means that even on a single machine 2 requests can cause the same entry in the database to be updated. So the above questions about race conditions apply to single instance web apps as well.
Try to avoid locking: the whole point of (horizontal) scaling is to gain performance benefits. By using locks you effectively remove this benefits as only one process at a time can use the locked resource.
Other points of considerations are:
If you are using an in-memory cache you might want to swap it out for a distributed cache.
The guidance at the MS docs
I have a webservice (asmx) that is running on 2 different servers that sit behind a load balancer. The service is called by multiple clients across our organization, as per my current knowledge, none of the clients use multiple threads.
I'm investigating a production issue where some of the data in a few static variables is clearing or returning null or empty, causing db exceptions and foreign key constraint errors.
Upon investigation, I noticed that the singleton pattern is not implemented correctly, so it's definitely not multi thread safe.
I checked with my team and see if there is any scenario where the service might run under multiple threads but they're all saying no.
I don't know why but I'm still convinced that it is running multiple threads as all the production issues I see align with the multi thread functionality. I can also force these errors when I do a parallel.invoke in my unit test cases, but I cannot find the scenario where it's happening on a day to day basis.
I was wondering if there is any way to go through the IIS logs or anything on the windows servers itself that might clarify this situation whether the service or anything inside it is using multiple threads while it's running.
Is it possible that on each IIS, the service is in its own single thread but when it calls other classes and methods within itself, they start their own thread?
I apologize for not sharing any code yet, just given the sheer amount of code, I didn't get a chance to extract part of it to post it here, I'll need to refactor quite a few things before I can post it here.
Many thanks in advance.
I have this scenario, and I don't really know where to start. Suppose there's a Web service-like app (might be API tho) hosted on a server. That app receives a request to proccess some data (through some method we will call processData(data theData)).
On the other side, there's a robot (might be installed on the same server) that procceses the data. So, The web-service inserts the request on a common Database (both programms have access to it), and it's supposed to wait for that row to change and send the results back.
The robot periodically check the database for new rows, proccesses the data and set some sort of flag to that row, indicating that the data was processed.
So the main problem here is, what should the method proccessData(..) do to check for the changes of the data row?.
I know one way to do it: I can build an iteration block that checks for the row every x secs. But i don't want to do that. What I want to do is to build some sort of event listener, that triggers when the row changes. I know it might involve some asynchronous programming
I might be dreaming, but is that even possible in a web enviroment.?
I've been reading about a SqlDependency class, Async and AWait classes, etc..
Depending on how much control you have over design of this distributed system, it might be better for its architecture if you take a step back and try to think outside the domain of solutions you have narrowed the problem down to so far. You have identified the "main problem" to be finding a way for the distributed services to communicate with each other through the common database. Maybe that is a thought you should challenge.
There are many potential ways for these components to communicate and if your design goal is to reduce latency and thus avoid polling, it might in fact be the right way for the service that needs to be informed of completion of this work item to be informed of it right away. However, if in the future the throughput of this system has to increase, processing work items in bulk and instead poll for the information might become the only feasible option. This is also why I have chosen to word my answer a bit more generically and discuss the design of this distributed system more abstractly.
If after this consideration your answer remains the same and you do want immediate notification, consider having the component that processes a work item to notify the component(s) that need to be notified. As a general design principle for distributed systems, it is best to have the component that is most authoritative for a given set of data to also be the component to answer requests about that data. In this case, the data you have is the completion status of your work items, so the best component to act on this would be the component completing the work items. It might be better for that component to inform calling clients and components of that completion. Here it's also important to know if you only write this data to the database for the sake of communication between components or if those rows have any value beyond the completion of a given work item, such as for reporting purposes or performance indicators (KPIs).
I think there can be valid reasons, though, why you would not want to have such a call, such as reducing coupling between components or lack of access to communicate with the other component in a direct manner. There are many communication primitives that allow such notification, such as MSMQ under Windows, or Queues in Windows Azure. There are also reasons against it, such as dependency on a third component for communication within your system, which could reduce the availability of your system and lead to outages. The questions you might want to ask yourself here are: "How much work can my component do when everything around it goes down?" and "What are my design priorities for this system in terms of reliability and availability?"
So I think the main problem you might want to really try to solve fist is a bit more abstract: how should the interface through which components of this distributed system communicate look like?
If after all of this you remain set on having the interface of communication between those components be the SQL database, you could explore using INSERT and UPDATE triggers in SQL. You can easily look up the syntax of those commands and specify Stored Procedures that then get executed. In those stored procedures you would want to check the completion flag of any new rows and possibly restrain the number of rows you check by date or have an ID for the last processed work item. To then notify the other component, you could go as far as using the built-in stored procedure XP_cmdshell to execute command lines under Windows. The command you execute could be a simple tool that pings your service for completion of the task.
I'm sorry to have initially overlooked your suggestion to use SQL Query Notifications. That is also a feasible way and works through the Service Broker component. You would define a SqlCommand, as if normally querying your database, pass this to an instance of SqlDependency and then subscribe to the event called OnChange. Once you execute the SqlCommand, you should get calls to the event handler you added to OnChange.
I am not sure, however, how to get the exact changes to the database out of the SqlNotificationEventArgs object that will be passed to your event handler, so your query might need to be specific enough for the application to tell that the work item has completed whenever the query changes, or you might have to do another round-trip to the database from your application every time you are notified to be able to tell what exactly has changed.
Are you referring to a Message Queue? The .Net framework already provides this facility. I would say let the web service manage an application level queue. The robot will request the same web service for things to do. Assuming that the data needed for the jobs are small, you can keep the whole thing in memory. I would rather not involve a database, if you don't already have one.
I need to notify 1-many clients to perform a task (reload). The server may or may not be running at any given point in time. (For this reason, I have had some difficulty defining who is the client and who is the server.)
At any given time, the server may start running. When the server closes itself, it will notify all clients to perform their task.
I tried using a NamedPipeServerStream and running multiple instances on the "clients" (remember the relationship is odd so bear with me). Unfortunetly, I can only create one Pipe server for any given server name. So this did not work. I could have the clients continously check for a server, but if I'm going to start polling then I might as well poll the DB directly.
My situation is roughly like an observer pattern. I do not need to dynamically subscribe/unsubscribe. I do want the server to push a notifcation to all running clients to perform a task.
How can I achieve this? Please keep in mind I have to do this with IPC. The server/clients are running under different processes and always on the same machine.
To solve the polling problem you can create a named ManualResetEvent that the client processes will listen to. The clients will spin up a thread then wait on the event, when the server starts it will signal the event and let all of the clients start their listening code which can open a named pipe like you do currently. Look at the EventWaitHandle.GetAccessControl MSDN page to see a example of how to make a named ManualResetEvent.
For your I can only create one Pipe server for any given server name. issue, if there are multiple servers running how are the clients supposed to know which server to connect to? You said it was a 1 server to * client relationship. If you are going to run multiple servers you will need a way to tell the client which server it should listen to.
Because you specified that the processes are all guaranteed to be on the same machine, I would probably look at using a windows named event. Your client(s) and server can call OpenEvent, if the event hasn't been created, call CreateEvent. This will give you some control (using pulseevent, set/reset, etc) over how many clients are released per event, and also allow you to open a client without the server already existing.
As Scott suggests, to easily do this in c#, use a .net named EventWaitHandle, by calling a constructor that takes a string (such as this one). Which will create the system-wide synchronization object for you. That particular constructor will also tell you if you were the first to ask for the event (you created it), or it was already in existence.
Using named shared memory would be one way to accomplish a one-to-many communication. The server can create the shared memory, and one or more of the clients can open it and read it. An example (in C) is shown here. In addition, a .NET example is shown here.
The shared memory will exist as long as at least one process has an open handle to it. From the OP, this sounds as if it would be a useful feature because the memory could continue its existence even after the server process was closed (as long as at least one client still had it open). The .NET example also shows how to persist the information, which would be useful if it has to outlive the processes.
Depending on the timing needed, clients could either periodically read the memory for the necessary information. Or if a more time-critical situation exists, you could use a named semaphore to signal clients to perform necessary operations. With a semaphore as the synchronization object, you can signal multiple clients by setting the release count to a value greater than 1.