Design question about background processing web service [closed]

Design question about background processing web service [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
the title of the question may not clear enough, allow me to explain the background here:
I would like to design a web service that generates PDF and submit it to printer, here is the workflow:
User submit a request to the web service, probably the request will be one off so that user wouldn't suffer from waiting the job complete. User may received a HTTP200 and continue their work.
Once web service received the request, the web service generates the PDF, and submit it to designated printer and this process could take some time and CPU resources. As I don't want the drain all resource on that server, I may use producer consumer pattern here, there might be a queue to to queue client jobs, and process them one by one.
My Questions is that:
I'm new to C#, what is the proper pattern to queue and process them? Should I use ConcurrentQueue and ThreadPool to archive it?
What is the proper way to notify user about the job is success/fail? Instead of using callback service, is async an ideal way? My concern is that there may be lots of jobs in the queue and I don't want client suffer from waiting it complete.
The web service is placed behind a load balancer, how can I maintain a 'process queue' among them? I've tried using Hangfire and it seems okay, however I'm looking for alternative?
How can I know the number of jobs in the Queue/ how may thread is currently running? The webservice will be deployed on IIS, is there a Native way to archive it, or should I implement a web service call to obtain them?
Any help will be appreciated, thanks!

WCF supports the idea of a fire-and-forget methods. You just mark your contract interface method as one way, and there will be no waiting for a return:
[OperationContract( IsOneWay = true )]
void PrintPDF( PrintRequest request );
The only downside, of course, is that you won't get any notification from the server that you're request was successful or even valid. You'd have to do some kind of periodic polling to see what's going on. I guess you could put a Guid into the PrintRequest, so you could interrogate for that job later.
If you're not married to wcf, you might consider signalR...there's a comprehensive sample app of both a server and simple wpf client here. It has the advantage that either party can initiate an exchange once the connection has been established.
If you need to stick with wcf, there's the possibility of doing dualHttp. The client connects with an endpoint to callback to...and the server can then post notifications as work completes. You can get a feel for it from this sample.
Both signalR and wcf dualHttp are pretty straightforward. I guess my preference would be based on the experience of the folks doing the work. signalR has the advantage of playing nicely with browser-based clients...if that ever turns into a concern for you.
As for the queue itself...and keeping with the wcf model, you want to make sure your requests are serializable...so if need be, you can drain the queue and restart it later. In wcf, that typically means making data contracts for queue items. As an aside, I never like to send a boatload of arguments to a service, I prefer instead to make a data contract for method parameters and return types.
Data contracts are typically just simple types marked up with attributes to control serialization. The wcf methods do the magic of serializing/deserializing your types over the wire without you having to do much thinking. The client sends a whizzy and the server receives a whizzy as it's parameter.
There are caveats...in particular, the deserialization doesn't call your constructor (I believe it uses MemberwiseClone instead) ...so you can't rely on the constructor to initialize properties. To that end, you have to remember that, for example, collection types that aren't required might need to be lazily initialized. For example:
[DataContract]
public class ClientState
{
private static object sync = new object( );
//--> and then somewhat later...
[DataMember( Name = "UpdateProblems", IsRequired = false, EmitDefaultValue = false )]
List<UpdateProblem> updateProblems;
/// <summary>Problems encountered during previous Windows Update sessions</summary>
public List<UpdateProblem> UpdateProblems
{
get
{
lock ( sync )
{
if ( updateProblems == null ) updateProblems = new List<UpdateProblem>( );
}
return updateProblems;
}
}
//--> ...and so on...
}
Something I always do is to mark the backing variable as the serializable member, so deserialization doesn't invoke the property logic. I've found this to be an important "trick".
Producer/consumer is easy to write...and easy to get wrong. Look around on StackOverflow...you'll find plenty of examples. One of the best is here. You can do it with ConcurrentQueue and avoid the locks, or just go at it with a good ol' simple Queue as in the example.
But really...you're so much better off using some kind of service bus architecture and not rolling your own queue.
Being behind a load balancer means you probably want them all calling to a service instance to manage a single queue. You could roll your own or, you could let each instance manage its own queue. That might be more processing than you want going on on your server instances...that's your call. With wcf dual http, you may need your load balancer to be configured to have client affinity...so you can have session-oriented two-way communications. signalR supports a message bus backed by Sql Server, Redis, or Azure Service Bus, so you don't have to worry about affinity with a particular server instance. It has performance implication that are discussed here.
I guess the most salient advice is...find out what's out there and try to avoid reinventing the wheel. By all means, go for it if you're in burning/learning mode and can afford the time. But, if you're getting paid, find and learn the tools that are already in the field.
Since you're using .Net on both sides, you might consider writing all your contracts (service contracts and data contracts) into a .DLL that you use on both the client and the service. The nice thing about that is it's easy to keep things in sync, and you don't have to use the (rather weak) generated data contract types that come through WSDL discovery or the service reference wizard, and you can spin up client instances using ChannelFactory<IYourServiceContract>.

Related

RabbitMQ, REST API, Docker, and Kubernete Best Practice Question

This is a high level question that I am asking for something I am currently architecting and cannot seem to find the exact answer I am looking for.
Scenario:
I have a .Net Core REST API that will be receiving requests from an external application. These requests will be getting pushed into a RabbitMQ instance. These notifications will be thrown to an exchange, then fanned out to multiple queues for multiple consumers.
There is one consumer that I will be responsible for and I am looking for advice on best practices. Ultimately, there will be a REST API that will eventually need to react to these messages being pushed into the queue. This REST API in question is a containerized (Docker) app running on a Kubernetes cluster. It will be receiving a lot of request traffic outside of these notifications (queue messages), making SQL calls, etc.
My question is, should I have an external microservice (hosted service/background service) that subscribes to this queue with the intent of calling into said REST API. Kind of like a traffic cop; routing messages to the appropriate API method based on certain data points.
Or
Would it be OK to put this consumer directly into the high-traffic REST API in question?
Any advice around this? Thanks in advance!

There is no right or wrong. This is the whole dilemma around monolith-microservices and synchronous-asynchronous.
If you are looking at going with microservices and more asynchronous, you can start with these questions:
Do you want your system into different codebases?
Do you want to divide responsibilities among different teams?
Do you want to use different languages/projects for the different components?
Do you want some components of the system to respond faster to the user?
Can your app be ok with the fact that one decoupled component may fail completely?
should I have an external microservice (hosted service/background service) that subscribes to this queue with the intent of calling into said REST API. Kind of like a traffic cop; routing messages to the appropriate API method based on certain data points.
Yes, if you are thinking more on the microservices route and the answer is 'yes' for most of the above questions (and even more microservices related questions not mentioned).
If you are thinking more about the monolith route:
Are you ok with the same code base shared across the different teams?
Are you ok with a more unified programming language?
Do you want to have a monorepo? (although you can do micro-services with monorepos)
Is the codebase going to be mainly be worked on by a few people who know it really well?
Is it easy to provide redundancy within the app? i.e If one component fails the application doesn't crash.
Would it be OK to put this consumer directly into the high-traffic REST API in question?
Yes, if your code can handle it and you are more in line with 'yes' on the answers above.

RESTFUL web service v Message queue when using Scatter Gatherer

Say I have a scatter gather setup like this:
1) Web app
2) RabbitMQ
3) Scatter gather API 1
4) Scatter gather API 2
5) Scatter gather API x
Say each scatter gather (and any new ones added in future) need to supply an image/update an image to the web app, so that when the web app displays the results on screen it also displays the image. What is the best way to do this?
1) RESTFUL call from each API to web app adding/updating an image where necessary
2) Use message queue to send the image
I believe option two is best because I am using a microservices architecture. However, this would mean that the image could be processed by the web app after requests are made (if competiting consumers are used). Therefore the image could be missing from the webpage?
The problem with option 1 is the scatter gatherer apis are tightly coupled with the web app.
What is the appropriate way to approach this?

The short answer: There is no right way to do this.
The long answer: Because there's no right way to do this, there a danger that any answer I give you will be an opinion. Rather than do that, I'm going to help clarify the ramifications of each option you've proposed.
First thing to note: Unless there is already an image available at the time of the HTTP request, then your HTTP response will not be able to include an image. This means that your front-end will need to be updated after the HTTP request/response cycle has concluded. There are two ways to do this: polling via AJAX requests, or pushing via sockets.
The advantage of polling is that it is probably easier to integrate into an existing web app. The advantage of pushing the image to the client via sockets is that the client won't need to spam your server with polling requests.
Second thing to note: Reporting back the image from the scatter/gather workers could happen either via an HTTP endpoint, or via the message queue, as you suggest.
The advantage of the HTTP endpoint is that it would likely be simpler to setup. The advantage of the message queue is that the worker would not have to wait for the the HTTP response (which could take a while if you're writing a large image file to disk) before moving on to the next job.
One more thing to note: If you choose to use an HTTP endpoint to create/update the images, it is possible that multiple scatter/gather workers will be trying to do this at the same time. You'll need to handle this to prevent multiple workers from trying to write to the same file at the same time. You could handle this by using a mutex to lock the file while one process is writing to it. If you choose to use a message queue, you'll have several options for dealing with this: you could use a mutex, or you could use a FIFO queue that guarantees the order of execution, or you could limit the number of workers on the queue to one, to prevent concurrency.
I do have experience with a similar system. My team and I chose to use a message queue. It worked well for us, given our constraints. But, ultimately, you'll need to decide which will work better for you given your constraints.
EDIT
The constraints we considered in choosing a message queue over HTTP included:
Not wanting to add private endpoints to a public facing web app
Not wanting to hold up a worker to wait on an HTTP request/response
Not wanting to make synchronous that which was asynchronous
There may have been other reasons. Those are the ones I remember off the top of my head.

c# Should I use a service or just poll the database

I don't know too much about services so if I am trying to do something they are not intended for please forgive me.
I am trying to wright dispatching software for a family member. They plan on starting with 3 or 4 dispatchers but it may scale in the future. I need the software to constantly (every 5 or 10 seconds at the very least) check and see if a new unhandled call has been placed when not in a call or if they are in a call see if another dispatcher updated the call (due to a call in with additional information).
Which option would be better for the above scenario
A) Have a table in a database that tracks updates to calls/ new calls and poll it every 5 - 10 seconds from every instance of the software.
B) Have a service running on the machine that has the database and have that service take care of all SQL. Create an instance of each call in the service and then just ask the service if there are any changes or unhandled call.
If B, is it possible to create a delegate in the service that the software on another (networked) machine can subscribe to? If so where might I find information on doing that, I could not find anything on google.

This is kind to broad.
However, you can use the following
DB Trigger to watch for inserts ect, then do and fabulous db stuff when triggered.
Create a Windows Service that polls, thats not a problem at all.
You could even self host a WCF server with a Duplex Contract that other software subscribes to, you could then send notifications ect via that channel.
or use SignalR for notification which would work just fine in this situation as well, and is a 5 minute job to get working.
Though, there is lots of approaches here, You really need to do some research to find what suits you most

Solution B is better.
If B, is it possible to create a delegate in the service that the
software on another (networked) machine can subscribe to? If so where
might I find information on doing that, I could not find anything on
google.
It depends on your need and project type.
You can use SignalR in ASP.Net
If you work with sockets you can keep connection alive and store client context in a list and notify theme

Bus instances Lifecycle and Best practices [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
While evaulating queueing mechanisms in general and Rebus in particular, I came up with the following questions about Bus Instances Lifecycle:
When need access to the Bus instance (one-way client mode) from several WCF services hosted on a windows service, the only option for instancing is on Singleton mode?
There is a way to Pause a Bus (stop dispatching message to the message handlers?) and then start it again.Or the only option is to dispose it and create a new one.
A use Case for this is when you connect to systems that have throughput limitations, or transactions per hour limits.
Can sagas have multiple workers, if so and assuming that the events were send on the correct order (initiator first), there is way to warranty that the initiator going to be handled first, there for creating the saga, before the following events are handled with multiple workers?
If in the same host, several Bus instances are used, and inside a message handler we call send on another bus instance based on the same configuration. The correlation-id wont be transmitted, And things like reply wont work properly, right?
I prefer concrete answers on how Rebus could support or not this, with code references/examples.

1: It's really simple: The bus instance (i.e. the implementation of IBus that gets put into the container and is handed to you when you do the Configure.With(...) configuration spells) is supposed to be a singleton instance that you keep around for the entire duration of your application's lifetime.
You can easily create multiple instances though, but that would be useful only for hosting multiple Rebus endpoints in the same process.
IOW the bus is fully reentrant and can safely be shared among threads in your web application.
2: Not readily, no - at least not in a way that is supported by the public API. You can do this though: ((RebusBus)bus).SetNumberOfWorkers(0) (i.e. cast the IBus instance to RebusBus and change the number of worker threads), which will block until the number of workers has been adjusted to the desired number.
This way, you can actually achieve what you're after. It's just not an official feature of Rebus (yet), but it might be in the future. I can guarantee, though, that the ability to adjust the number of workers at runtime will not go away.
3: Yes, sagas are guarded by an optimistic concurrency scheme no matter which persistence layer you choose. If you're unsure which type of message will arrive first at your saga, you should make your saga tolerant to this - i.e. just implement IAmInitiatedBy<> for each potentially initiating message type and make the saga handle that properly.
Being (fairly) tolerant to out-of-order messages is a good general robustness principle that will serve you well also when messages are redelivered after having stayed a while in an error queue.
4: Rebus will pick up the current message context even though you're using multiple bus instances because it uses an "ambient context" (i.e. a MessageContext instance mounted on the worker thread) to pick up the fact that you're sending a message from within a handler, which in turn will cause the correlation ID of the handled message to be copied to any outgoing messages.
Thus bus.Reply will work, too.
But as I stated in (1) the bus instance is fully reentrant and there's no need to have multiple instances around, unless they're actually logically difference endpoints.
I hope this answers your questions :)

WCF Service and Threading

I have created a simple WCF (.NET 3.5) service which defines 10 contracts which are basically calculations on the supplied data. At the moment I expect quite few clients to make a call to some of these contracts. How do I make the service more responsive ? I have a feeling that the service will wait until it process one request to go to the next one.
How can I use multithreading in WCF to speed things up ?

While I agree with Justin's answer, I believe some more light can be shed here on how WCF works.
You make the specific statement:
I have a feeling that the service will
wait until it process one request to
go to the next one. How can I use
multithreading in WCF to speed things
up ?
The concurrency of a service (how many calls it can take simultaneously) depends on the ConcurrencyMode value for the ServiceBehavior attached to the service. By default, this value is ConcurrencyMode.Single, meaning, it will serialize calls one after another.
However, this might not be as much of an issue as you think. If your service's InstanceContextMode is equal to InstanceContextMode.PerCall then this is a non-issue; a new instance of your service will be created per call and not used for any other calls.
If you have a singleton or a session-based service object, however, then the calls to that instance of the service implementation will be serialized.
You can always change the ConcurrencyMode, but be aware that if you do, you will have to handle concurrency issues and access to your resources manually, as you have explicitly told WCF that you will do so.
It's important not to change these just because you think that it will lead to increased performance. While not so much for concurrency, the instancing aspect of your service is very much a part of the identity of the service (if it is session or not session-based) and changing them impacts the clients consuming the service, so don't do it lightly.
Of course, this speaks nothing to whether or not the code that is actually implementing the service is efficient. That is definitely something that needs looking into, when you indicate that is the case.

This is definitely pre-mature optimization. Implement your services first and see if there's an issue or not.
I think you'll find that you are worrying about nothing. The server won't block on a single request as that request processes. IIS/WCF should handle things for you nicely as-is.

I'm not familiar with WCF, but can the process be async?
If you are expecting a huge amount of data and intensive calculations, one option could be to send an id, calculate the values in a separate thread and then provide a method to return the result using the initial id.
Something like:
int id = Service.CalculateX(...);
...
var y = Service.GetResultX(id);

By default the Instancing is PerSession.
See WCF Service defaults
However if you use a session binding that doesn't support sessions (like BasicHttpBinding) or the the channel/client does not create a session then this behaves like PerCall
See [Binding type session support] (https://learn.microsoft.com/en-us/dotnet/framework/wcf/system-provided-bindings).
Each WCF client object will create a Session and for each session there will be a server instance with a single thread that services all calls from that particular WCF client object synchronously.
Multiple clients therefore would each have their own session and therefore server instance and thread by default and would not block each other.
They will only affect each other on shared resources like DB, CPU etc.
See Using sessions
Like others suggested you should make sure the implementation is efficient BEFORE you start playing with the Instancing and Concurrency modes.
You could also consider client side calculations if there is no real reason to make a call to the server.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.