RabbitMQ, REST API, Docker, and Kubernete Best Practice Question

RabbitMQ, REST API, Docker, and Kubernete Best Practice Question - c#

This is a high level question that I am asking for something I am currently architecting and cannot seem to find the exact answer I am looking for.
Scenario:
I have a .Net Core REST API that will be receiving requests from an external application. These requests will be getting pushed into a RabbitMQ instance. These notifications will be thrown to an exchange, then fanned out to multiple queues for multiple consumers.
There is one consumer that I will be responsible for and I am looking for advice on best practices. Ultimately, there will be a REST API that will eventually need to react to these messages being pushed into the queue. This REST API in question is a containerized (Docker) app running on a Kubernetes cluster. It will be receiving a lot of request traffic outside of these notifications (queue messages), making SQL calls, etc.
My question is, should I have an external microservice (hosted service/background service) that subscribes to this queue with the intent of calling into said REST API. Kind of like a traffic cop; routing messages to the appropriate API method based on certain data points.
Or
Would it be OK to put this consumer directly into the high-traffic REST API in question?
Any advice around this? Thanks in advance!

There is no right or wrong. This is the whole dilemma around monolith-microservices and synchronous-asynchronous.
If you are looking at going with microservices and more asynchronous, you can start with these questions:
Do you want your system into different codebases?
Do you want to divide responsibilities among different teams?
Do you want to use different languages/projects for the different components?
Do you want some components of the system to respond faster to the user?
Can your app be ok with the fact that one decoupled component may fail completely?
should I have an external microservice (hosted service/background service) that subscribes to this queue with the intent of calling into said REST API. Kind of like a traffic cop; routing messages to the appropriate API method based on certain data points.
Yes, if you are thinking more on the microservices route and the answer is 'yes' for most of the above questions (and even more microservices related questions not mentioned).
If you are thinking more about the monolith route:
Are you ok with the same code base shared across the different teams?
Are you ok with a more unified programming language?
Do you want to have a monorepo? (although you can do micro-services with monorepos)
Is the codebase going to be mainly be worked on by a few people who know it really well?
Is it easy to provide redundancy within the app? i.e If one component fails the application doesn't crash.
Would it be OK to put this consumer directly into the high-traffic REST API in question?
Yes, if your code can handle it and you are more in line with 'yes' on the answers above.

Related

Communicating Rserve and C#

Most of the questions that somewhat relate to this link to old/deprecated resources, so I'm asking it once more.
I have a special tool that gets a lot of traffic and utilizes complex computations through R. We have opted to use Rserve to be able to intake a large influx of concurrent requests, but we cannot figure out how to communicate our C# ASP.NET web application with RServe. We have RServe up and running, but how can we actually communicate and make requests straight to Rserve from a C# application? The documentation for https://github.com/konne/RserveCLI2 isn't great, can someone help us understand how to call our functions?
Note: We have a plumber implementation up and running. It works great, but it seems to have issues with large amounts of concurrent requests as it simply queues them. The documentation talks about off-loading and creating parallel processes, but this may require a lot of parallel processes if each can only handle 1 request.

RESTFUL web service v Message queue when using Scatter Gatherer

Say I have a scatter gather setup like this:
1) Web app
2) RabbitMQ
3) Scatter gather API 1
4) Scatter gather API 2
5) Scatter gather API x
Say each scatter gather (and any new ones added in future) need to supply an image/update an image to the web app, so that when the web app displays the results on screen it also displays the image. What is the best way to do this?
1) RESTFUL call from each API to web app adding/updating an image where necessary
2) Use message queue to send the image
I believe option two is best because I am using a microservices architecture. However, this would mean that the image could be processed by the web app after requests are made (if competiting consumers are used). Therefore the image could be missing from the webpage?
The problem with option 1 is the scatter gatherer apis are tightly coupled with the web app.
What is the appropriate way to approach this?

The short answer: There is no right way to do this.
The long answer: Because there's no right way to do this, there a danger that any answer I give you will be an opinion. Rather than do that, I'm going to help clarify the ramifications of each option you've proposed.
First thing to note: Unless there is already an image available at the time of the HTTP request, then your HTTP response will not be able to include an image. This means that your front-end will need to be updated after the HTTP request/response cycle has concluded. There are two ways to do this: polling via AJAX requests, or pushing via sockets.
The advantage of polling is that it is probably easier to integrate into an existing web app. The advantage of pushing the image to the client via sockets is that the client won't need to spam your server with polling requests.
Second thing to note: Reporting back the image from the scatter/gather workers could happen either via an HTTP endpoint, or via the message queue, as you suggest.
The advantage of the HTTP endpoint is that it would likely be simpler to setup. The advantage of the message queue is that the worker would not have to wait for the the HTTP response (which could take a while if you're writing a large image file to disk) before moving on to the next job.
One more thing to note: If you choose to use an HTTP endpoint to create/update the images, it is possible that multiple scatter/gather workers will be trying to do this at the same time. You'll need to handle this to prevent multiple workers from trying to write to the same file at the same time. You could handle this by using a mutex to lock the file while one process is writing to it. If you choose to use a message queue, you'll have several options for dealing with this: you could use a mutex, or you could use a FIFO queue that guarantees the order of execution, or you could limit the number of workers on the queue to one, to prevent concurrency.
I do have experience with a similar system. My team and I chose to use a message queue. It worked well for us, given our constraints. But, ultimately, you'll need to decide which will work better for you given your constraints.
EDIT
The constraints we considered in choosing a message queue over HTTP included:
Not wanting to add private endpoints to a public facing web app
Not wanting to hold up a worker to wait on an HTTP request/response
Not wanting to make synchronous that which was asynchronous
There may have been other reasons. Those are the ones I remember off the top of my head.

Design question about background processing web service [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
the title of the question may not clear enough, allow me to explain the background here:
I would like to design a web service that generates PDF and submit it to printer, here is the workflow:
User submit a request to the web service, probably the request will be one off so that user wouldn't suffer from waiting the job complete. User may received a HTTP200 and continue their work.
Once web service received the request, the web service generates the PDF, and submit it to designated printer and this process could take some time and CPU resources. As I don't want the drain all resource on that server, I may use producer consumer pattern here, there might be a queue to to queue client jobs, and process them one by one.
My Questions is that:
I'm new to C#, what is the proper pattern to queue and process them? Should I use ConcurrentQueue and ThreadPool to archive it?
What is the proper way to notify user about the job is success/fail? Instead of using callback service, is async an ideal way? My concern is that there may be lots of jobs in the queue and I don't want client suffer from waiting it complete.
The web service is placed behind a load balancer, how can I maintain a 'process queue' among them? I've tried using Hangfire and it seems okay, however I'm looking for alternative?
How can I know the number of jobs in the Queue/ how may thread is currently running? The webservice will be deployed on IIS, is there a Native way to archive it, or should I implement a web service call to obtain them?
Any help will be appreciated, thanks!

WCF supports the idea of a fire-and-forget methods. You just mark your contract interface method as one way, and there will be no waiting for a return:
[OperationContract( IsOneWay = true )]
void PrintPDF( PrintRequest request );
The only downside, of course, is that you won't get any notification from the server that you're request was successful or even valid. You'd have to do some kind of periodic polling to see what's going on. I guess you could put a Guid into the PrintRequest, so you could interrogate for that job later.
If you're not married to wcf, you might consider signalR...there's a comprehensive sample app of both a server and simple wpf client here. It has the advantage that either party can initiate an exchange once the connection has been established.
If you need to stick with wcf, there's the possibility of doing dualHttp. The client connects with an endpoint to callback to...and the server can then post notifications as work completes. You can get a feel for it from this sample.
Both signalR and wcf dualHttp are pretty straightforward. I guess my preference would be based on the experience of the folks doing the work. signalR has the advantage of playing nicely with browser-based clients...if that ever turns into a concern for you.
As for the queue itself...and keeping with the wcf model, you want to make sure your requests are serializable...so if need be, you can drain the queue and restart it later. In wcf, that typically means making data contracts for queue items. As an aside, I never like to send a boatload of arguments to a service, I prefer instead to make a data contract for method parameters and return types.
Data contracts are typically just simple types marked up with attributes to control serialization. The wcf methods do the magic of serializing/deserializing your types over the wire without you having to do much thinking. The client sends a whizzy and the server receives a whizzy as it's parameter.
There are caveats...in particular, the deserialization doesn't call your constructor (I believe it uses MemberwiseClone instead) ...so you can't rely on the constructor to initialize properties. To that end, you have to remember that, for example, collection types that aren't required might need to be lazily initialized. For example:
[DataContract]
public class ClientState
{
private static object sync = new object( );
//--> and then somewhat later...
[DataMember( Name = "UpdateProblems", IsRequired = false, EmitDefaultValue = false )]
List<UpdateProblem> updateProblems;
/// <summary>Problems encountered during previous Windows Update sessions</summary>
public List<UpdateProblem> UpdateProblems
{
get
{
lock ( sync )
{
if ( updateProblems == null ) updateProblems = new List<UpdateProblem>( );
}
return updateProblems;
}
}
//--> ...and so on...
}
Something I always do is to mark the backing variable as the serializable member, so deserialization doesn't invoke the property logic. I've found this to be an important "trick".
Producer/consumer is easy to write...and easy to get wrong. Look around on StackOverflow...you'll find plenty of examples. One of the best is here. You can do it with ConcurrentQueue and avoid the locks, or just go at it with a good ol' simple Queue as in the example.
But really...you're so much better off using some kind of service bus architecture and not rolling your own queue.
Being behind a load balancer means you probably want them all calling to a service instance to manage a single queue. You could roll your own or, you could let each instance manage its own queue. That might be more processing than you want going on on your server instances...that's your call. With wcf dual http, you may need your load balancer to be configured to have client affinity...so you can have session-oriented two-way communications. signalR supports a message bus backed by Sql Server, Redis, or Azure Service Bus, so you don't have to worry about affinity with a particular server instance. It has performance implication that are discussed here.
I guess the most salient advice is...find out what's out there and try to avoid reinventing the wheel. By all means, go for it if you're in burning/learning mode and can afford the time. But, if you're getting paid, find and learn the tools that are already in the field.
Since you're using .Net on both sides, you might consider writing all your contracts (service contracts and data contracts) into a .DLL that you use on both the client and the service. The nice thing about that is it's easy to keep things in sync, and you don't have to use the (rather weak) generated data contract types that come through WSDL discovery or the service reference wizard, and you can spin up client instances using ChannelFactory<IYourServiceContract>.

Is there a best practice for throttling service calls to Windows services?

My team ships a client API that allows applications to communicate with our Windows service. There is a concern that malicious apps could possibly flood our service with requests, so we want to put in some throttling logic on the client API to prevent DOS attacks like this.
Is there a best practice for implementing throttling logic for Windows services? All I can find online is throttling for web (which makes sense). I imagine the same ideas would apply, but I am wondering if there is an established mechanism to do this when it's all on the local system.

You might find an open source project like TokenBucket to be suitable to implement this.
See WikiPedia for a discussion on leaky bucket algorithms in general.
For your application you may need to consider how to prevent multiple instances of your API from being called from the same machine, same process or multiple threads and that might affect how you implement this.

C# Thoughts on this design

I have this problem domain where I need to able to run a background process that would:
Run a filter to get an obj collection (time consuming operation)
Pass the obj coll through a set of rules...maybe thru a rule interface
Be able to expose any changes that the rules caused to any interested listeners.
Each filter may have many rules and there can be more than one filter.
Would would be the practical way to approach this? I'm thinking:
Have a WCF app hosted in a Windows Service that would expose callback for rule changes
Let the service do the grunt work of running filter->rules. Will this need to be a separate threaded work ?
Any thoughts or references to existing frameworks, design patterns, etc. are welcome.
thanks
Sunit

If your background process needs to be instantly (24/7/365) accessible from remote machines, the Windows service makes a lot of sense to me. Assuming you are familiar with C#, it is trivial to create a Windows service. You can follow the step-by-step here. Once you've got the Windows service going, it's easy to host the WCF service by creating the System.ServiceModle.ServiceHost instance in the OnStart callback of the Windows service. As far as WCF patterns and good practices, I'll refer you to Juval Lowy's website, IDesign.net. The site has a lot of free WCF-related downloads just by providing your email address.

You have a couple options, the two most obvious are either the client calls a method that starts the job and polls the server for status, or, setup a callback.
Either way the job should be run on a seperate thread so it doesn't block the service.
If you go with the poll for status route, put the actual result in the returning status.
If you go with the callback, use the WSDualHttpBinding and setup a callback. This looks a little scary to setup but it's really not that bad.
I'll let someone else chime in for actual patterns or frameworks, I'm just not sure. Also, checkout MSMQ, this might be another viable solution.

You could use WWF to take care of the rules. You should be able to host WWF as a service.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.