I'm creating an application that I want to put into the cloud. This application has one main function.
It hosts socket CLIENT sessions on behalf of other users (think of Beejive IM for the iPhone, where it hosts IM sessions for clients to maintain state on those IM networks, allowing the client to connect/disconnect at will, without breaking the IM network connection).
Now, the way I've planned it now, is that one 'worker instance' can likely only handle a finite number of client sessions (let's say 50,000 for argument sake). Those sessions will be very long lived worker tasks.
The issue I'm trying to get my head around is that I will sometimes need to perform tasks to specific client sessions (eg: If I need to disconnect a client session). With Azure, would I be able to queue up a smaller task that only the instance hosting that specific client session would be able to dequeue?
Right now I'm contemplating GoGrid as my provider, and I solve this issue by using Apache's Active Messaging Queue software. My web app enqueues 'disconnect' tasks that are assigned to a specific instance Id. Each client session is therefore assigned to a specific instance id. The instance then only dequeues 'disconnect' tasks that are assigned to it.
I'm wondering if it's feasible to do something similar on Azure, and how I would generally do it. I like the idea of not having to setup many different VM's to scale, but instead just deploying a single package. Also, it would be nice to make use of Azure's Queues instead of integrating a third party product such as Apache ActiveMQ, or even MSMQ.
I'd be very concerned about building a production application on Azure until the feature set, pricing, and licensing terms are finalized. For starters, you can't even do a cost comparison between it and e.g. GoGrid or EC2 or Mosso. So I don't see how it could possibly end up a front-runner. Also, we know that all of these systems will have glitches as they mature. Amazon's services are in much wider use than any of the others, and have been publicly available for much years. IMHO choosing Azure is a recipe for pain as they stabilize.
Have you considered Amazon's Simple Queue Service for queueing?
I think you can absolutely use Windows Azure for this. My recommendation would be to create a queue for each session you're tracking. Then enqueue the disconnect message (for example) on the queue for that session. The worker instance that's handling that connection should be the only one polling that queue, so it should handle performing the task on that connection.
Regarding the application hosting socket connections for clients to connect to, I'd double-check on what's allowed as I think only HTTP and HTTPS connections are allowed to be made with Azure.
Related
We have a microservice based application/ website hosted currently in Azure, and we need to have a function where we press on a button, and it sends some data to another webservice currently hosted inside our corporate network.
Our IT bods are against being able to POST to a service hosted inside our network, and I am wondering how people normally deal with this problem.
I can think of 2 possible solutions, neither of which I like particularly:
Set up a VPN to the internal network, which feels a bit of a heavy solution to me
The internal network service polls the cloud application for changes of state continuously, an triggers an update process when a change is recorded. This will generate a lot more traffic than I would ideally want
How do other people address this issue? Essentially I just want to send some data from the cloud into our network in a secure fashion. Pulls from our network are OK, but pushes into it are not.
Even sending a signal to get the internal network to initiate a pull would also work fine.
Both the solutions you came up with are fairly common patterns in Azure architecture. Of the two, the second would be the one I would generally choose for this particular scenario, but it does depend on how fast you need the push to happen. VPN is going to be the fastest as you have a direct connection between your Azure service and your internal one, but it is a bit more complex to set up for a single pipeline.
The second is generally accomplished through a messaging service like Service Bus as it adds a lot of resiliency to that sort of arrangement. You can configure your onprem service to ping Service Bus based on the interval you define- more often if you need the updates to happen quickly, less often if you want to reduce traffic. Depending on the size of the data, you can load it directly into Service Bus for pickup or the message can contain the location of the required data. Event Grid is another option for a messaging service. It sends notifications out instead of waiting for you to poll, so it would be a good choice if you wanted to ping your onprem service to reach out and pick up the changes.
If you are open to using Logic Apps to do the push, it accesses onprem resources via a data gateway that you install inside your network. It does use Service Bus in the background to accomplish this so you will be using your second solution, but it would be a bit simpler from a development perspective.
I've tried researching this, but haven't found much that sounds similar to something I'm needing to implement. In short, we'll be running an ASP Website on a server that will be accessed by clients. Ideally, we have a function that we want to initialize upon the start of a user's session, and stop when the session ends. While the session is happening, this function sends and receives messages via socket communication, meaning we need to access the send/receive functions of this class from pages in order to move information. What's the best way to go about this?
Look into SignalR. That's probably what you're wanting. Its "hubs" are effectively what you're looking for to spin up on session initiation, and spin down when the user disappears. It has a client-side JS library that automatically chooses the best connection method available (e.g., websockets > server-sent-events > long-polling), and it allows you to send messages both from the client to the server, and from the server to the client.
http://www.asp.net/signalr
Another alternative that I've played around with in the past is XSockets:
https://xsockets.net/
It's similar to SignalR in many respects, but it's not free.
It's hard to tell from you description, are you looking to communicate with the client browser via sockets? Or are you trying to communicate with some other service via sockets?
Web applications are not ideally suited for deterministic types of actions. It's difficult for the web server to know whether or not the client has actually closed their browser or not. In most cases, sessions simply time out after a period of inactivity (20+ minutes in most cases). So you cannot reliably know when the users session has actually ended.
To top it off, there are certain edge cases where Session_End will not fire. For instance, if the app pool recycles, then no Session_End event will fire. This may not be an issue, since if the app pool recycles your other connections would also recycle, but it's still an issue to keep in mind.
Finally, Web apps are not intended to be long running.
We have a number of different old school client-server C# WinForm client-side apps that are essentially front-ends for the database. Then there is a C# server-side windows service that waits on the client apps to submit orders and then it processes them.
The way the server-side service finds out whether there is work to do is that it polls the database. Over the years the logic of polling for waiting orders has gotten a lot more complicated due to the myriad of business rules. So because of this, the polling stored proc itself uses quite a bit of SQL Server resources even if there is nothing to do. Add to this the requirement that the orders be processed the moment they are submitted and you got yourself a performance problem, as the database is being polled constantly.
The setup actually works fine right now, but the load is about to go through the roof and, it is obvious, that it won't hold up.
What are some effective ways to communicate between a bunch of different client-side apps and a server-side windows service, that will be more future-proof than the current method?
The database server is SQL Server 2005. I can probably get the powers that be to pony up for latest SQL Server if it really comes to that, but I'd rather not fight that battle.
There are numerous options ways you can notify the clients.
You can use a ready-made solution like NServiceBus, to publish information from the server to the clients or other servers. NServiceBus uses MSMQ to publish one message to multiple subscribers in a very easy and durable way.
You can use MSMQ or another queuing product to publish messages from the server that will be delivered to the clients.
You can host a WCF service on the Windows service and connect to it from each client using a Duplex channel. Each time there is a change the service will notify the appropriate clients or even all of them. This is more complex to code but also much more flexible. You could probably send enough information back to the clients that they wouldn't need to poll the database at all.
You can have the service broadcast a UDP packet to all clients to notify them there are changes they need to pull. You can probably add enough information in the packet to allow the clients to decide whether they need to pull data from the server or not. This is a very lightweight for the server and the network, but it assumes that all clients are in the same LAN.
Perhaps you can leverage SqlDependency to receive notifications only when the data actually changes.
You can use any messaging middleware like MSMQ, JMS or TIBCO to communicate between your client and the service.
By far the easiest, and most likely the cheapest, answer is to simply buy a bigger server.
Barring that, you are in for a development effort that has a high probability of early failure. By failure I don't mean that you end up scraping whatever it is you end up building. Rather, I mean you launch the changes and orders will be screwed up while you are debugging your myriad of business rules.
Quite frankly, I wouldn't consider approaching a communications change under pressure; presuming your statement about load going "through the roof" in the near term.
If your risk exposure is such that it has to be 100% functional day one (which is normal when you are expecting a large increase in orders), with no hiccups then just upsize the DB server. Heck, I wouldn't even install the latest sql server on it. Instead, just buy a larger machine, install the exact same OS and DB server (and patch levels) and move your database.
Then look at your architecture to determine what needs to go away and what can be salvaged.
If everybody connects to SQL Server then there is also the option of Service Broker. Unlike other messaging/queueing solution recommended so far it is entirely contained in your database (no separate product to deploy, administer and configure), it offers a single story vis-a-vis your backup/recovery and high availability needs ( no separate backup for message store, no separate DR/HA, whatever is your DB solution is also your messaging solution) and overs a uniform programming API (SQL).
Even when everything is within one single SQL Server instance (ie. there is no need to communicate over network between multiple SQL Service instances) Service Broker still has an ace that no one can match: activation. With activation you eliminate completely the need to poll because the system itself will launch your processing code (will 'activate') when there are events to process. The processing code can be internal (T-SQL procedure or SQLCLR .Net procedure) or external (see external activator).
So basically I am thinking about attempting load testing on my asp.net application using various features all at once. There is a lot of dependencies and ajax requests being performed in this application so it seems like a simple replay of captured http requests will not suffice and due to other features like picking out random operations, performing then verifying results across several machines, simple load testing software will not suffice.
Also there is no budget to this project for spending, so commercial implementations can not be used. I'm debating on trying to use MSMQ (never used before) to handle communication between clients, but if that is really complicated to set up then I would either use a database table as a queue or a simple TCP server with each test machine as its clients.
Features I want are: immediate failure (one client crashes, then all clients should stop), each test run should start with a brand new scenario with no prior messages, and ability to publish a start and stop event. Also it would be nice if I don't have to worry about state management (leaning towards TCP server for this over database) or concurrency.
It doesn't sound like MSMQ is what you need. It is a message-passing asynchronous communication method, akin to email. You can send a message to another queue that no one is even listening to (i.e. the application isn't running). It seems to me you want a more "online" communication model.
How about creating agents (client applications that sit on many machines and create the load) that expose a WCF service where a controller program can connect to all of them and instruct the agents what to do? It can be a duplex contract, so that the agents can send the controller a notifications. When one of them send a error notification, the controller can instruct all the other agents to shut down. Also I'd go for a Net.TCP binding rather than HTTP binding.
I am using WCF and I am putting a chatroom facility in my C# program. So I need to be able to send information from the server to the clients for two events -
When a user connects/disconnects I update the list of connected users and send that back to all clients for display in a TextBlock
When a user posts a message, I need the server to send that message out to all clients
So I am looking for advice on the best way of implementing this. I was going to use netTcpBinding for duplex callbacks to clients but then I ran into some issues regarding not being able to call back the client if the connection is closed. I need to use percall instances for scalibility. I was advised in this thread that I shouldnt leave connections open as it would 'significantly limit scalibity' - WCF duplex callbacks, how do I send a message to all clients?
However I had a look through the book Programming WCF Services and the author seems to state that this is not an issue because 'In between calls, the client holds a reference on a proxy that doesn’t have an actual object at the end of the wire. This means that you can dispose of the expensive resources the service instance occupies long before the client closes the proxy'
So which is correct, is it fine to keep proxies open on clients?
But even if that is fine it leads to another issue. If the service instances are destroyed between call, how can they do duplex callbacks to update the clients? Regarding percall instances, the author of Programming WCF Services says 'Because the object will be discarded once the method returns, you should not spin off background threads or dispatch asynchronous calls back into the instance'
Would I be better off having clients poll the service for updates? I would have imagined that this is much more inefficient than duplex callbacks, clients could end up polling the service 50+ times as often as using a duplex callback. But maybe there is no other way? Would this be scalable? I envisage several hundred concurrent users.
Since I am guilty of telling you that server callbacks won't scale, I should probably explain a bit more. Let me start by addressing your questions:
Without owning the book in question, I can only assume that the author is either referring to http-based transports or request-response only, with no callbacks. Callbacks require one of two things- either the server needs to maintain an open TCP connection to the client (meaning that there are resources in use on the server for each client), or the server needs to be able to open a connection to a listening port on the client. Since you are using netTcpBinding, your situation would be the former. wsDualHttpBinding is an example of the latter, but that introduces a lot of routing and firewall issues that make it unworkable over the internet (I am assuming that the public internet is your target environment here- if not, let us know).
You have intuitively figured out why server resources are required for callbacks. Again, wsDualHttpBinding is a bit different, because in that case the server is actually calling back to the client over a new connection in order to send the async reply. This basically requires ports to be opened on the client's side and punched through any firewalls, something that you can't expect of the average internet user. Lots more on that here: WSDualHttpBinding for duplex callbacks
You can architect this a few different ways, but it's understandable if you don't want the overhead (and potential for delay) of the clients constantly hammering the server for updates. Again, at several hundred concurrent users, you are likely still within the range that one good server could handle using callbacks, but I assume you'd like to have a system that can scale beyond that if needed (or at peak times). What I'd do is this:
Use callback proxies (I know, I told you not to)... Clients connecting create new proxies, which are stored in a thread-safe collection and occasionally checked for live-ness (and purged if found to be dead).
Instead of having the server post messages directly from one client to another, have the server post the messages to some Message Queue Middleware. There are tons of these out there- MSMQ is popular with Windows, ActiveMQ and RabbitMQ are FOSS (Free Open Source Software), and Tibco EMS is popular in big enterprises (but can be very expensive). What you probably want to use is a topic, not a queue (more on queues vs topics here).
Have a thread (or several threads) on the server dedicated to reading messages off of the topic, and if that message is addressed to a live session on that server, deliver that message to the proxy on the server.
Here's a rough sketch of the architecture:
This architecture should allow you to automatically scale out by simply adding more servers, and load balancing new connections among them. The message queueing infrastructure would be the only limiting factor, and all of the ones I mentioned would scale beyond any likely use case you'd ever see. Because you'd be using topics and not queues, every message would be broadcast to each server- you might need to figure out a better way of distributing the messages, like using hash-based partitioning.