Chatroom functionality with WCF, duplex callbacks vs polling?

Chatroom functionality with WCF, duplex callbacks vs polling? - c#

I am using WCF and I am putting a chatroom facility in my C# program. So I need to be able to send information from the server to the clients for two events -
When a user connects/disconnects I update the list of connected users and send that back to all clients for display in a TextBlock
When a user posts a message, I need the server to send that message out to all clients
So I am looking for advice on the best way of implementing this. I was going to use netTcpBinding for duplex callbacks to clients but then I ran into some issues regarding not being able to call back the client if the connection is closed. I need to use percall instances for scalibility. I was advised in this thread that I shouldnt leave connections open as it would 'significantly limit scalibity' - WCF duplex callbacks, how do I send a message to all clients?
However I had a look through the book Programming WCF Services and the author seems to state that this is not an issue because 'In between calls, the client holds a reference on a proxy that doesn’t have an actual object at the end of the wire. This means that you can dispose of the expensive resources the service instance occupies long before the client closes the proxy'
So which is correct, is it fine to keep proxies open on clients?
But even if that is fine it leads to another issue. If the service instances are destroyed between call, how can they do duplex callbacks to update the clients? Regarding percall instances, the author of Programming WCF Services says 'Because the object will be discarded once the method returns, you should not spin off background threads or dispatch asynchronous calls back into the instance'
Would I be better off having clients poll the service for updates? I would have imagined that this is much more inefficient than duplex callbacks, clients could end up polling the service 50+ times as often as using a duplex callback. But maybe there is no other way? Would this be scalable? I envisage several hundred concurrent users.

Since I am guilty of telling you that server callbacks won't scale, I should probably explain a bit more. Let me start by addressing your questions:
Without owning the book in question, I can only assume that the author is either referring to http-based transports or request-response only, with no callbacks. Callbacks require one of two things- either the server needs to maintain an open TCP connection to the client (meaning that there are resources in use on the server for each client), or the server needs to be able to open a connection to a listening port on the client. Since you are using netTcpBinding, your situation would be the former. wsDualHttpBinding is an example of the latter, but that introduces a lot of routing and firewall issues that make it unworkable over the internet (I am assuming that the public internet is your target environment here- if not, let us know).
You have intuitively figured out why server resources are required for callbacks. Again, wsDualHttpBinding is a bit different, because in that case the server is actually calling back to the client over a new connection in order to send the async reply. This basically requires ports to be opened on the client's side and punched through any firewalls, something that you can't expect of the average internet user. Lots more on that here: WSDualHttpBinding for duplex callbacks
You can architect this a few different ways, but it's understandable if you don't want the overhead (and potential for delay) of the clients constantly hammering the server for updates. Again, at several hundred concurrent users, you are likely still within the range that one good server could handle using callbacks, but I assume you'd like to have a system that can scale beyond that if needed (or at peak times). What I'd do is this:
Use callback proxies (I know, I told you not to)... Clients connecting create new proxies, which are stored in a thread-safe collection and occasionally checked for live-ness (and purged if found to be dead).
Instead of having the server post messages directly from one client to another, have the server post the messages to some Message Queue Middleware. There are tons of these out there- MSMQ is popular with Windows, ActiveMQ and RabbitMQ are FOSS (Free Open Source Software), and Tibco EMS is popular in big enterprises (but can be very expensive). What you probably want to use is a topic, not a queue (more on queues vs topics here).
Have a thread (or several threads) on the server dedicated to reading messages off of the topic, and if that message is addressed to a live session on that server, deliver that message to the proxy on the server.
Here's a rough sketch of the architecture:
This architecture should allow you to automatically scale out by simply adding more servers, and load balancing new connections among them. The message queueing infrastructure would be the only limiting factor, and all of the ones I mentioned would scale beyond any likely use case you'd ever see. Because you'd be using topics and not queues, every message would be broadcast to each server- you might need to figure out a better way of distributing the messages, like using hash-based partitioning.

Related

Does Keep-Alive in WCF Use More Resources than Polling?

I could probably setup a couple test-bed applications and find out, but I'm hoping someone has already experienced this or just simply has a more intuitive understanding. I have three executables. Two different clients (call them Client1.exe and Client2.exe) and a WCF service host (call it Host.exe) that hosts what's more or less a message bus type service for the two clients. I won't get into the "why's" as that's a long story and not productive to this question.
The point is this, Client1 sends a request through this service to Client2. Client2 performs operations, then responds with results to Client1. Client1 will always be initiator of requests, so this order of operations will always be consistent this way. This also means that Client1 can open it's channels to communicate to this service as-needed, whereas due to the need of callback services, Client2 has to keep it's channels open. I've began by attempting to keep-alive. However, these are all three on the desktop and PC sleep events, or other issues (not sure) seem to interfere with it. And once it times out, everything has to be restarted which makes it a real pain. I have some ideas I may try to help the keep-alive approach, but this brought up a question that I don't have an answer to... is this the best use of my resources.
The way I figure it, there are two main approaches for Client2,
Keep-Alive with a lot of monitoring (timers and checking of connection states) and connection resetting code which would be faster since it could respond to requests immediately. The downside is this has to be kept alive throughout the time that the user keeps Client2 open on their desktop which could be short and sweet to crazy-long.
Poll periodically for a request which would allow the resources to only be used when checking or processing a request from Client1. This would be slower since poll requests would not be real-time, but would eliminate any external issue concerns disconnecting the service. This would also cause me to have to add more state to the service. It's already a PerSession service with a list of available instances of Client2 ID's so that Client1 knows which instance it's talking to, but it would add more.
Client2 performs many other functions and so still has to be very performant with this process, which makes me wonder which is most likely to cost in resources? Is the polling approach more costly in resources? Or attempting to keep-alive?

why is wcf duplex required?

WCF duplex performs a callback after a method has run on the server that then runs code on the client.
If i want to execute a method on the client from the server at the push of a button on the server then i don't think WCF duplex is appropriate.
Why would i not just create a client and a server at each end of my 2 applications?

I was one of the people that commented on your previous question so I probably owe you an answer here :o)
You have posted rather a lot of code and I have not looked at it in detail. However, in general terms, there is a reason for using wsDualHttpBinding and duplex contracts in general instead of more of a peer-to-peer approach where you have services on both sides, as follows:
The duplex approach is appropriate where you have a clearly defined server that is running permanently. This provides the hub of the interaction. The idea is that clients are in some way more transient than the server. The clients can start up and shut down or move location and the server does not need to be aware of them in advance. When the client starts up, it is pre-configured to know where the server is, so it can "register" itself with the server.
In contrast, the server does not need to be preconfigured to know where the clients are. It starts up and can run independently of any clients. It just accepts "registrations" from all clients that have valid credentials whenever they come online, and can continue to run after the client goes offline. Also, if the client moves, it just re-registers itself with the server at its new location.
So the server is in some sense a more "important" part of the system. No client can participate in the communication without the server, but the server can operate independently of any client.
To do this with WCF duplex service, you have to do some extra work yourself to implement the publish/subscribe behaviour. Fortunately, the MSFT Patterns and Practises team have provided some guidance on how to do it
http://msdn.microsoft.com/en-us/library/ms752254.aspx
This is fundamentally different from a genuine peer-to-peer approach where there is no well-defined hub (i.e. server) for the network and each node can come and go without affecting the overall functioning of the network.

WCF Duplex is used when you have a Publish/Subscribe setting (also known as the Observer Pattern). Let's say you have a service that subscribes for notifications of some sort (e.g. new email). Normally, you would need to check periodically for updates. Using WCF Duplex, the subscriber can be notified automatically by the publisher when there are updates.

message broker consumer/producer with reassign when client goes down?

I am looking for a message broker API to use it with c#.
Normally the things are quite simple. I have a server that knows what jobs are to do and I have some clients that need to get these jobs.
And here are the special requirements I have:
If a client got a job but fails to answer within a specific time, then another client should do the work.
More than one queue and priorities
If possible it needs to work with big message queues (this way I could just load all jobs sometimes a month and forget about it
secured communications would be good.
API for talking with the broker from c#. How much work is done? What is still to do?
Delete some jobs...
If available replication to another broker would be good.
The broker needs to run on windows
What is not an issue:
low latency (there is no problem when a message needs minutes)
Do you know such a message broker that is free to use?

RabbitMQ and several other AMQP implementations satisfy most of (if not all of) these requirements.
RabbitMQ allows clients to acknowledge receipt and/or processing of messages. As per http://www.rabbitmq.com/tutorials/amqp-concepts.html#message-acknowledge:
If a consumer dies without sending an acknowledgement the AMQP broker
will redeliver it to another consumer or, if none are available at the
time, the broker will wait until at least one consumer is registered
for the same queue before attempting redelivery.
Many queues (and in fact many brokers) are supported, in a variety of different configurations
It scales particularly well, even for very large message queues: http://www.rabbitmq.com/faq.html#performance
Encryption is supported: http://www.rabbitmq.com/faq.html#channel-encryption
There is a .NET Client Users Guide and API docs: http://www.rabbitmq.com/documentation.html
There is live failover if a broker dies: http://www.rabbitmq.com/clustering.html
It runs on Windows, Linux, and probably anything else that has an Erlang implementation

WCF or Custom Socket Architecture

I'm writing a client/server architecture where there are going to be possibly hundreds of clients over multiple virtual machines, mostly on the intranet but some in other locations.
Each client will be gathering data constantly and sending a message to a server every second or so. Each message will probably be about 128 characters or so in length.
My question is, for this architecture where I am writing both client/server in .NET is should I go with WCF or some socket code I've written previously. I need scalability (which the socket code has in mind), reliability and just the ability to handle that many messages.

I would not make final decision without peforming some proof of concept. Create very simple service, host it and use some stress test to get real performance results. Than validate results against your requirements. You have mentioned amount of messages but you didn't mentioned expected response time. There is currently discussed similar question on MSDN forum which complains about slow response time of WCF compared to sockets.
Other requirements are not directly mentioned in your post so I will make some assumption for best performance:
Use netTcpBinding - best performance, binary encoding, requires .NET server / clients. I guess you are going to use Net.Tcp because your other choice was direct socket programming.
Don't use security if you don't have to - reduces performance. Probably not possible for clients outside your intranet.
Reuse proxy on clients if possible. Openning TCP connection is expensive if you reuse the same proxy you will have single connection per proxy. This will affect instancing of you services - by default single service instance will handle all requests from single proxy.
Set service throttling so that your service host is ready for many clients
Also you should make some decisions about load balancing. Load balancing for WCF net.tcp connections requires sticky sessions (session affinity) so that after openning the channel client always calls the service on the same server (bacause instance of that service was created only on single server).

100 requests per second does not sound like much for a WCF service, especially with that little payload. But it should be quite quick to setup a simple setup with a WCF service with one echo method just returning the input and then hook up a client with a bunch of threads and a loop.
If you already have a working socket implementation you might keep it, but otherwise you can pick WCF and spend your precious development time elsewhere.

From my experience with WCF, i can tell you that it's performance on high load is very very nice. Especially you can chose between several bindings to achieve your requirements for the different scenarios (httpBinding for outside communication, netPeerTcpBinding in local network e.g.).

How would you notifiy clients about changed data on the server using .Net 2.0?

Imagine a WinForms client app that displays fairly complex calculated data fetched from a server app with .Net Remoting over a HTTPChannel.
Since the client app might be running for a whole workday, I need a method to notify the client that new data is available so the user is able to start a reload of the data when he needs to.
Currently I am using remoted .Net events, serializing the event to the client and then rethrowing the event on the side of the client.
I am not very happy with this setup and plan to reimplement it.
Important for me is:
.Net 2.0 based technology
easy of use
low complexity
robust enough to survive a server or client restart still functional
When limited to .Net 2.0, how would you implement such a feature? What technologies / libraries would you use?
I am looking for inspiration on how to attack the problem.
Edit:
The client and server exist in the same organisation, typically a LAN, perhaps a WAN/VPN situation.
This mechanism should only make the client aware that there is new data available. I'd like to keep remoting for getting the actual data to the client since that is working pretty well. MSMQ comes with windows, doesn't it? So it should be ok to use it, but I'm open to any alternative.

I've implemented a similar notification mechanism using MSMQ. The client machine opens a local, public queue, and then advises the server of it's queue name. When changes occur, the server pushes notifications into all the client queues that it's be made aware of. This way the client will know that data is ready, even if it wasn't running when the notification was sent.
The only downside is that it requires MSMQ on the clients, so this may not work if you don't have that kind of control over your client's machines.
For an extra level of redundancy (for example, if a client machine is completely down, and therefore the client queue is unavailable) you could queue notifications on the server prior to dissemination to clients. Notifications in the server queues are only removed when the client is successfully contacted (or perhaps after 3 failed attempts, etc.)
Also in that regard, if the server fails to deliver messages to a client a measured number of times, over a measured period of time, then support entities are notified, error alerts go out, and the client queue is removed from the list of destinations. When I say "measured" I mean a frequency/duration that makes sense to the setting. In my case, it was 5 retries with 5 minute intervals between attempts.
It might also make sense to have the client "renew" it's notification subscription at intervals. If a renewal doesn't occur, then eventually the client queue is removed from the destination list by a "groomer" process in the service.

It sounds as though you need to implement a message-queue based solution. Easy to implement, can survive reboots, and the technology is mature both on the server (MSMQ, MGQSeries) and on the client (System.Messaging)

If you can't find anything built-in and assuming you know the address of all the clients, you could send them a UDP message when data changes. Using UdpClient, this is very easy. The datagram doesn't even need to contain any data if the client app can assume that any UDP data on a certain port means it needs to get new data from the server.
If necessary, you can even make this a broadcast packet (if you don't know who the clients are and they are on the same subnet as the server), so long as the server isn't too "chatty".
Whatever solution you decide on, I would urge you to avoid having the clients poll. This will create a lot of unecessary network traffic and still won't perform all that well.

I would usually use a UI timer on the client to periodically hit the server to see if there was new or updated data. (Assuming you have a mechanism to identify that you have new data like time stamps for new rows, or file time stamps, or a table with last-calculated dates, etc)
That way the server doesn't have to know about the clients. The clients can check at their leisure, etc.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.