I could probably setup a couple test-bed applications and find out, but I'm hoping someone has already experienced this or just simply has a more intuitive understanding. I have three executables. Two different clients (call them Client1.exe and Client2.exe) and a WCF service host (call it Host.exe) that hosts what's more or less a message bus type service for the two clients. I won't get into the "why's" as that's a long story and not productive to this question.
The point is this, Client1 sends a request through this service to Client2. Client2 performs operations, then responds with results to Client1. Client1 will always be initiator of requests, so this order of operations will always be consistent this way. This also means that Client1 can open it's channels to communicate to this service as-needed, whereas due to the need of callback services, Client2 has to keep it's channels open. I've began by attempting to keep-alive. However, these are all three on the desktop and PC sleep events, or other issues (not sure) seem to interfere with it. And once it times out, everything has to be restarted which makes it a real pain. I have some ideas I may try to help the keep-alive approach, but this brought up a question that I don't have an answer to... is this the best use of my resources.
The way I figure it, there are two main approaches for Client2,
Keep-Alive with a lot of monitoring (timers and checking of connection states) and connection resetting code which would be faster since it could respond to requests immediately. The downside is this has to be kept alive throughout the time that the user keeps Client2 open on their desktop which could be short and sweet to crazy-long.
Poll periodically for a request which would allow the resources to only be used when checking or processing a request from Client1. This would be slower since poll requests would not be real-time, but would eliminate any external issue concerns disconnecting the service. This would also cause me to have to add more state to the service. It's already a PerSession service with a list of available instances of Client2 ID's so that Client1 knows which instance it's talking to, but it would add more.
Client2 performs many other functions and so still has to be very performant with this process, which makes me wonder which is most likely to cost in resources? Is the polling approach more costly in resources? Or attempting to keep-alive?
Related
1- Client application sends a a request to an http server (ashx file, IHttpHandler).
Remark-1: Client is a .dll which will be hosted by other stand alone applications.
Remark-2: Server was first developed as a web service, then for unknown reasons it became very slow, so we implemented it from scratch.
2- Server registers the request in database so that a long duration process is performed on data.
3- Client needs to get notified when the process is finished.
First thing that crossed my mind was implementing a Timer in client. Although I'm not sure if it is ok to do it inside of a host application which is not aware of such usage.
Then it crossed my mind if there may be a something useful in TcpListener or lower layers of socket programming instead of a high frequency timer and flooding server with update requests.
So, I appreciate any suggestion on proper way of doing this task.
UPDATE:
After giving some order to my codes, I update requirements like this:
1- Server "Broadcast"s ID of clients, like: "Client-a, read your instructions", then "Client-c", "Clinet-j",... . This is not mission critical, if a client looses the broadcast, it will return after one minute by a timer tick and will check instructiosn.
2- This server is gona be hosted on a shared hosting plan, at least at first. Solution must be acceptable in boundary of share hosting.
3- preferably all clients connect to the only one socket. No usage of extra resources.
Any recommendation is appreciated.
A while ago I came across an interesting article explaining that putting HttpClient in a using block will dispose of the object when the code has executed but not close the TCP socket and the TCP state will eventually go to TIME_WAIT and stay in that state listing for further activity for 4 minutes (default).
So basically using this multiple times:
using(var client = new HttpClient())
{
//do something with http client
}
results in many open TCP connections sitting in TIME_WAIT.
You can read the whole thing here:
You're using HttpClient wrong and it is destabilizing your software
So I was wondering what would happen if I did the same with the ClientBase<TChannel> derived service class created by Visual Studio when you right-click a project and select add Add Service Reference . . and implemented this:
//SomeServiceOutThere inherits from ClientBase
using (var service = new SomeServiceOutThere())
{
var serviceRequestParameter = txtInputBox.Text;
var result = service.BaddaBingMethod(serviceRequestParameter);
//do some magic (see Fred Brooks quote)
}
However, I haven't been able to recreate exactly the same behavior, and I wonder why.
I created a small desktop app and added a reference to a IIS hosted WCF service.
Next I added a button that basically calls the code via the using code block like I showed above.
After hitting the service the first time, I run netsat for the IP and this is the result:
So far so good. I clicked the button again, and sure enough, new connection established, while the first one went into TIME_WAIT state:
However, after this, every time I hit the service it would use the ESTABLISHED connection, and not open any more like in the HttpClient demo (even when passing different parameters to the service, but keeping the app running).
It seems that WCF is smart enough to realize there is already an established connection to the server, and uses that.
The interesting part is that, when I repeated the process above, but stopped and restarted the application between each call to the service, I did get the same behavior as with HttpClient:
There are some other potential problems with ClientBase (e.g. see here), and I know that temporarily open sockets may not be an issue at all if traffic to the service is relatively low or the server is setup for a large number of maximum connections, but I would still like to be able to reliably test whether this could be a problem or not, and under what conditions (e.g. a running windows service hitting the WCF service vs. a Desktop application).
Any thoughts?
WCF does not use HttpClient internally. WCF probably uses HttpWebRequest because that API was available at the time and it's likely a bit faster since HttpClient is a wrapper around it.
WCF is meant for high performance use cases so they made sure that HTTP connections are reused. Not reusing connections by default is, in my mind, unacceptable. This is either a bug or a design problem with HttpClient.
The 4.6.2 Desktop .NET Framework contains this line in HttpClienthandler.Dispose:
ServicePointManager.CloseConnectionGroups(this.connectionGroupName);
Since this code is not in CoreClr there is no documentation for it. I don't know why this was added. It even has a bug because of this.connectionGroupName = RuntimeHelpers.GetHashCode(this).ToString(NumberFormatInfo.InvariantInfo); in the ctor. Two connectionGroupNames can clash. This is a terrible way obtaining random numbers that are supposed to be unique.
If you restart the process there is no way to reuse existing connections. That's why you are seeing the old connections in a TIME_WAIT state. The two processes are unrelated. For what the code in them (and the OS) knows they are not cooperating in any way. It's also hard to save a TCP connection across process restarts (but possible). No app that I know of does this.
Are you starting processes so often that this might become a problem? Unlikely, but if yes you can apply one of the general workaround such as reducing TIME_WAIT duration.
Replicating this is easy: Just start 100k test processes in a loop.
I can't deny the performance benefit of a duplex async call, but some things about makes me feel wary.
My concern is that given a client object instantiated, will WCF be able to tell which particular client service instance will receive the callback argument?
Can anyone tell me if this is a good idea? If not why not?
new DuplexChannelFactory<IServerWithCallback>(
new ClientService(),
new NetTcpBinding(),
new EndpointAddress("net.tcp://localhost:1234/"+Guid.NewGuid()))
If the virtual path above is reserved how can it be discarded. I want the client service lifetime to be fairly short. IE make a request and receive a response and when done receiving, kill it. How bad is the performance penalty in making the client service lifetime short as opposed to pooling it and keeping it alive longer.
The idea is to avoid timeout issue. When done receiving, sending, dispose ASAP. By convention - can't pass the client services around. If you need info, create a new one, simple - just like EF/L2S etc.
From inside the WCF service itself, how do I kill the session with the client. ie. I don't want the client ending the session - I know I can decorate my operation accordingly, but I want the service to terminate itself programmatically when certain conditions are met.
I can affix the port and forward accordingly to resolve any firewall issue, but what I'm worried about is if the client were to sit behind a load-balancer. How would the service know which particular server to call?
I think in the end Duplex services is simply another failed architecture from Microsoft. This is one of those things that looked really good on paper but just falls apart upon closer examination.
There are too many weaknesses:
1) Reliance on session to establish client listener by the server. This is session information is stored in memory. Hence the server itself cannot be load balanced. Or if it were load balanced you need to turn ip affinity on, but now if one of the servers is bombarded you can't simply add another one and expect all these sessions to automagically migrate over to the new server.
2) For each client sitting behind a router/firewall/loadbalancer, a new end point with specific port needs to be created. Otherwise the router will not be able to properly route the callback messages to the appropriate client. An alternative is to have a router that allows custom programming to redirect specific path to a particular server. Again a tall order. Or another way is for the client with the callback to host its own database and share data via a database <-- Might work in some situation where licensing fees is not an issue... but it introduces a lot of complexity and so onerous on the client plus it mixes the application and services layer together (which might be acceptable in some exceptional situation, but not on top of the huge setup cost)
3) All this basically says that duplex is practically useless. If you need call back then you will do well to setup a wcf host on the client end. It will be simpler and much more scalable. Plus there is less coupling between client and server.
The best duplex solution for scalable architecture is in the end not using one.
It will depend on how short you need the clients new'd up and how long they will last. Pooling would not be an option if you specifically need a new client each time, but if the clients keep doing the same thing why not have a pool of them waiting to be used, if they fault out recreate that same client again.
In reality in a callback scenario if the service is calling back to the client (really calling a function on the client) to pass information the service is now the client and vice versa. You can have the service that's making the callback .Close() the connection but it will be open until the GC can dispose of it, from my experience that can take longer than expected. So in short the client should be responsible (the client being the one making the call to something) for shutting itself down, or disconnecting, the service should only give back answers or take data from a client.
In duplex callbacks the service now calling back to the client will get the address of the client abstracted behind the duplexchannelfactory. If the service can't call back to the client I don't think there's much that can be done, you'd have to ensure the port that your clients are calling to the service is open to receive callbacks I would guess.
I am using WCF and I am putting a chatroom facility in my C# program. So I need to be able to send information from the server to the clients for two events -
When a user connects/disconnects I update the list of connected users and send that back to all clients for display in a TextBlock
When a user posts a message, I need the server to send that message out to all clients
So I am looking for advice on the best way of implementing this. I was going to use netTcpBinding for duplex callbacks to clients but then I ran into some issues regarding not being able to call back the client if the connection is closed. I need to use percall instances for scalibility. I was advised in this thread that I shouldnt leave connections open as it would 'significantly limit scalibity' - WCF duplex callbacks, how do I send a message to all clients?
However I had a look through the book Programming WCF Services and the author seems to state that this is not an issue because 'In between calls, the client holds a reference on a proxy that doesn’t have an actual object at the end of the wire. This means that you can dispose of the expensive resources the service instance occupies long before the client closes the proxy'
So which is correct, is it fine to keep proxies open on clients?
But even if that is fine it leads to another issue. If the service instances are destroyed between call, how can they do duplex callbacks to update the clients? Regarding percall instances, the author of Programming WCF Services says 'Because the object will be discarded once the method returns, you should not spin off background threads or dispatch asynchronous calls back into the instance'
Would I be better off having clients poll the service for updates? I would have imagined that this is much more inefficient than duplex callbacks, clients could end up polling the service 50+ times as often as using a duplex callback. But maybe there is no other way? Would this be scalable? I envisage several hundred concurrent users.
Since I am guilty of telling you that server callbacks won't scale, I should probably explain a bit more. Let me start by addressing your questions:
Without owning the book in question, I can only assume that the author is either referring to http-based transports or request-response only, with no callbacks. Callbacks require one of two things- either the server needs to maintain an open TCP connection to the client (meaning that there are resources in use on the server for each client), or the server needs to be able to open a connection to a listening port on the client. Since you are using netTcpBinding, your situation would be the former. wsDualHttpBinding is an example of the latter, but that introduces a lot of routing and firewall issues that make it unworkable over the internet (I am assuming that the public internet is your target environment here- if not, let us know).
You have intuitively figured out why server resources are required for callbacks. Again, wsDualHttpBinding is a bit different, because in that case the server is actually calling back to the client over a new connection in order to send the async reply. This basically requires ports to be opened on the client's side and punched through any firewalls, something that you can't expect of the average internet user. Lots more on that here: WSDualHttpBinding for duplex callbacks
You can architect this a few different ways, but it's understandable if you don't want the overhead (and potential for delay) of the clients constantly hammering the server for updates. Again, at several hundred concurrent users, you are likely still within the range that one good server could handle using callbacks, but I assume you'd like to have a system that can scale beyond that if needed (or at peak times). What I'd do is this:
Use callback proxies (I know, I told you not to)... Clients connecting create new proxies, which are stored in a thread-safe collection and occasionally checked for live-ness (and purged if found to be dead).
Instead of having the server post messages directly from one client to another, have the server post the messages to some Message Queue Middleware. There are tons of these out there- MSMQ is popular with Windows, ActiveMQ and RabbitMQ are FOSS (Free Open Source Software), and Tibco EMS is popular in big enterprises (but can be very expensive). What you probably want to use is a topic, not a queue (more on queues vs topics here).
Have a thread (or several threads) on the server dedicated to reading messages off of the topic, and if that message is addressed to a live session on that server, deliver that message to the proxy on the server.
Here's a rough sketch of the architecture:
This architecture should allow you to automatically scale out by simply adding more servers, and load balancing new connections among them. The message queueing infrastructure would be the only limiting factor, and all of the ones I mentioned would scale beyond any likely use case you'd ever see. Because you'd be using topics and not queues, every message would be broadcast to each server- you might need to figure out a better way of distributing the messages, like using hash-based partitioning.
I have a web service slowdown.
My (web) service is in gsoap & managed C++. It's not IIS/apache hosted, but speaks xml.
My client is in .NET
The service computation time is light (<0.1s to prepare reply). I expect the service to be smooth, fast and have good availability.
I have about 100 clients, response time is 1s mandatory.
Clients have about 1 request per minute.
Clients are checking web service presence by tcp open port test.
So, to avoid possible congestion, I turned gSoap KeepAlive to false.
Until there everything runs fine : I bearly see connections in TCPView (sysinternals)
New special synchronisation program now calls the service in a loop.
It's higher load but everything is processed in less 30 seconds.
With sysinternals TCPView, I see that about 1 thousands connections are in TIME_WAIT.
They slowdown the service and It takes seconds for the service to reply, now.
Could it be that I need to reset the SoapHttpClientProtocol connection ?
Someone has TIME_WAIT ghosts with a web service call in a loop ?
Sounds like you aren't closing the connection after the call and opening new connections on each request. Either close the connection or reuse the open connections.
Be very careful with the implementations mentioned above. There are serious problems with them.
The implementation described in yakkowarner.blogspot.com/2008/11/calling-web-service-in-loop.html (COMMENT ABOVE):
PROBLEM: All your work will be be wiped out the next time you regenerate the web service using wsdl.exe and you are going to forget what you did not to mention that this fix is rather hacky relying on a message string to take action.
The implementation described in forums.asp.net/t/1003135.aspx (COMMENT ABOVE):
PROBLEM: You are selecting an endpoint between 5000 and 65535 so on the surface this looks like a good idea. If you think about it there is no way (at least none I can think of) that you could reserve ports to be used later. How can you guarantee that the next port on your list is not currently used? You are sequentially picking up ports to use and if some other application picks a port that is next on your list then you are hosed. Or what if some other application running on your client machine starts using random ports for its connections - you would be hosed at UNPREDICTABLE points in time. You would RANDOMLY get an error message like "remote host can't be reached or is unavailable" - even harder to troubleshoot.
Although I can't give you the right solution to this problem, some things you can do are:
Try to minimize the number of web service requests or spread them out more over a longer period of time
For your type of app maybe web services wasn't the correct architecture - for something with 1ms response time you should be using a messaging system - not a web service
Set your OS's number of connections allowed to 65K using the registry as in Windows
Set you OS's time that sockets remain in TIME_WAIT to some lower number (this presents its own list of problems)