I've had a fairly good search on google and nothing has popped up to answer my question. As I know very little about web services (only started using them, not building them in the last couple of months) I was wondering whether I should be ok to call a particular web service as frequently as I wish (within reason), or should I build up requests to do in one go.
To give you an example, my app is designed to make job updates, which for certain types of updates will call the web service. It seems like my options are that I could create a datatable in my app of updates that require the web service and pass the whole datatable to the web service and then write a method in the web service to process the datatable's updates. Alternatively I could iterate through my entire table of updates (which includes other updates than those requiring the web service) and call the web service as when an update requires it.
At the moment it seems like it would be simpler for me to pass each update rather than a datatable to the web service.
In terms of data being passed to the web service each update would contain a small amount of data (3 strings, max 120 characters in length). In terms of numbers of updates there would probably be no more than 200.
I was wondering whether I should be ok to call a particular web service as frequently as I wish (within reason), or should I build up requests to do in one go.
Web services or not, any calls routed over the network would benefit from building up multiple requests, so that they could be processed in a single round-trip. In your case, building an object representing all the updates is going to be a clear winner, especially in setups with slower connections.
When you make a call over the network, these things need to happen when a client communicates to a server (again, web services or not):
The data associated with your call gets serialized on the client
Serialized data is sent to the server
Server deserializes the data
Server processes the data, producing a response
Server serializes the response
Server sends serialized response back to the client
The response is deserialized on the client
Steps 2 and 6 usually cause a delay due to network latency. For simple operations, latency often dominates the timing of the call.
The latency on fastest networks used for high-frequency trading is in microseconds; on regular ones it is in milliseconds. If you are sending 100 packages one by one on a network with 1ms lag (2ms per roundtrip), you are wasting 200ms just on the network latency! This one fifth of a second, a lot of time by the standards of today's CPUs. If you can eliminate it simply by restructuring your requests, it's a great reason to do it.
You should usually favor coarse-grained remote interfaces over a fine-grained ones.
Consider adding a 10ms network latency to each call - what would be the delay for 100 updates?
Related
If this has been asked before my apologies, and this is .NET 2.0 ASMX Web services, again my apologies =D
A .NET Application that only exposes web services. Roughly 10 million messages per day load balanced between multiple IIS Servers. Each incoming messages is XML, and an outgoing message is XML. (XMLElement) (we have beefy servers that run on steroids).
I have a SLA that all messages are processed in under X Seconds.
One function, Linking Methods, in the process is now taking 10-20 seconds, it is required for every transaction, however is not critical that it happens before the web service returns the results. Because of this I made a suggestion to throw it on another thread, but now realize that my words and the eager developers behind them might have not fully thought this through.
The below example shows on the left the current flow. On the right what is being attempted
Effectively what I'm looking for is to have a web service spawn a long running (10-20 second) thread that will execute even after the web service is completed.
This is what, effectively, is going on:
Thread linkThread= new Thread(delegate()
{
Linkmembers(GetContext(), ID1, ID2, SomeOtherThing, XMLOrSomething);
});
linkThread.Start();
Using this we've reduced the time from 19 seconds to 2.1 seconds on our dev boxes, which is quite substantial.
I am worried that with the amount of traffic we get, and if a vendor/outside party decides to throttle us, IIS might decide to recycle/kill those threads before they're done processing. I agree our solution might not be the "best" however we don't have the time to build in a Queue system or another Windows Service to handle this.
Is there a better way to do this? Any caveats that should be considered?
Thanks.
Apart from the issues you've described, I cannot think of any. That being said, there are ways to fix the problem that do not involve building your own solution from scratch.
Use MSMQ with WCF: Create a WCF service with an MSMQ endpoint that is IIS hosted (no need to use a windows service as long as WAS is enabled) and make calls to the service from within your ASMX service. You reap all the benefits of reliable queueing without having to build your own.
Plus, if your MSMQ service fails or throws an exception, it will reprocess automatically. If you use DTC and are hitting a database, you can even have the MSMQ transaction flow to the DB.
I'll have about 10 computers communicating with the server on a regular basis.
The packets won't be too data intensive. (updates on game data)
The server will be writing to a postgres database
My main concern is concurrent access by the clients. How can I handle the queing in of client requests?
Is WCF a good route? So for instance, rather then opening tcp/ip streams is it feasible to use GET/POST requests & such by WCF over a LAN?
Can anyone suggest any other areas of focus for possible problems?
With WCF you have so much flexibility choosing what to use depends on what to you want to do. Knowing what to choose comes from experimentation and experience.
You can use tools like Fiddler, Wireshark and ServiceTraceViewer to analyze the conversations between clients and the server - they should be used periodically to detect inefficient conversations.
You're producing a game that needs to communicate data via a central server...thus presumably you want that to be as efficient as possible and with the lowest latency.
Thus ideally any requests that come into the server you want to be processed as fast as possible...you want to minimize any overhead.
Do you need to guarantee that the message was received by the server and consequently you are guaranteed to receive a reply (i.e. reliable delivery)?
Do the messages need to be processed in the same order they were sent ?
Then consider does your server need to scale? will you only ever have small number of clients at a time ? You mention 10?.....will this ever increase to a million?
Then consider your environment....you mention a LAN...will clients always be on the same LAN...i.e. Intranet? If so then they can communicate directly with the server using TCP sockets....without going through an IIS stack.
With certain assumptions above with WCF you could choose Endpoints that uses a netTCPBinding with Binary encoding, and with the Service using:
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerSession)]
[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Single)].
This would establish sessions (as long as you use a Binding that supports sessions) between the clients and server (doesn't scale because each client gets their own server instance to manage their session), it would allow multiple clients to be handled in parallel, and the Binary encoding "might" reduce data size (you could choose to avoid that as you said your data isn't intensive, and Binary might actually add more overhead to small messages sizes).
If on the other-hand your service needs the ability to infinitely scale and keep response times low, doesn't mind if messages are lost, or delivered in a different order, or have to be deliverable over HTTP, etc, etc, then...then there are other Bindings, InstanceContextModes, message encodings, etc, that you can use. And if you really want to get sophisticated, you can get your Service to expose multiple Endpoints that are configured in different ways, and your client might prefer a particular recipe.
NOTE: (if you host the WCF Service using IIS and you are using netTCPBinding then you have to enable the net.tcp protocol....not needed if you self-host the WCF Service).
http://blogs.msdn.com/b/santhoshonline/archive/2010/07/01/howto-nettcpbinding-on-iis-and-things-to-remember.aspx
So that is "1" way to do it with WCF for a particular scenario.
Experiment with one way...look at its memory use, latency, resilience, etc. and if your needs change, then change the Binding/Encoding that you use...that's the beauty and nightmare of WCF.
If your clients must use HTTP and POST/GET requests then use a Http flavoured binding. If you need guaranteed message delivery, then you need a reliable session.
If your Service needs to infinitely scale then you might start looking at hosting your Service in the Cloud e.g. with Windows Azure.
http://blog.shutupandcode.net/?p=1085
http://msdn.microsoft.com/en-us/library/ms731092.aspx
http://kennyw.com/work/indigo/178
We have multiple services that do some heavy data processing that we'd like to put multiple copies of them across multiple servers. Basically the idea is this:
Create multiple copies of identical servers with the collection of services running on them
a separate server will have an executable stub that will be run to contact one of these servers (determined arbitrarily from a list) to begin the data processing
The first server to be contacted will become the "master" server and delegate the various data processing tasks to the other "slave" servers.
We've spent quite a bit of time figuring out how to architect this and I think the design should work quite well but I thought I'd see if anyone had any suggestions on how to improve this approach.
The solution is to use a load balancer..
I am bit biased here - since I am from WSO2 - the open source WSO2 ESB can be used as a load balancer - and it has the flexibility of load balancing and routing based on different criteria. Also it supports FO load balancing as well...
Here are few samples related to load balancing with WSO2 ESB...
You can download the product from here...
eBay is using WSO2 ESB to process more than 1 Billion transactions per day in their main stream API traffic...
The first server to be contacted will become the "master" server and
delegate the various data processing tasks to the other "slave"
servers.
That is definitely not how I would build this.
I build this with the intent to use cloud computing (regardless of whether it uses true cloud computing or not). I would have a service that would receive requests and save those requests to a queue. I would then have multiple worker applications that will take an item from the queue, mark it in process and do whatever needs done. Upon completion the queue item is updated as done.
At this point I would either notify the client that the work is done, or you could have the client poll the server for reading the status of the queue.
We have a large process in our application that runs once a month. This process typically runs in about 30 minutes and generates 342000 or so log events. Recently we updated our logging to a centralized model using WCF and are now having difficulty with performance. Whereas the previous solution would complete in about 30 minutes, with the new logging, it now takes 3 or 4 hours. The problem it seems is because the application is actually waiting for the WCF request to complete before execution continues. The WCF method is already configured as IsOneWay and I wrapped the call on the client side to that WCF method in a different thread to try to prevent this type of problem but it doesn't seem to have worked. I have thought about using the async WCF calls but thought before I tried something else I would ask here to see if there is a better way to handle this.
342000 log events in 30 minutes, if I did my math correctly, comes out to 190 log events per second. I think your problem may have to do with the default throttling settings in WCF. Even if your method is set to one-way, depending on if you're creating a new proxy for each logged event, calling the method will still block while the proxy is created, the channel is opened, and if you're using an HTTP-based binding, it will block until the message has been received by the service (an HTTP-based binding sends back a null response for a 1-way method call when the message is received). The default WCF throttling limits concurrent instances to 10 on the service side, which means only 10 requests will be handled at a time, and any further requests will get queued, so pair that with an HTTP binding, and anything after the first 10 requests are going to block at the client until it's one of the 10 requests getting handled. Without knowing how your services are configured (instance mode, etc.) it's hard to say more than that, but if you're using per-call instancing, I'd recommend setting MaxConcurrentCalls and MaxConcurrentInstances on your ServiceBehavior to something much higher (the defaults are 16 and 10, respectively).
Also, to build on what others have mentioned about aggregating multiple events and submitting them all at once, I've found it helpful to setup a static Logger.LogEvent(eventData) method. That way it's simple to use throughout your code, and you can control in your LogEvent method how you want logging to behave throughout your application, such as configuring how many events should get submitted at a time.
Making a call to another process or remote service (i.e. calling a WCF service) is about the most expensive thing you can do in an application. Doing it 342,000 times is just sheer insanity!
If you must log to a centralized service, you need to accumulate batches of log entries and then, only when you have say 1000 or so in memory, send them all to the service in one hit. This will give you a reasonable performance improvement.
log4net has a buffering system that exists outside the context of the calling thread, so it won't hold up your call while it logs. Its usage should be clear from the many appender config examples - search for the term bufferSize. It's used on many of the slower appenders (eg. remoting, email) to keep the source thread moving without waiting on the slower logging medium, and there is also a generic buffering meta-appender that may be used "in front of" any other appender.
We use it with an AdoNetAppender in a system of similar volume and it works wonderfully.
There's always the traditional syslog there are plenty of syslog daemons that run on Windows. Its designed to be a more efficient way of centralised logging than WCF, which is designed for less intensive opertions, especially if you're not using the tcpip WCF configuration.
In other words, have a go with this - the correct tool for the job.
When I first posted this question I had strong coupling between my web service and application controller where the controller needed to open multiple threads to the service and as it received back data it had to do a lot of processing on the returned data and merge it into one dataset. I did not like the fact that the client had to so much processing and merge the returned data before it was ready to be used and wanted to move that layer to the service and let the service open the asynchronous threads to the suppliers and merge the results before returning them to the client.
One challenge I had was that I could not wait till all threads were complete and results were merged, I had to start receiving data as it was available. That called me to implement an observer pattern on the service so that it would notify my application when new set of results are merged and ready to be used and send them to the application.
I was looking for how to do this using either on ASMX webservices or WCF and so far I have found implementing it using WCF but this thread is always open for suggestions and improvements.
OK the solution to my problem came from WCF
In addition to classic request-reply operation of ASMX web services, WCF supports additional operation types like; one-way calls, duplex callbacks and streaming.
Not too hard to guess, duplex callback was what I was looking for.
Duplex callbacks simply allow the service to do call backs to the client. A callback contract is defined on the server and client is required to provide the callback endpoint on every call. Then it is up to the service to decide when and how many times to use the callback reference.
Only bidirectiona-capable bindings support callback operations. WCF offers the WSDualHttpBinding to support callbacks over HTTP (Callback support also exists by NetNamedPipeBinding and NetTcpBinding as TCP and IPC protocols support duplex communication)
One very important thing to note here is that duplex callbacks are nonstandard and pure Microsoft feature. This is not creating a problem on my current task at hand as both my web service and application are running on Microsoft ASP.NET
Programming WCF Services gave me a good jump start on WCF. Being over 700 pages it delves deep into all WCF consepts and has a dedicated chapter on the Callback and other type of operations.
Some other good resources I found on the net are;
Windows Communication Foundation (WCF) Screencasts
MSDN Webcast: Windows Communication Foundation Top to Bottom
Web Service Software Factory
The Service Factory for WCF
This sounds like a perfect use case for Windows Workflow Foundation. You can easily create a workflow to get information from each supplier, then merge the results when ready. It's much cleaner, and WF will do all the async stuff for you.
I'm not so sure that duplex is needed here... IMO, a standard async call with a callback should be more than sufficient to get notification of data delivery.
What is the biggest problem? If you are talking about async etc, then usually we are talking about the time taken to get the data to the client. Is this due to sheer data volume? or complexity generating the data at the server?
If it is the data volume, then I can think of a number of ways of significantly improving performance - although most of them involve using DTO objects (not DataSet/DataTable, which seemed to be implied in the question). For example, protobuf-net significantly reduces the data volume and processing required to transfer data.
One of the ways to achieve this is by invoking your WS asynchronously (http://www.stardeveloper.com/articles/display.html?article=2001121901&page=1, http://www.ondotnet.com/pub/a/dotnet/2005/08/01/async_webservices.html), and then updating the GUI in the callback.
However, you could have timeout problems if the querying of data takes too long. For example, if one of the supplier's web site is down or very slow, this could mean that the whole query could fail. Maybe it would be better if your business logic on the client side does the merging instead of WS doing it.
Not sure if this solution fits your particular task, but anyway:
Add paging parameters to your WS API (int pageNumber, int pageSize, out int totalPages)
Add a short-living TTL cache that associates request details (maybe a hash value) with output data
When your application asks for the first page, return it as soon as it's ready and put the whole bunch of collected/merged data to cache so when the next page is required you may use what is already prepared.
But note that you won't get the most up-to-date data, configure cache reloading interval cautiously.
The absolute best way to archive in your scenario and technology would be having some kind of token between your web app / library against your web service and your controller needs to have a thread to check if there are new results etc. However please note that you will require to get the complete data back from your WS as it's merge can result in removed items from the initial response.
Or I still think that handling threads would be better from controller with the use of WCF Webservices