Given a bunch of synchronous web requests, executed sequentially - it will take N seconds to complete the web requests, and receiving B bytes per second. However doing the exact same, but using asynchronous web requests, which makes it possible to execute all of the web requests in parallel - it will no longer take N seconds, however it will still receive B bytes per second.
Running a simple test, with 12 web requests - using both the synchronous and parallel approach, confirms that they both receive B bytes per second (using Resource Monitor).
My question is therefore... should the approach that executes the web requests in parallel, not receive more than B bytes per second, in order to make up for that it's faster than the synchronous approach? - Else the synchronous approach will both run longer, AND receive more bytes (totally) than the parallel approach.
These requests are not processed on your machine (unless connecting to localhost). This means that for each request to be fully processed, your machine will have to wait for a response.
Consider sending an invitation for your birthday party to friend #1, and after receiving a response, send one to friend #2, etcetera. It would be faster to send the invitation to all friends, and then wait for all of them to respond. Especially if friend #1 happens to be on holiday.
I don't know why the number of bytes per second are identical, perhaps some node in the network limits the speed, but the parallel approach can at least send out each request and "parallel out" the total wait time.
I don't understand how the synchronicity could affect the total number of bytes received. You're talking about bytes/second, but not the number of seconds spent at that transfer speed.
Related
I have a gRPC service that accepts streaming messages from a client. The client sends a finite sequence messages to the server at a high rate.
The result is the server buffering a large number of messages (> 1GB) and it's memory usage skyrocketing and then slowly draining as it handles them.
I find that even if I await all async calls, the client just keeps pushing messages as fast as it can. I would like the client to slow down.
I have implemented an explicit ack response that the client waits for before sending the next message, but since http/2 already has flow control semantics built in I feel like I'm reinventing the wheel a bit.
I have two concrete questions.
Does the C# implementation automatically apply backpressure? For example, if the consuming side is slow to call MoveNext on the async stream, will the client side take longer to return from it's calls to WriteAsync?
Does the C# implementation of gRPC have any configurable way of limiting the buffering of messages for a streaming rpc call. For example, capping the number of buffered messages or limiting the amount of space in the call's buffer.
As Jan commented, this question was answered on the gRPC GitHub Repo here.
flow control is expected to work in all gRPC languages as we consider
it one of the key features for a scalable RPC system.
More specifically:
if one side is slow to request messages (MoveNext()), the sending side will eventually be "blocked" when sending (WriteAsync() will take
longer to succeed - and you can only have on WriteAsync() operation in
progress per call).
I believe such parameters can be configured via C-core channel argument (ChannelOption in c#).
I have a design question on how to best approach a process within an existing DotNet2 web service I have inherited.
At a high level the process at the moment is as follows
Client
User starts new request via web service call from client/server app
Request and tasks created and saved to database.
New thread created that begins processing the request
Request ID returned to client (client polls using ID).
Thread
Loads up request detail and the multiple tasks
For each task it requests XML via another service (approx 2 sec wait)
Passes XML to another service for processing (approx 3 sec wait)
Repeat until all tasks complete
Marks request completed (client will know its finished)
Overall this takes approximately 30 seconds for a request of 6 tasks. With each task being performed sequentially it is clearly inefficient.
Would it be better to break out each task again on a separate thread and then when they are all complete mark the request as completed?
My reservation is that I am immediately duplicating the number of threads by up to a factor of 6-10 (number of tasks) and concerned on how this would impact on IIS. I estimate that I could cut a normal 30 second call down to around 5 seconds if I had each task processing concurrently but under load would this design suffer?
The current design is operating well and users have no problem with the time taken to process but I would prefer it work faster if possible.
Is this just a completely bad design and if so is there a better approach? I am limited by the current DotNet version at present.
Thanks
If you are worried about IIS performance you probably want to keep the jobs outside of IIS, so IMO I would consider queueing the tasks and creating a separate service to do the work. This approach would be more scaleable in that you could add or remove front end IIS servers or task processors to address a varying load. A large-scale system would most certainly perform the processing off of the front end server.
I am using a WebApi service controller, hosted by IIS,
and i'm trying to understand how this architecture really works:
When a WebPage client is sending an Async requests simultaneously, are all this requests executed in parallel at the WebApi controller ?
At the IIS app pool, i've noticed the queue size is set to 1,000 default value - Does it mean that 1,000 max threads can work in parallel at the same time at the WebApi server?
Or this value is only related to ths IIS queue?
I've read that the IIS maintains some kind of threads queue, is this queue sends its work asynchronously? or all the client requests sent by the IIS to the WebApi service are being sent synchronously?
The queue size you're looking at specifies the maximum number of requests that will be queued for each application pool (which typically maps to one w3wp worker process). Once the queue length is exceeded, 503 "Server Too Busy" errors will be returned.
Within each worker process, a number of threads can/will run. Each request runs on a thread within the worker process (defaulting to a maximum of 250 threads per process, I believe).
So, essentially, each request is processed on its own thread (concurrently - at least, as concurrently as threads get) but all threads for a particular app pool are (typically) managed by a single process. This means that requests are, indeed, executed asynchronously as far as the requests themselves are concerned.
In response to your comment; if you have sessions enabled (which you probably do), then ASP.NET will queue the requests in order maintain a lock on the session for each request. Try hitting your sleeping action in Chrome and then your quick-responding action in Firefox and see what happens. You should see that the two different sessions allow your requests to be executed concurrently.
Yes, all the requests will be executed in parallel using the threads from the CLR thread pool subject to limits. About the queue size set against the app pool, this limit is for IIS to start rejecting requests with a 503 - Service unavailable status code. Even before this happens, your requests will be queued by IIS/ASP.NET. That is because threads cannot be created at will. There is a limit to number of concurrent requests that can run which is set by MaxConcurrentRequestsPerCPU and a few other parameters. For 1000 threads to execute in parallel in a true sense, you will need 1000 CPU cores. Otherwise, threads will need to be time sliced and that adds overhead to the system. Hence, there are limits to number of threads. I believe it is very difficult to comprehensively answer your questions through a single answer here. You will probably need to read up a little bit and a good place to start will be http://blogs.msdn.com/b/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx.
For example, i can load the website 10 times consequentially with different pages (stackoverflow.com/questions/a , stackoverflow.com/questions/b , ...). The question is, will it be faster if i will load pages in 10 threads?
The biggest time in loading a webpage is waiting for the HTTP response to come back from the server, and a large amount of that time is taken in setting up the TCP connection.
HTTP has supported the concept of pipelining since version 1.1. This allows multiple requests to be sent along the same TCP connection, and also allows them to be sent before the replies have come back from the previous requests.
So yes, using ten threads could speed up loading ten different pages, but equally one thread could do the same by using asynchronous calls and firing off ten requests before the replies come back.
I am trying to simulate X number of concurrent requests for a WCF Service and measure the response time for each request. I want to have all the requests hit the Service at more or less the same time.
As the first step I spawned X number of Threads, using the Thread class, and have invoked the Start method. To synchronize all the requests, on the Thread callback I open the connection and have a Monitor.Wait to hold the request from being fired, till all the Threads are created and started. Once all the Threads are started, I call Monitor.PulseAll to trigger the method invocation on the WCF Client Proxy.
When I execute the requests this way, I see a huge delay in the response. A request that should just a few milliseconds, is taking about a second.
I also noticed huge lag between the time the request is dispatched and the time it was received at the service method. I measured this by send sending client time stamp as a parameter value to the service method for each request.
I have the following settings. Assume "X" to the Concurrent number of requests I want to fire. Also note with the following settings I don't get any Denial of Service issues.
The Call chain is as follows,Client->Service1->Service2->Service3
All Services are PerCall with Concurrency set to Multiple.
Throttling set to X Concurrent calls, X Concurrent Instances.
MaxConnections, ListenBacklog on the Service to X.
Min/Max Threads of ThreadPool set to X on both Client and Server (I have applied the patch provided by Microsoft).
Am not sure if the response time I'm measuring is accurate. Am I missing something very trivial?
Any inputs on this would be of great help.
Thanks.
-Krishnan
I did find the answer by myself the hard way. All this while, the way I was measuring the response time was wrong. One should spawn X number of threads, where X is the number of concurrent users one wants to simulate. In each thread, open the connection only once and have while loop to only execute the WCF Method that you want to test for a given duration. Measure the response time against each return, accumulate it and average it out against the number of calls that were executed within the given duration.
If all your outgoing calls are coming from a single process, it is likely that the runtime is either consolidating multiple requests onto a single open channel or capping the number of concurrent requests to a single target service. You may have better results if you move each simulated client into its own process, use a named EventWaitHandle to synchronize them, and call #Set() to unleash them all at once.
There is also a limit to the number of simultaneous pending (TCP SYN or SYN/ACK state) outgoing TCP connections allowed by desktop builds of Windows; if you encounter this limit you will get event 4226 logged and the additional simultaneous connections will be deferred.