Why .NET HttpResponseMessage.Content methods for reading are async? - c#

I noticed, that HttpResponseMessage.Content has ReadAsStringAsync() method. What is the sense of making it async when this operation requires cpu and task reation will add more CPU work?

It's not purely a CPU-bound operation. In a request with a large response, you can start reading the response body before the entire response body has been received over the network.
The first part of the response will likely already be in memory, but you could very well hit a point while reading the response where you need to wait for the rest of the data to be received over the network. That allows you to wait asynchronously and not block the thread.

Related

c# long running task in background

We have a function app that build a large json payload(+- 2000 lines) everyday and posts it to the api to be mapped and saved into a database.
We are using cqrs with mediatr and it seems the API side takes exceptionally long to create and save all the neccesary information.
The problem we have is that the function's postasjsonasync waits for the api response and times out after a few minutes.
Any idea how to run this as a background task or just post and forget? Our API is only concerned that it received data.
Function side:
using (var client = new HttpClient())
{
client.Timeout = new TimeSpan(0, 10, 0);
var response = await client.PostAsJsonAsync($"{endpoint}/api/v1.0/BatchImport/Import", json); <-- Times out waiting for API
response.EnsureSuccessStatusCode();
}
API mediatr handle side:
public async Task<Unit> Handle(CreateBatchOrderCommand request, CancellationToken cancellationToken)
{
foreach (var importOrder in request.Payload) <-- Takes long to process all the data
{
await PopulateImportDataAsync(importOrder, cancellationToken);
await CreateOrderAsync(importOrder, cancellationToken);
}
return Unit.Value;
}
Cheers
The problem we have is that the function's postasjsonasync waits for the api response and times out after a few minutes.
The easiest solution is going to be just increasing that timeout. If you are talking about Azure Functions, I believe you can increase the timeout to 10 minutes.
Any idea how to run this as a background task or just post and forget? Our API is only concerned that it received data.
Any fire-and-forget solution is not going to end well; you'll end up with lost data. I recommend that you not use fire-and-forget at all, and this advice goes double as soon as you're in the cloud.
Assuming increasing the timeout isn't sufficient, your solution is to use a basic distributed architecture, as described on my blog:
Have your API place the incoming request into a durable queue.
Have a separate backend (e.g., Azure (Durable) Function) process that request from the queue.
Assuming you’re on .NET Core, you could stick incoming requests into a queued background task:
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-6.0&tabs=visual-studio#queued-background-tasks
Keep in mind this chews up resources from servicing other web requests so it will not scale well with millions of requests. This same basic principle, a message queue item and offline processing, can also be distributed across multiple services to take some of the load off the web service.

Prevent hung workers in ASP.NET when reading the posted data

I have an issue caused by an external factor which is causing my app pool to queue up requests and hang. The issue seems to be caused when the client making the HTTP request somehow loses its TCP layer connection to my server, whilst the server is trying to read all the data POST'd to it.
I am using an Asynchronous HTTP handler, and I am using the following code to read all the posted data:
string post_data = new StreamReader(context.Request.InputStream).ReadToEnd();
I believe what is happening is that "ReadToEnd()" is blocking my worker thread and when the TCP layer is lost the thread is stuck there trying to read indefinitely. How can I prevent this from happening?
I am currently coding in .NET 2.0 but I can use a newer framework if this is required.
HttpRequest.InputStream will synchronously read the entire request, then it returns the Stream as one huge chunk. You'll want something like this instead (requires .NET 4.5):
string body = await new StreamReader(request.GetBufferlessInputStream()).ReadToEndAsync();
GetBufferlessInputStream() won't read the entire request eagerly; it returns a Stream that reads the request on-demand. Then ReadToEndAsync() will asynchronously read from this Stream without tying up a worker thread.
To create a .NET 4.5-style async handler, subclass the HttpTaskAsyncHandler class. See this blog post for more information on async / await in .NET 4.5. You must also target .NET 4.5 in Web.config to get this functionality.

How many simultaneous (concurrent) connections are actually active during a many async request

My understanding is the point of Task is to abstract out threads, and that a new thread is not guaranteed per Task.
I'm debugging in VS2010, and I have something similar to this:
var request = WebRequest.Create(URL);
Task.Factory.FromAsync<WebResponse>(
request.BeginGetResponse,
request.EndGetResponse).ContinueWith(
t => { /* ... Stuff to do with response ... */ });
If I make X calls to this, e.g. start up X async web requests, how am I to calculate how many simultaneous (concurrent) connections are actually being made at any given time during execution? I assume that somehow it is opening only the max it can (in the case X is very high), and the other Tasks are blocked while waiting?
Any insight into this or how I can check with the debugger to determine how many active (open) connections are existent at a given point in execution would be great.
Basically, I'm wondering if it's handled for me, or if I have to take special consideration so that I do not appear to be attacking a server?
This won't really be specific to Task. The external connection is created as soon as you make your call to Task.Factory.FromAsync. The "task" that the Task is performing is simply waiting for the response to get back (not for it to be sent in the first place). Thus the call to BeginGetResponse will fail if your machine is unable to send any more requests, and the response will contain an error message if the server is rejecting your requests due to their belief that you are flooding them.
The only real place that Task comes into play here is the amount of time between when the response is actually received by the machine and when your continuation runs. If you are getting lots of responses, or otherwise have lots of work in the thread pool, it could take some time for it to get to your continuation.

Why do we need both BeginGetResponse AND BeginRead?

I'm looking at the following reference for making asynchronous web requests with C#:
http://msdn.microsoft.com/en-us/library/86wf6409%28v=vs.100%29.aspx
When I build the sample code with only BeginGetResponse and EndGetResponse, my "asynchronous call" still takes hundreds of milliseconds to complete.
Can someone explain why the reading requires another asynchronous call, when the BeginGetResponse should already be on a separate thread?
Because BeginGetResponse/EndGetResponse have to do with connecting to the Http endpoint (server may take some time to respond) while BeginRead/EndRead have to do with reading a potentially long response from the response stream.
Imagine that your response takes 10 seconds to produce on the server and the amount of data it spits out is, say, 10MB.
Without the first pair of Begin/EndGetResponse calls, your thread would be blocked for at least 10 seconds waiting for the first byte of the response to come back.
Without the second set of Begin/EndRead calls, your thread would be blocked while you are reading 10MB of data one network packet at a time (remember that TCP packets have limited size so it takes a while for all of them to arrive back on the client)
I think that is mapped to underlying socket operations. BeginGetResponse establishes connection to server (that's why it takes so long) and sends the request, while BeginRead waits for response data.

Does it makes sense to queue send operations when using Socket.SendAsync?

I am using .NET async send method (SendAsync) from Socket class. Do I need to queue send operations in order to send the payload over the wire one by one after the previous transmission finishes?
I've noticed that SendAsync will accept happily any bytes I throw at it without complaining that the previous send has finished or not. The protocol I am using deals with out-of-order messages.
Does the Windows socket stack already do queuing internally?
The Socket-class should do this internaly - if you check the return value HERE:
Returns true if the I/O operation is pending.
Returns false if the I/O operation completed synchronously.

Categories

Resources