SendAsync and CopyToAsync not working when downloading a large file

SendAsync and CopyToAsync not working when downloading a large file - c#

I have a small app that receives a request from a browser, copy the header received and the post data (or GET path) and send it to another endpoint.
It then waits for the result and sends it back to the browser. It works like a reverse proxy.
Everything works fine until it receives a request to download a large file. Something like a 30MB will cause an strange behaviour in the browser. When the browser reaches around 8MB it stops receiving data from my app and, after some time, it aborts the download. Everything else works just fine.
If I change the SendAsync line to use HttpCompletionOption.ResponseContentRead it works just fine. I am assuming there is something wrong waiting for the stream and/or task, but I can't figure out what is going on.
The application is written in C#, .net Core (latest version available).
Here is the code (partial)
private async Task SendHTTPResponse(HttpContext context, HttpResponseMessage responseMessage)
{
context.Response.StatusCode = (int)responseMessage.StatusCode;
foreach (var header in responseMessage.Headers)
{
context.Response.Headers[header.Key] = header.Value.ToArray();
}
foreach (var header in responseMessage.Content.Headers)
{
context.Response.Headers[header.Key] = header.Value.ToArray();
}
context.Response.Headers.Remove("transfer-encoding");
using (var responseStream = await responseMessage.Content.ReadAsStreamAsync())
{
await responseStream.CopyToAsync(context.Response.Body);
}
}
public async Task ForwardRequestAsync(string toHost, HttpContext context)
{
var requestMessage = this.BuildHTTPRequestMessage(context);
var responseMessage = await _httpClient.SendAsync(requestMessage, HttpCompletionOption.ResponseHeadersRead, context.RequestAborted);
await this.SendHTTPResponse(context, responseMessage);
}
EDIT
Changed the SendHTTPResponse to wait for responseMessage.Content.ReadAsStreamAsync using await operator.

Just a guess but I believe the issue lies with the removal of the transfer encoding:
context.Response.Headers.Remove("transfer-encoding");
If the http request you are making with _httpClient returns the 30MB file using Chunked encoding (target server doesn't know the file size) then you would need to return the file to the browser with Chunked encoding as well.
When you buffer the response on your webservice (by passing HttpCompletionOption.ResponseContentRead) you know the exact message size you are sending back to the browser so the response works successfully.
I would check the response headers you get from responseMessage to see if the transfer encoding is chunked.

You are trying to stream a file but you are doing it not exactly right. If you do not specify,ResponseHeadersRead, response will never come back unless the server ends the request because it will try to read the response till the end.
HttpCompletionOption enumeration type has two members and one of them is ResponseHeadersRead which tells the HttpClient to only read the headers and then return back the result immediately.
var response = await httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
var stream = await response.Content.ReadAsStreamAsync();
using (var reader = new StreamReader(stream)) {
while (!reader.EndOfStream) {
//Oh baby we are streaming
//Do stuff copy to response stream etc..
}
}

Figure 3 shows a simple example where one method blocks on the result of an async method. This code will work just fine in a console application but will deadlock when called from a GUI or ASP.NET context. This behavior can be confusing, especially considering that stepping through the debugger implies that it’s the await that never completes. The actual cause of the deadlock is further up the call stack when Task.Wait is called.
Figure 3 A Common Deadlock Problem When Blocking on Async Code
public static class DeadlockDemo
{
private static async Task DelayAsync()
{
await Task.Delay(1000);
}
// This method causes a deadlock when called in a GUI or ASP.NET context.
public static void Test()
{
// Start the delay.
var delayTask = DelayAsync();
// Wait for the delay to complete.
delayTask.Wait();
}
}
The root cause of this deadlock is due to the way await handles contexts. By default, when an incomplete Task is awaited, the current “context” is captured and used to resume the method when the Task completes. This “context” is the current SynchronizationContext unless it’s null, in which case it’s the current TaskScheduler. GUI and ASP.NET applications have a SynchronizationContext that permits only one chunk of code to run at a time. When the await completes, it attempts to execute the remainder of the async method within the captured context. But that context already has a thread in it, which is (synchronously) waiting for the async method to complete. They’re each waiting for the other, causing a deadlock.
Note that console applications don’t cause this deadlock. They have a thread pool SynchronizationContext instead of a one-chunk-at-a-time SynchronizationContext, so when the await completes, it schedules the remainder of the async method on a thread pool thread. The method is able to complete, which completes its returned task, and there’s no deadlock. This difference in behavior can be confusing when programmers write a test console program, observe the partially async code work as expected, and then move the same code into a GUI or ASP.NET application, where it deadlocks.
The best solution to this problem is to allow async code to grow naturally through the codebase. If you follow this solution, you’ll see async code expand to its entry point, usually an event handler or controller action. Console applications can’t follow this solution fully because the Main method can’t be async. If the Main method were async, it could return before it completed, causing the program to end. Figure 4 demonstrates this exception to the guideline: The Main method for a console application is one of the few situations where code may block on an asynchronous method.
Figure 4 The Main Method May Call Task.Wait or Task.Result
class Program
{
static void Main()
{
MainAsync().Wait();
}
static async Task MainAsync()
{
try
{
// Asynchronous implementation.
await Task.Delay(1000);
}
catch (Exception ex)
{
// Handle exceptions.
}
}
}
LEARN MORE HERE

try these.
using (HttpResponseMessage responseMessage= await client.SendAsync(request))
{
await this.SendHTTPResponse(context, responseMessage);
}
or
using (HttpResponseMessage responseMessage=await _httpClient.SendAsync(requestMessage,
HttpCompletionOption.ResponseHeadersRead, context.RequestAborted))
{
await this.SendHTTPResponse(context, responseMessage)
}

Related

Any issue if no await on HttpClient().GetAsync().Result.Content.ReadAsStringAsync().Result?

Any issue with this C# method (.Net Framework 4.8) with not having await? I am using ReadAsStringAsync() with no await. Caller of RetrieveContent() method can't call it asynchronously. So, I need to avoid making it async method. This is a Windows Console app that can also be installed as a Windows Service- with an endpoint listening for requests. It is supposed to do Synchronous processing- one request at a time.
public string RetrieveContent()
{
return new HttpClient().GetAsync("https://www.google.com").Result.Content.ReadAsStringAsync().Result;
}
OR
public string RetrieveContent()
{
var response = new HttpClient().GetAsync("https://www.google.com").Result;
return response.Content.ReadAsStringAsync().Result;
}
Updates:
I can change like this. Thus, caller to this method doesn't need an information from Async method. Will this cause a dead-lock too?
public void RetrieveContent() //this is just logging content
{
var response = new HttpClient().GetAsync("https://www.google.com").Result;
if (response.StatusCode == HttpStatusCode.OK)
_logger.LogInformation($"content: {response.Content.ReadAsStringAsync().Result} "); //_logger is logging to local disk I/O
else
_logger.LogError($"HttpStatusCode: {response.StatusCode} ");
}
'''

Calling Result from synchronous code blocks is considered unsafe and may cause deadlocks because the task might depend on other incomplete tasks. Instead, you should usually be striving to make caller methods async as much as possible. In this specific case, however, it may be okay with a Task.Run wrapper.
See this question for more details: What's the "right way" to use HttpClient synchronously?

Now that we have an "await" keyword, is there any benefit to using ContinueWith method?

Picture the following code:
var client = new HttpClient();
var response = await client.GetAsync("www.someaddress.yo");
string content = await response.Content.ReadAsStringAsync();
Is there any added benefit, other than possibly saving a single thread, by writing the above code the following way:
var client = new HttpClient();
string content = await client.GetAsync("www.someaddress.yo")
.ContinueWith(r => r.Result.Content.ReadAsStringAsync()).Result;
Correct me if I'm wrong, but I believe performance-wise both codes end up doing the same amount of work.

The second snippet has no benefits, doesn't "save" any threads while allocating another task object and making debugging and exception handling harder by wrapping any exceptions in an AggregateException.
A task is a promise that something will produce some output in the future. That something may be :
A background operation running on a threadpool thread
A network/IO operation that doesn't run on any thread, waiting instead for the Network/IO driver to signal that the IO has finished.
A timer that signals a task after an interval. No kind of execution here.
A TaskCompletionSource that gets signalled after some time. No execution here either
HttpClient.GetAsync, HttpClient.GetStringAsync or Content.ReadAsStringAsync are such IO operations.
await doesn't make anything run asynchronously. It only awaits already executing tasks to complete without blocking.
Nothing is gained by using ContinueWith the way the second snippet does. This code simply allocates another task to wrap the task returned by ReadAsStringAsync. .Result returns the original task.
Should that method fail though, .Result will throw an AggregateException containing the original exception - or is it an AggregateException containing an AggregateException containing the original? I don't want to find out. Might as well have used Unwrap(). Finally, everything is still awaited.
The only difference is that the first snippet returns in the original synchronization context after each await. In a desktop application, that would be the UI. In many cases you want that - this allows you to update the UI with the response without any kind of marshalling. You can just write
var response = await client.GetAsync("www.someaddress.yo");
string content = await response.Content.ReadAsStringAsync();
textBox1.Text=content;
In other cases you may not want that, eg a library writer doesn't want the library to affect the client application. That's where ConfigureAwait(false) comes in :
var response = await client.GetAsync("www.someaddress.yo").ConfigureAwait(false);
string content = await response.Content.ReadAsStringAsync().ConfigureAwait(false);

UWP + IIS + async behaviour

We are working on a project developed in UWP(frontend) and REST-MVC-IIS(backend).
I was thinking on a theoretical scenario which might ensue:
From what I know, there is no way to guarantee the order in which requests will be processed and served by IIS.
So in a simple scenario, let's just assume this:
UI:
SelectionChanged(productId=1);
SelectionChanged(productId=2);
private async void SelectionChanged(int productId)
{
await GetProductDataAsync(productId);
}
IIS:
GetProductDataAsync(productId=1) scheduled on thread pool
GetProductDataAsync(productId=2) scheduled on thread pool
GetProductDataAsync(productId=2) finishes first => send response to client
GetProductDataAsync(productId=1) finishes later => send response to client
As you can see, the request for productId=2 for whatever reason finished faster then the first request for productId=1.
Because the way async works, both calls will create two continuation tasks on the UI which will override each other if they don't come in the correct order since they contain the same data.
This can be extrapolated to almost any master-detail scenario, where it can happen to end up selecting a master item and getting the wrong details for it (because of the order in which the response comes back from IIS).
What I wanted to know is if there are some best practice to handle this kind of scenarios... lot's of solutions come to mind but I don't want to jump the gun and go for one implementation before I try to see what other options are on the table.

As you presented your code await GetProductDataAsync(productId=2); will always run after await GetProductDataAsync(productId=1); has completed. So, there is no race condition.
If your code was:
await Task.WhenAll(
GetProductDataAsync(productId=1),
GetProductDataAsync(productId=2))
Then there might be a race condition. And, if that's a problem, it's not particular to async-await but due to the fact that you are making concurrent calls.
If you wrap that code in another method and use ConfigureAwait(), you'll have only one continuation on the UI thread:
Task GetProductDataAsync()
{
await Task.WhenAll(
GetProductDataAsync(productId=1).ConfigureAwait(),
GetProductDataAsync(productId=2).ConfigureAwait()
).ConfigureAwait();
}

I think I get what you're saying. Because of the async void eventhandler, nothing in the UI is awaiting the first call before the second. I am imagining a drop down of values and when it changes, it fetches the pertinent data.
Ideally, you would probably want to either lock out the UI during the call or implement a cancellationtoken.
If you're just looking for a way to meter the calls, keep reading...
I use a singleton repository layer in the UWP application that handles whether or not to fetch the data from a web service, or a locally cached copy. Additionally, if you want to meter the requests to process one at a time, use SemaphoreSlim. It works like lock, but for async operations (oversimplified simile).
Here is an example that should illustrate how it works...
public class ProductRepository : IProductRepository
{
//initializing (1,1) will allow only 1 use of the object
static SemaphoreSlim semaphoreLock = new SemaphoreSlim(1, 1);
public async Task<IProductData> GetProductDataByIdAsync(int productId)
{
try
{
//if semaphore is in use, subsequent requests will wait here
await semaphoreLock.WaitAsync();
try
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri("yourbaseurl");
client.DefaultRequestHeaders.Accept.Clear();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
string url = "yourendpoint";
HttpResponseMessage response = await client.GetAsync(url);
if (response.IsSuccessStatusCode)
{
var json = await response.Content.ReadAsStringAsync();
ProductData prodData = JsonConvert.DeserializeObject<ProductData>(json);
return prodData;
}
else
{
//handle non-success
}
}
}
catch (Exception e)
{
//handle exception
}
}
finally
{
//if any requests queued up, the next one will fire here
semaphoreLock.Release();
}
}
}

How does the async/await return callchain work?

I had a situation recently where I had an ASP.NET WebAPI controller that needed to perform two web requests to another REST service inside its action method. I had written my code to have functionality separated cleanly into separate methods, which looked a little like this example:
public class FooController : ApiController
{
public IHttpActionResult Post(string value)
{
var results = PerformWebRequests();
// Do something else here...
}
private IEnumerable<string> PerformWebRequests()
{
var result1 = PerformWebRequest("service1/api/foo");
var result = PerformWebRequest("service2/api/foo");
return new string[] { result1, result2 };
}
private string PerformWebRequest(string api)
{
using (HttpClient client = new HttpClient())
{
// Call other web API and return value here...
}
}
}
Because I was using HttpClient all web requests had to be async. I've never used async/await before so I started naively adding in the keywords. First I added the async keyword to the PerformWebRequest(string api) method but then the caller complained that the PerformWebRequests() method has to be async too in order to use await. So I made that async but now the caller of that method must be async too, and so on.
What I want to know is how far down the rabbit hole must everything be marked async to just work? Surely there would come a point where something has to run synchronously, in which case how is that handled safely? I've already read that calling Task.Result is a bad idea because it could cause deadlocks.

What I want to know is how far down the rabbit hole must everything be
marked async to just work? Surely there would come a point where
something has to run synchronously
No, there shouldn't be a point where anything runs synchronously, and that is what async is all about. The phrase "async all the way" actually means all the way up the call stack.
When you process a message asynchronously, you're letting your message loop process requests while your truly asynchronous method runs, because when you go deep down the rabit hole, There is no Thread.
For example, when you have an async button click event handler:
private async void Button_Click(object sender, RoutedEventArgs e)
{
await DoWorkAsync();
// Do more stuff here
}
private Task DoWorkAsync()
{
return Task.Delay(2000); // Fake work.
}
When the button is clicked, runs synchronously until hitting the first await. Once hit, the method will yield control back to the caller, which means the button event handler will free the UI thread, which will free the message loop to process more requests in the meanwhile.
The same goes for your use of HttpClient. For example, when you have:
public async Task<IHttpActionResult> Post(string value)
{
var results = await PerformWebRequests();
// Do something else here...
}
private async Task<IEnumerable<string>> PerformWebRequests()
{
var result1 = await PerformWebRequestAsync("service1/api/foo");
var result = await PerformWebRequestAsync("service2/api/foo");
return new string[] { result1, result2 };
}
private async string PerformWebRequestAsync(string api)
{
using (HttpClient client = new HttpClient())
{
await client.GetAsync(api);
}
// More work..
}
See how the async keyword went up all the way to the main method processing the POST request. That way, while the async http request is handled by the network device driver, your thread returns to the ASP.NET ThreadPool and is free to process more requests in the meanwhile.
A Console Application is a special case, since when the Main method terminates, unless you spin a new foreground thread, the app will terminate. There, you have to make sure that if the only call is an async call, you'll have to explicitly use Task.Wait or Task.Result. But in that case the default SynchronizationContext is the ThreadPoolSynchronizationContext, where there isn't a chance to cause a deadlock.
To conclude, async methods shouldn't be processed synchronously at the top of the stack, unless there is an exotic use case (such as a Console App), they should flow asynchronously all the way allowing the thread to be freed when possible.

You need to "async all the way up" to the very top of the call stack, where you reach a message loop that can process all of the asynchronous requests.

How to do asynchronous web calls from within asp.net

Lets say im within an ASP.NET application, WCF or web API, part of this applications job to is contact a 3rd party over the way. Id like to do this asynchronously or rather non blocking so that the thread pool doesnt get starved. However i dont want to change all my code in the service only the bit that makes the web call.
Here is some code i have written:
public string GetSomeData()
{
Task<string> stuff = CallApiAsync();
return stuff.result; //does this block here?
}
private async Task<string> CallApiasync()
{
using (var httpClient = new HttpClient())
{
string response = await httpClient.GetStringAsync(Util.EndPoint).ConfigureAwait(false);
return response;
}
}
I thought the idea was as follows but please correct any misconceptions.
The caller of CallApi can call the method and when it hits await there is a Task created which represents some work to be done asynchronously but that will take some time. At this point the thread reaches an await returns to the thread pool to do something else ie handle a different request. Once the Task completes the await line wakes up and the code continues from there as if it was synchronous.
If this is the case why do i need to return a Task from my apimethod. The caller seems to have to call stuff.Result which implies that the task may not have finished and calling result could block ? Note i don't want to make the calling method async too as then the method that calls that would need to be async etc etc.
What is the order of event here in my code?
One other question is why did i need to set configureAwait to false? otherwise everything hangs.

Id like to do this asynchronously or rather non blocking so that the thread pool doesnt get starved. However i dont want to change all my code in the service only the bit that makes the web call.
That's not possible. In order to be truly asynchronous, you must allow async to "grow" through the code as far as it needs to. What you're trying to do is block on an asynchronous call, which won't give you any benefit (you're freeing up a thread by using async, but then you're turning around and consuming a thread by using Result).
At this point the thread reaches an await returns to the thread pool to do something else ie handle a different request.
Not quite. When an async method hits an await, it returns an incomplete Task to its caller. If the caller, in turn, awaits that task, then it returns an incomplete Task to its caller, etc. When the ASP.NET runtime receives an incomplete Task from your action/service method/whatever, then it releases the thread to the thread pool.
So, you do have to go "async all the way" to see the real benefit of async.
I have an async intro on my blog if you want a more gentle introduction, as well as an MSDN article on async best practices (one of which is: async all the way). I also have a blog post that describes the deadlock you were seeing.

The compiler handles a lot of the magic behind the async pattern for you, but syntactically, you have to tell it what you want it to do by providing a method prototype that says "ok, this is an asynchronous operation that can be awaited."
For this to happen, your method must return a Task or Task<T>.
Any Task can be awaited.
You should be VERY careful when using .Result and .Wait(), as they can block in some very unexpected circumstances, because the runtime may decide to execute your method synchronously.
You should say:
await CallApiAsync();
or, to actually take advantage of it:
Task stuff = CallApiAsync();
//More code that can happen independetly of "stuff"
await stuff;
In order to do that, your GetSomeData() function must also be marked as async, but it doesn't have to, itself, return a Task.
Finished copy of a working async version of your code:
public async string GetSomeData()
{
Task stuff = CallApiAsync();
return await stuff;
}
private async Task<string> CallApiasync()
{
using (var httpClient = new HttpClient())
{
string response = await httpClient.GetStringAsync(Util.EndPoint).ConfigureAwait(false);
return response;
}
}
Honestly, if that's all the CallApiAsync function is ever going to do, you may as well inline it, though.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.