I'm trying to learn the async and await mechanisms in C#.
The simplest example is clear to me.
The line
Task<string> getStringTask = client.GetStringAsync("http://msdn.microsoft.com");
triggers an asynchronous web call. The control returns to AccessTheWebAsync(). It is free to perform DoIndependentWork(). After doing this it waits for the completion of the task getStringTask and when this result is available the function executes the next line
return urlContents.Length;
So, as far as I understand the purpose of the async call is to let the caller execute other operations when the operation tagged with async is in progress.
However, I'm bit confused with the example, in this function.
private async Task<byte[]> GetURLContentsAsync(string url)
{
// The downloaded resource ends up in the variable named content.
var content = new MemoryStream();
// Initialize an HttpWebRequest for the current URL.
var webReq = (HttpWebRequest)WebRequest.Create(url);
// Send the request to the Internet resource and wait for
// the response.
using (WebResponse response = await webReq.GetResponseAsync())
// The previous statement abbreviates the following two statements.
//Task<WebResponse> responseTask = webReq.GetResponseAsync();
//using (WebResponse response = await responseTask)
{
// Get the data stream that is associated with the specified url.
using (Stream responseStream = response.GetResponseStream())
{
// Read the bytes in responseStream and copy them to content.
await responseStream.CopyToAsync(content);
// The previous statement abbreviates the following two statements.
// CopyToAsync returns a Task, not a Task<T>.
//Task copyTask = responseStream.CopyToAsync(content);
// When copyTask is completed, content contains a copy of
// responseStream.
//await copyTask;
}
}
// Return the result as a byte array.
return content.ToArray();
}
Inside the method GetURLContentsAsync(), there are two async invocations. However, the API waits with an await call on both. The caller is not doing anything between the trigger of the async operation and the receipt of the data. So, as far as I understand, the async/await mechanism brings no benefit here. Am I missing something obvious here?
Your code doesn't need to explicitly be doing anything between await'd async calls to gain benefit. It means that the thread isn't sitting waiting for each call to complete, it is available to do other work.
If this is a web application it can result in more requests being processed. If it is a Windows application it means the UI thread isn't blocked and the user has a better experience.
However, the API waits with an await call on both.
You will have to await for the both because your method code should get executed sequentially, if you don't await the first call, your next lines of code will also get executed which is something you might not expect or need to happen.
The following two reasons that come in my mind for awaiting both methods are:
it is possible that your first async method result is used as
parameter in your second async method call
it is also possible that we decide on the result of first async
method call that the second async method to be called or not
So if that's the case then it is quite clear why you would not need to add await to every async method call inside your async method.
EDIT:
From the example which you are pointing to clearly you can see that the output of first async method is being used in the second async method call here:
using (WebResponse response = await webReq.GetResponseAsync())
// The previous statement abbreviates the following two statements.
//using (WebResponse response = await responseTask)
{
// Get the data stream that is associated with the specified url.
using (Stream responseStream = response.GetResponseStream())
{
// Read the bytes in responseStream and copy them to content.
await responseStream.CopyToAsync(content);
// The previous statement abbreviates the following two statements.
// CopyToAsync returns a Task, not a Task<T>.
//Task copyTask = responseStream.CopyToAsync(content);
// When copyTask is completed, content contains a copy of
// responseStream.
//await copyTask;
}
}
GetResponseAsync returns when the web server starts its response (by sending the headers), while CopyToAsync returns once all the data has been sent from the server and copied to the other stream.
If you add code to record how much time elapses between the start of the asynchronous call and the return to your function, you'll see that both methods take some time to complete (on a large file, at least.)
private static async Task<byte[]> GetURLContentsAsync(string url) {
var content = new MemoryStream();
var webReq = (HttpWebRequest)WebRequest.Create(url);
DateTime responseStart = DateTime.Now;
using (WebResponse response = await webReq.GetResponseAsync()) {
Console.WriteLine($"GetResponseAsync time: {(DateTime.Now - responseStart).TotalSeconds}");
using (Stream responseStream = response.GetResponseStream()) {
DateTime copyStart = DateTime.Now;
await responseStream.CopyToAsync(content);
Console.WriteLine($"CopyToAsync time: {(DateTime.Now - copyStart).TotalSeconds}");
}
}
return content.ToArray();
}
For a ~40 MB file on a fast server, the first await is quick while the second await takes longer.
https://ftp.mozilla.org/pub/thunderbird/releases/52.2.1/win32/en-US/Thunderbird%20Setup%2052.2.1.exe
GetResponseAsync time: 0.3422409
CopyToAsync time: 5.3175731
But for a server that takes a while to respond, the first await can take a while too.
http://www.fakeresponse.com/api/?sleep=3
GetResponseAsync time: 3.3125195
CopyToAsync time: 0
Related
I have an app in which a button starts creating XMLs. In the end of each XML creation, the SendInvoice function sends it, receives the response and a function (ParseResponse) parses the responses and does the database operations needed.
The idea is that when all the XMLs are created and sent, the application must close.
The problem is that I have lost control with async and the application seems to close before it actually finishes all the jobs. Also XMLs are sent before the previous have been processed.
The ParseResponse function is not asynchronous.
Here is the SendInvoice function.
Can you suggest any good practise?
Thank you in advance.
public async void SendInvoice(string body)
{
Cursor.Current = Cursors.WaitCursor;
var client = new HttpClient();
var queryString = HttpUtility.ParseQueryString(string.Empty);
var uri = "https://xxxx.xxx/SendInvoices?" + queryString;
HttpResponseMessage response;
// Request body
byte[] byteData = Encoding.UTF8.GetBytes(body);
using (var content = new ByteArrayContent(byteData))
{
content.Headers.ContentType = new MediaTypeHeaderValue("application/xml");
response = await client.PostAsync(uri, content);
string responsebody = await response.Content.ReadAsStringAsync();
ParseResponse(response.ToString());
ParseResponse(responsebody);
}
}
The rest of the code
private void button1_Click(object sender, EventArgs e)
{
For
{
......
SendInvoice(xml)
}
System.Windows.Forms.Application.Exit();
}
Since you are calling the method from an Event Handler, this is a case where async void is acceptable, change your Button Click handler method signature to use async, I also added some ConfigureAwait(false) to async method calls - best-practice-to-call-configureawait-for-all-server-side-code and why-is-writing-configureawaitfalse-on-every-line-with-await-always-recommended:
private async void button1_Click(object sender, EventArgs e)
{
//since you are using a for-loop, I'd suggest adding each Task
//to a List and awaiting all Tasks to complete using .WhenAll()
var tasks = new List<Task>();
FOR
{
......
//await SendInvoice(xml).ConfigureAwait(false);
tasks.Add(SendInvoice(xml));
}
await Task.WhenAll(tasks).ConfigureAwait(false);
System.Windows.Forms.Application.Exit();
}
AND change your SendInvoice method signature to return a Task
public async Task SendInvoice(string body)
{
Cursor.Current = Cursors.WaitCursor;
var client = new HttpClient();
var queryString = HttpUtility.ParseQueryString(string.Empty);
var uri = "https://xxxx.xxx/SendInvoices?" + queryString;
HttpResponseMessage response;
// Request body
byte[] byteData = Encoding.UTF8.GetBytes(body);
using (var content = new ByteArrayContent(byteData))
{
content.Headers.ContentType = new MediaTypeHeaderValue("application/xml");
response = await client.PostAsync(uri, content).ConfigureAwait(false);
string responsebody = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
ParseResponse(response.ToString());
ParseResponse(responsebody);
}
}
I was very used to multithreaded programming and it took me some time to understand asynchronous programming, because it really has nothing to do with multithreading. It is about doing more with a single thread, or a small number of threads.
asynchronous code is beneficial when the CPU would otherwise be waiting for something besides processing. Examples are: waiting for a network response, waiting for data to be read from disk, waiting on a separate process such as a database server.
It provides a way for the thread you are running to do other things while you wait. C# does this using Task. A task is some work that is being done, and it can be running or it can be waiting, and when waiting it doesn't need a thread attached.
All asynchronous functions must return a Task to be useful. So your function should be:
public async Task SendInvoice() {
...
The async keyword is used by the compiler to automatically wrap your function in a task object so you don't need to worry about a lot of the details. You just use await when calling another async function. You could do more work yourself to create tasks or return a task from another async function, or even call multiple async functions and await all of them together.
If your async method returns a value, use the generic Task: Task<String>, for example.
The Task is returned from an async method before the task completes. That is what allows the thread to be used by something else, but it has to get back to that starting place, which is why asynchonous programming you'll hear "async all the way up". It doesn't really do any good until it gets back to a caller that has multiple tasks to balance, which is usually the entry point of your application or the web request.
You can make your C# Main method async, but it mostly won't matter unless your process is really doing multiple things at the same time. For a web application, that can just be handling multiple requests. For a standalone app, it means you can query multiple APIs, make multiple web requests or db queries at the same time, and await them all, just using a single thread. Obviously, that can make things faster (at least locally, the external resources may have more work to do).
For a simple way to keep your program from exiting, if you have an asynchronous main, just await the call to SendInvoice. If your main is not async, you can use something like:
SendInvoice().Wait()
or
SendInvoice().Result
Using Wait() or Result will lock the thread until the task completes. It typically will make that thread exclusively available to the task so the thread cannot be used for any other tasks. If there are more threads in the threadpool, other tasks may continue to run, but typically using Wait/Result on a single Task defeats the point of asynchronous programming, so keep that in mind.
EDIT
Now that you have posted your calling code, it appears your call is in a loop. This is a good opportunity to take advantage of async calls and send ALL the invoices at once.
private async void button1_Click(object sender, EventArgs e)
{
List<Task> tasks = new List<Task>();
FOR
{
......
t = SendInvoice(xml).ConfigureAwait(false);
tasks.Add(t)
}
await Task.WhenAll(tasks).ConfigureAwait(false);
System.Windows.Forms.Application.Exit();
}
That will send ALL the invoices, then return from the handler, and then exit once all the responses have been received.
Below is my code to get an HTML page
public static async Task<string> GetUrltoHtml(string url)
{
string s;
using (var client = new HttpClient())
{
var result = client.GetAsync(url).Result;
//Console.WriteLine("!!!"+result.StatusCode);
s = result.Content.ReadAsStringAsync().Result; //break point
}
return s;
}
the line
var result = client.GetAsync(url).Result;
causes app freeze seconds and work as sync mode
Your comment welcome
According to the docs
Accessing the property's get accessor blocks the calling thread until the asynchronous operation is complete; it is equivalent to calling the Wait method.
So getting Result is a blocking action. You should use await instead.
s = await result.Content.ReadAsStringAsync();
(Result is helpful when the result is ready and you just want to get it. Or in some cases you want to block the thread (but it's not recommended).)
var httpResponseMessage = await httpClient.SendAsync(message).ConfigureAwait(false);
var dataStream = await httpResponseMessage.Content.ReadAsStreamAsync();
This by idea should be awaited, but no matter what it do executions exists the method and returns to UI. Execution resumes when responses arrives, but by that time UI has already updated that execution finished, when in fact it hasn't.
All calling methods are awaited.
Initial method is not awaited by design (Task.Run(() => StartDownload(selectedSchedules)); which starts UI method executing services that triggers httpclient, when that call finished UI should update with progress, but the second httpClient.SendAsyncis executed, execution returns to UI
Task.Run(() => StartDownload(selectedSchedules)); //First call, initiated by a button
public async Task StartDownload(SchedulesList list)
{
//var t = new Task(() => _scheduleServices.Download(list));
//t.Start();
//await t;
await _scheduleServices.DownloadIwcfDb(list, UsbInfoModels);
}
public async Task Download(SchedulesList schedulesList)
{
await DownloadDb(schedulesList);
}
private async Task DownloadDb(SchedulesList schedulesList)
{
using (var httpClient = new HttpClient())
{
var message = new HttpRequestMessage(new HttpMethod("POST"), ApiCallUrls.GetIwcfSchedules)
{
Content = new StringContent(JsonConvert.SerializeObject(schedulesList), Encoding.UTF8, "application/json")
};
httpClient.Timeout = TimeSpan.FromMinutes(20);
var httpResponseMessage= await httpClient.SendAsync(message).ConfigureAwait(false);
var dataStream = await httpResponseMessage.Content.ReadAsStreamAsync();
using (Stream contentStream = dataStream, stream = new FileStream(Path.Combine(Directories.SomEDir, Directories.SomeFileName), FileMode.Create, FileAccess.Write, FileShare.None))
{
await contentStream.CopyToAsync(stream);
}
}
}
Call Chain Added, irrelevant code removed from the methods
You're problem probably lies within your first call.
In your code you have:
Task.Run(()=>StartDownload(selectedSchedules)); //First call, initiated by a button
//I assume afterwards you update the ProgressBar or give some other progress feedback
What this does is: It calls StartDownload and immediately continues execution. All the other stuff (downloading etc) is then happening in the background. Remember that the method StartDownload does not block; it simply returns a Task object. If you do not await that Task object, the code will simply proceed.
I guess what you wanted is: Call StartDownload, wait for it to finish, then update the progress.
A quick solution would be to mark your event handler of the button with async, and then use async all the way. The method would look a bit like this:
private async void HandleEvent()
{
await StartDownload();
//update progress
}
I can recommend you this blog post from Stephen Cleary for an introduction to async-await: https://blog.stephencleary.com/2012/02/async-and-await.html
I'm working on a console app that take a list of endpoints to video data, makes an HTTP request, and saves the result to a file. These are relatively small videos. Because of an issue outside of my control, one of the videos is very large (145 minutes instead of a few seconds).
The problem I'm seeing is that my memory usage spikes to ~1 GB after that request is called, and I eventually get a "Task was cancelled" error (presumably because the client timed out). This is fine, I don't want this video, but what is concerning is that my allocated memory stays high no matter what I do. I want to be able to release the memory. It seems concerning that Task Manager shows ~14 MB memory usage until this call, then trickles up continuously afterwards. In the VS debugger I just see a spike.
I tried throwing everything in a using statement, re-initializing the HttpClient on exception, manually invoking GC.Collect() with no luck. The code I'm working with looks something like this:
consumer.Received += async (model, ea) =>
{
InitializeHttpClient(source);
...
foreach(var item in queue)
{
await SaveFileFromEndpoint(url, fileName);
...
}
}
and the methods:
public void InitializeHttpClient(string source)
{
...
_client = new HttpClient();
...
}
public async Task SaveFileFromEndpoint(string endpoint, string fileName)
{
try
{
using (HttpResponseMessage response = await _client.GetAsync(endpoint))
{
if (response.IsSuccessStatusCode)
{
using(var content = await response.Content.ReadAsStreamAsync())
using (var fileStream = File.Create($"{fileName}"))
{
await response.Content.CopyToAsync(fileStream);
}
}
}
}
catch (Exception ex)
{
}
}
Here is a look at my debugger output:
I guess I have a few questions about what I'm seeing:
Is the memory usage I'm seeing actually an issue?
Is there any way I can release the memory being allocated by a large HTTP request?
Is there any way I can see the content length of the request before the call is made and memory is allocated? So far I haven't been able to find a way to find out before the actual memory is allocated.
Thanks in advance for your help!
If you use HttpClient.SendAsync(HttpRequestMessage, HttpCompletionOption) instead of GetAsync, you can supply HttpCompletionOption.ResponseHeadersRead, (as opposed to the default ResponseContentRead). This means that the response stream will be handed back to you before the response body has downloaded (rather than after it), and will require significantly less buffer to operate.
In addition to #spender's answers (which is on point), you need to also make sure that you dispose the response when you are done with it. You can find more information about this on "Efficiently Streaming Large HTTP Responses With HttpClient" article.
Here is a code sample:
using (HttpClient client = new HttpClient())
{
const string url = "https://github.com/tugberkugurlu/ASPNETWebAPISamples/archive/master.zip";
using (HttpResponseMessage response = await client.GetAsync(url, HttpCompletionOption.ResponseHeadersRead))
using (Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
{
string fileToWriteTo = Path.GetTempFileName();
using (Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create))
{
await streamToReadFrom.CopyToAsync(streamToWriteTo);
}
}
}
You also need to take into account that you should not be creating an HttpClient instance per operation. HttpClientFactory is a very organised way to make sure that you flow the HttpClient within your app safely in a most performant way.
List<string> urls = this.populateRequestList();
this.Logger("Starting");
var reqs = urls.Select<string, WebRequest>(HttpWebRequest.Create).ToArray();
var iars = reqs.Select(req => req.BeginGetResponse(null, null)).ToArray();
var rsps = reqs.Select((req, i) => req.EndGetResponse(iars[i])).ToArray();
this.Logger("Done");
Things I noticed so far:
When I run this code, "Starting" shows up in my log, but "Done" never shows up. When I view the whole process in the debugger, it seems to skip over it like it's not even there. No exceptions are being thrown either. When reqs.Select is looping through req.EndGetResponse(iars[i]), it's like it freezes or skips over stuff. When I view it in the debugger, I don't get past 10-15 loops before it just skips to the end.
Questions:
How do I stop this from "skipping" sometime during var rsps = reqs.Select((req, i) => req.EndGetResponse(iars[i])).ToArray();?
How to I get the html from rsps? I think this problem doing that stems from the "skipping". I tried looping through each response and calling Repsponse.GetResponseStream() etc..., but nothing happens as soon as it skips.
The problem with your code is that BeginGetResponse(null, null) accepts a callback as the first argument which is invoked when the operation completes. This callback is where EndGetResponse should be called. When you call EndGetResponse, the operations are not yet completed.
Look at this article to see how aync web requests can be made in C# using iterators: http://tomasp.net/blog/csharp-async.aspx.
If using the task parallel library or .NET 4 you can also do this:
var urls = new List<string>();
var tasks = urls.Select(url =>
{
var request = WebRequest.Create(url);
var task = Task.Factory.FromAsync<WebResponse>(request.BeginGetResponse, request.EndGetResponse, null);
task.Start();
return task;
}).ToArray();
Task.WaitAll(tasks);
foreach (var task in tasks)
{
using (var response = task.Result)
using (var stream = response.GetResponseStream())
using (var reader = new StreamReader(stream))
{
var html = reader.ReadToEnd();
}
}
You are trying to use the asynchronous request methods to do a synchronous request, that doesn't work.
You are supposed to start the requests using BeginGetResponse with a callback method that handles each response. If you call EndGetResponse immediately after BeginGetResponse, it will fail because the response haven't started to arrive yet.
If you want to make a synchronous request, use the GetResponse method instead.
As I read http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.endgetresponse.aspx you need to wait for the callback before you can use EndGetResponse?
Or use GetReponse: http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.getresponse.aspx