I'm trying to figure out the correct way to parallelize HTTP requests using Task and async/await. I'm using the HttpClient class which already has async methods for retrieving data. If I just call it in a foreach loop and await the response, only one request gets sent at a time (which makes sense because during the await, control is returning to our event loop, not to the next iteration of the foreach loop).
My wrapper around HttpClient looks as such
public sealed class RestClient
{
private readonly HttpClient client;
public RestClient(string baseUrl)
{
var baseUri = new Uri(baseUrl);
client = new HttpClient
{
BaseAddress = baseUri
};
}
public async Task<Stream> GetResponseStreamAsync(string uri)
{
var resp = await GetResponseAsync(uri);
return await resp.Content.ReadAsStreamAsync();
}
public async Task<HttpResponseMessage> GetResponseAsync(string uri)
{
var resp = await client.GetAsync(uri);
if (!resp.IsSuccessStatusCode)
{
// ...
}
return resp;
}
public async Task<T> GetResponseObjectAsync<T>(string uri)
{
using (var responseStream = await GetResponseStreamAsync(uri))
using (var sr = new StreamReader(responseStream))
using (var jr = new JsonTextReader(sr))
{
var serializer = new JsonSerializer {NullValueHandling = NullValueHandling.Ignore};
return serializer.Deserialize<T>(jr);
}
}
public async Task<string> GetResponseString(string uri)
{
using (var resp = await GetResponseStreamAsync(uri))
using (var sr = new StreamReader(resp))
{
return sr.ReadToEnd();
}
}
}
And the code invoked by our event loop is
public async void DoWork(Action<bool> onComplete)
{
try
{
var restClient = new RestClient("https://example.com");
var ids = await restClient.GetResponseObjectAsync<IdListResponse>("/ids").Ids;
Log.Info("Downloading {0:D} items", ids.Count);
using (var fs = new FileStream(#"C:\test.json", FileMode.Create, FileAccess.Write, FileShare.Read))
using (var sw = new StreamWriter(fs))
{
sw.Write("[");
var first = true;
var numCompleted = 0;
foreach (var id in ids)
{
Log.Info("Downloading item {0:D}, completed {1:D}", id, numCompleted);
numCompleted += 1;
try
{
var str = await restClient.GetResponseString($"/info/{id}");
if (!first)
{
sw.Write(",");
}
sw.Write(str);
first = false;
}
catch (HttpException e)
{
if (e.StatusCode == HttpStatusCode.Forbidden)
{
Log.Warn(e.ResponseMessage);
}
else
{
throw;
}
}
}
sw.Write("]");
}
onComplete(true);
}
catch (Exception e)
{
Log.Error(e);
onComplete(false);
}
}
I've tried a handful of different approaches involving Parallel.ForEach, Linq.AsParallel, and wrapping the entire contents of the loop in a Task.
The basic idea is to keep of track of all the asynchronous tasks, and awaiting them at once. The simplest way to do this is to extract the body of your foreach to a separate asynchronous method, and do something like this:
var tasks = ids.Select(i => DoWorkAsync(i));
await Task.WhenAll(tasks);
This way, the individual tasks are issued separately (still in sequence, but without waiting for the I/O to complete), and you await them all at the same time.
Do note that you will also need to do some configuration - HTTP is throttled by default to only allow two simultaneous connections to the same server.
Related
i have the following problem, i try to wait for for an Async Web Response.
But it never finished.
public string getTermine(string trmId)
{
System.Threading.Tasks.Task<string> lisi = LoadTermine((HttpWebRequest)WebRequest.Create("http://" + curent.usrCH + apiKey + curent.phrase + apiTrmIDIS + trmId));//Request get String result like http://ajax.googleapis.com/ajax/services/search/web?v=1.0&start="+i+"&q=
lisi.Wait();
return lisi.Result;
}
private async System.Threading.Tasks.Taskstring>LoadTermine(HttpWebRequest myRequest)
{
//List<Termine> terminListe = new List<Termine>();
List<Appointment> Resu = null;
using (WebResponse response = await myRequest.GetResponseAsync())
{
using (System.IO.StreamReader reader = new System.IO.StreamReader(response.GetResponseStream()))
{
Resu = reader.ReadToEnd();
}
}
return Resu;
}
P.S. I cant use and synchronous request because this methods are an part of the Base code which is used by iOS, WinPhone and Android and i dont know why i cant get an synchronous WebResponse.
You are creating a deadlock by calling .Result on the task.
You could do something like this where the remoteUrl variabled is the url of your web service
private async System.Threading.Tasks.Task<string> LoadTermineAsync(HttpWebRequest myRequest)
{
using (var client = new HttpClient()) {
using (var request = new HttpRequestMessage(HttpMethod.Get, myRemoteUrl)) {
var response = await client.SendAsync(request).ConfigureAwait(false);
var result = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
return result;
}
}
}
For more info on Async/Await
And this evolve video is a little bit more advanced.
I am trying to measure improvement in amount of time if I use Http Async instead of Sync one, but the code below deadlocks because of Async method contexts I searched and found one can use ConfigureAwait(false) to fix the situation but the code still deadlocks. Any suggestions on how I can fix this?
class Program
{
static void Main(string[] args)
{
Stopwatch stopwatch = new Stopwatch();
var entities = new List<string>() { "iron+man+3", "dark+knight+rises", "frozen+movie" };
stopwatch.Start();
int taskStatus = AsyncGet(entities).Result;
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedTicks);
stopwatch.Reset();
stopwatch.Start();
SyncGet(entities);
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedTicks);
var depTime = DateTime.UtcNow;
depTime = depTime.AddMilliseconds(-depTime.Millisecond);
Console.WriteLine(depTime.ToString("yyyy-MM-ddTHH:mm:ss.fff"));
Console.Read();
}
private static async Task<int> AsyncGet(List<string> entities)
{
var taskHttpList = new List<Task<WebResponse>>();
var taskStreamList = new List<Task<string>>();
var uriTemplate = "https://www.google.co.in/?#q={0}";
foreach (var entity in entities)
{
Uri uri = new Uri(string.Format(uriTemplate, entity));
var request = WebRequest.Create(uri);
taskHttpList.Add(request.GetResponseAsync());
}
foreach (var task1 in taskHttpList)
{
var response = (HttpWebResponse)await task1.ConfigureAwait(false);
taskStreamList.Add((new StreamReader(response.GetResponseStream())).ReadToEndAsync());
}
foreach (var task in taskStreamList)
{
var responseStr = (String)await task.ConfigureAwait(false);
}
return 0;
}
private static void SyncGet(List<string> entities)
{
var uriTemplate = "https://www.google.co.in/?#q={0}";
foreach (var entity in entities)
{
Uri uri = new Uri(string.Format(uriTemplate, entity));
var request = WebRequest.Create(uri);
var response = request.GetResponse();
var str = new StreamReader(response.GetResponseStream()).ReadToEnd();
}
}
}
I imagine that there is a limit on the number of threads handling IO completion events. Instead of processing it in the lockstep fashion build a complete chain of tasks per work item:
private static async Task<int> AsyncGet(List<string> entities)
{
var tasks = new List<Task<string>>();
foreach (var entity in entities)
{
var t = AsyncGetResponse(entity);
tasks.Add(t);
}
await Task.WaitAll(tasks.ToArray()).ConfigureAwait(false);
return 0
}
static async Task<string> AsyncGetResponse(string entity)
{
const string uriTemplate = "https://www.google.co.in/?#q={0}";
Uri uri = new Uri(string.Format(uriTemplate, entity));
var request = WebRequest.Create(uri);
string result;
using (var response = (HttpWebResponse)await request.GetResponseAsync().ConfigureAwait(false))
{
var reader = new StreamReader(response.GetResponseStream()))
result = await (string) reader.ReadToEndAsync().ConfigureAwait(false);
}
return result;
}
As was mentioned in comments, don't forget to dispose allocated resources, such as WebResponse.
Given:
public async Task<string> SendRequest(this string url)
{
var wc = new WebClient();
wc.DownloadDataCompleted += (s, e) =>
{
var buffer = e.Result;
using (var sr = new StreamReader(new MemoryStream(buffer)))
{
var result = await sr.ReadToEndAsync();
};
};
wc.DownloadDataAsync(new Uri(url));
}
}
The statement:
var result = await sr.ReadToEndAsync();
Shows an error in the designer as follows: "The await operator can only be used within an async lambda expression"
I don't understand why this message is happening, when I look at all ReadToEndAsync examples they look exactly like this code.
Please advise...
When using a WebClient with the TPL, you should be using the methods with Task in the name, to get Task returning methods rather than using the event based mode (which you then would need to transform into tasks):
public static async Task<string> SendRequest(this string url)
{
using (var wc = new WebClient())
{
var bytes = await wc.DownloadDataTaskAsync(url);
using (var reader = new StreamReader(new MemoryStream(bytes)))
return await reader.ReadToEndAsync();
}
}
public static Task<string> SendRequest(this string url)
{
return (new WebClient()).DownloadStringTaskAsync(new Uri(url));
}
This question is a followup to Threading issues when using HttpClient for asynchronous file downloads.
To get a file transfer to complete asynchronously using HttpClient, you need to add HttpCompletionOption.ResponseHeadersRead to the SendAsync request. Thus, when that call completes, you will be able to determine that all was well with the request and the response headers by adding a call to EnsureSuccessStatusCode. However the data is possibly still being transferred at this point.
How can you detect errors which happen after the headers are returned but before the data transfer is complete? How would said errors manifest themselves?
Some example code follows, with the point of the question marked at line 109)with the comment: "// *****WANT TO DO MORE ERROR CHECKING HERE**"
using System;
using System.Collections.Generic;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;
namespace TestHttpClient2
{
class Program
{
/* Use Yahoo portal to access quotes for stocks - perform asynchronous operations. */
static string baseUrl = "http://real-chart.finance.yahoo.com/";
static string requestUrlFormat = "/table.csv?s={0}&d=0&e=1&f=2016&g=d&a=0&b=1&c=1901&ignore=.csv";
static void Main(string[] args)
{
var activeTaskList = new List<Task>();
string outputDirectory = "StockQuotes";
if (!Directory.Exists(outputDirectory))
{
Directory.CreateDirectory(outputDirectory);
}
while (true)
{
Console.WriteLine("Enter symbol or [ENTER] to exit:");
string symbol = Console.ReadLine();
if (string.IsNullOrEmpty(symbol))
{
break;
}
Task downloadTask = DownloadDataForStockAsync(outputDirectory, symbol);
if (TaskIsActive(downloadTask))
{
// This is an asynchronous world - lock the list before updating it!
lock (activeTaskList)
{
activeTaskList.Add(downloadTask);
}
}
else
{
Console.WriteLine("task completed already?!??!?");
}
CleanupTasks(activeTaskList);
}
Console.WriteLine("Cleaning up");
while (CleanupTasks(activeTaskList))
{
Task.Delay(1).Wait();
}
}
private static bool CleanupTasks(List<Task> activeTaskList)
{
// reverse loop to allow list item deletions
// This is an asynchronous world - lock the list before updating it!
lock (activeTaskList)
{
for (int i = activeTaskList.Count - 1; i >= 0; i--)
{
if (!TaskIsActive(activeTaskList[i]))
{
activeTaskList.RemoveAt(i);
}
}
return activeTaskList.Count > 0;
}
}
private static bool TaskIsActive(Task task)
{
return task != null
&& task.Status != TaskStatus.Canceled
&& task.Status != TaskStatus.Faulted
&& task.Status != TaskStatus.RanToCompletion;
}
static async Task DownloadDataForStockAsync(string outputDirectory, string symbol)
{
try
{
using (var client = new HttpClient())
{
client.BaseAddress = new Uri(baseUrl);
client.Timeout = TimeSpan.FromMinutes(5);
string requestUrl = string.Format(requestUrlFormat, symbol);
var request = new HttpRequestMessage(HttpMethod.Post, requestUrl);
var response = await client.SendAsync(request,
HttpCompletionOption.ResponseHeadersRead);
response.EnsureSuccessStatusCode();
using (var httpStream = await response.Content.ReadAsStreamAsync())
{
var timestampedName = FormatTimestampedString(symbol, true);
var filePath = Path.Combine(outputDirectory, timestampedName + ".csv");
using (var fileStream = File.Create(filePath))
{
await httpStream.CopyToAsync(fileStream);
}
}
// *****WANT TO DO MORE ERROR CHECKING HERE*****
}
}
catch (HttpRequestException ex)
{
Console.WriteLine("Exception on thread: {0}: {1}\r\n",
System.Threading.Thread.CurrentThread.ManagedThreadId,
ex.Message,
ex.StackTrace);
}
catch (Exception ex)
{
Console.WriteLine("Exception on thread: {0}: {1}\r\n",
System.Threading.Thread.CurrentThread.ManagedThreadId,
ex.Message,
ex.StackTrace);
}
}
static volatile string lastTimestampedString = string.Empty;
static volatile string dummy = string.Empty;
static string FormatTimestampedString(string message, bool uniquify = false)
{
// This is an asynchronous world - lock the shared resource before using it!
lock (dummy)
//lock (lastTimestampedString)
{
Console.WriteLine("IN - Thread: {0:D2} lastTimestampedString: {1}",
System.Threading.Thread.CurrentThread.ManagedThreadId,
lastTimestampedString);
string newTimestampedString;
while (true)
{
DateTime lastDateTime = DateTime.Now;
newTimestampedString = string.Format(
"{1:D4}_{2:D2}_{3:D2}_{4:D2}_{5:D2}_{6:D2}_{7:D3}_{0}",
message,
lastDateTime.Year, lastDateTime.Month, lastDateTime.Day,
lastDateTime.Hour, lastDateTime.Minute, lastDateTime.Second,
lastDateTime.Millisecond
);
if (!uniquify)
{
break;
}
if (newTimestampedString != lastTimestampedString)
{
break;
}
//Task.Delay(1).Wait();
};
lastTimestampedString = newTimestampedString;
Console.WriteLine("OUT - Thread: {0:D2} lastTimestampedString: {1}",
System.Threading.Thread.CurrentThread.ManagedThreadId,
lastTimestampedString);
return lastTimestampedString;
}
}
}
}
I have copied and slightly cleaned up the relevant code.
var request = new HttpRequestMessage(HttpMethod.Post, requestUrl);
var response = await client.SendAsync(request,
HttpCompletionOption.ResponseHeadersRead);
response.EnsureSuccessStatusCode();
using (var httpStream = await response.Content.ReadAsStreamAsync())
{
var timestampedName = FormatTimestampedString(symbol, true);
var filePath = Path.Combine(outputDirectory, timestampedName + ".csv");
using (var fileStream = File.Create(filePath))
{
await httpStream.CopyToAsync(fileStream);
}
}
The question is, what if something goes wrong during reading the stream and copying it into your file?
All logical errors have already been addressed as part of the HTTP request and response cycle: the server has received your request, it has decided it is valid, it has responded with success (header portion of response), and it is now sending you the result (body portion of response).
The only errors that could occur now are things like the server crashing, the connection being lost, etc. My understanding is that these will manifest as HttpRequestException, meaning you can write code like this:
try
{
using (var httpStream = await response.Content.ReadAsStreamAsync())
{
var timestampedName = FormatTimestampedString(symbol, true);
var filePath = Path.Combine(outputDirectory, timestampedName + ".csv");
using (var fileStream = File.Create(filePath))
{
await httpStream.CopyToAsync(fileStream);
}
}
}
catch (HttpRequestException e)
{
...
}
The documenation doesn't say much, unfortunately. The reference source doesn't either. So your best bet is to start with this and maybe log all exceptions that are not HttpRequestException in case there is another exception type that could be thrown during the download of the response body.
If you want to narrow it down to the part which is between the header read and the content read, you actually leave yourself with the asynchronous buffer read:
var httpStream = await response.Content.ReadAsStreamAsync();
If you look whats going on inside the method, you'll see:
public Task<Stream> ReadAsStreamAsync()
{
this.CheckDisposed();
TaskCompletionSource<Stream> tcs = new TaskCompletionSource<Stream>();
if (this.contentReadStream == null && this.IsBuffered)
{
this.contentReadStream = new MemoryStream(this.bufferedContent.GetBuffer(),
0, (int)this.bufferedContent.Length,
false, false);
}
if (this.contentReadStream != null)
{
tcs.TrySetResult(this.contentReadStream);
return tcs.Task;
}
this.CreateContentReadStreamAsync().ContinueWithStandard(delegate(Task<Stream> task)
{
if (!HttpUtilities.HandleFaultsAndCancelation<Stream>(task, tcs))
{
this.contentReadStream = task.Result;
tcs.TrySetResult(this.contentReadStream);
}
});
return tcs.Task;
}
CreateContentReadStreamAsync is the one doing all the reading, internally, it will call LoadIntoBufferAsync, which you can find here.
Basically, you can see the it encapsulates IOException and ObjectDisposedException, or an ArgumentOutOfRangeException is the buffer is larger than 2GB (although i think that will be highly rare).
I have 2 classes Main.cs and Processing.cs (P and M for short) class M makes a call to P passing an html link, P in tern downloads, converts to Base64, renames and saves the file then returns a string back to M, now i need M to wait until all of this is complete to proceed but i have not been able to.
I used a lambda expression with the event handler to be able to do everything in that function as opposed to a separate function for the event trigger so i would be able to return the string with the Base64 converted file, but it just returns the empty string, not wait until it has been assigned.
I thought the taskA.Wait() call would make it wait until all processing was done, but it's not the case
If anyone has any ideas I would appreciate the help.
the call from Main.cs is like this:
Processing processing = new processing();
String _link = “http://www.something.com”;
var ResultBase64_Val = processing.FileToBase64(_link).Result;
In Processing.cs the function is:
public async Task<String> FileToBase64(String filePath)
{
String convertedFile = "";
WebClient client = new WebClient();
Task taskA = Task.Factory.StartNew(() => client.OpenReadCompleted += async (object sender, OpenReadCompletedEventArgs e) =>
{
byte[] buffer = new byte[e.Result.Length];
buffer = new byte[e.Result.Length];
await e.Result.ReadAsync(buffer, 0, buffer.Length);
convertedFile = Convert.ToBase64String(buffer);
});
client.OpenReadAsync(new Uri(filePath));
taskA.Wait();
return convertedFile;
}
Thanks,
Bob
The problem with your code is that the task started with Task.Factory.StartNew completes instantly, well before OpenReadCompleted is fired sometime in the future. That said, wrapping a naturally asynchronous API like OpenReadAsync with Task.Run or Task.Factory.StartNew is a bad idea anyway. Even if you waited for the event somehow, or used synchronous OpenRead, you'd be wasting a pool thread.
There's new WebClient.OpenReadTaskAsync method for that:
public async Task<String> FileToBase64(String filePath)
{
using (var client = new WebClient())
using (var stream = await client.OpenReadTaskAsync(new Uri(filePath)))
{
// use stream.ReadAsync
}
}
I also recommend HttpClient over WebClient, the former supports multiple HTTP requests in parallel:
using System.Net.Http;
// ...
public async Task<String> FileToBase64(String filePath)
{
using (var client = new HttpClient())
using (var response = await client.GetAsync(filePath))
using (var stream = await response.Content.ReadAsStreamAsync())
{
return string.Empty;
// use stream.ReadAsync
}
}
If you want to use the WebClient, and you want to return result from the method, you'll want to use the TaskCompletionSource and subscribe to the event normally
public Task<String> FileToBase64(String filePath)
{
TaskCompletionSource<string> completion = new TaskCompletionSource<string>();
String convertedFile = "";
WebClient client = new WebClient();
client.OpenReadCompleted += (s, e) =>
{
TextReader reader = new StreamReader(e.Result);
string result = reader.ReadToEnd();
completion.SetResult(result);
};
client.OpenReadAsync(new Uri(filePath));
return completion.Task;
}
Given this answer, I would recommend using the HttpClient instead of the WebClient
Both Noseratio answer and Shawn Kendrot answer should work. But I will try to use other approach - by WebRequest.
First to do this I will have to extend my WebRequest method by GetRequestStreamAsync() - WP lacks this method:
public static class Extensions
{
public static Task<Stream> GetRequestStreamAsync(this WebRequest webRequest)
{
TaskCompletionSource<Stream> taskComplete = new TaskCompletionSource<Stream>();
webRequest.BeginGetRequestStream(arg =>
{
try
{
Stream requestStream = webRequest.EndGetRequestStream(arg);
taskComplete.TrySetResult(requestStream);
}
catch (Exception ex) { taskComplete.SetException(ex); }
}, webRequest);
return taskComplete.Task;
}
}
Then I would convert your Task to something like this:
public async Task<string> FileToBase64(string filePath)
{
try
{
WebRequest request = WebRequest.Create(new Uri(filePath));
if (request != null)
{
using (Stream resopnse = await request.GetRequestStreamAsync())
using (MemoryStream temp = new MemoryStream())
{
const int BUFFER_SIZE = 1024;
byte[] buf = new byte[BUFFER_SIZE];
int bytesread = 0;
while ((bytesread = await resopnse.ReadAsync(buf, 0, BUFFER_SIZE)) > 0)
temp.Write(buf, 0, bytesread);
return Convert.ToBase64String(temp.ToArray());
}
}
return String.Empty;
}
catch { return String.Empty; }
}
Maybe this will help.
First of all, don't use Task.Factory.StartNew with async methods. Stephen Toub and Stephen Clearly have explained why.
Second, if you're using async-await, then you can use the ~TaskAsync methods of the WebClient class with the Microsoft Async NuGet package - given that you said it's for Windows Phone 8.
On a side note, you should always suffix your asynchronous methods with Async (or TaskAsync if not possible).
Given that, your code becomes:
public async Task<String> FileToBase64Async(String filePath)
{
String convertedFile = "";
var client = new WebClient();
Task taskA = Task.Factory.StartNew(() => client.OpenReadCompleted += async (object sender, OpenReadCompletedEventArgs e) =>
{
var buffer = await webclient.DownloadDataTaskAsync(filePath);
convertedFile = Convert.ToBase64String(buffer);
return convertedFile;
}
}
But you would be better served with the new HttpClient class that you can find on the Microsoft HTTP Client Libraries NuGet package.
And finally, you should not block on async code in the UI thread.