Multiple Response.writeAsync Calls - c#

I have been researching Asp.Net Security and I found some surprising code:
Strange Code?
context.Response.ContentType = "text/html";
await context.Response.WriteAsync("<html><body>");
await context.Response.WriteAsync("An remote error has occured: " + context.Request.Query["ErrorMessage"] + "<br>");
await context.Response.WriteAsync("Home");
await context.Response.WriteAsync("</body></html>");
What surprised me is the multiple calls to WriteAsync with short strings.
What I would have done
I would have used a template with String.Format or a StringBuilder to concatenate the strings and then write that in a single call:
var template = #"
<html><body>
An remote error has occured:{0}<br>
Home
</body></html>
";
var html = string.format(template, context.Request.Query["ErrorMessage"]);
await context.Response.WriteAsync(html);
The differences I observe
My code is much easier to modify.
I've got some extra white-space.
My code uses a larger hard-coded string instead of a bunch of small hard-coded strings.
I use String.Format which may have a performance hit compared to concatenation.
If string concatenation should be avoided, this part should be broken up:
"An remote error has occured: " + context.Request.Query["ErrorMessage"] + "<br>"
Questions
For the purposes of discussion. Let's assume that this is in the context of a web server with an average of ~10,000 simultaneous active users: So performance is important.
Why is this done like this?
How does it affect performance?
When should await Response.WriteAsync be called instead of Response.Write?
How often should Response.WriteAsync be called?
As often as possible with tiny amounts of data
Only when a large amount of text is ready

I created an Azure website (running on Basic 1 - 1 Small Instance) to benchmark this. Then I used the free service at https://loader.io to run each test at 100 users/second over 1 minute.
I ran each test 3 times in different orders. The times for each test run were within 200ms of each other.
Results:
The results were clear: StringBuilder won significantly. The cost of each async call far out weighs the cost of any form of string concatenation (even String.Format performed better than the multiple async calls).
1992ms - StringBuilder.Append
3071ms - StringBuilder.AppendFormat
4257ms - WriteAsync with String.Format
9265ms - WriteAsync
Here is the code for each test:
// Do not write this code - It is ugly and performs terribly
private async Task TestWriteAsync(HttpContext context)
{
var r = context.Response;
var id = "id";
var size = "12";
var text = "text";
await r.WriteAsync("<div style='display:none'>");
for (int i = 0; i < 10000; i++)
{
await r.WriteAsync("<li id='");
await r.WriteAsync(id);
await r.WriteAsync("' style='font-size:");
await r.WriteAsync(size);
await r.WriteAsync("'>");
await r.WriteAsync(text);
await r.WriteAsync("</li>");
}
await r.WriteAsync("</div>");
}
// This is much better, but still not great
private async Task TestWriteAsyncFormat(HttpContext context)
{
var r = context.Response;
var id = "id";
var size = "12";
var text = "text";
var template = "<li id='{0}' style='font-size:{1}'>{2}</li>";
await r.WriteAsync("<div style='display:none'>");
for (int i = 0; i < 10000; i++)
{
await r.WriteAsync(string.Format(template, id, size, text));
}
await r.WriteAsync("</div>");
}
// The Performance Winner, but ugly code
private async Task TestStringBuilder(HttpContext context)
{
var sb = new StringBuilder();
var id = "id";
var size = "12";
var text = "text";
sb.Append("<div style='display:none'>");
for (int i = 0; i < 10000; i++)
{
sb.Append("<li id='");
sb.Append(id);
sb.Append("' style='font-size:");
sb.Append(size);
sb.Append("'>");
sb.Append(text);
sb.Append("</li>");
}
sb.Append("</div>");
await context.Response.WriteAsync(sb.ToString());
}
// Decent performance and Clean Code
private async Task TestStringBuilderFormat(HttpContext context)
{
var sb = new StringBuilder();
var id = "id";
var size = "12";
var text = "text";
var template = "<li id='{0}' style='font-size:{1}'>{2}</li>";
sb.Append("<div style='display:none'>");
for (int i = 0; i < 10000; i++)
{
sb.AppendFormat(template, id, size, text);
}
sb.Append("</div>");
await context.Response.WriteAsync(sb.ToString());
}
So although the old "Response.Write" is faster than StringBuilder with synchronous requests, "await Response.WriteAsync" is much slower (because of the async overhead).
Test Screenshots:

I found that Link that might answer some of your questions about Response.Write :
http://www.dotnetperls.com/response-write
It seems that a lot of shorts strings is faster .
I hope it works the same as Response.WriteAsync.

Related

Asynchronously download and compile list of JsonDocument

I'm a little new (returning after a couple of decades) to C# and to the async/await model of programming. Looking for a little guidance, since I received an understandable warning CS1998 that the asynchronous method lacks await and operators and will run synchronously.
The code below I think is straightforward - the server API returns data in pages of 25 items. I'm using a continuation to add each page of 25 to a List of JsonDocuments. Calling code will handle the parsing as needed. I'm not sure how I could reasonably leverage anything further in this, but looking for any suggestions/guidance.
internal static async Task<List<JsonDocument>> Get_All_Data(HttpClient client, string endpoint)
{
Console.WriteLine("Downloading all data from {0}{1}", client.BaseAddress, endpoint);
var all_pages = new List<JsonDocument>();
// Get first page to determine total number of pages
HttpResponseMessage response = client.GetAsync(endpoint).Result;
Console.WriteLine("Initial download complete - parsing headers to determine total pages");
//int items_per_page;
if (int.TryParse(Get_Header_Value("X-Per-Page", response.Headers), out int items_per_page) == false)
// throw new Exception("Response missing X-Per-Page in header");
items_per_page = 25;
if (int.TryParse(Get_Header_Value("X-Total-Count", response.Headers), out int total_items) == false)
//throw new Exception("Response missing X-Total-Count in header");
total_items = 1;
// Divsion returns number of complete pages, add 1 for partial IF total items_json is not an exact multiple of items_per_page
var total_pages = total_items / items_per_page;
if ((total_items % items_per_page) != 0) total_pages++;
Console.WriteLine("{0} pages to be downloaded", total_pages);
var http_tasks = new Task[total_pages];
for (int i = 1; i <= total_pages; i++)
{
Console.WriteLine("Downloading page {0}", i);
var paged_endpoint = endpoint + "?page=" + i;
response = client.GetAsync(paged_endpoint).Result;
http_tasks[i - 1] = response.Content.ReadAsStringAsync().ContinueWith((_content) => { all_pages.Add(JsonDocument.Parse(_content.Result)); }); ;
//http_tasks[i].ContinueWith((_content) => { all_pages.Add(JsonDocument.Parse_List(_content.Result)); });
}
System.Threading.Tasks.Task.WaitAll(http_tasks); // wait for all of the downloads and parsing to complete
return all_pages;
}
Thanks for your help
My suggestion is to await all asynchronous operations, and use the Parallel.ForEachAsync method to parallelize the downloading of the JSON documents, while maintaining control of the degree of parallelism:
static async Task<JsonDocument[]> GetAllData(HttpClient client, string endpoint)
{
HttpResponseMessage response = await client.GetAsync(endpoint);
response.EnsureSuccessStatusCode();
if (!Int32.TryParse(GetHeaderValue(response, "X-Total-Count"),
out int totalItems) || totalItems < 0)
totalItems = 1;
if (!Int32.TryParse(GetHeaderValue(response, "X-Per-Page"),
out int itemsPerPage) || itemsPerPage < 1)
itemsPerPage = 25;
int totalPages = ((totalItems - 1) / itemsPerPage) + 1;
JsonDocument[] results = new JsonDocument[totalPages];
ParallelOptions options = new() { MaxDegreeOfParallelism = 5 };
await Parallel.ForEachAsync(Enumerable.Range(1, totalPages), options,
async (page, ct) =>
{
string pageEndpoint = endpoint + "?page=" + page;
HttpResponseMessage pageResponse = await client
.GetAsync(pageEndpoint, ct);
pageResponse.EnsureSuccessStatusCode();
string pageContent = await response.Content.ReadAsStringAsync(ct);
JsonDocument result = JsonDocument.Parse(pageContent);
results[page - 1] = result;
});
return results;
}
static string GetHeaderValue(HttpResponseMessage response, string name)
=> response.Headers.TryGetValues(name, out var values) ?
values.FirstOrDefault() : null;
The MaxDegreeOfParallelism is configured to the value 5 for demonstration purposes. You can find the optimal degree of parallelism by experimenting with your API. Setting the value too low might result in mediocre performance. Setting the value too high might overburden the target server, and potentially trigger an anti-DoS-attack mechanism.
If you are not familiar with the Enumerable.Range, it is a LINQ method that returns an incremented numeric sequence of integers that starts from start, and contains count elements.
The GetAllData is an asynchronous method and it is supposed to be awaited. If you are calling it without await, and your application is a UI application like WinForms or WPF, you are at risk of experiencing a deadlock. Don't panic, it happens consistently, and you'll observe it during the testing. One way to prevent it is to append .ConfigureAwait(false) to all awaited operations inside the GetAllData method.

Calling HttpClient and getting identical results from paged requests - is it me or the service?

I am sending five HttpClient requests to the same URL, but with a varying page number parameter. They all fire async, and then I await for them all to finish using Tasks.WaitAll(). My requests are using System.Net.Http.HttpClient.
This mostly works fine, and I get five distinct results representing each page of the data about 99% of the time.
But every so often, and I have not dug into deep analysis yet, I get the exact same response for each task. Each task does indeed instantiate its own HttpClient. When I was reusing one client instance, I got this problem. But since I started instantiating new clients for every call, the problem went away.
I am calling a 3rd party web service over which I have no control. So before nagging their team too much about this, I do want to know if I may be doing something wrong here, or if there is some aspect of HttpClient ot Task that I'm missing.
Here is the calling code:
for (int i = 1; i <= 5; i++)
{
page = load_made + i;
var t_page = page;
var t_url = url;
var task = new Task<List<T>>(() => DoPagedLoad<T>(t_page, per_page, t_url));
task.Run();
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
Here is the code in the DoPagedLoad, which returns a Task:
var client = new HttpClient();
var response = client.GetAsync(url).Result;
var results = response.Content.ReadAsStringAsync().Result();
I would appreciate any help from folks familiar with the possible quirks of Task and HttpClient
NOTE: Run is an extension method to help with async exceptions.
public static Task Run(this Task task)
{
task.Start();
task.ContinueWith(t =>
{
if(t.Exception != null)
Log.Error(t.Exception.Flatten().ToString());
});
return task;
}
It's hard to give a definitive answer because we don't have all the detail but here's a sample implementation of how you should fire off HTTP requests. Notice that all async operations are awaited - Result and Wait / WaitAll are not used. You should almost never need / use any of those - they block synchronously and can create problems.
Also notice that there are no global cookie containers, default headers, etc. defined for the HTTP client. If you need any of that stuff, just create individial HttpRequestMessage objects and add whatever headers you need to add. Don't use the global properties - it's a lot cleaner to just set per-request properties.
// Globally defined HTTP client.
private static readonly HttpClient _httpClient = new HttpClient();
// Other stuff here...
private async Task SomeFunctionToGetContent()
{
var requestTasks = new List<Task<HttpResponseMessage>>();
var responseTasks = new List<Task>();
for (var i = 0; i < 5; i++)
{
// Fake URI but still based on the counter (or other
// variable, similar to page in the question)
var uri = new Uri($"https://.../{i}.html");
requestTasks.Add(_httpClient.GetAsync(uri));
}
await (Task.WhenAll(requestTasks));
for (var i = 0; i < 5; i++)
{
var response = await (requestTasks[i]);
responseTasks.Add(HandleResponse(response));
}
await (Tasks.WhenAll(responseTasks));
}
private async Task HandleResponse(HttpResponseMessage response)
{
try
{
if (response.Content != null)
{
var content = await (response.Content.ReadAsStringAsync());
// do something with content here; check IsSuccessStatusCode to
// see if the request failed or succeeded
}
else
{
// Do something when no content
}
}
finally
{
response.Dispose();
}
}

Huge performance differences if I dont await in the Parent method C#

I see there's a huge performance differences between using and NOT using await from the caller. I feel only the difference should be on the return. ie, If I use await then the caller method should wait for the response from called method before returning to the next statement else caller method doesn't need to wait for the response and it can continue to execute the further statements.
Here in my case There's a huge performance difference if use and don't use await in the caller method. ie, If I don't use await then it continues to execute the next statement in the caller without waiting but it's very much faster than using await in the caller.
Also, Am I using async & await correctly??
Code
List<UserViewModel> _ListUser = new List<UserViewModel>();
public XmlElement CreateUpdateUser(Stream input)
{
Main(_ListUser, HttpContext.Current); // using await here makes performance slower and without await it's faster but it returns to the next statement immediately thats the problem.
return FormatResponse("S", "Record(s) created successfully.");
}
public async Task Main(List<UserViewModel> _ListUser, HttpContext current)
{
try
{
WriteToLog("Import Users - Start", 0, DateTime.Now);
UserViewModel _objSiteFileUserSettings = await FillupSiteFileSettings(new UserViewModel());
List<Branch> _branchCollection = await db.Branches.ToListAsync();
List<UserType> _usertypeCollection = await db.UserTypes.ToListAsync();
List<UserStatu> _userstatusCollection = await db.UserStatus.ToListAsync();
List<UserDept> _userdeptCollection = await db.UserDepts.ToListAsync();
List<UserLocation> _userlocationCollection = await db.UserLocations.ToListAsync();
HttpContext.Current = current;
//var tasks = new List<Task>();
foreach (var x in _ListUser)
Update1Record(x, _objSiteFileUserSettings, _branchCollection, _usertypeCollection, _userstatusCollection, _userdeptCollection, _userlocationCollection);
WriteToLog("Import Users - End", 0, DateTime.Now);
}
catch (Exception ex)
{
throw new Exception(ex.ToString());
}
}
public string Update1Record(UserViewModel objUser, UserViewModel _objSiteFileUserSettings, List<Branch> _Lbranch, List<UserType> _Lusertype, List<UserStatu> _Luserstatus, List<UserDept> _Luserdept, List<UserLocation> _Luserlocation)
{
objUser.BranchSiteFile = _objSiteFileUserSettings.BranchSiteFile;
objUser.UsrTypeSiteFile = _objSiteFileUserSettings.UsrTypeSiteFile;
objUser.UsrStatSiteFile = _objSiteFileUserSettings.UsrStatSiteFile;
objUser.BranchId = objUser.Branch != null ? CheckBranch(objUser.Branch, _Lbranch) : null;
objUser.UserDeptId = objUser.UserDept != null ? CheckDept(objUser.UserDept, _Luserdept) : null;
objUser.UserLocationId = objUser.UserLocation != null ? CheckLocation(objUser.UserLocation, _Luserlocation) : null;
objUser.UserStatusId = objUser.UserStatus != null ? CheckStatus(objUser.UserStatus, _Luserstatus) : null;
objUser.UserTypeId = objUser.UserType != null ? CheckType(objUser.UserType, _Lusertype) : 0;
objUser._iEmail = _objSiteFileUserSettings._iEmail;
objUser._iSMS = _objSiteFileUserSettings._iSMS;
using (var VibrantDbContext = new VIBRANT())
using (var AuditDb = new VibrantAuditEntities())
using (var VibrantTransaction = VibrantDbContext.Database.BeginTransaction(System.Data.IsolationLevel.ReadCommitted))
using (var AuditTransaction = AuditDb.Database.BeginTransaction(System.Data.IsolationLevel.ReadCommitted))
{
try
{
VibrantDbContext.Configuration.AutoDetectChangesEnabled = false;
objUser.RecordTimeStamp = DateTime.Now;
var _ObjUserItem = FillupDateTimeValues(objUser);
ImportToDB(_ObjUserItem, 0, VibrantDbContext, AuditDb);
BuildImportLog(objUser, VibrantDbContext, AuditDb);
VibrantDbContext.SaveChanges();
AuditDb.SaveChanges();
VibrantTransaction.Commit();
AuditTransaction.Commit();
}
catch (Exception ex)
{
VibrantTransaction.Rollback();
AuditTransaction.Rollback();
throw new Exception(ex.ToString());
}
}
return "S";
}
public XmlElement FormatResponse(string Status, string Message)
{
XmlDocument xmlDoc = new XmlDocument();
XmlNode response = xmlDoc.CreateElement("Response");
xmlDoc.AppendChild(response);
XmlNode statusNode = xmlDoc.CreateElement("Status");
statusNode.InnerText = Status;
response.AppendChild(statusNode);
XmlNode MessageNode = xmlDoc.CreateElement("Message");
MessageNode.InnerText = Message;
response.AppendChild(MessageNode);
return xmlDoc.DocumentElement;
}
No, I'd say you don't have a grasp yet on what async/await are, and when to use them. You should read up on await/async, and Task to fully understand the implementation, and most importantly the reason for using this structure. The common misconception about asynchronous code is that it makes code faster. This is often not accurate. It actually makes that specifc code a little slower, however it makes code more responsive, and allows you easily leverage the processing capability of your server.
In the simplest sense, code you write or call often has to wait for something. This could be for I/O such as disk access, database, or other communication. It can also have to wait for computation, code to finish working through something that will take a chunk of time to complete. If you have several of these tasks, that take on average say, 5 seconds each, and fire them off synchronously, the first task will finish after 5 seconds, the second after 10 seconds, and the third after 15 seconds.
For instance:
var result1 = DoSomething1(); //5 seconds.
var result2 = DoSomething2(); //5 seconds.
var result3 = DoSomething3(); //5 seconds.
var total = result1 + result2 + result3; // executes after 15 seconds.
Now if I make DoSomething() async and return a Task<int>
var result1 = DoSomething1();
var result2 = DoSomething2();
var result3 = DoSomething3();
int total = result1 + result2 + result3; // ERROR! result1,2,3 are Taxk<int> representing they are a handle to executing code.
Instead if we don't care about the results, and just do a Console.WriteLine("Done"); You'll hit the total line nearly instantaneously because all you've effectively done is start 3x tasks. Each task will now take a bit longer than 5 seconds to run and return their result. For argument's sake, let's say 5.1 seconds. (it will be less than that, more like .01) This cost penalty is for the context switching. Your code "appears" to have run instantaneously because we've reached "Done" while worker threads are still executing our tasks. (Don't we care about the results? handling any exceptions?)
So, now we have 3x asyncronous methods available to run in parallel, we can use await to get the results:
var result1 = await DoSomething1();
var result2 = await DoSomething2();
var result3 = await DoSomething3();
int total = result1 + result2 + result3; // This now works.
However, how long do you think this takes to execute? The answer will be 15.3 seconds. By using Await, each operation must wait for the other to complete.
"Well, what the heck is the point of this then?!" I'm sure some have asked. Well, to have these run in parallel, you can write it like so:
var result1 = DoSomething1();
var result2 = DoSomething2();
var result3 = DoSomething3();
int total = await result1 + await result2 + await result3; // This also works.
The execution time now? 5.1 seconds.
"AHA! so it is faster!" Yes, overall but only because it can, when it is safe to leverage a different thread for each operation AND the results of any code BEFORE the await is not dependent on anything from the previous async statements and deals with thread-safe references. There are other considerations like synchronization contexts, exception handling, and more. Keep in mind also that objects like EF's DbContext is not thread safe, so calling awaitable methods that each reference the same DbContext without awaiting them in sequence (because it should be faster in parallel) will lead to ugly little issues and errors.
This only covers the very basics of async/await. There is a lot more reading you should research from Microsoft and other sources about its use and limitations, and most importantly when to use it. It is NOT a silver bullet and should not be defaulted to whenever available because in most cases I've seen it used, it has lead to performance issues, and concurrency bugs with references to non-thread-safe code. It is intended for use where an operation can take a while to run, and you can safely finish off some code or kick off a parallel, independent operation before awaiting the result(s). If your synchronous code has performance issues, 99.5% of the time these will not be solved by async/await. You're better off looking at exhausting possible explanations with the existing code before considering parallel operations. The main argument for async/await is making code more responsive.
Some materials to start with:
https://www.codingame.com/playgrounds/4240/your-ultimate-async-await-tutorial-in-c/introduction
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/
I'm sure there will be other recommendations given for SO questions/answers and other resources.
On a final note: don't do this:
catch (Exception ex)
{
throw new Exception(ex.ToString());
}
Throwing a new exception generates a new call stack. If you aren't doing anything to handle the exception, just remove the try/catch block. If you are doing something and want the exception to then bubble up, use throw:
catch (Exception ex)
{
VibrantTransaction.Rollback();
AuditTransaction.Rollback();
throw;
}
This preserves the existing call stack for the exception.

Parallel.For and httpclient crash the application C#

I want to avoid application crashing problem due to parallel for loop and httpclient but I am unable to apply solutions that are provided elsewhere on the web due to my limited knowledge of programming. My code is pasted below.
class Program
{
public static List<string> words = new List<string>();
public static int count = 0;
public static string output = "";
private static HttpClient Client = new HttpClient();
public static void Main(string[] args)
{
//input path strings...
List<string> links = new List<string>();
links.AddRange(File.ReadAllLines(input));
List<string> longList = new List<string>(File.ReadAllLines(#"a.txt"));
words.AddRange(File.ReadAllLines(output1));
System.Net.ServicePointManager.DefaultConnectionLimit = 8;
count = longList.Count;
//for (int i = 0; i < longList.Count; i++)
Task.Run(() => Parallel.For(0, longList.Count, new ParallelOptions { MaxDegreeOfParallelism = 5 }, (i, loopState) =>
{
Console.WriteLine(i);
string link = #"some link" + longList[i] + "/";
try
{
if (!links.Contains(link))
{
Task.Run(async () => { await Download(link); }).Wait();
}
}
catch (System.Exception e)
{
}
}));
//}
}
public static async Task Download(string link)
{
HtmlAgilityPack.HtmlDocument document = new HtmlDocument();
document.LoadHtml(await getURL(link));
//...stuff with html agility pack
}
public static async Task<string> getURL(string link)
{
string result = "";
HttpResponseMessage response = await Client.GetAsync(link);
Console.WriteLine(response.StatusCode);
if(response.IsSuccessStatusCode)
{
HttpContent content = response.Content;
var bytes = await response.Content.ReadAsByteArrayAsync();
result = Encoding.UTF8.GetString(bytes);
}
return result;
}
}
There are solutions for example this one, but I don't know how to put await keyword in my main method, and currently the program simply exits due to its absence before Task.Run(). As you can see I have already applied a workaround regarding async Download() method to call it in main method.
I have also doubts regarding the use of same instance of httpclient in different parallel threads. Please advise me whether I should create new instance of httpclient each time.
You're right that you have to block tasks somewhere in a console application, otherwise the program will just exit before it's complete. But you're doing this more than you need to. Aim for just blocking the main thread and delegating the rest to an async method. A good practice is to create a method with a signature like private async Task MainAsyc(args), put the "guts" of your program logic there, call it from Main like this:
MainAsync(args).Wait();
In your example, move everything from Main to MainAsync. Then you're free to use await as much as you want. Task.Run and Parallel.For are explicitly consuming new threads for I/O bound work, which is unnecessary in the async world. Use Task.WhenAll instead. The last part of your MainAsync method should end up looking something like this:
await Task.WhenAll(longList.Select(async s => {
Console.WriteLine(i);
string link = #"some link" + s + "/";
try
{
if (!links.Contains(link))
{
await Download(link);
}
}
catch (System.Exception e)
{
}
}));
There is one little wrinkle here though. Your example is throttling the parallelism at 5. If you find you still need this, TPL Dataflow is a great library for throttled parallelism in the async world. Here's a simple example.
Regarding HttpClient, using a single instance across threads is completely safe and highly encouraged.

Starting Multiple Async Tasks and Process Them As They Complete (C#)

So I am trying to learn how to write asynchronous methods and have been banging my head to get asynchronous calls to work. What always seems to happen is the code hangs on "await" instruction until it eventually seems to time out and crash the loading form in the same method with it.
There are two main reason this is strange:
The code works flawlessly when not asynchronous and just a simple loop
I copied the MSDN code almost verbatim to convert the code to asynchronous calls here: https://msdn.microsoft.com/en-us/library/mt674889.aspx
I know there are a lot of questions already about this on the forms but I have gone through most of them and tried a lot of other ways (with the same result) and now seem to think something is fundamentally wrong after MSDN code wasn't working.
Here is the main method that is called by a background worker:
// this method loads data from each individual webPage
async Task LoadSymbolData(DoWorkEventArgs _e)
{
int MAX_THREADS = 10;
int tskCntrTtl = dataGridView1.Rows.Count;
Dictionary<string, string> newData_d = new Dictionary<string, string>(tskCntrTtl);
// we need to make copies of things that can change in a different thread
List<string> links = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.url].Value.ToString()).ToList());
List<string> symbols = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.symbol].Value.ToString()).ToList());
// we need to create a cancelation token once this is working
// TODO
using (LoadScreen loadScreen = new LoadScreen("Querying stock servers..."))
{
// we cant use the delegate becaus of async keywords
this.loaderScreens.Add(loadScreen);
// wait until the form is loaded so we dont get exceptions when writing to controls on that form
while ( !loadScreen.IsLoaded() );
// load the total number of operations so we can simplify incrementing the progress bar
// on seperate form instances
loadScreen.LoadProgressCntr(0, tskCntrTtl);
// try to run all async tasks since they are non-blocking threaded operations
for (int i = 0; i < tskCntrTtl; i += MAX_THREADS)
{
List<Task<string[]>> ProcessURL = new List<Task<string[]>>();
List<int> taskList = new List<int>();
// Make a list of task indexs
for (int task = i; task < i + MAX_THREADS && task < tskCntrTtl; task++)
taskList.Add(task);
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery =
from task in taskList select QueryHtml(loadScreen, links[task], symbols[task]);
// ***Use ToList to execute the query and start the tasks.
List<Task<string[]>> downloadTasks = downloadTasksQuery.ToList();
// ***Add a loop to process the tasks one at a time until none remain.
while (downloadTasks.Count > 0)
{
// Identify the first task that completes.
Task<string[]> firstFinishedTask = await Task.WhenAny(downloadTasks); // <---- CODE HANGS HERE
// ***Remove the selected task from the list so that you don't
// process it more than once.
downloadTasks.Remove(firstFinishedTask);
// Await the completed task.
string[] data = await firstFinishedTask;
if (!newData_d.ContainsKey(data.First()))
newData_d.Add(data.First(), data.Last());
}
}
// now we have the dictionary with all the information gathered from teh websites
// now we can add the columns if they dont already exist and load the information
// TODO
loadScreen.UpdateProgress(100);
this.loaderScreens.Remove(loadScreen);
}
}
And here is the async method for querying web pages:
async Task<string[]> QueryHtml(LoadScreen _loadScreen, string _link, string _symbol)
{
string data = String.Empty;
try
{
HttpClient client = new HttpClient();
var doc = new HtmlAgilityPack.HtmlDocument();
var html = await client.GetStringAsync(_link); // <---- CODE HANGS HERE
doc.LoadHtml(html);
string percGrn = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'up_g')]//span[2]");
string percRed = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'down_r')]//span[2]");
// create somthing we'll nuderstand later
if ((String.IsNullOrEmpty(percGrn) && String.IsNullOrEmpty(percRed)) ||
(!String.IsNullOrEmpty(percGrn) && !String.IsNullOrEmpty(percRed)))
throw new Exception();
// adding string to empty gives string
string perc = percGrn + percRed;
bool isNegative = String.IsNullOrEmpty(percGrn);
double percDouble;
if (double.TryParse(Regex.Match(perc, #"\d+([.])?(\d+)?").Value, out percDouble))
data = (isNegative ? 0 - percDouble : percDouble).ToString();
}
catch (Exception ex) { }
finally
{
// update the progress bar...
_loadScreen.IncProgressCntr();
}
return new string[] { _symbol, data };
}
I could really use some help. Thanks!
In short when you combine async with any 'regular' task functions you get a deadlock
http://olitee.com/2015/01/c-async-await-common-deadlock-scenario/
the solution is by using configureawait
var html = await client.GetStringAsync(_link).ConfigureAwait(false);
The reason you need this is because you didn't await your orginal thread.
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery = from task in taskList select QueryHtml(loadScreen,links[task], symbols[task]);
What's happeneing here is that you mix the await paradigm with thre regular task handling paradigm. and those don't mix (or rather you have to use the ConfigureAwait(false) for this to work.

Categories

Resources