I am trying to understand async programming, and I had a question. It is regarding the following functions below.
public async void TestAsyncCall() {
Task<string> TaskResult1 = DoSomethingAsync();
string Result2 = DoSomething();
string Result1 = await TaskResult1;
}
public string DoSomething() {
return "synch";
}
public async Task<string> DoSomethingAsync() {
await Task.Delay(10000);
return "asynch";
}
In the function call TestAsyncCall(), would one thread be used to execute DoSomethingAsync(), and another thread to execute DoSomething()?
Then when await is encountered, it would wait for DoSomethingAsync() to complete and release that thread (while also not blocking the original thread)?
Or will this not warrant any new threads being created? In that case will the DoSomethingAsync call be relevant only if it were to deal with some external resource?
I recommend you read my article on async ASP.NET.
Or will this not warrant any new threads being created?
This won't create any new threads. In particular, async and await by themselves won't create any new threads.
On ASP.NET, it's likely that the code after an await will run on a different thread than the code before that await. This is just exchanging one thread for another, though; no new threads are created.
In that case will the DoSomethingAsync call be relevant only if it were to deal with some external resource?
The primary use case for async is to deal with I/O, yes. This is particularly true on ASP.NET.
As #Stepehen-cleary said, "In particular, async and await by themselves won't create any new threads."
This next example is taken from the book: "C sharp in Depth" by John Skeet, chapter 15 pp.465:
class AsyncForm : Form
{
/* The first part of listing 15.1 simply creates the UI and hooks up an event handler for
the button in a straightforward way */
Label label;
Button button;
public AsyncForm()
{
label = new Label {
Location = new Point(10, 20),
Text = "Length"
};
button = new Button {
Location = new Point(10, 50),
Text = "Click"
};
button.Click += DisplayWebSiteLength;
AutoSize = true;
Controls.Add(label);
Controls.Add(button);
}
/* When you click on the button, the text of the book’s home page is fetched
and the label is updated to display the HTML lenght in characters */
async void DisplayWebSiteLength(object sender, EventArgs e)
{
label.Text = "Fetching...";
using (HttpClient client = new HttpClient())
{
string text =
await client.GetStringAsync("http://csharpindepth.com");
label.Text = text.Length.ToString();
}
}
/* The label is updated to display the HTML length in characters D. The
HttpClient is also disposed appropriately, whether the operation succeeds or fails—
something that would be all too easy to forget if you were writing similar asynchronous
code in C# 4 */
}
With this in mind, let's take a look to your code, you have Result1 and Result2, there's no point in having one asynchronous task waiting for a synchronous task to be finished. I would use Parallelism so you can perform both methods but to return something like two sets of Data, performing LINQ queries at the same time.
Take a look to this short example about Parallelism with Async Tasks:
public class StudentDocs
{
//some code over here
string sResult = ProcessDocs().Result;
//If string sResult is not empty there was an error
if (!sResult.Equals(string.Empty))
throw new Exception(sResult);
//some code over there
##region Methods
public async Task<string> ProcessDocs()
{
string sResult = string.Empty;
try
{
var taskStuDocs = GetStudentDocumentsAsync(item.NroCliente);
var taskStuClasses = GetStudentSemesterClassesAsync(item.NroCliente, vencimientoParaProductos);
//We Wait for BOTH TASKS to be accomplished...
await Task.WhenAll(taskStuDocs, taskStuClasses);
//Get the IList<Class>
var docsStudent = taskStuDocs.Result;
var docsCourses = taskStuClasses.Result;
/*
You can do something with this data ... here
*/
}
catch (Exception ex)
{
sResult = ex.Message;
Loggerdb.LogInfo("ERROR:" + ex.Message);
}
}
public async Task<IList<classA>> GetStudentDocumentsAsync(long studentId)
{
return await Task.Run(() => GetStudentDocuments(studentId)).ConfigureAwait(false);
}
public async Task<IList<classB>> GetStudentSemesterCoursessAsync(long studentId)
{
return await Task.Run(() => GetStudentSemesterCourses(studentId)).ConfigureAwait(false);
}
//Performs task to bring Student Documents
public IList<ClassA> GetStudentDocuments(long studentId)
{
IList<ClassA> studentDocs = new List<ClassA>();
//Let's execute a Stored Procedured map on Entity Framework
using (ctxUniversityData oQuery = new ctxUniversityData())
{
//Since both TASKS are running at the same time we use AsParallel for performing parallels LINQ queries
foreach (var item in oQuery.GetStudentGrades(Convert.ToDecimal(studentId)).AsParallel())
{
//These are every element of IList
studentDocs.Add(new ClassA(
(int)(item.studentId ?? 0),
item.studentName,
item.studentLastName,
Convert.ToInt64(item.studentAge),
item.studentProfile,
item.studentRecord
));
}
}
return studentDocs;
}
//Performs task to bring Student Courses per Semester
public IList<ClassB> GetStudentSemesterCourses(long studentId)
{
IList<ClassB> studentCourses = new List<ClassB>();
//Let's execute a Stored Procedured map on Entity Framework
using (ctxUniversityData oQuery = new ctxUniversityData())
{
//Since both TASKS are running at the same time we use AsParallel for performing parallels LINQ queries
foreach (var item in oQuery.GetStudentCourses(Convert.ToDecimal(studentId)).AsParallel())
{
//These are every element of IList
studentCourses.Add(new ClassB(
(int)(item.studentId ?? 0),
item.studentName,
item.studentLastName,
item.carreerName,
item.semesterNumber,
Convert.ToInt64(item.Year),
item.course ,
item.professorName
));
}
}
return studentCourses;
}
#endregion
}
Related
I want to replace BackgroundWorker in my winform application with a Thread.
The goal is do the the jobs in a new thread other than UI-thread & prevent program hang during run.
So i did this :
private void radBtn_start_Click(object sender, EventArgs e)
{
try
{
string thread_name = "trd_" + rnd.Next(99000, 10000000).ToString();
Thread thread = new Thread(new ThreadStart(Thread_Method));
thread.Name = thread_name;
thread.Start();
}
catch (System.Exception ex)
{
MessageBox.Show("Error in radBtn_start_Click() Is : " + ex.ToString());
}
}
public void Thread_Method()
{
...Some Jobs
Thread.Sleep(20000);
...Some Jobs After Delay
Thread.Sleep(20000);
...Some Jobs After Delay
this.Invoke(new MethodInvoker(delegate
{
radTextBoxControl1.Text += DateTime.Now.ToString() + " : We are at end of search( " + radDropDownList1.SelectedItem.Tag + " ) = -1" + Environment.NewLine;
}));
}
But after running these codes UI hangs during sleep.
What is the correct codes for my purpose?
You don't have to create a new Thread, your process already has a pool of threads anxiously waiting to do something for you
Usually the threads in the thread pool are used when you use async-await. However, you can also use them for heavy calculations
My advice is to make your thread_method async. This has the advantage, that whenever your thread_method has to wait idly for another process to finish, like writing data to a file, fetching items from a database, or reading information from the internet, the thread is available for the thread pool to do other tasks.
If you are not familiar with async-await: this interview with Eric Lippert really helped me to understand what happens when you use async-await. Search somewhere in the middle for async-await.
One of the nice things about async-await, is that the executing thread has the same "context" as the UI-thread, so this thread can access UI-elements. No need to check for InvokeRequired or to call Invoke.
To make your ThreadMethod async:
declare it async
instead of TResults return Task<TResult>; instead of void return Task
only exception: async event handlers return void
whenever you call other methods that have an async version, call this async version, start awaiting when you need the results of the async task.
public async Task FetchCustomerAddress(int customerId)
{
// fetch the customer address from the database:
using (var dbContext = new OrderDbContext(...))
{
return await dbContext.Customers
.Where(customer => customer.Id == customerId)
.Select(customer => new Address
{
Name = customer.Name,
Street = customer.Street,
... // etc
})
.FirstOrDefaultAsync();
}
}
public async Task CreateCustomerOrder(
int customerId, IEnumerable orderLines)
{
// start reading the customer Address
var taskReadCustomerAddress = this.FetchCustomerAddress(customerId);
// meanwhile create the order
CustomerOrder order = new CustomerOrder();
foreach (var orderLine in orderLines)
{
order.OrderLines.Add(orderLine);
}
order.CalculateTotal();
// now you need the address of the customer: await:
Address customerAddress = await taskReadCustomerAddress;
order.Address = customerAddress;
return order;
}
Sometimes you don't have to wait idly for another process to finish, but you need to do some heavy calculations, and still keep your UI-thread responsive. In older applications you would use the BackgroundWorker for this, in newer applications you use Task.StartNew
For instance, you have a button, and a menu item that both will start some heavy calculations. Just like when using the backgroundworker you want to show some progress. While doing the calculations, both the menu item and the button need to be disable.
public async Task PrintCustomerOrdersAsync(
ICollection<CustomerOrderInformation> customerOrders)
{
// while creating the customer orders: disable the button and the menu items
this.buttonPrintOrders.Enabled = false;
this.menuItemCreateOrderLines.Enabled = false;
// show the progress bar
this.ProgressBarCalculating.MinValue = 0;
this.ProgressBarCalculating.MaxValue = customers.Count;
this.ProgressBarCalculating.Value = 0;
this.ProgressBarCalculating.Visible = true;
List<Task<PrintJob>> printJobs = new List<Task<PrintJob>>();
foreach (CustomerOrderInformation orderInformation in customerOrders)
{
// instead of BackGroundworker raise event, you can access the UI items yourself
CustomerOrder order = this.CreateCustomerOrder(orderInformation.CustomerId,
orderInformation.OrderLines);
this.ProgressBarCalculating.Value +=1;
// print the Order, do not await until printing finished, create next order
printJobs.Add(this.Print(order));
}
// all orders created and sent to the printer. await until all print jobs complete:
await Task.WhenAll(printJobs);
// cleanup:
this.buttonPrintOrders.Enabled = true;
this.menuItemCreateOrderLines.Enabled = true;
this.ProgressBarCalculating.Visible = false;
}
By the way: in a proper design, you would separate the enabling / disabling the items from the actual processing:
public async Task PrintCustomerOrdersAsync(ICollection<CustomerOrderInformation> customerOrders)
{
this.ShowBusyPrintingOrders(customerOrders.Count);
await this.PrintOrdersAsync(customerOrders);
this.HideBusyPrintingOrders();
}
Now to start printing the orders when a button is pressed, there are two possibilities:
If the process is mostly waiting for others: async event handler
If there are really heavy calculations (longer than a second?): start a task that does the calculations
No heavy calculations:
// async event handler has void return value!
private async void ButtonPrintOrdersClickedAsync(object sender, ...)
{
var orderInformations = this.GetOrderInformations();
await PrintCustomerOrdersAsync(orderInformations);
}
Because I don't have anything other useful to do, I await immediately
Heavy calculations: start a separate task:
private async Task ButtonCalculateClickedAsync(object sender, ...)
{
var calculationTask = Task.Run(() => this.DoHeavyCalculations(this.textBox1.Text);
// because you didn't await, you are free to do something else,
// for instance show progress:
while (!calculationTask.Complete)
{
// await one second; UI is responsive!
await Task.Delay(TimeSpan.FromSeconds(1));
this.ProgressBar.Value += 1;
}
}
Be aware: using these methods, you can't stop the process. So you are in trouble if the operator wants to close the application while you are still printing.
Just like your background thread, every method that supports cancellation should regularly check if cancellation is requested. The advantage is, that this checking is also done in the .NET methods that support cancellation, like reading database information, writing a file, etc. The backgroundWorker couldn't cancel writing to a file.
For this we have the CancellationTokenSource
private CancellationTokenSource cancellationTokenSource;
private Task taskPrintOrders;
public async Task PrintCustomerOrdersAsync(ICollection<CustomerOrderInformation> customerOrders)
{
this.ShowBusyPrintingOrders(customerOrders.Count);
using (this.cancellactionTokenSource = new CancellationTokenSource())
{
taskPrintOrders = this.PrintOrdersAsync(customerOrders, this.cancellationTokenSource.Token);
await taskPrintOrders;
this.HideBusyPrintingOrders();
}
private void CancelPrinting()
{
this.cancellationTokenSource?.Cancel();
}
If you want to cancel and wait until finished, for instance when closing the form:
private bool TaskStillRunning => this.TaskPrinting != null && !this.TaskPrinting.Complete;
private async void OnFormClosing(object sender, ...)
{
if (this.TaskStillRunning)
{
bool canClose = this.AskIfCanClose();
if (!canClose)
eventArgs.Cancel = true;
else
{
// continue closing: stop the task, and wait until stopped
this.CancelPrinting();
await this.taskPrintOrders;
}
}
}
This will work in separate thread without hanging your UI.
Use new Thread
new Thread(delegate()
{
Thread_Method();
}).Start();
or Task.run
Task.Run(() =>
{
Thread_Method();
});
I need to process data from a producer in FIFO fashion with the ability to abort processing if the same producer produces a new bit of data.
So I implemented an abortable FIFO queue based on Stephen Cleary's AsyncCollection (called AsyncCollectionAbortableFifoQueuein my sample) and one on TPL's BufferBlock (BufferBlockAbortableAsyncFifoQueue in my sample). Here's the implementation based on AsyncCollection
public class AsyncCollectionAbortableFifoQueue<T> : IExecutableAsyncFifoQueue<T>
{
private AsyncCollection<AsyncWorkItem<T>> taskQueue = new AsyncCollection<AsyncWorkItem<T>>();
private readonly CancellationToken stopProcessingToken;
public AsyncCollectionAbortableFifoQueue(CancellationToken cancelToken)
{
stopProcessingToken = cancelToken;
_ = processQueuedItems();
}
public Task<T> EnqueueTask(Func<Task<T>> action, CancellationToken? cancelToken)
{
var tcs = new TaskCompletionSource<T>();
var item = new AsyncWorkItem<T>(tcs, action, cancelToken);
taskQueue.Add(item);
return tcs.Task;
}
protected virtual async Task processQueuedItems()
{
while (!stopProcessingToken.IsCancellationRequested)
{
try
{
var item = await taskQueue.TakeAsync(stopProcessingToken).ConfigureAwait(false);
if (item.CancelToken.HasValue && item.CancelToken.Value.IsCancellationRequested)
item.TaskSource.SetCanceled();
else
{
try
{
T result = await item.Action().ConfigureAwait(false);
item.TaskSource.SetResult(result); // Indicate completion
}
catch (Exception ex)
{
if (ex is OperationCanceledException && ((OperationCanceledException)ex).CancellationToken == item.CancelToken)
item.TaskSource.SetCanceled();
item.TaskSource.SetException(ex);
}
}
}
catch (Exception) { }
}
}
}
public interface IExecutableAsyncFifoQueue<T>
{
Task<T> EnqueueTask(Func<Task<T>> action, CancellationToken? cancelToken);
}
processQueuedItems is the task that dequeues AsyncWorkItem's from the queue, and executes them unless cancellation has been requested.
The asynchronous action to execute gets wrapped into an AsyncWorkItem which looks like this
internal class AsyncWorkItem<T>
{
public readonly TaskCompletionSource<T> TaskSource;
public readonly Func<Task<T>> Action;
public readonly CancellationToken? CancelToken;
public AsyncWorkItem(TaskCompletionSource<T> taskSource, Func<Task<T>> action, CancellationToken? cancelToken)
{
TaskSource = taskSource;
Action = action;
CancelToken = cancelToken;
}
}
Then there's a task looking and dequeueing items for processing and either processing them, or aborting if the CancellationToken has been triggered.
That all works just fine - data gets processed, and if a new piece of data is received, processing of the old is aborted. My problem now stems from these Queues leaking massive amounts of memory if I crank up the usage (producer producing a lot more than the consumer processes). Given it's abortable, the data that is not processed, should be discarded and eventually disappear from memory.
So let's look at how I'm using these queues. I have a 1:1 match of producer and consumer. Every consumer handles data of a single producer. Whenever I get a new data item, and it doesn't match the previous one, I catch the queue for the given producer (User.UserId) or create a new one (the 'executor' in the code snippet). Then I have a ConcurrentDictionary that holds a CancellationTokenSource per producer/consumer combo. If there's a previous CancellationTokenSource, I call Cancel on it and Dispose it 20 seconds later (immediate disposal would cause exceptions in the queue). I then enqueue processing of the new data. The queue returns me a task that I can await so I know when processing of the data is complete, and I then return the result.
Here's that in code
internal class SimpleLeakyConsumer
{
private ConcurrentDictionary<string, IExecutableAsyncFifoQueue<bool>> groupStateChangeExecutors = new ConcurrentDictionary<string, IExecutableAsyncFifoQueue<bool>>();
private readonly ConcurrentDictionary<string, CancellationTokenSource> userStateChangeAborters = new ConcurrentDictionary<string, CancellationTokenSource>();
protected CancellationTokenSource serverShutDownSource;
private readonly int operationDuration = 1000;
internal SimpleLeakyConsumer(CancellationTokenSource serverShutDownSource, int operationDuration)
{
this.serverShutDownSource = serverShutDownSource;
this.operationDuration = operationDuration * 1000; // convert from seconds to milliseconds
}
internal async Task<bool> ProcessStateChange(string userId)
{
var executor = groupStateChangeExecutors.GetOrAdd(userId, new AsyncCollectionAbortableFifoQueue<bool>(serverShutDownSource.Token));
CancellationTokenSource oldSource = null;
using (var cancelSource = userStateChangeAborters.AddOrUpdate(userId, new CancellationTokenSource(), (key, existingValue) =>
{
oldSource = existingValue;
return new CancellationTokenSource();
}))
{
if (oldSource != null && !oldSource.IsCancellationRequested)
{
oldSource.Cancel();
_ = delayedDispose(oldSource);
}
try
{
var executionTask = executor.EnqueueTask(async () => { await Task.Delay(operationDuration, cancelSource.Token).ConfigureAwait(false); return true; }, cancelSource.Token);
var result = await executionTask.ConfigureAwait(false);
userStateChangeAborters.TryRemove(userId, out var aborter);
return result;
}
catch (Exception e)
{
if (e is TaskCanceledException || e is OperationCanceledException)
return true;
else
{
userStateChangeAborters.TryRemove(userId, out var aborter);
return false;
}
}
}
}
private async Task delayedDispose(CancellationTokenSource src)
{
try
{
await Task.Delay(20 * 1000).ConfigureAwait(false);
}
finally
{
try
{
src.Dispose();
}
catch (ObjectDisposedException) { }
}
}
}
In this sample implementation, all that is being done is wait, then return true.
To test this mechanism, I wrote the following Data producer class:
internal class SimpleProducer
{
//variables defining the test
readonly int nbOfusers = 10;
readonly int minimumDelayBetweenTest = 1; // seconds
readonly int maximumDelayBetweenTests = 6; // seconds
readonly int operationDuration = 3; // number of seconds an operation takes in the tester
private readonly Random rand;
private List<User> users;
private readonly SimpleLeakyConsumer consumer;
protected CancellationTokenSource serverShutDownSource, testAbortSource;
private CancellationToken internalToken = CancellationToken.None;
internal SimpleProducer()
{
rand = new Random();
testAbortSource = new CancellationTokenSource();
serverShutDownSource = new CancellationTokenSource();
generateTestObjects(nbOfusers, 0, false);
consumer = new SimpleLeakyConsumer(serverShutDownSource, operationDuration);
}
internal void StartTests()
{
if (internalToken == CancellationToken.None || internalToken.IsCancellationRequested)
{
internalToken = testAbortSource.Token;
foreach (var user in users)
_ = setNewUserPresence(internalToken, user);
}
}
internal void StopTests()
{
testAbortSource.Cancel();
try
{
testAbortSource.Dispose();
}
catch (ObjectDisposedException) { }
testAbortSource = new CancellationTokenSource();
}
internal void Shutdown()
{
serverShutDownSource.Cancel();
}
private async Task setNewUserPresence(CancellationToken token, User user)
{
while (!token.IsCancellationRequested)
{
var nextInterval = rand.Next(minimumDelayBetweenTest, maximumDelayBetweenTests);
try
{
await Task.Delay(nextInterval * 1000, testAbortSource.Token).ConfigureAwait(false);
}
catch (TaskCanceledException)
{
break;
}
//now randomly generate a new state and submit it to the tester class
UserState? status;
var nbStates = Enum.GetValues(typeof(UserState)).Length;
if (user.CurrentStatus == null)
{
var newInt = rand.Next(nbStates);
status = (UserState)newInt;
}
else
{
do
{
var newInt = rand.Next(nbStates);
status = (UserState)newInt;
}
while (status == user.CurrentStatus);
}
_ = sendUserStatus(user, status.Value);
}
}
private async Task sendUserStatus(User user, UserState status)
{
await consumer.ProcessStateChange(user.UserId).ConfigureAwait(false);
}
private void generateTestObjects(int nbUsers, int nbTeams, bool addAllUsersToTeams = false)
{
users = new List<User>();
for (int i = 0; i < nbUsers; i++)
{
var usr = new User
{
UserId = $"User_{i}",
Groups = new List<Team>()
};
users.Add(usr);
}
}
}
It uses the variables at the beginning of the class to control the test. You can define the number of users (nbOfusers - every user is a producer that produces new data), the minimum (minimumDelayBetweenTest) and maximum (maximumDelayBetweenTests) delay between a user producing the next data and how long it takes the consumer to process the data (operationDuration).
StartTests starts the actual test, and StopTests stops the tests again.
I'm calling these as follows
static void Main(string[] args)
{
var tester = new SimpleProducer();
Console.WriteLine("Test successfully started, type exit to stop");
string str;
do
{
str = Console.ReadLine();
if (str == "start")
tester.StartTests();
else if (str == "stop")
tester.StopTests();
}
while (str != "exit");
tester.Shutdown();
}
So, if I run my tester and type 'start', the Producer class starts producing states that are consumed by Consumer. And memory usage starts to grow and grow and grow. The sample is configured to the extreme, the real-life scenario I'm dealing with is less intensive, but one action of the producer could trigger multiple actions on the consumer side which also have to be executed in the same asynchronous abortable fifo fashion - so worst case, one set of data produced triggers an action for ~10 consumers (that last part I stripped out for brevity).
When I'm having a 100 producers, and each producer produces a new data item every 1-6 seconds (randomly, also the data produces is random). Consuming the data takes 3 seconds.. so there's plenty of cases where there's a new set of data before the old one has been properly processed.
Looking at two consecutive memory dumps, it's obvious where the memory usage is coming from.. it's all fragments that have to do with the queue. Given that I'm disposing every TaskCancellationSource and not keeping any references to the produced data (and the AsyncWorkItem they're put into), I'm at a loss to explain why this keeps eating up my memory and I'm hoping somebody else can show me the errors of my way. You can also abort testing by typing 'stop'.. you'll see that no longer is memory being eaten, but even if you pause and trigger GC, memory is not being freed either.
The source code of the project in runnable form is on Github. After starting it, you have to type start (plus enter) in the console to tell the producer to start producing data. And you can stop producing data by typing stop (plus enter)
Your code has so many issues making it impossible to find a leak through debugging. But here are several things that already are an issue and should be fixed first:
Looks like getQueue creates a new queue for the same user each time processUseStateUpdateAsync gets called and does not reuse existing queues:
var executor = groupStateChangeExecutors.GetOrAdd(user.UserId, getQueue());
CancellationTokenSource is leaking on each call of the code below, as new value created each time the method AddOrUpdate is called, it should not be passed there that way:
userStateChangeAborters.AddOrUpdate(user.UserId, new CancellationTokenSource(), (key, existingValue
Also code below should use the same cts as you pass as new cts, if dictionary has no value for specific user.UserId:
return new CancellationTokenSource();
Also there is a potential leak of cancelSource variable as it gets bound to a delegate which can live for a time longer than you want, it's better to pass concrete CancellationToken there:
executor.EnqueueTask(() => processUserStateUpdateAsync(user, state, previousState,
cancelSource.Token));
By some reason you do not dispose aborter here and in one more place:
userStateChangeAborters.TryRemove(user.UserId, out var aborter);
Creation of Channel can have potential leaks:
taskQueue = Channel.CreateBounded<AsyncWorkItem<T>>(new BoundedChannelOptions(1)
You picked option FullMode = BoundedChannelFullMode.DropOldest which should remove oldest values if there are any, so I assume that that stops queued items from processing as they would not be read. It's a hypotheses, but I assume that if an old item is removed without being handled, then processUserStateUpdateAsync won't get called and all resources won't be freed.
You can start with these found issues and it should be easier to find the real cause after that.
I want to avoid application crashing problem due to parallel for loop and httpclient but I am unable to apply solutions that are provided elsewhere on the web due to my limited knowledge of programming. My code is pasted below.
class Program
{
public static List<string> words = new List<string>();
public static int count = 0;
public static string output = "";
private static HttpClient Client = new HttpClient();
public static void Main(string[] args)
{
//input path strings...
List<string> links = new List<string>();
links.AddRange(File.ReadAllLines(input));
List<string> longList = new List<string>(File.ReadAllLines(#"a.txt"));
words.AddRange(File.ReadAllLines(output1));
System.Net.ServicePointManager.DefaultConnectionLimit = 8;
count = longList.Count;
//for (int i = 0; i < longList.Count; i++)
Task.Run(() => Parallel.For(0, longList.Count, new ParallelOptions { MaxDegreeOfParallelism = 5 }, (i, loopState) =>
{
Console.WriteLine(i);
string link = #"some link" + longList[i] + "/";
try
{
if (!links.Contains(link))
{
Task.Run(async () => { await Download(link); }).Wait();
}
}
catch (System.Exception e)
{
}
}));
//}
}
public static async Task Download(string link)
{
HtmlAgilityPack.HtmlDocument document = new HtmlDocument();
document.LoadHtml(await getURL(link));
//...stuff with html agility pack
}
public static async Task<string> getURL(string link)
{
string result = "";
HttpResponseMessage response = await Client.GetAsync(link);
Console.WriteLine(response.StatusCode);
if(response.IsSuccessStatusCode)
{
HttpContent content = response.Content;
var bytes = await response.Content.ReadAsByteArrayAsync();
result = Encoding.UTF8.GetString(bytes);
}
return result;
}
}
There are solutions for example this one, but I don't know how to put await keyword in my main method, and currently the program simply exits due to its absence before Task.Run(). As you can see I have already applied a workaround regarding async Download() method to call it in main method.
I have also doubts regarding the use of same instance of httpclient in different parallel threads. Please advise me whether I should create new instance of httpclient each time.
You're right that you have to block tasks somewhere in a console application, otherwise the program will just exit before it's complete. But you're doing this more than you need to. Aim for just blocking the main thread and delegating the rest to an async method. A good practice is to create a method with a signature like private async Task MainAsyc(args), put the "guts" of your program logic there, call it from Main like this:
MainAsync(args).Wait();
In your example, move everything from Main to MainAsync. Then you're free to use await as much as you want. Task.Run and Parallel.For are explicitly consuming new threads for I/O bound work, which is unnecessary in the async world. Use Task.WhenAll instead. The last part of your MainAsync method should end up looking something like this:
await Task.WhenAll(longList.Select(async s => {
Console.WriteLine(i);
string link = #"some link" + s + "/";
try
{
if (!links.Contains(link))
{
await Download(link);
}
}
catch (System.Exception e)
{
}
}));
There is one little wrinkle here though. Your example is throttling the parallelism at 5. If you find you still need this, TPL Dataflow is a great library for throttled parallelism in the async world. Here's a simple example.
Regarding HttpClient, using a single instance across threads is completely safe and highly encouraged.
I've been reading examples for a long time now, but unfortunately I've been unable to apply the solutions to the code I'm working with. Some quick Facts/Assorted Info:
1) I'm new to C#
2) The code posted below is modified from Amazon Web Services (mostly stock)
3) Purpose of code is to compare server info to offline already downloaded info and create a list of need to download files. This snip is for the list made from the server side, only option with AWS is to call async, but I need this to finish before moving forward.
public void InitiateSearch()
{
UnityInitializer.AttachToGameObject(this.gameObject);
//these are the access key and secret access key for credentials
BasicAWSCredentials credentials = new BasicAWSCredentials("secret key", "very secret key");
AmazonS3Config S3Config = new AmazonS3Config()
{
ServiceURL = ("url"),
RegionEndpoint = RegionEndpoint.blahblah
};
//Setting the client to be used in the call below
AmazonS3Client Client = new AmazonS3Client(credentials, S3Config);
var request = new ListObjectsRequest()
{
BucketName = "thebucket"
};
Client.ListObjectsAsync(request, (responseObject) =>
{
if (responseObject.Exception == null)
{
responseObject.Response.S3Objects.ForEach((o) =>
{
int StartCut = o.Key.IndexOf(SearchType) - 11;
if (SearchType == o.Key.Substring(o.Key.IndexOf(SearchType), SearchType.Length))
{
if (ZipCode == o.Key.Substring(StartCut + 12 + SearchType.Length, 5))
{
AWSFileList.Add(o.Key + ", " + o.LastModified);
}
}
}
);
}
else
{
Debug.Log(responseObject.Exception);
}
});
}
I have no idea how to apply await to the Client.ListObjectsAsync line, I'm hoping you all can give me some guidance and let me keep my hair for a few more years.
You can either mark your method async and await it, or you can call .Wait() or .Result() on the Task you're given back.
I have no idea how to apply await to the Client.ListObjectsAsync line
You probably just put await in front of it:
await Client.ListObjectsAsync(request, (responseObject) => ...
As soon as you do this, Visual Studio will give you an error. Take a good look at the error message, because it tells you exactly what to do next (mark InitiateSearch with async and change its return type to Task):
public async Task InitiateSearchAsync()
(it's also a good idea to add an Async suffix to follow the common pattern).
Next, you'd add an await everywhere that InitiateSearchAsync is called, and so on.
I'm assuming Client.ListObjectsAsync returns a Task object, so a solution for your specific problem would be this:
public async void InitiateSearch()
{
//code
var collection = await Client.ListObjectsAsync(request, (responseObject) =>
{
//code
});
foreach (var item in collection)
{
//do stuff with item
}
}
the variable result will now be filled with the objects. You may want to set the return type of InitiateSearch() to Task, so you can await it too.
await InitiateSearch(); //like this
If this method is an event handler of some sort (like called by the click of a button), then you can keep using void as return type.
A simple introduction from an unpublished part of the documentation for async-await:
Three things are needed to use async-await:
The Task object: This object is returned by a method which is executed asynchronous. It allows you to control the execution of the method.
The await keyword: "Awaits" a Task. Put this keyword before the Task to asynchronously wait for it to finish
The async keyword: All methods which use the await keyword have to be marked as async
A small example which demonstrates the usage of this keywords
public async Task DoStuffAsync()
{
var result = await DownloadFromWebpageAsync(); //calls method and waits till execution finished
var task = WriteTextAsync(#"temp.txt", result); //starts saving the string to a file, continues execution right await
Debug.Write("this is executed parallel with WriteTextAsync!"); //executed parallel with WriteTextAsync!
await task; //wait for WriteTextAsync to finish execution
}
private async Task<string> DownloadFromWebpageAsync()
{
using (var client = new WebClient())
{
return await client.DownloadStringTaskAsync(new Uri("http://stackoverflow.com"));
}
}
private async Task WriteTextAsync(string filePath, string text)
{
byte[] encodedText = Encoding.Unicode.GetBytes(text);
using (FileStream sourceStream = new FileStream(filePath, FileMode.Append))
{
await sourceStream.WriteAsync(encodedText, 0, encodedText.Length);
}
}
Some thing to note:
You can specify a return value from an asynchronous operations with Task. The await keyword waits till the execution of the method finishes, and returns the string.
the Task object contains the status of the execution of the method, it can be used as any other variable.
if an exception is thrown (for example by the WebClient) it bubbles up at the first time the await keyword is used (in this example at the line string result (...))
It is recommended to name methods which return the Task object as MethodNameAsync
For more information about this take a look at http://blog.stephencleary.com/2012/02/async-and-await.html.
So I am trying to learn how to write asynchronous methods and have been banging my head to get asynchronous calls to work. What always seems to happen is the code hangs on "await" instruction until it eventually seems to time out and crash the loading form in the same method with it.
There are two main reason this is strange:
The code works flawlessly when not asynchronous and just a simple loop
I copied the MSDN code almost verbatim to convert the code to asynchronous calls here: https://msdn.microsoft.com/en-us/library/mt674889.aspx
I know there are a lot of questions already about this on the forms but I have gone through most of them and tried a lot of other ways (with the same result) and now seem to think something is fundamentally wrong after MSDN code wasn't working.
Here is the main method that is called by a background worker:
// this method loads data from each individual webPage
async Task LoadSymbolData(DoWorkEventArgs _e)
{
int MAX_THREADS = 10;
int tskCntrTtl = dataGridView1.Rows.Count;
Dictionary<string, string> newData_d = new Dictionary<string, string>(tskCntrTtl);
// we need to make copies of things that can change in a different thread
List<string> links = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.url].Value.ToString()).ToList());
List<string> symbols = new List<string>(dataGridView1.Rows.Cast<DataGridViewRow>()
.Select(r => r.Cells[dbIndexs_s.symbol].Value.ToString()).ToList());
// we need to create a cancelation token once this is working
// TODO
using (LoadScreen loadScreen = new LoadScreen("Querying stock servers..."))
{
// we cant use the delegate becaus of async keywords
this.loaderScreens.Add(loadScreen);
// wait until the form is loaded so we dont get exceptions when writing to controls on that form
while ( !loadScreen.IsLoaded() );
// load the total number of operations so we can simplify incrementing the progress bar
// on seperate form instances
loadScreen.LoadProgressCntr(0, tskCntrTtl);
// try to run all async tasks since they are non-blocking threaded operations
for (int i = 0; i < tskCntrTtl; i += MAX_THREADS)
{
List<Task<string[]>> ProcessURL = new List<Task<string[]>>();
List<int> taskList = new List<int>();
// Make a list of task indexs
for (int task = i; task < i + MAX_THREADS && task < tskCntrTtl; task++)
taskList.Add(task);
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery =
from task in taskList select QueryHtml(loadScreen, links[task], symbols[task]);
// ***Use ToList to execute the query and start the tasks.
List<Task<string[]>> downloadTasks = downloadTasksQuery.ToList();
// ***Add a loop to process the tasks one at a time until none remain.
while (downloadTasks.Count > 0)
{
// Identify the first task that completes.
Task<string[]> firstFinishedTask = await Task.WhenAny(downloadTasks); // <---- CODE HANGS HERE
// ***Remove the selected task from the list so that you don't
// process it more than once.
downloadTasks.Remove(firstFinishedTask);
// Await the completed task.
string[] data = await firstFinishedTask;
if (!newData_d.ContainsKey(data.First()))
newData_d.Add(data.First(), data.Last());
}
}
// now we have the dictionary with all the information gathered from teh websites
// now we can add the columns if they dont already exist and load the information
// TODO
loadScreen.UpdateProgress(100);
this.loaderScreens.Remove(loadScreen);
}
}
And here is the async method for querying web pages:
async Task<string[]> QueryHtml(LoadScreen _loadScreen, string _link, string _symbol)
{
string data = String.Empty;
try
{
HttpClient client = new HttpClient();
var doc = new HtmlAgilityPack.HtmlDocument();
var html = await client.GetStringAsync(_link); // <---- CODE HANGS HERE
doc.LoadHtml(html);
string percGrn = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'up_g')]//span[2]");
string percRed = doc.FindInnerHtml(
"//span[contains(#class,'time_rtq_content') and contains(#class,'down_r')]//span[2]");
// create somthing we'll nuderstand later
if ((String.IsNullOrEmpty(percGrn) && String.IsNullOrEmpty(percRed)) ||
(!String.IsNullOrEmpty(percGrn) && !String.IsNullOrEmpty(percRed)))
throw new Exception();
// adding string to empty gives string
string perc = percGrn + percRed;
bool isNegative = String.IsNullOrEmpty(percGrn);
double percDouble;
if (double.TryParse(Regex.Match(perc, #"\d+([.])?(\d+)?").Value, out percDouble))
data = (isNegative ? 0 - percDouble : percDouble).ToString();
}
catch (Exception ex) { }
finally
{
// update the progress bar...
_loadScreen.IncProgressCntr();
}
return new string[] { _symbol, data };
}
I could really use some help. Thanks!
In short when you combine async with any 'regular' task functions you get a deadlock
http://olitee.com/2015/01/c-async-await-common-deadlock-scenario/
the solution is by using configureawait
var html = await client.GetStringAsync(_link).ConfigureAwait(false);
The reason you need this is because you didn't await your orginal thread.
// ***Create a query that, when executed, returns a collection of tasks.
IEnumerable<Task<string[]>> downloadTasksQuery = from task in taskList select QueryHtml(loadScreen,links[task], symbols[task]);
What's happeneing here is that you mix the await paradigm with thre regular task handling paradigm. and those don't mix (or rather you have to use the ConfigureAwait(false) for this to work.