I have some c# code (MVC WebAPI) which iterates over an array of IDs in parallel and makes an API call for each entry. In the first version, the whole code was a simple, synchronous for loop. Now we changed that to a combination of Task.WhenAll and a LINQ select:
private async Task RunHeavyLoad(IProgress<float> progress) {
List<MyObj> myElements = new List<MyObj>(someEntries);
float totalSteps = 1f / myElements.Count();
int currentStep = 0;
await Task.WhenAll(myElements.Select(async elem => {
var result = await SomeHeavyApiCall(elem);
DoSomethingWithThe(result);
progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f);
}
// Do some more stuff
}
This is a simplified version of the original method! The actual method EnforceImmediateImport is called by this SignalR hub method:
public class ImportStatusHub : Hub {
public async Task RunUnscheduledImportAsync(DateTime? fromDate, DateTime? toDate) {
Clients.Others.enableManualImport(false);
try {
Progress<float> progress = new Progress<float>((p) => Clients.All.updateProgress(p));
await MvcApplication.GlobalScheduler.EnforceImmediateImport(progress, fromDate, toDate);
} catch (Exception ex) {
Clients.All.importError(ex.Message);
}
Clients.Others.enableManualImport(true);
}
}
Now I wonder, if this is "thread safe" per se, or if I need to do something with the progress.Report calls to prevent anything from going wrong.
From the docs:
Remarks
Any handler provided to the constructor or event handlers
registered with the ProgressChanged event are invoked through a
SynchronizationContext instance captured when the instance is
constructed. If there is no current SynchronizationContext at the time
of construction, the callbacks will be invoked on the ThreadPool.
For more information and a code example, see the article Async in 4.5:
Enabling Progress and Cancellation in Async APIs in the .NET Framework
blog.
Like anything else using the SynchronizationContext, it's safe to post from multiple threads.
Custom implementations of IProgress<T> should have their behavior defined.
On your question, internally, Progress only does invoking. It is up to the code you wrote to handle the progress on the other side. I would say that the line progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f); can cause a potential progress reporting issue due to the multiplication which is not atomic.
This is what happens internally inside Progress when you call Report
protected virtual void OnReport(T value)
{
// If there's no handler, don't bother going through the sync context.
// Inside the callback, we'll need to check again, in case
// an event handler is removed between now and then.
Action<T> handler = m_handler;
EventHandler<T> changedEvent = ProgressChanged;
if (handler != null || changedEvent != null)
{
// Post the processing to the sync context.
// (If T is a value type, it will get boxed here.)
m_synchronizationContext.Post(m_invokeHandlers, value);
}
}
On the code though, a better way to run in parallel is to use PLinq. In your current code, if the list contains many items, it will spin up tasks for every single item at the same time and wait for all of them to complete. However, in PLinq, the number of concurrent executions will be determined for you to optimize performance.
myElements.AsParallel().ForAll(async elem =>
{
var result = await SomeHeavyApiCall(elem);
DoSomethingWithThe(result);
progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f);
}
Please keep in mind that AsParallel().ForAll() will immediately return when using async func. So you might want to capture all the tasks and wait for them before you proceed.
One last thing, if your list is being edited while it is being processed, i recommend using ConcurrentQueue or ConcurrentDictionary or ConcurrentBag.
Related
I am using the System.Threading.Timer class in one of my projects and I've noticed that the callback methods are called before the previous ones get to finish which is different than what I was expecting.
For example the following code
var delta = new TimeSpan(0,0,1);
var timer = new Timer(TestClass.MethodAsync, null, TimeSpan.Zero, delta);
static class TestClass
{
static int i = 0;
public static async void MethodAsync(object _)
{
i++;
Console.WriteLine("method " + i + "started");
await Task.Delay(10000);
Console.WriteLine("method " + i + "finished");
}
}
has this output
method 1started
method 2started
method 3started
method 4started
method 5started
method 6started
method 7started
method 8started
method 9started
method 10started
method 11started
method 11finished
method 12started
method 12finished
Which of course is not thread safe. What I was expecting is that a new call would be made to the callback method after the previous call has succeeded and additionally after the delta period is elapsed.
What I am looking for is where in the docs from Microsoft is this behavior documented and maybe if there is a built in way to make it wait for the callback calls to finish before starting new ones
The problem of overlapping event handlers is inherent with the classic multithreaded .NET timers (the System.Threading.Timer and the System.Timers.Timer). Attempting to solve this problem while remaining on the event-publisher-subscriber model is difficult, tricky, and error prone. The .NET 6 introduced a new timer, the PeriodicTimer, that attempts to solve this problem once and for all. Instead of handling an event, you start an asynchronous loop and await the PeriodicTimer.WaitForNextTickAsync method on each iteration. Example:
class TestClass : IDisposable
{
private int i = 0;
private PeriodicTimer _timer;
public async Task StartAsynchronousLoop()
{
if (_timer != null) throw new InvalidOperationException();
_timer = new(TimeSpan.FromSeconds(1));
while (await _timer.WaitForNextTickAsync())
{
i++;
Console.WriteLine($"Iteration {i} started");
await Task.Delay(10000); // Simulate an I/O-bound operation
Console.WriteLine($"Iteration {i} finished");
}
}
public void Dispose() => _timer?.Dispose();
}
This way there is no possibility for overlapping executions, provided that you will start only one asynchronous loop.
The await _timer.WaitForNextTickAsync() returns false when the timer is disposed. You can also stop the loop be passing a CancellationToken as argument. When the token is canceled, the WaitForNextTickAsync will complete with an OperationCanceledException.
In case the periodic action is not asynchronous, you can offload it to the ThreadPool, by wrapping it in Task.Run:
await Task.Run(() => Thread.Sleep(10000)); // Simulate a blocking operation
If you are targeting a .NET platform older than .NET 6, you can find alternatives to the PeriodicTimer here.
What I am looking for is where in the docs from Microsoft is this behavior documented...
System.Threading.Timer
If processing of the Elapsed event lasts longer than Interval, the event might be raised again on another ThreadPool thread. In this situation, the event handler should be reentrant.
System.Timers.Timer
The callback method executed by the timer should be reentrant, because it is called on ThreadPool threads.
For System.Windows.Forms.Timer this post asserts that the event does wait. The documentation doesn't seem very specific, but in the Microsoft Timer.Tick Event official example the code shows turning the timer off and on in the handler. So it seems that, regardless, steps are taken to prevent ticks and avoid reentrancy.
...and if there is a built in way to make it wait for the callback calls to finish before starting new ones.
According to the first Microsoft link (you might consider this a workaround, but it's straight from the horse's mouth):
One way to resolve this race condition is to set a flag that tells the event handler for the Elapsed event to ignore subsequent events.
The way I personally go about achieving this objective this is to call Wait(0) on the synchronization object of choice as a robust way to ignore reentrancy without having timer events piling up in a queue:
static SemaphoreSlim _sslim = new SemaphoreSlim(1, 1);
public static async void MethodAsync(object _)
{
if (_sslim.Wait(0))
{
try
{
i++;
Console.WriteLine($"method {i} started # {DateTime.Now}");
await Task.Delay(10000);
Console.WriteLine($"method {i} finished # {DateTime.Now}");
}
catch (Exception ex)
{
Debug.Assert(false, ex.Message);
}
finally
{
_sslim.Release();
}
}
}
In which case your MethodAsync generates this output:
I've been working on a project and saw the below code. I am new to the async/await world. As far as I know, only a single task is performing in the method then why it is decorated with async/await. What benefits I am getting by using async/await and what is the drawback if I remove async/await i.e make it synchronous I am a little bit confused so any help will be appreciated.
[Route("UpdatePersonalInformation")]
public async Task<DataTransferObject<bool>> UpdatePersonalInformation([FromBody] UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
try
{
model.UserId = UserIdentity;
transfer = await _userService.UpdateUserPersonalInformation(model);
}
catch (Exception ex)
{
transfer.TransactionStatusCode = 500;
transfer.ErrorMessage = ex.Message;
}
return transfer;
}
Service code
public async Task<DataTransferObject<bool>> UpdateUserPersonalInformation(UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
await Task.Run(() =>
{
try
{
var data = _userProfileRepository.FindBy(x => x.AspNetUserId == model.UserId)?.FirstOrDefault();
if (data != null)
{
var userProfile = mapper.Map<UserProfile>(model);
userProfile.UpdatedBy = model.UserId;
userProfile.UpdateOn = DateTime.UtcNow;
userProfile.CreatedBy = data.CreatedBy;
userProfile.CreatedOn = data.CreatedOn;
userProfile.Id = data.Id;
userProfile.TypeId = data.TypeId;
userProfile.AspNetUserId = data.AspNetUserId;
userProfile.ProfileStatus = data.ProfileStatus;
userProfile.MemberSince = DateTime.UtcNow;
if(userProfile.DOB==DateTime.MinValue)
{
userProfile.DOB = null;
}
_userProfileRepository.Update(userProfile);
transfer.Value = true;
}
else
{
transfer.Value = false;
transfer.Message = "Invalid User";
}
}
catch (Exception ex)
{
transfer.ErrorMessage = ex.Message;
}
});
return transfer;
}
What benefits I am getting by using async/await
Normally, on ASP.NET, the benefit of async is that your server is more scalable - i.e., can handle more requests than it otherwise could. The "Synchronous vs. Asynchronous Request Handling" section of this article goes into more detail, but the short explanation is that async/await frees up a thread so that it can handle other requests while the asynchronous work is being done.
However, in this specific case, that's not actually what's going on. Using async/await in ASP.NET is good and proper, but using Task.Run on ASP.NET is not. Because what happens with Task.Run is that another thread is used to run the delegate within UpdateUserPersonalInformation. So this isn't asynchronous; it's just synchronous code running on a background thread. UpdateUserPersonalInformation will take another thread pool thread to run its synchronous repository call and then yield the request thread by using await. So it's just doing a thread switch for no benefit at all.
A proper implementation would make the repository asynchronous first, and then UpdateUserPersonalInformation can be implemented without Task.Run at all:
public async Task<DataTransferObject<bool>> UpdateUserPersonalInformation(UserPersonalInformationRequestModel model)
{
DataTransferObject<bool> transfer = new DataTransferObject<bool>();
try
{
var data = _userProfileRepository.FindBy(x => x.AspNetUserId == model.UserId)?.FirstOrDefault();
if (data != null)
{
...
await _userProfileRepository.UpdateAsync(userProfile);
transfer.Value = true;
}
else
{
transfer.Value = false;
transfer.Message = "Invalid User";
}
}
catch (Exception ex)
{
transfer.ErrorMessage = ex.Message;
}
return transfer;
}
The await keyword only indicates that the execution of the current function is halted until the Task which is being awaited is completed. This means if you remove the async, the method will continue execution and therefore immediately return the transfer object, even if the UpdateUserPersonalInformation Task is not finished.
Take a look at this example:
private void showInfo()
{
Task.Delay(1000);
MessageBox.Show("Info");
}
private async void showInfoAsync()
{
await Task.Delay(1000);
MessageBox.Show("Info");
}
In the first method, the MessageBox is immediately displayed, since the newly created Task (which only waits a specified amount of time) is not awaited. However, the second method specifies the await keyword, therefore the MessageBox is displayed only after the Task is finished (in the example, after 1000ms elapsed).
But, in both cases the delay Task is ran asynchronously in the background, so the main thread (for example the UI) will not freeze.
The usage of async-await mechanism mainly used
when you have some long calculation process which takes some time and you want it to be on the background
in UI when you don't want to make the main thread stuck which will be reflected on UI performance.
you can read more here:
https://learn.microsoft.com/en-us/dotnet/csharp/async
Time Outs
The main usages of async and await operates preventing TimeOuts by waiting for long operations to complete. However, there is another less known, but very powerful one.
If you don't await long operation, you will get a result back, such as a null, even though the actual request as not completed yet.
Cancellation Tokens
Async requests have a default parameter you can add:
public async Task<DataTransferObject<bool>> UpdatePersonalInformation(
[FromBody] UserPersonalInformationRequestModel model,
CancellationToken cancellationToken){..}
A CancellationToken allows the request to stop when the user changes pages or interrupts the connection. A good example of this is a user has a search box, and every time a letter is typed you filter and search results from your API. Now imagine the user types a very long string with say 15 characters. That means that 15 requests are sent and 15 requests need to be completed. Even if the front end is not awaiting the first 14 results, the API is still doing all the 15 requests.
A cancellation token simply tells the API to drop the unused threads.
I would like to chime in on this because most answers although good, do not point to a definite time when to use and when not.
From my experience, if you are developing anything with a front-end, add async/await to your methods when expecting output from other threads to be input to your UI. This is the best strategy for handling multithread output and Microsoft should be commended to come out with this when they did. Without async/await you would have to add more code to handle thread output to UI (e.g Event, Event Handler, Delegate, Event Subscription, Marshaller).
Don't need it anywhere else except if using strategically for slow peripherals.
I am reproducing my Rx issue with a simplified test case below. The test below hangs. I am sure it is a small, but fundamental, thing that I am missing, but can't put my finger on it.
public class Service
{
private ISubject<double> _subject = new Subject<double>();
public void Reset()
{
_subject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _subject;
}
}
public class ObTest
{
[Fact]
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var task = Task.Run(async () =>
{
service.Reset();
});
await result;
}
}
UPDATE
My attempt above was to simplify the problem a little and understand it. In my case GetProgress() is a merge of various Observables that publish the download progress, one of these Observables is a Subject<double> that publishes 0 everytime somebody calls a method to delete the download.
The race condition identified by Enigmativity and Theodor Zoulias may(??) happen in real life. I display a view which attempts to get the progress, however, quick fingers delete it just in time.
What I need to understand a bit more is if the download is started again (subscription has taken place by now, by virtue of displaying a view, which has already made the subscription) and somebody again deletes it.
public class Service
{
private ISubject<double> _deleteSubject = new Subject<double>();
public void Reset()
{
_deleteSubject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _deleteSubject.Merge(downloadProgress);
}
}
Your code isn't hanging. It's awaiting an observable that sometimes never gets a value.
You have a race condition.
The Task.Run is sometimes executing to completion before the await result creates the subscription to the observable - so it never sees the value.
Try this code instead:
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var awaiter = result.GetAwaiter();
var task = Task.Run(() =>
{
service.Reset();
});
await awaiter;
}
The line await result creates a subscription to the observable. The problem is that the notification _subject.OnNext(0.0) may occur before this subscription, in which case the value will pass unobserved, and the await result will continue waiting for a notification for ever. In this particular example the notification is always missed, at least in my PC, because the subscription is delayed for around 30 msec (measured with a Stopwatch), which is longer than the time needed for the task that resets the service to complete, probably because the JITer must load and compile some RX-related assembly. The situation changes when I do a warm-up by calling new Subject<int>().FirstAsync().Subscribe() before running the example. In that case the notification is observed almost always, and the hanging is avoided.
I can think of two robust solutions to this problem.
The solution suggested by Enigmativity, to create an awaitable subscription before starting the task that resets the service. This can be done with either GetAwaiter or ToTask.
To use a ReplaySubject<T> instead of a plain vanilla Subject<T>.
Represents an object that is both an observable sequence as well as an observer. Each notification is broadcasted to all subscribed and future observers, subject to buffer trimming policies.
The ReplaySubject will cache the value so that it can be observed by the future subscription, eliminating the race condition. You could initialize it with a bufferSize of 1 to minimize the memory footprint of the buffer.
I have a class (Class A) that is responsible for running an async job in the background that looks like this:
public async void DoJob()
{
while (true)
{
var thingToDo = this.getNextThing();
if (thingToDo != null)
{
try
{
await this.performAction(thingToDo);
}
catch (Exception ex)
{
// file logging of error.
// then wait a certain period.
await Task.Delay(someInterval);
}
}
else
{
// Gets the interval that should be awaited until there is a
// thingToDo available.
var waitInterval = this.getWaitUntilNextThingAvailable();
// if there such an interval then wait for it.
if (waitInterval != null)
{
await Task.Delay(waitInterval.Value);
}
// else (basically when there is nothing to be done by this job)
// use an AsyncManualResetEvent to wait until its set.
else
{
await this.waitHandle.WaitAsync();
}
}
}
}
I am basically interested in the last else block - the one where I use an AsyncManualResetEvent (provided by the AsyncEx library)
I use an event provided by another class (Class B) to set the waitHandle. This is how the subscription looks like (note that this method is in Class B)
private event Action ChangeOccurred;
public void Attach(Action action)
{
this.ChangeOccurred += action;
}
And now onto my question : I use ChangeOccurred?.Invoke() to set the waitHandle such that Class A can be notified that there is something to do and continue performing things in the background.
Is Invoke() the right way? I am not sure if I should be using BeginInvoke() and EndInvoke instead? The event contains no date and is simply used as a signal that the async job can do things already.
The code in Class B where the ChangeOccurred event is invoked is synchronous.
Do not use BeginInvoke/EndInvoke. Those methods just call Invoke on a thread pool thread.
Using myEvent?.Invoke() is an appropriate way of raising the event, which (synchronously) sets the AsyncManualResetEvent. The fact that there's an asynchronous listener on the AsyncManualResetEvent doesn't matter.
On a side note, the latest (v5 preview) version of AsyncEx includes PauseToken
/ PauseTokenSource types which are really just a simple wrapper around AsyncManualResetEvent, but might make the intent of the code a bit clearer.
I'm looking for a way to implement the following:
A says: "Yo, X happened. Bye."
Others see that and start doing some work.
In other words, I would like to fire an event and let others handle that in a fire and forget way.
So I've looked into the observer pattern: https://msdn.microsoft.com/en-us/library/dd783449(v=vs.110).aspx. However this example is synchronous, and if the observers take a long time to do their work, the notify method blocks for a long time.
I also looked at how to raise events: https://msdn.microsoft.com/en-us/library/9aackb16(v=vs.110).aspx. Also this example is synchronous, and blocks the sender for a long time when the handler takes long to handle the event.
My question is:
How do I do fire and forget events/messages/delegates in C#?
Probably you should meet Task Parallel Library (TPL) Dataflows. There's one data flow called ActionBlock<TInput> that should be a good start for you:
The ActionBlock<TInput> class is a target block that calls a delegate
when it receives data. Think of a ActionBlock<TInput> object as a
delegate that runs asynchronously when data becomes available. The
delegate that you provide to an ActionBlock<TInput> object can be of
type Action or type System.Func<TInput, Task>[...]
Therefore, what about giving a Func<TInput, Task> to ActionBlock<TInput> to perform asynchronous stuff? I've modified the sample found on this TPL Dataflow MSDN article:
List<Func<int, Task>> observers = new List<Func<int, Task>>
{
n => Console.WriteLine(n),
n => Console.WriteLine(n * i),
n => Console.WriteLine(n * n / i)
};
// Create an ActionBlock<int> object that prints values
// to the console.
var actionBlock = new ActionBlock<int>
(
n =>
{
// Fire and forget call to all observers
foreach(Func<int, Task> observer in observers)
{
// Don't await for its completion
observer(n);
}
}
);
// Post several messages to the block.
for (int i = 0; i < 3; i++)
{
actionBlock.Post(i * 10);
}
// Set the block to the completed state
actionBlock.Complete();
// See how I commented out the following sentence.
// You don't wait actions to complete as you want the fire
// and forget behavior!
// actionBlock.Completion.Wait();
You might also want to take a look at BufferBlock<T>.