I'm putting together a simple C# app where a user types commands into a "command bar" and the results are fetched in real time.
For example I could type, google stackoverflow, and it sends an API call off to google, fetches each of the results and displays them in the ui.
Currently I fire off the "search" method if a user pauses for more than 1/4 of a second on typing, so if you paused in the middle it could fire google stack and google stackoverflow.
Now in reality the api is doing a (rather slow) database query and this causes the ui to lock up while the first search completes, before it tries to start on the second search.
Is there a simple (C# 4.0) way to run the search call in a separate thread that I can then cancel/abort if the user continues typing?
e.g.
Task<string> thread;
string getSearchResults(string input) {
... Do some work ...
}
string userPaused(string search) {
if(this.thread.isRunning()) this.thread.Kill();
this.thread = new Task<String>(getSearchResults(string input);
return this.thread.result();
}
I've looked at the Tasks api and it doesn't look as if you can kill a task in the middle of work, it suggests using a while look and passing a shouldStop boolean, however during an API call to download the results there is no while loop.
The Threading documentation however points you to tasks if you need to get the return value.
What you may do with Tasks is to create them, and then cancel when not needed any more. If you can't cancel operation you are doing (like database query) - you can always cancel before results get returned. Your code may be something like this (not tested, just a draft):
var tokenSource2 = new CancellationTokenSource();
CancellationToken ct = tokenSource2.Token;
var task = Task.Factory.StartNew(() =>
{
ct.ThrowIfCancellationRequested();
var result = Database.GetResult(); // whatever database query method you use.
ct.ThrowIfCancellationRequested();
return result;
}, tokenSource2.Token);
So as you can see it will query database and return value when no cancellation requested, but if you will try to cancell the task - it will not return value but rather throw OperationCanceledException you need to catch. For details visit MSDN Task Cancellation, but I think this should give you an idea. Don't worry about big task number - if your query is not very slow it won't matter - user will not be able to trigger so many searches. If you have asynchronous way of querying database - you can improve this code a bit more, but that also shouldn't be too hard.
Related
I am currently learning C# Async/Await feature and can see its usefulness in GUI and web apps but I am still trying to figure out its real usefulness in Console apps. Can you give an example that drives home the point?
Async allows running more code until tasks are awaited, so if there is more code that can be run simultaniously (meaning - it is not dependent of the other task), it can start right away.
for example:
public async Task<string> GetUserFullNameAsync(string firstName)
{
return await GetUserFullNameAsyncInner(firstName); // gets user name from db in an async fashion - takes 4 seconds
}
public async Task<DateTime> GetFlightTimeAsync(string filghtName)
{
return await GetFlightTimeAsyncInner(filghtName); // gets filget time from db in as async fashion - takes 4 seconds
}
public async Task<UserDetails> GetUserDetailsAsync(string userFullName)
{
return await GetUserDetailsAsyncInner(name); // gets user details by its full name from db in an async fashion - takes 4 seconds
}
lets look at this function:
public async <UserDetails> GetUserDetails(string firstName)
{
var userFullName = await GetUserDetailsAsync(firstName);
return await GetUserDetailsAsync(userFullName);
}
notice how GetUserDetailsAsync is dependent of getting the full name first, by using GetUserDetailsAsync.
So if you need to get the UserDetails object, you are dependent of waitig for the GetUserDetailsAsync to finish. that may take some time - especailly for heavier actions like video processing and such.
In this example - 4 seconds for the first function + 4 seconds for the seconds = 8 seconds.
now lets look at this second function:
public async <FlightDetails> GetUserFlightDetails(string firstName, string flightName)
{
var userFullNameTask = GetUserDetailsAsync(firstName);
var flightTimeTask = GetFlightTimeAsync(flightName);
await Task.WhenAll(userFullNameTask, flightTimeTask);
return new FlightDetails(await userFullNameTask, await flightTimeTask);
}
Notice that GetFlightTimeAsync is not dependent on any other function, so if you need say that user full name and flight time, you can do it in a parallel way, so both actions are processed in the same time - hence the total time to wait is faster than getting the full name and then getting the flight time.
4 seconds for the first function + 4 seconds for the second - in a parallel way < 8 seconds.
Let's look at a different angle on the asynchronous programming than just a way of doing things in parallel. Yes, you can run tasks in parallel but you can find so much code that is using await/async but it is waiting on every asynchronous execution.
What is the point of doing so? There is no parallel execution there...
It is everything about making better use of available system resources, especially threads.
Once the execution reaches an asynchronous code the thread can be released and the threads are limited system resources. By releasing the thread when it’s idling for an IO-bound work to complete, it can be used to serve another request. It also protects against usage bursts since the scheduler doesn’t suddenly find itself starved of threads to serve new requests.
Choosing an async operation instead of a synchronous one doesn't speed up the operation. It will take the same amount of time (or even more). It just enables that thread to continue executing some other CPU bound work instead of wasting resources.
If you have any I/O-bound needs (such as requesting data from a network, accessing a database, or reading and writing to a file system), you'll want to utilize asynchronous programming. No matter if the application is a console one or not.
Bonus: If you are wondering: "Ok, my application released the thread but there must be some other thread that is really doing the wait!" have a look at this article from Stephen Cleary
I already have some experience in working with threads in Windows but most of that experience comes from using Win32 API functions in C/C++ applications. When it comes to .NET applications however, I am often not sure about how to properly deal with multithreading. There are threads, tasks, the TPL and all sorts of other things I can use for multithreading but I never know when to use which of those options.
I am currently working on a C# based Windows service which needs to periodically validate different groups of data from different data sources. Implementing the validation itself is not really an issue for me but I am unsure about how to handle all of the validations running simultaneously.
I need a solution for this which allows me to do all of the following things:
Run the validations at different (predefined) intervals.
Control all of the different validations from one place so I can pause and/or stop them if necessary, for example when a user stops or restarts the service.
Use the system ressources as efficiently as possible to avoid performance issues.
So far I've only had one similar project before where I simply used Thread objects combined with a ManualResetEvent and a Thread.Join call with a timeout to notify the threads about when the service is stopped. The logic inside those threads to do something periodically then looked like this:
while (!shutdownEvent.WaitOne(0))
{
if (DateTime.Now > nextExecutionTime)
{
// Do something
nextExecutionTime = nextExecutionTime.AddMinutes(interval);
}
Thread.Sleep(1000);
}
While this did work as expected, I've often heard that using threads directly like this is considered "oldschool" or even a bad practice. I also think that this solution does not use threads very efficiently as they are just sleeping most of the time. How can I achive something like this in a more modern and efficient way?
If this question is too vague or opinion-based then please let me know and I will try my best to make it as specific as possible.
Question feels a bit broad but we can use the provided code and try to improve it.
Indeed the problem with the existing code is that for the majority of the time it holds thread blocked while doing nothing useful (sleeping). Also thread wakes up every second only to check the interval and in most cases go to sleep again since it's not validation time yet. Why it does that? Because if you will sleep for longer period - you might block for a long time when you signal shutdownEvent and then join a thread. Thread.Sleep doesn't provide a way to be interrupted on request.
To solve both problems we can use:
Cooperative cancellation mechanism in form of CancellationTokenSource + CancellationToken.
Task.Delay instead of Thread.Sleep.
For example:
async Task ValidationLoop(CancellationToken ct) {
while (!ct.IsCancellationRequested) {
try {
var now = DateTime.Now;
if (now >= _nextExecutionTime) {
// do something
_nextExecutionTime = _nextExecutionTime.AddMinutes(1);
}
var waitFor = _nextExecutionTime - now;
if (waitFor.Ticks > 0) {
await Task.Delay(waitFor, ct);
}
}
catch (OperationCanceledException) {
// expected, just exit
// otherwise, let it go and handle cancelled task
// at the caller of this method (returned task will be cancelled).
return;
}
catch (Exception) {
// either have global exception handler here
// or expect the task returned by this method to fail
// and handle this condition at the caller
}
}
}
Now we do not hold a thread any more, because await Task.Delay doesn't do this. Instead, after specificed time interval it will execute the subsequent code on a free thread pool thread (it's more complicated that this but we won't go into details here).
We also don't need to wake up every second for no reason, because Task.Delay accepts cancellation token as a parameter. When that token is signalled - Task.Delay will be immediately interrupted with exception, which we expect and break from the validation loop.
To stop the provided loop you need to use CancellationTokenSource:
private readonly CancellationTokenSource _cts = new CancellationTokenSource();
And you pass its _cts.Token token into the provided method. Then when you want to signal the token, just do:
_cts.Cancel();
To futher improve the resource management - IF your validation code uses any IO operations (reads files from disk, network, database access etc) - use Async versions of said operations. Then also while performing IO you will hold no unnecessary threads blocked waiting.
Now you don't need to manage threads yourself anymore and instead you operatate in terms of tasks you need to perform, letting framework \ OS manage threads for you.
You should use Microsoft's Reactive Framework (aka Rx) - NuGet System.Reactive and add using System.Reactive.Linq; - then you can do this:
Subject<bool> starter = new Subject<bool>();
IObservable<Unit> query =
starter
.StartWith(true)
.Select(x => x
? Observable.Interval(TimeSpan.FromSeconds(5.0)).SelectMany(y => Observable.Start(() => Validation()))
: Observable.Never<Unit>())
.Switch();
IDisposable subscription = query.Subscribe();
That fires off the Validation() method every 5.0 seconds.
When you need to pause and resume, do this:
starter.OnNext(false);
// Now paused
starter.OnNext(true);
// Now restarted.
When you want to stop it all call subscription.Dispose().
using: Asp.net Core, Entityframework Core, ABP 4.5
I have a user registration and initialization flow. But it takes a long time. I want to parallelize this. This is due to updating from the same entity, but with a different field.
My goal:
1. The endpoint should respond as soon as possible;
2. Long initialization is processed in the background;
Code-before (minor details omitted for brevity)
public async Task<ResponceDto> Rgistration(RegModel input)
{
var user = await _userRegistrationManager.RegisterAsync(input.EmailAddress, input.Password, false );
var result = await _userManager.AddToRoleAsync(user, defaultRoleName);
user.Code = GenerateCode();
await SendEmail(user.EmailAddress, user.Code);
await AddSubEntities(user);
await AddSubCollectionEntities(user);
await CurrentUnitOfWork.SaveChangesAsync();
return user.MapTo<ResponceDto>();
}
private async Task AddSubEntities(User user)
{
var newSubEntity = new newSubEntity { User = user, UserId = user.Id };
await _subEntityRepo.InsertAsync(newSubEntity);
//few another One-to-One entities...
}
private async Task AddSubEntities(User user)
{
List<AnotherEntity> collection = GetSomeCollection(user.Type);
await _anotherEntitieRepo.GetDbContext().AddRangeAsync(collection);
//few another One-to-Many collections...
}
Try change:
public async Task<ResponceDto> Rgistration(RegModel input)
{
var user = await _userRegistrationManager.RegisterAsync(input.EmailAddress, input.Password, false );
Task.Run(async () => {
var result = await _userManager.AddToRoleAsync(user, defaultRoleName);
});
Task.Run(async () => {
user.Code = GenerateCode();
await SendEmail(user.EmailAddress, user.Code);
});
Task.Run(async () => {
using (var unitOfWork = UnitOfWorkManager.Begin())
{//long operation. defalt unitOfWork out of scope
try
{
await AddSubEntities(user);
}
finally
{
unitOfWork.Complete();
}
}
});
Task.Run(async () => {
using (var unitOfWork = UnitOfWorkManager.Begin())
{
try
{
await AddSubCollectionEntities(user);
}
finally
{
unitOfWork.Complete();
}
}
});
await CurrentUnitOfWork.SaveChangesAsync();
return user.MapTo<ResponceDto>();
}
Errors:
here I get a lot of different errors related to competition. frequent:
A second operation started on this context before a previous operation completed. This is usually caused by different threads using the same instance of DbContext.
In few registratin calls: Cannot insert duplicate key row in object 'XXX' with unique index 'YYY'. The duplicate key value is (70). The statement has been terminated.
I thought on the server every request in its stream, but apparently not.
or all users are successfully registered, but they don’t have some sub-entity in the database. it’s much easier not to register the user than to figure out where he was initialized incorrectly =(
how to keep the user entity “open” for updating and at the same time “closed” for changes initiated by other requests? How to make this code thread safe and fast, can anyone help with advice?
Using Task.Run in ASP.NET is rarely a good idea.
Async methods run on the thread pool anyway, so wrapping them in Task.Run is simply adding overhead without any benefit.
The purpose of using async in ASP.NET is simply to prevent threads being blocked so they are able to serve other HTTP requests.
Ultimately, your database is the bottleneck; if all these operations need to happen before you return a response to the client, then there's not much you can do other than to let them happen.
If it is possible to return early and allow some operations to continue running on the background, then there are details here showing how that can be done.
Task.Run is not the same as parallel. It takes a new thread from the pool and runs the work on that thread, and since you're not awaiting it, the rest of the code can move on. However, that's because you're essentially orphaning that thread. When the action returns, all the scoped services will be disposed, which includes things like your context. Any threads that haven't finished, yet, will error out as a result.
The thread pool is a limited resource, and within the context of a web application, it equates directly to the throughput of your server. Every thread you take is one less request you can service. As a result, you're more likely to end up queuing requests, which will only add to processing time. It's virtually never appropriate to use Task.Run in a web environment.
Also, EF Core (or old EF, for that matter) does not support parallelization. So, even without the other problems described above, it will stop you cold from doing what you're trying to do here, regardless.
The queries you have here are not complex. Even if you were trying to insert 100s of things at once, it should still take only milliseconds to complete. If there is any significant delay here, you need to look at the resources of your database server and your network latency, first.
More likely than not, the slow-down is coming from the sending of the email. That too can likely be optimized, though. I was in a situation once where it was taking emails 30 seconds to send, until I finally figured out that it was an issue with our Exchange server, where an IT admin had idiotically introduced a 30 second delay on purpose. Regardless, it is generally always preferable to background things like sending emails, since they aren't core to your app's functionality. However, that means actually processing them in background, i.e. queue them and process them via something like a hosted service or an entirely different worker process.
I'm doing some tests with the new Background tasks with hosted services in ASP.NET Core feature present in version 2.1, more specifically with Queued background tasks, and a question about parallelism came to my mind.
I'm currently following strictly the tutorial provided by Microsoft and when trying to simulate a workload with several requests being made from a same user to enqueue tasks I noticed that all workItems are executed in order, so no parallelism.
My question is, is this behavior expected? And if so, in order to make the request execution parallel is it ok to fire and forget, instead of waiting the workItem to complete?
I've searched for a couple of days about this specific scenario without luck, so if anyone has any guide or examples to provide, I would be really glad.
Edit: The code from the tutorial is quite long, so the link for it is https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-2.1#queued-background-tasks
The method which executes the work item is this:
public class QueuedHostedService : IHostedService
{
...
public Task StartAsync(CancellationToken cancellationToken)
{
_logger.LogInformation("Queued Hosted Service is starting.");
_backgroundTask = Task.Run(BackgroundProceessing);
return Task.CompletedTask;
}
private async Task BackgroundProceessing()
{
while (!_shutdown.IsCancellationRequested)
{
var workItem =
await TaskQueue.DequeueAsync(_shutdown.Token);
try
{
await workItem(_shutdown.Token);
}
catch (Exception ex)
{
_logger.LogError(ex,
$"Error occurred executing {nameof(workItem)}.");
}
}
}
...
}
The main point of the question is to know if anyone out there could share the knowledge of how to use this specific technology to execute several work items at the same time, since a server can handle this workload.
I tried the fire and forget method when executing the work item and it worked the way I intended it to, several tasks executing in parallel at the same time, I 'm jut no sure if this is an ok practice, or if there is a better or proper way of handling this situation.
The code you posted executes the queued items in order, one at a time but also in parallel to the web server. An IHostedService is running per definition in parallel to the web server. This article provides a good overview.
Consider the following example:
_logger.LogInformation ("Before()");
for (var i = 0; i < 10; i++)
{
var j = i;
_backgroundTaskQueue.QueueBackgroundWorkItem (async token =>
{
var random = new Random();
await Task.Delay (random.Next (50, 1000), token);
_logger.LogInformation ($"Event {j}");
});
}
_logger.LogInformation ("After()");
We add ten tasks which will wait a random amount of time. If you put the code in a controller method the events will still be logged even after controller method returns. But each item will be executed in order so that the output looks like this:
Event 1
Event 2
...
Event 9
Event 10
In order to introduce parallelism we have to change the implementation of the BackgroundProceessing method in the QueuedHostedService.
Here is an example implementation that allows two Tasks to be executed in parallel:
private async Task BackgroundProceessing()
{
var semaphore = new SemaphoreSlim (2);
void HandleTask(Task task)
{
semaphore.Release();
}
while (!_shutdown.IsCancellationRequested)
{
await semaphore.WaitAsync();
var item = await TaskQueue.DequeueAsync(_shutdown.Token);
var task = item (_shutdown.Token);
task.ContinueWith (HandleTask);
}
}
Using this implementation the order of the events logged in no longer in order as each task waits a random amount of time. So the output could be:
Event 0
Event 1
Event 2
Event 3
Event 4
Event 5
Event 7
Event 6
Event 9
Event 8
edit: Is it ok in a production environment to execute code this way, without awaiting it?
I think the reason why most devs have a problem with fire-and-forget is that it is often misused.
When you execute a Task using fire-and-forget you are basically telling me that you do not care about the result of this function. You do not care if it exits successfully, if it is canceled or if it threw an exception. But for most Tasks you do care about the result.
You do want to make sure a database write went through
You do want to make sure a Log entry is written to the hard drive
You do want to make sure a network packet is sent to the receiver
And if you care about the result of the Task then fire-and-forget is the wrong method.
That's it in my opinion. The hard part is finding a Task where you really do not care about the result of the Task.
You can add the QueuedHostedService once or twice for every CPU in the machine.
So something like this:
for (var i=0;i<Environment.ProcessorCount;++i)
{
services.AddHostedService<QueuedHostedService>();
}
You can hide this in an extension method and make the concurrency level configurable to keep things clean.
I want to know when some parallel tasks are completed.
I'm using this code to make between 1500 and 2000 small WebClient.DownloadString with a 10 seconds HttpRequest Timeout on a website:
Task.Factory.StartNew(() =>
Parallel.ForEach<string>(myKeywords, new ParallelOptions
{ MaxDegreeOfParallelism = 5 }, getKey));
Sometimes, a query fails, so that there are exceptions and the function never finish, and the UI refresh inside each getKey function sometimes seems to be called twice, so I cannot get an accurate idea about how many tasks are completed. I'm calculating: Number of UI refresh calls / total number of keywords, and get a result between 100% and 250%, and I never know when task are completed. I search in a lot of SO discussion but none was a direct method or a method that suits my needs. So I guess Framework 4.0 doesn't provides any Tasks.AllCompleted Event Handler or similar workaround?
Should I run my Parallel.Foreach in one other thread instead of my UI thread then add it?
myTasks.WaitAll
[EDIT]
A temporary solution was to copy my list of string in a ArrayList, then removing one by one each item from the list at the beginning of each query. Whenever the function worked well or not, I know when all items have been processed.
Parallel.ForEach is no different than other loops when it comes to handling exceptions. If an exception is thrown, then it is going to stop processing of the loop. This is probably why you're seeing variances in the percentages (I assume you might be processing the count as you're processing the loop).
Also, you don't really need Parallel.ForEach becuase the asynchronous calls that you're making on the WebClient class are going to block waiting on IO completion (the network responses), they are not computationally bound (Parallel.ForEach is much better when you are computationally bound).
That said, you should first translate your calls to WebClient to use Task<TResult>. Translating the event-based asynchronous pattern to the task-based asynchronous pattern is simple with the use of the TaskCompletionSource<TResult> class.
Assuming that you have a sequence of Uri instances that are produced as a result of your calls to getKey, you can create a function to do this:
static Task<String> DownloadStringAsync(Uri uri)
{
// Create a WebClient
var wc = new WebClient();
// Set up your web client.
// Create the TaskCompletionSource.
var tcs = new TaskCompletionSource<string>();
// Set the event handler on the web client.
wc.DownloadStringCompleted += (s, e) => {
// Dispose of the WebClient when done.
using (wc)
{
// Set the task completion source based on the
// event.
if (e.Cancelled)
{
// Set cancellation.
tcs.SetCancelled();
return;
}
// Exception?
if (e.Error != null)
{
// Set exception.
tcs.SetException(e.Error);
return;
}
// Set result.
tcs.SetResult(e.Result);
};
// Return the task.
return tcs.Task;
};
Note, the above can be optimized to use one WebClient, that is left as an exercise for you (assuming your tests show you need it).
From there, you can get a sequence of Task<string>:
// Gotten from myKeywords
IEnumerable<Uri> uris = ...;
// The tasks.
Task<string>[] tasks = uris.Select(DownloadStringAsync).ToArray();
Note that you must call the ToArray extension method in order for the tasks to start running. This is to get around deferred execution. You don't have to call ToArray, but you must call something which will enumerate through the entire list and cause the tasks to start running.
Once you have these Task<string> instances, you can wait on them all to complete by calling the ContinueWhenAll<TAntecedentResult> method on the TaskFactory class, like so:
Task.Factory.ContinueWhenAll(tasks, a => { }).Wait();
When this is done, you can cycle through the tasks array and look at the Exception and/or Result properties to check to see what the exception or result was.
If you are updating a user interface, then you should look at intercepting the call to Enumerable.Select, namely, you should call the ContinueWith<TNewResult> method on the Task<TResult> to perform an operation when that download is complete, like so:
// The tasks.
Task<string>[] tasks = uris.
Select(DownloadStringAsync).
// Select receives a Task<T> here, continue that.
Select(t => t.ContinueWith(t2 => {
// Do something here:
// - increment a count
// - fire an event
// - update the UI
// Note that you have to take care of synchronization here, so
// make sure to synchronize access to a count, or serialize calls
// to the UI thread appropriately with a SynchronizationContext.
...
// Return the result, this ensures that you'll have a Task<string>
// waiting.
return t2;
})).
ToArray();
This will allow you to update things as they happen. Note that in the above case, if you call Select again, you might want to check the state of t2 and fire some other events, depending on what you want your error handling mechanism to be.