Rx and tasks - cancel running task when new task is spawned? - c#

I have an user interaction scenario I'd like to handle with Rx.
The scenario is similar to the canonical "when user stops typing, do some work" (usually, search for what the user has typed so far) (1) - but I also need to :
(2) only get the latest of the results of "do some work" units (see below)
(3) when a new unit of work starts, cancel any work in progress (in my case it's CPU intensive)
For (1) I use an IObservable for the user events, throttled with .Throttle() to only trigger on pauses between events ("user stops typing").
From that, i .Select(_ => CreateMyTask(...).ToObservable()).
This gives me an IObservable<IObservable<T>> where each of the inner observables wraps a single task.
To get (2) I finally apply .Switch() to only get the results from the newest unit of work.
What about (3) - cancel pending tasks ?
If I understand correctly, whenever there's a new inner IObservable<T>, the .Switch() method subscribes to it and unsubscribes from the previous one(s), causing them to Dispose().
Maybe that can be somehow wired to trigger the task to cancel?

You can just use Observable.FromAsync which will generate tokens that are cancelled when the observer unsubcribes:
input.Throttle(...)
.Select(_ => Observable.FromAsync(token => CreateMyTask(..., token)))
.Switch()
.Subscribe(...);
This will generate a new token for each unit of work and cancel it every time Switch switches to the new one.

Do you have to work with Tasks?
If you're happy to work purely with Observables then you can do this nicely yourself.
Try doing something like this:
var query =
Observable.Create<int>(o =>
{
var cancelling = false;
var cancel = Disposable.Create(() =>
{
cancelling = true;
});
var subscription = Observable.Start(() =>
{
for (var i = 0; i < 100; i++)
{
Thread.Sleep(10); //1000 ms in total
if (cancelling)
{
Console.WriteLine("Cancelled on {0}", i);
return -1;
}
}
Console.WriteLine("Done");
return 42;
}).Subscribe(o);
return new CompositeDisposable(cancel, subscription);
});
This observable is doing some hard work in the for loop with the Thread.Sleep(10);, but when the observable is disposed the loop is exited and the intensive CPU work ceases. Then you can use the standard Rx Dispose with the Switch to cancel the in progress work.
If you'd like that bundled up in a method, then try this:
public static IObservable<T> Start<T>(Func<Func<bool>, T> work)
{
return Observable.Create<T>(o =>
{
var cancelling = false;
var cancel = Disposable
.Create(() => cancelling = true);
var subscription = Observable
.Start(() => work(() => cancelling))
.Subscribe(o);
return new CompositeDisposable(cancel, subscription);
});
}
And then call it with a function like this:
Func<Func<bool>, int> work = cancelling =>
{
for (var i = 0; i < 100; i++)
{
Thread.Sleep(10); //1000 ms in total
if (cancelling())
{
Console.WriteLine("Cancelled on {0}", i);
return -1;
}
}
Console.WriteLine("Done");
return 42;
};
Here's my code that proved this worked:
var disposable =
ObservableEx
.Start(work)
.Subscribe(x => Console.WriteLine(x));
Thread.Sleep(500);
disposable.Dispose();
I got "Cancelled on 50" (sometime "Cancelled on 51") as my output.

Related

Set property value when all threads are finished?

In my application there are three threads like:
private Thread _analysisThread;
private Thread _head2HeadThread;
private Thread _formThread;
and each thread is started in the following way:
if (_analysisThread == null || !_analysisThread.IsAlive)
{
_analysisThread = new Thread(() => { Analysis.Logic(match); });
_analysisThread.Start();
}
I've a ListView where the user can select an item and then start again the thread, but I want prevent this 'cause the methods inside each thread are heavy, so need time to complete them.
Until now I want disable the ListView selection, so I did:
<ListView IsEnabled="{Binding IsMatchListEnabled}">
private bool _isMatchListEnabled = true;
public bool IsMatchListEnabled
{
get { return _isMatchListEnabled; }
set
{
_isMatchListEnabled = value;
OnPropertyChanged();
}
}
before a new Thread start I do: IsMatchListEnabled = false; but what I need to do is check if all thread are finished and then do: IsMatchListEnabled = true;, actually if I enable the ListView after all thread, I get the ListView even enabled 'cause the Thread code is async, and the code outside the Thread is sync, so actually this property is useless.
What I tried to avoid this is create an infinite loop like this:
while (true)
{
if (!_analysisThread.IsAlive && !_head2HeadThread.IsAlive && !_formThread.IsAlive)
{
IsMatchListEnabled = true;
break;
}
}
this loop is placed after all threads execution, but as you can imagine, this will freeze the application.
Any solution?
All comments are correct — it's better to use Tasks. Just to answer OP's question.
You can synchronize threads with ManualResetEvent, having an array of events by the number of threads and one additional thread to change IsMatchListEnabled when all threads are finished.
public static void SomeThreadAction(object id)
{
var ev = new ManualResetEvent(false);
events[id] = ev; // store the event somewhere
Thread.Sleep(2000 * (int)id); // do your work
ev.Set(); // set the event signaled
}
Then, somewhere else we need to initialize waiting routine.
// we need tokens to be able to cancel waiting
var cts = new CancellationTokenSource();
var ct = cts.Token;
Task.Factory.StartNew(() =>
{
bool completed = false;
while (!ct.IsCancellationRequested && !completed)
{
// will check if our routine is cancelled each second
completed =
WaitHandle.WaitAll(
events.Values.Cast<ManualResetEvent>().ToArray(),
TimeSpan.FromSeconds(1));
}
if (completed) // if not completed, then somebody cancelled our routine
; // change your variable here
});
Complete example can be found and viewed here.
I would suggest using Microsoft's Reactive Framework for this. It's more powerful than tasks and the code is far simpler than using threads.
Let's say you have 3 long-running operations:
Action huey = () => { Console.WriteLine("Huey Start"); Thread.Sleep(5000); Console.WriteLine("Huey Done"); };
Action dewey = () => { Console.WriteLine("Dewey Start"); Thread.Sleep(5000); Console.WriteLine("Dewey Done"); };
Action louie = () => { Console.WriteLine("Louie Start"); Thread.Sleep(5000); Console.WriteLine("Louie Done"); };
Now you can write the following simple query:
IObservable<Unit> query =
from a in new [] { huey, dewey, louie }.ToObservable()
from u in Observable.Start(() => a())
select u;
You run it like this:
Stopwatch sw = Stopwatch.StartNew();
IDisposable subscription = query.Subscribe(u => { }, () =>
{
Console.WriteLine("All Done in {0} seconds.", sw.Elapsed.TotalSeconds);
});
The results I get are:
Huey Start
Dewey Start
Louie Start
Huey Done
Louie Done
Dewey Done
All Done in 5.0259197 seconds.
Three 5 second operations complete in 5.03 seconds. All in parallel.
If you want to stop the computation early just call subscription.Dispose().
NuGet "System.Reactive" to get the bits.

Run X number of Task<T> at any given time while keeping UI responsive

I have a C# WinForms (.NET 4.5.2) app utilizing the TPL. The tool has a synchronous function which is passed over to a task factory X amount of times (with different input parameters), where X is a number declared by the user before commencing the process. The tasks are started and stored in a List<Task>.
Assuming the user entered 5, we have this in an async button click handler:
for (int i = 0; i < X; i++)
{
var progress = Progress(); // returns a new IProgress<T>
var task = Task<int>.Factory.StartNew(() => MyFunction(progress), TaskCreationOptions.LongRunning);
TaskList.Add(task);
}
Each progress instance updates the UI.
Now, as soon as a task is finished, I want to fire up a new one. Essentially, the process should run indefinitely, having X tasks running at any given time, unless the user cancels via the UI (I'll use cancellation tokens for this). I try to achieve this using the following:
while (TaskList.Count > 0)
{
var completed = await Task.WhenAny(TaskList.ToArray());
if (completed.Exception == null)
{
// report success
}
else
{
// flatten AggregateException, print out, etc
}
// update some labels/textboxes in the UI, and then:
TaskList.Remove(completed);
var task = Task<int>.Factory.StartNew(() => MyFunction(progress), TaskCreationOptions.LongRunning);
TaskList.Add(task);
}
This is bogging down the UI. Is there a better way of achieving this functionality, while keeping the UI responsive?
A suggestion was made in the comments to use TPL Dataflow but due to time constraints and specs, alternative solutions are welcome
Update
I'm not sure whether the progress reporting might be the problem? Here's what it looks like:
private IProgress<string> Progress()
{
return new Progress<string>(msg =>
{
txtMsg.AppendText(msg);
});
}
Now, as soon as a task is finished, I want to fire up a new one. Essentially, the process should run indefinitely, having X tasks running at any given time
It sounds to me like you want an infinite loop inside your task:
for (int i = 0; i < X; i++)
{
var progress = Progress(); // returns a new IProgress<T>
var task = RunIndefinitelyAsync(progress);
TaskList.Add(task);
}
private async Task RunIndefinitelyAsync(IProgress<T> progress)
{
while (true)
{
try
{
await Task.Run(() => MyFunction(progress));
// handle success
}
catch (Exception ex)
{
// handle exceptions
}
// update some labels/textboxes in the UI
}
}
However, I suspect that the "bogging down the UI" is probably in the // handle success and/or // handle exceptions code. If my suspicion is correct, then push as much of the logic into the Task.Run as possible.
As I understand, you simply need a parallel execution with the defined degree of parallelization. There is a lot of ways to implement what you want. I suggest to use blocking collection and parallel class instead of tasks.
So when user clicks button, you need to create a new blocking collection which will be your data source:
BlockingCollection<IProgress> queue = new BlockingCollection<IProgress>();
CancellationTokenSource source = new CancellationTokenSource();
Now you need a runner that will execute your in parallel:
Task.Factory.StartNew(() =>
Parallel.For(0, X, i =>
{
foreach (IProgress p in queue.GetConsumingEnumerable(source.Token))
{
MyFunction(p);
}
}), source.Token);
Or you can choose more correct way with partitioner. So you'll need a partitioner class:
private class BlockingPartitioner<T> : Partitioner<T>
{
private readonly BlockingCollection<T> _Collection;
private readonly CancellationToken _Token;
public BlockingPartitioner(BlockingCollection<T> collection, CancellationToken token)
{
_Collection = collection;
_Token = token;
}
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
throw new NotImplementedException();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return _Collection.GetConsumingEnumerable(_Token);
}
public override bool SupportsDynamicPartitions
{
get { return true; }
}
}
And runner will looks like this:
ParallelOptions Options = new ParallelOptions();
Options.MaxDegreeOfParallelism = X;
Task.Factory.StartNew(
() => Parallel.ForEach(
new BlockingPartitioner<IProgress>(queue, source.Token),
Options,
p => MyFunction(p)));
So all you need right now is to fill queue with necessary data. You can do it whenever you want.
And final touch, when the user cancels operation, you have two options:
first you can break execution with source.Cancel call,
or you can gracefully stop execution by marking collection complete (queue.CompleteAdding), in that case runner will execute all already queued data and finish.
Of course you need additional code to handle exceptions, progress, state and so on. But main idea is here.

Unit tests failing with Observable.FromAsync and Observable.Switch

I'm having troubles testing a class that makes use of Observable.FromAsync<T>() and Observable.Switch<T>(). What it does is to wait for a trigger observable to produce a value, then it starts an async operation, and finally recollects all operations' results in a single output sequence. The gist of it is something like:
var outputStream = triggerStream
.Select(_ => Observable
.FromAsync(token => taskProducer.DoSomethingAsync(token)))
.Switch();
I put up some sanity check tests with the bare minimum parts to understand what's going on, here's the test with results in comments:
class test_with_rx : nspec
{
void Given_async_task_and_switch()
{
Subject<Unit> triggerStream = null;
TaskCompletionSource<long> taskDriver = null;
ITestableObserver<long> testObserver = null;
IDisposable subscription = null;
before = () =>
{
TestScheduler scheduler = new TestScheduler();
testObserver = scheduler.CreateObserver<long>();
triggerStream = new Subject<Unit>();
taskDriver = new TaskCompletionSource<long>();
// build stream under test
IObservable<long> streamUnderTest = triggerStream
.Select(_ => Observable
.FromAsync(token => taskDriver.Task))
.Switch();
/* Also tried with this Switch() overload
IObservable<long> streamUnderTest = triggerStream
.Select(_ => taskDriver.Task)
.Switch(); */
subscription = streamUnderTest.Subscribe(testObserver);
};
context["Before trigger"] = () =>
{
it["Should not notify"] = () => testObserver.Messages.Count.Should().Be(0);
// PASSED
};
context["After trigger"] = () =>
{
before = () => triggerStream.OnNext(Unit.Default);
context["When task completes"] = () =>
{
long result = -1;
before = () =>
{
taskDriver.SetResult(result);
//taskDriver.Task.Wait(); // tried with this too
};
it["Should notify once"] = () => testObserver.Messages.Count.Should().Be(1);
// FAILED: expected 1, actual 0
it["Should notify task result"] = () => testObserver.Messages[0].Value.Value.Should().Be(result);
// FAILED: of course, index out of bound
};
};
after = () =>
{
taskDriver.TrySetCanceled();
taskDriver.Task.Dispose();
subscription.Dispose();
};
}
}
In other tests I've done with mocks too, I can see that the Func passed to FromAsync is actually invoked (e.g. taskProducer.DoSomethingAsync(token)), but then it looks like nothing more follows, and the output stream doesn't produce the value.
I also tried inserting some Task.Delay(x).Wait(), or some taskDriver.Task.Wait() before hitting expectations, but with no luck.
I read this SO thread and I'm aware of schedulers, but at a first look I thought I didn't need them, no ObserveOn() is being used. Was I wrong? What am I missing? TA
Just for completeness, testing framework is NSpec, assertion library is FluentAssertions.
What you're hitting is a case of testing Rx and TPL together.
An exhaustive explanation can be found here but I'll try to give advice for your particular code.
Basically your code is working fine, but your test is not.
Observable.FromAsync will transform into a ContinueWith on the provided task, which will be executed on the taskpool, hence asynchronously.
Many ways to fix your test: (from ugly to complex)
Sleep after result set (note wait doesn't work because Wait doesn't wait for continuations)
taskDriver.SetResult(result);
Thread.Sleep(50);
Set the result before executing FromAsync (because FromAsync will return an immediate IObservable if the task is finished, aka will skip ContinueWith)
taskDriver.SetResult(result);
triggerStream.OnNext(Unit.Default);
Replace FromAsync by a testable alternative, e.g
public static IObservable<T> ToObservable<T>(Task<T> task, TaskScheduler scheduler)
{
if (task.IsCompleted)
{
return task.ToObservable();
}
else
{
AsyncSubject<T> asyncSubject = new AsyncSubject<T>();
task.ContinueWith(t => task.ToObservable().Subscribe(asyncSubject), scheduler);
return asyncSubject.AsObservable<T>();
}
}
(using either a synchronous TaskScheduler, or a testable one)

Parallel ForEach wait 500 ms before spawning

I have this situation:
var tasks = new List<ITask> ...
Parallel.ForEach(tasks, currentTask => currentTask.Execute() );
Is it possible to instruct PLinq to wait for 500ms before the next thread is spawned?
System.Threading.Thread.Sleep(5000);
You are using Parallel.Foreach totally wrong, You should make a special Enumerator that rate limits itself to getting data once every 500 ms.
I made some assumptions on how your DTO works due to you not providing any details.
private IEnumerator<SomeResource> GetRateLimitedResource()
{
SomeResource someResource = null;
do
{
someResource = _remoteProvider.GetData();
if(someResource != null)
{
yield return someResource;
Thread.Sleep(500);
}
} while (someResource != null);
}
here is how your paralell should look then
Parallel.ForEach(GetRateLimitedResource(), SomeFunctionToProcessSomeResource);
There are already some good suggestions. I would agree with others that you are using PLINQ in a manner it wasn't meant to be used.
My suggestion would be to use System.Threading.Timer. This is probably better than writing a method that returns an IEnumerable<> that forces a half second delay, because you may not need to wait the full half second, depending on how much time has passed since your last API call.
With the timer, it will invoke a delegate that you've provided it at the interval you specify, so even if the first task isn't done, a half second later it will invoke your delegate on another thread, so there won't be any extra waiting.
From your example code, it sounds like you have a list of tasks, in this case, I would use System.Collections.Concurrent.ConcurrentQueue to keep track of the tasks. Once the queue is empty, turn off the timer.
You could use Enumerable.Aggregate instead.
var task = tasks.Aggregate((t1, t2) =>
t1.ContinueWith(async _ =>
{ Thread.Sleep(500); return t2.Result; }));
If you don't want the tasks chained then there is also the overload to Select assuming the tasks are in order of delay.
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => x * 2))
.Select((x, i) => Task.Delay(TimeSpan.FromMilliseconds(i * 500))
.ContinueWith(_ => x.Result));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
From the comments a better options would be to guard the resource instead of using the time delay.
static object Locker = new object();
static int GetResultFromResource(int arg)
{
lock(Locker)
{
Thread.Sleep(500);
return arg * 2;
}
}
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => GetResultFromResource(x)));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
In this case how about a Producer-Consumer pattern with a BlockingCollection<T>?
var tasks = new BlockingCollection<ITask>();
// add tasks, if this is an expensive process, put it out onto a Task
// tasks.Add(x);
// we're done producin' (allows GetConsumingEnumerable to finish)
tasks.CompleteAdding();
RunTasks(tasks);
With a single consumer thread:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Execute();
// this may not be as accurate as you would like
Thread.Sleep(500);
}
}
If you have access to .Net 4.5 you can use Task.Delay:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
Task.Delay(500)
.ContinueWith(() => task.Execute())
.Wait();
}
}

ReactiveExtensions Observable FromAsync calling twice Function

Ok, Trying to understand Rx, kinda of lost here.
FromAsyncPattern is now deprecated so I took the example from here (section Light up Task with Rx), and it works, I just made a few changes, not using await just wait the observable and subscribing.....
What I don't understand is Why is called Twice the function SumSquareRoots?
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
class Program
{
static void Main(string[] args)
{
Samples();
}
static void Samples()
{
var x = 100000000;
try
{
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
}
catch (TimeoutException)
{
Console.WriteLine("Timed out :-(");
}
}
static Task<double> SumSquareRoots(long count, CancellationToken ct)
{
return Task.Run(() =>
{
var res = 0.0;
Console.WriteLine("Why I'm called twice");
for (long i = 0; i < count; i++)
{
res += Math.Sqrt(i);
if (i % 10000 == 0 && ct.IsCancellationRequested)
{
Console.WriteLine("Noticed cancellation!");
ct.ThrowIfCancellationRequested();
}
}
return res;
});
}
}
The reason that this is calling SumSquareRoots twice is because you're Subscribing twice:
// Subscribes to res
res.Subscribe(y => Console.WriteLine(y));
// Also Subscribes to res, since it *must* produce a result, even
// if that result is then discarded (i.e. Wait doesn't return IObservable)
res.Wait();
Subscribe is the foreach of Rx - just like if you foreach an IEnumerable twice, you could end up doing 2x the work, multiple Subscribes means multiple the work. To undo this, you could use a blocking call that doesn't discard the result:
Console.WriteLine(res.First());
Or, you could use Publish to "freeze" the result and play it back to > 1 subscriber (kind of like how you'd use ToArray in LINQ):
res = res.Publish();
res.Connect();
// Both subscriptions get the same result, SumSquareRoots is only called once
res.Subscribe(Console.WriteLine);
res.Wait();
The general rule you can follow is, that any Rx method that doesn't return IObservable<T> or Task<T> will result in a Subscription(*)
* - Not technically correct. But your brain will feel better if you think of it this way.

Categories

Resources