I have an async method that is a long-running method that reads a stream and when it finds something fires an event:
public static async void GetStream(int id, CancellationToken token)
It takes a cancellation token because it is create in a new task. Internally it calls await when it reads a stream:
var result = await sr.ReadLineAsync()
Now, I want to convert this to a method that returns an IObservable<> so that I can use this with the reactive extensions. From what I've read, the best way to do this is using Observable.Create, and since RX 2.0 now also supports async I can get it all to work with something like this:
public static IObservable<Message> ObservableStream(int id, CancellationToken token)
{
return Observable.Create<Message>(
async (IObserver<Message> observer) =>
{
The rest of the code inside is the same, but instead of firing events I'm calling observer.OnNext(). But, this feels wrong. For one thing I'm mixing CancellationTokens up in there, and although adding the async keyword made it work, is this actually the best thing to do? I'm calling my ObservableStream like this:
Client.ObservableStream(555404, token).ObserveOn(Dispatcher.CurrentDispatcher).SubscribeOn(TaskPoolScheduler.Default).Subscribe(m => Messages.Add(m));
You are correct. Once you represent your interface through an IObservable, you should avoid requiring the callers to supply a CancellationToken. That doesn't mean you cannot use them internally. Rx provides several mechanisms to produce CancellationToken instances which are canceled when the observer unsubscribes from your observable.
There are a number of ways to tackle your problem. The simplest requires almost no changes in your code. It uses an overload of Observable.Create which supplies you with a CancellationToken that triggers if the caller unsubscribes:
public static IObservable<Message> ObservableStream(int id)
{
return Observable.Create<Message>(async (observer, token) =>
{
// no exception handling required. If this method throws,
// Rx will catch it and call observer.OnError() for us.
using (var stream = /*...open your stream...*/)
{
string msg;
while ((msg = await stream.ReadLineAsync()) != null)
{
if (token.IsCancellationRequested) { return; }
observer.OnNext(msg);
}
observer.OnCompleted();
}
});
}
You should change GetStream to return a Task, instead of void (returning async void is not good, except when absolutely required, as svick commented). Once you return a Task, you can just call .ToObservable() and you are done.
For example:
public static async Task<int> GetStream(int id, CancellationToken token) { ... }
Then,
GetStream(1, new CancellationToken(false))
.ToObservable()
.Subscribe(Console.Write);
Related
In contrast to Task.Wait() or Task.Result, await’ing a Task in C# 5 prevents the thread which executes the wait from lying fallow. Instead, the method using the await keyword needs to be async so that the call of await just makes the method to return a new task which represents the execution of the async method.
But when the await’ed Task completes before the async method has received CPU time again, the await recognizes the Task as finished and thus the async method will return the Task object only at a later time. In some cases this would be later than acceptable because it probably is a common mistake that a developer assumes the await’ing always defers the subsequent statements in his async method.
The mistaken async method’s structure could look like the following:
async Task doSthAsync()
{
var a = await getSthAsync();
// perform a long operation
}
Then sometimes doSthAsync() will return the Task only after a long time.
I know it should rather be written like this:
async Task doSthAsync()
{
var a = await getSthAsync();
await Task.Run(() =>
{
// perform a long operation
};
}
... or that:
async Task doSthAsync()
{
var a = await getSthAsync();
await Task.Yield();
// perform a long operation
}
But I do not find the last two patterns pretty and want to prevent the mistake to occur. I am developing a framework which provides getSthAsync and the first structure shall be common. So getSthAsync should return an Awaitable which always yields like the YieldAwaitable returned by Task.Yield() does.
Unfortunately most features provided by the Task Parallel Library like Task.WhenAll(IEnumerable<Task> tasks) only operate on Tasks so the result of getSthAsync should be a Task.
So is it possible to return a Task which always yields?
First of all, the consumer of an async method shouldn't assume it will "yield" as that's nothing to do with it being async. If the consumer needs to make sure there's an offload to another thread they should use Task.Run to enforce that.
Second of all, I don't see how using Task.Run, or Task.Yield is problematic as it's used inside an async method which returns a Task and not a YieldAwaitable.
If you want to create a Task that behaves like YieldAwaitable you can just use Task.Yield inside an async method:
async Task Yield()
{
await Task.Yield();
}
Edit:
As was mentioned in the comments, this has a race condition where it may not always yield. This race condition is inherent with how Task and TaskAwaiter are implemented. To avoid that you can create your own Task and TaskAwaiter:
public class YieldTask : Task
{
public YieldTask() : base(() => {})
{
Start(TaskScheduler.Default);
}
public new TaskAwaiterWrapper GetAwaiter() => new TaskAwaiterWrapper(base.GetAwaiter());
}
public struct TaskAwaiterWrapper : INotifyCompletion
{
private TaskAwaiter _taskAwaiter;
public TaskAwaiterWrapper(TaskAwaiter taskAwaiter)
{
_taskAwaiter = taskAwaiter;
}
public bool IsCompleted => false;
public void OnCompleted(Action continuation) => _taskAwaiter.OnCompleted(continuation);
public void GetResult() => _taskAwaiter.GetResult();
}
This will create a task that always yields because IsCompleted always returns false. It can be used like this:
public static readonly YieldTask YieldTask = new YieldTask();
private static async Task MainAsync()
{
await YieldTask;
// something
}
Note: I highly discourage anyone from actually doing this kind of thing.
Here is a polished version of i3arnon's YieldTask:
public class YieldTask : Task
{
public YieldTask() : base(() => { },
TaskCreationOptions.RunContinuationsAsynchronously)
=> RunSynchronously();
public new YieldAwaitable.YieldAwaiter GetAwaiter()
=> default;
public new YieldAwaitable ConfigureAwait(bool continueOnCapturedContext)
{
if (!continueOnCapturedContext) throw new NotSupportedException();
return default;
}
}
The YieldTask is immediately completed upon creation, but its awaiter says otherwise. The GetAwaiter().IsCompleted always returns false. This mischief makes the await operator to trigger the desirable asynchronous switch, every time it awaits this task. Actually creating multiple YieldTask instances is redundant. A singleton would work just as well.
There is a problem with this approach though. The underlying methods of the Task class are not virtual, and hiding them with the new modifier means that polymorphism doesn't work. If you store a YieldTask instance to a Task variable, you'll get the default task behavior. This is a considerable drawback for my use case, but I can't see any solution around it.
I have to use a library whose API looks something like this:
public void Connect();
...
public delegate void ConnectResultDelegate(bool succeeded, string msg);
public ConnectResultDelegate ConnectResultHandler;
After calling the Connect() method, the ConnectResultHandler callback delegate will get called.
The API exposes other methods that work in a similar "request-response" manner; I guess the reason for the delegates is that the methods interact with an external hardware device, and the response (delegate call) may not happen for many milliseconds.
I was hoping I Could wrap the API in some way that would allow me to use it in a more "sequential" manner that is more like async/await, along the lines of:
void DoSomething()
{
_library.Connect();
// Wait for notification that this has completed
// Do something with the response passed to the delegate callback
_library.Configure(...);
// Wait for notification that this has completed
// Do something with the response
..etc..
}
Thoughts? Refactoring the library itself is not an option.
There are one or two similar SO questions out there, but they differ in that their delegates are passed to the methods, rather than being separate properties, making it relatively easy to wrap in a Task.
There are a lot of answers that show how to convert events or Begin/End async operations into tasks. That code though doesn't follow the conventions of either model. It's similar to the Event-based Async model EAP without using an event. If you searched for event to task conversions, you'd find a lot of answers. Delegates arent' used for async operations though, as the convention before EAP was to sue the Asynchronous Programming Model (APM) or Begin/End.
The process process is still the same though. It's described in Interop with Other Asynchronous Patterns and Types.
In all cases, a TaskCompletionSource is used to create a Task that's signalled when an operation completes.
When the class follows the APM conventions, one can use the TaskFactory.FromAsync method to convert a Beging/End pair into a task. FromAsync uses a TaskCompletionSource under the covers to return a Task that's signaled when the callback is called. The Interop doc example for this is Stream.BeginRead :
public static Task<int> ReadAsync(this Stream stream,
byte[] buffer, int offset,
int count)
{
if (stream == null)
throw new ArgumentNullException("stream");
return Task<int>.Factory.FromAsync(stream.BeginRead,
stream.EndRead, buffer,
offset, count, null);
}
Using delegates is similar to using events, which is also shown in the interop article. Adapted to the question, it would look something like this :
public Task<bool> ConnectAsync(ThatService service)
{
if (service==null)
throw new ArgumentNullException(nameof(service));
var tcs=new TaskCompletionSource<bool>();
service.ConnectResultHandler=(ok,msg)=>
{
if(ok)
{
tcs.TrySetResult(true);
}
else
{
tcs.TrySetException(new Exception(msg));
}
};
return tcs.Task;
}
This will allow you to use ConnectAsync in an async method, eg :
public async Task MyMethod()
{
...
var ok=await ConnectAsync(_service);
...
}
If msg contains data on success, you could change ConnectAsync to :
public Task<string> ConnectAsync(ThatService service)
{
if (service==null)
throw new ArgumentNullException(nameof(service));
var tcs=new TaskCompletionSource<string>();
service.ConnectResultHandler=(ok,msg)=>
{
if(ok)
{
tcs.TrySetResult(msg);
}
else
{
tcs.TrySetException(new Exception(msg));
}
};
return tcs.Task;
}
You can change ConnectAsync into an extension method which will allow you to use it as if it were a method of your service class :
public static class MyServiceExtensions
{
public static Task<string> ConnectAsync(this ThatService service)
{
//Same as before
}
}
And use it :
public async Task MyMethod()
{
...
var msg=await _service.ConnectAsync();
...
}
Regarding the right worker method signature I need to understand the following:
is there a point in returning Task instead of void for Worker method (if going sync)?
Should I really wait (call Wait()) on the Worker method (if going sync)?
what should be the return value of Worker method when marked as returning Task object (both if going sync/async)?
what signature and body of Worker method should be, given the work it completes is long-running CPU/IO-bound work? Should I follow this recommendation (if going mixed/async)?
Note
Despite the cpu-bound code, there's a choice to call async versions of io-bound methods (sql queries). So it may be all sync or partially async. As for the nature of code in the Worker method.
public class LoopingService
{
private CancellationTokenSource cts;
// ..
void Worker(CancellationToken cancellationToken)
{
while(!cancellationToken.IsCancellationRequested)
{
// mixed, CPU/IO-bound code
try {
// sql query (can be called either as sync/async)
var lastId = documentService.GetLastDocument().Id;
// get next document from a public resource (third-party code, sync)
// can be moved to a web api
var document = thirdPartyDocumentService.GetNextDocument(lastId);
// apply different processors in parallel
var tasksList = new List<Task>();
foreach(var processor in documentService.Processors) {
// each processor checks if it's applicable
// which may include xml-parsing, additional db calls, regexes
// if it's applicable then document data is inserted into the db
var task = new Task(() => processor.Process(document));
tasksList.Add(task);
task.Start();
}
// or
// var tasksList = documentService.ProcessParallel(document);
Task.WaitAll(tasksList.ToArray(), cancellationToken);
}
catch(Exception ex) {
logger.log(ex);
}
}
}
public void Start()
{
this.cts = new CancellationTokenSource();
Task.Run(() => this.Worker(cts.Token));
}
public void Stop()
{
this.cts.Cancel();
this.cts.Dispose();
}
}
is there a point in returning Task instead of void for Worker method?
If Worker is a truly asynchronous method it should return a Task for you to be able to await it. If it's just a synchronous method runnning on a background thread there is no point of changing the return type from void provided that the method is not supposed to return anything.
what should be the return value of Worker method when marked as returning Task object?
Nothing. Provided that the method is asynchronous and marked as async with a return type of Task, it shouldn't return any value:
async Task Worker(CancellationToken cancellationToken) { ... }
Note that there is no point of defining the method as async unless you actually use the await keyword in it.
what signature and body of Worker method should be given the work it completes is long-running CPU/IO-bound work? Should I follow this recommendation?
Yes, probably. If you for some reason are doing both asynchronous and synchronous (CPU-bound) work in the same method, you should prefer to using an asynchronous signature but not wrap the synchronous stuff in Task.Run. Then your service would look something like this:
public class LoopingService
{
private CancellationTokenSource cts;
async Task Worker(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
await ...
}
}
public async Task Start()
{
this.cts = new CancellationTokenSource();
await this.Worker(cts.Token).ConfigureAwait(false);
}
public void Stop()
{
this.cts.Cancel();
this.cts.Dispose();
}
}
Ideally your method should be either asynchronous or CPU-bound but not both though.
If I want to expose an API which internally Schedules a sequence of Tasks that should be cancellable by the user.
e.g.
public ??? DoWork()
{
Task t = new .... , myCancellationToken);
return ???
}
What is the correct object to return for cancellation Control?
Is it CancellationTokenSource ?
public CancellationTokenSource DoWork()
{
CancellationTokenSource source = new ....
Task t = new .... , source.Token);
return source;
}
Should I return anything at all?
Should I just accept a CancellationToken as an arg and let the user create the token source if needed?
public void DoWork(CancellationToken token)
{
Task t = new .... , token);
}
What is the most idiomatic way to deal with this?
Should I just accept a CancellationToken as an arg and let the user create the token source if needed?
This. But you should quite possibly return a Task as well, so that the user can observe when it's completed etc. This is amenable to async/await as well, of course.
You may also want to have overloads:
public Task DoWork()
{
return DoWork(CancellationToken.None);
}
public Task DoWork(CancellationToken cancellationToken)
{
...
}
See the Task-based Asynchronous Pattern for general conventions on this sort of thing.
Async methods should return the Task. You will need the Task for syncronization purpose as well as for getting the result if you do not implement callback-Patterns with IAsync. If you mark the method with async and call it with await the result is unwrapped from the Task automatically.
Task is the point of interest in TPL. Do not create a new Task if you do not need to do it explicitle, prefer the static Run method instead.
In C# and TPL (Task Parallel Library), the Task class represents an ongoing work that produces a value of type T.
I'd like to know what is the need for the Task.FromResult method ?
That is: In a scenario where you already have the produced value at hand, what is the need to wrap it back into a Task?
The only thing that comes to mind is that it's used as some adapter for other methods accepting a Task instance.
There are two common use cases I've found:
When you're implementing an interface that allows asynchronous callers, but your implementation is synchronous.
When you're stubbing/mocking asynchronous code for testing.
One example would be a method that makes use of a cache. If the result is already computed, you can return a completed task with the value (using Task.FromResult). If it is not, then you go ahead and return a task representing ongoing work.
Cache Example: Cache Example using Task.FromResult for Pre-computed values
Use it when you want to create an awaitable method without using the async keyword.
I found this example:
public class TextResult : IHttpActionResult
{
string _value;
HttpRequestMessage _request;
public TextResult(string value, HttpRequestMessage request)
{
_value = value;
_request = request;
}
public Task<HttpResponseMessage> ExecuteAsync(CancellationToken cancellationToken)
{
var response = new HttpResponseMessage()
{
Content = new StringContent(_value),
RequestMessage = _request
};
return Task.FromResult(response);
}
}
Here you are creating your own implementation of the IHttpActionResult interface to be used in a Web Api Action. The ExecuteAsync method is expected to be asynchronous but you don't have to use the async keyword to make it asynchronous and awaitable. Since you already have the result and don't need to await anything it's better to use Task.FromResult.
From msft.com Create pre-computed tasks with Task.FromResult:
This method is useful when you perform an asynchronous operation that returns a Task object, and the result of that Task object is already computed.
Use the Task.FromResult when you want to have a asynchronous operation but sometimes the result is in hand synchronously. You can find a good sample here http://msdn.microsoft.com/en-us/library/hh228607.aspx.
I would argue that you could use Task.FromResult for methods that are synchronous that take a long time to complete while you can do other independent work in your code. Id rather make those methods to call async though. But imagine the situation where you have no control over the code called and you want that implicit parallel processing.
Task.Run() creates an lambda thread, no async is required and returns a type object. In my example, I have multiple tasks running simulatenously awaiting their completion. Once all the tasks have completed, I can cycle through their results. Task.FromResult is used to push a task result not generated by Task.Run()
The Task.FromResult pushs a type object in this case RecordStruct class in Result class. I created to tasks calling the function getData. The Task.WaitAll processes each of the task and push the results into an array of result object of type RecordStruct. I then access the attribute element of the RecordStruct Class as a result
public class RecordStruct
{
public RecordStruct(string _element) {
element = _element;
}
public string element { get;set; }
}
public class TaskCustom
{
public Task<RecordStruct> getData(string phrase)
{
if (phrase == "hello boise")
{
return Task.FromResult(new RecordStruct("Boise is a great place to live"));
}
return Task.Run(() =>
{
return new RecordStruct(phrase);
});
}
}
[Fact]
public async Task TestFactory()
{
TaskCustom obj = new TaskCustom();
List<Task<RecordStruct>> tasks = new List<Task<RecordStruct>>();
tasks.Add(obj.getData("hello world"));
tasks.Add(obj.getData("hello boise"));
Task.WaitAll(tasks.ToArray());
for(int ctr = 0; ctr < tasks.Count; ctr++) {
if (tasks[ctr].Status == TaskStatus.Faulted)
output.WriteLine(" Task fault occurred");
else
{
output.WriteLine("test sent {0}",
tasks[ctr].Result.element);
Assert.True(true);
}
}
}