I have a class which takes a stream in the constructor. You can then set up callbacks for various events, and then call StartProcessing. The issue is that I want to use it from a function which should return an IEnumerable.
Example:
public class Parser
{
public Parser(System.IO.Stream s) { // saves stream and does some set up }
public delegate void OnParsedHandler(List<string> token);
public event OnParsedHandler OnParsedData;
public void StartProcessing()
{
// reads stream and makes callback when it has a whole record
}
}
public class Application
{
public IEnumerable<Thing> GetThings(System.IO.Stream s)
{
Parser p = new Parser(s);
p.OnParsedData += (List<string> str) =>
{
Thing t = new Thing(str[0]);
// here is where I would like to yield
// but I can't
yield return t;
};
p.StartProcessing();
}
}
Right now my solution, which isn't so great, is to put them all the Things into a List which is captured by the lambda, and then iterate over them after calling StartProcessing.
public class Application
{
public IEnumerable<Thing> GetThings(System.IO.Stream s)
{
Parser p = new Parser(s);
List<Thing> thingList = new List<Thing>();
p.OnParsedData += (List<string> str) =>
{
Thing t = new Thing(str[0]);
thingList .Add(t);
};
p.StartProcessing();
foreach(Thing t in thingList )
{
yield return t;
}
}
}
The issue here is that now I have to save all of the Thing objects into list.
The problem you have here is that you don't fundamentally have a "pull" mechanic here, you're trying to push data from the parser. If the parser is going to push data to you, rather than letting the caller pull the data, then GetThings should return an IObservable, rather than an IEnumerable, so the caller can consume the data when it's ready.
If it really is important to have a pull mechanic here then Parser shouldn't fire an event to indicate that it has new data, but rather the caller should be able to ask it for new data and have it get it; it should either return all of the parsed data, or itself return an IEnumerable.
Interesting question. I would like to build upon what #servy has said regarding push and pull. In your implementation above, you are effectively adapting a push mechanism to a pull interface.
Now, first things first. You have not specified whether the call to the StartProcessing() method is a blocking call or not. A couple of remarks regarding that:
If the method is blocking (synchronous), then there is really no point in adapting it to a pull model anyway. The caller will see all the data processed in a single blocking call.
In that regard, receiving the data indirectly via an event handler scatters into two seemingly unrelated constructs what should otherwise be a single, cohesive, explicit operation. For example:
void ProcessAll(Action<Thing> callback);
On the other hand, if the StartProcessing() method actually spawns a new thread (maybe better named BeginProcessing() and follow the Event-based Asynchronous Pattern or another async processing pattern), you could adapt it to a pull machanism by means of a synchronization construct using a wait handle: ManualResetEvent, mutex and the like. Pseudo-code:
public IEnumerable<Thing> GetThings(System.IO.Stream s)
{
var parser = new Parser(s);
var waitable = new AutoResetEvent(false);
Thing item = null;
parser.OnParsedData += (Thing thing) =>
{
item = thing;
waitable.Set();
};
IAsyncResult result = parser.BeginProcessing();
while (!result.IsCompleted)
{
waitable.WaitOne();
yield return item;
}
}
Disclaimer
The above code serves only as a means for presenting an idea. It is not thread-safe and the synchronization mechanics do not work properly. See the producer-consumer pattern for more information.
Related
I am having the following problem: with RethinkDB using RunChangesAsync method runs once and when used, it starts listening to changes on a given query. When the query changes, you are given the Cursor<Change<Class>> , which is a delta between the initial state and the actual state.
My question is how can I make this run continuously?
If I use:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
If there are changes happening where i pointed in the code, they would not be caught by the RunChanges. The only changes that would be caught would be while RunChanges is listening. Not before ..or after it retrieves the results.
So I tried wrapping the RunChanges in an observable but it does not listen continuously for changes as I would have expected...it just retrieves 2 null items (garbage I suppose) and ends.
Observable
public IObservable<Cursor<Change<UserStatus?>>> GetObservable() =>
r.Db(Constants.DB_NAME).Table(Constants.CLIENT_TABLE).RunChangesAsync<UserStatus?>(this.con,CancellationToken.None).ToObservable();
Observer
class PlayerSubscriber : IObserver<Cursor<Change<UserStatus?>>>
{
public void OnCompleted() => Console.WriteLine("Finished");
public void OnError(Exception error) => Console.WriteLine("error");
public void OnNext(Cursor<Change<UserStatus?>> value)
{
foreach (var item in value.BufferedItems)
Console.WriteLine(item);
}
}
Program
class Program
{
public static RethinkDB r = RethinkDB.R;
public static bool End = false;
static async Task Main(string[] args)
{
var address = new Address { Host = "127.0.0.1", Port = 28015 };
var con = await r.Connection().Hostname(address.Host).Port(address.Port).ConnectAsync();
var database = new Database(r, con);
var obs = database.GetObservable();
var sub = new PlayerSubscriber();
var disp = obs.Subscribe(sub);
Console.ReadKey();
Console.WriteLine("Hello World!");
}
}
When I am debugging as you can see, the OnNext method of the Observer is executed only once (returns two null objects) and then it closes.
P.S: Database is just a wrapper around rethinkdb queries. The only method used is GetObservable which I posted it. The UserStatus is a POCO.
When creating a change feed, you'll want to create one change feed object. For example, when you get back a Cursor<Change<T>> after running .RunChangesAsync(); that is really all you need.
The cursor object you get back from query.RunChangesAsync() is your change feed object that you will use for the entire lifetime you want to receive changes.
In your example:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
Having .RunChangesAsync(); in a while loop is not the correct approach. You don't need to re-run the query again and get another Cursor<Change<T>>. I'll explain how this works at the end of this post.
Also, do not use cursor.BufferedItems on the cursor object. The cursor.BufferedItems property on the cursor is not meant to consumed by your code directly; the cursor.BufferedItems property is only exposed for those special situations where you want to "peek ahead" inside the cursor object (client-side) for items that are ready to be consumed that are specific to your change feed query.
The proper way to consume items in your change feed is to enumerate over the cursor object itself as shown below:
var cursor = await query.RunChangesAsync(conn);
foreach (var item in cursor){
Console.WriteLine(item);
}
When the cursor runs out of items, it will make a request to the RethinkDB server for more items. Keep in mind, each iteration of the foreach loop can be potentially a blocking call. For example, the foreach loop can block indefinitely when 1) there are no items on the client-side to be consumed (.BufferedItems.Count == 0) and 2) there are no items that have been changed on the server-side according to your change feed query criteria. under these circumstances, the foreach loop will block until RethinkDB server sends you an item that is ready to be consumed.
Documentation about using Reactive Extensions and RethinkDB in C#
There is a driver unit test that shows how .NET Reactive Extensions can work here.
Specifically, Lines 31 - 47 in this unit test set up a change feed with Reactive Extensions:
var changes = R.Db(DbName).Table(TableName)
//.changes()[new {include_states = true, include_initial = true}]
.Changes()
.RunChanges<JObject>(conn);
changes.IsFeed.Should().BeTrue();
var observable = changes.ToObservable();
//use a new thread if you want to continue,
//otherwise, subscription will block.
observable.SubscribeOn(NewThreadScheduler.Default)
.Subscribe(
x => OnNext(x),
e => OnError(e),
() => OnCompleted()
);
Additionally, here is a good example and explanation of what happens and how to consume a change feed with C#:
Hope that helps.
Thanks,
Brian
If you have an operation that has the signature Task<int> ReadAsync(), then the way to set up polling, is like this:
IObservable<int> PollRead(TimeSpan interval)
{
return
Observable
.Interval(interval)
.SelectMany(n => Observable.FromAsync(() => ReadAsync()));
}
I'd also caution about you creating your own implementation of IObservable<T> - it's fraught with danger. You should use Observer.Create(...) if you are creating your own observer that you want to hand around. Generally you don't even do that.
public static void CacheUncachedMessageIDs(List<int> messageIDs)
{
var uncachedRecordIDs = LocalCacheController.GetUncachedRecordIDs<PrivateMessage>(messageIDs);
if (!uncachedRecordIDs.Any()) return;
using (var db = new DBContext())
{
.....
}
}
The above method is repeated regularly throughout the project (except with different generics passed in). I'm looking to avoid repeated usages of the if (!uncachedRecordIDs.Any()) return; lines.
In short, is it possible to make the LocalCacheController.GetUncachedRecordIDs return the CacheUncachedMessageIDs method?
This will guarantee a new data context is not created unless it needs to be (stops accidentally forgetting to add the return line in the parent method).
It is not possible for a nested method to return from parent method.
You can do some unhandled Exception inside GetUncachedRecordIDs, that will do the trick, but it is not supposed to do this, so it creates confusion. Moreover, it is very slow.
Another not suggested mechanic is to use some goto magic. This also generates confusion because goto allows unexpected behaviour in program execution flow.
Your best bet would be to return a Result object with simple bool HasUncachedRecordIDs field and then check it. If it passes, then return. This solution solves the problem of calling a method, which is Any() in this case.
var uncachedRecordIDsResult = LocalCacheController.GetUncachedRecordIDs<PrivateMessage>(messageIDs);
if(uncachedRecordIDsResult.HasUncachedRecordIDs) return;
My reasoning for lack of this feature in the language is that calling GetUncachedRecordIDs in basically any function would unexpectedly end that parent function, without warning. Also, it would intertwine closely both functions, and best programming practices involve loose coupling of classes and methods.
You could pass an Action to your GetUncachedRecordIDs method which you only invoke if you need to. Rough sketch of the idea:
// LocalCacheController
void GetUncachedRecordIDs<T>(List<int> messageIDs, Action<List<int>> action)
{
// ...
if (!cached) {
action(recordIds);
}
}
// ...
public static void CacheUncachedMessageIDs(List<int> messageIDs)
{
LocalCacheController.GetUncachedRecordIDs<PrivateMessage>(messageIDs, uncachedRecordIDs => {
using (var db = new DBContext())
{
// ...
}
});
}
I have a class that receives standard .Net events from an external class.
These events have an address property (in addition to a lot of other properties, of course) that I can use to synchronize my events, so that I should be able to create a method to Get something, wait for the correct event, then return the data from the event in the Get method.
However, I'm fairly new to synchronization in C# and was hoping any of you could help me out. Below is somewhat pseudo code for what I want to accomplish:
Someone calls DoAsynchronousToSynchronousCall
That method waits until an event have been received with the same address (or until it times out)
The event checks against all current requests. If it finds a request with the same address, let DoAsynchronousToSynchronousCall know the reply has arrived
DoAsynchronousCall gets (or retrieves) the reply and returns it to the caller
public class MyMessage
{
public string Address { get; set; }
public string Data { get; set; }
}
public Main
{
externalClass.MessageReceived += MessageReceived;
}
public void MessageReceived(MyMessage message)
{
MyMessage request = _requestQueue.FirstOrDefault(m => m.Address = message.Address);
if (request != null)
{
// Do something to let DoAsynchronousToSynchronousCall() know the reply has arrived
}
}
private List<MyMessage> _requestQueue = new List<MyMessage>();
public MyMessage DoAsynchronousToSynchronousCall(MyMessage message)
{
_requestQueue.Add(message);
externalClass.Send(message);
// Do something to wait for a reply (as checked for above)
MyMessage reply = WaitForCorrectReply(timeout: 10000);
return reply;
}
I feel like I'm missing an opportunity to use async and await (yet I don't know how), and I hope you're able to understand what I'm trying to accomplish based on the information above.
You really can't have multiple calls on the fly and have synchronous responses. If you want synchronous responses for multiple calls then you need to do the calls synchronously too.
I would look at using Microsoft's Reactive Extensions (NuGet "Rx-Main") to make what you're doing as simple as possible. Rx lets you turn events into streams of values that you can query against.
Here's what I would do.
I would first define a stream of the received messages as IObservable<MyMessage> receivedMessages like this:
receivedMessages =
Observable
.FromEvent<MessageReceivedHandler, MyMessage>(
h => externalClass.MessageReceived += h,
h => externalClass.MessageReceived -= h);
(You didn't provide a class def so I've called the event delegate MessageReceivedHandler.)
Now you can redefine DoAsynchronousToSynchronousCall as:
public IObservable<MyMessage> DoAsynchronousCall(MyMessage message)
{
return Observable.Create<MyMessage>(o =>
{
IObservable<MyMessage> result =
receivedMessages
.Where(m => m.Address == message.Address)
.Take(1);
IObservable<MyMessage> timeout =
Observable
.Timer(TimeSpan.FromSeconds(10.0))
.Select(x => (MyMessage)null);
IDisposable subscription =
Observable
.Amb(result, timeout)
.Subscribe(o);
externalClass.Send(message);
return subscription;
});
}
The result observable is the receivedMessages filtered for the current message.Address.
The timeout observable is a default value to return if the call takes longer than TimeSpan.FromSeconds(10.0) to complete.
Finally the subscription uses Observable.Amb(...) to determine which of result or timeout produces a value first and subscribes to that result.
So now to call this you can do this:
DoAsynchronousCall(new MyMessage() { Address = "Foo", Data = "Bar" })
.Subscribe(response => Console.WriteLine(response.Data));
So, if I make a simple definition of ExternalClass like this:
public class ExternalClass
{
public event MessageReceivedHandler MessageReceived;
public void Send(MyMessage message)
{
this.MessageReceived(new MyMessage()
{
Address = message.Address,
Data = message.Data + "!"
});
}
}
...I get the result Bar! printed on the console.
If you have a whole bunch of messages that you want to process you can do this:
var messagesToSend = new List<MyMessage>();
/* populate `messagesToSend` */
var query =
from message in messagesToSend.ToObservable()
from response in DoAsynchronousCall(message)
select new
{
message,
response
};
query
.Subscribe(x =>
{
/* Do something with each correctly paired
`x.message` & `x.response`
*/
});
You're probably looking for ManualResetEvent which functions as a "toggle" of sorts to switch between thread-blocking and non-blocking behavior. The DoAsynchronousToSynchronousCall would Reset and then WaitOne(int timeoutMilliseconds) the event to block the thread, and the thing checking for the correct reply arrived would do the Set call to let the thread continue on its way if the correct thing arrived.
I have a slow and expensive method that return some data for me:
public Data GetData(){...}
I don't want to wait until this method will execute. Rather than I want to return a cached data immediately.
I have a class CachedData that contains one property Data cachedData.
So I want to create another method public CachedData GetCachedData() that will initiate a new task(call GetData inside of it) and immediately return cached data and after task will finish we will update the cache.
I need to have thread safe GetCachedData() because I will have multiple request that will call this method.
I will have a light ping "is there anything change?" each minute and if it will return true (cachedData != currentData) then I will call GetCachedData().
I'm new in C#. Please, help me to implement this method.
I'm using .net framework 4.5.2
The basic idea is clear:
You have a Data property which is wrapper around an expensive function call.
In order to have some response immediately the property holds a cached value and performs updating in the background.
No need for an event when the updater is done because you poll, for now.
That seems like a straight-forward design. At some point you may want to use events, but that can be added later.
Depending on the circumstances it may be necessary to make access to the property thread-safe. I think that if the Data cache is a simple reference and no other data is updated together with it, a lock is not necessary, but you may want to declare the reference volatile so that the reading thread does not rely on a stale cached (ha!) version. This post seems to have good links which discuss the issues.
If you will not call GetCachedData at the same time, you may not use lock. If data is null (for sure first run) we will wait long method to finish its work.
public class SlowClass
{
private static object _lock;
private static Data _cachedData;
public SlowClass()
{
_lock = new object();
}
public void GetCachedData()
{
var task = new Task(DoStuffLongRun);
task.Start();
if (_cachedData == null)
task.Wait();
}
public Data GetData()
{
if (_cachedData == null)
GetCachedData();
return _cachedData;
}
private void DoStuffLongRun()
{
lock (_lock)
{
Console.WriteLine("Locked Entered");
Thread.Sleep(5000);//Do Long Stuff
_cachedData = new Data();
}
}
}
I have tested on console application.
static void Main(string[] args)
{
var mySlow = new SlowClass();
var mySlow2 = new SlowClass();
mySlow.GetCachedData();
for (int i = 0; i < 5; i++)
{
Console.WriteLine(i);
mySlow.GetData();
mySlow2.GetData();
}
mySlow.GetCachedData();
Console.Read();
}
Maybe you can use the MemoryCache class,
as explained here in MSDN
This sample console application has 2 observables. The first one pushes numbers from 1 to 100. This observable is subscribed by the AsyncClass which runs a long running process for each number it gets. Upon completion of this new async process I want to be able to 'push' to 2 subscribers which would be doing something with this new value.
My attempts are commented in the source code below.
AsyncClass:
class AsyncClass
{
private readonly IConnectableObservable<int> _source;
private readonly IDisposable _sourceDisposeObj;
public IObservable<string> _asyncOpObservable;
public AsyncClass(IConnectableObservable<int> source)
{
_source = source;
_sourceDisposeObj = _source.Subscribe(
ProcessArguments,
ExceptionHandler,
Completed
);
_source.Connect();
}
private void Completed()
{
Console.WriteLine("Completed");
Console.ReadKey();
}
private void ExceptionHandler(Exception exp)
{
throw exp;
}
private void ProcessArguments(int evtArgs)
{
Console.WriteLine("Argument being processed with value: " + evtArgs);
//_asyncOpObservable = LongRunningOperationAsync("hello").Publish();
// not going to work either since this creates a new observable for each value from main observer
}
// http://rxwiki.wikidot.com/101samples
public IObservable<string> LongRunningOperationAsync(string param)
{
// should not be creating an observable here, rather 'pushing' values?
return Observable.Create<string>(
o => Observable.ToAsync<string, string>(DoLongRunningOperation)(param).Subscribe(o)
);
}
private string DoLongRunningOperation(string arg)
{
return "Hello";
}
}
Main:
static void Main(string[] args)
{
var source = Observable
.Range(1, 100)
.Publish();
var asyncObj = new AsyncClass(source);
var _asyncTaskSource = asyncObj._asyncOpObservable;
var ui1 = new UI1(_asyncTaskSource);
var ui2 = new UI2(_asyncTaskSource);
}
UI1 (and UI2, they're basically the same):
class UI1
{
private IConnectableObservable<string> _asyncTaskSource;
private IDisposable _taskSourceDisposable;
public UI1(IConnectableObservable<string> asyncTaskSource)
{
_asyncTaskSource = asyncTaskSource;
_asyncTaskSource.Connect();
_taskSourceDisposable = _asyncTaskSource.Subscribe(RefreshUI, HandleException, Completed);
}
private void Completed()
{
Console.WriteLine("UI1: Stream completed");
}
private void HandleException(Exception obj)
{
Console.WriteLine("Exception! "+obj.Message);
}
private void RefreshUI(string obj)
{
Console.WriteLine("UI1: UI refreshing with value "+obj);
}
}
This is my first project with Rx so let me know if I should be thinking differently. Any help would be highly appreciated!
I'm going to let you know you should be thinking differently... :) Flippancy aside, this looks like a case of bad collision between object-oriented and functional-reactive styles.
It's not clear what the requirements are around timing of the data flow and caching of results here - the use of Publish and IConnectableObservable is a little confused. I'm going to guess you want to avoid the 2 downstream subscriptions causing the processing of a value being duplicated? I'm basing some of my answer on that premise. The use of Publish() can achieve this by allowing multiple subscribers to share a subscription to a single source.
Idiomatic Rx wants you to try and keep to a functional style. In order to do this, you want to present the long running work as a function. So let's say, instead of trying to wire your AsyncClass logic directly into the Rx chain as a class, you could present it as a function like this contrived example:
async Task<int> ProcessArgument(int argument)
{
// perform your lengthy calculation - maybe in an OO style,
// maybe creating class instances and invoking methods etc.
await Task.Delay(TimeSpan.FromSeconds(1));
return argument + 1;
}
Now, you can construct a complete Rx observable chain calling this function, and through the use of Publish().RefCount() you can avoid multiple subscribers causing duplicate effort. Note how this separates concerns too - the code processing the value is simpler because the reuse is handled elsewhere.
var query = source.SelectMany(x => ProcessArgument(x).ToObservable())
.Publish().RefCount();
By creating a single chain for subscribers, the work is only started when necessary on subscription. I've used Publish().RefCount() - but if you want to ensure values aren't missed by the second and subsequent subscribers, you could use Replay (easy) or use Publish() and then Connect - but you'll want the Connect logic outside the individual subscriber's code because you just need to call it once when all subscribers have subscribed.