Limit replay buffer by observable

Limit replay buffer by observable - c#

I have a stream with live data, and a stream which basically delimits parts of the live data that belong together. Now when someone subscribes to the live data stream, I would like to replay them the live data. However I don't want to remember all the live data, only the part since the last time the other stream emitted a value.
There is an issue which would solve my problem, since there is a replay operator which does exactly what I want (or at least I think).
What is currently the way to do this easily? Is there a better way than something like the following?
private class ReplayWithLimitObservable<TItem, TDelimiter> : IConnectableObservable<TItem>
{
private readonly List<TItem> cached = new List<TItem>();
private readonly IObservable<TDelimiter> delimitersObservable;
private readonly IObservable<TItem> itemsObservable;
public ReplayWithLimitObservable(IObservable<TItem> itemsObservable, IObservable<TDelimiter> delimitersObservable)
{
this.itemsObservable = itemsObservable;
this.delimitersObservable = delimitersObservable;
}
public IDisposable Subscribe(IObserver<TItem> observer)
{
lock (cached)
{
cached.ForEach(observer.OnNext);
}
return itemsObservable.Subscribe(observer);
}
public IDisposable Connect()
{
var delimiters = delimitersObservable.Subscribe(
p =>
{
lock (cached)
{
cached.Clear();
}
});
var items = itemsObservable.Subscribe(
p =>
{
lock (cached)
{
cached.Add(p);
}
});
return Disposable.Create(
() =>
{
items.Dispose();
delimiters.Dispose();
lock (cached)
{
cached.Clear();
}
});
}
public static IConnectableObservable<TItem> ReplayWithLimit<TItem, TDelimiter>(IObservable<TItem> items, IObservable<TDelimiter> delimiters)
{
return new ReplayWithLimitObservable<TItem, TDelimiter>(items, delimiters);
}

Does this do what you want? It has the advantage of leaving all of the locking and race conditions to the Rx pros :)
private class ReplayWithLimitObservable<T, TDelimiter> : IConnectableObservable<T>
{
private IConnectableObservable<IObservable<T>> _source;
public ReplayWithLimitObservable(IObservable<T> source, IObservable<TDelimiter> delimiter)
{
_source = source
.Window(delimiter) // new replay window on delimiter
.Select<IObservable<T>,IObservable<T>>(window =>
{
var replayWindow = window.Replay();
// immediately connect and start memorizing values
replayWindow.Connect();
return replayWindow;
})
.Replay(1); // remember the latest window
}
IDisposable Connect()
{
return _source.Connect();
}
IDisposable Subscribe(IObserver<T> observer)
{
return _source
.Concat()
.Subscribe(observer);
}
}
public static IConnectableObservable<TItem> ReplayWithLimit<TItem, TDelimiter>(IObservable<TItem> items, IObservable<TDelimiter> delimiters)
{
return new ReplayWithLimitObservable<TItem, TDelimiter>(items, delimiters);
}

Related

Looking for a way to do less locking while caching

I am using the code below to cache items. It's pretty basic.
The issue I have is that every time it caches an item, section of the code locks. So with roughly a million items arriving every hour or so, this is a problem.
I've tried creating a dictionary of static lock objects per cacheKey, so that locking is granular, but that in itself becomes an issue with managing expiration of them, etc...
Is there a better way to implement minimal locking?
private static readonly object cacheLock = new object();
public static T GetFromCache<T>(string cacheKey, Func<T> GetData) where T : class {
// Returns null if the string does not exist, prevents a race condition
// where the cache invalidates between the contains check and the retrieval.
T cachedData = MemoryCache.Default.Get(cacheKey) as T;
if (cachedData != null) {
return cachedData;
}
lock (cacheLock) {
// Check to see if anyone wrote to the cache while we where
// waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey) as T;
if (cachedData != null) {
return cachedData;
}
// The value still did not exist so we now write it in to the cache.
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, new CacheItemPolicy(...));
return cachedData;
}
}

You may want to consider using ReaderWriterLockSlim, which you can obtain write lock only when needed.
Using cacheLock.EnterReadLock(); and cacheLock.EnterWriteLock(); should greatly improve the performance.
That link I gave even have an example of a cache, exactly what you need, I copy here:
public class SynchronizedCache
{
private ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
private Dictionary<int, string> innerCache = new Dictionary<int, string>();
public int Count
{ get { return innerCache.Count; } }
public string Read(int key)
{
cacheLock.EnterReadLock();
try
{
return innerCache[key];
}
finally
{
cacheLock.ExitReadLock();
}
}
public void Add(int key, string value)
{
cacheLock.EnterWriteLock();
try
{
innerCache.Add(key, value);
}
finally
{
cacheLock.ExitWriteLock();
}
}
public bool AddWithTimeout(int key, string value, int timeout)
{
if (cacheLock.TryEnterWriteLock(timeout))
{
try
{
innerCache.Add(key, value);
}
finally
{
cacheLock.ExitWriteLock();
}
return true;
}
else
{
return false;
}
}
public AddOrUpdateStatus AddOrUpdate(int key, string value)
{
cacheLock.EnterUpgradeableReadLock();
try
{
string result = null;
if (innerCache.TryGetValue(key, out result))
{
if (result == value)
{
return AddOrUpdateStatus.Unchanged;
}
else
{
cacheLock.EnterWriteLock();
try
{
innerCache[key] = value;
}
finally
{
cacheLock.ExitWriteLock();
}
return AddOrUpdateStatus.Updated;
}
}
else
{
cacheLock.EnterWriteLock();
try
{
innerCache.Add(key, value);
}
finally
{
cacheLock.ExitWriteLock();
}
return AddOrUpdateStatus.Added;
}
}
finally
{
cacheLock.ExitUpgradeableReadLock();
}
}
public void Delete(int key)
{
cacheLock.EnterWriteLock();
try
{
innerCache.Remove(key);
}
finally
{
cacheLock.ExitWriteLock();
}
}
public enum AddOrUpdateStatus
{
Added,
Updated,
Unchanged
};
~SynchronizedCache()
{
if (cacheLock != null) cacheLock.Dispose();
}
}

I don't know how MemoryCache.Default is implemented, or whether or not you have control over it.
But in general, prefer using ConcurrentDictionary over Dictionary with lock in a multi threaded environment.
GetFromCache would just become
ConcurrentDictionary<string, T> cache = new ConcurrentDictionary<string, T>();
...
cache.GetOrAdd("someKey", (key) =>
{
var data = PullDataFromDatabase(key);
return data;
});
There are two more things to take care about.
Expiry
Instead of saving T as the value of the dictionary, you can define a type
struct CacheItem<T>
{
public T Item { get; set; }
public DateTime Expiry { get; set; }
}
And store the cache as a CacheItem with a defined expiry.
cache.GetOrAdd("someKey", (key) =>
{
var data = PullDataFromDatabase(key);
return new CacheItem<T>() { Item = data, Expiry = DateTime.UtcNow.Add(TimeSpan.FromHours(1)) };
});
Now you can implement expiration in an asynchronous thread.
Timer expirationTimer = new Timer(ExpireCache, null, 60000, 60000);
...
void ExpireCache(object state)
{
var needToExpire = cache.Where(c => DateTime.UtcNow >= c.Value.Expiry).Select(c => c.Key);
foreach (var key in needToExpire)
{
cache.TryRemove(key, out CacheItem<T> _);
}
}
Once a minute, you search for all cache entries that need to be expired, and remove them.
"Locking"
Using ConcurrentDictionary guarantees that simultaneous read/writes won't corrupt the dictionary or throw an exception.
But, you can still end up with a situation where two simultaneous reads cause you to fetch the data from the database twice.
One neat trick to solve this is to wrap the value of the dictionary with Lazy
ConcurrentDictionary<string, Lazy<CacheItem<T>>> cache = new ConcurrentDictionary<string, Lazy<CacheItem<T>>>();
...
var data = cache.GetOrData("someKey", key => new Lazy<CacheItem<T>>(() =>
{
var data = PullDataFromDatabase(key);
return new CacheItem<T>() { Item = data, Expiry = DateTime.UtcNow.Add(TimeSpan.FromHours(1)) };
})).Value;
Explanation
with GetOrAdd you might end up invoking the "get from database if not in cache" delegate multiple times in the case of simultaneous requests.
However, GetOrAdd will end up using only one of the values that the delegate returned, and by returning a Lazy, you guaranty that only one Lazy will get invoked.

Dispose inner subscription of merge

!!warning: Rx newbie!!
We have multiple price feeds. The requirement is to subscribe to all these feeds and only output the latest tick every 1 sec(throttle)
public static class FeedHandler
{
private static IObservable<PriceTick> _combinedPriceFeed = null;
private static double _throttleFrequency = 1000;
public static void AddToCombinedFeed(IObservable<PriceTick> feed)
{
_combinedPriceFeed = _combinedPriceFeed != null ? _combinedPriceFeed.Merge(feed) : feed;
AddFeed(_combinedPriceFeed);
}
private static IDisposable _subscriber;
private static void AddFeed(IObservable<PriceTick> feed)
{
_subscriber?.Dispose();
_subscriber = feed.Buffer(TimeSpan.FromMilliseconds(_throttleFrequency)).Subscribe(buffer => buffer.GroupBy(x => x.InstrumentId, (key, result) => result.First()).ToObservable().Subscribe(NotifyClient));
}
public static void NotifyClient(PriceTick tick)
{
//Do some action
}
}
The code have multiple issues. If I call AddToCombinedFeed with the same feed multiple times, the streams will get duplicated to start with. Eg. below
IObservable<PriceTick> feed1;
FeedHandler.AddToCombinedFeed(feed1);//1 stream
FeedHandler.AddToCombinedFeed(feed1);//2 streams(even though the groupby and first() functions will prevent this effect to propagate further
This brings me to the question. If I want to remove one price stream from the merged stream, how can I do that?

Update - New Solution
With Dynamic-Data (MIT-License) from RolandPheasant with Nuget.
Use a SourceList instead of a List
Use the MergeMany operator
Code:
public class FeedHandler
{
private readonly IDisposable _subscriber;
private readonly SourceList<IObservable<PriceTick>> _feeds = new SourceList<IObservable<PriceTick>>();
private readonly double _throttleFrequency = 1000;
public FeedHandler()
{
var combinedPriceFeed = _feeds.Connect().MergeMany(x => x).Buffer(TimeSpan.FromMilliseconds(_throttleFrequency)).SelectMany(buffer => buffer.GroupBy(x => x.InstrumentId, (key, result) => result.First()));
_subscriber = combinedPriceFeed.Subscribe(NotifyClient);
}
public void AddFeed(IObservable<PriceTick> feed) => _feeds.Add(feed);
public void NotifyClient(PriceTick tick)
{
//Do some action
}
}
Old Solution
Eradicate the need to resubscribe by applying Switch() technique.
Your _combinedPriceFeed just switches to the next observable that
will be supplied by _combinePriceFeedChange.
Keep a list to manage your multiple feeds. Create the new observable whenever the list changes and provide it via _combinePriceFeedChange.
You should get the logic of an corresponding remove method.
Code:
public class FeedHandler
{
private readonly IDisposable _subscriber;
private readonly IObservable<PriceTick> _combinedPriceFeed;
private readonly List<IObservable<PriceTick>> _feeds = new List<IObservable<PriceTick>>();
private readonly BehaviorSubject<IObservable<PriceTick>> _combinedPriceFeedChange = new BehaviorSubject<IObservable<PriceTick>>(Observable.Never<PriceTick>());
private readonly double _throttleFrequency = 1000;
public FeedHandler()
{
_combinedPriceFeed = _combinedPriceFeedChange.Switch().Buffer(TimeSpan.FromMilliseconds(_throttleFrequency)).SelectMany(buffer => buffer.GroupBy(x => x.InstrumentId, (key, result) => result.First()));
_subscriber = _combinedPriceFeed.Subscribe(NotifyClient);
}
public void AddFeed(IObservable<PriceTick> feed)
{
_feeds.Add(feed);
_combinedPriceFeedChange.OnNext(_feeds.Merge());
}
public void NotifyClient(PriceTick tick)
{
//Do some action
}
}

Here's the code you need:
private static SerialDisposable _subscriber = new SerialDisposable();
private static void AddFeed(IObservable<PriceTick> feed)
{
_subscriber.Disposable =
feed
.Buffer(TimeSpan.FromMilliseconds(_throttleFrequency))
.SelectMany(buffer =>
buffer
.GroupBy(x => x.InstrumentId, (key, result) => result.First()))
.Subscribe(NotifyClient);
}

How to create repeatable IObservable of resource instances?

Need some help with RX. I want define observable, which should create resource when first subscribtion created, post this resource instance once for every new subscribtion, and when all subscribtion are done that resource instance must be disposed. Something like Observable.Using, but with Publish(value) and RefCount behaviour. All my attempts to express it using standard operators failed. This code does what i want, but I think there must be standart way to do it. I'm really don't want reinvent the wheel.
using System;
using System.Reactive.Linq;
using System.Reactive.Disposables;
namespace ConsoleApplication1
{
class Program
{
static void Main()
{
// this part is what i can't express in standart RX operators..
Res res = null;
RefCountDisposable disp = null;
var #using = Observable.Create<Res>(obs =>
{
res = res ?? new Res();
disp = disp == null || disp.IsDisposed ? new RefCountDisposable(res) : disp;
obs.OnNext(res);
return new CompositeDisposable(disp.GetDisposable(), disp, Disposable.Create(() => res = null));
});
// end
var sub1 = #using.Subscribe(Print);
var sub2 = #using.Subscribe(Print);
sub1.Dispose();
sub2.Dispose();
sub1 = #using.Subscribe(Print);
sub2 = #using.Subscribe(Print);
sub1.Dispose();
sub2.Dispose();
Console.ReadKey();
}
static void Print(object o)
{
Console.WriteLine(o.GetHashCode());
}
}
class Res : IDisposable
{
public Res()
{
Console.WriteLine("CREATED");
}
public void Dispose()
{
Console.WriteLine("DISPOSED");
}
}
}
Output:
CREATED
1111
1111
DISPOSED
CREATED
2222
2222
DISPOSED
My "best" attempt with standart operators:
var #using = Observable.Using(() => new Res(), res => Observable.Never(res).StartWith(res))
.Replay(1)
.RefCount();
and output is:
CREATED
1111
1111
DISPOSED
CREATED
1111 <-- this is "wrong" value
2222
2222
DISPOSED
Thank you!
ps. sorry for my poor english =(

After a little headache i'm finally realized that problem with Using.Replay.RefCount was that Replay internally calls Multicast with single ReplaySubject instance, but in my specific case I need Replay that recreates subject on every new first subscription. Through google I found RXX library and it's ReconnectableObservable was the answer. It uses subject factory instead of subject instance to recreate subject in every Connect call(original rxx code, simply without contracts):
internal sealed class ReconnectableObservable<TSource, TResult> : IConnectableObservable<TResult>
{
private ISubject<TSource, TResult> Subject
{
get { return _subject ?? (_subject = _factory()); }
}
private readonly object _gate = new object();
private readonly IObservable<TSource> _source;
private readonly Func<ISubject<TSource, TResult>> _factory;
private ISubject<TSource, TResult> _subject;
private IDisposable _subscription;
public ReconnectableObservable(IObservable<TSource> source, Func<ISubject<TSource, TResult>> factory)
{
_source = source;
_factory = factory;
}
public IDisposable Connect()
{
lock (_gate)
{
if (_subscription != null)
return _subscription;
_subscription = new CompositeDisposable(
_source.Subscribe(Subject),
Disposable.Create(() =>
{
lock (_gate)
{
_subscription = null;
_subject = null;
}
}));
return _subscription;
}
}
public IDisposable Subscribe(IObserver<TResult> observer)
{
lock (_gate)
{
return Subject.Subscribe(observer);
}
}
}
and few extension methods:
public static class Ext
{
public static IConnectableObservable<T> Multicast<T>(this IObservable<T> obs, Func<ISubject<T>> subjectFactory)
{
return new ReconnectableObservable<T, T>(obs, subjectFactory);
}
public static IConnectableObservable<T> ReplayReconnect<T>(this IObservable<T> obs, int replayCount)
{
return obs.Multicast(() => new ReplaySubject<T>(replayCount));
}
public static IConnectableObservable<T> PublishReconnect<T>(this IObservable<T> obs)
{
return obs.Multicast(() => new Subject<T>());
}
}
using that code, now i'm able to do so:
var #using = Observable
.Using(() => new Res(), _ => Observable.Never(_).StartWith(_))
.ReplayReconnect(1) // <-- that's it!
.RefCount();
Yahoo! It works as expected.
Thanks for all who answered! You pushed me in the right direction.

Try this:
var #using = Observable.Using(
() => new Res(),
res => Observable.Return(res).Concat(Observable.Never<Res>()))
.Publish((Res)null)
.RefCount()
.SkipWhile(res => res == null);
The Concat prevents observers from auto-unsubscribing when the observable produces its only value.

If there is a way to do this using standard operators, I can't see it.
The problem is that there is no "cache values only if there are subscribers" option amonst the standard operators.
The Replay operator will cache the last value regardless of subscribers, and is the underlying cause of the "wrong" value you are seeing.
It highlights the fact that Using + Replay is a dangerous combination since it's emitted a disposed value.
I suspect that if someone did manage some wizardry with the standard operators, it wouldn't be as readable as your Observable.Create implementation.
I have used Observable.Create many times to create code that I am certain is more concise, readable and maintainable than an equivalent construction using standard operators.
My advice is that there is absolutely nothing wrong with using Observable.Create - wrap your code up in a nice factory method that accepts the resource and you're good to go. Here's my attempt at doing that, it's just a repackage of your code with thread safety added:
public static IObservable<T> CreateObservableRefCountedResource<T>(Func<T> resourceFactory)
where T : class, IDisposable
{
T resource = null;
RefCountDisposable resourceDisposable = null;
var gate = new object();
return Observable.Create<T>(o =>
{
lock (gate)
{
resource = resource ?? resourceFactory();
var disposeAction = Disposable.Create(() =>
{
lock (gate)
{
resource.Dispose();
resource = null;
}
});
resourceDisposable = (resourceDisposable == null || resourceDisposable.IsDisposed)
? new RefCountDisposable(disposeAction)
: resourceDisposable;
o.OnNext(resource);
return new CompositeDisposable(
resourceDisposable,
resourceDisposable.GetDisposable());
}
});
}
EDITED - Forgot to call resourceDisposable.GetDisposable()!

Parallel.ForEach stalled when integrated with BlockingCollection

I adopted my implementation of parallel/consumer based on the code in this question
class ParallelConsumer<T> : IDisposable
{
private readonly int _maxParallel;
private readonly Action<T> _action;
private readonly TaskFactory _factory = new TaskFactory();
private CancellationTokenSource _tokenSource;
private readonly BlockingCollection<T> _entries = new BlockingCollection<T>();
private Task _task;
public ParallelConsumer(int maxParallel, Action<T> action)
{
_maxParallel = maxParallel;
_action = action;
}
public void Start()
{
try
{
_tokenSource = new CancellationTokenSource();
_task = _factory.StartNew(
() =>
{
Parallel.ForEach(
_entries.GetConsumingEnumerable(),
new ParallelOptions { MaxDegreeOfParallelism = _maxParallel, CancellationToken = _tokenSource.Token },
(item, loopState) =>
{
Log("Taking" + item);
if (!_tokenSource.IsCancellationRequested)
{
_action(item);
Log("Finished" + item);
}
else
{
Log("Not Taking" + item);
_entries.CompleteAdding();
loopState.Stop();
}
});
},
_tokenSource.Token);
}
catch (OperationCanceledException oce)
{
System.Diagnostics.Debug.WriteLine(oce);
}
}
private void Log(string message)
{
Console.WriteLine(message);
}
public void Stop()
{
Dispose();
}
public void Enqueue(T entry)
{
Log("Enqueuing" + entry);
_entries.Add(entry);
}
public void Dispose()
{
if (_task == null)
{
return;
}
_tokenSource.Cancel();
while (!_task.IsCanceled)
{
}
_task.Dispose();
_tokenSource.Dispose();
_task = null;
}
}
And here is a test code
class Program
{
static void Main(string[] args)
{
TestRepeatedEnqueue(100, 1);
}
private static void TestRepeatedEnqueue(int itemCount, int parallelCount)
{
bool[] flags = new bool[itemCount];
var consumer = new ParallelConsumer<int>(parallelCount,
(i) =>
{
flags[i] = true;
}
);
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
Thread.Sleep(1000);
Debug.Assert(flags.All(b => b == true));
}
}
The test always fails - it always stuck at around 93th-item from the 100 tested. Any idea which part of my code caused this issue, and how to fix it?

You cannot use Parallel.Foreach() with BlockingCollection.GetConsumingEnumerable(), as you have discovered.
For an explanation, see this blog post:
https://devblogs.microsoft.com/pfxteam/parallelextensionsextras-tour-4-blockingcollectionextensions/
Excerpt from the blog:
BlockingCollection’s GetConsumingEnumerable implementation is using BlockingCollection’s internal synchronization which already supports multiple consumers concurrently, but ForEach doesn’t know that, and its enumerable-partitioning logic also needs to take a lock while accessing the enumerable.
As such, there’s more synchronization here than is actually necessary, resulting in a potentially non-negligable performance hit.
[Also] the partitioning algorithm employed by default by both Parallel.ForEach and PLINQ use chunking in order to minimize synchronization costs: rather than taking the lock once per element, it'll take the lock, grab a group of elements (a chunk), and then release the lock.
While this design can help with overall throughput, for scenarios that are focused more on low latency, that chunking can be prohibitive.
That blog also provides the source code for a method called GetConsumingPartitioner() which you can use to solve the problem.
public static class BlockingCollectionExtensions
{
public static Partitioner<T> GetConsumingPartitioner<T>(this BlockingCollection<T> collection)
{
return new BlockingCollectionPartitioner<T>(collection);
}
public class BlockingCollectionPartitioner<T> : Partitioner<T>
{
private BlockingCollection<T> _collection;
internal BlockingCollectionPartitioner(BlockingCollection<T> collection)
{
if (collection == null)
throw new ArgumentNullException("collection");
_collection = collection;
}
public override bool SupportsDynamicPartitions
{
get { return true; }
}
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
if (partitionCount < 1)
throw new ArgumentOutOfRangeException("partitionCount");
var dynamicPartitioner = GetDynamicPartitions();
return Enumerable.Range(0, partitionCount).Select(_ => dynamicPartitioner.GetEnumerator()).ToArray();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return _collection.GetConsumingEnumerable();
}
}
}

The reason for failure is because of the following reason as explained here
The partitioning algorithm employed by default by both
Parallel.ForEach and PLINQ use chunking in order to minimize
synchronization costs: rather than taking the lock once per element,
it'll take the lock, grab a group of elements (a chunk), and then
release the lock.
To get it to work, you can add a method on your ParallelConsumer<T> class to indicate that the adding is completed, as below
public void StopAdding()
{
_entries.CompleteAdding();
}
And now call this method after your for loop , as below
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
consumer.StopAdding();
Otherwise, Parallel.ForEach() would wait for the threshold to be reached so as to grab the chunk and start processing.

How to use lock on a Dictionary containing a list of objects in C#?

I have the following class:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
lock (Lock)
{
if (!_companyHotspots.ContainsKey(companyId))
{
RefreshCompanyHotspotCache(companyId);
}
return _companyHotspots[companyId];
}
}
private static void RefreshCompanyHotspotCache(short companyId)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
_companyHotspots.Add(companyId, hotspots);
....
}
The issue that I'm having is that the operation of getting the hotspots, in RefreshCompanyHotspotCache method, takes a lot of time . So while one thread is performing the cache refresh for a certain CompanyId, all the other threads are waiting until this operation is finished, although there could be threads that are requesting the list of hotspots for another companyId for which the list is already loaded in the dictionary. I would like these last threads not be locked. I also want that all threads that are requesting the list of hotspots for a company that is not yet loaded in the cache to wait until the list is fully retrieved and loaded in the dictionary.
Is there a way to lock only the threads that are reading/writing the cache for certain companyId (for which the refresh is taking place) and let the other threads that are requesting data for another company to do their job?
My thought was to use and array of locks
lock (companyLocks[companyId])
{
...
}
But that didn't solve anything. The threads dealing with one company are still waiting for threads that are refreshing the cache for other companies.

Use the Double-checked lock mechanism also mentioned by Snowbear - this will prevent your code locking when it doesn't actually need to.
With your idea of an individual lock per client, I've used this mechanism in the past, though I used a dictionary of locks. I made a utility class for getting a lock object from a key:
/// <summary>
/// Provides a mechanism to lock based on a data item being retrieved
/// </summary>
/// <typeparam name="T">Type of the data being used as a key</typeparam>
public class LockProvider<T>
{
private object _syncRoot = new object();
private Dictionary<T, object> _lstLocks = new Dictionary<T, object>();
/// <summary>
/// Gets an object suitable for locking the specified data item
/// </summary>
/// <param name="key">The data key</param>
/// <returns></returns>
public object GetLock(T key)
{
if (!_lstLocks.ContainsKey(key))
{
lock (_syncRoot)
{
if (!_lstLocks.ContainsKey(key))
_lstLocks.Add(key, new object());
}
}
return _lstLocks[key];
}
}
So simply use this in the following manner...
private static LockProvider<short> _clientLocks = new LockProvider<short>();
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (_clientLocks.GetLock(companyId))
{
if (!_companyHotspots.ContainsKey(companyId))
{
// Add item to _companyHotspots here...
}
}
return _companyHotspots[companyId];
}

How about you only lock 1 thread, and let that update, while everyone else uses the old list?
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
private static Dictionary<short, List<HotSpot>> _companyHotspotsOld = new Dictionary<short, List<HotSpot>>();
private static bool _hotspotsUpdating = false;
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_hotspotsUpdating)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (Lock)
{
_hotspotsUpdating = true;
_companyHotspotsOld = _companyHotspots;
RefreshCompanyHotspotCache(companyId);
_hotspotsUpdating = false;
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspotsOld[companyId];
}
}

Have you looked into ReaderWriterLockSlim? That should be able to let get finer grained locking where you only take a writelock when needed.
Another thing you may need to look out for is false sharing. I don't know how a lock is implemented exactly but if you lock on objects in an array they're bound to be close to each other in memory, possibly putting them on the same cacheline, so the lock may not behave as you expect.
Another idea, what happens if you change the last code snippet to
object l = companyLocks[companyId];
lock(l){
}
could be the lock statement wraps more here than intended.
GJ

New idea, with locking just the lists as they are created.
If you can guarantee that each company will have at least one hotspot, do this:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
static HotspotsCache()
{
foreach(short companyId in allCompanies)
{
companyHotspots.Add(companyId, new List<HotSpot>());
}
}
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
List<HotSpots> result = _companyHotspots[companyId];
if(result.Count == 0)
{
lock(result)
{
if(result.Count == 0)
{
RefreshCompanyHotspotCache(companyId, result);
}
}
}
return result;
}
private static void RefreshCompanyHotspotCache(short companyId, List<HotSpot> resultList)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
resultList.AddRange(hotspots);
....
}
}
Since the dictionary is being modified after its initial creation, no need to do any locking on it. We only need to lock the individual lists as we populate them, the read operation needs no locking (including the initial Count == 0).

If you're able to use .NET 4 then the answer is straightforward -- use a ConcurrentDictionary<K,V> instead and let that look after the concurrency details for you:
public static class HotSpotsCache
{
private static readonly ConcurrentDictionary<short, List<HotSpot>>
_hotSpotsMap = new ConcurrentDictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
return _hotSpotsMap.GetOrAdd(companyId, id => LoadHotSpots(id));
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}
If you're not able to use .NET 4 then your idea of using several more granular locks is a good one:
public static class HotSpotsCache
{
private static readonly Dictionary<short, List<HotSpot>>
_hotSpotsMap = new Dictionary<short, List<HotSpot>();
private static readonly object _bigLock = new object();
private static readonly Dictionary<short, object>
_miniLocks = new Dictionary<short, object>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
List<HotSpot> hotSpots;
object miniLock;
lock (_bigLock)
{
if (_hotSpotsMap.TryGetValue(companyId, out hotSpots))
return hotSpots;
if (!_miniLocks.TryGetValue(companyId, out miniLock))
{
miniLock = new object();
_miniLocks.Add(companyId, miniLock);
}
}
lock (miniLock)
{
if (!_hotSpotsMap.TryGetValue(companyId, out hotSpots))
{
hotSpots = LoadHotSpots(companyId);
lock (_bigLock)
{
_hotSpotsMap.Add(companyId, hotSpots);
_miniLocks.Remove(companyId);
}
}
return hotSpots;
}
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Limit replay buffer by observable - c#

Related

Looking for a way to do less locking while caching

Dispose inner subscription of merge

How to create repeatable IObservable of resource instances?

Parallel.ForEach stalled when integrated with BlockingCollection

How to use lock on a Dictionary containing a list of objects in C#?

Categories

Resources