Singleton pattern on persistent in-memory cache - c#

Using what I judged was the best of all worlds on the Implementing the Singleton Pattern in C# amazing article, I have been using with success the following class to persist user-defined data in memory (for the very rarely modified data):
public class Params
{
static readonly Params Instance = new Params();
Params()
{
}
public static Params InMemory
{
get
{
return Instance;
}
}
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
public int ChunkSize
{
get
{
// Loc uses the Localizations impl
LC.Loc("params.chunksize").To<int>();
}
}
public void RebuildLocalizations()
{
_localizations = null;
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
My usage would look something like this:
var allLocs = Params.InMemory.Localizations; //etc
Whenever I update the database, the RefreshLocalizations gets called, so only part of my in-memory store is rebuilt. I have a single production environment out of about 10 that seems to be misbehaving when the RefreshLocalizations gets called, not refreshing at all, but this is also seems to be intermittent and very odd altogether.
My current suspicions goes towards the singleton, which I think does the job great and all the unit tests prove that the singleton mechanism, the refresh mechanism and the RAM performance all work as expected.
That said, I am down to these possibilities:
This customer is lying when he says their environment is not using loading balance, which is a setting I am not expecting the in-memory stuff to work properly (right?)
There is some non-standard pool configuration in their IIS which I am testing against (maybe in a Web Garden setting?)
The singleton is failing somehow, but not sure how.
Any suggestions?
.NET 3.5 so not much parallel juice available, and not ready to use the Reactive Extensions for now
Edit1: as per the suggestions, would the getter look something like:
public IEnumerable<Localization> Localizations
{
get
{
lock(_localizations) {
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
}

To expand on my comment, here is how you might make the Localizations property thread safe:
public class Params
{
private object _lock = new object();
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
lock (_lock) {
if ( _localizations == null ) {
_localizations = new Repository<Localization>().Get();
}
return _localizations;
}
}
}
public void RebuildLocalizations()
{
lock(_lock) {
_localizations = null;
}
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}

There is no point in creating a thread safe singleton, if your properties are not going to be thread safe.
You should either lock around assignment of the _localization field, or instantiate in your singleton's constructor (preferred). Any suggestion which applies to singleton instantiation applies to this lazy-instantiated property.
The same thing further applies to all properties (and their properties) of Localization. If this is a Singleton, it means that any thread can access it any time, and simply locking the getter will again do nothing.
For example, consider this case:
Thread 1 Thread 2
// both threads access the singleton, but you are "safe" because you locked
1. var loc1 = Params.Localizations; var loc2 = Params.Localizations;
// do stuff // thread 2 calls the same property...
2. var value = loc1.ChunkSize; var chunk = LC.Loc("params.chunksize");
// invalidate // ...there is a slight pause here...
3. loc1.RebuildLocalizations();
// ...and gets the wrong value
4. var value = chunk.To();
If you are only reading these values, then it might not matter, but you can see how you can easily get in trouble with this approach.
Remember that with threading, you never know if a different thread will execute something between two instructions. Only simple 32-bit assignments are atomic, nothing else.
This means that, in this line here:
return LC.Loc("params.chunksize").To<int>();
is, as far as threading is concerned, equivalent to:
var loc = LC.Loc("params.chunksize");
Thread.Sleep(1); // anything can happen here :-(
return loc.To<int>();
Any thread can jump in between Loc and To.

Related

Do replace operations on different ConcurrentDictionary keys share one lock?

Does replacing a value associated with a ConcurrentDictionary key lock any dictionary operations beyond that key?
EDIT: For example, I'd like to know if either thread will ever block the other, besides when the keys are first added, in the following:
public static class Test {
private static ConcurrentDictionary<int, int> cd = new ConcurrentDictionary<int, int>();
public static Test() {
new Thread(UpdateItem1).Start();
new Thread(UpdateItem2).Start();
}
private static void UpdateItem1() {
while (true) cd[1] = 0;
}
private static void UpdateItem2() {
while (true) cd[2] = 0;
}
}
Initially I assumed it does, because for example dictionary[key] = value; could refer to a key that is not present yet. However, while working I realized that if an add is necessary it could occur after a separate lock escalation.
I was drafting the following class, but the indirection provided by the AccountCacheLock class is unnecessary if the answer to this question (above) is "no". In fact, all of my own lock management is pretty much unneeded.
// A flattened subset of repository user values that are referenced for every member page access
public class AccountCache {
// The AccountCacheLock wrapper allows the AccountCache item to be updated in a locally-confined account-specific lock.
// Otherwise, one of the following would be necessary:
// Replace a ConcurrentDictionary item, requiring a lock on the ConcurrentDictionary object (unless the ConcurrentDictionary internally implements similar indirection)
// Update the contents of the AccountCache item, requiring either a copy to be returned or the lock to wrap the caller's use of it.
private static readonly ConcurrentDictionary<int, AccountCacheLock> dictionary = new ConcurrentDictionary<int, AccountCacheLock>();
public static AccountCache Get(int accountId, SiteEntities refreshSource) {
AccountCacheLock accountCacheLock = dictionary.GetOrAdd(accountId, k => new AccountCacheLock());
AccountCache accountCache;
lock (accountCacheLock) {
accountCache = accountCacheLock.AccountCache;
}
if (accountCache == null || accountCache.ExpiresOn < DateTime.UtcNow) {
accountCache = new AccountCache(refreshSource.Accounts.Single(a => a.Id == accountId));
lock (accountCacheLock) {
accountCacheLock.AccountCache = accountCache;
}
}
return accountCache;
}
public static void Invalidate(int accountId) {
// TODO
}
private AccountCache(Account account) {
ExpiresOn = DateTime.UtcNow.AddHours(1);
Status = account.Status;
CommunityRole = account.CommunityRole;
Email = account.Email;
}
public readonly DateTime ExpiresOn;
public readonly AccountStates Status;
public readonly CommunityRoles CommunityRole;
public readonly string Email;
private class AccountCacheLock {
public AccountCache AccountCache;
}
}
Side question: is there something in the ASP.NET framework that already does this?
You don't need to be doing any locks. The ConcurrentDictionary should handle that pretty well.
Side question: is there something in the ASP.NET framework that already does this?
Of course. It's not specifically related to ASP.NET but you may take a look at the System.Runtime.Caching namespace and more specifically the MemoryCache class. It adds things like expiration and callbacks on the top of a thread safe hashtable.
I don't quite understand the purpose of the AccountCache you have shown in your updated answer. It's exactly what a simple caching layer gives you for free.
Obviously if you intend to be running your ASP.NET application in a web farm you should consider some distributed caching such as memcached for example. There are .NET implementations of the ObjectCache class on top of the memcached protocol.
I also wanted to note that I took a cursory peek inside ConcurrentDictionary, and it looks like item replacements are locked on neither the individual item nor the entire dictionary, but rather the hash of the item (i.e. a lock object associated with a dictionary "bucket"). It seems to be designed so that an initial introduction of a key also does not lock the entire dictionary, provided the dictionary need not be resized. I believe this also means that two updates can occur simultaneously provided they don't produce matching hashes.

Caching Objects with Expensive Build & Allowing Updates

I am working on a caching manager for a MVC web application. For this app, I have some very large objects that are costly to build. During the application lifetime, I may need to create several of these objects, based upon user requests. When built, the user will be working with the data in the objects, resulting in many read actions. On occasion, I will need to update some minor data points in the cached object (create & replace would take too much time).
Below is a cache manager class that I have created to help me in this. Beyond basic thread safety, my goals were to:
Allow multiple reads against a object, but lock all reads to that object upon an
update request
Ensure that the object is only ever created 1 time if
it does not already exist (keep in mind that its a long build
action).
Allow the cache to store many objects, and maintain a lock
per object (rather than one lock for all objects).
public class CacheManager
{
private static readonly ObjectCache Cache = MemoryCache.Default;
private static readonly ConcurrentDictionary<string, ReaderWriterLockSlim>
Locks = new ConcurrentDictionary<string, ReaderWriterLockSlim>();
private const int CacheLengthInHours = 1;
public object AddOrGetExisting(string key, Func<object> factoryMethod)
{
Locks.GetOrAdd(key, new ReaderWriterLockSlim());
var policy = new CacheItemPolicy
{
AbsoluteExpiration = DateTimeOffset.Now.AddHours(CacheLengthInHours)
};
return Cache.AddOrGetExisting
(key, new Lazy<object>(factoryMethod), policy);
}
public object Get(string key)
{
var targetLock = AcquireLockObject(key);
if (targetLock != null)
{
targetLock.EnterReadLock();
try
{
var cacheItem = Cache.GetCacheItem(key);
if(cacheItem!= null)
return cacheItem.Value;
}
finally
{
targetLock.ExitReadLock();
}
}
return null;
}
public void Update<T>(string key, Func<T, object> updateMethod)
{
var targetLock = AcquireLockObject(key);
var targetItem = (Lazy<object>) Get(key);
if (targetLock == null || key == null) return;
targetLock.EnterWriteLock();
try
{
updateMethod((T)targetItem.Value);
}
finally
{
targetLock.ExitWriteLock();
}
}
private ReaderWriterLockSlim AcquireLockObject(string key)
{
return Locks.ContainsKey(key) ? Locks[key] : null;
}
}
Am I accomplishing my goals while remaining thread safe? Do you all see a better way to achieve my goals?
Thanks!
UPDATE: So the bottom line here was that I was really trying to do too much in 1 area. For some reason, I was convinced that managing the Get / Update operations in the same class that managed the cache was a good idea. After looking at Groo's solution & rethinking the issue, I was able to do a good amount of refactoring which removed this issue I was facing.
Well, I don't think this class does what you need.
Allow multiple reads against the object, but lock all reads upon an update request
You may lock all reads to the cache manager, but you are not locking reads (nor updates) to the actual cached instance.
Ensure that the object is only ever created 1 time if it does not already exist (keep in mind that its a long build action).
I don't think you ensured that. You are not locking anything while adding the object to the dictionary (and, furthermore, you are adding a lazy constructor, so you don't even know when the object is going to be instantiated).
Edit: This part holds, the only thing I would change is to make Get return a Lazy<object>. While writing my program, I forgot to cast it and calling ToString on the return value returned `"Value not created".
Allow the cache to store many objects, and maintain a lock per object (rather than one lock for all objects).
That's the same as point 1: you are locking the dictionary, not the access to the object. And your update delegate has a strange signature (it accepts a typed generic parameter, and returns an object which is never used). This means you are really modifying the object's properties, and these changes are immediately visible to any part of your program holding a reference to that object.
How to resolve this
If your object is mutable (and I presume it is), there is no way to ensure transactional consistency unless each of your properties also acquires a lock on each read access. A way to simplify this is to make it immutable (that why these are so popular for multithreading).
Alternatively, you may consider breaking this large object into smaller pieces and caching each piece separately, making them immutable if needed.
[Edit] Added a race condition example:
class Program
{
static void Main(string[] args)
{
CacheManager cache = new CacheManager();
cache.AddOrGetExisting("item", () => new Test());
// let one thread modify the item
ThreadPool.QueueUserWorkItem(s =>
{
Thread.Sleep(250);
cache.Update<Test>("item", i =>
{
i.First = "CHANGED";
Thread.Sleep(500);
i.Second = "CHANGED";
return i;
});
});
// let one thread just read the item and print it
ThreadPool.QueueUserWorkItem(s =>
{
var item = ((Lazy<object>)cache.Get("item")).Value;
Log(item.ToString());
Thread.Sleep(500);
Log(item.ToString());
});
Console.Read();
}
class Test
{
private string _first = "Initial value";
public string First
{
get { return _first; }
set { _first = value; Log("First", value); }
}
private string _second = "Initial value";
public string Second
{
get { return _second; }
set { _second = value; Log("Second", value); }
}
public override string ToString()
{
return string.Format("--> PRINTING: First: [{0}], Second: [{1}]", First, Second);
}
}
private static void Log(string message)
{
Console.WriteLine("Thread {0}: {1}", Thread.CurrentThread.ManagedThreadId, message);
}
private static void Log(string property, string value)
{
Console.WriteLine("Thread {0}: {1} property was changed to [{2}]", Thread.CurrentThread.ManagedThreadId, property, value);
}
}
Something like this should happen:
t = 0ms : thread A gets the item and prints the initial value
t = 250ms: thread B modifies the first property
t = 500ms: thread A prints the INCONSISTENT value (only the first prop. changed)
t = 750ms: thread B modifies the second property

Thread Safety with less locking [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C# Threading & Blocking
I am trying to effectively determine which approach is better:
Currently, I have a singleton instance that exposes entities that are loaded in lazy load fashion. I have listed three approaches which each of which has some advantages. The first approach relies solely on double lock pattern to ensure thread safety. The second approach doesn't use locking but it has the potential of double Load in case of a race. The third approach really uses a solution that I am becoming very fond of. (System.Lazy).
For some reason, I feel there is something wrong with the second approach (System.Thread.InterLocked), yet i can't pin point it. Is there a reason to favor one approach over the other? I did cover this in a previous post where I felt the third option is the way to go from now on.
I stripped the code to the barebones to be able explain the design.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace TPLDemo
{
public class SomeEntity
{
}
public class MultiThreadedManager
{
private static readonly System.Lazy<MultiThreadedManager> instance = new Lazy<MultiThreadedManager>(() => { return new MultiThreadedManager(); });
private readonly object _syncRoot = new object();
private List<SomeEntity> _inMemoryEntities = null;
private List<SomeEntity> _inMemoryEntitiesUsingLockFreeApproach = null;
private System.Lazy<List<SomeEntity>> _inMemoryUsingLazy = new Lazy<List<SomeEntity>>(() => { return MultiThreadedManager.Instance.LoadFromSomewhere(); });
public static MultiThreadedManager Instance
{
get { return instance.Value; }
}
public IEnumerable<SomeEntity> LazyEntities
{
get
{
return _inMemoryUsingLazy.Value;
}
}
public IEnumerable<SomeEntity> LocklessEntities
{
get
{
if (_inMemoryEntitiesUsingLockFreeApproach == null)
{
do
{
// Is it possible multiple threads hit this at the same time?
} while (System.Threading.Interlocked.CompareExchange<List<SomeEntity>>(ref _inMemoryEntitiesUsingLockFreeApproach, this.LoadFromSomewhere(), null) != null);
}
return _inMemoryEntitiesUsingLockFreeApproach;
}
}
/// <summary>
/// This is thread safe but it involved some locking.
/// </summary>
public IEnumerable<SomeEntity> Entities
{
get
{
if (_inMemoryEntities == null)
{
lock (_syncRoot)
{
if (_inMemoryEntities == null)
{
List<SomeEntity> list = this.LoadFromSomewhere();
_inMemoryEntities = list;
}
}
}
return _inMemoryEntities;
}
}
private List<SomeEntity> LoadFromSomewhere()
{
return new List<SomeEntity>();
}
public void ReloadEntities()
{
// This is sufficient becasue any subsequent call will reload them safely.
_inMemoryEntities = null;
// This is sufficient becasue any subsequent call will reload them safely.
_inMemoryEntitiesUsingLockFreeApproach = null;
// This is necessary becasue _inMemoryUsingLazy.Value is readonly.
_inMemoryUsingLazy = new Lazy<List<SomeEntity>>(() => { return MultiThreadedManager.Instance.LoadFromSomewhere(); });
}
}
}
The third option (Lazy) allows you to configure how it should behave. You can make it behave like (1) or like (2).
In any case, once it is loaded it does not need to lock or interlock internally to hand you back the loaded Value.
So by all means go for System.Lazy.
There is one nasty thing though: If the factory function fails, the exception is stored and thrown everytime the Value property is accessed. This means that this Lazy instance is not ruined. You cannot ever retry. This means that a transient failure (network error, ...) might permanently take down your application until it is manually restarted.
If have complained about this on MS Connect but it is by design.
My solution was to write my own Lazy. It's not hard.

Singleton factory, sort of

Sorry if this has been answered elsewhere... I have found a lot of posts on similar things but not the same.
I want to ensure that only one instance of an object exists at a time BUT I don't want that object to be retained past its natural life-cycle, as it might be with the Singleton pattern.
I am writing some code where processing of a list gets triggered (by external code that I have no control over) every minute. Currently I just create a new 'processing' object each time and it gets destroyed when it goes out of scope, as per normal. However, there might be occasions when the processing takes longer than a minute, and so the next trigger will create a second instance of the processing class in a new thread.
Now, I want to have a mechanism whereby only one instance can be around at a time... say, some sort of factory whereby it'll only allow one object at a time. A second call to the factory will return null, instead of a new object, say.
So far my (crappy) solution is to have a Factory type object as a nested class of the processor class:
class XmlJobListProcessor
{
private static volatile bool instanceExists = false;
public static class SingletonFactory
{
private static object lockObj = new object();
public static XmlJobListProcessor CreateListProcessor()
{
if (!instanceExists)
{
lock (lockObj)
{
if (!instanceExists)
{
instanceExists = true;
return new XmlJobListProcessor();
}
return null;
}
}
return null;
}
}
private XmlJobListProcessor() { }
....
}
I was thinking of writing an explicit destructor for the XmlJobListProcessor class that reset the 'instanceExists' field to false.
I Realise this is a seriously terrible design. The factory should be a class in its own right... it's only nested so that both it and the instance destructors can access the volatile boolean...
Anyone have any better ways to do this? Cheers
I know .NET 4 is not as widely used, but eventually it will be and you'll have:
private static readonly Lazy<XmlJobListProcessor> _instance =
new Lazy<XmlJobListProcessor>(() => new XmlJobListProcessor());
Then you have access to it via _instance.Value, which is initialized the first time it's requested.
Your original example uses double-check locking, which should be avoided at all costs.
See msdn Singleton implementation on how to do initialize the Singleton properly.
just make one and keep it around, don't destroy and create it every minute
"minimize the moving parts"
I would instance the class and keep it around. Certainly I wouldn't use a destructor (if you mean ~myInstance() )...that increases GC time. In addition, if a process takes longer than a minute, what do you do with the data that was suppose to be processed if you just return a null value?
Keep the instance alive, and possibly build a buffer mechanism to continue taking input while the processor class is busy. You can check to see:
if ( isBusy == true )
{
// add data to bottom of buffer
}
else
{
// call processing
}
I take everyone's point about not re-instantiating the processor object and BillW's point about a queue, so here is my bastardized mashup solution:
public static class PRManager
{
private static XmlJobListProcessor instance = new XmlJobListProcessor();
private static object lockobj = new object();
public static void ProcessList(SPList list)
{
bool acquired = Monitor.TryEnter(lockobj);
try
{
if (acquired)
{
instance.ProcessList(list);
}
}
catch (ArgumentNullException)
{
}
finally
{
Monitor.Exit(lockobj);
}
}
}
The processor is retained long-term as a static member (here, long term object retention is not a problem since it has no state variables etc.) If a lock has been acquired on lockObj, the request just isn't processed and the calling thread will go on with its business.
Cheers for the feedback guys. Stackoverflow will ensure my internship! ;D

C# thread safety of global configuration settings

In a C# app, suppose I have a single global class that contains some configuration items, like so :
public class Options
{
int myConfigInt;
string myConfigString;
..etc.
}
static Options GlobalOptions;
the members of this class will be uses across different threads :
Thread1: GlobalOptions.myConfigString = blah;
while
Thread2: string thingie = GlobalOptions.myConfigString;
Using a lock for access to the GlobalOptions object would also unnecessary block when 2 threads are accessing different members, but on the other hand creating a sync-object for every member seems a bit over the top too.
Also, using a lock on the global options would make my code less nice I think;
if I have to write
string stringiwanttouse;
lock(GlobalOptions)
{
stringiwanttouse = GlobalOptions.myConfigString;
}
everywhere (and is this thread-safe or is stringiwanttouse now just a pointer to myConfigString ? Yeah, I'm new to C#....) instead of
string stringiwanttouse = GlobalOptions.myConfigString;
it makes the code look horrible.
So...
What is the best (and simplest!) way to ensure thread-safety ?
You could wrap the field in question (myConfigString in this case) in a Property, and have code in the Get/Set that uses either a Monitor.Lock or a Mutex. Then, accessing the property only locks that single field, and doesn't lock the whole class.
Edit: adding code
private static object obj = new object(); // only used for locking
public static string MyConfigString {
get {
lock(obj)
{
return myConfigstring;
}
}
set {
lock(obj)
{
myConfigstring = value;
}
}
}
The following was written before the OP's edit:
public static class Options
{
private static int _myConfigInt;
private static string _myConfigString;
private static bool _initialized = false;
private static object _locker = new object();
private static void InitializeIfNeeded()
{
if (!_initialized) {
lock (_locker) {
if (!_initialized) {
ReadConfiguration();
_initalized = true;
}
}
}
}
private static void ReadConfiguration() { // ... }
public static int MyConfigInt {
get {
InitializeIfNeeded();
return _myConfigInt;
}
}
public static string MyConfigString {
get {
InitializeIfNeeded();
return _myConfigstring;
}
}
//..etc.
}
After that edit, I can say that you should do something like the above, and only set configuration in one place - the configuration class. That way, it will be the only class modifying the configuration at runtime, and only when a configuration option is to be retrieved.
Your configurations may be 'global', but they should not be exposed as a global variable. If configurations don't change, they should be used to construct the objects that need the information - either manually or through a factory object. If they can change, then an object that watches the configuration file/database/whatever and implements the Observer pattern should be used.
Global variables (even those that happen to be a class instance) are a Bad Thing™
What do you mean by thread safety here? It's not the global object that needs to be thread safe, it is the accessing code. If two threads write to a member variable near the same instant, one of them will "win", but is that a problem? If your client code depends on the global value staying constant until it is done with some unit of processing, then you will need to create a synchronization object for each property that needs to be locked. There isn't any great way around that. You could just cache a local copy of the value to avoid problems, but the applicability of that fix will depend on your circumstances. Also, I wouldn't create a synch object for each property by default, but instead as you realize you will need it.

Categories

Resources