Creating Dynamic Locks at Runtime in ASP.NET - c#

Are the following assumptions valid for this code? I put some background info under the code, but I don't think it's relevant.
Assumption 1: Since this is a single application, I'm making the assumption it will be handled by a single process. Thus, static variables are shared between threads, and declaring my collection of lock objects statically is valid.
Assumption 2: If I know the value is already in the dictionary, I don't need to lock on read. I could use a ConcurrentDictionary, but I believe this one will be safe since I'm not enumerating (or deleting), and the value will exist and not change when I call UnlockOnValue().
Assumption 3: I can lock on the Keys collection, since that reference won't change, even if the underlying data structure does.
private static Dictionary<String,Object> LockList =
new Dictionary<string,object>();
private void LockOnValue(String queryStringValue)
{
lock(LockList.Keys)
{
if(!LockList.Keys.Contains(queryStringValue))
{
LockList.Add(screenName,new Object());
}
System.Threading.Monitor.Enter(LockList[queryStringValue]);
}
}
private void UnlockOnValue(String queryStringValue)
{
System.Threading.Monitor.Exit(LockList[queryStringValue]);
}
Then I would use this code like:
LockOnValue(Request.QueryString["foo"])
//Check cache expiry
//if expired
//Load new values and cache them.
//else
//Load cached values
UnlockOnValue(Request.QueryString["foo"])
Background: I'm creating an app in ASP.NET that downloads data based on a single user-defined variable in the query string. The number of values will be quite limited. I need to cache the results for each value for a specified period of time.
Approach: I decided to use local files to cache the data, which is not the best option, but I wanted to try it since this is non-critical and performance is not a big issue. I used 2 files per option, one with the cache expiry date, and one with the data.
Issue: I'm not sure what the best way to do locking is, and I'm not overly familiar with threading issues in .NET (one of the reasons I chose this approach). Based on what's available, and what I read, I thought the above should work, but I'm not sure and wanted a second opinion.

Your current solution looks pretty good. The two things I would change:
1: UnlockOnValue needs to go in a finally block. If an exception is thrown, it will never release its lock.
2: LockOnValue is somewhat inefficient, since it does a dictionary lookup twice. This isn't a big deal for a small dictionary, but for a larger one you will want to switch to TryGetValue.
Also, your assumption 3 holds - at least for now. But the Dictionary contract makes no guarantee that the Keys property always returns the same object. And since it's so easy to not rely on this, I'd recommend against it. Whenever I need an object to lock on, I just create an object for that sole purpose. Something like:
private static Object _lock = new Object();

lock only has a scope of a single process. If you want to span processes you'll have to use primitives like Mutex (named).
lock is the same as Monitor.Enter and Monitor.Exit. If you also do Monitor.Enter and Monitor.Exit, it's being redundant.
You don't need to lock on read, but you do have to lock the "transaction" of checking if the value doesn't exist and adding it. If you don't lock on that series of instructions, something else could come in between when you check for the key and when you add it and add it--thus resulting in an exception. The lock you're doing is sufficient to do that (you don't need the additional calls to Enter and Exit--lock will do that for you).

Related

Is it best practice to create a variable if accessing a property of an object more than once in a routine?

When I first began as a junior C# dev, I was always told during code reviews that if I was accessing an object's property more than once in a given scope then I should create a local variable within the routine as it was cheaper than having to retrieve it from the object. I never really questioned it as it came from more people I perceived to be quite knowledgeable at the time.
Below is a rudimentary example
Example 1: storing an objects identifer in a local variable
public void DoWork(MyDataType object)
{
long id = object.Id;
if (ObjectLookup.TryAdd(id, object))
{
DoSomeOtherWork(id);
}
}
Example 2: retrieving the identifier from the Id property of the object property anytime it is needed
public void DoWork(MyDataType object)
{
if (ObjectLookup.TryAdd(object.Id, object))
{
DoSomeOtherWork(object.Id);
}
}
Does it actually matter or was it more a preference of coding style where I was working? Or perhaps a situational design time choice for the developer to make?
As explained in this answer, if the property is a basic getter/setter than the CLR "will inline the property access and generate code that’s as efficient as accessing a field directly". However, if your property, for example, does some calculations every time the property is accessed, then storing the value of the property in a local variable will avoid the overhead of additional calculations being done.
All the memory allocation stuff aside, there is the principle of DRY(don't repeat yourself). When you can deal with one variable with a short name rather than repeating the object nesting to access the external property, why not do that?
Apart from that, by creating that local variable you are respecting the single responsibility principle by isolating the methods from the external entity they don't need to know about.
And lastly if the so-called resuing leads to unwanted instantiation of reference types or any repetitive calculation, then it is a must to create the local var and reuse it throughout the class/method.
Any way you look at it, this practice helps with readability and more maintainable code, and possibly safer too.
I don't know if it is faster or not (though I would say that the difference is negligible and thus unimportant), but I'll cook up some benchmark for you.
What IS important though will be made evident to you with an example;
public Class MyDataType
{
publig int id {
get {
// Some actual code
return this.GetHashCode() * 2;
}
}
}
Does this make more sense? The first time I will access the id Getter, some code will be executed. The second time, the same code will be executed costing twice as much with no need.
It is very probable, that the reviewers had some such case in mind and instead of going into every single one property and check what you are doing and if it is safe to access, they created a new rule.
Another reason to store, would be useability.
Imagine the following example
object.subObject.someOtherSubObject.id
In this case I ask in reviews to store to a variable even if they use it just once. That is because if this is used in a complicated if statement, it will reduce the readability and maintainability of the code in the future.
A local variable is essentially guaranteed to be fast, whereas there is an unknown amount of overhead involved in accessing the property.
It's almost always a good idea to avoid repeating code whenever possible. Storing the value once means that there is only one thing to change if it needs changing, rather than two or more.
Using a variable allows you to provide a name, which gives you an opportunity to describe your intent.
I would also point out that if you're referring to other members of an object a lot in one place, that can often be a strong indication that the code you're writing actually belongs in that other type instead.
You should consider that getting a value from a method that is calculated from an I/O-bound or CPU-bound process can be irrational. Therefore, it's better to define a var and store the result to avoid multiple same processing.
In the case that you are using a value like object.Id, utilizing a variable decorated with const keyword guarantees that the value will not change in the scope.
Finally, it's better to use a local var in the classes and methods.

Which way is thread-safe and has better performance to use, ContainsKey or string indexer?

I have a collection as below
private static readonly Dictionary<string,object> _AppCache = new Dictionary<string,object>;
Then I was wondering which one is better to use to check if a key exists (none of my keys has null value)
_AppCache.ContainsKey("x")
_AppCache["x"] != null
This code might be access through various number of threads
The whole code is:
public void SetGlobalObject(string key, object value)
{
globalCacheLock.EnterWriteLock();
try
{
if (!_AppCache.ContainsKey(key))
{
_AppCache.Add(key, value);
}
}
finally
{
globalCacheLock.ExitWriteLock();
}
}
Update
I changed my code to use dictionary to keep focus of the question on Conatinskey or Indexer
I don't disagree with other's advice to use Dictionary. However, to answer your question, I think you should use ContainsKey to check if a key exists for several reasons
That is specifically what ContainsKey was written to do
For _AppCache["x"] != null to work your app must operate under an unenforced assumption (that no values will be null). That assumption may hold true now, but future maintainers may not know or understand this critical assumption, resulting in unintuitive bugs
Slightly less processing for ContainsKey, although this is not really important
Neither of the two choices are threadsafe, so that is not a deciding factor. For that, you either need to use locking, or use ConcurrentDictionary.
If you move to a Dictionary (per your question update), the answer is even more in favor of ContainsKey. If you used the index option, you would have to catch an exception to detect if the key is not in the Dictionary. ContainsKey would be much more straightforward in your code.
When the key is in the Dictionary, ContainsKey is slightly more efficient. Both options first call an internal method FindEntry. In the case of ContainsKey, it just returns the result of that. For the index option, it must also retrieve the value. In the case of the key not being in the Dictionary, the index option would be a fair amount less efficient, because it will be throwing an exception.
You are obviously checking for the existence of that key. In that case, _AppCache["x"] != null will give you a KeyNotFoundException if the key does not exist, which is probably not as desirable. If you really want to check if the key exists, without generating an exception by just checking, you have to use _AppCache.ContainsKey("x"). For checking if the key exists in the dictionary or hashtable, I would stick with ContainsKey. Any difference in performance, if != null is faster, would be offset by the additional code to deal with the exception if the key really does not exist.
In reality, _AppCache["x"] != null is not checking if the key exists, it is checking, given that key "x" exists, whether the associated value is null.
Neither way (although accomplishing different tasks) gives you any advantage on thread safety.
All of this holds true if you use ConcurrentDictionary - no difference in thread safety, the two ways accomplish different things, any possible gain in checking with !=null is offset by additional code to handle exception. So, use ContainsKey.
If you're concerned about thread-safety, you should have a look at the ConcurrentDictionary class.
If you do not want to use ConcurrentDictionary, than you'll have to make sure that you synchronize access to your regular Dictionary<K,V> instance. That means, making sure that no 2 threads can have multiple access to your dictionary, by locking on each write and read operation.
For instance, if you want to add something to a regular Dictionary in a thread-safe way, you'll have to do it like this:
private readonly object _sync = new object();
// ...
lock( _sync )
{
if( _dictionary.ContainsKey(someKey) == false )
{
_dictionary.Add(someKey, somevalue);
}
}
You should'nt be using using Hashtable anymore since the introduction of the generic Dictionary<K,V> class and therefore type-safe alternative has been introduced in .NET 2.0
One caveat though when using a Dictionary<K,V>: when you want to retrieve the value associated with a given key, the Dictionary will throw an exception when there is no entry for that specified key, whereas a Hashtable will return null in that case.
You should use a ConcurrentDictionary rather than a Dictionary, which is thread-safe itself. Therefore you do not need the lock, which (generally *) improves performance, since the locking mechanisms are rather expensive.
Now, only to check whether an entry exists I recommend ContainsKey, irrespective of which (Concurrent)Dictionary you use:
_AppCache.ContainsKey(key)
But what you do in two steps can be done in one step using the Concurrent Dictionary by using GetOrAdd:
_AppCache.GetOrAdd(key, value);
You need a lock for neither action:
public void SetGlobalObject(string key, object value)
{
_AppCache.GetOrAdd(key, value);
}
Not only does this (probably *) perform better, but I think it expresses your intentions much clearer and less cluttered.
(*) Using "probably" and "generally" here to emphasise that these data structures do have loads of baked-in optimisations for performance, however performance in your specific case must always be measured.

Double checked locking on Dictionary "ContainsKey"

My team is currently debating this issue.
The code in question is something along the lines of
if (!myDictionary.ContainsKey(key))
{
lock (_SyncObject)
{
if (!myDictionary.ContainsKey(key))
{
myDictionary.Add(key,value);
}
}
}
Some of the posts I've seen say that this may be a big NO NO (when using TryGetValue). Yet members of our team say it is ok since "ContainsKey" does not iterate on the key collection but checks if the key is contained via the hash code in O(1). Hence they claim there is no danger here.
I would like to get your honest opinions regarding this issue.
Don't do this. It's not safe.
You could be calling ContainsKey from one thread while another thread calls Add. That's simply not supported by Dictionary<TKey, TValue>. If Add needs to reallocate buckets etc, I can imagine you could get some very strange results, or an exception. It may have been written in such a way that you don't see any nasty effects, but I wouldn't like to rely on it.
It's one thing using double-checked locking for simple reads/writes to a field, although I'd still argue against it - it's another to make calls to an API which has been explicitly described as not being safe for multiple concurrent calls.
If you're on .NET 4, ConcurrentDictionary is probably the way forward. Otherwise, just lock on every access.
If you are in a multithreaded environment, you may prefer to look at using a ConcurrentDictionary. I blogged about it a couple of months ago, you might find the article useful: http://colinmackay.co.uk/blog/2011/03/24/parallelisation-in-net-4-0-the-concurrent-dictionary/
This code is incorrect. The Dictionary<TKey, TValue> type does not support simultaneous read and write operations. Even though your Add method is called within the lock the ContainsKey is not. Hence it easily allows for a violation of the simultaneous read / write rule and will lead to corruption in your instance
It doesn't look thread-safe, but it would probably be hard to make it fail.
The iteration vs hash lookup argument doesn't hold, there could be a hash-collision for instance.
If this dictionary is rarely written and often read, then I often employ safe double locking by replacing the entire dictionary on write. This is particularly effective if you can batch writes together to make them less frequent.
For example, this is a cut down version of a method we use that tries to get a schema object associated with a type, and if it can't, then it goes ahead and creates schema objects for all the types it finds in the same assembly as the specified type to minimize the number of times the entire dictionary has to be copied:
public static Schema GetSchema(Type type)
{
if (_schemaLookup.TryGetValue(type, out Schema schema))
return schema;
lock (_syncRoot) {
if (_schemaLookup.TryGetValue(type, out schema))
return schema;
var newLookup = new Dictionary<Type, Schema>(_schemaLookup);
foreach (var t in type.Assembly.GetTypes()) {
var newSchema = new Schema(t);
newLookup.Add(t, newSchema);
}
_schemaLookup = newLookup;
return _schemaLookup[type];
}
}
So the dictionary in this case will be rebuilt, at most, as many times as there are assemblies with types that need schemas. For the rest of the application lifetime the dictionary accesses will be lock-free. The dictionary copy becomes a one-time initialization cost of the assembly. The dictionary swap is thread-safe because pointer writes are atomic so the whole reference gets switched at once.
You can apply similar principles in other situations as well.

Name for this pattern? (Answer: lazy initialization with double-checked locking)

Consider the following code:
public class Foo
{
private static object _lock = new object();
public void NameDoesNotMatter()
{
if( SomeDataDoesNotExist() )
{
lock(_lock)
{
if( SomeDataDoesNotExist() )
{
CreateSomeData();
}
else
{
// someone else also noticed the lack of data. We
// both contended for the lock. The other guy won
// and created the data, so we no longer need to.
// But once he got out of the lock, we got in.
// There's nothing left to do.
}
}
}
}
private bool SomeDataDoesNotExist()
{
// Note - this method must be thread-safe.
throw new NotImplementedException();
}
private bool CreateSomeData()
{
// Note - This shouldn't need to be thread-safe
throw new NotImplementedException();
}
}
First, there are some assumptions I need to state:
There is a good reason I couldn't just do this once an app startup. Maybe the data wasn't available yet, etc.
Foo may be instantiated and used concurrently from two or more threads. I want one of them to end up creating some data (but not both of them) then I'll allow both to access that same data (ignore thread safety of accessing the data)
The cost to SomeDataDoesNotExist() is not huge.
Now, this doesn't necessarily have to be confined to some data creation situation, but this was an example I could think of.
The part that I'm especially interested in identifying as a pattern is the check -> lock -> check. I've had to explain this pattern to developers on a few occasions who didn't get the algorithm at first glance but could then appreciate it.
Anyway, other people must do similarly. Is this a standardized pattern? What's it called?
Though I can see how you might think this looks like double-checked locking, what it actually looks like is dangerously broken and incorrect double-checked locking. Without an actual implementation of SomeDataDoesNotExist and CreateSomeData to critique we have no guarantee whatsoever that this thing is actually threadsafe on every processor.
For an example of an analysis of how double-checked locking can go wrong, check out this broken and incorrect version of double-checked locking:
C# manual lock/unlock
My advice: don't use any low-lock technique without a compelling reason and a code review from an expert on the memory model; you'll probably get it wrong. Most people do.
In particular, don't use double-checked locking unless you can describe exactly what memory access reorderings the processors can do on your behalf and provide a convincing argument that your solution is correct given any possible memory access reordering. The moment you step away even slightly from a known-to-be-correct implementation, you need to start the analysis over from scratch. You can't assume that just because one implementation of double-checked locking is correct, that they all are; almost none of them are correct.
Lazy initialization with double-checked locking?
The part that I'm especially interested in identifying as a pattern is the check -> lock -> check.
That is called double-checked locking.
Beware that in older Java versions (before Java 5) it is not safe because of how Java's memory model was defined. In Java 5 and newer changes were made to the specification of Java's memory model so that it is now safe.
The only name that comes to mind for this kind of is "Faulting". This name is used in iOS Core-Data framework to similar effect.
Basically, your method NameDoesNotMatter is a fault, and whenever someone invokes it, it results in the object to get populated or initialized.
See http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/CoreData/Articles/cdFaultingUniquing.html for more details on how this design pattern is used.

Read a value multiple times or store as a variable first time round?

Basically, is it better practice to store a value into a variable at the first run through, or to continually use the value? The code will explain it better:
TextWriter tw = null;
if (!File.Exists(ConfigurationManager.AppSettings["LoggingFile"]))
{
// ...
tw = File.CreateText(ConfigurationManager.AppSettings["LoggingFile"]);
}
or
TextWriter tw = null;
string logFile = ConfigurationManager.AppSettings["LoggingFile"].ToString();
if (!File.Exists(logFile))
{
// ...
tw = File.CreateText(logFile);
}
Clarity is important, and DRY (don't repeat yourself) is important. This is a micro-abstraction - hiding a small, but still significant, piece of functionality behind a variable. The performance is negligible, but the positive impact of clarity can't be understated. Use a well-named variable to hold the value once it's been acquired.
the 2nd solution is better for me because :
the dictionary lookup has a cost
it's more readable
Or you can have a singleton object with it's private constructor that populates once all configuration data you need.
Second one would be the best choice.
Imagine this next situation. Settings are updated by other threads and during some of them, since setting value isn't locked, changes to another value.
In the first situation, your execution can fail, or it'll be executed fine, but code was checking for a file of some name, and later saves whatever to a file that's not the one checked before. This is too bad, isn't it?
Another benefit is you're not retrieving the value twice. You get once, and you use wherever your code needs to read the whole setting.
I'm pretty sure, the second one is more readable. But if you talk about performance - do not optimize on early stages and without profiler.
I must agree with the others. Readability and DRY is important and the cost of the variable is very low considering that often you will have just Objects and not really store the thing multiple times.
There might be exceptions with special or large objects. You must keep in mind the question if the value you cache might change in between and if you would like or not (most times the second!) to know the new value within your code! In your example, think what might happen when ConfigurationManager.AppSettings["LoggingFile"] changes between the two calls (due to accessor logic or thread or always reading the value from a file from disk).
Resumee: About 99% you will want the second method / the cache!
IMO that would depend on what you are trying to cache. Caching a setting from App.conig might not be as benefiial (apart from code readability) as caching the results of a web service call over a GPRS connection.

Categories

Resources