StackExchange.Redis pipelining on for-loop? - c#

Preface
I have a simple interface that assumes dependencies between keys.
Two of its methods are:
Remove(string key) - that removes a single key from the cache.
RemoveDependentsOf(string baseKey) - that removes baseKey and all dependents of baseKey.
The dependents of baseKey are specified in a Redis set.
So in order to remove all dependents of baseKey I have to read the baseKey's set and then loop them to delete each of them.
Question
I read StackExchange.Redis documentation so I know about their legendary pipelining suppport, and according to their documentation the following code should work very efficient.
However, I cannot seem to understand how could the library pipeline the KeyDelete commands as the method returns a boolean value whether the key was deleted or not.
So before executing the second KeyDelete command, the first one should have been sent and received its response (that's not efficient).
What am I missing here?
How should have I written the below code?
public void Remove(string key)
{
_redis.KeyDelete(key);
}
public void RemoveDependentsOf(string key)
{
Remove(key);
var setKey = GetDependencySetKey(key);
RedisValue[] dependents = _redis.SetMembers(setKey);
foreach (var dependentKey in dependents)
{
RemoveDependentsOf(dependentKey);
}
// This is the way to remove the whole set
_redis.KeyExpire(setKey, TimeSpan.Zero);
}

You’re using synchronous methods and, although you don’t explicitly depend on the result of the KeyDelete operation, StackExchange.Redis doesn’t know that you’re not using the result. You are therefore not getting any of the pipelining benefits offered by the library.
The documentation explicitly calls out the two ways that you can use the pipelining support; use the Async method and perform a Task.WhenAll if you want to know when it’s done or use Fire and Forget. You can explicitly tell the library that you want to do this by passing CommandFlags.FireAndForget to your commands, e.g.
_redis.KeyDelete(key, CommandFlags.FireAndForget)
Note this will cause a default result to be returned from the call rather than the actual result. Given that you are disregarding those results anyway you should be OK!

Related

Why ConcurrentDictionary TryUpdate() method requires indicating oldValue?

When calling TryUpdate you should specify old value besides key, why is it required ?
And additional question why TryUpdate method (and others similar) have while(true) loop wrapper inside ?
The ConcurrentDictionary<TKey,TValue> collection is designed to support concurrent scenarios, where operations must be atomic. For example let's say that you have a dictionary with string keys and int values, and you want to increment the value of the key "A". The following code is not atomic:
dictionary["A"]++;
Between reading the value and updating it, it is possible that another thread will change the value, resulting in the other thread's change being lost. It is easier to see it if we rewrite the above code like this:
var value = dictionary["A"];
value++;
dictionary["A"] = value;
The solution is to avoid updating the dictionary using the indexer, and use the TryUpdate instead. In case another thread intercepts our update, we'll have to start all over again, until we finally win the race at updating this key:
while (true)
{
var existing = dictionary["A"];
var updated = existing + 1;
if (dictionary.TryUpdate("A", updated, existing)) break;
}
Doing loops with while (true), also known as "spinning", is a typical technique in low-lock multithreaded programming.
Related question: Is there a way to use ConcurrentDictionary.TryUpdate with a lambda expression?

How does a "GetFoo()" function differ from "Foo"? [duplicate]

This is probably a matter of personal preference, but when do you use properties instead of functions in your code
For instance to get an error log I could say
string GetErrorLog()
{
return m_ErrorLog;
}
or I could
string ErrorLog
{
get { return m_ErrorLog; }
}
How do you decide which one to use? I seem to be inconsistent in my usage and I'm looking for a good general rule of thumb. Thanks.
I tend to use properties if the following are true:
The property will return a single, logic value
Little or no logic is involved (typically just return a value, or do a small check/return value)
I tend to use methods if the following are true:
There is going to be significant work involved in returning the value - ie: it'll get fetched from a DB, or something that may take "time"
There is quite a bit of logic involved, either in getting or setting the value
In addition, I'd recommend looking at Microsoft's Design Guidelines for Property Usage. They suggest:
Use a property when the member is a logical data member.
Use a method when:
The operation is a conversion, such as Object.ToString.
The operation is expensive enough that you want to communicate to the user that they should consider caching the result.
Obtaining a property value using the get accessor would have an observable side effect.
Calling the member twice in succession produces different results.
The order of execution is important. Note that a type's properties should be able to be set and retrieved in any order.
The member is static but returns a value that can be changed.
The member returns an array. Properties that return arrays can be very misleading. Usually it is necessary to return a copy of the internal array so that the user cannot change internal state. This, coupled with the fact that a user can easily assume it is an indexed property, leads to inefficient code. In the following code example, each call to the Methods property creates a copy of the array. As a result, 2n+1 copies of the array will be created in the following loop.
Here are Microsoft's guidelines:
Choosing Between Properties and Methods
Consider using a property if the member represents a logical attribute of the type.
Do use a property, rather than a method, if the value of the property is stored in the process memory and the property would just provide access to the value.
Do use a method, rather than a property, in the following situations.
The operation is orders of magnitude slower than a field set would be. If you are even considering providing an asynchronous version of an operation to avoid blocking the thread, it is very likely that the operation is too expensive to be a property. In particular, operations that access the network or the file system (other than once for initialization) should most likely be methods, not properties.
The operation is a conversion, such as the Object.ToString method.
The operation returns a different result each time it is called, even if the parameters do not change. For example, the NewGuid method returns a different value each time it is called.
The operation has a significant and observable side effect. Note that populating an internal cache is not generally considered an observable side effect.
The operation returns a copy of an internal state (this does not include copies of value type objects returned on the stack).
The operation returns an array.
I use properties when its clear the semantic is "Get somevalue from the object". However using a method is a good way to communicate "this may take a bit more than a trivial effort to return".
For example a collection could have a Count property. Its reasonable to assume a collection object knows how many items are currently held without it actually having to loop through them and count them.
On the hand this hypothetical collection could have GetSum() method which returns the total of the set of items held. The collection just a easily have a Sum property instead but by using a method it communicates the idea that the collection will have to do some real work to get an answer.
I'd never use a property if I could be affecting more than one field - I'd always use a method.
Generally, I just use the
public string ErrorLog { get; private set; }
syntax for Properties and use Methods for everything else.
In addition to Reed's answer when the property is only going to be a getter like getting a resource such as an Event Log might be. I try and only use properties when the property will be side effect free.
If there is more than something trivial happening in a property, then it should be a method. For example, if your ErrorLog getter property was actually going and reading files, then it should be a method. Accessing a property should be fast, and if it is doing much processing, it should be a method. If there are side affects of accessing a property that the user of the class might not expect, then it should probably be a method.
There is .NET Framework Design Guidelines book that covers this kind of stuff in great detail.

Can I specify CAS value and durability requirements together in a store operation?

I am using Couchbase .NET SDK 1.3.
The task is to store a value, provided the data under its key has not been modified in the DB since the value was read, and be sure the new value is persisted/replicated to a certain number of nodes. For modification check, I'd like to utilize optimistic locking, i.e. Couchbase's CAS methods. I need to synchronously wait until persistence/replication of the value is successful.
Problem is that Couchbase SDK provides methods to specify either a CAS value or durability requirements:
ExecuteCas(mode, key, value, validfor, cas);
ExecuteStore(mode, key, value, persistTo, replicateTo);
I need to combine both. There is also Observe method:
Observe(key, cas, persistTo, replicateTo);
Seems that it's what I need, but I couldn't find its documentation anywhere. So, particularly, I can't be sure if the method waits for the value to be persisted/replicated or just checks that at the moment of the call. Is it valid to use this method like so?
var storeResult = client.ExecuteCas(StoreMode.Set, key, value, TimeSpan.FromSeconds(60), cas);
// check storeResult.Success
var observeResult = client.Observe(key, cas, persistTo, replicateTo);
// check observeResult.Success
Got an answer on the official forums: https://forums.couchbase.com/t/2199
Indeed, Observe method can be used. It will block until the value is persisted/replicated.

Double checked locking on Dictionary "ContainsKey"

My team is currently debating this issue.
The code in question is something along the lines of
if (!myDictionary.ContainsKey(key))
{
lock (_SyncObject)
{
if (!myDictionary.ContainsKey(key))
{
myDictionary.Add(key,value);
}
}
}
Some of the posts I've seen say that this may be a big NO NO (when using TryGetValue). Yet members of our team say it is ok since "ContainsKey" does not iterate on the key collection but checks if the key is contained via the hash code in O(1). Hence they claim there is no danger here.
I would like to get your honest opinions regarding this issue.
Don't do this. It's not safe.
You could be calling ContainsKey from one thread while another thread calls Add. That's simply not supported by Dictionary<TKey, TValue>. If Add needs to reallocate buckets etc, I can imagine you could get some very strange results, or an exception. It may have been written in such a way that you don't see any nasty effects, but I wouldn't like to rely on it.
It's one thing using double-checked locking for simple reads/writes to a field, although I'd still argue against it - it's another to make calls to an API which has been explicitly described as not being safe for multiple concurrent calls.
If you're on .NET 4, ConcurrentDictionary is probably the way forward. Otherwise, just lock on every access.
If you are in a multithreaded environment, you may prefer to look at using a ConcurrentDictionary. I blogged about it a couple of months ago, you might find the article useful: http://colinmackay.co.uk/blog/2011/03/24/parallelisation-in-net-4-0-the-concurrent-dictionary/
This code is incorrect. The Dictionary<TKey, TValue> type does not support simultaneous read and write operations. Even though your Add method is called within the lock the ContainsKey is not. Hence it easily allows for a violation of the simultaneous read / write rule and will lead to corruption in your instance
It doesn't look thread-safe, but it would probably be hard to make it fail.
The iteration vs hash lookup argument doesn't hold, there could be a hash-collision for instance.
If this dictionary is rarely written and often read, then I often employ safe double locking by replacing the entire dictionary on write. This is particularly effective if you can batch writes together to make them less frequent.
For example, this is a cut down version of a method we use that tries to get a schema object associated with a type, and if it can't, then it goes ahead and creates schema objects for all the types it finds in the same assembly as the specified type to minimize the number of times the entire dictionary has to be copied:
public static Schema GetSchema(Type type)
{
if (_schemaLookup.TryGetValue(type, out Schema schema))
return schema;
lock (_syncRoot) {
if (_schemaLookup.TryGetValue(type, out schema))
return schema;
var newLookup = new Dictionary<Type, Schema>(_schemaLookup);
foreach (var t in type.Assembly.GetTypes()) {
var newSchema = new Schema(t);
newLookup.Add(t, newSchema);
}
_schemaLookup = newLookup;
return _schemaLookup[type];
}
}
So the dictionary in this case will be rebuilt, at most, as many times as there are assemblies with types that need schemas. For the rest of the application lifetime the dictionary accesses will be lock-free. The dictionary copy becomes a one-time initialization cost of the assembly. The dictionary swap is thread-safe because pointer writes are atomic so the whole reference gets switched at once.
You can apply similar principles in other situations as well.

Abusing type overloading to create boilerplate code in C#

In a project I'm currently working on, we've added a wrapper class for accessing the HttpSessionState object. The trouble is that the current solution means you have to write some code to wrap the functionality. I came up with the following solution
/// <typeparam name="TKey">Class used for generating key into session state storage.</typeparam>
/// <typeparam name="T">Type of object to store.</typeparam>
public static class SessionManager<TKey, T>
{
static SessionManager()
{
_key = typeof(TKey).ToString();
}
private static readonly string _key;
public static string Key
{
get { return _key; }
}
// Other functions ... (Set, IsSet, Remove, etc.)
}
Now you can create the desired storage by merely using
using StringStore= Test.SessionManager<System.Boolean, System.String>;
using StringStore2= Test.SessionManager<System.Version, System.String>;
StringStore.Set("I'm here");
StringStore2.Set("I'm also here");
The code works and is nice in that you can easily create the wrapper class (single using statement) and everything is static. The code is however abusing the type system a bit, so maybe it is a bit to obscure? Before I added it I wanted to get some feedback, so here's the question:
If you we're maintaining said system and encountered the code above, would you
Hunt down and kill whoever checked the file in?
Be a bit annoyed by the attempt to be clever, but let it slide?
Think it was a nice way of avoiding boilerplate code?
Would you prefer to use a text generation tool] like T4?
Thanks for any replies,
Mads
So, you're using generics to create a key for a dictionary. I'd say this is definitely not a good idea for a couple different reasons.
The first reason is that it violates the reasoning behind a dictionary key. A key should hold some significance to the value it holds. Why are you storing a string under System.Boolean? What does System.Boolean mean? Does it mean the string is either true or false? Confusion like this makes the code harder to support. I'm also suspicious that the key value is used in casting the string somewhere else in code. This is clearly a mis-mixing of dictionaries and generics.
The second reason is that the Session is shared within a user's session. So two completely different segments of code written by two different developers are accessing this shared location. What is to stop one developer from identifying System.Boolean as an appropriate key-type for data A in their code, and another using System.Boolean as a key-type for data B in their code? Now the first developer is expecting A when accessing this bucket, but gets B. A meaningful, unique key would prevent this from happening.
If the use of System.Boolean versus System.Version is merely to distinguish different types to get separate instances of _key into the system, my response would be somewhere between #1 and #2. At least comment it and create some dummy types (maybe just empty interfaces) to use instead of using arbitrary .NET types.
I say this as someone whose primary job has been code review for the past 6+ months. If you're working with a lot of other people who need to understand that code, it's going to be difficult to understand and maintain if you don't at least give the reader some clue.

Categories

Resources