Using lock on the key of a Dictionary<string, object> - c#

I have a Dictionary<string, someobject>.
EDIT: It was pointed out to me, that my example was bad. My whole intention was not to update the references in a loop but to update different values based on differnt threads need to update/get the data. I changed the loop to a method.
I need to update items in my dictionary - one key at a time and i was wondering if there are any problems in using the lock on the .key value of my Dictionary object?
private static Dictionary<string, MatrixElement> matrixElements = new Dictionary<string, MatrixElement>();
//Pseudo-code
public static void UpdateValue(string key)
{
KeyValuePair<string, MatrixElement> keyValuePair = matrixElements[key];
lock (keyValuePair.Key)
{
keyValuePair.Value = SomeMeanMethod();
}
}
Would that hold up in court or fail? I just want each value in the dictionary to be locked independantly so locking (and updating) one value does not lock the others. Also i'm aware the locking will be holding for a long time - but the data will be invalid untill updated fully.

Locking on an object that is accessible outside of the code locking it is a big risk. If any other code (anywhere) ever locks that object you could be in for some deadlocks that are hard to debug. Also note that you lock the object, not the reference, so if I gave you a dictionary, I may still hold references to the keys and lock on them - causing us to lock on the same object.
If you completely encapsulate the dictionary, and generate the keys yourself (they aren't ever passed in, then you may be safe.
However, try to stick to one rule - limit the visibility of the objects you lock on to the locking code itself whenever possible.
That's why you see this:
public class Something
{
private readonly object lockObj = new object();
public SomethingReentrant()
{
lock(lockObj) // Line A
{
// ...
}
}
}
rather than seeing line A above replaced by
lock(this)
That way, a separate object is locked on, and the visibility is limited.
Edit Jon Skeet correctly observed that lockObj above should be readonly.

No, this would not work.
The reason is string interning. This means that:
string a = "Something";
string b = "Something";
are both the same object! Therefore, you should never lock on strings because if some other part of the program (e.g. another instance of this same object) also wants to lock on the same string, you could accidentally create lock contention where there is no need for it; possibly even a deadlock.
Feel free to do this with non-strings, though. For best clarity, I make it a personal habit to always create a separate lock object:
class Something
{
bool threadSafeBool = true;
object threadSafeBoolLock = new object(); // Always lock this to use threadSafeBool
}
I recommend you do the same. Create a Dictionary with the lock objects for every matrix cell. Then, lock these objects when needed.
PS. Changing the collection you are iterating over is not considered very nice. It will even throw an exception with most collection types. Try to refactor this - e.g. iterate over a list of keys, if it will always be constant, not the pairs.

Note: I assume exception when modifying collection during iteration is already fixed
Dictionary is not thread-safe collection, which means it is not safe to modify and read collection from different threads without external synchronization. Hashtable is (was?) thread-safe for one-writer-many-readers scenario, but Dictionary has different internal data structure and doesn't inherit this guarantee.
This means that you cannot modify your dictionary while you accessing it for read or write from the other thread, it can just broke internal data structures. Locking on the key doesn't protect internal data structure, because while you modify that very key someone could be reading different key of your dictionary in another thread. Even if you can guarantee that all your keys are same objects (like said about string interning), this doesn't bring you on safe side. Example:
You lock the key and begin to modify dictionary
Another thread attempts to get value for the key which happens to fall into the same bucket as locked one. This is not only when hashcodes of two objects are the same, but more frequently when hashcode%tableSize is the same.
Both threads are accessing the same bucket (linked list of keys with same hashcode%tableSize value)
If there is no such key in dictionary, first thread will start modifying the list, and the second thread will likely to read incomplete state.
If such key already exists, implementation details of dictionary could still modify data structure, for example move recently accessed keys to the head of the list for faster retrieval. You cannot rely on implementation details.
There are many cases like that, when you will have corrupted dictionary. So you have to have external synchronization object (or use Dictionary itself, if it is not exposed to public) and lock on it during entire operation. If you need more granular locks when operation can take some long time, you can copy keys you need to update, iterate over it, lock entire dictionary during single key update (don't forget to verify key is still there) and release it to let other threads run.

If I'm not mistaken, the original intention was to lock on a single element, rather than locking the whole dictionary (like table-level lock vs. row level lock in a DB)
you can't lock on the dictionary's key as many here explained.
What you can do, is to keep an internal dictionary of lock objects, that corresponds to the actual dictionary. So when you'd want to write to YourDictionary[Key1], you'll first lock on InternalLocksDictionary[Key1] - so only a single thread will write to YourDictionary.
a (not too clean) example can be found here.

Just came across this and thought id share some code I wrote a few years ago where I needed to a dictionary on a key basis
using (var lockObject = new Lock(hashedCacheID))
{
var lockedKey = lockObject.GetLock();
//now do something with the dictionary
}
the lock class
class Lock : IDisposable
{
private static readonly Dictionary<string, string> Lockedkeys = new Dictionary<string, string>();
private static readonly object CritialLock = new object();
private readonly string _key;
private bool _isLocked;
public Lock(string key)
{
_key = key;
lock (CritialLock)
{
//if the dictionary doesnt contain the key add it
if (!Lockedkeys.ContainsKey(key))
{
Lockedkeys.Add(key, String.Copy(key)); //enusre that the two objects have different references
}
}
}
public string GetLock()
{
var key = Lockedkeys[_key];
if (!_isLocked)
{
Monitor.Enter(key);
}
_isLocked = true;
return key;
}
public void Dispose()
{
var key = Lockedkeys[_key];
if (_isLocked)
{
Monitor.Exit(key);
}
_isLocked = false;
}
}

In your example, you can not do what you want to do!
You will get a System.InvalidOperationException with a message of Collection was modified; enumeration operation may not execute.
Here is an example to prove:
using System.Collections.Generic;
using System;
public class Test
{
private Int32 age = 42;
static public void Main()
{
(new Test()).TestMethod();
}
public void TestMethod()
{
Dictionary<Int32, string> myDict = new Dictionary<Int32, string>();
myDict[age] = age.ToString();
foreach(KeyValuePair<Int32, string> pair in myDict)
{
Console.WriteLine("{0} : {1}", pair.Key, pair.Value);
++age;
Console.WriteLine("{0} : {1}", pair.Key, pair.Value);
myDict[pair.Key] = "new";
Console.WriteLine("Changed!");
}
}
}
The output would be:
42 : 42
42 : 42
Unhandled Exception: System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
at System.Collections.Generic.Dictionary`2.Enumerator.MoveNext()
at Test.TestMethod()
at Test.Main()

I can see a few potential issues there:
strings can be shared, so you don't necessarily know who else might be locking on that key object for what other reason
strings might not be shared: you may be locking on one string key with the value "Key1" and some other piece of code may have a different string object that also contains the characters "Key1". To the dictionary they're the same key but as far as locking is concerned they are different objects.
That locking won't prevent changes to the value objects themselves, i.e. matrixElements[someKey].ChangeAllYourContents()

Related

C# Access ConcurrentDictionary value by key in async app

I have question about ConcurrencyDictionary in .NET C#.
My app is going to be async (I try to do that :)).
I have some external devices, which send data to my core (C# .NET) via some TCPIP communication. I store the objects in values of ConcurrentDictionary for each device. I have some operations with that data, where I need to read it and sometimes change some in the object.
Now it looks good without deadlock (when I increase the number of external/simulated devices, it does not slow, but it can handle more messages in same time (and without data lose)
But: I am not sure if I'm using it correctly.
I need to change some values inside of the object, call some functions and store all changes in the dict. All objects in the dict must be available to be read by other processes (I know during the "DoJob" other processes can have old values in dict until I will save value, but in my case it is ok). I just need to avoid blocking/locking other tasks and make it as fast as possible.
Which way is better:
1 way (i use it now):
var dict = new ConcurentDictionary<MyClass>(concurrencyLevel, initalCapacity);
private async Task DoJob(string myKey)
{
MyClass myClass;
MyClass myClassInitState;
dict.TryGetValue(myKey, out myClass);
dict.TryGetValue(myKey, out myClassInitState);
var value = myClass.SomeValueToRead;
myClass.Prop1 = 10;
await myClass.DoSomeAnotherJob();
dict.TryUpdate(myKey, myClass, myClassInitState);
}
2 way:
var dict = new ConcurentDictionary<MyClass>(concurrencyLevel, initalCapacity);
private async Task DoJob(string myKey)
{
var value = dict[myKey].SomeValueToRead;
dict[myKey].ChangeProp1(10);
await dict[myKey].DoSomeAnotherJob();
}
The second way looks much more clear and simple. But I am not sure if I can do that because of async.
Will I block the other threads/tasks?
Which way will be faster? I expect first one, because inside of DoJob I do not work with dict, but with some copy of object and after all I will update the dict.
Does the reading of values directly (#2) could slow down the whole process?
Could other processes read last-actualised value from dict even during #2 way without any troubles?
What happen when I call:
dict[myKey].DoSomeAnotherJob();
It is awaitable, so it should not block the threads. But in fact it is called in shared dict in some its value.
The thread-safe ConcurrentDictionary (as opposed to a plain old Dictionary) has nothing to do with async/await.
What this does:
await dict[myKey].DoSomeAnotherJob();
Is this:
var temp = dict[myKey];
await temp.DoSomeAnotherJob();
You do not need a ConcurrentDictionary in order to call that async method, dict can just as well be a regular Dictionary.
Also, assuming MyClass is a reference type (a class as opposed to a struct), saving its original reference in a temporary variable and updating the dictionary, as you do, is unnecessary. The moment after you called myClass.Prop1 = 10, this change is propagated to all other places where you have a reference to that same myClass instance.
You only want to call TryUpdate() if you want to replace the value, but you don't, as it's still the same reference - there's nothing to replace, both myClass and myClassInitState point to the same object.
The only reason to use a ConcurrentDictionary (as opposed to a Dictionary), is when the dictionary is accessed from multiple threads. So if you call DoJob() from different threads, that's when you should use a ConcurrentDictionary.
Also, when multithreading is involved, this is dangerous:
var value = dict[myKey].SomeValueToRead;
dict[myKey].ChangeProp1(10);
await dict[myKey].DoSomeAnotherJob();
Because in the meantime, another thread could change the value for myKey, meaning you obtain a different reference each time you call dict[myKey]. So saving it in a temporary variable is the way to go.
Also, using the indexer property (myDict[]) instead of TryGetValue() has its own issues, but still no threading issues.
Both ways are equal in effect. The first way uses methods to read from the collection and the second way uses an indexer to achieve the same. In fact the indexer internally invokes TryGetValue().
When invoking MyClass myClass = concurrentDictionary[key] (or concurrentDictionary[key].MyClassOperation()) the dictionary internally executes the getter of the indexer property:
public TValue this[TKey key]
{
get
{
if (!TryGetValue(key, out TValue value))
{
throw new KeyNotFoundException();
}
return value;
}
set
{
if (key == null) throw new ArgumentNullException("key");
TryAddInternal(key, value, true, true, out TValue dummy);
}
}
The internal ConcurrentDictionary code shows that
concurrentDictionary.TryGetValue(key, out value)
and
var value = concurrentDictionary[key]
are the same except the indexer will throw an KeyNotFoundException if the key doesn't exist.
From an consuming point of view the first version using TryGetValue enables to write more readable code:
// Only get value if key exists
if (concurrentDictionary.TryGetValue(key, out MyClass value))
{
value.Operation();
}
vs
// Only get value if key exists to avoid an exception
if (concurrentDictionary.Contains(key))
{
MyClass myClass = concurrentDictionary[key];
myClass.Operation();
}
Talking about readability, your code can be simplified as followed:
private async Task DoJob(string myKey)
{
if (dict.TryGetValue(myKey, out MyClass myClass))
{
var value = myClass.SomeValueToRead;
myClass.Prop1 = 10;
await myClass.DoSomeAnotherJob();
}
}
As async/await was designed to asynchronously execute operation on
the UI thread, await myClass.DoSomeAnotherJob() won't block.
neither will TryGetValue nor this[] block other threads
both access variants execute at the same speed as they share the same implementation
dict[myKey].Operation()
is equal to
MyClass myclass = dict.[myKey];
myClass.Operation();.
It's the same when
GetMyClass().Operation()
is equal to
MyClass myClass = GetMyClass();
myClass.Operation();
your perception is wrong. Nothing is called "inside" the dictionary. As you can see from the internal code snippet dict[key] returns a value.
Remarks
ConcurrentDictionary is a way to provide thread-safe access to a collection. E.g., this means the thread that accesses the collection will always access a defined state of the collection.
But be aware that the items itself are not thread-safe because they are stored in a thread-safe collection:
Consider the following method is executed by two threads simultaneously.
Both threads share the same ConcurrentDictionary containing therefore shared objects.
// Thread 1
private async Task DoJob(string myKey)
{
if (dict.TryGetValue(myKey, out MyClass myClass))
{
var value = myClass.SomeValueToRead;
myClass.Prop1 = 10;
await myClass.DoSomeLongRunningJob();
// The result is now '100' and not '20' because myClass.Prop1 is not thread-safe. The second thread was allowed to change the value while this thread was reading it
int result = 2 * myClass.Prop1;
}
}
// Thread 2
private async Task DoJob(string myKey)
{
if (dict.TryGetValue(myKey, out MyClass myClass))
{
var value = myClass.SomeValueToRead;
myClass.Prop1 = 50;
await myClass.DoSomeLongRunningJob();
int result = 2 * myClass.Prop1; // '100'
}
}
Also ConcurrentDictionary.TryUpdate(key, newValue, comparisonValue) is the same like the following code:
if (dict.Contains(key))
{
var value = dict[key];
if (value == comparisonValue)
{
dict[key] = newValue;
}
}
Example: Let's say the dictionary contains a numeric element at key "Amount" with a value of 50. Thread 1 only wants to modify this value if Thread 2 hasn't changed it in the meantime. Thread 2 value is more important (has precedence). You now can use the TryUpdate method to apply this rule:
if (dict.TryGetValue("Amount", out int oldIntValue))
{
// Change value according to rule
if (oldIntValue > 0)
{
// Calculate a new value
int newIntValue = oldIintValue *= 2;
// Replace the old value inside the dictionary ONLY if thread 2 hasn't change it already
dict.TryUpdate("Amount", newIntValue, oldIntValue);
}
else // Change value without rule
{
dict["Amount"] = 1;
}
}

Thread safe Increment in C#

I am trying to Increment an element in a list in C#, but I need it to be thread safe, so the count does not get affected.
I know you can do this for integers:
Interlocked.Increment(ref sdmpobjectlist1Count);
but this does not work on a list I have the following so far:
lock (padlock)
{
DifferenceList[diff[d].PropertyName] = DifferenceList[diff[d].PropertyName] + 1;
}
I know this works, but I'm not sure if there is another way to do this?
As David Heffernan said, ConcurrentDictionary should provider better performance. But, the performance gain might be negligible depending upon how frequently multiple threads try to access the cache.
using System;
using System.Collections.Concurrent;
using System.Threading;
namespace ConcurrentCollections
{
class Program
{
static void Main()
{
var cache = new ConcurrentDictionary<string, int>();
for (int threadId = 0; threadId < 2; threadId++)
{
new Thread(
() =>
{
while (true)
{
var newValue = cache.AddOrUpdate("key", 0, (key, value) => value + 1);
Console.WriteLine("Thread {0} incremented value to {1}",
Thread.CurrentThread.ManagedThreadId, newValue);
}
}).Start();
}
Thread.Sleep(TimeSpan.FromMinutes(2));
}
}
}
If you use a List<int[]> rather than a List<int>, and have each element in the list be a single-item array, you will be able to do Increment(ref List[whatever][0]) and have it be atomic. One could improve storage efficiency slightly if one defined
class ExposedFieldHolder<T> {public T Value;}
and then used a List<ExposedFieldHolder<int>> and used the statement Increment(ref List[whatever].Value) to perform the increment. Things could be more efficient yet if the built-in types provided a means of exposing an item as a ref or allowed derived classes sufficient access to their internals to provide such ability themselves. They don't, however, so one must either define one's own collection types from scratch or encapsulate each item in its own class object [using an array or a wrapper class].
check the variable you locked on "padLock", normally, you can define it as private static Object padLock = new Object(). if you do not define it as static, each object has its own copy, thus it will not work.

How to access the reference values of a HashSet<TValue> without enumeration?

I have this scenario in which memory conservation is paramount. I am trying to read in > 1 GB of Peptide sequences into memory and group peptide instances together that share the same sequence. I am storing the Peptide objects in a Hash so I can quickly check for duplication, but found out that you cannot access the objects in the Set, even after knowing that the Set contains that object.
Memory is really important and I don't want to duplicate data if at all possible. (Otherwise I would of designed my data structure as: peptides = Dictionary<string, Peptide> but that would duplicate the string in both the dictionary and Peptide class). Below is the code to show you what I would like to accomplish:
public SomeClass {
// Main Storage of all the Peptide instances, class provided below
private HashSet<Peptide> peptides = new HashSet<Peptide>();
public void SomeMethod(IEnumerable<string> files) {
foreach(string file in files) {
using(PeptideReader reader = new PeptideReader(file)) {
foreach(DataLine line in reader.ReadNextLine()) {
Peptide testPep = new Peptide(line.Sequence);
if(peptides.Contains(testPep)) {
// ** Problem Is Here **
// I want to get the Peptide object that is in HashSet
// so I can add the DataLine to it, I don't want use the
// testPep object (even though they are considered "equal")
peptides[testPep].Add(line); // I know this doesn't work
testPep.Add(line) // THIS IS NO GOOD, since it won't be saved in the HashSet which i use in other methods.
} else {
// The HashSet doesn't contain this peptide, so we can just add it
testPep.Add(line);
peptides.Add(testPep);
}
}
}
}
}
}
public Peptide : IEquatable<Peptide> {
public string Sequence {get;private set;}
private int hCode = 0;
public PsmList PSMs {get;set;}
public Peptide(string sequence) {
Sequence = sequence.Replace('I', 'L');
hCode = Sequence.GetHashCode();
}
public void Add(DataLine data) {
if(PSMs == null) {
PSMs = new PsmList();
}
PSMs.Add(data);
}
public override int GethashCode() {
return hCode;
}
public bool Equals(Peptide other) {
return Sequence.Equals(other.Sequence);
}
}
public PSMlist : List<DataLine> { // and some other stuff that is not important }
Why does HashSet not let me get the object reference that is contained in the HashSet? I know people will try to say that if HashSet.Contains() returns true, your objects are equivalent. They may be equivalent in terms of values, but I need the references to be the same since I am storing additional information in the Peptide class.
The only solution I came up with is Dictionary<Peptide, Peptide> in which both the key and value point to the same reference. But this seems tacky. Is there another data structure to accomplish this?
Basically you could reimplement HashSet<T> yourself, but that's about the only solution I'm aware of. The Dictionary<Peptide, Peptide> or Dictionary<string, Peptide> solution is probably not that inefficient though - if you're only wasting a single reference per entry, I would imagine that would be relatively insignificant.
In fact, if you remove the hCode member from Peptide, that will safe you 4 bytes per object which is the same size as a reference in x86 anyway... there's no point in caching the hash as far as I can tell, as you'll only compute the hash of each object once, at least in the code you've shown.
If you're really desperate for memory, I suspect you could store the sequence considerably more efficiently than as a string. If you give us more information about what the sequence contains, we may be able to make some suggestions there.
I don't know that there's any particularly strong reason why HashSet doesn't permit this, other than that it's a relatively rare requirement - but it's something I've seen requested in Java as well...
Use a Dictionary<string, Peptide>.

How to reset a Dictionary

If I declared a dictionary like this:
private static Dictionary<string, object> aDict = new Dictionary<string, object>();
And now I want to use it at another place. How do I reset it?
aDict = new Dictionary<string, object>(); // like this?
aDict = null; // or like this?
or other reset styles?
You can simply use the Clear method, it will remove all keys and values, then you can reuse it without having to create new instances:
aDict.Clear();
Try this
aDict.Clear();
aDict.Clear(); will work.
aDict.Clear(); is the only way to go since you don't want to change the reference and keep the same object available at another place
As everybody has pretty much answered that .Clear() method provided on the Dictionary class should be the way to go here (can't agree more).
Just to make it clear (for newbies of course ;)) that why not the other approaches, like creating a new instance every time we need to refresh the dictionary
aDict = new Dictionary<string, object>(); // like this?
because even though this way works, it is not a memory efficient approach as this creates a new instance and leaves behind the old instance(s) of the dictionary waiting for GC (garbage collector) to dispose it (as it is no longer referred). So you would agree on not consuming extra memory when you don't need to :)
and
aDict = null; // or like this?
because this leaves your instance set to null and next time as the OP wanted to use it as a dict, OP has to create another instance (yes, you got it right not memory efficient)
and also this won't be a better programming style here as someone might end up doing .ContainsKey() (or any operation on the dictionary for that matter)on the aDict variable and cause a nullPointerException if aDict is still pointing to a null object.
Hope this explanation helps!! Thanks for reading!
Running a decompile of the Clear method in Resharper on a Dictionary object shows this:
/// <summary>Removes all keys and values from the <see cref="T:System.Collections.Generic.Dictionary`2" />.</summary>
[__DynamicallyInvokable]
public void Clear()
{
if (this.count <= 0)
return;
for (int index = 0; index < this.buckets.Length; ++index)
this.buckets[index] = -1;
Array.Clear((Array) this.entries, 0, this.count);
this.freeList = -1;
this.count = 0;
this.freeCount = 0;
++this.version;
}
The dictionary contains an integer array of buckets and other control variables that are either set to -1 or 0 to effectively clear the keys and values from the dictionary object. It is pretty many variables representing a valid state of the Dictionary as we can see in the .NET source code. Interesting.

Generating the next available unique name in C#

If you were to have a naming system in your app where the app contains say 100 actions, which creates new objects, like:
Blur
Sharpen
Contrast
Darken
Matte
...
and each time you use one of these, a new instance is created with a unique editable name, like Blur01, Blur02, Blur03, Sharpen01, Matte01, etc. How would you generate the next available unique name, so that it's an O(1) operation or near constant time. Bear in mind that the user can also change the name to custom names, like RemoveFaceDetails, etc.
It's acceptable to have some constraints, like restricting the number of characters to 100, using letters, numbers, underscores, etc...
EDIT: You can also suggest solutions without "filling the gaps" that is without reusing the already used, but deleted names, except the custom ones of course.
I refer you to Michael A. Jackson's Two Rules of Program Optimization:
Don't do it.
For experts only: Don't do it yet.
Simple, maintainable code is far more important than optimizing for a speed problem that you think you might have later.
I would start simple: build a candidate name (e.g. "Sharpen01"), then loop through the existing filters to see if that name exists. If it does, increment and try again. This is O(N2), but until you get thousands of filters, that will be good enough.
If, sometime later, the O(N2) does become a problem, then I'd start by building a HashSet of existing names. Then you can check each candidate name against the HashSet, rather than iterating. Rebuild the HashSet each time you need a unique name, then throw it away; you don't need the complexity of maintaining it in the face of changes. This would leave your code easy to maintain, while only being O(N).
O(N) will be good enough. You do not need O(1). The user is not going to click "Sharpen" enough times for there to be any difference.
I would create a static integer in action class that gets incremented and assigned as part of each new instance of the class. For instance:
class Blur
{
private static int count = 0;
private string _name;
public string Name
{
get { return _name; }
set { _name = value; }
}
public Blur()
{
_name = "Blur" + count++.ToString();
}
}
Since count is static, each time you create a new class, it will be incremented and appended to the default name. O(1) time.
EDIT
If you need to fill in the holes when you delete, I would suggest the following. It would automatically queue up numbers when items are renamed, but it would be more costly overall:
class Blur
{
private static int count = 0;
private static Queue<int> deletions = new Queue<int>();
private string _name;
public string Name
{
get { return _name; }
set
{
_name = value;
Delete();
}
}
private int assigned;
public Blur()
{
if (deletions.Count > 0)
{
assigned = deletions.Dequeue();
}
else
{
assigned = count++;
}
_name = "Blur" + assigned.ToString();
}
public void Delete()
{
if (assigned >= 0)
{
deletions.Enqueue(assigned);
assigned = -1;
}
}
}
Also, when you delete an object, you'll need to call .Delete() on the object.
CounterClass Dictionary version
class CounterClass
{
private int count;
private Queue<int> deletions;
public CounterClass()
{
count = 0;
deletions = new Queue<int>();
}
public string GetNumber()
{
if (deletions.Count > 0)
{
return deletions.Dequeue().ToString();
}
return count++.ToString();
}
public void Delete(int num)
{
deletions.Enqueue(num);
}
}
you can create a Dictionary to look up counters for each string. Just make sure you parse out the index and call .Delete(int) whenever you rename or delete a value.
You can easily do it in O(m) where m is the number of existing instances of the name (and not dependent on n, the number of items in the list.
Look up the string S in question. If S isn't in the list, you're done.
S exists, so construct S+"01" and check for that. Continue incrementing (e.g. next try S+"02" until it doesn't exist.
This gives you unique names but they're still "pretty" and human-readable.
Unless you expect a large number of duplicates, this should be "near-constant" time because m will be so small.
Caveat: What if the string naturally ends with e.g. "01"? In your case this sounds unlikely so perhaps you don't care. If you do care, consider adding more of a suffix, e.g. "_01" instead of just "01" so it's easier to tell them apart.
You could do something like this:
private Dictionary<string, int> instanceCounts = new Dictionary<string, int>();
private string GetNextName(string baseName)
{
int count = 1;
if (instanceCounts.TryGetValue(baseName, out count))
{
// the thing already exists, so add one to it
count++;
}
// update the dictionary with the new value
instanceCounts[baseName] = count;
// format the number as desired
return baseName + count.ToString("00");
}
You would then just use it by calling GetNextName(...) with the base name you wanted, such as
string myNextName = GetNextName("Blur");
Using this, you wouldn't have to pre-init the dictionary.
It would fill in as you used the various base words.
Also, this is O(1).
I would create a dictionary with a string key and a integer value, storing the next number to use for a given action. This will be almost O(1) in practice.
private IDictionary<String, Int32> NextFreeActionNumbers = null;
private void InitializeNextFreeActionNumbers()
{
this.NextFreeActionNumbers = new Dictionary<String, Int32>();
this.NextFreeActionNumbers.Add("Blur", 1);
this.NextFreeActionNumbers.Add("Sharpen", 1);
this.NextFreeActionNumbers.Add("Contrast", 1);
// ... and so on ...
}
private String GetNextActionName(String action)
{
Int32 number = this.NextFreeActionNumbers[action];
this.NextFreeActionNumbers[action] = number + 1;
return String.Format("{0} {1}", action, number);
}
And you will have to check against collisions with user edited values. Again a dictionary might be a smart choice. There is no way around that. What ever way you generate your names, the user can always change a existing name to the next one you generate unless you include all existing names into the generation schema. (Or use a special character that is not allowed in user edited names, but that would be not that nice.)
Because of the comments on reusing the holes I want to add it here, too. Don't resuse the holes generated be renaming or deletion. This will confuse the user because names he deleted or modified will suddenly reappear.
I would look for ways to simplify the problem.
Are there any constraints that can be applied? As an example, would it be good enough if each user can only have one (active) type of action? Then, the actions could be distinguished using the name (or ID) of the user.
Blur (Ben F)
Blur (Adrian H)
Focus (Ben F)
Perhaps this is not an option in this case, but maybe something else would be possible. I would go to great lengths in order to avoid the complexity in some of the proposed solutions!
If you want O(1) time then just track how many instances of each you have. Keep a hashtable with all of the possible objects, when you create an object, increment the value for that object and use the result in the name.
You're definitely not going to want to expose a GUID to the user interface.
Are you proposing an initial name like "Blur04", letting the user rename it, and then raising an error message if the user's custom name conflicts? Or silently renaming it to "CustomName01" or whatever?
You can use a Dictionary to check for duplicates in O(1) time. You can have incrementing counters for each effect type in the class that creates your new effect instances. Like Kevin mentioned, it gets more complex if you have to fill in gaps in the numbering when an effect is deleted.

Categories

Resources