Is this dictionary operation thread-safe? - c#

I understand that regular dictionary read/write operations aren't threadsafe. What about this case?
class Data{ public int data{get; set;}}
and
var dict = new Dictionary<int, Data>();
Can this be thread safe?
await Task.WhenAll(dict
.Select(async t =>
t.Value.data = t.Value.data+1;
await somefunc(t.Value.data);
);
Is there a better way to rewrite this operation without using a ConcurrentDictionary since there is just one instance that I need this behaviour supported.

Related

WaitAll for tasks from dictionary and get result, including Dictionary key

I have to modify the following code. I have given a list of tasks (httpClient.Run method returns Task). I have to run them and wait until all of them are done. Later on I need to collect all results and build response.
var tasks = new Dictionary<string, Task<string>>();
tasks.Add("CODE1", service.RunTask("CODE1"));
tasks.Add("CODE2", service.RunTask("CODE2"));
tasks.Add("CODE3", service.RunTask("CODE3"));
//...
var result = await Task.WhenAll(tasks.Values); // how to get CODE (dictionary KEY) here
// build response
The problem above is that when we get results we have lost exactly which task was run. results is string array, but I need, for instance,KeyValuePair array. We need to know which task (CODE) was run, so that we can build result properly.
You can use async in a Select lambda to transform the KeyValuePair<T1, Task<T2>s into Task<KeyValuePair<T1, T2>>s.
var resultTasks = tasks.Select(async pair => KeyValuePair.Create(pair.Key, await pair.Value));
IReadOnlyCollection<KeyValuePair<string, string>> results = await Task.WhenAll(resultTasks);
Something like this ought to do it. This is written assuming you do the Task.WhenAll() call first. Otherwise, you'll block the thread when you start to enumerate over resultsWithKeys.
await Task.WhenAll(tasks.Values());
var resultsWithKeys = tasks.Select(
x => new
{
Key = x.Key,
Result = x.Value.Result
});
foreach (var result in resultsWithKeys)
Console.WriteLine($"{result.Key} - {result.Result.SomeValue}");
The Dictionary<string, Task<string>> contains tasks that do not propagate the key as part of their Result. If you prefer to have tasks that include the key in their result, you'll have to create a new dictionary and fill it with new tasks that wrap the existing tasks. The TResult of these new tasks can be a struct of your choice, for example a KeyValuePair<string, string>. Below is an extension method WithKeys that allows to create easily the new dictionary with the new tasks:
public static Dictionary<TKey, Task<KeyValuePair<TKey, TValue>>> WithKeys<TKey, TValue>(
this Dictionary<TKey, Task<TValue>> source)
{
return source.ToDictionary(e => e.Key,
async e => KeyValuePair.Create(e.Key, await e.Value.ConfigureAwait(false)),
source.Comparer);
}
Usage example:
KeyValuePair<string, string>[] result = await Task.WhenAll(tasks.WithKeys().Values);

ConcurrentBag of strings and using .Contains in Parallel.ForEach

Im using the ConcurrentBag to contain a list of strings. Occasionally it will contain a duplicate.
However im checking the contents of this before adding the new entry so it should never have a duplicate.
ConcurrentDictionary<string, string> SystemFiles = PopulateSystemFiles();
ConcurrentBag<string> SystemNames = new ConcurrentBag<string>();
Parallel.ForEach(SystemFiles, file =>
{
string name = GetSystemName(file.Value);
if (!SystemNames.Contains(name))
{
SystemNames.Add(name);
}
});
My assumpion is that the .Contains method is not thread safe. Am i correct?
ConcurrentBag is threadsafe, but your code isn't:
if (!SystemNames.Contains(name))
{
SystemNames.Add(name);
}
Contains will execute in a thread-safe way, then Add will also execute in a thread-safe way, but you have no guarantee that an item haven't been added in-between.
For your needs, I recommend using a ConcurrentDictionary instead. Just ignore the value as you won't need it.
var SystemNames = new ConcurrentDictionary<string, bool>();
Then use the TryAdd method to do the "if not contains then add" in a single atomic operation:
SystemNames.TryAdd(name, true);

Is HashSet<T> thread safe as a value of ConcurrentDictionary<TKey, HashSet<T>>?

If I have the following code:
var dictionary = new ConcurrentDictionary<int, HashSet<string>>();
foreach (var user in users)
{
if (!dictionary.ContainsKey(user.GroupId))
{
dictionary.TryAdd(user.GroupId, new HashSet<string>());
}
dictionary[user.GroupId].Add(user.Id.ToString());
}
Is the act of adding an item into the HashSet inherently thread safe because HashSet is a value property of the concurrent dictionary?
No. Putting a container in a thread-safe container does not make the inner container thread safe.
dictionary[user.GroupId].Add(user.Id.ToString());
is calling HashSet's add after retrieving it from the ConcurrentDictionary. If this GroupId is looked up from two threads at once this would break your code with strange failure modes. I saw the result of one of my teammates making the mistake of not locking his sets, and it wasn't pretty.
This is a plausible solution. I'd do something different myself but this is closer to your code.
if (!dictionary.ContainsKey(user.GroupId))
{
dictionary.TryAdd(user.GroupId, new HashSet<string>());
}
var groups = dictionary[user.GroupId];
lock(groups)
{
groups.Add(user.Id.ToString());
}
No, the collection (the dictionary itself) is thread-safe, not whatever you put in it. You have a couple of options:
Use AddOrUpdate as #TheGeneral mentioned:
dictionary.AddOrUpdate(user.GroupId, new HashSet<string>(), (k,v) => v.Add(user.Id.ToString());
Use a concurrent collection, like the ConcurrentBag<T>:
ConcurrentDictionary<int, ConcurrentBag<string>>
Whenever you are building the Dictionary, as in your code, you should be better off accessing it as little as possible. Think of something like this:
var dictionary = new ConcurrentDictionary<int, ConcurrentBag<string>>();
var grouppedUsers = users.GroupBy(u => u.GroupId);
foreach (var group in grouppedUsers)
{
// get the bag from the dictionary or create it if it doesn't exist
var currentBag = dictionary.GetOrAdd(group.Key, new ConcurrentBag<string>());
// load it with the users required
foreach (var user in group)
{
if (!currentBag.Contains(user.Id.ToString())
{
currentBag.Add(user.Id.ToString());
}
}
}
If you actually want a built-in concurrent HashSet-like collection, you'd need to use ConcurrentDictionary<int, ConcurrentDictionary<string, string>>, and care either about the key or the value from the inner one.

Thread-safe changes to a ConcurrentDictionary

I am populating a ConcurrentDictionary in a Parallel.ForEach loop:
var result = new ConcurrentDictionary<int, ItemCollection>();
Parallel.ForEach(allRoutes, route =>
{
// Some heavy operations
lock(result)
{
if (!result.ContainsKey(someKey))
{
result[someKey] = new ItemCollection();
}
result[someKey].Add(newItem);
}
}
How do I perform the last steps in a thread-safe manner without using the lock statement?
EDIT: Assume that ItemCollection is thread-safe.
I think you want GetOrAdd, which is explicitly designed to either fetch an existing item, or add a new one if there's no entry for the given key.
var collection = result.GetOrAdd(someKey, _ => new ItemCollection());
collection.Add(newItem);
As noted in the question comments, this assumes that ItemCollection is thread-safe.
You need to use the GetOrAdd method.
var result = new ConcurrentDictionary<int, ItemCollection>();
int someKey = ...;
var newItem = ...;
ItemCollection collection = result.GetOrAdd(someKey, _ => new ItemCollection());
collection.Add(newItem);
Assuming ItemCollection.Add is not thread-safe, you will need a lock, but you can reduce the size of the critical region.
var collection = result.GetOrAdd(someKey, k => new ItemCollection());
lock(collection)
collection.Add(...);
Update: Since it seems to be thread-safe, you don't need the lock at all
var collection = result.GetOrAdd(someKey, k => new ItemCollection());
collection.Add(...);

Passing a dictionary as a parameter to a thread function

How is it possible to pass a dictionary as a parameter to a thread function and then iterate through it?
Dictionary<string, Track> dic = allTracks;
updateThread = new Thread(() => toDB(dic));
updateThread.Start();
and the function:
public static void toDB( Dictionary<string, Track> dict)
{
foreach (KeyValuePair<string, Track> pair in dict)
{
//do something - but I do not alter anything in dictionary
}
}
I have tried like this but I get an error
Collection was modified; enumeration operation may not execute.
You will get this exception if your dictionary is modified in the main thread or the thread you have passed on to. You can use ConcurrentDictionary or implement the locking yourself.
However, if you do not intend to modify the original collection inside the function you are calling in the thread and you don't need the latest values either, then you can simple create a copy before passing it to your seperate thread.

Categories

Resources