I need a container where insert is fast and thread-safe, because I plan to use it inside a Parallel.for_each instance.
Once in a while, I will scan said container and remove every items contained.
What's the best choice given those costraints?
Thanks
You could use a ConcurrentBag<T>. Basically the System.Collections.Concurrent namespace is worth checking. If you have unique keys, a ConcurrentDictionary<TKey, TValue> would be a great choice as it provides you a very fast access to elements given a key.
There are a bunch of concurrent collections in .NET 4.0: dictionary, queue, etc. See http://msdn.microsoft.com/en-us/library/dd997305.aspx
try ConcurrentBag - it is thread-safe and very fast since most operations are implemented lock-free... there are also ConcurrentDictionary etc. so I am not sure which features you exactly need.
Linked List. That has fast insertion, though I'm not sure if a threadsafe version exists in .NET
Related
I was reading about C#'s ImmutableSortedDictionary in System.Collections.Immutable and thinking about how to apply it in my program. I quite like C++'s lower_bound and upper_bound (see here), and I was rather expecting to see something of the sort for range lookups. However, similar methods seem to be strangely absent from the documentation. Am I missing something? Or does MS truly provide a sorted dictionary without efficient access to the sorted ranges? That doesn't exactly seem like something one could do on an IEnumerable of the keys as say an extension method, so I'm a bit puzzled I'm not seeing something provided directly by the collection.
It is irritating that the available built-in collections are not offering a full set of features (like the SortedDictionary lacking a BinarySearch method), forcing us to search for third-party solutions (like the C5 library).
In your case instead of an ImmutableSortedDictionary you could probably use a ImmutableSortedSet, embedding the values in the keys and using an appropriate comparer. At least the API of this class contains the properties Min and Max.
I understand that Java HashMap elements are stored in "buckets" based on the hashes of the elements' keys. Does that same hashing occur in C# dictionaries? If not, then how does lookup work?
There are several kind of Dictionarys inside C# System.Collections - namespace. They use different strategies to store theire internal data:
This one System.Collections.Specialized.HybridDictionary
uses DoubleLinkedLists until a certain size is reached, then switches to Hashsets. The "normal" System.Collections.Generics-Dictionary uses Hashsets internally all the time. There is also a Dictionary for concurrent uses - look it up yourself (Concurrent Dictionary)if you like.
So it depends on what kind of Dictionary you are using and (in above case) might change due to internal considerations of the class you are using for performance or other reasons.
I still use Wintellect's PowerCollections library, even though it is aging and not maintained because it did a good job covering holes left in the standard MS Collections libraries. But LINQ and C# 4.0 are poised to replace PowerCollections...
I was very happy to discover System.Linq.Lookup because it should replace Wintellect.PowerCollections.MultiDictionary in my toolkit. But Lookup seems to be immutable! Is that true, can you only created a populated Lookup by calling ToLookup?
Yes, you can only create a Lookup by calling ToLookup. The immutable nature of it means that it's easy to share across threads etc, of course.
If you want a mutable version, you could always use the Edulinq implementation as a starting point. It's internally mutable, but externally immutable - and I wouldn't be surprised if the Microsoft implementation worked in a similar way.
Personally I'm rarely in a situation where I want to mutate the lookup - I would prefer to perform appropriate transformations on the input first. I would encourage you to think in this way too - I find myself wishing for better immutability support from other collections (e.g. Dictionary) more often than I wish that Lookup were mutable :)
That is correct. Lookup is immutable, you can create an instance by using the Linq ToLookup() extension method. Technically even that fact is an implementation detail since the method returns an ILookup interface which in the future might be implemented by some other concrete class.
Is there a practical difference between .All() and .TrueForAll() when operating on a List? I know that .All() is part of IEnumerable, so why add .TrueForAll()?
From the docs for List<T>.TrueForAll:
Supported in: 4, 3.5, 3.0, 2.0
So it was added before Enumerable.All.
The same is true for a bunch of other List<T> methods which work in a similar way to their LINQ counterparts. Note that ConvertAll is somewhat different, in that it has the advantage of knowing that it's working on a List<T> and creating a List<TResult>, so it gets to preallocate whatever it needs.
TrueForAll existed in .NET 2.0, before LINQ was in .NET 3.5.
See: http://msdn.microsoft.com/en-us/library/kdxe4x4w(v=VS.80).aspx
TrueForAll appears to be specific to List, while All is part of LINQ.
My guess is that the former dates back to the .NET 2 days, while the latter is new in .NET 3.5.
Sorry for digging this out, but I came across this question and have seen that the actual question about differences is not properly answered.
The differences are:
The IEnumerable.All() extension method does an additional check for the extended object (in case it is null, it throws an exception).
IEnumerable.All() may not check elements in order. In theory, it would be allowed by specification to have the items to check in a different order every call. List.TrueForAll() specifies in the documentation that it will always be in the order of the list.
The second point is because Enumerable.All must use a foreach or the MoveNext() method to iterate over the items, while List.TrueForAll() internally uses a for loop with list indices.
However, you can be pretty sure that also the foreach / MoveNext() approach will return the elements in the order of the list entries because a lot of programs expect that and would break if this would be changed in the future.
From a performance point of view, List.TrueForAll() should be faster because it has one check less and for on a list is cheaper than foreach. However, usually the compiler does a good job and optimizes here a lot, so that there will probably (almost) no difference measurable in the end.
Conclusion: List.TrueForAll() is the better choice in theory. But practically it makes no difference.
Basically, because this method existed before Linq did. TrueForAll on a List originated in Framework 2.0.
TrueForAll is not an extension method and in the framework from version 2.
I have a list of strings, I need to be able to simply probe if a new string is in the table or not. When the list is large, testing a simple list directly is pretty inefficient... so typically I use a Dictionary to get constant lookup speeds, although I don't actually care about the value. This seems like a misuse of a dictionary, so I'm wondering what other approaches I could take.
Is there a better way to do hit testing that I am unaware of?
You should use a HashSet<string>, which is specifically designed for this purpose.
A HashSet is better suited than a Dictionary, for this purpose.