How to convert linq results to HashSet or HashedSet - c#

I have a property on a class that is an ISet. I'm trying to get the results of a linq query into that property, but can't figure out how to do so.
Basically, looking for the last part of this:
ISet<T> foo = new HashedSet<T>();
foo = (from x in bar.Items select x).SOMETHING;
Could also do this:
HashSet<T> foo = new HashSet<T>();
foo = (from x in bar.Items select x).SOMETHING;

I don't think there's anything built in which does this... but it's really easy to write an extension method:
public static class Extensions
{
public static HashSet<T> ToHashSet<T>(
this IEnumerable<T> source,
IEqualityComparer<T> comparer = null)
{
return new HashSet<T>(source, comparer);
}
}
Note that you really do want an extension method (or at least a generic method of some form) here, because you may not be able to express the type of T explicitly:
var query = from i in Enumerable.Range(0, 10)
select new { i, j = i + 1 };
var resultSet = query.ToHashSet();
You can't do that with an explicit call to the HashSet<T> constructor. We're relying on type inference for generic methods to do it for us.
Now you could choose to name it ToSet and return ISet<T> - but I'd stick with ToHashSet and the concrete type. This is consistent with the standard LINQ operators (ToDictionary, ToList) and allows for future expansion (e.g. ToSortedSet). You may also want to provide an overload specifying the comparison to use.

Just pass your IEnumerable into the constructor for HashSet.
HashSet<T> foo = new HashSet<T>(from x in bar.Items select x);

This functionality has been added as an extension method on IEnumerable<TSource> to .NET Framework 4.7.2 and .NET Core 2.0. It is consequently also available on .NET 5 and later.
ToHashSet<TSource>(IEnumerable<TSource>)
ToHashSet<TSource>(IEnumerable<TSource>, IEqualityComparer<TSource>)

As #Joel stated, you can just pass your enumerable in. If you want to do an extension method, you can do:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> items)
{
return new HashSet<T>(items);
}

There is an extension method build in the .NET framework and in .NET core for converting an IEnumerable to a HashSet: https://learn.microsoft.com/en-us/dotnet/api/?term=ToHashSet
public static System.Collections.Generic.HashSet<TSource> ToHashSet<TSource> (this System.Collections.Generic.IEnumerable<TSource> source);
It appears that I cannot use it in .NET standard libraries yet (at the time of writing). So then I use this extension method:
[Obsolete("In the .NET framework and in NET core this method is available, " +
"however can't use it in .NET standard yet. When it's added, please remove this method")]
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source, IEqualityComparer<T> comparer = null) => new HashSet<T>(source, comparer);

That's pretty simple :)
var foo = new HashSet<T>(from x in bar.Items select x);
and yes T is the type specified by OP :)

If you need just readonly access to the set and the source is a parameter to your method, then I would go with
public static ISet<T> EnsureSet<T>(this IEnumerable<T> source)
{
ISet<T> result = source as ISet<T>;
if (result != null)
return result;
return new HashSet<T>(source);
}
The reason is, that the users may call your method with the ISet already so you do not need to create the copy.

Jon's answer is perfect. The only caveat is that, using NHibernate's HashedSet, I need to convert the results to a collection. Is there an optimal way to do this?
ISet<string> bla = new HashedSet<string>((from b in strings select b).ToArray());
or
ISet<string> bla = new HashedSet<string>((from b in strings select b).ToList());
Or am I missing something else?
Edit: This is what I ended up doing:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
{
return new HashSet<T>(source);
}
public static HashedSet<T> ToHashedSet<T>(this IEnumerable<T> source)
{
return new HashedSet<T>(source.ToHashSet());
}

Rather than the simple conversion of IEnumerable to a HashSet, it is often convenient to convert a property of another object into a HashSet. You could write this as:
var set = myObject.Select(o => o.Name).ToHashSet();
but, my preference would be to use selectors:
var set = myObject.ToHashSet(o => o.Name);
They do the same thing, and the the second is obviously shorter, but I find the idiom fits my brains better (I think of it as being like ToDictionary).
Here's the extension method to use, with support for custom comparers as a bonus.
public static HashSet<TKey> ToHashSet<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> selector,
IEqualityComparer<TKey> comparer = null)
{
return new HashSet<TKey>(source.Select(selector), comparer);
}

Related

Extension method for nested generic type

I'm writing an extension method, that should work on a generic of generics, say IEnumerable<IEnumerable<T>> - for example
public static IEnumerable<T> SelectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
return source.SelectMany(x => x);
}
Now, how do I make it accept a parameter that's actually a List<List<T>? I've only been able to make it swallow a List<IEnumerable<T>>, is there a way without manually casting to that?
It's working nicely, what's the problem you're having?
var w = new List<List<int>>(){
new List<int>{2,3,4},
new List<int>{5,3,2}
};
w.SelectAll().Dump();

What's the Best Way to Add One Item to an IEnumerable<T>?

Here's how I would add one item to an IEnumerable object:
//Some IEnumerable<T> object
IEnumerable<string> arr = new string[] { "ABC", "DEF", "GHI" };
//Add one item
arr = arr.Concat(new string[] { "JKL" });
This is awkward. I don't see a method called something like ConcatSingle() however.
Is there a cleaner way to add a single item to an IEnumerable object?
Nope, that's about as concise as you'll get using built-in language/framework features.
You could always create an extension method if you prefer:
arr = arr.Append("JKL");
// or
arr = arr.Append("123", "456");
// or
arr = arr.Append("MNO", "PQR", "STU", "VWY", "etc", "...");
// ...
public static class EnumerableExtensions
{
public static IEnumerable<T> Append<T>(
this IEnumerable<T> source, params T[] tail)
{
return source.Concat(tail);
}
}
IEnumerable is immutable collection, it means you cannot add, or remove item. Instead, you have to create a new collection for this, simply to convert to list to add:
var newCollection = arr.ToList();
newCollection.Add("JKL"); //is your new collection with the item added
Write an extension method ConcatSingle :)
public static IEnumerable<T> ConcatSingle<T>(this IEnumerable<T> source, T item)
{
return source.Concat(new [] { item } );
}
But you need to be more careful with your terminology.
You can't add an item to an IEnumerable<T>. Concat creates a new instance.
Example:
var items = Enumerable.Range<int>(1, 10)
Console.WriteLine(items.Count()); // 10
var original= items;
items = items.ConcatSingle(11);
Console.WriteLine(original.Count()); // 10
Console.WriteLine(items.Count()); // 11
As you can see, the original enumeration - which we saved in original didn't change.
Since IEnumerable is read-only, you need to convert to list.
var new_one = arr.ToList().Add("JKL");
Or you can get a extension method like;
public static IEnumerable<T> Append<T>(this IEnumerable<T> source, params T[] item)
{
return source.Concat(item);
}
Append() - is exactly what you need, it has been added to the .NET Standard (in 2017), so you no longer need to write your own extension methods. You can simply do this:
arr = arr.Append("JKL");
Since .NET is open source, here you can look on the implementation (it is more sophisticated than custom methods suggested above):
https://github.com/dotnet/runtime/blob/master/src/libraries/System.Linq/src/System/Linq/AppendPrepend.cs
You're assigning an array to an IEnumerable. Why don't you use the Array type instead of IEnumerable?
Otherwise you can use IList (or List) if you want to change the collection.
I use IEnumerable only for methods params when I need to read and IList (or List) when I need to change items in it.

C# Extension Methods

I'm currently trying to write an extension method, but it doesn't seem to be operating as intended. Before we delve too much deeper, here's the code I have:
public static void Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var items = source.Where(predicate);
source = source.Where(t => !items.Contains(t));
}
The desire is that I can call this extension method on any IEnumerable and all items matching the predicate are then removed from the collection. I'm tired of iterating through collections to find the items that match and then removing them one at a time to avoid altering the collection while enumerating through it...
Anyway... When I step through the code, everything seems to work. Before existing the method, the source has the correct number of items removed. However, when I return to the calling code all of the items still exist in my original IEnumerable object. Any tips?
Thanks in advance,
Sonny
Can't do that the way you have originally written it, you are taking a reference variable (source) and making it refer to a new instance. This modifies the local reference source and not the original argument passed in.
Keep in mind for reference types in C#, the default parameter passing scheme is pass by value (where the value being passed is a reference).
Let's say you pass in a variable x to this method, which refers to the original list and that list lives at theoretical location 1000, this means that source is a new reference to the original list living at location 1000.
Now when you say:
source = source.Where(....);
You are assigning source to a new list (say at location 2000), but that only affects what source points to and not the x you passed in.
To fix this as an extension method, you would really want to return the new sequence instead:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) throw new ArgumentNullException("source");
if (predicate == null) throw new ArgumentNullException("predicate");
// you can also collapse your logic to returning the opposite result of your predicate
return source.Where(x => !predicate(x));
}
This is all assuming you want to keep it totally generic to IEnumerable<T> as you asked in your question. Obviously as also pointed out in other examples if you only care about List<T> there is a baked-in RemoveAll() method.
This kind of extension should be implemented by returning a new sequence. That way you can integrate into a chain of sequence operations:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
return source.Where(t => !predicate(t));
}
var query = mySequence.Select(x => x.Y).Remove(x => x == 2).Select(x => 2*x);
Now the method is nothing but a wrapper around Where(), which obviously isn't helpful. You might consider getting rid of it.
If you want to actually update the underlying collection (assuming that even exists) then you can't do it this way, since IEnumerable<T> doesn't provide any way to alter its contents. You would have to do something like:
var myNewList = new List<int>(oldList.Remove(x => x == 2));
Finally, if you are working with List<T>, you can use the RemoveAll() method to actually remove items from the list:
int numberOfItemsRemoved = myList.RemoveAll(x => x == 2);
try this there's a useful List.RemoveAll(Predicate match) method which I think is designed for this: http://msdn.microsoft.com/en-us/library/wdka673a.aspx
so just use this on the list which you have.
source.RemoveAll(t => !items.Contains(t))
or your extension method returns you the required enumerable and you can use that.
This is because IEnumerable is immutable
You have to return another sequence from your Remove method for this to work:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var items = source.Where(predicate);
return source.Where(t => !items.Contains(t));
}

How to get an empty list of a collection?

I have a collection of anonymous class and I want to return an empty list of it.
What is the best readable expression to use?
I though of the following but I don't think they are readably enough:
var result = MyCollection.Take(0).ToList();
var result = MyCollection.Where(p => false).ToList();
Note: I don't want to empty the collection itself.
Any suggestion!
Whats about:
Enumerable.Empty<T>();
This returns an empty enumerable which is of type T. If you really want a List so you are free to do this:
Enumerable.Empty<T>().ToList<T>();
Actually, if you use a generic extension you don't even have to use any Linq to achieve this, you already have the anonymous type exposed through T
public static IList<T> GetEmptyList<T>(this IEnumerable<T> source)
{
return new List<T>();
}
var emp = MyCollection.GetEmptyList();
Given that your first suggestion works and should perform well - if readability is the only issue, why not create an extension method:
public static IList<T> CreateEmptyCopy(this IEnumerable<T> source)
{
return source.Take(0).ToList();
}
Now you can refactor your example to
var result = MyCollection.CreateEmptyCopy();
For performance reasons, you should stick with the first option you came up with.
The other one would iterate over the entire collection before returning an empty list.
Because the anonymous type there is no way, in source code, to create a list. There is, however, a way to create such list through reflection.

Immutable set in .NET

Does the .NET BCL have an immutable Set type? I'm programming in a functional dialect of C# and would like to do something like
new Set.UnionWith(A).UnionWith(B).UnionWith(C)
But the best I can find is HashSet.UnionWith, which would require the following sequence of calls:
HashSet composite = new HashSet();
composite.UnionWith(A);
composite.UnionWith(B);
composite.UnionWith(C);
This use is highly referentially opaque, making it hard to optimize and understand. Is there a better way to do this without writing a custom functional set type?
The new ImmutableCollections have:
ImmutableStack<T>
ImmutableQueue<T>
ImmutableList<T>
ImmutableHashSet<T>
ImmutableSortedSet<T>
ImmutableDictionary<K, V>
ImmutableSortedDictionary<K, V>
More info here
About the union this test passes:
[Test]
public void UnionTest()
{
var a = ImmutableHashSet.Create("A");
var b = ImmutableHashSet.Create("B");
var c = ImmutableHashSet.Create("C");
var d = a.Union(b).Union(c);
Assert.IsTrue(ImmutableHashSet.Create("A", "B", "C").SetEquals(d));
}
Update
This answer was written some time ago, and since then a set of immutable collections have been introduced in the System.Collections.Immutable namespace.
Original answer
You can roll out your own method for this:
public static class HashSetExtensions {
public static HashSet<T> Union<T>(this HashSet<T> self, HashSet<T> other) {
var set = new HashSet<T>(self); // don't change the original set
set.UnionWith(other);
return set;
}
}
Use it like this:
var composite = A.Union(B).Union(C);
You can also use LINQ's Union, but to get a set, you'll need to pass the result to the HashSet constructor:
var composite = new HashSet<string>(A.Union(B).Union(C));
But, HashSet itself is mutable. You could try to use F#'s immutable set.
Also, as mentioned in the comments by ErikE, using Concat yields the same result and probably performs better:
var composite = new HashSet<string>(A.Concat(B).Concat(C));
There is a ReadOnlyCollection, but it's not a hash table. LINQ adds the Union method as an extension.

Categories

Resources