I'm currently trying to write an extension method, but it doesn't seem to be operating as intended. Before we delve too much deeper, here's the code I have:
public static void Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var items = source.Where(predicate);
source = source.Where(t => !items.Contains(t));
}
The desire is that I can call this extension method on any IEnumerable and all items matching the predicate are then removed from the collection. I'm tired of iterating through collections to find the items that match and then removing them one at a time to avoid altering the collection while enumerating through it...
Anyway... When I step through the code, everything seems to work. Before existing the method, the source has the correct number of items removed. However, when I return to the calling code all of the items still exist in my original IEnumerable object. Any tips?
Thanks in advance,
Sonny
Can't do that the way you have originally written it, you are taking a reference variable (source) and making it refer to a new instance. This modifies the local reference source and not the original argument passed in.
Keep in mind for reference types in C#, the default parameter passing scheme is pass by value (where the value being passed is a reference).
Let's say you pass in a variable x to this method, which refers to the original list and that list lives at theoretical location 1000, this means that source is a new reference to the original list living at location 1000.
Now when you say:
source = source.Where(....);
You are assigning source to a new list (say at location 2000), but that only affects what source points to and not the x you passed in.
To fix this as an extension method, you would really want to return the new sequence instead:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) throw new ArgumentNullException("source");
if (predicate == null) throw new ArgumentNullException("predicate");
// you can also collapse your logic to returning the opposite result of your predicate
return source.Where(x => !predicate(x));
}
This is all assuming you want to keep it totally generic to IEnumerable<T> as you asked in your question. Obviously as also pointed out in other examples if you only care about List<T> there is a baked-in RemoveAll() method.
This kind of extension should be implemented by returning a new sequence. That way you can integrate into a chain of sequence operations:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
return source.Where(t => !predicate(t));
}
var query = mySequence.Select(x => x.Y).Remove(x => x == 2).Select(x => 2*x);
Now the method is nothing but a wrapper around Where(), which obviously isn't helpful. You might consider getting rid of it.
If you want to actually update the underlying collection (assuming that even exists) then you can't do it this way, since IEnumerable<T> doesn't provide any way to alter its contents. You would have to do something like:
var myNewList = new List<int>(oldList.Remove(x => x == 2));
Finally, if you are working with List<T>, you can use the RemoveAll() method to actually remove items from the list:
int numberOfItemsRemoved = myList.RemoveAll(x => x == 2);
try this there's a useful List.RemoveAll(Predicate match) method which I think is designed for this: http://msdn.microsoft.com/en-us/library/wdka673a.aspx
so just use this on the list which you have.
source.RemoveAll(t => !items.Contains(t))
or your extension method returns you the required enumerable and you can use that.
This is because IEnumerable is immutable
You have to return another sequence from your Remove method for this to work:
public static IEnumerable<T> Remove<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var items = source.Where(predicate);
return source.Where(t => !items.Contains(t));
}
Related
I have casted
var info = property.Info;
object data = info.GetValue(obj);
...
var enumerable = (IEnumerable)data;
if (enumerable.Any()) ///Does not compile
{
}
if (enumerable.GetEnumerator().Current != null) // Run time error
{
}
and I would like to see if this enumerable has any elements, via using Linq Query Any(). But unfortunately, even with using Linq, I can't.
How would I do this without specifying the generic type.
While you can't do this directly, you could do it via Cast:
if (enumerable.Cast<object>().Any())
That should always work, as any IEnumerable can be wrapped as an IEnumerable<object>. It will end up boxing the first element if it's actually an IEnumerable<int> or similar, but it should work fine. Unlike most LINQ methods, Cast and OfType target IEnumerable rather than IEnumerable<T>.
You could write your own subset of extension methods like the LINQ ones but operating on the non-generic IEnumerable type if you wanted to, of course. Implementing LINQ to Objects isn't terribly hard - you could use my Edulinq project as a starting point, for example.
There are cases where you could implement Any(IEnumerable) slightly more efficiently than using Cast - for example, taking a shortcut if the target implements the non-generic ICollection interface. At that point, you wouldn't need to create an iterator or take the first element. In most cases that won't make much performance difference, but it's the kind of thing you could do if you were optimizing.
One method is to use foreach, as noted in IEnumerable "Remarks". It also provides details on the additional methods off of the result of GetEnumerator.
bool hasAny = false;
foreach (object i in (IEnumerable)(new int[1] /* IEnumerable of any type */)) {
hasAny = true;
break;
}
(Which is itself easily transferable to an Extension method.)
Your attempt to use GetEnumerator().Current tried to get the current value of an enumerator that had not yet been moved to the first position yet. It would also have given the wrong result if the first item existed or was null. What you could have done (and what the Any() in Enumerable does) is see if it was possible to move to that first item or not; i.e. is there a first item to move to:
internal static class UntypedLinq
{
public static bool Any(this IEnumerable source)
{
if (source == null) throw new ArgumentNullException(nameof(source));
IEnumerator ator = source.GetEnumerator();
// Unfortunately unlike IEnumerator<T>, IEnumerator does not implement
// IDisposable. (A design flaw fixed when IEnumerator<T> was added).
// We need to test whether disposal is required or not.
if (ator is IDisposable disp)
{
using(disp)
{
return ator.MoveNext();
}
}
return ator.MoveNext();
}
// Not completely necessary. Causes any typed enumerables to be handled by the existing Any
// in Linq via a short method that will be inlined.
public static bool Any<T>(this IEnumerable<T> source) => Enumerable.Any(source);
}
I was "playing" around with LINQ and testing some stuff and something came to my attention.
Let's suppose I have this "lazy" implementation for the GroupBy extension method:
public static IEnumerable<IGrouping<TKey, TSource>> GroupByA<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
//To avoid duplicate groups
List<TKey> grouping = new List<TKey>();
foreach (var item in source)
{
if (!grouping.Contains(keySelector(item)))
{
grouping.Add(keySelector(item));
Group<TKey, TSource> g = new Group<TKey, TSource>(
keySelector(item),
source.Where(x => keySelector(x).Equals(keySelector(item)))
);
Console.WriteLine("Returning group");
yield return g; //yield returning a complete group
}
}
}
Note: Assume Group<TKey, TSource> implements IGrouping<TKey, TSource
I was wondering, What happens if execute this?
var groups = students.GroupByA(x => x.Group).Take(2);
Note: students is List<Student>.
Will .Take(2) force the complete .GroupByA(x=>x.Group) execution or somehow will it consume one group at a time until it counts 2? Either way Why?
PS: I tried using my own implementation for:
public static IEnumerable<T> TakeA<T>(this IEnumerable<T> source, int count)
like this:
public static IEnumerable<T> TakeA<T>(this IEnumerable<T> source, int count)
{
int iter = 0;
foreach (var item in source)
{
if (iter == count)
yield break;
yield return item;
iter++;
}
}
But I am pretty sure that way causes the GroupBy to execute completely before calling TakeA. I don't know if it is my way of implementing it or somehow original Take does something else different.
The C# compiler translates your code into a state machine. That is, it creates a new class behind the scenes with state and behavior needed for iterating the student list. Each time you call the code you get an instance of this class.
Will .Take(2) force the complete .GroupByA(x=>x.Group) execution
Looking at the full students.GroupByA(x => x.Group).Take(2) expression, .Net is able to use the new class instance created by the GroupByA() with the Take() function, and you can think of it as execution only continues until the second time your code hits the yield line, but no further.
However, the nature of a GROUP BY operation is you must loop through the entire dataset to know the attributes of your group, meaning even though you only see the second yield expression, the source.Where() call still has to look at your entire data set and make for at least a O(n*m) operation... every time you identify a new group you go through the entire dataset again.
It should be possible to write a O(n) GROUP BY operation using a Dictionary rather than a List for finding new groups and accumulating aggregate info in the Dictionary values as you go. You might want to see if you can manage that. Of course, the catch is with small values for n (small source list sizes) the hash calculations and lookups can cost more than the sequence iterations.
I have a collection of anonymous class and I want to return an empty list of it.
What is the best readable expression to use?
I though of the following but I don't think they are readably enough:
var result = MyCollection.Take(0).ToList();
var result = MyCollection.Where(p => false).ToList();
Note: I don't want to empty the collection itself.
Any suggestion!
Whats about:
Enumerable.Empty<T>();
This returns an empty enumerable which is of type T. If you really want a List so you are free to do this:
Enumerable.Empty<T>().ToList<T>();
Actually, if you use a generic extension you don't even have to use any Linq to achieve this, you already have the anonymous type exposed through T
public static IList<T> GetEmptyList<T>(this IEnumerable<T> source)
{
return new List<T>();
}
var emp = MyCollection.GetEmptyList();
Given that your first suggestion works and should perform well - if readability is the only issue, why not create an extension method:
public static IList<T> CreateEmptyCopy(this IEnumerable<T> source)
{
return source.Take(0).ToList();
}
Now you can refactor your example to
var result = MyCollection.CreateEmptyCopy();
For performance reasons, you should stick with the first option you came up with.
The other one would iterate over the entire collection before returning an empty list.
Because the anonymous type there is no way, in source code, to create a list. There is, however, a way to create such list through reflection.
I have a property on a class that is an ISet. I'm trying to get the results of a linq query into that property, but can't figure out how to do so.
Basically, looking for the last part of this:
ISet<T> foo = new HashedSet<T>();
foo = (from x in bar.Items select x).SOMETHING;
Could also do this:
HashSet<T> foo = new HashSet<T>();
foo = (from x in bar.Items select x).SOMETHING;
I don't think there's anything built in which does this... but it's really easy to write an extension method:
public static class Extensions
{
public static HashSet<T> ToHashSet<T>(
this IEnumerable<T> source,
IEqualityComparer<T> comparer = null)
{
return new HashSet<T>(source, comparer);
}
}
Note that you really do want an extension method (or at least a generic method of some form) here, because you may not be able to express the type of T explicitly:
var query = from i in Enumerable.Range(0, 10)
select new { i, j = i + 1 };
var resultSet = query.ToHashSet();
You can't do that with an explicit call to the HashSet<T> constructor. We're relying on type inference for generic methods to do it for us.
Now you could choose to name it ToSet and return ISet<T> - but I'd stick with ToHashSet and the concrete type. This is consistent with the standard LINQ operators (ToDictionary, ToList) and allows for future expansion (e.g. ToSortedSet). You may also want to provide an overload specifying the comparison to use.
Just pass your IEnumerable into the constructor for HashSet.
HashSet<T> foo = new HashSet<T>(from x in bar.Items select x);
This functionality has been added as an extension method on IEnumerable<TSource> to .NET Framework 4.7.2 and .NET Core 2.0. It is consequently also available on .NET 5 and later.
ToHashSet<TSource>(IEnumerable<TSource>)
ToHashSet<TSource>(IEnumerable<TSource>, IEqualityComparer<TSource>)
As #Joel stated, you can just pass your enumerable in. If you want to do an extension method, you can do:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> items)
{
return new HashSet<T>(items);
}
There is an extension method build in the .NET framework and in .NET core for converting an IEnumerable to a HashSet: https://learn.microsoft.com/en-us/dotnet/api/?term=ToHashSet
public static System.Collections.Generic.HashSet<TSource> ToHashSet<TSource> (this System.Collections.Generic.IEnumerable<TSource> source);
It appears that I cannot use it in .NET standard libraries yet (at the time of writing). So then I use this extension method:
[Obsolete("In the .NET framework and in NET core this method is available, " +
"however can't use it in .NET standard yet. When it's added, please remove this method")]
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source, IEqualityComparer<T> comparer = null) => new HashSet<T>(source, comparer);
That's pretty simple :)
var foo = new HashSet<T>(from x in bar.Items select x);
and yes T is the type specified by OP :)
If you need just readonly access to the set and the source is a parameter to your method, then I would go with
public static ISet<T> EnsureSet<T>(this IEnumerable<T> source)
{
ISet<T> result = source as ISet<T>;
if (result != null)
return result;
return new HashSet<T>(source);
}
The reason is, that the users may call your method with the ISet already so you do not need to create the copy.
Jon's answer is perfect. The only caveat is that, using NHibernate's HashedSet, I need to convert the results to a collection. Is there an optimal way to do this?
ISet<string> bla = new HashedSet<string>((from b in strings select b).ToArray());
or
ISet<string> bla = new HashedSet<string>((from b in strings select b).ToList());
Or am I missing something else?
Edit: This is what I ended up doing:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
{
return new HashSet<T>(source);
}
public static HashedSet<T> ToHashedSet<T>(this IEnumerable<T> source)
{
return new HashedSet<T>(source.ToHashSet());
}
Rather than the simple conversion of IEnumerable to a HashSet, it is often convenient to convert a property of another object into a HashSet. You could write this as:
var set = myObject.Select(o => o.Name).ToHashSet();
but, my preference would be to use selectors:
var set = myObject.ToHashSet(o => o.Name);
They do the same thing, and the the second is obviously shorter, but I find the idiom fits my brains better (I think of it as being like ToDictionary).
Here's the extension method to use, with support for custom comparers as a bonus.
public static HashSet<TKey> ToHashSet<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> selector,
IEqualityComparer<TKey> comparer = null)
{
return new HashSet<TKey>(source.Select(selector), comparer);
}
I'm in a situation where I just want to append values in string array (type String[]) to an object with IList<String>. A quick look-up on MSDN revealed that IList<T>'s Insert method only has a version which takes an index and an object T, and does not have a version which takes IEnumerable<T> instead of T. Does this mean that I have to write a loop over an input list to put values into the destination list? If that's the case, it seems very limiting and rather very unfriendly API design for me. Maybe, I'm missing something. What does C# experts do in this case?
Because an interface is generally the least functionality required to make it usable, to reduce the burden on the implementors. With C# 3.0 you can add this as an extension method:
public static void AddRange<T>(this IList<T> list, IEnumerable<T> items) {
if(list == null) throw new ArgumentNullException("list");
if(items == null) throw new ArgumentNullException("items");
foreach(T item in items) list.Add(item);
}
et voila; IList<T> now has AddRange:
IList<string> list = ...
string[] arr = {"abc","def","ghi","jkl","mno"};
list.AddRange(arr);