LINQ to empty Collection - c#

I have one simple linq in my VM:
public int MaxItem => Collection.Max((c)=> c.Count);
There is no problem with it, if Collection is full of items. But if I need cleared it like this :
Collection.Clear();
Then I have Exception:
System.InvalidOperationException
How can fix it ?

Max (and Min) are undefined for empty sets, so the only reasonable behavior is to throw exception when sequence has no items.
If you need special handling for your collection - check for empty (or "full of items") condition and call different methods:
public int MaxItem => Collection.IsFullOfItems ?
Collection.Max((c)=> c.Count) : 0;
(You can use Any() or Count != 0 for most collection types if you don't have your custom IsFullOfItems property)
Alternatively if Collection is your custom class you can implement your own Max(Func<bool, T> predicate) method in that class and it will be used in Collection.Max call instead of default Enumerable.Max extension method.

Related

Does .Where(x=> x.listFoos.Count() > 1) count all sub element?

Given a list of object with a property of type List:
class Bar{
public List<Foo> Foos{get;set;}
}
Will the following code, that select all bar with more than one Foo, count all Foos?
Or will it stop iterating at 2 Foos?
var input = new List<Bar>();
var result = input.Where(x=> x.Foos.Count()>1).ToList();
It won't count anything. List<T> redundantly stores the number of elements, so accessing the Count property is a O(1) operation.
This works even if you use the Enumerable.Count() extension method rather than List<T>s built-in Count property, because Enumerable.Count() has a built-in optimization if the underlying data source implements ICollection<T>.
As mentioned by Enigmativity in the comments: If you have an IEnumerable which is not an ICollection<T>, you can use the following instead to prevent iterating the entire enumerable:
var result = input.Where(x => x.Foos.Skip(1).Any()).ToList();
When you have questions about how parts of .Net works it's ideal to look at the
source code
this is source for List.Count
// Read-only property describing how many elements are in the List.
public int Count {
get {
Contract.Ensures(Contract.Result<int>() >= 0);
return _size;
}
}
_size is changed whenever the underlying collection is changed, so it doesn't actually count it just references the known size of the list.

Cast generic Collection to List once, then RemoveAt multiple times

I have a generic collection of objects as a property of an object. This data comes from a sql query and api. I want to use the RemoveAt method to remove some of them efficiently. But Visual Studio complains to me that the RemoveAt method is undefined. My intuition is to cast the Collection to a List, giving me access to the RemoveAt method. I only want to cast one time, then use RemoveAt as many times as necessary. I could use the Remove(object) command, but it requires traversing the Collection to look for the object for each call, which is slower than using RemoveAt
Here is what I'm trying:
obj.stuckArray = obj.stuckArray.ToList();
After this line, I have a line of code that looks like this:
obj.stuckArray.RemoveAt(1);
Unfortunately the RemoveAt gets underlined with red and the warning from Visual Studio reads: "ICollection does not contain a definition for 'RemoveAt'"
Is it possible to cast once and RemoveAt multiple? Or is this not possible?
Just do it in three statements instead of two, using a local variable:
var list = obj.stuckArray.ToList();
list.RemoveAt(1);
obj.stuckArray = list;
That way the type of stuckArray doesn't matter as much: you only need to be able to call ToList() on it. The RemoveAt method on List<T> is fine because that's the type of list.
Obviously you want to remove an element from the array and store it back into the original member stuckArray. However as an Icollection has no method RemoveAt defined, you get the error. The method however exists on List<T>.
So do the following instead:
var tmp = obj.stuckArray.ToList();
tmp.RemoveAt(1);
obj.stuckArray = tmp;
However this will traverse the entire collection anyway, as ToList will copy the entire collection into a new one. But I don´t see any way around this in order to delete an element from your array, because an array has no RemoveAt-method.
As per your EDIT: why not just make the Remove after the re-definition of your stuckArray:
var tmp = obj.stuckArray.ToList();
obj.stuckArray = tmp;
Now you can call RemoveAt as often as you want:
((List<MyType>)obj.stuckArray).RemoveAt(1);
((List<MyType>)obj.stuckArray).RemoveAt(1);
((List<MyType>)obj.stuckArray).RemoveAt(1);
Casting this so many times shouldn´t have a big impact on your performance, as obj.stuckArray already is a List<MyType>. RemoveAt on the other hand will have an effect here, as the method will copy the internal array, as you can see at the source-code for RemoveAt:
public void RemoveAt(int index) {
if ((uint)index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
_size--;
if (index < _size) {
Array.Copy(_items, index + 1, _items, index, _size - index); // here the entire array will be traversed again
}
_items[_size] = default(T);
_version++;
}
So by calling RemoveAt three times, you also copy the internal array three times.
If you have control over the object that exposes the struckArray property,
you can expose stuckArray as ICollection<T>, but inside the object use a List<T> as it's backing field. Then you can add a method that removes an item by it's index:
class MyClass
{
private list<int> _stuckArray; // of course, this doesn't have to be int...
public ICollection<int> StuckArray {get {return _stuckArray;}}
public RemoveFromStuckArray(int index)
{
_stuckArray.RemoveAt(index);
}
}
That will enable you to keep whatever references you already have to the property, and also supply a method to remove items by their indexes efficiently, though I'm not sure that's such a good idea to enable removing items by indexes from an ICollection in the first place.

Removing All Nulls From Collections

I'm having to deal with collections of data being thrown at my application from data sources out of my control. Some of these collections contain nulls which I would prefer to filter out as soon as they hit my code rather than scatter null checking code all over the place. I want to do this in a reusable generic fashion and have written this method to do it:
public static void RemoveNulls<T>(this IList<T> collection) where T : class
{
for (var i = 0; i < collection.Count(); i++)
{
if (collection[i] == null)
collection.RemoveAt(i);
}
}
I know on the concrete List class there is the RemoveAll() method that could be used like:
collection.RemoveAll(x => x == null);
But a lot of the return types are interface based (IList/ IList ...) rather than concrete types.
Instead of removing nulls from source collection, you can create a copy of collection without nulls using LINQ:
collection.Where(i => i != null).ToList();
Extension methods would work on any IEnumerable, including IList.
Your method won't work because removing an element will cause the index of all subsequent elements to be decremented. If you don't want a Linq solution (which seems simplest: see the answer from #alex), you should iterate backwards.
public static void RemoveNulls<T>(this IList<T> collection) where T : class
{
for (var i = collection.Count-1; i >= 0 ; i--)
{
if (collection[i] == null)
collection.RemoveAt(i);
}
}

Better way to check for elements in list?

If I want to perform actions such as .Where(...) or .Max(...), I need to make sure the list is not null and has a count greater than zero. Besides doing something such as the following everytime I want to use the list:
if(mylist != null && mylist.Count > 0)
{...}
is there something more inline or lambda like technique that I can use? Or another more compressed technique?
public static class LinqExtensions
{
public static bool IsNullOrEmpty<T>(this IEnumerable<T> items)
{
return items == null || !items.Any();
}
}
You can then do something like
if (!myList.IsNullOrEmpty())
....
My general preference is to have empty list instances, instead of null list variables. However, not everyone can cajole their co-workers into this arrangment. You can protect yourself from null list variables using this extension method.
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> source)
{
return source ?? Enumerable.Empty<T>();
}
Called by:
Customers result = myList.EmptyIfNull().Where(c => c.Name == "Bob");
Most linq methods work on empty collections. Two methods that don't are Min and Max. Generally, I call these methods against an IGrouping. Most IGrouping implementations have at least one element (for example, IGroupings generated by GroupBy or ToLookup). For other cases, you can use Enumerable.DefaultIfEmpty.
int result = myList.EmptyIfNull().Select(c => c.FavoriteNumber).DefaultIfEmpty().Max();
Don't let the list be null
Ensure the object is always in a valid state. By ensuring the list is never null, you never have to check that the list is null.
public class MyClass
{
private readonly IEnumerable<int> ints;
public MyClass(IEnumerable<int> ints)
{
this.ints = ints;
}
public IEnumerable<int> IntsGreaterThan5()
{
return this.ints.Where(x => x > 5);
}
}
Even if this list were empty, you'd still get a valid IEnumerable<int> back.
Max and Min overloads with Nullable types
That still doesn't solve the "Max" and "Min" problems though. There's an overload of Max and Min that take selectors. Those selector overloads can return nullable ints, so your max method becomes this:
this.ints.Max(x => new int?(x));
Therefore, you run Max and check to see if you've gotten a null value or an integer back. voila!
Other Options
Custom Extension Methods
You could also write your own extension methods.
public static MinMaxHelper()
{
public static int? MaxOrDefault(IEnumerable<int> ints)
{
if(!ints.Any())
{
return null;
}
return ints.Max();
}
public static int MaxOrDefault(IEnumerable<int> ints, int defaultValue)
{
if(!ints.Any())
{
return defaultValue;
}
return ints.Max();
}
}
Overriding Linq Extension Methods
And finally, remember that the build in Linq extension methods can be overriden with your own extension methods with matching signatures. Therefore, you could write an extension method to replace .Where(...) and .Max(...) to return null (or a default value) instead of throwing an ArgumentNullException if the Enumerable is null.
Use empty collections instead of null collections. Where will work just fine against an empty collection, so you don't need to ensure that Count > 0 before calling it. You can also call Max on an empty collection if you do a bit of gymnastics first.
For IEnumerable<T> use Enumerable.Empty<T>()
For T[] use new T[0]
For List<T> use new List<T>()
You could try myList.Any() instead of .Count, but you'd still need to check for null.
If there is a risk of your list being null you will alway have to check that before calling any of its methods but you could use the Any() method rather than count. This will return true as soon as it counts one item regardless if there is one or more item in the list. This saves iterating over the entire list which is what Count will do:
if(mylist != null && mylist.Any())
{...}
You can use ?? operator which converts null to the value you supply on the right side:
public ProcessList(IEnumerable<int> ints)
{
this.ints = ints ?? new List<int>();
}
By the way: It is not a problem to process an empty list using LINQ.
You don't need to check Count to call Where. Max needs a non-empty list for value types but that can be overcome with an inline cast, eg
int? max = new List<int>().Max(i => (int?)i); // max = null

Implementing IEquatable<T> to avoid duplicates from List<T>

I have a List<CustomObject> and want to remove duplicates from it.
If two Custom Objects have same value for property: City, then I will call them duplicate.
I have implemented IEquatable as follows, but not able to remove duplicates from the list.
What is missing?
public class CustomAddress : IAddress, IEqualityComparer<IAddress>
{
//Other class members go here
//IEqualityComparer members
public bool Equals(IAddress x, IAddress y)
{
// Check whether the compared objects reference the same data.
if (ReferenceEquals(x, y)) return true;
// Check whether any of the compared objects is null.
if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
return false;
// Check whether the Objects' properties are equal.
return x.City.Equals(y.City);
}
public int GetHashCode(IAddress obj)
{
// Check whether the object is null.
if (ReferenceEquals(obj, null)) return 0;
int hashAreaName = City == null ? 0 : City.GetHashCode();
return hashAreaName;
}
}
I am using .NET 3.5
With your overrides of Equals and GetHashCode in place, if you have an existing list that you need to filter, simply invoke Distinct() (available through the namespace System.Linq) on the list.
var noDupes = list.Distinct();
This will give you a duplicate-free sequence. If you need that to be a concrete list, simply add a ToList() to the end of the invocation.
var noDupes = list.Distinct().ToList();
Another answer mentions implementing an IEqualityComparer<CustomObject>. This is useful when overriding Equals and GetHashCode directly is either impossible (you don't control the source) or does not make sense (your idea of equality in this particular case is not universal for the class). In that case, define the comparer as demonstrated and provide an instance of the comparer to an overload of Distinct.
Finally, if you're building a list from the ground-up and want to avoid duplicates being inserted, you can use a HashSet<T> as mentioned here. The HashSet also accepts a custom comparer in the constructor, so you can optionally include that.
var mySet = new HashSet<CustomObject>();
bool isAdded = mySet.Add(myElement);
// isAdded will be false if myElement already exists in set, and
// myElement would not be added a second time.
// or you could use
if (!mySet.Contains(myElement))
mySet.Add(myElement);
One more option that is not using .NET library methods but can be useful in a pinch is Jon Skeet's DistinctBy, which you can see a rough implementation here. The idea is that you submit a Func<MyObject, Key> lambda expression directly and omit the overrides of Equals and GetHashCode (or the custom comparer) entirely.
var noDupes = list.DistinctBy(obj => obj.City); // NOT part of BCL
Just by implementing .Equals the way you did (wich you implemented correctly) you will not prevent duplicates from beeing added to a List<T>. You will actually have to manually remove them.
Instead of List<CustomObject> use HashSet<CustomObject>. It will never contain duplicates.
That's because List<CustomObject> tests if your class ( CustomObject) implements IEquatable<CustomObject> and not IEquatable<IAddress> as you did
I assume that for duplicate check you are using the Contains method, before adding a new member
To match duplicates on only a specific property you need a comparer.
class MyComparer : IEqualityComparer<CustomObject>
{
public bool Equals(CustomObject x, CustomObject y)
{
return x.City.Equals(y.City);
}
public int GetHashCode(CustomObject x)
{
return x.City.GetHashCode()
}
}
Usage:
var yourDistictObjects = youObjects.Distinct(new MyComparer());
Edit: Found this thread that does what you need and I think I referred to it in the past:
Remove duplicates in the list using linq
One answer that I thought was kind of interesting (but not how had done it) was:
var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());
It's a one liner that does what you need but might not be as efficient as the other methods.

Categories

Resources