If I want to perform actions such as .Where(...) or .Max(...), I need to make sure the list is not null and has a count greater than zero. Besides doing something such as the following everytime I want to use the list:
if(mylist != null && mylist.Count > 0)
{...}
is there something more inline or lambda like technique that I can use? Or another more compressed technique?
public static class LinqExtensions
{
public static bool IsNullOrEmpty<T>(this IEnumerable<T> items)
{
return items == null || !items.Any();
}
}
You can then do something like
if (!myList.IsNullOrEmpty())
....
My general preference is to have empty list instances, instead of null list variables. However, not everyone can cajole their co-workers into this arrangment. You can protect yourself from null list variables using this extension method.
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> source)
{
return source ?? Enumerable.Empty<T>();
}
Called by:
Customers result = myList.EmptyIfNull().Where(c => c.Name == "Bob");
Most linq methods work on empty collections. Two methods that don't are Min and Max. Generally, I call these methods against an IGrouping. Most IGrouping implementations have at least one element (for example, IGroupings generated by GroupBy or ToLookup). For other cases, you can use Enumerable.DefaultIfEmpty.
int result = myList.EmptyIfNull().Select(c => c.FavoriteNumber).DefaultIfEmpty().Max();
Don't let the list be null
Ensure the object is always in a valid state. By ensuring the list is never null, you never have to check that the list is null.
public class MyClass
{
private readonly IEnumerable<int> ints;
public MyClass(IEnumerable<int> ints)
{
this.ints = ints;
}
public IEnumerable<int> IntsGreaterThan5()
{
return this.ints.Where(x => x > 5);
}
}
Even if this list were empty, you'd still get a valid IEnumerable<int> back.
Max and Min overloads with Nullable types
That still doesn't solve the "Max" and "Min" problems though. There's an overload of Max and Min that take selectors. Those selector overloads can return nullable ints, so your max method becomes this:
this.ints.Max(x => new int?(x));
Therefore, you run Max and check to see if you've gotten a null value or an integer back. voila!
Other Options
Custom Extension Methods
You could also write your own extension methods.
public static MinMaxHelper()
{
public static int? MaxOrDefault(IEnumerable<int> ints)
{
if(!ints.Any())
{
return null;
}
return ints.Max();
}
public static int MaxOrDefault(IEnumerable<int> ints, int defaultValue)
{
if(!ints.Any())
{
return defaultValue;
}
return ints.Max();
}
}
Overriding Linq Extension Methods
And finally, remember that the build in Linq extension methods can be overriden with your own extension methods with matching signatures. Therefore, you could write an extension method to replace .Where(...) and .Max(...) to return null (or a default value) instead of throwing an ArgumentNullException if the Enumerable is null.
Use empty collections instead of null collections. Where will work just fine against an empty collection, so you don't need to ensure that Count > 0 before calling it. You can also call Max on an empty collection if you do a bit of gymnastics first.
For IEnumerable<T> use Enumerable.Empty<T>()
For T[] use new T[0]
For List<T> use new List<T>()
You could try myList.Any() instead of .Count, but you'd still need to check for null.
If there is a risk of your list being null you will alway have to check that before calling any of its methods but you could use the Any() method rather than count. This will return true as soon as it counts one item regardless if there is one or more item in the list. This saves iterating over the entire list which is what Count will do:
if(mylist != null && mylist.Any())
{...}
You can use ?? operator which converts null to the value you supply on the right side:
public ProcessList(IEnumerable<int> ints)
{
this.ints = ints ?? new List<int>();
}
By the way: It is not a problem to process an empty list using LINQ.
You don't need to check Count to call Where. Max needs a non-empty list for value types but that can be overcome with an inline cast, eg
int? max = new List<int>().Max(i => (int?)i); // max = null
Related
I need to check if two lists have any elements in common. I just need a yes/no - I don't need the actual list of common elements.
I could use Enumerable.Intersect() but this does actually return the set of matching items which seems like it would require extra overhead. Is there a better method for checking if lists are disjoint?
My lists do actually happen to be List<T> but that isn't crucial and I could use something like HashSet (say) if that were more convenient. i.e., I don't want to unnecessarily constrain potential solutions.
Grazie mille
Simplest version (using Intersect):
public bool Compare(List<T> firstCollection, List<T> secondCollection)
{
return firstCollection.Intersect(secondCollection).Any();
}
Only caveat is either T shall implement IEquatable<T> or pass custom IEqualityComparer<T> in the Intersect call. Also make sure that GetHashCode is overriden along with Equals
Edit 1:
This version using Dictionary will not just provide boolean comparison, but also the elements. In this solution Dictionary in the end would contain data related to number of intersecting elements, number of elements in one of the collection but not in another, so fairly durable. This solution also has IEquatable<T> requirement
public bool CompareUsingDictionary(IEnumerable<T> firstCollection, IEnumerable<T> secondCollection)
{
// Implementation needs overiding GetHashCode methods of the object base class in the compared type
// Obviate initial test cases, if either collection equals null and other doesn't then return false. If both are null then return true.
if (firstCollection == null && secondCollection != null)
return false;
if (firstCollection != null && secondCollection == null)
return false;
if (firstCollection == null && secondCollection == null)
return true;
// Create a dictionary with key as Hashcode and value as number of occurences
var dictionary = new Dictionary<int, int>();
// If the value exists in first list , increase its count
foreach (T item in firstCollection)
{
// Get Hash for each item in the list
int hash = item.GetHashCode();
// If dictionary contains key then increment
if (dictionary.ContainsKey(hash))
{
dictionary[hash]++;
}
else
{
// Initialize he dictionary with value 1
dictionary.Add(hash, 1);
}
}
// If the value exists in second list , decrease its count
foreach (T item in secondCollection)
{
// Get Hash for each item in the list
int hash = item.GetHashCode();
// If dictionary contains key then decrement
if (dictionary.ContainsKey(hash))
{
dictionary[hash]--;
}
else
{
return false;
}
}
// Check whether any value is 0
return dictionary.Values.Any(numOfValues => numOfValues == 0);
}
I have one simple linq in my VM:
public int MaxItem => Collection.Max((c)=> c.Count);
There is no problem with it, if Collection is full of items. But if I need cleared it like this :
Collection.Clear();
Then I have Exception:
System.InvalidOperationException
How can fix it ?
Max (and Min) are undefined for empty sets, so the only reasonable behavior is to throw exception when sequence has no items.
If you need special handling for your collection - check for empty (or "full of items") condition and call different methods:
public int MaxItem => Collection.IsFullOfItems ?
Collection.Max((c)=> c.Count) : 0;
(You can use Any() or Count != 0 for most collection types if you don't have your custom IsFullOfItems property)
Alternatively if Collection is your custom class you can implement your own Max(Func<bool, T> predicate) method in that class and it will be used in Collection.Max call instead of default Enumerable.Max extension method.
I'm having to deal with collections of data being thrown at my application from data sources out of my control. Some of these collections contain nulls which I would prefer to filter out as soon as they hit my code rather than scatter null checking code all over the place. I want to do this in a reusable generic fashion and have written this method to do it:
public static void RemoveNulls<T>(this IList<T> collection) where T : class
{
for (var i = 0; i < collection.Count(); i++)
{
if (collection[i] == null)
collection.RemoveAt(i);
}
}
I know on the concrete List class there is the RemoveAll() method that could be used like:
collection.RemoveAll(x => x == null);
But a lot of the return types are interface based (IList/ IList ...) rather than concrete types.
Instead of removing nulls from source collection, you can create a copy of collection without nulls using LINQ:
collection.Where(i => i != null).ToList();
Extension methods would work on any IEnumerable, including IList.
Your method won't work because removing an element will cause the index of all subsequent elements to be decremented. If you don't want a Linq solution (which seems simplest: see the answer from #alex), you should iterate backwards.
public static void RemoveNulls<T>(this IList<T> collection) where T : class
{
for (var i = collection.Count-1; i >= 0 ; i--)
{
if (collection[i] == null)
collection.RemoveAt(i);
}
}
I have a List<CustomObject> and want to remove duplicates from it.
If two Custom Objects have same value for property: City, then I will call them duplicate.
I have implemented IEquatable as follows, but not able to remove duplicates from the list.
What is missing?
public class CustomAddress : IAddress, IEqualityComparer<IAddress>
{
//Other class members go here
//IEqualityComparer members
public bool Equals(IAddress x, IAddress y)
{
// Check whether the compared objects reference the same data.
if (ReferenceEquals(x, y)) return true;
// Check whether any of the compared objects is null.
if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
return false;
// Check whether the Objects' properties are equal.
return x.City.Equals(y.City);
}
public int GetHashCode(IAddress obj)
{
// Check whether the object is null.
if (ReferenceEquals(obj, null)) return 0;
int hashAreaName = City == null ? 0 : City.GetHashCode();
return hashAreaName;
}
}
I am using .NET 3.5
With your overrides of Equals and GetHashCode in place, if you have an existing list that you need to filter, simply invoke Distinct() (available through the namespace System.Linq) on the list.
var noDupes = list.Distinct();
This will give you a duplicate-free sequence. If you need that to be a concrete list, simply add a ToList() to the end of the invocation.
var noDupes = list.Distinct().ToList();
Another answer mentions implementing an IEqualityComparer<CustomObject>. This is useful when overriding Equals and GetHashCode directly is either impossible (you don't control the source) or does not make sense (your idea of equality in this particular case is not universal for the class). In that case, define the comparer as demonstrated and provide an instance of the comparer to an overload of Distinct.
Finally, if you're building a list from the ground-up and want to avoid duplicates being inserted, you can use a HashSet<T> as mentioned here. The HashSet also accepts a custom comparer in the constructor, so you can optionally include that.
var mySet = new HashSet<CustomObject>();
bool isAdded = mySet.Add(myElement);
// isAdded will be false if myElement already exists in set, and
// myElement would not be added a second time.
// or you could use
if (!mySet.Contains(myElement))
mySet.Add(myElement);
One more option that is not using .NET library methods but can be useful in a pinch is Jon Skeet's DistinctBy, which you can see a rough implementation here. The idea is that you submit a Func<MyObject, Key> lambda expression directly and omit the overrides of Equals and GetHashCode (or the custom comparer) entirely.
var noDupes = list.DistinctBy(obj => obj.City); // NOT part of BCL
Just by implementing .Equals the way you did (wich you implemented correctly) you will not prevent duplicates from beeing added to a List<T>. You will actually have to manually remove them.
Instead of List<CustomObject> use HashSet<CustomObject>. It will never contain duplicates.
That's because List<CustomObject> tests if your class ( CustomObject) implements IEquatable<CustomObject> and not IEquatable<IAddress> as you did
I assume that for duplicate check you are using the Contains method, before adding a new member
To match duplicates on only a specific property you need a comparer.
class MyComparer : IEqualityComparer<CustomObject>
{
public bool Equals(CustomObject x, CustomObject y)
{
return x.City.Equals(y.City);
}
public int GetHashCode(CustomObject x)
{
return x.City.GetHashCode()
}
}
Usage:
var yourDistictObjects = youObjects.Distinct(new MyComparer());
Edit: Found this thread that does what you need and I think I referred to it in the past:
Remove duplicates in the list using linq
One answer that I thought was kind of interesting (but not how had done it) was:
var distinctItems = items.GroupBy(x => x.Id).Select(y => y.First());
It's a one liner that does what you need but might not be as efficient as the other methods.
So I frequently run into this situation... where Do.Something(...) returns a null collection, like so:
int[] returnArray = Do.Something(...);
Then, I try to use this collection like so:
foreach (int i in returnArray)
{
// do some more stuff
}
I'm just curious, why can't a foreach loop operate on a null collection? It seems logical to me that 0 iterations would get executed with a null collection... instead it throws a NullReferenceException. Anyone know why this could be?
This is annoying as I'm working with APIs that aren't clear on exactly what they return, so I end up with if (someCollection != null) everywhere.
Well, the short answer is "because that's the way the compiler designers designed it." Realistically, though, your collection object is null, so there's no way for the compiler to get the enumerator to loop through the collection.
If you really need to do something like this, try the null coalescing operator:
int[] array = null;
foreach (int i in array ?? Enumerable.Empty<int>())
{
System.Console.WriteLine(string.Format("{0}", i));
}
A foreach loop calls the GetEnumerator method.
If the collection is null, this method call results in a NullReferenceException.
It is bad practice to return a null collection; your methods should return an empty collection instead.
There is a big difference between an empty collection and a null reference to a collection.
When you use foreach, internally, this is calling the IEnumerable's GetEnumerator() method. When the reference is null, this will raise this exception.
However, it is perfectly valid to have an empty IEnumerable or IEnumerable<T>. In this case, foreach will not "iterate" over anything (since the collection is empty), but it will also not throw, since this is a perfectly valid scenario.
Edit:
Personally, if you need to work around this, I'd recommend an extension method:
public static IEnumerable<T> AsNotNull<T>(this IEnumerable<T> original)
{
return original ?? Enumerable.Empty<T>();
}
You can then just call:
foreach (int i in returnArray.AsNotNull())
{
// do some more stuff
}
It is being answer long back but i have tried to do this in the following way to just avoid null pointer exception and may be useful for someone using C# null check operator ?.
//fragments is a list which can be null
fragments?.ForEach((obj) =>
{
//do something with obj
});
Another extension method to work around this:
public static void ForEach<T>(this IEnumerable<T> items, Action<T> action)
{
if(items == null) return;
foreach (var item in items) action(item);
}
Consume in several ways:
(1) with a method that accepts T:
returnArray.ForEach(Console.WriteLine);
(2) with an expression:
returnArray.ForEach(i => UpdateStatus(string.Format("{0}% complete", i)));
(3) with a multiline anonymous method
int toCompare = 10;
returnArray.ForEach(i =>
{
var thisInt = i;
var next = i++;
if(next > 10) Console.WriteLine("Match: {0}", i);
});
Because a null collection is not the same thing as an empty collection. An empty collection is a collection object with no elements; a null collection is a nonexistent object.
Here's something to try: Declare two collections of any sort. Initialize one normally so that it's empty, and assign the other the value null. Then try adding an object to both collections and see what happens.
Just write an extension method to help you out:
public static class Extensions
{
public static void ForEachWithNull<T>(this IEnumerable<T> source, Action<T> action)
{
if(source == null)
{
return;
}
foreach(var item in source)
{
action(item);
}
}
}
It is the fault of Do.Something(). The best practice here would be to return an array of size 0 (that is possible) instead of a null.
Because behind the scenes the foreach acquires an enumerator, equivalent to this:
using (IEnumerator<int> enumerator = returnArray.getEnumerator()) {
while (enumerator.MoveNext()) {
int i = enumerator.Current;
// do some more stuff
}
}
I think the explanation of why exception is thrown is very clear with the answers provided here. I just wish to complement with the way I usually work with these collections. Because, some times, I use the collection more then once and have to test if null every time. To avoid that, I do the following:
var returnArray = DoSomething() ?? Enumerable.Empty<int>();
foreach (int i in returnArray)
{
// do some more stuff
}
This way we can use the collection as much as we want without fear the exception and we don't polute the code with excessive conditional statements.
Using the null check operator ?. is also a great approach. But, in case of arrays (like the example in the question), it should be transformed into List before:
int[] returnArray = DoSomething();
returnArray?.ToList().ForEach((i) =>
{
// do some more stuff
});
SPListItem item;
DataRow dr = datatable.NewRow();
dr["ID"] = (!Object.Equals(item["ID"], null)) ? item["ID"].ToString() : string.Empty;