I have a 2 lists of an object type:
List<MyClass> list1;
List<MyClass> list2;
What is the best way (performance and clean code) to extract differences in data between these two List?
I mean get objects that is added, deleted, or changed (and the change)?
Try Except with Union, but you'll need to do it for both in order to find differences in both.
var exceptions = list1.Except(list2).Union(list2.Except(list1)).ToList();
OR as a Linq alternative, there could be a much faster approach: HashSet.SymmetricExceptWith():
var exceptions = new HashSet(list1);
exceptions.SymmetricExceptWith(list2);
IEnumerable<string> differenceQuery = list1.Except(list2);
http://msdn.microsoft.com/en-us/library/bb397894.aspx
You may use FindAll to get the result you want, even you don't have IEquatable or IComparable implemented in your MyClass. Here is one example:
List<MyClass> interetedList = list1.FindAll(delegate(MyClass item1) {
MyClass found = list2.Find(delegate(MyClass item2) {
return item2.propertyA == item1.propertyA ...;
}
return found != null;
});
In the same way, you can get your interested items from list2 by comparing to list1.
This strategy may get your "changed" items as well.
One way to get items that are either in list1 or in list2 but not in both would be:
var common = list1.Intersect(list2);
var exceptions = list1.Except(common).Concat(list2.Except(common));
Try this for objects comparison and loop around it for List<T>
public static void GetPropertyChanges<T>(this T oldObj, T newObj)
{
Type type = typeof(T);
foreach (System.Reflection.PropertyInfo pi in type.GetProperties(System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.Instance))
{
object selfValue = type.GetProperty(pi.Name).GetValue(oldObj, null);
object toValue = type.GetProperty(pi.Name).GetValue(newObj, null);
if (selfValue != null && toValue != null)
{
if (selfValue.ToString() != toValue.ToString())
{
//do your code
}
}
}
}
Related
This question already has answers here:
LINQ to Objects List Difference
(5 answers)
Closed 7 years ago.
I have two list of type Link
Link
{
Title;
url;
}
I have two list(List lst1 and List lst2 of type Link
Now I want those element which is not in lst1 but in lst2
How can I do that using lambda expression.
I dont want to use for loop.
For reference comparison:
list2.Except(list1);
For value comparison you can use:
list2.Where(el2 => !list1.Any(el1 => el1.Title == el2.Title && el1.url == el2.url));
In set operations what you are looking for is
a union minus the intersect
so
(list1 union list2) except (list1 intersect list2)
check out this link for linq set operations
https://msdn.microsoft.com/en-us/library/bb546153.aspx
class CompareLists
{
static void Main()
{
// Create the IEnumerable data sources.
string[] names1 = System.IO.File.ReadAllLines(#"../../../names1.txt");
string[] names2 = System.IO.File.ReadAllLines(#"../../../names2.txt");
// Create the query. Note that method syntax must be used here.
IEnumerable<string> differenceQuery =
names1.Except(names2);
// Execute the query.
Console.WriteLine("The following lines are in names1.txt but not names2.txt");
foreach (string s in differenceQuery)
Console.WriteLine(s);
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
}
/* Output:
The following lines are in names1.txt but not names2.txt
Potra, Cristina
Noriega, Fabricio
Aw, Kam Foo
Toyoshima, Tim
Guy, Wey Yuan
Garcia, Debra
*/
EDIT
Using Except is exactly the right way do go. If your type overrides Equals and GetHashCode, or you're only interested in reference type equality (i.e. two references are only "equal" if they refer to the exact same object), you can just use:
var list3 = list1.Except(list2).ToList();
If you need to express a custom idea of equality, e.g. by ID, you'll need to implement IEqualityComparer. For example:
public class IdComparer : IEqualityComparer<CustomObject>
{
public int GetHashCode(CustomObject co)
{
if (co == null)
{
return 0;
}
return co.Id.GetHashCode();
}
public bool Equals(CustomObject x1, CustomObject x2)
{
if (object.ReferenceEquals(x1, x2))
{
return true;
}
if (object.ReferenceEquals(x1, null) ||
object.ReferenceEquals(x2, null))
{
return false;
}
return x1.Id == x2.Id;
}
}
Then use:
var list3 = list1.Except(list2, new IdComparer()).ToList();
EDIT: As noted in comments, this will remove any duplicate elements; if you need duplicates to be preserved, let us know... it would probably be easiest to create a set from list2 and use something like:
var list3 = list1.Where(x => !set2.Contains(x)).ToList();
Difference between two lists
I have an object.
This object is casting an Items Container (I don't know what items, but I can check).
But is there any code which can help me find how many items it contains?
I mean
object[] arrObj = new object[2] {1, 2};
object o = (object)arrObj;
In this case arrObj is an array so I can check:
((Array)o).Length //2
But what if I have those 2 others ?
ArrayList al = new ArrayList(2);
al.Add(1);
al.Add(2);
object o = (object)al ;
and
List<object> lst= new List<object>(2);
object o = (object)lst;
Is there any general code which can help me find how many items are in this casted object (o in this samples) ?
Of course I can check if (o is ...) { } but Im looking for more general code.
You can cast to the interface every container implements: IEnumerable. However, to be more performant, it is a good idea to first try IEnumerable<T>:
var count = -1;
var enumerable = lst as IEnumerable<object>;
if(enumerable != null)
count = enumerable.Count();
else
{
var nonGenericEnumerable = lst as IEnumerable;
count = nonGenericEnumerable.Cast<object>().Count();
}
For Count() to be available, you need to add using System.Linq; to your .cs file.
Please note that this code has one big advantage: If the collection implements ICollection<T> - like List<T> or strong typed arrays of reference types - this code executes in O(1) [Assuming the concrete implementation of ICollection<T>.Count executes in O(1)]. Only if it doesn't - like ArrayList or strong typed arrays of value types - does this code execute in O(n) and additionally, it will box the items in the case of an array of value types.
You could use linq.
var count = ((IEnumerable)o).Cast<object>().Count();
Ensure that the type o has implements IEnumerable and that you have using System.Linq at the top of your file.
Well the most basic interface it could implement would be IEnumerable. Unfortunately even Enumerable.Count from LINQ is implemented for IEnumerable<T>, but you could easily write your own:
public static int Count(IEnumerable sequence)
{
// Shortcut for any ICollection implementation
var collection = sequence as ICollection;
if (collection != null)
{
return collection.Count;
}
var iterator = sequence.GetEnumerator();
try
{
int count = 0;
while (iterator.MoveNext())
{
count++;
}
return count;
}
finally
{
IDisposable disposable = iterator as IDisposable;
if (disposable != null)
{
disposable.Dispose();
}
}
}
Note that this is basically equivalent to:
int count = 0;
foreach (object item in sequence)
{
count++;
}
... except that because it never uses Current, it wouldn't need to do any boxing if your container was actually an int[] for example.
Call it with:
var sequence = container as IEnumerable;
if (sequence != null)
{
int count = Count(sequence);
// Use the count
}
It's worth noting that avoiding boxing really is a bit of a micro-optimization: it's unlikely to really be significant. But you can do it once, just in this method, and then take advantage of it everywhere.
For my function
IEnumerable<CallbackListRecord> LoadOpenListToProcess(CallbackSearchParams usp);
This line errors when the sequence contains no elements (as it should)
CallbackListRecord nextRecord = CallbackSearch.LoadOpenListToProcess(p).First();
I have changed it to the following
CallbackListRecord nextRecord = null;
IEnumerable<CallbackListRecord> nextRecords = CallbackSearch.LoadOpenListToProcess(p);
if (nextRecords.Any())
{
nextRecord = nextRecords.First();
}
Are there better, easier or more elegant ways to determine if the IEnumerable sequence has no elements?
You should try to avoid enumerating it more times than necessary (even if short-circuited, like First and Any) - how about:
var nextRecord = CallbackSearch.LoadOpenListToProcess(p).FirstOrDefault();
if(nextRecord != null) {
// process it...
}
This works well with classes (since you can just compare the reference to null).
You can shorten the code to the following
var nextrecord = CallbackSearch.LoadOpenListToProcess(p).FirstOrDefault();
nextrecord will either contain the First element if there was one or null if the collection was empty.
If you are anticipating that there could be null values in the sequence, you could handle the enumerator yourself.
var enumerator = CallbackSearch.LoadOpenListToProcess(p).GetEnumerator();
if (enumerator.MoveNext()) {
var item = enumerator.Current;
...
}
You could add an extension method like this:
public static class Extensions
{
public static bool HasElements<T>(this IEnumerable<T> collection)
{
foreach (T t in collection)
return true;
return false;
}
}
I often want to grab the first element of an IEnumerable<T> in .net, and I haven't found a nice way to do it. The best I've come up with is:
foreach(Elem e in enumerable) {
// do something with e
break;
}
Yuck! So, is there a nice way to do this?
If you can use LINQ you can use:
var e = enumerable.First();
This will throw an exception though if enumerable is empty: in which case you can use:
var e = enumerable.FirstOrDefault();
FirstOrDefault() will return default(T) if the enumerable is empty, which will be null for reference types or the default 'zero-value' for value types.
If you can't use LINQ, then your approach is technically correct and no different than creating an enumerator using the GetEnumerator and MoveNext methods to retrieve the first result (this example assumes enumerable is an IEnumerable<Elem>):
Elem e = myDefault;
using (IEnumerator<Elem> enumer = enumerable.GetEnumerator()) {
if (enumer.MoveNext()) e = enumer.Current;
}
Joel Coehoorn mentioned .Single() in the comments; this will also work, if you are expecting your enumerable to contain exactly one element - however it will throw an exception if it is either empty or larger than one element. There is a corresponding SingleOrDefault() method that covers this scenario in a similar fashion to FirstOrDefault(). However, David B explains that SingleOrDefault() may still throw an exception in the case where the enumerable contains more than one item.
Edit: Thanks Marc Gravell for pointing out that I need to dispose of my IEnumerator object after using it - I've edited the non-LINQ example to display the using keyword to implement this pattern.
Just in case you're using .NET 2.0 and don't have access to LINQ:
static T First<T>(IEnumerable<T> items)
{
using(IEnumerator<T> iter = items.GetEnumerator())
{
iter.MoveNext();
return iter.Current;
}
}
This should do what you're looking for...it uses generics so you to get the first item on any type IEnumerable.
Call it like so:
List<string> items = new List<string>() { "A", "B", "C", "D", "E" };
string firstItem = First<string>(items);
Or
int[] items = new int[] { 1, 2, 3, 4, 5 };
int firstItem = First<int>(items);
You could modify it readily enough to mimic .NET 3.5's IEnumerable.ElementAt() extension method:
static T ElementAt<T>(IEnumerable<T> items, int index)
{
using(IEnumerator<T> iter = items.GetEnumerator())
{
for (int i = 0; i <= index; i++, iter.MoveNext()) ;
return iter.Current;
}
}
Calling it like so:
int[] items = { 1, 2, 3, 4, 5 };
int elemIdx = 3;
int item = ElementAt<int>(items, elemIdx);
Of course if you do have access to LINQ, then there are plenty of good answers posted already...
Well, you didn't specify which version of .Net you're using.
Assuming you have 3.5, another way is the ElementAt method:
var e = enumerable.ElementAt(0);
FirstOrDefault ?
Elem e = enumerable.FirstOrDefault();
//do something with e
Try this
IEnumerable<string> aa;
string a = (from t in aa where t.Equals("") select t.Value).ToArray()[0];
Use FirstOrDefault or a foreach loop as already mentioned. Manually fetching an enumerator and calling Current should be avoided. foreach will dispose your enumerator for you if it implements IDisposable. When calling MoveNext and Current you have to dispose it manually (if aplicable).
If your IEnumerable doesn't expose it's <T> and Linq fails, you can write a method using reflection:
public static T GetEnumeratedItem<T>(Object items, int index) where T : class
{
T item = null;
if (items != null)
{
System.Reflection.MethodInfo mi = items.GetType()
.GetMethod("GetEnumerator");
if (mi != null)
{
object o = mi.Invoke(items, null);
if (o != null)
{
System.Reflection.MethodInfo mn = o.GetType()
.GetMethod("MoveNext");
if (mn != null)
{
object next = mn.Invoke(o, null);
while (next != null && next.ToString() == "True")
{
if (index < 1)
{
System.Reflection.PropertyInfo pi = o
.GetType().GetProperty("Current");
if (pi != null) item = pi
.GetValue(o, null) as T;
break;
}
index--;
}
}
}
}
}
return item;
}
you can also try the more generic version which gives you the ith element
enumerable.ElementAtOrDefault(i));
hope it helps
What's the "best" (taking both speed and readability into account) way to determine if a list is empty? Even if the list is of type IEnumerable<T> and doesn't have a Count property.
Right now I'm tossing up between this:
if (myList.Count() == 0) { ... }
and this:
if (!myList.Any()) { ... }
My guess is that the second option is faster, since it'll come back with a result as soon as it sees the first item, whereas the second option (for an IEnumerable) will need to visit every item to return the count.
That being said, does the second option look as readable to you? Which would you prefer? Or can you think of a better way to test for an empty list?
Edit #lassevk's response seems to be the most logical, coupled with a bit of runtime checking to use a cached count if possible, like this:
public static bool IsEmpty<T>(this IEnumerable<T> list)
{
if (list is ICollection<T>) return ((ICollection<T>)list).Count == 0;
return !list.Any();
}
You could do this:
public static Boolean IsEmpty<T>(this IEnumerable<T> source)
{
if (source == null)
return true; // or throw an exception
return !source.Any();
}
Edit: Note that simply using the .Count method will be fast if the underlying source actually has a fast Count property. A valid optimization above would be to detect a few base types and simply use the .Count property of those, instead of the .Any() approach, but then fall back to .Any() if no guarantee can be made.
I would make one small addition to the code you seem to have settled on: check also for ICollection, as this is implemented even by some non-obsolete generic classes as well (i.e., Queue<T> and Stack<T>). I would also use as instead of is as it's more idiomatic and has been shown to be faster.
public static bool IsEmpty<T>(this IEnumerable<T> list)
{
if (list == null)
{
throw new ArgumentNullException("list");
}
var genericCollection = list as ICollection<T>;
if (genericCollection != null)
{
return genericCollection.Count == 0;
}
var nonGenericCollection = list as ICollection;
if (nonGenericCollection != null)
{
return nonGenericCollection.Count == 0;
}
return !list.Any();
}
LINQ itself must be doing some serious optimization around the Count() method somehow.
Does this surprise you? I imagine that for IList implementations, Count simply reads the number of elements directly while Any has to query the IEnumerable.GetEnumerator method, create an instance and call MoveNext at least once.
/EDIT #Matt:
I can only assume that the Count() extension method for IEnumerable is doing something like this:
Yes, of course it does. This is what I meant. Actually, it uses ICollection instead of IList but the result is the same.
I just wrote up a quick test, try this:
IEnumerable<Object> myList = new List<Object>();
Stopwatch watch = new Stopwatch();
int x;
watch.Start();
for (var i = 0; i <= 1000000; i++)
{
if (myList.Count() == 0) x = i;
}
watch.Stop();
Stopwatch watch2 = new Stopwatch();
watch2.Start();
for (var i = 0; i <= 1000000; i++)
{
if (!myList.Any()) x = i;
}
watch2.Stop();
Console.WriteLine("myList.Count() = " + watch.ElapsedMilliseconds.ToString());
Console.WriteLine("myList.Any() = " + watch2.ElapsedMilliseconds.ToString());
Console.ReadLine();
The second is almost three times slower :)
Trying the stopwatch test again with a Stack or array or other scenarios it really depends on the type of list it seems - because they prove Count to be slower.
So I guess it depends on the type of list you're using!
(Just to point out, I put 2000+ objects in the List and count was still faster, opposite with other types)
List.Count is O(1) according to Microsoft's documentation:
http://msdn.microsoft.com/en-us/library/27b47ht3.aspx
so just use List.Count == 0 it's much faster than a query
This is because it has a data member called Count which is updated any time something is added or removed from the list, so when you call List.Count it doesn't have to iterate through every element to get it, it just returns the data member.
The second option is much quicker if you have multiple items.
Any() returns as soon as 1 item is found.
Count() has to keep going through the entire list.
For instance suppose the enumeration had 1000 items.
Any() would check the first one, then return true.
Count() would return 1000 after traversing the entire enumeration.
This is potentially worse if you use one of the predicate overrides - Count() still has to check every single item, even it there is only one match.
You get used to using the Any one - it does make sense and is readable.
One caveat - if you have a List, rather than just an IEnumerable then use that list's Count property.
#Konrad what surprises me is that in my tests, I'm passing the list into a method that accepts IEnumerable<T>, so the runtime can't optimize it by calling the Count() extension method for IList<T>.
I can only assume that the Count() extension method for IEnumerable is doing something like this:
public static int Count<T>(this IEnumerable<T> list)
{
if (list is IList<T>) return ((IList<T>)list).Count;
int i = 0;
foreach (var t in list) i++;
return i;
}
... in other words, a bit of runtime optimization for the special case of IList<T>.
/EDIT #Konrad +1 mate - you're right about it more likely being on ICollection<T>.
Ok, so what about this one?
public static bool IsEmpty<T>(this IEnumerable<T> enumerable)
{
return !enumerable.GetEnumerator().MoveNext();
}
EDIT: I've just realized that someone has sketched this solution already. It was mentioned that the Any() method will do this, but why not do it yourself? Regards
Another idea:
if(enumerable.FirstOrDefault() != null)
However I like the Any() approach more.
This was critical to get this to work with Entity Framework:
var genericCollection = list as ICollection<T>;
if (genericCollection != null)
{
//your code
}
If I check with Count() Linq executes a "SELECT COUNT(*).." in the database, but I need to check if the results contains data, I resolved to introducing FirstOrDefault() instead of Count();
Before
var cfop = from tabelaCFOPs in ERPDAOManager.GetTable<TabelaCFOPs>()
if (cfop.Count() > 0)
{
var itemCfop = cfop.First();
//....
}
After
var cfop = from tabelaCFOPs in ERPDAOManager.GetTable<TabelaCFOPs>()
var itemCfop = cfop.FirstOrDefault();
if (itemCfop != null)
{
//....
}
private bool NullTest<T>(T[] list, string attribute)
{
bool status = false;
if (list != null)
{
int flag = 0;
var property = GetProperty(list.FirstOrDefault(), attribute);
foreach (T obj in list)
{
if (property.GetValue(obj, null) == null)
flag++;
}
status = flag == 0 ? true : false;
}
return status;
}
public PropertyInfo GetProperty<T>(T obj, string str)
{
Expression<Func<T, string, PropertyInfo>> GetProperty = (TypeObj, Column) => TypeObj.GetType().GetProperty(TypeObj
.GetType().GetProperties().ToList()
.Find(property => property.Name
.ToLower() == Column
.ToLower()).Name.ToString());
return GetProperty.Compile()(obj, str);
}
Here's my implementation of Dan Tao's answer, allowing for a predicate:
public static bool IsEmpty<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null) throw new ArgumentNullException();
if (IsCollectionAndEmpty(source)) return true;
return !source.Any(predicate);
}
public static bool IsEmpty<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw new ArgumentNullException();
if (IsCollectionAndEmpty(source)) return true;
return !source.Any();
}
private static bool IsCollectionAndEmpty<TSource>(IEnumerable<TSource> source)
{
var genericCollection = source as ICollection<TSource>;
if (genericCollection != null) return genericCollection.Count == 0;
var nonGenericCollection = source as ICollection;
if (nonGenericCollection != null) return nonGenericCollection.Count == 0;
return false;
}
List<T> li = new List<T>();
(li.First().DefaultValue.HasValue) ? string.Format("{0:yyyy/MM/dd}", sender.First().DefaultValue.Value) : string.Empty;
myList.ToList().Count == 0. That's all
This extension method works for me:
public static bool IsEmpty<T>(this IEnumerable<T> enumerable)
{
try
{
enumerable.First();
return false;
}
catch (InvalidOperationException)
{
return true;
}
}