Checking two lists for Modifications - c#

I have got two lists of two different type which has the following common properties.
Id -->used to identify corresponding objects
Bunch of other Properties
ModificationDate
Need to compare these two lists based on Modification date.If the modified date is different (first list ModificationDate greater than second list's ModificationDate then, copy all the properties if that item from first list to second.
Please let me know the best way to do this.
EDITED:Second list may or maynot contain all elements of the first and vice versa.My first list is always the source list. so if an item is present in list 1 and not present in list 2 we need to add it in list 2. also if an item present in list 2 but in not in list 1 then remove it from list2.

Finding added/deleted items
var list1 = new List<MyType>();
var list2 = new List<MyType>();
// These two assume MyType : IEquatable<MyType>
var added = list1.Except(list2);
var deleted = list2.Except(list1);
// Now add "added" to list2, remove "deleted" from list2
If MyType does not implement IEquatable<MyType>, or the implementation is not based solely on comparing ids, you will need to create an IEqualityComparer<MyType>:
class MyTypeIdComparer : IEqualityComparer<MyType>
{
public bool Equals(MyType x, MyType y)
{
return x.Id.CompareTo(y.Id);
}
public int GetHashCode(MyType obj)
{
return obj.Id.GetHashCode();
}
}
Which will allow you to do:
// This does not assume so much for MyType
var comparer = new MyTypeIdComparer();
var added = list1.Except(list2, comparer);
var deleted = list2.Except(list1, comparer);
Finding modified items
var modified = list1.Concat(list2)
.GroupBy(item => item.Id)
.Where(g => g.Select(item => item.ModificationDate)
.Distinct().Count() != 1);
// To propagate the modifications:
foreach(var grp in modified) {
var items = grp.OrderBy(item => item.ModificationDate);
var target = items.First(); // earliest modification date = old
var source = grp.Last(); // latest modification date = new
// And now copy properties from source to target
}

This might be able to help. The Linq library has lots of decent functions, such as Except, Intersection.
http://msdn.microsoft.com/en-us/library/bb397894.aspx

The provided link was helpful in comparing two lists of different types
Comparing Collections in .Net

Related

Update a property field in a List

I have a List<Map> and I wanted to update the Map.Target property based from a matching value from another List<Map>.
Basically, the logic is:
If mapsList1.Name is equal to mapsList2.Name
Then mapsList1.Target = mapsList2.Name
The structure of the Map class looks like this:
public class Map {
public Guid Id { get; set; }
public string Name { get; set; }
public string Target { get; set; }
}
I tried the following but obviously it's not working:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
// populate the 2 lists here
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
The count of items in list 1 will be always greater than or equal to the count of items in list 2. No duplicates in both lists.
Assuming there are a small number of items in the lists and only one item in list 1 that matches:
list2.ForEach(l2m => list1.First(l1m => l1m.Name == l2m.Name).Target = l2m.Target);
If there are more than one item in List1 that must be updated, enumerate the entire list1 doing a First on list2.
list1.ForEach(l1m => l1m.Target = list2.FirstOrDefault(l2m => l1.Name == l2m.Name)?.Target ?? l1m.Target);
If there are a large number of items in list2, turn it into a dictionary
var d = list2.ToDictionary(m => m.Name);
list1.ForEach(m => m.Target = d.ContainsKey(m.Name) ? d[m.Name].Target : m.Target);
(Presumably list2 doesn't contain any repeated names)
If list1's names are unique and everything in list2 is in list1, you could even turn list1 into a dictionary and enumerate list2:
var d=list1.ToDictionary(m => m.Name);
list2.ForEach(m => d[m.Name].Target = m.Target);
If List 2 has entries that are not in list1 or list1 has duplicate names, you could use a Lookup instead, you'd just have to do something to avoid a "collection was modified; enumeration may not execute" you'd get if you were trying to modify the list it returns in response to a name
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
LINQ Where doesn't really work like that / that's not a statement in itself. The m1 is the entry from list1, and the inner Where would produce an enumerable of list 2 items, but it doesn't result in the Boolean the outer Where is expecting, nor can you do anything to either of the sequences because LINQ operations are not supposed to have side effects. The only thing you can do with a Where is capture or use the sequence it returns in some other operation (like enumerating it), so Where isn't really something you'd use for this operation unless you use it to find all the objects you need to alter. It's probably worth pointing out that ForEach is a list thing, not a LINQ thing, and is basically just another way of writing foreach(var item in someList)
If collections are big enough better approach would be to create a dictionary to lookup the targets:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
var dict = mapsList2
.GroupBy(map => map.Name)
.ToDictionary(maps => maps.Key, maps => maps.First().Target);
foreach (var map in mapsList1)
{
if (dict.TryGetValue(map.Name, out var target))
{
map.Target = target;
}
}
Note, that this will discard any possible name duplicates from mapsList2.

Merge C# list elements based on attribute of the object

I have a list in C#. List where User object has few parameters. username, age something like that.
In the list there are duplicate (only twice) entities according to the username. Eventhoug the usernames are same, other attributes are not same.
How can I merge those elements and remove duplications of elements in that list.
P.S: Eventhough there are duplicate entities according to the username, other atteributes empty in one element and other element has the values for those attributes.
var duplicates = Users
.GroupBy(u => u.UserName)
.Where(g => g.Count() > 1)
.ToList();
Each member is now an IEnumerable with the same UserName
foreach(var duplicate in duplicates)
{
// write some logic to combine >= 2 Users
// and remove all but 1 from original Users
// a rough idea:
var main = duplicate.First();
foreach(var user in duplicate.Skip(1))
{
// merge user with main
....
toDeleteList.Add(user);
}
}
You can use a IEqualityComparer
internal class UserEqualChecker : IEqualityComparer<User>
{
public bool Equals(User x, User y)
{
//Code for what makes them equal
//for instance
return x.UserName.Equals(y.UserName, System.StringComparison.OrdinalIgnoreCase);
}
//.....
}
And then...
var list = new List<User>();
//put the data into the list...
list.Distinct(new UserEqualChecker());
This way, you have a reusable comparer
You can group your list using linq and then create a new object with your merged data:
var merged = from item in mylist
group item by item.UserName into grp
select new YourClass {
Username = grp.Key,
Proeperty1 = grp.Where(g => g.Porperty1 != null).FirstOrDefault(),
Property2 = ...
}
This assume usernames are CaseSensitive, you can change the group by using UserName.Toupper() or something similar...
As #HenkHolterman said, you have to define how select values for your properties.
The rule i wrote for Prperty1 is only an example...

make the remove from a list with condition in fastest way

Is there any alternative for deleting object from a list
instead of what I have done with foreach;
I mean I don't think that the way I do is the best way and optimize way
like this:
var allobj= .. //this a list of all object
var myobj= .. //this a list of my selected object
foreach (var inu in myobj.ToArray())
{
if (allobj.Where(p => p.UserName == inu.UserName).Count() != 0)
{
myobj.Remove(inu);
}
}
Other answers have drawback, and that's "creating new collection excluding selected items" instead of removing items from actual collection.
This approach does not copy from main collection, it will remove items from list directly at optimum speed.
You will generate hashset from your selected items so that you can lookup strings in hashset at constant speed.
// generate hashset from selected items
var set = new HashSet<string>(myobj.Select(x => x.UserName));
// remove all items from list.
allobj.RemoveAll(x => set.Contains(x.UserName));
If you want remove objects from myobj which UserName exists in allobj, then
var selected = myobj.Where(obj => allobj.Any(o => o.UserName == obj.UserName)).ToList();
You cannot remove from the list you are iterating into.
Anyway you can construct a new list containing all the elements where username is not present in the global list:
var finalList = myobj.Where(obj => allobj.Any(o => o.UserName !=
obj.UserName)).ToList();
If objects in those lists have the same reference, then to get all except selected items you can simply use:
var r = allItems.Except(selectedItems).ToList();
If they don't have the same reference, you can create the result this way:
var r = allItems.Where(x => !selectedItems.Any(y => y.UserName == x.UserName)).ToList();

Sort in-memory list by another in-memory list

Is possible to sort an in-memory list by another list (the second list would be a reference data-source or something like this) ?
public class DataItem
{
public string Name { get; set; }
public string Path { get; set; }
}
// a list of Data Items, randomly sorted
List<DataItem> dataItems = GetDataItems();
// the sort order data source with the paths in the correct order
IEnumerable<string> sortOrder = new List<string> {
"A",
"A.A1",
"A.A2",
"A.B1"
};
// is there a way to tell linq to sort the in-memory list of objects
// by the sortOrder "data source"
dataItems = dataItems.OrderBy(p => p.Path == sortOrder).ToList();
First, lets assign an index to each item in sortOrder:
var sortOrderWithIndices = sortOrder.Select((x, i) => new { path = x, index = i });
Next, we join the two lists and sort:
var dataItemsOrdered =
from d in dataItems
join x in sortOrderWithIndices on d.Path equals x.path //pull index by path
orderby x.index //order by index
select d;
This is how you'd do it in SQL as well.
Here is an alternative (and I argue more efficient) approach to the one accepted as answer.
List<DataItem> dataItems = GetDataItems();
IDictionary<string, int> sortOrder = new Dictionary<string, int>()
{
{"A", int.MaxValue},
{"A.A1", int.MaxValue-1},
{"A.A2", int.MaxValue -2},
{"A.B1", int.MaxValue-3},
};
dataItems.Sort((di1, di2) => sortOrder[di1.Path].CompareTo(sortOrder[di2.Path]));
Let's say Sort() and OrderBy() both take O(n*logn), where n is number of items in dataItems. The solution given here takes O(n*logn) to perform the sort. We assume the step required to create the dictionary sortOrder has a cost not significantly different from creating the IEnumerable in the original post.
Doing a join and then sorting the collection, however adds an additional cost O(nm) where m is number of elements in sortOrder. Thus the total time complexity for that solution comes to O(nm + nlogn).
In theory, the approach using join may boil down to O(n * (m + logn)) ~= O(n*logn) any way. But in practice, join is costing extra cycles. This is in addition to possible extra space complexity incurred in the linq approach where auxiliary collections might have been created in order to process the linq query.
If your list of paths is large, you would be better off performing your lookups against a dictionary:
var sortValues = sortOrder.Select((p, i) => new { Path = p, Value = i })
.ToDictionary(x => x.Path, x => x.Value);
dataItems = dataItems.OrderBy(di => sortValues[di.Path]).ToList();
custom ordering is done by using a custom comparer (an implementation of the IComparer interface) that is passed as the second argument to the OrderBy method.

Tell LINQ Distinct which item to return

I understand how to do a Distinct() on a IEnumerable and that I have to create an IEqualityComparer for more advanced stuff however is there a way in which you can tell which duplicated item to return?
For example say you have a List<T>
List<MyClass> test = new List<MyClass>();
test.Add(new MyClass {ID = 1, InnerID = 4});
test.Add(new MyClass {ID = 2, InnerID = 4});
test.Add(new MyClass {ID = 3, InnerID = 14});
test.Add(new MyClass {ID = 4, InnerID = 14});
You then do:
var distinctItems = test.Distinct(new DistinctItemComparer());
class DistinctItemComparer : IEqualityComparer<MyClass> {
public bool Equals(MyClass x, MyClass y) {
return x.InnerID == y.InnerID;;
}
public int GetHashCode(MyClassobj) {
return obj.InnerID.GetHasCode();
}
}
This code will return the classes with ID 1 and 3. Is there a way to return the ID matches 2 & 4.
I don't believe it's actually guaranteed, but I'd be very surprised to see the behaviour of Distinct change from returning items in the order they occur in the source sequence.
So, if you want particular items, you should order your source sequence that way. For example:
items.OrderByDescending(x => x.Id)
.Distinct(new DistinctItemComparer());
Note that one alternative to using Distinct with a custom comparer is to use DistinctBy from MoreLINQ:
items.OrderByDescending(x => x.Id)
.DistinctBy(x => x.InnerId);
Although you can't guarantee that the normal LINQ to Objects ordering from Distinct won't change, I'd be happy to add a guarantee to MoreLINQ :) (It's the only ordering that is sensible anyway, to be honest.)
Yet another alternative would be to use GroupBy instead - then for each inner ID you can get all the matching items, and go from there.
You don't want distinct then - you want to group your items and select the "maximum" element for them, based on ID:
var distinctItems = test.Distinct(new DistinctItemComparer());
var otherItems = test.GroupBy(a => a.InnerID, (innerID, values) => values.OrderBy(b => b.ID).Last());
var l1 = distinctItems.ToList();
var l2 = otherItems.ToList();
l1 = your current list
l2 = your desired list
This doesn't sound like a job for Distinct, this sounds like a job for Where. You want to filter the sequence in your case:
var ids = new[] { 2, 4 };
var newSeq = test.Where(m => ids.Contains(m.ID));
If you want to select one particular of the group of elements that are considered equal using the comparison you use, then you can use group by:
var q = from t in tests
group t by t.InnerID into g
select g.First(...);
In the select clause, you'll get a collection of elements that are equal and you can select the one specific element you need (e.g. using First(...)). You actually don't need to add Distinct to the end, because you're already selecting only a single element for each of the groups.
No, there's no way.
Distinct() is used to find distinct elements. If you're worried about which element to return...then obviously they are not truly identical (and therefore not distinct) and you have a flaw in your design.

Categories

Resources