Update a property field in a List - c#

I have a List<Map> and I wanted to update the Map.Target property based from a matching value from another List<Map>.
Basically, the logic is:
If mapsList1.Name is equal to mapsList2.Name
Then mapsList1.Target = mapsList2.Name
The structure of the Map class looks like this:
public class Map {
public Guid Id { get; set; }
public string Name { get; set; }
public string Target { get; set; }
}
I tried the following but obviously it's not working:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
// populate the 2 lists here
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
The count of items in list 1 will be always greater than or equal to the count of items in list 2. No duplicates in both lists.

Assuming there are a small number of items in the lists and only one item in list 1 that matches:
list2.ForEach(l2m => list1.First(l1m => l1m.Name == l2m.Name).Target = l2m.Target);
If there are more than one item in List1 that must be updated, enumerate the entire list1 doing a First on list2.
list1.ForEach(l1m => l1m.Target = list2.FirstOrDefault(l2m => l1.Name == l2m.Name)?.Target ?? l1m.Target);
If there are a large number of items in list2, turn it into a dictionary
var d = list2.ToDictionary(m => m.Name);
list1.ForEach(m => m.Target = d.ContainsKey(m.Name) ? d[m.Name].Target : m.Target);
(Presumably list2 doesn't contain any repeated names)
If list1's names are unique and everything in list2 is in list1, you could even turn list1 into a dictionary and enumerate list2:
var d=list1.ToDictionary(m => m.Name);
list2.ForEach(m => d[m.Name].Target = m.Target);
If List 2 has entries that are not in list1 or list1 has duplicate names, you could use a Lookup instead, you'd just have to do something to avoid a "collection was modified; enumeration may not execute" you'd get if you were trying to modify the list it returns in response to a name
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
LINQ Where doesn't really work like that / that's not a statement in itself. The m1 is the entry from list1, and the inner Where would produce an enumerable of list 2 items, but it doesn't result in the Boolean the outer Where is expecting, nor can you do anything to either of the sequences because LINQ operations are not supposed to have side effects. The only thing you can do with a Where is capture or use the sequence it returns in some other operation (like enumerating it), so Where isn't really something you'd use for this operation unless you use it to find all the objects you need to alter. It's probably worth pointing out that ForEach is a list thing, not a LINQ thing, and is basically just another way of writing foreach(var item in someList)

If collections are big enough better approach would be to create a dictionary to lookup the targets:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
var dict = mapsList2
.GroupBy(map => map.Name)
.ToDictionary(maps => maps.Key, maps => maps.First().Target);
foreach (var map in mapsList1)
{
if (dict.TryGetValue(map.Name, out var target))
{
map.Target = target;
}
}
Note, that this will discard any possible name duplicates from mapsList2.

Related

Flatten a Dictionary<int, List<object>>

I have a dictionary which has an integer Key that represents a year, and a Value which is a list of object Channel. I need to flatten the data and create a new object from it.
Currently, my code looks like this:
Dictionary<int, List<Channel>> myDictionary;
foreach(var x in myDictionary)
{
var result = (from a in x.Value
from b in anotherList
where a.ChannelId == b.ChannelId
select new NewObject
{
NewObjectYear = x.Key,
NewObjectName = a.First().ChannelName,
}).ToList();
list.AddRange(result);
}
Notice that I am using the Key to be the value of property NewObjectYear.
I want to get rid of foreach since the dictionary contains a lot of data and doing some joins inside the iteration makes it very slow. So I decided to refactor and came up with this:
var flatten = myDictionary.SelectMany(x => x.Value.Select(y =>
new KeyValuePair<int, Channel>(x.Key, y))).ToList();
But with this, I couldn't get the Key directly. Using something like flatten.Select(x => x.Key) is definitely not the correct way. So I tried finding other ways to flatten that would be favorable for my scenario but failed. I also thought about creating a class which will contain the year and the list from the flattened but I don't know how.
Please help me with this.
Also, is there also another way that doesn't have the need to create a new class?
It seems to me you are trying to do only filtering, you do not need join for that:
var anotherListIDs = new HashSet<int>(anotherList.Select(c => c.ChannelId));
foreach (var x in myDictionary)
{
list.AddRange(x.Value
.Where(c => anotherListIDs.Contains(c.ChannelId))
.Select(c => new NewObject
{
NewObjectYear = x.Key,
NewObjectName = c.First().ChannelName,
}));
}
You do realise, that if the second element of the list in a specific dictionary element has a matching channelId, that you return the first element of this list, don't you?
var otherList = new OtherItem[]
{
new OtherItem() {ChannelId = 1, ...}
}
var dictionary = new Dictionary<int, List<Channel>[]
{
{ 10, // Key
new List<Channel>() // Value
{
new Channel() {ChannelId = 100, Name = "100"},
new Channel() {ChannelId = 1, Name = "1"},
},
};
Although the 2nd element has a matching ChannelId, you return the Name of the first element.
Anyway, let's assume this is what you really want. You are right, your function isn't very efficient.
Your dictionary implements IEnumerable<KeyValuePair<int, List<Channel>>. Therefore every x in your foreach is a KeyValuePair<int, List<Channel>. Every x.Value is a List<Channel>.
So for every element in your dictionary (which is a KeyValuePair<int, List<Channel>), you take the complete list, and perform a full inner join of the complete list with otherList, and for the result you take the key of the KeyValuePair and the first element of the List in the KeyValuePair.
And even though you might not use the complete result, but only the first or the first few, because of FirstOrDefault(), or Take(3), you do this for every element of every list in your Dictionary.
Indeed your query could be much more efficient.
As you use the ChannelIds in your OtherList only to find out if it is present, one of the major improvements would be to convert the ChannelIds of OtherList to a HashSet<int> where you have superior fast lookup to check if the ChannelId of one of the values in your Dictionary is in the HashSet.
So for every element in your dictionary, you only have to check every ChannelId in the list to see if one of them is in the HashSet. As soon as you've found one, you can stop and return only the first element of the List and the Key.
My solution is an extension function of Dictionary>. See Extension Methods Demystified
public static IEnumerable<NewObject> ExtractNewObjects(this Dictionary<int, List<Channel>> dictionary,
IEnumerable<OtherItem> otherList)
{
// I'll only use the ChannelIds of the otherList, so extract them
IEnumerable<int> otherChannelIds = otherList
.Select(otherItem => otherItem.ChannelId);
return dictionary.ExtractNewObjects(otherChannelIds);
}
This calls the other ExtractNewobjects:
public static IEnumerable<NewObject> ExtractNewObjects(this Dictionary<int, List<Channel>> dictionary,
IEnumerable<int> otherChannelIds)
{
var channelIdsSet = new HashSet<int>(otherChannelIds));
// duplicate channelIds will be removed automatically
foreach (KeyValuePair<int, List<Channel>> keyValuePair in dictionary)
{
// is any ChannelId in the list also in otherChannelIdsSet?
// every keyValuePair.Value is a List<Channel>
// every Channel has a ChannelId
// channelId found if any of these ChannelIds in in the HashSet
bool channelIdFound = keyValuePair.Value
.Any(channel => otherChannelIdsSet.Contains(channel.ChannelId);
if (channelIdFound)
{
yield return new NewObject()
{
NewObjectYear = keyValuePair.Key,
NewObjectName = keyValuePair.Value
.Select(channel => channel.ChannelName)
.FirstOrDefault(),
};
}
}
}
usage:
IEnumerable<OtherItem> otherList = ...
Dictionary<int, List<Channel>> dictionary = ...
IEnumerable<Newobject> extractedNewObjects = dictionary.ExtractNewObjects(otherList);
var someNewObjects = extractedNewObjects
.Take(5) // here we see the benefit from the yield return
.ToList();
We can see four efficiency improvements:
the use of HashSet<int> enables a very fast lookup to see if the ChannelId is in OtherList
the use of Any() stops enumerating the List<Channel> as soon as we've found a matching Channelid in the HashSet
the use of yield return makes that you don't enumerate over more elements in your Dictionary than you'll actually use.
The use of Select and FirstOrDefault when creating NewObjectName prevents exceptions if List<Channel> is empty

How to store the result of a linq query in a KeyDictionary variable

So I have a collection of objects who have multiple properties, two of these are groupname and personname. Now I need to count in the collection how many of each object belong to a certain group and person. So in other words, I need to group by groupname, then personname and then count how many objects have this combination. First I created this
public MultiKeyDictionary<string, string, int> GetPersonsPerGroup(IEnumerable<Home> homes ,List<string> gr, List<string> na)
{
List<string> groups = gr;
groups.Add("");
List<string> names = na;
names.Add("");
List<Home> Filtered = homes.ToList();
Filtered.ForEach(h => h.RemoveNull());
var result = new MultiKeyDictionary<string, string, int>();
int counter1 = 0;
foreach (var g in groups)
{
int counter2 = 0;
foreach (var n in names)
{
int counter3 = 0;
foreach (Home h in Filtered)
{
if (h.GroupName == g && h.PersonName == n)
{
counter3++;
if (counter3 > 100)
break;
}
}
if (counter3 > 0)
{
result.Add(g,n,counter3);
}
counter2++;
}
counter1++;
}
Which may look good, but the problem is that the "home" parameter can contain more than 10000 objects, with more than 1500 unique names and around 200 unique groups. Which causes this to iterate like a billion times really slowing my program down. So I need an other way of handling this. Which made me decide to try using linq. Which led to this creation:
var newList = Filtered.GroupBy(x => new { x.GroupName, x.PersonName })
.Select(y => (MultiKeyDictionary<string, string, int>)result.Add(y.Key.GroupName, y.Key.PersonName, y.ToList().Count));
Which gives an error "Cannot convert type 'void' to 'MultiKeyDictionary<string,string,int>' and I have no idea how to solve it. How can I make it so that the result of this query gets stored all in one MultikeyDictionary without having to iterate over each possible combination and counting all of them.
Some information:
MultiKeyDictionary is a class I defined (something I found on here actually), it's just a normal dictionary but with two keys assosiated to one value.
The RemoveNull() method on the Home object makes sure that all the properties of the Home object are not null. If it is the case the value gets sets to something not null ("null", basic date, 0, ...).
The parameters are:
homes = a list of Home objects received from an other class
gr = a list of all the unique groups in the list of homes
na = a list of all the unique names in the list of homes
The same name can occur on different groups
Hopefully someone can help me get further!
Thanks in advance!
Select must return something. You are not returning but only adding to an existing list. Do this instead:
var newList = Filtered.GroupBy(x => new { x.GroupName, x.PersonName }):
var result = new MultiKeyDictionary<string, string, int>);
foreach(var y in newList)
{
result.Add(y.Key.GroupName, y.Key.PersonName, y.ToList().Count));
}
The reason you are getting error below:
"Cannot convert type 'void' to 'MultiKeyDictionary'
is because you are trying to cast the returned value from Add which is void to MultiKeyDictionary<string,string,int> which clearly cannot be done.
If MultiKeyDictionary requires the two keys to match in order to find a result, then you might want to just use a regular Dictionary with a Tuple as a composite type. C# 7 has features that make this pretty easy:
public Dictionary<(string, string), int> GetPersonsPerGroup(IEnumerable<Home> homes ,List<string> gr, List<string> na)
{
return Filtered.GroupBy(x => (x.GroupName, x.PersonName))
.ToDictionary(g => g.Key, g => g.Count);
}
You can even associate optional compile-time names with your tuple's values, by declaring it like this: Dictionary<(string groupName, string personName), int>.
Your grouping key anonymous object should work fine as a standard Dictionary key, so no reason to create a new type of Dictionary unless it offers special access via single keys, so just convert the grouping to a standard Dictionary:
var result = Filtered.GroupBy(f => new { f.GroupName, f.PersonName })
.ToDictionary(fg => fg.Key, fg => fg.Count());

Sort in-memory list by another in-memory list

Is possible to sort an in-memory list by another list (the second list would be a reference data-source or something like this) ?
public class DataItem
{
public string Name { get; set; }
public string Path { get; set; }
}
// a list of Data Items, randomly sorted
List<DataItem> dataItems = GetDataItems();
// the sort order data source with the paths in the correct order
IEnumerable<string> sortOrder = new List<string> {
"A",
"A.A1",
"A.A2",
"A.B1"
};
// is there a way to tell linq to sort the in-memory list of objects
// by the sortOrder "data source"
dataItems = dataItems.OrderBy(p => p.Path == sortOrder).ToList();
First, lets assign an index to each item in sortOrder:
var sortOrderWithIndices = sortOrder.Select((x, i) => new { path = x, index = i });
Next, we join the two lists and sort:
var dataItemsOrdered =
from d in dataItems
join x in sortOrderWithIndices on d.Path equals x.path //pull index by path
orderby x.index //order by index
select d;
This is how you'd do it in SQL as well.
Here is an alternative (and I argue more efficient) approach to the one accepted as answer.
List<DataItem> dataItems = GetDataItems();
IDictionary<string, int> sortOrder = new Dictionary<string, int>()
{
{"A", int.MaxValue},
{"A.A1", int.MaxValue-1},
{"A.A2", int.MaxValue -2},
{"A.B1", int.MaxValue-3},
};
dataItems.Sort((di1, di2) => sortOrder[di1.Path].CompareTo(sortOrder[di2.Path]));
Let's say Sort() and OrderBy() both take O(n*logn), where n is number of items in dataItems. The solution given here takes O(n*logn) to perform the sort. We assume the step required to create the dictionary sortOrder has a cost not significantly different from creating the IEnumerable in the original post.
Doing a join and then sorting the collection, however adds an additional cost O(nm) where m is number of elements in sortOrder. Thus the total time complexity for that solution comes to O(nm + nlogn).
In theory, the approach using join may boil down to O(n * (m + logn)) ~= O(n*logn) any way. But in practice, join is costing extra cycles. This is in addition to possible extra space complexity incurred in the linq approach where auxiliary collections might have been created in order to process the linq query.
If your list of paths is large, you would be better off performing your lookups against a dictionary:
var sortValues = sortOrder.Select((p, i) => new { Path = p, Value = i })
.ToDictionary(x => x.Path, x => x.Value);
dataItems = dataItems.OrderBy(di => sortValues[di.Path]).ToList();
custom ordering is done by using a custom comparer (an implementation of the IComparer interface) that is passed as the second argument to the OrderBy method.

Hashet union with preserving values from combined items

I am looking for efficient way to combine multiple hashsets (based on object key) while preserving non key values (such as Version below).
class MyObject {
public string Key {get; set;}
public long Version {get; set;}
override GetHashCode() { *key* }
override Equals(...) { *key* }
}
in the and, i need to combine hash sets into a master list, but also all Versions.
I can make the union of 3 sets in this way:
for (var o in List1.Union(List2).Union(List3))
Console.WriteLine("{0} : {1}", o.Key, o.Version)
this only shows the versions from one of the lists (List1, or whatever list contains the item).
I need to compile these into a result with all versions..
Something i wish i could do like this:
for (var o in List1.Union(List2).Union(List3).Select((a,b,c) => new DiffObj(){Key=a.Key,VersionA=a.Version,VersionB=b.Version,VersionC=c.Version}))
Console.WriteLine("{0} : {1},{2},{3}", o.Key, o.VrsionA, o.VersionB, VersionC);
is that possible with hashsets?
update
it is important to keep track which list had which version (in the final result).
It sounds like you want a grouping:
var grouped = list1.Select(x => new { List=1, Item=x })
.Concat(list2.Select(x => new { List=2, Item=x }))
.Concat(list3.Select(x => new { List=3, Item=x }))
.GroupBy(pair => pair.Item);
Then you can just iterate over each group, which will contain equal values according to Equals and GetHashCode, but which can still be distinct. The List value will indicate which list each item came from.

Checking two lists for Modifications

I have got two lists of two different type which has the following common properties.
Id -->used to identify corresponding objects
Bunch of other Properties
ModificationDate
Need to compare these two lists based on Modification date.If the modified date is different (first list ModificationDate greater than second list's ModificationDate then, copy all the properties if that item from first list to second.
Please let me know the best way to do this.
EDITED:Second list may or maynot contain all elements of the first and vice versa.My first list is always the source list. so if an item is present in list 1 and not present in list 2 we need to add it in list 2. also if an item present in list 2 but in not in list 1 then remove it from list2.
Finding added/deleted items
var list1 = new List<MyType>();
var list2 = new List<MyType>();
// These two assume MyType : IEquatable<MyType>
var added = list1.Except(list2);
var deleted = list2.Except(list1);
// Now add "added" to list2, remove "deleted" from list2
If MyType does not implement IEquatable<MyType>, or the implementation is not based solely on comparing ids, you will need to create an IEqualityComparer<MyType>:
class MyTypeIdComparer : IEqualityComparer<MyType>
{
public bool Equals(MyType x, MyType y)
{
return x.Id.CompareTo(y.Id);
}
public int GetHashCode(MyType obj)
{
return obj.Id.GetHashCode();
}
}
Which will allow you to do:
// This does not assume so much for MyType
var comparer = new MyTypeIdComparer();
var added = list1.Except(list2, comparer);
var deleted = list2.Except(list1, comparer);
Finding modified items
var modified = list1.Concat(list2)
.GroupBy(item => item.Id)
.Where(g => g.Select(item => item.ModificationDate)
.Distinct().Count() != 1);
// To propagate the modifications:
foreach(var grp in modified) {
var items = grp.OrderBy(item => item.ModificationDate);
var target = items.First(); // earliest modification date = old
var source = grp.Last(); // latest modification date = new
// And now copy properties from source to target
}
This might be able to help. The Linq library has lots of decent functions, such as Except, Intersection.
http://msdn.microsoft.com/en-us/library/bb397894.aspx
The provided link was helpful in comparing two lists of different types
Comparing Collections in .Net

Categories

Resources