Group Multiple Properties with Logical OR - c#

I have a class with multiple properties, of which I'm interested in two. Say, PropA and PropB in the following example.
public class GroupByOR
{
public string PropA { get; set; }
public string PropB { get; set; }
public int SomeNumber { get; set; }
public GroupByOR(string a, string b, int num) { PropA = a; PropB = b; SomeNumber = num; }
}
And I would like to group a list of this class' objects, criteria being an item should fall into a group if either PropA or PropB matches.
For example, let's say my list looks like this:
List<GroupByOR> list = new List<GroupByOR>
{
new GroupByOR("CA", "NY", 1), // Item 1
new GroupByOR("CA", "OR", 2), // Item 2
new GroupByOR("NY", "OR", 5) // Item 3
};
Then my desired outcome is this:
Group 1: Items 1 and 2 (based on CA)
Group 2: Items 1 and 3 (based on NY)
Group 3: Items 2 and 3 (based on OR)
Looking around, I found this and many other examples, but they all seem to focus on grouping by multiple properties, but with an AND operation.
Is what I'm trying to achieve even possible?
Or else is join the way to go here? If so can you please direct me in the right direction?

You're going to end up with more items than you started with, so GroupBy isn't the whole answer. You need something like this:
List<GroupByOR> list = new List<GroupByOR>
{
new GroupByOR("CA", "NY", 1), // Item 1
new GroupByOR("CA", "OR", 2), // Item 2
new GroupByOR("NY", "OR", 5) // Item 3
};
var lookupA = list.ToLookup(e => e.PropA);
var lookupB = list.ToLookup(e => e.PropB);
var keys = lookupA.Select(e => e.Key)
.Concat(lookupB.Select(e => e.Key)).Distinct();
var result = keys.Select(e =>
new
{
Key = e,
Values = lookupA[e].Concat(lookupB[e]).Distinct()
});
This is projecting the result into a new anonymous type with Key being the value of PropA or PropB and Values being an IEnumerable<GroupByOr> of all the matching elements.
Edit: Code walkthrough
The first two linq lines are making a lookup (effectively a multi-valued dictionary) out of the enumeration, with the given key. These can be enumerated as IEnumerable<IGrouping<TKey, TValue>>, but can also be used for efficient lookups.
var lookupA = list.ToLookup(e => e.PropA);
var lookupB = list.ToLookup(e => e.PropB);
The next line is finding all the distinct values of PropA and PropB (using the lookups, but could have gone back to the list for this too).
var keys = lookupA.Select(e => e.Key).Concat(lookupB.Select(e => e.Key)).Distinct();
The last line is taking the distinct keys, and taking the matching enumerations from both the propA lookup and the propB lookup then concatenating them, and (now I've spotted another bug) de-duplicating them.
The select statement is producing an anonymous type - these types can't be referred to explicitly, but then can be assigned to a var and they can be enumerated. If you want to store the resulting value, you'd have to make a non-anonymous type.
var result = keys.Select(e =>
new
{
Key = e,
Values = lookupA[e].Concat(lookupB[e]).Distinct()
});
Edit: Output
CA
Values: (PropA: CA, PropB: NY, SomeNumber: 1) (PropA: CA, PropB: OR, SomeNumber: 2)
NY
Values: (PropA: NY, PropB: OR, SomeNumber: 5) (PropA: CA, PropB: NY, SomeNumber: 1)
OR
Values: (PropA: CA, PropB: OR, SomeNumber: 2) (PropA: NY, PropB: OR, SomeNumber: 5)

Related

Join two lists of different objects and create a new list

I've got two lists of different objects.
List<ObjA> objAs = new List<ObjA>();
List<ObjB> objBs = new List<ObjB>();
They have the following class structures.
public class ObjA
{
public int Id;
public int ObjBId;
}
public class ObjB
{
public int Id;
public string Title;
}
Joining objA's ObjBId property to ObjB's Id property, I want to create a list of ObjA's Ids alongside ObjB's Titles. Something like this:
List<int, string> output = new List<int, string>();
// where int = ObjA's Id, string = ObjB's Title
How can I do this in LINQ? Are there any alternative than using Concat and creating a wrapper class?
You can use Join method and return a result as list of named tuples List<(int, string)> (available beginning with C# 7), becuase List<int, string> isn't a valid C# declaration.
var output = objAs.Join(objBs, a => a.ObjBId, b => b.Id, (a, b) => (a.Id, b.Title)).ToList();
You may also use anonymous objects instead of tuples, e.g. (a, b) => new { a.Id, b.Title}
Enumerable.Join should help you in this.
var result = objAs.Join(objBs,x=>x.ObjBId,y=>y.Id,(x,y)=>new {x.Id,y.Title})
.ToList();
You can use a join and return a list
var result = (from a in objAs
join b in objBs on a.ObjBId equals b.Id
select new
{
a.ObjBId,
b.Title
}).ToList();
So for every element of objAs, you want to take the Id, and if an object with the same Id is in objBs, you want the Id from objA and the title from objB.
In fact, since the Id of objA and objB are equal, you don't care if you take the Id from objA or from objB.
You didn't write what you want if there is no item in objBs with the same Id.
Let's assume you want null in that case.
var result = objAs.GroupJoin(objBs, // GroupJoin A with B
objA => objA.Id, // from every element in A take the Id
objB => objB.Id, // from every element in B take the Id
// ResultSelector: take all elements of A, each with the matching elements from B
(objA, matchingObjBs) => new
{
Id = objA.Id,
Title = matchingObjBs.Select(objB => objB.Title).FirstOrDefault(),
});
The nice thing about GroupJoin, is that you also get the element from A that have no matching B. And if there are more than one matching item in B, you take the first one.
If you don't want the items from A that have no matching Id in B, it is enough to take only the elements from B that have an Id in A:
var idsA = objAs.Select(objA => objA.Id);
var result = objBs.Where(objB => idsA.Contains(objB.Id));

Group and separate list

I have below entity structure
public class Item
{
public EnumType Type { get; set; }
public int Price { get; set; }
}
public enum EnumType
{
A =1,
B=2,
C =3
}
I have a list of items as follow
var items = new List<Item>
{
new Item{ Price=5, Type= EnumType.B},
new Item{ Price=5, Type= EnumType.B},
new Item{ Price=5, Type= EnumType.B},
new Item{ Price=10, Type= EnumType.B},
new Item{ Price=10, Type= EnumType.B},
new Item{ Price=10, Type= EnumType.B},
new Item{ Price=15, Type= EnumType.C},
new Item{ Price=15, Type= EnumType.C},
new Item{ Price=15, Type= EnumType.C},
new Item{ Price=15, Type= EnumType.C},
new Item{ Price=15, Type= EnumType.C}
};
If the price and type are same, based on type it need to exclude every nth item from the list and then calculate the sum.
i.e type B = 3, Type C = 4
Which means in above sample data, since there are 3 items each in type B once it group by price and type it need to exclude every 3rd item when calculate sum.
So sum for type B will be 5+5+10+10 and sum for type C will be 15+15+15+15
I tried using modular but seems its not the correct direction
I have tried this so far
static int GetFactorByType(EnumType t)
{
switch(t)
{
case EnumType.A:
return 2;
case EnumType.B:
return 3;
case EnumType.C:
return 4;
default:
return 2;
}
}
var grp = items.GroupBy(g => new { g.Type, g.Price }).Select(s => new
{
type= s.Key.Type,
price = s.Key.Price,
count = s.Count()
}).Where(d => d.count % GetFactorByType(d.type) == 0).ToList();
Here's one solve:
//track the type:nth element discard
var dict = new Dictionary<EnumType, int?>();
dict[EnumType.B] = 3;
dict[EnumType.C] = 4;
//groupby turns our list of items into two collections, depending on whether their type is b or c
var x = items.GroupBy(g => new { g.Type })
.Select(g => new //now project a new collection
{
g.Key.Type, //that has the type
SumPriceWithoutNthElement = //and a sum
//the sum is calculated by reducing the list based on index position: in where(v,i), the i is the index of the item.
//We drop every Nth one, N being determined by a dictioary lookup or 2 if the lookup is null
//we only want list items where (index%N != N-1) is true
g.Where((v, i) => (i % (dict[g.Key.Type]??2)) != ((dict[g.Key.Type] ?? 2) - 1))
.Sum(r => r.Price) //sum the price for the remaining
}
).ToList(); //tolist may not be necessary, i just wanted to look at it
It seemed to me like your question words and your example are not aligned. You said (and did in code):
If the price and type are same, based on type it need to exclude every nth item from the list and then calculate the sum. i.e type B = 3, Type C = 4
Which to me means you should group by Type and Price, so B/5 is one list, and B/10 is another list. But you then said:
Which means in above sample data, since there are 3 items each in type B once it group by price and type it need to exclude every 3rd item when calculate sum. So sum for type B will be 5+5+10+10
I couldn't quite understand this. To me there are 3 items in B/5, so B/5 should be a sum of 10 (B/5 + B/5 + excluded). There are 3 items in B/10, again, should be (B/10 + B/10 + excluded) for a total of 20.
The code above does not group by price. It outputs a collection of 2 items, Type=B,SumWithout=30 and Type=C,SumWithout=60. This one groups by price too, it outputs a 3 item collection:
var x = items.GroupBy(g => new { g.Type, g.Price })
.Select(g => new
{
g.Key.Type,
g.Key.Price,
SumPriceWithoutNthElement =
g.Where((v, i) => (i % (dict[g.Key.Type]??2)) != ((dict[g.Key.Type] ?? 2) - 1))
.Sum(r => r.Price) }
).ToList();
The items are Type=B,Price=5,SumWithout=10 and Type=B,Price=10,SumWithout=20 and Type=C,Price=15,SumWithout=60
Maybe you mean group by type&price, remove every 3rd item (from b, 4th item from c etc), then group again by type only and then sum
This means if your type B prices were
1,1,1,1,2,2,2,2,2
^ ^
we would remove one 1 and one 2 (the Ines with arrows under them), then sum for a total of 9. This is different to removing every 3rd for all type b:
1,1,1,1,2,2,2,2,2
^ ^ ^
?
In which case, maybe group by Type/sum again the SumWithout output from my second example
I did consider that there might be a more efficient ways to do this without LINQ.. and it would nearly certainly be easier to understand the code if if were non LINQ - LINQ isn't necessarily a wonderful magic bullet that can kill all ptoblems, and even though it might look like a hammer with which every problem can be beaten, sometimes it's good to avoid
Depending on how you intended the problem to be solved (is price part of the group key or not) building a dictionary and accumulating 0 instead of th price every Nth element might be one way.. The other way, if price is to be part of the key, could be to sum all the prices and then subtract (count/N)*price from the total price
Grouping by a new object, which is always unique, guarantees you that you'll have as many groups as you have items. Try something like this:
var grp = items.GroupBy(g => $"{g.Type}/{g.Price}").Select(s => new
{
type= s.Value.First().Type,
price = s.Value.First().Price,
count = s.Value.Count()
}).Where(d => count % GetFactorByType(d.type) == 0).ToList();
This way, you group by a string composed from the type/price combination, so if the items are equivalent, the strings will be equal.
The $"{g.Type}/{g.Price}"string amounts to "B/5" for your first three item examples, so it's quite readable as well.

LINQ query for ordering by a specific set list

I am developing a little music application where I'm filling a drop-down list.
My issue is in the LINQ query that I run here:
var results = (from x in db.Keys
where x.SongId == songId
orderby x.ChordKey ascending
select x.ChordKey).ToList();
My values for the ChordKey are always only going to be:
Ab, A, Bb, B, C, C#, Db, D, Eb, E, F, F#, Gb, G
I'd like them to be ordered as they are above, unfortunately A will appear before Ab etc. if ordered alphabetically. Is there a way to have it ordered according to the specific standard above?
Use an enum for the keys with underlying integral values that sort the way you want.
public enum ChordKey
{Ab=1, A=2, Bb=3, B=4, C=5,
Db=6, D=7, Eb=8, E=9,
F=10, Gb=11, G=12}
then
var results = (from x in db.Keys
where x.SongId == songId
orderby (int)x.ChordKey ascending
select x.ChordKey).ToList();
You can have a custom sotring List that you can use its items order to order your specific List. This can be done by creating a list of the custom order and making use of the index of each item in that list. If there are ChordKey values that may not be in your list (doesn't seem the case, then you'll need further checking):
var sortingOrder = new List<string>()
{
"Ab", "A", "Bb", "B", "C", "C#", "Db", "D", "Eb", "E", "F", "F#", "Gb", "G"
};
results = results.OrderBy(x => sortingOrder.IndexOf(x)).ToList();
This orders each item in your List by the index of the item in your sorting list.
Another solution is create class ChordKey and implement IComparer interface:
class ChordKey : IComparer {
// the number of the Chord. For Ab is 1 (or 0), for "G" is 14 (or 13) for example
public int Id { get; set; }
// name of the Chord. For Ab is "Ab"
public string Name { get; set; }
public ChordKey(string name, int id) {
Name = name;
Id = id;
}
public int Compare(object a, object b) {
var c1 = (ChordKey)a;
var c2 = (ChordKey)a;
return c1.Id - c2.Id;
}
}
Now you can use it in your LINQ queries:
var results = (from x in db.Keys
where x.SongId == songId
orderby x.ChordKey.Id ascending
select x.ChordKey).ToList();

Speed of linq query grouping and intersect in particular

Say 3 lists exist with over 500,000 records and we need to perform a set of operations (subsets shown below):
1) Check for repeating ids in list one and two and retrieve distinct ids while Summing up "ValuesA" for duplicate ids and put results in a list. Lets call this list list12.
2) compare all the values with matching ids between list 3 list12 and print results say to console.
3) ensure optimal performance.
This what i have so far:
var list1 = new List<abc>()
{
new abc() { Id = 0, ValueA = 50},
new abc() { Id = 1, ValueA = 40},
new abc() { Id = 1, ValueA = 70}
};
var list2 = new List<abc>()
{
new abc() { Id = 0, ValueA = 40},
new abc() { Id = 1, ValueA = 60},
new abc() { Id = 3, ValueA = 20},
};
var list3 = new List<abc>()
{
new abc() { Id = 0, ValueA = 50},
new abc() { Id = 1, ValueA = 40},
new abc() { Id = 4, ValueA = 70},
};
1) with the help of the solution from here [link][1] I was able to resolve part 1.
var list12 = list2.GroupBy(i => i.Id)
.Select(g => new
{
Id = g.Key,
NewValueA = g.Sum(j => j.ValueA),
});
2)I cant seem to properly get the complete result set from this part. I can get the matching account numbers, maybe someone knows of a faster way other than hashsets, but I also need the ValueA from each list along with the matching account numbers.
foreach (var values in list3.ToHashSet().Select(i => i.ID).Intersect(list12.ToHashSet().Select(j => j.UniqueAccount)))
{
Console.WriteLine(values) //prints matching account number
//?? how do I get ValueA with from both lists with this in the quickest way possible
}
3) my only attempt at improving performance from reading online is to use hashsets as I seen in the attempt above but I may be doing this incorrectly and someone may have a better solution
I don't think that any conversion to HashSet, however efficient, will increase performance. The reason is that the lists must be enumerated to create the HashSets and then the HashSets must be enumerated to get to the results.
If you put everything in one LINQ statement the number of enumerations will be minimized. And by calculating the sums at the end the number of calculations is reduced to the absolute minimum:
list1.Concat(list2)
.Join(list3, x => x.Id, l3 => l3.Id, (l12,l3) => l12)
.GroupBy (x => x.Id)
.Select(g => new
{
Id = g.Key,
NewValueA = g.Sum(j => j.ValueA),
})
With your data this shows:
Id NewValueA
0 90
1 170
I don't know if I understood all requirements well, but this should give you the general idea.
If you want to get access to both elements you probably want a join. A join is a very general construct that can be used to construct all other set operations.

sort list-of-lists using linq and custom sorting rules

I have a list of lists, with each inner list being a list of integers like this (code is simplified for clarity):
var listOfLists = new List<List<int>>
{
new List<int> {6, 0, 2, 5, 6}, // #3 in sort order
new List<int> {6, 0, 2, 2, 5, 6}, // #2 in sort order
new List<int> {0, -1, 0, 0, 7}, // #1 in sort order
new List<int> {11, 3, 5, 5, 12}, // #4 in sort order
};
I want to sort these lists using LINQ-to-Objects according to the following rules:
If either list is empty, the non-empty list sorts first
The list with the lowest minimum number is sorted first
If there's a tie, the largest number of duplicates of the minimum is sorted first. Example: {2,2} sorts before {2}.
If there's still a tie, remove the minimums from both lists and goto Step #1
Background:
Each list represents locations which each have 1+ retail stores. Each retail store is assigned an "urgency" number defining how badly that location needs to be restocked. The "worst" is -1, which usually means a customer has complained about empty products. 0 usually means the store has run out of high-selling products. 1 is usually a store that's almost empty. And so on. In practice the urgengy calculation is complex and based on multiple criteria validated by data-mining, but how we get the numbers isn't important for the sorting.
I want to find the most urgent locations, which I'm defining as the location with the most urgent store, with the number of stores at that level of urgency used as a tie-breaker.
I divided the problem into two:
First, I dealt with the tiebreaking by duplicates part of the problem by subtracting a small floating-point number (e.g. .0000001) for every duplicate, e.g.:
from m in list
group m by m.Urgency into g
orderby g.Key
select g.Key - (g.Count()-1) *.0000001
This left me with a much simpler problem: simply comparing two sequences item-by-item.
For this task I adapted Jon Skeet's answer to Is there a built-in way to compare IEnumerable (by their elements)?. I had to change his comparer to fix a code bug and because my sorting rules treated longer sequences as sorted before shorter sequences, but otherwise I copied the code verbatim.
Here's the resulting solution, somewhat simplified for posting here:
var urgentLocations =
( from m in Stores
group m by m.Location into g
where g.Min(m => m.Urgency) < 14
select new {Name = g.Key, Stores = g.OrderBy(m=>m.Urgency).ToList()})
.OrderBy (g=>g.Stores, new LocationComparer())
.ToList();
public class LocationComparer : IComparer<List<Machine>>
{
public int Compare(List<Store> source1, List<Store> source2)
{
var reduced1 = from m in source1
group m by m.Urgency into g
orderby g.Key
select g.Key - (g.Count() - 1) * .0000001;
var reduced2 = from m in source2
group m by m.Urgency into g
orderby g.Key
select g.Key - (g.Count() - 1) * .0000001;
return SequenceCompare(reduced1, reduced2);
}
// adapted from https://stackoverflow.com/a/2811805/126352 but modified so that
// shorter sequences are sorted last
public int SequenceCompare<T>(IEnumerable<T> source1, IEnumerable<T> source2)
{
// You could add an overload with this as a parameter
IComparer<T> elementComparer = Comparer<T>.Default;
using (IEnumerator<T> iterator1 = source1.GetEnumerator())
using (IEnumerator<T> iterator2 = source2.GetEnumerator())
{
while (true)
{
bool next1 = iterator1.MoveNext();
bool next2 = iterator2.MoveNext();
if (!next1 && !next2) // Both sequences finished
{
return 0;
}
if (!next1) // Only the first sequence has finished
{
return 1;
}
if (!next2) // Only the second sequence has finished
{
return -1;
}
// Both are still going, compare current elements
int comparison = elementComparer.Compare(iterator1.Current,
iterator2.Current);
// If elements are non-equal, we're done
if (comparison != 0)
{
return comparison;
}
}
}
}
}
This also works for Compare, right?
public int Compare(List<Store> source1, List<Store> source2)
{
var reduced1 = source1.Select(s => s.Urgency).OrderBy(u => u);
var reduced2 = source2.Select(s => s.Urgency).OrderBy(u => u);
return SequenceCompare(reduced1, reduced2);
}

Categories

Resources