Looking for a built-in way (preferable one-liner) to reproduce this Python line in C#.
sorted_weights = sorted(weights, key=lambda weight: (weight[1], weight[0]))
It sorts the map/dictionary using first the value and if there are duplicated values, it should sort using keys. (please note: both, keys and values, are integers)
I'd like to avoid writing an own function/loop (which I am capable of ;)) to achieve sorting if not needed. I'm pretty sure there is a functional programming approach in C# for this as well, isn't there?
Short answer
A one-liner:
Assuming your Weights have properties Value and Key:
var sortedWeights = weights.SortedBy(weight => weight.Value).ThenBy(weight => weight.Key);
Reusable method
Apparently your input is a sequence of similar items, and you want to sort first by one property, followed by a sort in another property.
My advice would be to create an extension method. After that you can use it as a one-liner LINQ like method. See extension methods demystified
To make it reusable, your <int, int> version calls a generic method:
public static IEnumerable<TSource> OrderBy<TSource, TSort1, TSort2>(
this IEnumerable<TSource> source,
Func<TSource, TSort1> sortProperty1,
Func<TSource, TSort2> sortProperty2)
{
return source.OrderBy(item => sortProperty1(item)
.ThenBy(item => sortProperty2(item);
}
Usage:
IEnumerable<Weight> weights = ...
// sort by Weight.Value then by Id:
var result = weights.OrderBy(weight => weight.Value, weight => weight.Id);
Or A dictionary, order by Weight.X then by dictionary key:
Dictionary<int, Weights> dict = ...
var result = dict.OrderBy(dictItem => dictItem.Value.X,
dictItem => dictItem.Key);
If you don't want to mention the second sort property, consider adding an extra extension method:
public static IEnumerable<KeyValuePair<TKey, TValue>> OrderByThenByKey<TKey, TValue, TProperty>(
// TODO: invent a proper method name
this IEnumerable<KeyValuePair<TKey, TValue>> source,
Func<TValue, TProperty> propertySelector)
{
// call the other OrderBy
return source.OrderBy(propertySelector, keyValuePair => keyValuePair.Key);
}
Usage:
Dictionary<int, Weight> dict = ...
var result = dict.OrderByThenByKey(weight => weight.X);
Assuming weights is a dictionary, you can try:
//using System.Linq;
var sortedWeights = weights.OrderBy(weight => weight.Value).ThenBy(weight => weight.Key);
I'm not 100% clear on what that Python code produces, but if it's a flat list of weight objects then this C# line does the same thing:
using System.Linq;
var sorted_weights = weights.OrderBy(weight => (weight[1], weight[0]));
Just be aware that output object is the equivalent of a Python generator. It will only be evaluated when you enumerate over it.
Related
I have a list with some identifiers like this:
List<long> docIds = new List<long>() { 6, 1, 4, 7, 2 };
Morover, I have another list of <T> items, which are represented by the ids described above.
List<T> docs = GetDocsFromDb(...)
I need to keep the same order in both collections, so that the items in List<T> must be in the same position than in the first one (due to search engine scoring reasons). And this process cannot be done in the GetDocsFromDb() function.
If necessary, it's possible to change the second list into some other structure (Dictionary<long, T> for example), but I'd prefer not to change it.
Is there any simple and efficient way to do this "ordenation depending on some IDs" with LINQ?
docs = docs.OrderBy(d => docsIds.IndexOf(d.Id)).ToList();
Since you don't specify T,
public static IEnumerable<T> OrderBySequence<T, TId>(
this IEnumerable<T> source,
IEnumerable<TId> order,
Func<T, TId> idSelector)
{
var lookup = source.ToDictionary(idSelector, t => t);
foreach (var id in order)
{
yield return lookup[id];
}
}
Is a generic extension for what you want.
You could use the extension like this perhaps,
var orderDocs = docs.OrderBySequence(docIds, doc => doc.Id);
A safer version might be
public static IEnumerable<T> OrderBySequence<T, TId>(
this IEnumerable<T> source,
IEnumerable<TId> order,
Func<T, TId> idSelector)
{
var lookup = source.ToLookup(idSelector, t => t);
foreach (var id in order)
{
foreach (var t in lookup[id])
{
yield return t;
}
}
}
which will work if source does not zip exactly with order.
Jodrell's answer is best, but actually he reimplemented System.Linq.Enumerable.Join. Join also uses Lookup and keeps ordering of source.
docIds.Join(
docs,
i => i,
d => d.Id,
(i, d) => d);
One simple approach is to zip with the ordering sequence:
List<T> docs = GetDocsFromDb(...).Zip(docIds, Tuple.Create)
.OrderBy(x => x.Item2).Select(x => x.Item1).ToList();
I've created a simplification of the issue. I have an ordered IEnumerable, I'm wondering why applying a where filter could unorder the objects
This does not compile while it should have the potential to
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
//Error Cannot Implicitly conver IEnumerable<int> To IOrderedEnumerable<int>
tmp = tmp.Where(x => x > 1);
I understand that there would be no gaurenteed execution order if coming from an IQueryable such as using linq to some DB Provider.
However, when dealing with Linq To Object what senario could occur that would unorder your objects, or why wasn't this implemented?
EDIT
I understand how to properly order this that is not the question. My Question is more of a design question. A Where filter on linq to objects should enumerate the give enumerable and apply filtering. So why is that we can only return an IEnumerable instead of an IOrderedEnumerable?
EDIT
To Clarify the senario in when this would be userful. I'm building Queries based on conditions in my code, I want to reuse as much code as possible. I have a function that is returning an OrderedEnumerable, however after applying the additional where I would have to reorder this even though it would be in its original ordered state
Rene's answer is correct, but could use some additional explanation.
IOrderedEnumerable<T> does not mean "this is a sequence that is ordered". It means "this is a sequence that has had an ordering operation applied to it and you may now follow that up with a ThenBy to impose additional ordering requirements."
The result of Where does not allow you to follow it up with ThenBy, and therefore you may not use it in a context where an IOrderedEnumerable<T> is required.
Make sense?
But of course, as others have said, you almost always want to do the filtering first and then the ordering. That way you are not spending time putting items into order that you are just going to throw away.
There are of course times when you do have to order and then filter; for example, the query "songs in the top ten that were sung by a woman" and the query "the top ten songs that were sung by a woman" are potentially very different! The first one is sort the songs -> take the top ten -> apply the filter. The second is apply the filter -> sort the songs -> take the top ten.
The signature of Where() is this:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
So this method takes an IEnumerable<int> as first argument. The IOrderedEnumerable<int> returned from OrderBy implements IEnumerable<int> so this is no problem.
But as you can see, Where returns an IEnumerable<int> and not an IOrderedEnumerable<int>. And this cannot be casted into one another.
Anyway, the object in that sequence will still have the same order. So you could just do it like this
IEnumerable<int> tmp = new List<int>().OrderBy(x => x).Where(x => x > 1);
and get the sequence you expected.
But of course you should (for performance reasons) filter your objects first and sort them afterwards when there are fewer objects to sort:
IOrderedEnumerable<int> tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
The tmp variable's type is IOrderedEnumerable.
Where() is a function just like any other with a return type, and that return type is IEnumerable. IEnumerable and IOrderedEnumerable are not the same.
So when you do this:
tmp = tmp.Where(x => x > 1);
You are trying to assign the result of a Where() function call, which is an IEnuemrable, to the tmp variable, which is an IOrderedEnumerable. They are not directly compatible, there is no implicit cast, and so the compiler sends you an error.
The problem is you are being too specific with the tmp variable's type. You can make one simple change that will make this all work by being just be a little less specific with your tmp variable:
IEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1);
Because IOrderedEnumerable inherits from IEnumerable, this code will all work. As long as you don't want to call ThenBy() later on, this should give you exactly the same results as you expect without any other loss of ability to use the tmp variable later.
If you really need an IOrderedEnumerable, you can always just call .OrderBy(x => x) again:
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1).OrderBy(x => x);
And again, in most cases (not all, but most) you want to get your filtering out of the way before you start sorting. In other words, this is even better:
var tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
why wasn't this implemented?
Most likely because the LINQ designers decided that the effort to implement, test, document etc. isn't worth enough compared to the potential use cases. In fact your are the first one I hear complaining about that.
But if it's so important to you, you can add that missing functionality yourself (similar to #Jon Skeet MoreLINQ extension library). For instance, something like this:
namespace MyLinq
{
public static class Extensions
{
public static IOrderedEnumerable<T> Where<T>(this IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
return new WhereOrderedEnumerable<T>(source, predicate);
}
class WhereOrderedEnumerable<T> : IOrderedEnumerable<T>
{
readonly IOrderedEnumerable<T> source;
readonly Func<T, bool> predicate;
public WhereOrderedEnumerable(IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
this.source = source;
this.predicate = predicate;
}
public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending) =>
new WhereOrderedEnumerable<T>(source.CreateOrderedEnumerable(keySelector, comparer, descending), predicate);
public IEnumerator<T> GetEnumerator() => Enumerable.Where(source, predicate).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
}
}
And putting it into action:
using System;
using System.Collections.Generic;
using System.Linq;
using MyLinq;
var test = Enumerable.Range(0, 100)
.Select(n => new { Foo = 1 + (n / 20), Bar = 1 + n })
.OrderByDescending(e => e.Foo)
.Where(e => (e.Bar % 2) == 0)
.ThenByDescending(e => e.Bar) // Note this compiles:)
.ToList();
I am trying to create a function whereby I can pass in a functor/predicate that can slot into a dictionary's 'Where' method.
(cardPool is the dictionary of type 'cardStats')
Pseudo of what I'd like to do:
void CardStats findCard(Predicate<CardStats> pred)
{
return cardPool.Where(pred);
}
This code obviously wont work but is simply a rough example of the functionality I am looking for.
I have had no problems setting this up for lists, but for a Dictionary, its really got me stumped.
Any help would be great, thanks!
Edit:
Ahh sorry I should have mentioned more: Cardstats is the value, the key is of type int. I'd like to sift through the values (cardStats) and test their properties such as ID(int) or name(string).
Dictionary<TKey, TValue> implements IEnumerable<KeyValuePair<TKey, TValue>>, so its Where extension method takes a predicate of type Func<KeyValuePair<TKey, TValue>, bool>.
You could implement your method like this:
void CardStats findCard(Func<int, CardStats, bool> pred)
{
return cardPool.Where(kv => pred(kv.Key, kv.Value))
.Select(kv => kv.Value)
.FirstOrDefault();
}
And use it like this:
CardStats stats = myCards.findCard((id, stats) => id == 7);
or
CardStats stats = myCards.findCard((id, stats) => stats.Name == "Ace of Clubs");
Note that using Where on a dictionary doesn't take advantage of the dictionary's quick lookup features and basically treats it as a linear collection of key-value pairs.
One more comment: I would suggest providing a method that returns an IEnumerable of found cards if there are several. Or you could provide one that does that, and one that just returns the first match:
void IEnumerable<CardStats> findCards(Func<int, CardStats, bool> pred)
{
return cardPool.Where(kv => pred(kv.Key, kv.Value))
.Select(kv => kv.Value);
}
void CardStats findCard(Func<int, CardStats, bool> pred)
{
return findCards(pred).FirstOrDefault();
}
I would use FirstOrDefault as the first statement because it will stop as soon it finds a matching element. another thing is that I will consider using something else than a dictionary - because when using it this way is abuse if its indexed purpose.
anyway, this is the code I will use:
public CardStats Find(Func<CardStats, bool> predicate)
{
KeyValuePair<int, Roster> kvCard = cardPool.FirstOrDefault(kvp => predicate(kvp.Value));
if (kvCard.Equals(default(KeyValuePair<int, Roster>)))
return null;
return kvCard.Value;
}
Can I split an IEnumerable<T> into two IEnumerable<T> using LINQ and only a single query/LINQ statement?
I want to avoid iterating through the IEnumerable<T> twice. For example, is it possible to combine the last two statements below so allValues is only traversed once?
IEnumerable<MyObj> allValues = ...
List<MyObj> trues = allValues.Where( val => val.SomeProp ).ToList();
List<MyObj> falses = allValues.Where( val => !val.SomeProp ).ToList();
Some people like Dictionaries, but I prefer Lookups due to the behavior when a key is missing.
IEnumerable<MyObj> allValues = ...
ILookup<bool, MyObj> theLookup = allValues.ToLookup(val => val.SomeProp);
// does not throw when there are not any true elements.
List<MyObj> trues = theLookup[true].ToList();
// does not throw when there are not any false elements.
List<MyObj> falses = theLookup[false].ToList();
Unfortunately, this approach enumerates twice - once to create the lookup, then once to create the lists.
If you don't really need lists, you can get this down to a single iteration:
IEnumerable<MyObj> trues = theLookup[true];
IEnumerable<MyObj> falses = theLookup[false];
You can use this:
var groups = allValues.GroupBy(val => val.SomeProp);
To force immediate evaluation like in your example:
var groups = allValues.GroupBy(val => val.SomeProp)
.ToDictionary(g => g.Key, g => g.ToList());
List<MyObj> trues = groups[true];
List<MyObj> falses = groups[false];
Copy pasta extension method for your convenience.
public static void Fork<T>(
this IEnumerable<T> source,
Func<T, bool> pred,
out IEnumerable<T> matches,
out IEnumerable<T> nonMatches)
{
var groupedByMatching = source.ToLookup(pred);
matches = groupedByMatching[true];
nonMatches = groupedByMatching[false];
}
Or using tuples in C# 7.0
public static (IEnumerable<T> matches, IEnumerable<T> nonMatches) Fork<T>(
this IEnumerable<T> source,
Func<T, bool> pred)
{
var groupedByMatching = source.ToLookup(pred);
return (groupedByMatching[true], groupedByMatching[false]);
}
// Ex.
var numbers = new [] { 1, 2, 3, 4, 5, 6, 7, 8 };
var (numbersLessThanEqualFour, numbersMoreThanFour) = numbers.Fork(x => x <= 4);
Modern C# example using just Linq, no custom extension methods:
(IEnumerable<MyObj> trues, IEnumerable<MyObj> falses)
= ints.Aggregate<MyObj,(IEnumerable<MyObj> trues, IEnumerable<MyObj> falses)>(
(new List<MyObj>(),new List<MyObj>()),
(a, i) => i.SomeProp ? (a.trues.Append(i), a.falses) : (a.trues, a.falses.Append(i))
);
Does this answer the question, yes; is this better or more readable than a foreach, no.
In all of these answers you lose LINQ's 2nd greatest power (after expressiveness of course); laziness! When we call ToDictionary() or ToLookup() we are forcing an enumeration.
Let's take a look at the implementation of partition in Haskell, a great lazy functional programming language.
From Hoogle:
'partition' p xs = ('filter' p xs, 'filter' (not . p) xs)
As you can see it, partition is an expression which returns a tuple of two other expressions. First, where the predicate is applied to the elements and second, where the inverse of the predicate is applied to the elements. Haskell is lazily evaluated implicitly in this case, similar to how LINQ is lazy through its usage of expressions rather than delegates.
So why don't we implement our partition extension method the same way. In LINQ, filter is called where so lets use that.
public static (IEnumerable<T>, IEnumerable<T>) Partition<T>(
this IEnumerable<T> source, Func<T, bool> predicate)
=> (source.Where(predicate), source.Where(x => !predicate(x)));
One caveat with this is that if you force an evaluation on the matches AND the rest, you will perform a double enumeration. However, don't try to optimise early. With this approach you can express partition in terms of LINQ thereby preserving its beneficial characteristics.
Had some fun coming up with this extension method based on the ToLookup suggestion in other answers:
public static (IEnumerable<T> XS, IEnumerable<T> YS) Bifurcate<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var lookup = source.ToLookup(predicate);
return (lookup[true], lookup[false]);
}
The callsite will look like this:
var numbers = new []{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var (evens, odds) = numbers.Bifurcate(n => n % 2 == 0);
I think the usability of this is nice which is why I'm posting this answer.
We can go even further:
public static (IEnumerable<T> XS, IEnumerable<T> YS, IEnumerable<T> ZS) Trifurcate<T>(this IEnumerable<T> source, Func<T, bool> predicate1, Func<T, bool> predicate2)
{
var lookup = source.ToLookup(x =>
{
if (predicate1(x))
return 1;
if (predicate2(x))
return 2;
return 3;
});
return (lookup[1], lookup[2], lookup[3]);
}
The order of predicates matters with this one. If you pass n => n > 5 and n => n > 100 in that order for example, the second collection will always be empty.
One might even have an itch to come up with a version of this that would work with a variable number of predicates(I know I did) but as far as I know that's not possible with tuple return values in C#.
If I have an IEnumerable where ClassA exposes an ID property of type long.
Is it possible to use a Linq query to get all instances of ClassA with ID belonging to a second IEnumerable?
In other words, can this be done?
IEnumerable<ClassA> = original.Intersect(idsToFind....)?
where original is an IEnumerable<ClassA> and idsToFind is IEnumerable<long>.
Yes.
As other people have answered, you can use Where, but it will be extremely inefficient for large sets.
If performance is a concern, you can call Join:
var results = original.Join(idsToFind, o => o.Id, id => id, (o, id) => o);
If idsToFind can contain duplicates, you'll need to either call Distinct() on the IDs or on the results or replace Join with GroupJoin (The parameters to GroupJoin would be the same).
I will post an answer using Intersect.
This is useful if you want to intersect 2 IEnumerables of the same type.
First we will need an EqualityComparer:
public class KeyEqualityComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, object> keyExtractor;
public KeyEqualityComparer(Func<T, object> keyExtractor)
{
this.keyExtractor = keyExtractor;
}
public bool Equals(T x, T y)
{
return this.keyExtractor(x).Equals(this.keyExtractor(y));
}
public int GetHashCode(T obj)
{
return this.keyExtractor(obj).GetHashCode();
}
}
Secondly we apply the KeyEqualityComparer to the Intersect function:
var list3= list1.Intersect(list2, new KeyEqualityComparer<ClassToCompare>(s => s.Id));
You can do it, but in the current form, you'd want to use the Where extension method.
var results = original.Where(x => yourEnumerable.Contains(x.ID));
Intersect on the other hand will find elements that are in both IEnumerable's. If you are looking for just a list of ID's, you can do the following which takes advantage of Intersect
var ids = original.Select(x => x.ID).Intersect(yourEnumerable);
A simple way would be:
IEnumerable<ClassA> result = original.Where(a => idsToFind.contains(a.ID));
Use the Where method to filter the results:
var result = original.Where(o => idsToFind.Contains(o.ID));
Naming things is important. Here is an extension method base on the Join operator:
private static IEnumerable<TSource> IntersectBy<TSource, TKey>(
this IEnumerable<TSource> source,
IEnumerable<TKey> keys,
Func<TSource, TKey> keySelector)
=> source.Join(keys, keySelector, id => id, (o, id) => o);
You can use it like this var result = items.IntersectBy(ids, item => item.id).
I've been tripping up all morning on Intersect, and how it doesn't work anymore in core 3, due to it being client side not server side.
From a list of items pulled from a database, the user can then choose to display them in a way that requires children to attached to that original list to get more information.
What use to work was:
itemList = _context.Item
.Intersect(itemList)
.Include(i => i.Notes)
.ToList();
What seems to now work is:
itemList = _context.Item
.Where(item => itemList.Contains(item))
.Include(i => i.Notes)
.ToList();
This seems to be working as expected, without any significant performance difference, and is really no more complicated than the first.