What is the purpose of SelectMany(x => x)? - c#

I understand the use of lambda functions as a filter such as (x => x.Count() == 1), but what is the purpose of the (x => x)? When I take it out, the code doesn't compile, and every example of lambda functions I can find seems to use it to filter in one line instead of multiple lines without the lambda.
List<Tuple<int, int>> regVals = ReadRegValCollection.SelectMany(x => x).ToList();
The purpose of this gem is to flatten a List of Lists into a List

x => x is a lambda expression that returns whatever argument it's provided with.
It's equivalent to a method declared as
public T Identity<T>(T item)
{
return item;
}
It's commonly used with SelectMany method to flatten a collection declared as IEnumerable<IEnumerable<T>> into IEnumerable<T>.
SelectMany requires a delegate that matches Func<IEnumerable<TSource>, IEnumerable<TResult>>. In case when a source is IEnumerable<IEnumerable<T>> and you want a result to be IEnumerable<T> no projection has to be done on source collection elements, as they already are IEnumerable<TResult>.

Related

Why doesn't IOrderedEnumerable retain order after where filtering

I've created a simplification of the issue. I have an ordered IEnumerable, I'm wondering why applying a where filter could unorder the objects
This does not compile while it should have the potential to
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
//Error Cannot Implicitly conver IEnumerable<int> To IOrderedEnumerable<int>
tmp = tmp.Where(x => x > 1);
I understand that there would be no gaurenteed execution order if coming from an IQueryable such as using linq to some DB Provider.
However, when dealing with Linq To Object what senario could occur that would unorder your objects, or why wasn't this implemented?
EDIT
I understand how to properly order this that is not the question. My Question is more of a design question. A Where filter on linq to objects should enumerate the give enumerable and apply filtering. So why is that we can only return an IEnumerable instead of an IOrderedEnumerable?
EDIT
To Clarify the senario in when this would be userful. I'm building Queries based on conditions in my code, I want to reuse as much code as possible. I have a function that is returning an OrderedEnumerable, however after applying the additional where I would have to reorder this even though it would be in its original ordered state
Rene's answer is correct, but could use some additional explanation.
IOrderedEnumerable<T> does not mean "this is a sequence that is ordered". It means "this is a sequence that has had an ordering operation applied to it and you may now follow that up with a ThenBy to impose additional ordering requirements."
The result of Where does not allow you to follow it up with ThenBy, and therefore you may not use it in a context where an IOrderedEnumerable<T> is required.
Make sense?
But of course, as others have said, you almost always want to do the filtering first and then the ordering. That way you are not spending time putting items into order that you are just going to throw away.
There are of course times when you do have to order and then filter; for example, the query "songs in the top ten that were sung by a woman" and the query "the top ten songs that were sung by a woman" are potentially very different! The first one is sort the songs -> take the top ten -> apply the filter. The second is apply the filter -> sort the songs -> take the top ten.
The signature of Where() is this:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
So this method takes an IEnumerable<int> as first argument. The IOrderedEnumerable<int> returned from OrderBy implements IEnumerable<int> so this is no problem.
But as you can see, Where returns an IEnumerable<int> and not an IOrderedEnumerable<int>. And this cannot be casted into one another.
Anyway, the object in that sequence will still have the same order. So you could just do it like this
IEnumerable<int> tmp = new List<int>().OrderBy(x => x).Where(x => x > 1);
and get the sequence you expected.
But of course you should (for performance reasons) filter your objects first and sort them afterwards when there are fewer objects to sort:
IOrderedEnumerable<int> tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
The tmp variable's type is IOrderedEnumerable.
Where() is a function just like any other with a return type, and that return type is IEnumerable. IEnumerable and IOrderedEnumerable are not the same.
So when you do this:
tmp = tmp.Where(x => x > 1);
You are trying to assign the result of a Where() function call, which is an IEnuemrable, to the tmp variable, which is an IOrderedEnumerable. They are not directly compatible, there is no implicit cast, and so the compiler sends you an error.
The problem is you are being too specific with the tmp variable's type. You can make one simple change that will make this all work by being just be a little less specific with your tmp variable:
IEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1);
Because IOrderedEnumerable inherits from IEnumerable, this code will all work. As long as you don't want to call ThenBy() later on, this should give you exactly the same results as you expect without any other loss of ability to use the tmp variable later.
If you really need an IOrderedEnumerable, you can always just call .OrderBy(x => x) again:
IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1).OrderBy(x => x);
And again, in most cases (not all, but most) you want to get your filtering out of the way before you start sorting. In other words, this is even better:
var tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);
why wasn't this implemented?
Most likely because the LINQ designers decided that the effort to implement, test, document etc. isn't worth enough compared to the potential use cases. In fact your are the first one I hear complaining about that.
But if it's so important to you, you can add that missing functionality yourself (similar to #Jon Skeet MoreLINQ extension library). For instance, something like this:
namespace MyLinq
{
public static class Extensions
{
public static IOrderedEnumerable<T> Where<T>(this IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
return new WhereOrderedEnumerable<T>(source, predicate);
}
class WhereOrderedEnumerable<T> : IOrderedEnumerable<T>
{
readonly IOrderedEnumerable<T> source;
readonly Func<T, bool> predicate;
public WhereOrderedEnumerable(IOrderedEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
this.source = source;
this.predicate = predicate;
}
public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending) =>
new WhereOrderedEnumerable<T>(source.CreateOrderedEnumerable(keySelector, comparer, descending), predicate);
public IEnumerator<T> GetEnumerator() => Enumerable.Where(source, predicate).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
}
}
And putting it into action:
using System;
using System.Collections.Generic;
using System.Linq;
using MyLinq;
var test = Enumerable.Range(0, 100)
.Select(n => new { Foo = 1 + (n / 20), Bar = 1 + n })
.OrderByDescending(e => e.Foo)
.Where(e => (e.Bar % 2) == 0)
.ThenByDescending(e => e.Bar) // Note this compiles:)
.ToList();

C# Dictionary search predicate as method argument

I am trying to create a function whereby I can pass in a functor/predicate that can slot into a dictionary's 'Where' method.
(cardPool is the dictionary of type 'cardStats')
Pseudo of what I'd like to do:
void CardStats findCard(Predicate<CardStats> pred)
{
return cardPool.Where(pred);
}
This code obviously wont work but is simply a rough example of the functionality I am looking for.
I have had no problems setting this up for lists, but for a Dictionary, its really got me stumped.
Any help would be great, thanks!
Edit:
Ahh sorry I should have mentioned more: Cardstats is the value, the key is of type int. I'd like to sift through the values (cardStats) and test their properties such as ID(int) or name(string).
Dictionary<TKey, TValue> implements IEnumerable<KeyValuePair<TKey, TValue>>, so its Where extension method takes a predicate of type Func<KeyValuePair<TKey, TValue>, bool>.
You could implement your method like this:
void CardStats findCard(Func<int, CardStats, bool> pred)
{
return cardPool.Where(kv => pred(kv.Key, kv.Value))
.Select(kv => kv.Value)
.FirstOrDefault();
}
And use it like this:
CardStats stats = myCards.findCard((id, stats) => id == 7);
or
CardStats stats = myCards.findCard((id, stats) => stats.Name == "Ace of Clubs");
Note that using Where on a dictionary doesn't take advantage of the dictionary's quick lookup features and basically treats it as a linear collection of key-value pairs.
One more comment: I would suggest providing a method that returns an IEnumerable of found cards if there are several. Or you could provide one that does that, and one that just returns the first match:
void IEnumerable<CardStats> findCards(Func<int, CardStats, bool> pred)
{
return cardPool.Where(kv => pred(kv.Key, kv.Value))
.Select(kv => kv.Value);
}
void CardStats findCard(Func<int, CardStats, bool> pred)
{
return findCards(pred).FirstOrDefault();
}
I would use FirstOrDefault as the first statement because it will stop as soon it finds a matching element. another thing is that I will consider using something else than a dictionary - because when using it this way is abuse if its indexed purpose.
anyway, this is the code I will use:
public CardStats Find(Func<CardStats, bool> predicate)
{
KeyValuePair<int, Roster> kvCard = cardPool.FirstOrDefault(kvp => predicate(kvp.Value));
if (kvCard.Equals(default(KeyValuePair<int, Roster>)))
return null;
return kvCard.Value;
}

Need help understanding .Select method C#

I am having difficulties understandting what type of statement this is and how to use the .select method.
var lines = System.IO.File.ReadLines(#"c:\temp\mycsvfil3.csv")
.Select(l => new
{
myIdentiafication= int.Parse(l.Split(',')[0].Trim()),
myName= l.Split(',')[1].Trim()
}
).OrderBy(i => i.Id);
any help is appreciated!
The Enumerable.Select method is an extension method for an IEnumerable<T> type. It takes a Func<TSource, TResult> that allows you to take in your IEnumerable<T> items and project them to something else, such as a property of the type, or a new type. It makes heavy use of generic type inference from the compiler to do this without <> everywhere.
In your example, the IEnumerable<T> is the string[] of lines from the file. The Select func creates an anonymous type (also making use of generic type inference) and assigns some properties based on splitting each line l, which is a string from your enumerable.
OrderBy is another IEnumerable<T> extension method and proceeds to return an IEnumerable<T> in the order based on the expression you provide.
T at this point is the anonymous type from the Select with two properties (myIdentiafication and myName), so the OrderBy(i => i.Id) bit won't compile. It can be fixed:
.OrderBy(i => i.myIdentiafication);
This is a LINQ query. Enumerable.Select projects each line from file into anonymous object with properties myIdentiafication and myName. Then you sort sequence of anonymous objects with Enumerable.OrderBy. But you should select property which exists in anonymous object. E.g. myIdentiafication because there is no id property:
var lines = File.ReadLines(#"c:\temp\mycsvfil3.csv") // get sequence of lines
.Select(l => new {
myIdentiafication = int.Parse(l.Split(',')[0].Trim()),
myName= l.Split(',')[1].Trim()
}).OrderBy(i => i.myIdentiafication);
NOTE: To avoid parsing each line twice, you can use query syntax with introducing new range variables:
var lines = from l in File.ReadLines(#"c:\temp\mycsvfil3.csv")
let pair = l.Split(',')
let id = Int32.Parse(pair[0].Trim())
orderby id
select new {
Id = id,
Name = pair[1].Trim()
};
From each string returned by ReadLines create an anonymous object with two properties (myIdentiaficiation and myName). Within the Select the context variable l represents a single line from the set returned by ReadLines.

Can I split an IEnumerable into two by a boolean criteria without two queries?

Can I split an IEnumerable<T> into two IEnumerable<T> using LINQ and only a single query/LINQ statement?
I want to avoid iterating through the IEnumerable<T> twice. For example, is it possible to combine the last two statements below so allValues is only traversed once?
IEnumerable<MyObj> allValues = ...
List<MyObj> trues = allValues.Where( val => val.SomeProp ).ToList();
List<MyObj> falses = allValues.Where( val => !val.SomeProp ).ToList();
Some people like Dictionaries, but I prefer Lookups due to the behavior when a key is missing.
IEnumerable<MyObj> allValues = ...
ILookup<bool, MyObj> theLookup = allValues.ToLookup(val => val.SomeProp);
// does not throw when there are not any true elements.
List<MyObj> trues = theLookup[true].ToList();
// does not throw when there are not any false elements.
List<MyObj> falses = theLookup[false].ToList();
Unfortunately, this approach enumerates twice - once to create the lookup, then once to create the lists.
If you don't really need lists, you can get this down to a single iteration:
IEnumerable<MyObj> trues = theLookup[true];
IEnumerable<MyObj> falses = theLookup[false];
You can use this:
var groups = allValues.GroupBy(val => val.SomeProp);
To force immediate evaluation like in your example:
var groups = allValues.GroupBy(val => val.SomeProp)
.ToDictionary(g => g.Key, g => g.ToList());
List<MyObj> trues = groups[true];
List<MyObj> falses = groups[false];
Copy pasta extension method for your convenience.
public static void Fork<T>(
this IEnumerable<T> source,
Func<T, bool> pred,
out IEnumerable<T> matches,
out IEnumerable<T> nonMatches)
{
var groupedByMatching = source.ToLookup(pred);
matches = groupedByMatching[true];
nonMatches = groupedByMatching[false];
}
Or using tuples in C# 7.0
public static (IEnumerable<T> matches, IEnumerable<T> nonMatches) Fork<T>(
this IEnumerable<T> source,
Func<T, bool> pred)
{
var groupedByMatching = source.ToLookup(pred);
return (groupedByMatching[true], groupedByMatching[false]);
}
// Ex.
var numbers = new [] { 1, 2, 3, 4, 5, 6, 7, 8 };
var (numbersLessThanEqualFour, numbersMoreThanFour) = numbers.Fork(x => x <= 4);
Modern C# example using just Linq, no custom extension methods:
(IEnumerable<MyObj> trues, IEnumerable<MyObj> falses)
= ints.Aggregate<MyObj,(IEnumerable<MyObj> trues, IEnumerable<MyObj> falses)>(
(new List<MyObj>(),new List<MyObj>()),
(a, i) => i.SomeProp ? (a.trues.Append(i), a.falses) : (a.trues, a.falses.Append(i))
);
Does this answer the question, yes; is this better or more readable than a foreach, no.
In all of these answers you lose LINQ's 2nd greatest power (after expressiveness of course); laziness! When we call ToDictionary() or ToLookup() we are forcing an enumeration.
Let's take a look at the implementation of partition in Haskell, a great lazy functional programming language.
From Hoogle:
'partition' p xs = ('filter' p xs, 'filter' (not . p) xs)
As you can see it, partition is an expression which returns a tuple of two other expressions. First, where the predicate is applied to the elements and second, where the inverse of the predicate is applied to the elements. Haskell is lazily evaluated implicitly in this case, similar to how LINQ is lazy through its usage of expressions rather than delegates.
So why don't we implement our partition extension method the same way. In LINQ, filter is called where so lets use that.
public static (IEnumerable<T>, IEnumerable<T>) Partition<T>(
this IEnumerable<T> source, Func<T, bool> predicate)
=> (source.Where(predicate), source.Where(x => !predicate(x)));
One caveat with this is that if you force an evaluation on the matches AND the rest, you will perform a double enumeration. However, don't try to optimise early. With this approach you can express partition in terms of LINQ thereby preserving its beneficial characteristics.
Had some fun coming up with this extension method based on the ToLookup suggestion in other answers:
public static (IEnumerable<T> XS, IEnumerable<T> YS) Bifurcate<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var lookup = source.ToLookup(predicate);
return (lookup[true], lookup[false]);
}
The callsite will look like this:
var numbers = new []{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var (evens, odds) = numbers.Bifurcate(n => n % 2 == 0);
I think the usability of this is nice which is why I'm posting this answer.
We can go even further:
public static (IEnumerable<T> XS, IEnumerable<T> YS, IEnumerable<T> ZS) Trifurcate<T>(this IEnumerable<T> source, Func<T, bool> predicate1, Func<T, bool> predicate2)
{
var lookup = source.ToLookup(x =>
{
if (predicate1(x))
return 1;
if (predicate2(x))
return 2;
return 3;
});
return (lookup[1], lookup[2], lookup[3]);
}
The order of predicates matters with this one. If you pass n => n > 5 and n => n > 100 in that order for example, the second collection will always be empty.
One might even have an itch to come up with a version of this that would work with a variable number of predicates(I know I did) but as far as I know that's not possible with tuple return values in C#.

How can I convert a Predicate<T> to an Expression<Predicate<T>> to use with Moq?

Please help this Linq newbie!
I'm creating a list inside my class under test, and I would like to use Moq to check the results.
I can easily put together a Predicate which checks the results of the list. How do I then make that Predicate into an Expression?
var myList = new List<int> {1, 2, 3};
Predicate<List<int>> myPredicate = (list) =>
{
return list.Count == 3; // amongst other stuff
};
// ... do my stuff
myMock.Verify(m => m.DidStuffWith(It.Is<List<int>>( ??? )));
??? needs to be an Expression<Predicate<List<int>>> if you can get your head round that many generics. I've found answers which do this the other way round and compile an Expression into a Predicate. They're not helping me understand Linq any better, though.
EDIT: I've got it working with methods; with expressions; I would just like to know if there's any way to do it with a lambda with a body - and if not, why not?
Change:
Predicate<List<int>> myPredicate = (list) => list.Count == 3;
To:
Expression<Predicate<List<int>>> myPredicate = (list) => list.Count == 3;
The compiler does the magic for you. With some caveats1, any lambda dealing only with expressions (no blocks) can be converted into an expression tree by wrapping the delegate type (in this case Predicate<List<int>>) with Expression<>. As you noted, you could then invert it yet again by calling myPredicate.Compile().
1 For example, async lambdas (i.e. async () => await Task.Delay(1);) cannot be converted to an expression tree.
Update: You simply cannot use the compiler to arrive at the expression tree you want if it includes statements. You'll have to build up the expression tree yourself (a lot more work) using the static methods in Expression. (Expression.Block, Expression.IfThen, etc.)
Kirk Woll's answer directly addresses your question, but consider the fact that you can use the Callback method on a Setup of a void method to handle the parameters which were passed in on invocation. This makes more sense to me since you're already having to build a method to validate the list anyway; it also gives you a local copy of the list.
//what is this list used for?
var myList = new List<int> {1, 2, 3};
List<int> listWithWhichStuffWasDone = null;
//other setup
myMock.Setup(m => m.DoStuffWith(It.IsAny<List<int>>()).
Callback<List<int>>(l => listWithWhichStufFWasDone = l);
objectUnderTest.MethodUnderTest();
myMock.Verify(m => m.DoStuffWith(It.IsAny<List<int>>()));
Validate(listWithWhichStuffWasDone);

Categories

Resources