Flatten IEnumerable<IEnumerable<>>; understanding generics - c#

I wrote this extension method (which compiles):
public static IEnumerable<J> Flatten<T, J>(this IEnumerable<T> #this)
where T : IEnumerable<J>
{
foreach (T t in #this)
foreach (J j in t)
yield return j;
}
The code below causes a compile time error (no suitable method found), why?:
IEnumerable<IEnumerable<int>> foo = new int[2][];
var bar = foo.Flatten();
If I implement the extension like below, I get no compile time error:
public static IEnumerable<J> Flatten<J>(this IEnumerable<IEnumerable<J>> #this)
{
foreach (IEnumerable<J> js in #this)
foreach (J j in js)
yield return j;
}
Edit(2): This question I consider answered, but it raised another question regarding overload resolution and type constraints. This question I put here: Why aren't type constraints part of the method signature?

First, you don't need Flatten(); that method already exists, and is called SelectMany(). You can use it like this:
IEnumerable<IEnumerable<int>> foo = new [] { new[] {1, 2}, new[] {3, 4} };
var bar = foo.SelectMany(x => x); // bar is {1, 2, 3, 4}
Second, your first attempt doesn't work because generic type inference works only based on the arguments to the method, not generic constraints associated with the method. Since there is no argument that directly uses the J generic parameter, the type inference engine can't guess what J should be, and thus doesn't think that your method is a candidate.
It's edifying to see how SelectMany() gets around this: it requires an additional Func<TSource, TResult> argument. That allows the type inference engine to determine both generic types, since they are both available based solely on the arguments provided to the method.

dlev's answer is fine; I just thought I'd add a little more information.
Specifically, I note that you are attempting to use generics to implement a sort of covariance on IEnumerable<T>. In C# 4 and above, IEnumerable<T> already is covariant.
Your second example illustrates this. If you have
List<List<int>> lists = whatever;
foreach(int x in lists.Flatten()) { ... }
then type inference will reason that List<List<int>> is convertible to IE<List<int>>, List<int> is convertible to IE<int>, and therefore, because of covariance, IE<List<int>> is convertible to IE<IE<int>>. That gives type inference something to go on; it can infer that T is int, and everything is good.
This doesn't work in C# 3. Life is a bit harder in a world without covariance but you can get by with judicious use of the Cast<T> extension method.

Related

Why compiler does not allow us to use `var` instead of `generic type`?

In the example code below Generic Type is used in writing a Reverse function that reverses an array of any type:
public T[] Reverse<T>(T[] array)
{
var result = new T[array.Length];
int j=0;
for(int i=array.Length; i>= 0; i--)
{
result[j] = array[i];
j++;
}
return result;
}
However, I could write the same code like below by using var type:
public var[] Reverse(var[] array)
{
var result = new var[array.Length];
int j=0;
for(int i=array.Length; i>= 0; i--)
{
result[j] = array[i];
j++;
}
return result;
}
However, the compiler does not accept the latter. I want know to the difference between Generic type and var?
It doesn't compile, so it doesn't work.
The use of generics and the var are very different. var means "compiler, I'm lazy, please discover for me the single exact type that I should use here, inferring it from what I'm writing after the =" (there are some cases where it is mandatory to use var instead of writing explicitly the variable type, but we will ignore them) ... So for example
var foo = "Hello";
The foo variable type is string, because the compiler can infer it by looking at the type of the expression after the assignment =. The var is totally replaced by the "correct" type in the compiled program.
So it would be equivalent to writing:
string foo = "Hello";
Generics instead are a way to make a method/class able to adapt to different types that are used in calling/creating them. In this instance the caller could
int[] foo1 = Reverse(new int[] { 1, 2, 3, 4, 5);
or
long[] bar1 = Reverse(new long[] { 1, 2, 3, 4, 5);
The compiler (because generics are resolved at compile time) will infer the type T (int or long) from the parameters used and will write it somewhere (in the compiled file). The runtime then will see this and create two different specialized versions of Reverse (one for int and one for long). But in this case T is an "openness" to the various possible types of parameters. In the case of var, there is a single possible type that the variable can be. So in the compiled file there is a Reverse<T> compiled method, while at runtime there are a Reverse<int> version of the method and a Reverse<long> version of the method (and if necessary the runtime will create other versions of the method).
Using var as a parameter wouldn't have any meaning, and it would be a poorer syntax than the generics one, where the list of used generics are put somewhere (between the method name and the ( in this case) and you can have multiple generic types, like
public static IEnumerable<TResult> Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
(that is the LINQ Select) where there are two generic parameters TSource and TResult. With your syntax you wouldn't be able to differentiate between the two generic parameters (there is a single var keyword), and you couldn't use var as is currently used (compiler, I'm lazy, please discover for the the type of this local variable).

Returning different type via generics in C#

My generic extension method signature:
public static class MyExtensions
{
public static IEnumerable<R> DoThing<T, R>(this IEnumerable<T> source)
where T : class
where R : class
{
throw new NotImplementedException();
}
}
My usage pattern:
List<MyItem> codes = new List<MyItem>();
List<MyNewItem> newValues = codes.Where(o => o.Property== 1).DoThing<MyItem, MyNewItem>();
codes.Where should result in an IQueryable< T > output, it's the normal System.Linq namespace.
Results in:
'IEnumerable' does not contain a definition for 'DoThing' and no extension method 'DoThing' accepting a first argument of type 'IEnumerable' could be found (are you missing a using directive or an assembly reference?)
I needed to specify both Types, T and R, as it turns out.
Thank you, all for the help!
you must either specify no Types or all of them
do this
List<MyNewItem> newValues = codes.Where(o => o.Property== 1).DoThing();
that wont work with the code you show (cos it doesnt return anything) but if you fix it it probably will. Else do this
List<MyNewItem> newValues = codes.Where(o => o.Property== 1).DoThing<MyNewItem, MyItem>();
You cannot do partial method type inference in C#. It would be nice if you could, but you can't. Your choices are to either provide the entire list of type arguments for the method call, or none of them and have them deduced.
In your case type inference will never deduce the type of R because type inference only looks at arguments and formals and there is no formal that uses R.
Looking more specifically at your scenario: I cannot for the life of me see what the body of your method could possibly be; it must create a sequence of R out of a sequence of T. OK, let's suppose T is Giraffe and R is Comparer<Nullable<Rectangle>>. How on earth are you going to get a bunch of nullable rectangle comparers out of a bunch of giraffes?
You are almost certainly doing something very wrong here. Typically you would provide a function to do that. That's why the signature of Select is
public static IEnumerable<R> Select<A, R>(
this IEnumerable<A> items,
Func<A, R> projection)
Now there is enough compile time information for the compiler to deduce both A and R from the call site arguments, and there is a way to get a sequence of R out of a sequence of A, by calling the projection on each A.

Cast from IEnumerable to IEnumerable<object>

I prefer to use IEnumerable<object>, for LINQ extension methods are defined on it, not IEnumerable, so that I can use, for example, range.Skip(2). However, I also prefer to use IEnumerable, for T[] is implicitly convertible to IEnumerable whether T is a reference type or value type. For the latter case, no boxing is involved, which is good. As a result, I can do IEnumerable range = new[] { 1, 2, 3 }. It seems impossible to combine the best of both worlds. Anyway, I chose to settle down to IEnumerable and do some kind of cast when I need to apply LINQ methods.
From this SO thread, I come to know that range.Cast<object>() is able to do the job. But it incurs performance overhead which is unnecessary in my opinion. I tried to perform a direct compile-time cast like (IEnumerable<object>)range. According to my tests, it works for reference element type but not for value type. Any ideas?
FYI, the question stems from this GitHub issue. And the test code I used is as follows:
static void Main(string[] args)
{
// IEnumerable range = new[] { 1, 2, 3 }; // won't work
IEnumerable range = new[] { "a", "b", "c" };
var range2 = (IEnumerable<object>)range;
foreach (var item in range2)
{
Console.WriteLine(item);
}
}
According to my tests, it works for reference element type but not for
value type.
Correct. This is because IEnumerable<out T> is co-variant, and co-variance/contra-variance is not supported for value types.
I come to know that range.Cast() is able to do the job. But it
incurs performance overhead which is unnecessary in my opinion.
IMO the performance cost(brought by boxing) is unavoidable if you want a collection of objects with a collection of value-types given. Using the non-generic IEnumerable won't avoid boxing because IEnumerable.GetEnumerator provides a IEnumerator whose .Current property returns an object. I'd prefer always use IEnumerable<T> instead of IEnumerable. So just use the .Cast method and forget the boxing.
After decompiling that extension, the source showed this:
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
{
IEnumerable<TResult> enumerable = source as IEnumerable<TResult>;
if (enumerable != null)
return enumerable;
if (source == null)
throw Error.ArgumentNull("source");
return Enumerable.CastIterator<TResult>(source);
}
private static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source)
{
foreach (TResult result in source)
yield return result;
}
This basically does nothing else than IEnumerable<object> in first place.
You stated:
According to my tests, it works for reference element type but not for
value type.
How did you test that?
Despite I really do not like this approach, I know it is possible to provide a toolset similar to LINQ-to-Objects that is callable directly on an IEnumerable interface, without forcing a cast to IEnumerable<object> (bad: possible boxing!) and without casting to IEnumerable<TFoo> (even worse: we'd need to know and write TFoo!).
However, it is:
not free for runtime: it may be heavy, I didn't run perfomance test
not free for developer: you actually need to write all those LINQ-like extension methods for IEnumerable (or find a lib that does it)
not simple: you need to inspect the incoming type carefully and need to be careful with many possible options
is not an oracle: given a collection that implements IEnumerable but does not implement IEnumerable<T> it only can throw error or silently cast it to IEnumerable<object>
will not always work: given a collection that implements both IEnumerable<int> and IEnumerable<string> it simply cannot know what to do; even giving up and casting to IEnumerable<object> doesn't sound right here
Here's an example for .Net4+:
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
public static void Main()
{
Console.WriteLine("List<int>");
new List<int> { 1, 2, 3 }
.DoSomething()
.DoSomething();
Console.WriteLine("List<string>");
new List<string> { "a", "b", "c" }
.DoSomething()
.DoSomething();
Console.WriteLine("int[]");
new int[] { 1, 2, 3 }
.DoSomething()
.DoSomething();
Console.WriteLine("string[]");
new string[] { "a", "b", "c" }
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with ints");
var stack = new System.Collections.Stack();
stack.Push(1);
stack.Push(2);
stack.Push(3);
stack
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with mixed items");
new System.Collections.ArrayList { 1, "a", null }
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with .. bits");
new System.Collections.BitArray(0x6D)
.DoSomething()
.DoSomething();
}
}
public static class MyGenericUtils
{
public static System.Collections.IEnumerable DoSomething(this System.Collections.IEnumerable items)
{
// check the REAL type of incoming collection
// if it implements IEnumerable<T>, we're lucky!
// we can unwrap it
// ...usually. How to unwrap it if it implements it multiple times?!
var ietype = items.GetType().FindInterfaces((t, args) =>
t.IsGenericType && t.GetGenericTypeDefinition() == typeof(IEnumerable<>),
null).SingleOrDefault();
if (ietype != null)
{
return
doSomething_X(
doSomething_X((dynamic)items)
);
// .doSomething_X() - and since the compile-time type is 'dynamic' I cannot chain
// .doSomething_X() - it in normal way (despite the fact it would actually compile well)
// `dynamic` doesn't resolve extension methods!
// it would look for doSomething_X inside the returned object
// ..but that's just INTERNAL implementation. For the user
// on the outside it's chainable
}
else
// uh-oh. no what? it can be array, it can be a non-generic collection
// like System.Collections.Hashtable .. but..
// from the type-definition point of view it means it holds any
// OBJECTs inside, even mixed types, and it exposes them as IEnumerable
// which returns them as OBJECTs, so..
return items.Cast<object>()
.doSomething_X()
.doSomething_X();
}
private static IEnumerable<T> doSomething_X<T>(this IEnumerable<T> valitems)
{
// do-whatever,let's just see it being called
Console.WriteLine("I got <{1}>: {0}", valitems.Count(), typeof(T));
return valitems;
}
}
Yes, that's silly. I chained them four (2outsidex2inside) times just to show that the type information is not lost in subsequent calls. The point was to show that the 'entry point' takes a nongeneric IEnumerable and that <T> is resolved wherever it can be. You can easily adapt the code to make it a normal LINQ-to-Objects .Count() method. Similarly, one can write all other operations, too.
This example uses dynamic to let the platform resolve the most-narrow T for IEnumerable, if possible (which we need to ensure first). Without dynamic (i.e. .Net2.0) we'd need to invoke the dosomething_X through reflection, or implement it twice as dosomething_refs<T>():where T:class+dosomething_vals<T>():where T:struct and do some magic to call it properly without actually casting (probably reflection, again).
Nevertheless, it seems that you can get something-like-linq working "directly" on things hidden behind nongeneric IEnumerable. All thanks to the fact that the objects hiding behind IEnumerable still have their own full type information (yeah, that assumption may fail with COM or Remoting). However.. I think settling for IEnumerable<T> is a better option. Let's leave plain old IEnumerable to special cases where there is really no other option.
..oh.. and I actually didn't investigate if the code above is correct, fast, safe, resource-conserving, lazy-evaluating, etc.
IEnumerable<T> is a generic interface. As long as you're only dealing with generics and types known at compile-time, there's no point in using IEnumerable<object> - either use IEnumerable<int> or IEnumerable<T>, depending entirely on whether you're writing a generic method, or one where the correct type is already known. Don't try to find an IEnumerable to fit them all - use the correct one in the first place - it's very rare for that not to be possible, and most of the time, it's simply a result of bad object design.
The reason IEnumerable<int> cannot be cast to IEnumerable<object> may be somewhat surprising, but it's actually very simple - value types aren't polymorphic, so they don't support co-variance. Do not be mistaken - IEnumerable<string> doesn't implement IEnumerable<object> - the only reason you can cast IEnumerable<string> to IEnumerable<object> is that IEnumerable<T> is co-variant.
It's just a funny case of "surprising, yet obvious". It's surprising, since int derives from object, right? And yet, it's obvious, because int doesn't really derive from object, even though it can be cast to an object through a process called boxing, which creates a "real object-derived int".

Can params[] be parameters for a lambda expression? [duplicate]

This question already has answers here:
Variable parameters in C# Lambda
(5 answers)
Closed 1 year ago.
I've recently started exploring lambda expressions, and a question came to mind. Say I have a function that requires an indeterminate number of parameters. I would use the params keyword to model that variable number of parameters.
My question: can I do something similar with Lambda expressions? For example:
Func<int[], int> foo = (params numbers[]) =>
{
int result;
foreach(int number in numbers)
{
result += numbers;
}
return result;
}
If so, two sub-questions present themselves - is there a 'good' way to write such an expression, and would I even want to write an expression like this at some point?
Well, sort of.
First, instead of using Func<>, you would need to define a custom delegate:
public delegate int ParamsFunc (params int[] numbers);
Then, you could write a following lambda:
ParamsFunc sum = p => p.Sum();
And invoke it with variable number of arguments:
Console.WriteLine(sum(1, 2, 3));
Console.WriteLine(sum(1, 2, 3, 4));
Console.WriteLine(sum(1, 2, 3, 4, 5));
But to be honest, it is really much more straightforward to stick with built-in Func<> delegates.
The closest thing that I think you can get would be something like this:
Func<int[], int> foo = numbers[] =>
{
// logic...
}
var result = foo(Params.Get(1, 5, 4, 4, 36, 321, 21, 2, 0, -4));
And have:
public static class Params
{
public static T[] Get(params T[] arr)
{
return arr;
}
}
But I can't see how that beats a simple new[] {1, 5, 4, 4, ...}
There are two things here, the Func<int[], int> generic delegate on the LHS and the lambda expression on the RHS. The former is not possible, since a Func<S, T> delegate is declared like:
public delegate TResult Func<in T, out TResult>(T arg); //ie no params involved
You need your own delegate that accepts params input as shown in accepted answer.
The latter, which is what the question title is about, is not possible as well in C#, but for a reason.
The LHS of an assignment expression is a compile time thing (unless it's dynamic of course but again compiler is aware of it) and its RHS is a run time thing (unless of course in case of consts). The compiler can infer what's typed on LHS, but it gets the values on RHS only during run time, ie when the code is run. When you type this:
Func<int[], int> foo = ....
foo is always considered as Func<int[], int>. It will add a lot of complexity to compiler if it had to decipher RHS. For e.g. if what you're attempting was possible, think about this scenario:
Func<int[], int> foo = (params int[] numbers) =>
{
int result;
foreach(int number in numbers)
{
result += numbers;
}
return result;
};
//and later at some other place
foo = (int[] numbers) => 0;
//how would you call 'foo' now?
Instead when you write your own delegate that accepts params, you're telling the compiler directly (ie known from LHS).
Of the three features that parameters of a named method support, ie, out/ref, params, optional parameter, lambda expressions (or even the earlier delegate syntax) support only out/ref.

C# Generic Generics (A Serious Question)

In C# I am trying to write code where I would be creating a Func delegate which is in itself generic. For example the following (non-Generic) delegate is returning an arbitrary string:
Func<string> getString = () => "Hello!";
I on the other hand want to create a generic which acts similarly to generic methods. For example if I want a generic Func to return default(T) for a type T. I would imagine that I write code as follows:
Func<T><T> getDefaultObject = <T>() => default(T);
Then I would use it as
getDefaultObject<string>() which would return null and if I were to write getDefaultObject<int>() would return 0.
This question is not merely an academic excercise. I have found numerous places where I could have used this but I cannot get the syntax right. Is this possible? Are there any libraries which provide this sort of functionality?
Well you can't overload anything based only on the return value, so this includes variables.
You can however get rid of that lambda expression and write a real function:
T getDefaultObject<T>() { return default(T); }
and then you call it exactly like you want:
int i=getDefaultObject<int>(); // i=0
string s=getDefaultObject<string>(); // s=null
Though one might find practical workarounds like Stephen Cleary's
Func<T> CreateGetDefaultObject<T>() { return () => default(T); }
where you can specify the generics directly, this is a quite interesting problem from a theoretical point that cannot be solved by C#'s current type system.
A type which, as you call it, is in itself generic, is referred to as a higher-rank type.
Consider the following example (pseudo-C#):
Tuple<int[], string[]> Test(Func<?> f) {
return (f(1), f("Hello"));
}
In your proposed system, a call could look like that:
Test(x => new[] { x }); // Returns ({ 1 }, { "Hello" })
But the question is: How do we type the function Test and it's argument f?
Apparently, f maps every type T to an array T[] of this type. So maybe?
Tuple<int[], string[]> Test<T>(Func<T, T[]> f) {
return (f(1), f("Hello"));
}
But this doesn't work. We can't parameterize Test with any particular T, since f should can be applied to all types T. At this point, C#'s type system can't go further.
What we needed was a notation like
Tuple<int[], string[]> Test(forall T : Func<T, T[]> f) {
return (f(1), f("Hello"));
}
In your case, you could type
forall T : Func<T> getDefaultValue = ...
The only language I know that supports this kind of generics is Haskell:
test :: (forall t . t -> [t]) -> ([Int], [String])
test f = (f 1, f "hello")
See this Haskellwiki entry on polymorphism about this forall notation.
This isn't possible, since a delegate instance in C# cannot have generic parameters. The closest you can get is to pass the type object as a regular parameter and use reflection. :(
In many cases, casting to dynamic helps remove the pain of reflection, but dynamic doesn't help when creating new instances, such as your example.
You can't do this, because generic type parameters have to be known at runtime. You have to use the activator class:
Object o = Activator.CreateInstance(typeof(StringBuilder));
which will do exactly what you want to. You can write it as the following:
public T Default<T>()
{
return (T)Activator.CreateInstance(typeof(T));
}
Edit
Blindy's solution is better.

Categories

Resources