I prefer to use IEnumerable<object>, for LINQ extension methods are defined on it, not IEnumerable, so that I can use, for example, range.Skip(2). However, I also prefer to use IEnumerable, for T[] is implicitly convertible to IEnumerable whether T is a reference type or value type. For the latter case, no boxing is involved, which is good. As a result, I can do IEnumerable range = new[] { 1, 2, 3 }. It seems impossible to combine the best of both worlds. Anyway, I chose to settle down to IEnumerable and do some kind of cast when I need to apply LINQ methods.
From this SO thread, I come to know that range.Cast<object>() is able to do the job. But it incurs performance overhead which is unnecessary in my opinion. I tried to perform a direct compile-time cast like (IEnumerable<object>)range. According to my tests, it works for reference element type but not for value type. Any ideas?
FYI, the question stems from this GitHub issue. And the test code I used is as follows:
static void Main(string[] args)
{
// IEnumerable range = new[] { 1, 2, 3 }; // won't work
IEnumerable range = new[] { "a", "b", "c" };
var range2 = (IEnumerable<object>)range;
foreach (var item in range2)
{
Console.WriteLine(item);
}
}
According to my tests, it works for reference element type but not for
value type.
Correct. This is because IEnumerable<out T> is co-variant, and co-variance/contra-variance is not supported for value types.
I come to know that range.Cast() is able to do the job. But it
incurs performance overhead which is unnecessary in my opinion.
IMO the performance cost(brought by boxing) is unavoidable if you want a collection of objects with a collection of value-types given. Using the non-generic IEnumerable won't avoid boxing because IEnumerable.GetEnumerator provides a IEnumerator whose .Current property returns an object. I'd prefer always use IEnumerable<T> instead of IEnumerable. So just use the .Cast method and forget the boxing.
After decompiling that extension, the source showed this:
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
{
IEnumerable<TResult> enumerable = source as IEnumerable<TResult>;
if (enumerable != null)
return enumerable;
if (source == null)
throw Error.ArgumentNull("source");
return Enumerable.CastIterator<TResult>(source);
}
private static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source)
{
foreach (TResult result in source)
yield return result;
}
This basically does nothing else than IEnumerable<object> in first place.
You stated:
According to my tests, it works for reference element type but not for
value type.
How did you test that?
Despite I really do not like this approach, I know it is possible to provide a toolset similar to LINQ-to-Objects that is callable directly on an IEnumerable interface, without forcing a cast to IEnumerable<object> (bad: possible boxing!) and without casting to IEnumerable<TFoo> (even worse: we'd need to know and write TFoo!).
However, it is:
not free for runtime: it may be heavy, I didn't run perfomance test
not free for developer: you actually need to write all those LINQ-like extension methods for IEnumerable (or find a lib that does it)
not simple: you need to inspect the incoming type carefully and need to be careful with many possible options
is not an oracle: given a collection that implements IEnumerable but does not implement IEnumerable<T> it only can throw error or silently cast it to IEnumerable<object>
will not always work: given a collection that implements both IEnumerable<int> and IEnumerable<string> it simply cannot know what to do; even giving up and casting to IEnumerable<object> doesn't sound right here
Here's an example for .Net4+:
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
public static void Main()
{
Console.WriteLine("List<int>");
new List<int> { 1, 2, 3 }
.DoSomething()
.DoSomething();
Console.WriteLine("List<string>");
new List<string> { "a", "b", "c" }
.DoSomething()
.DoSomething();
Console.WriteLine("int[]");
new int[] { 1, 2, 3 }
.DoSomething()
.DoSomething();
Console.WriteLine("string[]");
new string[] { "a", "b", "c" }
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with ints");
var stack = new System.Collections.Stack();
stack.Push(1);
stack.Push(2);
stack.Push(3);
stack
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with mixed items");
new System.Collections.ArrayList { 1, "a", null }
.DoSomething()
.DoSomething();
Console.WriteLine("nongeneric collection with .. bits");
new System.Collections.BitArray(0x6D)
.DoSomething()
.DoSomething();
}
}
public static class MyGenericUtils
{
public static System.Collections.IEnumerable DoSomething(this System.Collections.IEnumerable items)
{
// check the REAL type of incoming collection
// if it implements IEnumerable<T>, we're lucky!
// we can unwrap it
// ...usually. How to unwrap it if it implements it multiple times?!
var ietype = items.GetType().FindInterfaces((t, args) =>
t.IsGenericType && t.GetGenericTypeDefinition() == typeof(IEnumerable<>),
null).SingleOrDefault();
if (ietype != null)
{
return
doSomething_X(
doSomething_X((dynamic)items)
);
// .doSomething_X() - and since the compile-time type is 'dynamic' I cannot chain
// .doSomething_X() - it in normal way (despite the fact it would actually compile well)
// `dynamic` doesn't resolve extension methods!
// it would look for doSomething_X inside the returned object
// ..but that's just INTERNAL implementation. For the user
// on the outside it's chainable
}
else
// uh-oh. no what? it can be array, it can be a non-generic collection
// like System.Collections.Hashtable .. but..
// from the type-definition point of view it means it holds any
// OBJECTs inside, even mixed types, and it exposes them as IEnumerable
// which returns them as OBJECTs, so..
return items.Cast<object>()
.doSomething_X()
.doSomething_X();
}
private static IEnumerable<T> doSomething_X<T>(this IEnumerable<T> valitems)
{
// do-whatever,let's just see it being called
Console.WriteLine("I got <{1}>: {0}", valitems.Count(), typeof(T));
return valitems;
}
}
Yes, that's silly. I chained them four (2outsidex2inside) times just to show that the type information is not lost in subsequent calls. The point was to show that the 'entry point' takes a nongeneric IEnumerable and that <T> is resolved wherever it can be. You can easily adapt the code to make it a normal LINQ-to-Objects .Count() method. Similarly, one can write all other operations, too.
This example uses dynamic to let the platform resolve the most-narrow T for IEnumerable, if possible (which we need to ensure first). Without dynamic (i.e. .Net2.0) we'd need to invoke the dosomething_X through reflection, or implement it twice as dosomething_refs<T>():where T:class+dosomething_vals<T>():where T:struct and do some magic to call it properly without actually casting (probably reflection, again).
Nevertheless, it seems that you can get something-like-linq working "directly" on things hidden behind nongeneric IEnumerable. All thanks to the fact that the objects hiding behind IEnumerable still have their own full type information (yeah, that assumption may fail with COM or Remoting). However.. I think settling for IEnumerable<T> is a better option. Let's leave plain old IEnumerable to special cases where there is really no other option.
..oh.. and I actually didn't investigate if the code above is correct, fast, safe, resource-conserving, lazy-evaluating, etc.
IEnumerable<T> is a generic interface. As long as you're only dealing with generics and types known at compile-time, there's no point in using IEnumerable<object> - either use IEnumerable<int> or IEnumerable<T>, depending entirely on whether you're writing a generic method, or one where the correct type is already known. Don't try to find an IEnumerable to fit them all - use the correct one in the first place - it's very rare for that not to be possible, and most of the time, it's simply a result of bad object design.
The reason IEnumerable<int> cannot be cast to IEnumerable<object> may be somewhat surprising, but it's actually very simple - value types aren't polymorphic, so they don't support co-variance. Do not be mistaken - IEnumerable<string> doesn't implement IEnumerable<object> - the only reason you can cast IEnumerable<string> to IEnumerable<object> is that IEnumerable<T> is co-variant.
It's just a funny case of "surprising, yet obvious". It's surprising, since int derives from object, right? And yet, it's obvious, because int doesn't really derive from object, even though it can be cast to an object through a process called boxing, which creates a "real object-derived int".
Related
TL;DR - I would expect these all to work the same, but (per comments) they do not:
var c1 = new[] { FileMode.Append }.Cast<int>();
var c2 = new[] { FileMode.Append }.Select(x => (int)x);
var c3 = new[] { FileMode.Append }.Select(x => x).Cast<int>();
foreach (var x in c1 as IEnumerable)
Console.WriteLine(x); // Append (I would expect 6 here!)
foreach (var x in c2 as IEnumerable)
Console.WriteLine(x); // 6
foreach (var x in c3 as IEnumerable)
Console.WriteLine(x); // 6
This is a contrived example; I obviously wouldn't cast the collections to IEnumerable if I didn't have to, and in that case everything would work as expected. But I'm working on a library with several methods that take an object and return a serialized string representation. If it determines via reflection that the object implements IEnumerable, it will enumerate it and, in almost all cases, return the expected result...except for this strange case with Array.Cast<T>.
There's 2 things I could do here:
Tell uses to materialize IEnumerables first, such as with ToList().
Create an overload for each affected method that takes an IEnumerable<T>.
For different reasons, neither of those is ideal. Is it possible for a method that takes an object to somehow infer T when Array.Cast<T>() is passed?
Is it possible for a method that takes an object to somehow infer T when Array.Cast() is passed?
No, not in the example you gave.
The reason you get the output you do is that the Enumerable.Cast<T>() method has an optimization to allow the original object to be returned when it's compatible with the type you ask for:
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source) {
IEnumerable<TResult> typedSource = source as IEnumerable<TResult>;
if (typedSource != null) return typedSource;
if (source == null) throw Error.ArgumentNull("source");
return CastIterator<TResult>(source);
}
So in your first case, nothing actually happens. The Cast<T>() method is just returning the object you passed into the method, and so by the time you get it back, the fact that it ever went through Cast<T>() is completely lost.
Your question doesn't have any other details about how you got into this situation or why it matters in a practical sense. But we can say conclusively that given the code you posted, it would be impossible to achieve the goal you've stated.
I have a method that looks like this (assume that I have the necessary method GetMySerializedDataArry() and my serializer JsonSerializer):
public static List<T> GetMyListOfData<T>()
{
var msgList = new List<T>();
foreach (string s in GetMySerializedDataArray())
{
msgList.Add(JsonSerializer.Deserialize<T>(s));
}
return msgList;
}
This works fine and as expected.
However, I want to use the same method to optionally, if and only if the generic type is specified as string, return the data unserialized like this (which does not compile and has syntax problems):
public static List<T> GetMyListOfData<T>(bool leaveSerialized)
{
if (typeof (T) != typeof(string) && leaveSerialized)
{
throw new ArgumentException("Parameter must be false when generic type is not List<string>", "leaveSerialized");
}
var msgList = new List<T>();
foreach (string s in GetMySerializedDataArray())
{
if (leaveSerialized)
{
// Casting does not work: "Cannot cast expression of type 'System.Collections.Generic.List<T>' to type 'List<string>'"
// I've tried various permutations of "is" and "as"... but they don't work with generic types
// But I know in this case that I DO have a list of strings..... just the compiler doesn't.
// How do I assure the compiler?
((List<string>)msgList).Add(s);
}
else
{
msgList.Add(JsonSerializer.Deserialize<T>(s));
}
}
return msgList;
}
My questions are in the inline comment.... basically though the compiler clearly doesn't like the cast of generic to non-generic, it won't let me use permutations of "is" and "are" operators either, I know I actually have the correct string in this case.... how to assure the compiler it is OK?
Many thanks in advance.
EDIT: SOLUTION
Thanks to Lee and Lorentz, both. I will be creating two public methods, but implementing the code in a private method with the admittedly icky decision tree about whether to leave serialization. My reason is that my real-world method is far more complex than what I posed here to SO, and I don't want to duplicate those business rules.
FINAL EDIT: CHANGED SOLUTION
Although both answers were very helpful, I have now been able to detangle business rules, and as a result the "correct" answer for me is now the first -- two different methods. Thanks again to all.
You should not return a list of strings as a list of T. I would suggest that you use two separate methods and skip the parameter:
public static List<T> GetMyListOfData<T>()
public static List<string> GetSerializedMyListOfData()
The advantages of this approach is
It's more readable (imo) GetSerializedMyListOfData() vs GetMyListOfData<string>(true)
You also know the intent of the caller at compile time and don't have to throw an exception when the type argument don't match the intent to leave the data serialized
You can cast to object first:
((List<string>)(object)msgList).Add(s);
however a cleaner solution could be to create another method for dealing with strings, this would also allow you to remove the leaveSerialized parameter.
In C# I use LINQ and IEnumerable a good bit. And all is well-and-good (or at least mostly so).
However, in many cases I find myself that I need an empty IEnumerable<X> as a default. That is, I would like
for (var x in xs) { ... }
to work without needing a null-check. Now this is what I currently do, depending upon the larger context:
var xs = f() ?? new X[0]; // when xs is assigned, sometimes
for (var x in xs ?? new X[0]) { ... } // inline, sometimes
Now, while the above is perfectly fine for me -- that is, if there is any "extra overhead" with creating the array object I just don't care -- I was wondering:
Is there "empty immutable IEnumerable/IList" singleton in C#/.NET? (And, even if not, is there a "better" way to handle the case described above?)
Java has Collections.EMPTY_LIST immutable singleton -- "well-typed" via Collections.emptyList<T>() -- which serves this purpose, although I am not sure if a similar concept could even work in C# because generics are handled differently.
Thanks.
You are looking for Enumerable.Empty<T>().
In other news the Java empty list sucks because the List interface exposes methods for adding elements to the list which throw exceptions.
Enumerable.Empty<T>() is exactly that.
In your original example you use an empty array to provide an empty enumerable. While using Enumerable.Empty<T>() is perfectly right, there might other cases: if you have to use an array (or the IList<T> interface), you can use the method
System.Array.Empty<T>()
which helps you to avoid unnecessary allocations.
Notes / References:
the documentation does not mention that this method allocates the empty array only once for each type
roslyn analyzers recommend this method with the warning CA1825: Avoid zero-length array allocations
Microsoft reference implementation
.NET Core implementation
I think you're looking for Enumerable.Empty<T>().
Empty list singleton doesn't make that much sense, because lists are often mutable.
I think adding an extension method is a clean alternative thanks to their ability to handle nulls - something like:
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> list)
{
return list ?? Enumerable.Empty<T>();
}
foreach(var x in xs.EmptyIfNull())
{
...
}
Using Enumerable.Empty<T>() with lists has a drawback. If you hand Enumerable.Empty<T> into the list constructor then an array of size 4 is allocated. But if you hand an empty Collection into the list constructor then no allocation occurs. So if you use this solution throughout your code then most likely one of the IEnumerables will be used to construct a list, resulting in unnecessary allocations.
Microsoft implemented `Any()' like this (source)
public static bool Any<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw new ArgumentNullException("source");
using (IEnumerator<TSource> e = source.GetEnumerator())
{
if (e.MoveNext()) return true;
}
return false;
}
If you want to save a call on the call stack, instead of writing an extension method that calls !Any(), just rewrite make these three changes:
public static bool IsEmpty<TSource>(this IEnumerable<TSource> source) //first change (name)
{
if (source == null) throw new ArgumentNullException("source");
using (IEnumerator<TSource> e = source.GetEnumerator())
{
if (e.MoveNext()) return false; //second change
}
return true; //third change
}
I had a need for a method that could take a collection of strings, and replace all occurrences of a specific string with another.
For example, if I have a List<string> that looks like this:
List<string> strings = new List<string> { "a", "b", "delete", "c", "d", "delete" };
and I want to replace "delete" with "", I would use this LINQ statement:
strings = (from s in strings select (s=="delete" ? s=String.Empty : s)).ToList();
and it works great. But then I figured I should make it an extension method, since I'd likely use it again later. In this case, I just want to write the following:
strings.ReplaceStringInListWithAnother( "delete", String.Empty);
While my code compiles, and the LINQ statement works inside of the extension method, when I return the collection reverts back to its original contents:
public static void ReplaceStringInListWithAnother( this List<string> my_list, string to_replace, string replace_with)
{
my_list = (from s in my_list select (s==to_replace ? s=replace_with : s)).ToList();
}
So it would seem that I just modified a copy of the List... but when I looked at the code for Pop, it modifies the collection similarly, yet the changes stick, so my assumption was that my method's parameter declarations are correct.
Can anyone explain what I am doing wrong here?
The LINQ statement you wrote does not modify the collection, it actually creates a new one.
The extension method you wrote creates this new collection and then discards it. The assignment is redundant: you’re assigning to a local parameter, which goes out of scope immediately after.
When you’re calling the method, you’re also discarding its result instead of assigning it back.
Therefore, you should write the method like this:
public static List<string> ReplaceStringInListWithAnother(
this List<string> my_list, string to_replace, string replace_with)
{
return (from s in my_list select
(s == to_replace ? replace_with : s)).ToList();
}
and the call like this:
strings = strings.ReplaceStringInListWithAnother("delete", "");
By the way, you can make the function more useful by making it generic:
public static List<T> ReplaceInList<T>(this List<T> my_list,
T to_replace, T replace_with) where T : IEquatable<T>
{
return (from s in my_list select
(s.Equals(to_replace) ? replace_with : s)).ToList();
}
This way you can use it for other things, not just strings. Furthermore, you can also declare it to use IEnumerable<T> instead of List<T>:
public static IEnumerable<T> ReplaceItems<T>(this IEnumerable<T> my_list,
T to_replace, T replace_with) where T : IEquatable<T>
{
return from s in my_list select (s.Equals(to_replace) ? replace_with : s);
}
This way you can use it for any collection of equatable items, not just List<T>. Notice that List<T> implements IEnumerable<T>, so you can still pass a List into this function. If you want a list out, simply call .ToList() after the call to this one.
Update: If you actually want to replace elements in a list instead of creating a new one, you can still do that with an extension method, and it can still be generic, but you can’t use Linq and you can’t use IEnumerable<T>:
public static void ReplaceInList<T>(this List<T> my_list,
T to_replace, T replace_with) where T : IEquatable<T>
{
for (int i = 0; i < my_list.Count; i++)
if (my_list[i].Equals(to_replace))
my_list[i] = replace_with;
}
This will not return the new list, but instead modify the old one, so it has a void return type like your original.
Here's a hint: what do you expect the below code to do?
void SetToTen(int y)
{
y = 10;
}
int x = 0;
SetToTen(x);
Hopefully, you understand that the SetToTen method above does nothing meaningful, since it only changes the value of its own local variable y and has no effect on the variable whose value was passed to it (in order for that to happen, the y parameter would have to be of type ref int and the method would be called as SetToTen(ref x)).
Keeping in mind that extension methods are really just static methods in fancy clothes, it should be clear why your ReplaceStringInListWithAnother is not doing what you expected: it is only setting its local my_list variable to a new value, having no effect on the original List<string> passed to the method.
Now, it's worth mentioning that the only reason this is not working for you is that your code works by setting a variable to a new object*. If you were to modify the List<string> passed to ReplaceStringInListWithAnother, everything would work just fine:
public static void ReplaceStringInListWithAnother( this List<string> my_list, string to_replace, string replace_with)
{
for (int i = 0; i < my_list.Count; ++i)
{
if (my_list[i] == to_replace)
{
my_list[i] = replace_with;
}
}
}
It's also worth mentioning that List<string> is an overly restrictive parameter type for this method; you could achieve the same functionality for any type implementing IList<string> (and so I'd change the my_list parameter to be of type IList<string>).
*Reading your question again, it seems clear to me that this is the main point of confusion for you. The important thing you have to realize is that by default, everything in C# is passed by value. With value types (anything defined as a struct -- int, double, DateTime, and many more), the thing that's passed is the value itself. With reference types (anything that's defined as a class), the thing that's passed is a reference to an object. In the latter case, all method calls on references to objects of mutable types do actually affect the underlying object, since multiple variables of reference type can point to the same object. But assignment is different from a method call; if you assign a reference to an object that has been passed by value to some new reference to an object, you are doing nothing to the underlying object, and therefore nothing is happening that would be reflected by the original reference.
This is a really important concept that many .NET developers struggle with. But it's also a topic that's been explained thoroughly elsewhere. If you need more explanation, let me know and I'll try to dig up a link to a page that makes all of this as clear as possible.
You haven't shown the code for "Pop" so it's hard to know what you mean. You talk about "when I return the collection" but you're not returning anything - the method has a void return type.
LINQ typically doesn't change the contents of an existing collection. Usually you should return a new collection from the extension method. For example:
public static IEnumerable<string> ReplaceAll
(this IEnumerable<string> myList, string toReplace, string replaceWith)
{
return toReplace.Select(x => x == toReplace ? replaceWith : x);
}
(I've made it more general here - you shouldn't start materializing lists unless you really need to.)
You'd then call it with:
strings = strings.ReplaceAll("delete", "").ToList();
... or change the type of string to IEnumerable<string> and just use
strings = strings.ReplaceAll("delete", "");
In C# I am trying to write code where I would be creating a Func delegate which is in itself generic. For example the following (non-Generic) delegate is returning an arbitrary string:
Func<string> getString = () => "Hello!";
I on the other hand want to create a generic which acts similarly to generic methods. For example if I want a generic Func to return default(T) for a type T. I would imagine that I write code as follows:
Func<T><T> getDefaultObject = <T>() => default(T);
Then I would use it as
getDefaultObject<string>() which would return null and if I were to write getDefaultObject<int>() would return 0.
This question is not merely an academic excercise. I have found numerous places where I could have used this but I cannot get the syntax right. Is this possible? Are there any libraries which provide this sort of functionality?
Well you can't overload anything based only on the return value, so this includes variables.
You can however get rid of that lambda expression and write a real function:
T getDefaultObject<T>() { return default(T); }
and then you call it exactly like you want:
int i=getDefaultObject<int>(); // i=0
string s=getDefaultObject<string>(); // s=null
Though one might find practical workarounds like Stephen Cleary's
Func<T> CreateGetDefaultObject<T>() { return () => default(T); }
where you can specify the generics directly, this is a quite interesting problem from a theoretical point that cannot be solved by C#'s current type system.
A type which, as you call it, is in itself generic, is referred to as a higher-rank type.
Consider the following example (pseudo-C#):
Tuple<int[], string[]> Test(Func<?> f) {
return (f(1), f("Hello"));
}
In your proposed system, a call could look like that:
Test(x => new[] { x }); // Returns ({ 1 }, { "Hello" })
But the question is: How do we type the function Test and it's argument f?
Apparently, f maps every type T to an array T[] of this type. So maybe?
Tuple<int[], string[]> Test<T>(Func<T, T[]> f) {
return (f(1), f("Hello"));
}
But this doesn't work. We can't parameterize Test with any particular T, since f should can be applied to all types T. At this point, C#'s type system can't go further.
What we needed was a notation like
Tuple<int[], string[]> Test(forall T : Func<T, T[]> f) {
return (f(1), f("Hello"));
}
In your case, you could type
forall T : Func<T> getDefaultValue = ...
The only language I know that supports this kind of generics is Haskell:
test :: (forall t . t -> [t]) -> ([Int], [String])
test f = (f 1, f "hello")
See this Haskellwiki entry on polymorphism about this forall notation.
This isn't possible, since a delegate instance in C# cannot have generic parameters. The closest you can get is to pass the type object as a regular parameter and use reflection. :(
In many cases, casting to dynamic helps remove the pain of reflection, but dynamic doesn't help when creating new instances, such as your example.
You can't do this, because generic type parameters have to be known at runtime. You have to use the activator class:
Object o = Activator.CreateInstance(typeof(StringBuilder));
which will do exactly what you want to. You can write it as the following:
public T Default<T>()
{
return (T)Activator.CreateInstance(typeof(T));
}
Edit
Blindy's solution is better.