Grouping consecutive identical items: IEnumerable<T> to IEnumerable<IEnumerable<T>> - c#

I've got an interresting problem: Given an IEnumerable<string>, is it possible to yield a sequence of IEnumerable<IEnumerable<string>> that groups identical adjacent strings in one pass?
Let me explain.
1. Basic illustrative sample :
Considering the following IEnumerable<string> (pseudo representation):
{"a","b","b","b","c","c","d"}
How to get an IEnumerable<IEnumerable<string>> that would yield something of the form:
{ // IEnumerable<IEnumerable<string>>
{"a"}, // IEnumerable<string>
{"b","b","b"}, // IEnumerable<string>
{"c","c"}, // IEnumerable<string>
{"d"} // IEnumerable<string>
}
The method prototype would be:
public IEnumerable<IEnumerable<string>> Group(IEnumerable<string> items)
{
// todo
}
But it could also be :
public void Group(IEnumerable<string> items, Action<IEnumerable<string>> action)
{
// todo
}
...where action would be called for each subsequence.
2. More complicated sample
Ok, the first sample is very simple, and only aims to make the high level intent clear.
Now imagine we are dealing with IEnumerable<Anything>, where Anything is a type defined like this:
public class Anything
{
public string Key {get;set;}
public double Value {get;set;}
}
We now want to generate the subsequences based on the Key, (group every consecutive Anything that have the same key) to later use them in order to calculate the total value by group:
public void Compute(IEnumerable<Anything> items)
{
Console.WriteLine(items.Sum(i=>i.Value));
}
// then somewhere, assuming the Group method
// that returns an IEnumerable<IEnumerable<Anything>> actually exists:
foreach(var subsequence in Group(allItems))
{
Compute(subsequence);
}
3. Important notes
Only one iteration over the original sequence
No intermediary collections allocations (we can assume millions of items in the original sequence, and millions consecutives items in each group)
Keeping enumerators and defered execution behavior
We can assume that resulting subsequences will be iterated only once, and will be iterated in order.
Is it possible, and how would you write it?

Is this what you are looking for?
Iterate list only once.
Defer execution.
No intermediate collections (my other post failed on this criterion).
This solution relies on object state because it's difficult to share state between two IEnumerable methods that use yield (no ref or out params).
internal class Program
{
static void Main(string[] args)
{
var result = new[] { "a", "b", "b", "b", "c", "c", "d" }.Partition();
foreach (var r in result)
{
Console.WriteLine("Group".PadRight(16, '='));
foreach (var s in r)
Console.WriteLine(s);
}
}
}
internal static class PartitionExtension
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> src)
{
var grouper = new DuplicateGrouper<T>();
return grouper.GroupByDuplicate(src);
}
}
internal class DuplicateGrouper<T>
{
T CurrentKey;
IEnumerator<T> Itr;
bool More;
public IEnumerable<IEnumerable<T>> GroupByDuplicate(IEnumerable<T> src)
{
using(Itr = src.GetEnumerator())
{
More = Itr.MoveNext();
while (More)
yield return GetDuplicates();
}
}
IEnumerable<T> GetDuplicates()
{
CurrentKey = Itr.Current;
while (More && CurrentKey.Equals(Itr.Current))
{
yield return Itr.Current;
More = Itr.MoveNext();
}
}
}
Edit: Added extension method for cleaner usage. Fixed loop test logic so that "More" is evaluated first.
Edit: Dispose the enumerator when finished

Way Better Solution That Meets All Requirements
OK, scrap my previous solution (I'll leave it below, just for reference). Here's a much better approach that occurred to me after making my initial post.
Write a new class that implements IEnumerator<T> and provides a few additional properties: IsValid and Previous. This is all you really need to resolve the whole mess with having to maintain state inside an iterator block using yield.
Here's how I did it (pretty trivial, as you can see):
internal class ChipmunkEnumerator<T> : IEnumerator<T> {
private readonly IEnumerator<T> _internal;
private T _previous;
private bool _isValid;
public ChipmunkEnumerator(IEnumerator<T> e) {
_internal = e;
_isValid = false;
}
public bool IsValid {
get { return _isValid; }
}
public T Previous {
get { return _previous; }
}
public T Current {
get { return _internal.Current; }
}
public bool MoveNext() {
if (_isValid)
_previous = _internal.Current;
return (_isValid = _internal.MoveNext());
}
public void Dispose() {
_internal.Dispose();
}
#region Explicit Interface Members
object System.Collections.IEnumerator.Current {
get { return Current; }
}
void System.Collections.IEnumerator.Reset() {
_internal.Reset();
_previous = default(T);
_isValid = false;
}
#endregion
}
(I called this a ChipmunkEnumerator because maintaining the previous value reminded me of how chipmunks have pouches in their cheeks where they keep nuts. Does it really matter? Stop making fun of me.)
Now, utilizing this class in an extension method to provide exactly the behavior you want isn't so tough!
Notice that below I've defined GroupConsecutive to actually return an IEnumerable<IGrouping<TKey, T>> for the simple reason that, if these are grouped by key anyway, it makes sense to return an IGrouping<TKey, T> rather than just an IEnumerable<T>. As it turns out, this will help us out later anyway...
public static IEnumerable<IGrouping<TKey, T>> GroupConsecutive<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
using (var e = new ChipmunkEnumerator<T>(source.GetEnumerator())) {
if (!e.MoveNext())
yield break;
while (e.IsValid) {
yield return e.GetNextDuplicateGroup(keySelector);
}
}
}
public static IEnumerable<IGrouping<T, T>> GroupConsecutive<T>(this IEnumerable<T> source)
where T : IEquatable<T> {
return source.GroupConsecutive(x => x);
}
private static IGrouping<TKey, T> GetNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
return new Grouping<TKey, T>(keySelector(e.Current), e.EnumerateNextDuplicateGroup(keySelector));
}
private static IEnumerable<T> EnumerateNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
do {
yield return e.Current;
} while (e.MoveNext() && keySelector(e.Previous).Equals(keySelector(e.Current)));
}
(To implement these methods, I wrote a simple Grouping<TKey, T> class that implements IGrouping<TKey, T> in the most straightforward way possible. I've omitted the code just so as to keep moving along...)
OK, check it out. I think the code example below pretty well captures something resembling the more realistic scenario you described in your updated question.
var entries = new List<KeyValuePair<string, int>> {
new KeyValuePair<string, int>( "Dan", 10 ),
new KeyValuePair<string, int>( "Bill", 12 ),
new KeyValuePair<string, int>( "Dan", 14 ),
new KeyValuePair<string, int>( "Dan", 20 ),
new KeyValuePair<string, int>( "John", 1 ),
new KeyValuePair<string, int>( "John", 2 ),
new KeyValuePair<string, int>( "Bill", 5 )
};
var dupeGroups = entries
.GroupConsecutive(entry => entry.Key);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"Key: {0} Sum: {1}",
dupeGroup.Key.PadRight(5),
dupeGroup.Select(entry => entry.Value).Sum()
);
}
Output:
Key: Dan Sum: 10
Key: Bill Sum: 12
Key: Dan Sum: 34
Key: John Sum: 3
Key: Bill Sum: 5
Notice this also fixes the problem with my original answer of dealing with IEnumerator<T> objects that were value types. (With this approach, it doesn't matter.)
There's still going to be a problem if you try calling ToList here, as you will find out if you try it. But considering you included deferred execution as a requirement, I doubt you would be doing that anyway. For a foreach, it works.
Original, Messy, and Somewhat Stupid Solution
Something tells me I'm going to get totally refuted for saying this, but...
Yes, it is possible (I think). See below for a damn messy solution I threw together. (Catches an exception to know when it's finished, so you know it's a great design!)
Now, Jon's point about there being a very real problem in the event that you try to do, for instance, ToList, and then access the values in the resulting list by index, is totally valid. But if your only intention here is to be able to loop over an IEnumerable<T> using a foreach -- and you're only doing this in your own code -- then, well, I think this could work for you.
Anyway, here's a quick example of how it works:
var ints = new int[] { 1, 3, 3, 4, 4, 4, 5, 2, 3, 1, 6, 6, 6, 5, 7, 7, 8 };
var dupeGroups = ints.GroupConsecutiveDuplicates(EqualityComparer<int>.Default);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"New dupe group: " +
string.Join(", ", dupeGroup.Select(i => i.ToString()).ToArray())
);
}
Output:
New dupe group: 1
New dupe group: 3, 3
New dupe group: 4, 4, 4
New dupe group: 5
New dupe group: 2
New dupe group: 3
New dupe group: 1
New dupe group: 6, 6, 6
New dupe group: 5
New dupe group: 7, 7
New dupe group: 8
And now for the (messy as crap) code:
Note that since this approach requires passing the actual enumerator around between a few different methods, it will not work if that enumerator is a value type, as calls to MoveNext in one method are only affecting a local copy.
public static IEnumerable<IEnumerable<T>> GroupConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comparer) {
using (var e = source.GetEnumerator()) {
if (e.GetType().IsValueType)
throw new ArgumentException(
"This method will not work on a value type enumerator."
);
// get the ball rolling
if (!e.MoveNext()) {
yield break;
}
IEnumerable<T> nextDuplicateGroup;
while (e.FindMoreDuplicates(comparer, out nextDuplicateGroup)) {
yield return nextDuplicateGroup;
}
}
}
private static bool FindMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer, out IEnumerable<T> duplicates) {
duplicates = enumerator.GetMoreDuplicates(comparer);
return duplicates != null;
}
private static IEnumerable<T> GetMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
try {
if (enumerator.Current != null)
return enumerator.GetMoreDuplicatesInner(comparer);
else
return null;
} catch (InvalidOperationException) {
return null;
}
}
private static IEnumerable<T> GetMoreDuplicatesInner<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
while (enumerator.Current != null) {
var current = enumerator.Current;
yield return current;
if (!enumerator.MoveNext())
break;
if (!comparer.Equals(current, enumerator.Current))
break;
}
}

Your second bullet is the problematic one. Here's why:
var groups = CallMagicGetGroupsMethod().ToList();
foreach (string x in groups[3])
{
...
}
foreach (string x in groups[0])
{
...
}
Here, it's trying to iterate over the fourth group and then the first group... that's clearly only going to work if all the groups are buffered or it can reread the sequence, neither of which is ideal.
I suspect you want a more "reactive" approach - I don't know offhand whether Reactive Extensions does what you want (the "consecutive" requirement is unusual) but you should basically provide some sort of action to be executed on each group... that way the method won't need to worry about having to return you something which could be used later on, after it's already finished reading.
Let me know if you'd like me to try to find a solution within Rx, or whether you would be happy with something like:
void GroupConsecutive(IEnumerable<string> items,
Action<IEnumerable<string>> action)

Here's a solution that I think satisfies your requirements, works with any type of data item, and is quite short and readable:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> list)
{
var current = list.FirstOrDefault();
while (!Equals(current, default(T))) {
var cur = current;
Func<T, bool> equalsCurrent = item => item.Equals(cur);
yield return list.TakeWhile(equalsCurrent);
list = list.SkipWhile(equalsCurrent);
current = list.FirstOrDefault();
}
}
Notes:
Deferred execution is there (both TakeWhile and SkipWhile do it).
I think this iterates over the entire collection only once (with SkipWhile); it does iterate over the collection once more when you process the returned IEnumerables, but the partitioning itself iterates only once.
If you don't care about value types, you can add a constraint and change the while condition to a test for null.
If I am somehow mistaken, I 'd be especially interested in comments pointing out the mistakes!
Very Important Aside:
This solution will not allow you to enumerate the produced enumerables in any order other than the one it provides them in. However, I think the original poster has been pretty clear in comments that this is not a problem.

Related

Complexity between IEnumerable<T> return type implementation

Is there a significant complexity difference between these two implementation or does the compiler optimize it anyway?
Usage:
for(int i = 0; i < int.MaxValue; i++)
{
foreach(var item in GoodItems)
{
if(DoSomethingBad(item))
break; // this is later added.
}
}
Implementation (1):
public IEnumerable<T> GoodItems
{
get { return _list.Where(x => x.IsGood); }
}
Implementation (2):
public IEnumerable<T> GoodItems
{
get { foreach(var item in _list.Where(x => x.IsGood)) yield return item; }
}
It appears that IEnumerable methods should always be implemented using (2)? When is one better than the other?
I just built an example program and then used ILSpy to examine the output assembly. The second option will actually generate an extra class that wraps the call to Where but adds zero value to the code. The extra layer the code must follow will probably not cause performance issues in most programs but consider all the extra syntax just to perform the same thing at a slightly slower speed. Not worth it in my book.
where uses yield return internally. You don't need to wrap it in another yield return.
You do _list.where(x => x.IsGood); in both. With that said, isn't it obvious which has to be the better usage?
yield return has its usages, but this scenario, especially in a getter, is not the one
The extra code without payload in "implementation 2" is the less evil here.
Both variants lead to undesirable creation of new object each time you call the property getter. So, results of two sequential getter calls will not be equal:
interface IItem
{
bool IsGood { get; set; }
}
class ItemsContainer<T>
where T : IItem
{
private readonly List<T> items = new List<T>();
public IEnumerable<T> GoodItems
{
get { return items.Where(item => item.IsGood); }
}
// ...
}
// somewhere in code
class Item : IItem { /* ... */ }
var container = new ItemsContainer<Item>();
Console.WriteLine(container.GoodItems == container.GoodItems); // False; Oops!
You should avoid this side-effect:
class ItemsContainer<T>
where T : IItem
{
private readonly List<T> items;
private readonly Lazy<IEnumerable<T>> goodItems;
public ItemsContainer()
{
this.items = new List<T>();
this.goodItems = new Lazy<IEnumerable<T>>(() => items.Where(item => item.IsGood));
}
public IEnumerable<T> GoodItems
{
get { return goodItems.Value; }
}
// ...
}
or make a method instead of property:
public IEnumerable<T> GetGoodItems()
{
return _list.Where(x => x.IsGood);
}
Also, the property is not a good idea, if you want to provide snapshot of your items to the client code.
Internally, the first version gets compiled down to something that looks like this:
public IEnumerable<T> GoodItems
{
get
{
foreach (var item in _list)
if (item.IsGood)
yield return item;
}
}
Whereas the second one will now look something like:
public IEnumerable<T> GoodItems
{
get
{
foreach (var item in GoodItemsHelper)
yield return item;
}
}
private IEnumerable<T> GoodItemsHelper
{
get
{
foreach (var item in _list)
if (item.IsGood)
yield return item;
}
}
The Where clause in LINQ is implemented with deferred execution. So there's no need to apply the foreach (...) yield return ... pattern. You're making more work for yourself, and potentially for the runtime.
I don't know if the second version gets jitted to the same thing as the first. Semantically, the two are distinct in that the first does a single round of deferred execution while the second does two rounds. On those grounds I'd argue that the second would be more complex.
The real question you need to ask is: When you're exposing the IEnumerable, what guarantees are you making? Are you saying that you want to simply provide forward iteration? Or are you stating that your interface provides deferred execution?
In the code below, my intent for is to simply provide forward enumeration without random access:
private List<Int32> _Foo = new List<Int32>() { 1, 2, 3, 4, 5 };
public IEnumerable<Int32> Foo
{
get
{
return _Foo;
}
}
But here, I want to prevent unnecessary computation. I want my expensive computation to be performed only when a result is requested.
private List<Int32> _Foo = new List<Int32>() { 1, 2, 3, 4, 5 };
public IEnumerable<Int32> Foo
{
get
{
foreach (var item in _Foo)
{
var result = DoSomethingExpensive(item);
yield return result;
}
}
}
Even though both versions of Foo look identical on the outside, their internal implementation does different things. That's the part that you need to watch out for. When you use LINQ, you don't need to worry about deferring execution since most operators do it for you. In your own code, you may wish to go with the first or second depending on your needs.

Appending an element to a collection using LINQ

I am trying to process some list with a functional approach in C#.
The idea is that I have a collection of Tuple<T,double> and I want to change the Item 2 of some element T.
The functional way to do so, as data is immutable, is to take the list, filter for all elements where the element is different from the one to change, and the append a new tuple with the new values.
My problem is that I do not know how to append the element at the end. I would like to do:
public List<Tuple<T,double>> Replace(List<Tuple<T,double>> collection, T term,double value)
{
return collection.Where(x=>!x.Item1.Equals(term)).Append(Tuple.Create(term,value));
}
But there is no Append method. Is there something else?
I believe you are looking for the Concat operator.
It joins two IEnumerable<T> together, so you can create one with a single item to join.
public List<Tuple<T,double>> Replace(List<Tuple<T,double>> collection, T term,double value)
{
var newItem = new List<Tuple<T,double>>();
newItem.Add(new Tuple<T,double>(term,value));
return collection.Where(x=>!x.Item1.Equals(term)).Concat(newItem).ToList();
}
It seems that .NET 4.7.1 adds Append LINQ operator, which is exactly what you want. Unlike Concat it takes a single value.
By the way, if you declare a generic method you should include type parameter(s) after its name:
public List<Tuple<T, double>> Replace<T>(List<Tuple<T, double>> collection, T term, double value)
{
return collection.Where(x => !x.Item1.Equals(term))
.Append(Tuple.Create(term, value))
.ToList();
}
LINQ is not for mutation.
Functional programming avoid mutation.
Thus:
public IEnumerable<Tuple<T,double>> Extend(IEnumerable<Tuple<T,double>> collection,
T term,double value)
{
foreach (var x in collection.Where(x=>!x.Item1.Equals(term)))
{
yield return x;
}
yield return Tuple.Create(term,value);
}
If you're willing to use an additional package, check out MoreLinq, available on Nuget. This provides a new overload to the Concat-Function:
public static IEnumerable<T> Concat<T>(this IEnumerable<T> head, T tail);
This function does exactly what was asked for, e.g. you could do
var myEnumerable = Enumerable.Range(10, 3); // Enumerable of values 10, 11, 12
var newEnumerable = myEnumerable.Concat(3); // Enumerable of values 10, 11, 12, 3
And, if you like LINQ, you will probably like a lot of the other new functions, too!
Additionally, as pointed out in a discussion on the MoreLinq Github-page, the function
public static IEnumerable<TSource> Append<TSource>(this IEnumerable<TSource> source, TSource element);
with a different name but the same functionality is available in .NET Core, so it might be possible that we will see it for C# in the future.
This should do what you want (although it uses mutation inside, it feels functional from a callers perspective):
public List<Tuple<T, double>> Replace(List<Tuple<T, double>> collection, T term, double value) {
var result = collection.Where(x => !x.Item1.Equals(term)).ToList();
result.Add(Tuple.Create(term, value));
return result;
}
A alternative way to do it is to use "map" (select in LINQ):
public List<Tuple<T, double>> Replace(List<Tuple<T, double>> collection, T term, double value) {
return collection.Select(x =>
Tuple.Create(
x.Item1,
x.Item1.Equals(term) ? value : x.Item2)).ToList();
}
But it might give you different results than your original intention. Although, to me, that's what I think when I see a method called Replace, which is, replace-in-place.
UPDATE
You can also create what you want like this:
public List<Tuple<T, double>> Replace(List<Tuple<T, double>> collection, T term, double value) {
return collection.
Where(x => !x.Item1.Equals(term)).
Append(Tuple.Create(term, value)).
ToList();
}
Using Concat, as mentioned by Oded:
public static class EnumerableEx {
public static IEnumerable<T> Append<T>(this IEnumerable<T> source, T item) {
return source.Concat(new T[] { item });
}
}
One way is to use .Concat(), but you need to have a enumerable rather than a single item as the second argument. To create an array with a single element does work, but is combersome to write.
It is better to write an custom extension method to do so.
One method is to create a new List<T> and add the items from the first list and then the items from the second list. However, it is better to use the yield-keyword instead, so you do not need to create an list and the enumerable will be evaluated in a lazy fashion:
public static class EnumerableExtensions
{
public static IEnumerable<T> Concat<T>(this IEnumerable<T> list, T item)
{
foreach (var element in list)
{
yield return element;
}
yield return item;
}
}
The closest answer I could find came from this post and is:
return collection.Where(x=>!x.Item1.Equals(term)).Concat(new[]{Tuple.Create(term,value)});

C# Itreator best approach

Apart from (IEnumerable Returns GetEnumerator() ,for "foreach" IEnumerble is essential)
almost the following two approaches allow us to iterate over the collection.What is
the advantage of one over another ? (I am not asking the difference between IEnumerable
and IEnumerator).
static void Main()
{
IEnumerator<int> em = MyEnumerator<int>(new int[] { 1, 2, 3, 4 });
IEnumerator<int> e = Collection<int>
(new int[] { 1, 2, 3, 4 }).GetEnumerator();
while (em.MoveNext())
{
Console.WriteLine(em.Current);
}
while (e.MoveNext())
{
Console.WriteLine(e.Current);
}
Console.ReadKey(true);
}
approach 1
public static IEnumerator<T> MyEnumerator<T>(T[] vals )
{
T[] some = vals;
foreach (var v in some)
{
yield return v;
}
}
approach 2
public static IEnumerable<T> Collection<T>(T[] vals)
{
T[] some = vals;
foreach (var v in some)
{
yield return v;
}
}
The main difference is that most API support an imput of IEnumerable<T> but not of IEnumerator<T>.
You also have to remember to call Reset() when using it while the syntax is more evident in IEnumerable<T> (Just call GetEnumerator again). Also see the comment of Eric Lipper about reset being a bad idea; if Reset isn't implemented in your IEnumerator<T> or is buggy it become a one-time-only enumerator (pretty useless in a lot of cases).
Another difference may be that you could have an IEnumerable<T> that could be enumerated from multiple threads at the same time but an IEnumerator<T> store one position in the enumerated data (Imagine a RandomEnumerable or RangeEnumerable).
So the conclusion is that IEnumerable<T> is more versatile, but anyway if you have a function returning an IEnumerator<T> generating the IEnumerable<T> around it is simple.
class EnumeratorWrapper<T> : IEnumerable<T>
{
Func<IEnumerator<T>> m_generator;
public EnumeratorWrapper(Func<IEnumerator<T>> generator)
{
m_generator = generator;
}
public IEnumerator<T> GetEnumerator()
{
return m_generator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return m_generator();
}
}
Just for API consistency using IEnumerable<T> seem the best solution to me.
But the issue with Reset() in IEnumerator<T> make it an unusable solution anyway, so IEnumerable<T> is the way to go.
Correct me if I'm wrong but the only difference is the difference between IEnumerable and IEnumerator and since you specifically said you're not asking the difference, both are a good...

Passing a single item as IEnumerable<T>

Is there a common way to pass a single item of type T to a method which expects an IEnumerable<T> parameter? Language is C#, framework version 2.0.
Currently I am using a helper method (it's .Net 2.0, so I have a whole bunch of casting/projecting helper methods similar to LINQ), but this just seems silly:
public static class IEnumerableExt
{
// usage: IEnumerableExt.FromSingleItem(someObject);
public static IEnumerable<T> FromSingleItem<T>(T item)
{
yield return item;
}
}
Other way would of course be to create and populate a List<T> or an Array and pass it instead of IEnumerable<T>.
[Edit] As an extension method it might be named:
public static class IEnumerableExt
{
// usage: someObject.SingleItemAsEnumerable();
public static IEnumerable<T> SingleItemAsEnumerable<T>(this T item)
{
yield return item;
}
}
Am I missing something here?
[Edit2] We found someObject.Yield() (as #Peter suggested in the comments below) to be the best name for this extension method, mainly for brevity, so here it is along with the XML comment if anyone wants to grab it:
public static class IEnumerableExt
{
/// <summary>
/// Wraps this object instance into an IEnumerable<T>
/// consisting of a single item.
/// </summary>
/// <typeparam name="T"> Type of the object. </typeparam>
/// <param name="item"> The instance that will be wrapped. </param>
/// <returns> An IEnumerable<T> consisting of a single item. </returns>
public static IEnumerable<T> Yield<T>(this T item)
{
yield return item;
}
}
Well, if the method expects an IEnumerable you've got to pass something that is a list, even if it contains one element only.
passing
new[] { item }
as the argument should be enough I think
In C# 3.0 you can utilize the System.Linq.Enumerable class:
// using System.Linq
Enumerable.Repeat(item, 1);
This will create a new IEnumerable that only contains your item.
Your helper method is the cleanest way to do it, IMO. If you pass in a list or an array, then an unscrupulous piece of code could cast it and change the contents, leading to odd behaviour in some situations. You could use a read-only collection, but that's likely to involve even more wrapping. I think your solution is as neat as it gets.
In C# 3 (I know you said 2), you can write a generic extension method which might make the syntax a little more acceptable:
static class IEnumerableExtensions
{
public static IEnumerable<T> ToEnumerable<T>(this T item)
{
yield return item;
}
}
client code is then item.ToEnumerable().
This helper method works for item or many.
public static IEnumerable<T> ToEnumerable<T>(params T[] items)
{
return items;
}
I'm kind of surprised that no one suggested a new overload of the method with an argument of type T to simplify the client API.
public void DoSomething<T>(IEnumerable<T> list)
{
// Do Something
}
public void DoSomething<T>(T item)
{
DoSomething(new T[] { item });
}
Now your client code can just do this:
MyItem item = new MyItem();
Obj.DoSomething(item);
or with a list:
List<MyItem> itemList = new List<MyItem>();
Obj.DoSomething(itemList);
Either (as has previously been said)
MyMethodThatExpectsAnIEnumerable(new[] { myObject });
or
MyMethodThatExpectsAnIEnumerable(Enumerable.Repeat(myObject, 1));
As a side note, the last version can also be nice if you want an empty list of an anonymous object, e.g.
var x = MyMethodThatExpectsAnIEnumerable(Enumerable.Repeat(new { a = 0, b = "x" }, 0));
I agree with #EarthEngine's comments to the original post, which is that 'AsSingleton' is a better name. See this wikipedia entry. Then it follows from the definition of singleton that if a null value is passed as an argument that 'AsSingleton' should return an IEnumerable with a single null value instead of an empty IEnumerable which would settle the if (item == null) yield break; debate. I think the best solution is to have two methods: 'AsSingleton' and 'AsSingletonOrEmpty'; where, in the event that a null is passed as an argument, 'AsSingleton' will return a single null value and 'AsSingletonOrEmpty' will return an empty IEnumerable. Like this:
public static IEnumerable<T> AsSingletonOrEmpty<T>(this T source)
{
if (source == null)
{
yield break;
}
else
{
yield return source;
}
}
public static IEnumerable<T> AsSingleton<T>(this T source)
{
yield return source;
}
Then, these would, more or less, be analogous to the 'First' and 'FirstOrDefault' extension methods on IEnumerable which just feels right.
This is 30% faster than yield or Enumerable.Repeat when used in foreach due to this C# compiler optimization, and of the same performance in other cases.
public struct SingleSequence<T> : IEnumerable<T> {
public struct SingleEnumerator : IEnumerator<T> {
private readonly SingleSequence<T> _parent;
private bool _couldMove;
public SingleEnumerator(ref SingleSequence<T> parent) {
_parent = parent;
_couldMove = true;
}
public T Current => _parent._value;
object IEnumerator.Current => Current;
public void Dispose() { }
public bool MoveNext() {
if (!_couldMove) return false;
_couldMove = false;
return true;
}
public void Reset() {
_couldMove = true;
}
}
private readonly T _value;
public SingleSequence(T value) {
_value = value;
}
public IEnumerator<T> GetEnumerator() {
return new SingleEnumerator(ref this);
}
IEnumerator IEnumerable.GetEnumerator() {
return new SingleEnumerator(ref this);
}
}
in this test:
// Fastest among seqs, but still 30x times slower than direct sum
// 49 mops vs 37 mops for yield, or c.30% faster
[Test]
public void SingleSequenceStructForEach() {
var sw = new Stopwatch();
sw.Start();
long sum = 0;
for (var i = 0; i < 100000000; i++) {
foreach (var single in new SingleSequence<int>(i)) {
sum += single;
}
}
sw.Stop();
Console.WriteLine($"Elapsed {sw.ElapsedMilliseconds}");
Console.WriteLine($"Mops {100000.0 / sw.ElapsedMilliseconds * 1.0}");
}
As I have just found, and seen that user LukeH suggested too, a nice simple way of doing this is as follows:
public static void PerformAction(params YourType[] items)
{
// Forward call to IEnumerable overload
PerformAction(items.AsEnumerable());
}
public static void PerformAction(IEnumerable<YourType> items)
{
foreach (YourType item in items)
{
// Do stuff
}
}
This pattern will allow you to call the same functionality in a multitude of ways: a single item; multiple items (comma-separated); an array; a list; an enumeration, etc.
I'm not 100% sure on the efficiency of using the AsEnumerable method though, but it does work a treat.
Update: The AsEnumerable function looks pretty efficient! (reference)
Although it's overkill for one method, I believe some people may find the Interactive Extensions useful.
The Interactive Extensions (Ix) from Microsoft includes the following method.
public static IEnumerable<TResult> Return<TResult>(TResult value)
{
yield return value;
}
Which can be utilized like so:
var result = EnumerableEx.Return(0);
Ix adds new functionality not found in the original Linq extension methods, and is a direct result of creating the Reactive Extensions (Rx).
Think, Linq Extension Methods + Ix = Rx for IEnumerable.
You can find both Rx and Ix on CodePlex.
I recently asked the same thing on another post
Is there a way to call a C# method requiring an IEnumerable<T> with a single value? ...with benchmarking.
I wanted people stopping by here to see the brief benchmark comparison shown at that newer post for 4 of the approaches presented in these answers.
It seems that simply writing new[] { x } in the arguments to the method is the shortest and fastest solution.
This may not be any better but it's kind of cool:
Enumerable.Range(0, 1).Select(i => item);
Sometimes I do this, when I'm feeling impish:
"_".Select(_ => 3.14) // or whatever; any type is fine
This is the same thing with less shift key presses, heh:
from _ in "_" select 3.14
For a utility function I find this to be the least verbose, or at least more self-documenting than an array, although it'll let multiple values slide; as a plus it can be defined as a local function:
static IEnumerable<T> Enumerate (params T[] v) => v;
// usage:
IEnumerable<double> example = Enumerate(1.234);
Here are all of the other ways I was able to think of (runnable here):
using System;
using System.Collections.Generic;
using System.Linq;
public class Program {
public static IEnumerable<T> ToEnumerable1 <T> (T v) {
yield return v;
}
public static T[] ToEnumerable2 <T> (params T[] vs) => vs;
public static void Main () {
static IEnumerable<T> ToEnumerable3 <T> (params T[] v) => v;
p( new string[] { "three" } );
p( new List<string> { "three" } );
p( ToEnumerable1("three") ); // our utility function (yield return)
p( ToEnumerable2("three") ); // our utility function (params)
p( ToEnumerable3("three") ); // our local utility function (params)
p( Enumerable.Empty<string>().Append("three") );
p( Enumerable.Empty<string>().DefaultIfEmpty("three") );
p( Enumerable.Empty<string>().Prepend("three") );
p( Enumerable.Range(3, 1) ); // only for int
p( Enumerable.Range(0, 1).Select(_ => "three") );
p( Enumerable.Repeat("three", 1) );
p( "_".Select(_ => "three") ); // doesn't have to be "_"; just any one character
p( "_".Select(_ => 3.3333) );
p( from _ in "_" select 3.0f );
p( "a" ); // only for char
// these weren't available for me to test (might not even be valid):
// new Microsoft.Extensions.Primitives.StringValues("three")
}
static void p <T> (IEnumerable<T> e) =>
Console.WriteLine(string.Join(' ', e.Select((v, k) => $"[{k}]={v,-8}:{v.GetType()}").DefaultIfEmpty("<empty>")));
}
For those wondering about performance, while #mattica has provided some benchmarking information in a similar question referenced above, My benchmark tests, however, have provided a different result.
In .NET 7, yield return value is ~9% faster than new T[] { value } and allocates 75% the amount of memory. In most cases, this is already hyper-performant and is as good as you'll ever need.
I was curious if a custom single collection implementation would be faster or more lightweight. It turns out because yield return is implemented as IEnumerator<T> and IEnumerable<T>, the only way to beat it in terms of allocation is to do that in my implementation as well.
If you're passing IEnumerable<> to an outside library, I would strongly recommend not doing this unless you're very familiar with what you're building. That being said, I made a very simple (not-reuse-safe) implementation which was able to beat the yield method by 5ns and allocated only half as much as the array.
Because all tests were passed an IEnumerable<T>, value types generally performed worse than reference types. The best implementation I had was actually the simplest - you can look at the SingleCollection class in the gist I linked to. (This was 2ns faster than yield return, but allocated 88% of what the array would, compared to the 75% allocated for yield return.)
TL:DR; if you care about speed, use yield return item. If you really care about speed, use a SingleCollection.
The easiest way I'd say would be new T[]{item};; there's no syntax to do this. The closest equivalent that I can think of is the params keyword, but of course that requires you to have access to the method definition and is only usable with arrays.
Enumerable.Range(1,1).Select(_ => {
//Do some stuff... side effects...
return item;
});
The above code is useful when using like
var existingOrNewObject = MyData.Where(myCondition)
.Concat(Enumerable.Range(1,1).Select(_ => {
//Create my object...
return item;
})).Take(1).First();
In the above code snippet there is no empty/null check, and it is guaranteed to have only one object returned without afraid of exceptions. Furthermore, because it is lazy, the closure will not be executed until it is proved there is no existing data fits the criteria.
To be filed under "Not necessarily a good solution, but still...a solution" or "Stupid LINQ tricks", you could combine Enumerable.Empty<>() with Enumerable.Append<>()...
IEnumerable<string> singleElementEnumerable = Enumerable.Empty<string>().Append("Hello, World!");
...or Enumerable.Prepend<>()...
IEnumerable<string> singleElementEnumerable = Enumerable.Empty<string>().Prepend("Hello, World!");
The latter two methods are available since .NET Framework 4.7.1 and .NET Core 1.0.
This is a workable solution if one were really intent on using existing methods instead of writing their own, though I'm undecided if this is more or less clear than the Enumerable.Repeat<>() solution. This is definitely longer code (partly due to type parameter inference not being possible for Empty<>()) and creates twice as many enumerator objects, however.
Rounding out this "Did you know these methods exist?" answer, Array.Empty<>() could be substituted for Enumerable.Empty<>(), but it's hard to argue that makes the situation...better.
I'm a bit late to the party but I'll share my way anyway.
My problem was that I wanted to bind the ItemSource or a WPF TreeView to a single object. The hierarchy looks like this:
Project > Plot(s) > Room(s)
There was always going to be only one Project but I still wanted to Show the project in the Tree, without having to pass a Collection with only that one object in it like some suggested.
Since you can only pass IEnumerable objects as ItemSource I decided to make my class IEnumerable:
public class ProjectClass : IEnumerable<ProjectClass>
{
private readonly SingleItemEnumerator<AufmassProjekt> enumerator;
...
public IEnumerator<ProjectClass > GetEnumerator() => this.enumerator;
IEnumerator IEnumerable.GetEnumerator() => this.GetEnumerator();
}
And create my own Enumerator accordingly:
public class SingleItemEnumerator : IEnumerator
{
private bool hasMovedOnce;
public SingleItemEnumerator(object current)
{
this.Current = current;
}
public bool MoveNext()
{
if (this.hasMovedOnce) return false;
this.hasMovedOnce = true;
return true;
}
public void Reset()
{ }
public object Current { get; }
}
public class SingleItemEnumerator<T> : IEnumerator<T>
{
private bool hasMovedOnce;
public SingleItemEnumerator(T current)
{
this.Current = current;
}
public void Dispose() => (this.Current as IDisposable).Dispose();
public bool MoveNext()
{
if (this.hasMovedOnce) return false;
this.hasMovedOnce = true;
return true;
}
public void Reset()
{ }
public T Current { get; }
object IEnumerator.Current => this.Current;
}
This is probably not the "cleanest" solution but it worked for me.
EDIT
To uphold the single responsibility principle as #Groo pointed out I created a new wrapper class:
public class SingleItemWrapper : IEnumerable
{
private readonly SingleItemEnumerator enumerator;
public SingleItemWrapper(object item)
{
this.enumerator = new SingleItemEnumerator(item);
}
public object Item => this.enumerator.Current;
public IEnumerator GetEnumerator() => this.enumerator;
}
public class SingleItemWrapper<T> : IEnumerable<T>
{
private readonly SingleItemEnumerator<T> enumerator;
public SingleItemWrapper(T item)
{
this.enumerator = new SingleItemEnumerator<T>(item);
}
public T Item => this.enumerator.Current;
public IEnumerator<T> GetEnumerator() => this.enumerator;
IEnumerator IEnumerable.GetEnumerator() => this.GetEnumerator();
}
Which I used like this
TreeView.ItemSource = new SingleItemWrapper(itemToWrap);
EDIT 2
I corrected a mistake with the MoveNext() method.
I prefer
public static IEnumerable<T> Collect<T>(this T item, params T[] otherItems)
{
yield return item;
foreach (var otherItem in otherItems)
{
yield return otherItem;
}
}
This lets you call item.Collect() if you want the singleton, but it also lets you call item.Collect(item2, item3) if you want

Is there a statement to prepend an element T to a IEnumerable<T>

For example:
string element = 'a';
IEnumerable<string> list = new List<string>{ 'b', 'c', 'd' };
IEnumerable<string> singleList = ???; //singleList yields 'a', 'b', 'c', 'd'
I take it you can't just Insert into the existing list?
Well, you could use new[] {element}.Concat(list).
Otherwise, you could write your own extension method:
public static IEnumerable<T> Prepend<T>(
this IEnumerable<T> values, T value) {
yield return value;
foreach (T item in values) {
yield return item;
}
}
...
var singleList = list.Prepend("a");
Since .NET framework 4.7.1 there is LINQ method for that:
list.Prepend("a");
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.prepend?view=netframework-4.7.1
public static class IEnumerableExtensions
{
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> ie, T item)
{
return new T[] { item }.Concat(ie);
}
}
You can roll your own:
static IEnumerable<T> Prepend<T>(this IEnumerable<T> seq, T val) {
yield return val;
foreach (T t in seq) {
yield return t;
}
}
And then use it:
IEnumerable<string> singleList = list.Prepend(element);
This would do it...
IEnumerable<string> singleList = new[] {element}.Concat(list);
If you wanted the singleList to be a List then...
IEnumerable<string> singleList = new List<string>() {element}.Concat(list);
... works too.
Also:
IEnumerable<string> items = Enumerable.Repeat(item, 1).Concat(list);
I find it convenient to be able to prepend multiple items in a chainable fashion. This version takes advantage of extension methods and params.
As a note, this version implicitly allows null, but it's just as easy to change it to throw new NullReferenceException() if that's the desired behavior.
public static class IEnumerableExtensions
{
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> source, params T[] items)
{
return items.Concat(source ?? new T[0]);
}
}
Allows for a very readable syntax for individual items:
GetItems().Prepend(first, second, third);
...and for collections of items:
GetItems().Prepend(GetMoreItems());
Finishing the example in the question results in:
string element = "a";
IEnumerable<string> list = new List<string>{ "b", "c", "d" };
IEnumerable<string> singleList = list.Prepend(element);
No, there's no such built-in statment, statement, but it's trivial to implement such function:
IEnumerable<T> PrependTo<T>(IEnumerable<T> underlyingEnumerable, params T[] values)
{
foreach(T value in values)
yield return value;
foreach(T value in underlyingEnumerable)
yield return value;
}
IEnumerable<string> singleList = PrependTo(list, element);
You can even make it an extension method if C# version allows for.
Just as a reminder - List< T > is not the only type of container. If you find yourself adding elements to the front of the list quite frequently, you can also consider using Stack< T > to implement your container. Once you have a stack
var container = new Stack<string>(new string[] { "b", "c", "d" });
you can always "prepend" an element via
container.Push("a");
and still use the collection as IEnumerable< T > like in
foreach (var s in container)
// do sth with s
besides all the other methods typical for a stack like Pop(), Peek(), ...
Some of the solutions above iterate through the whole IEnumeration< T > just to prepend one element (or more than one in one case). This can be a very expensive operation if your collection contains a large number of elements and the frequency of prepending is relatively high.
Looking at some of the examples, I think I'd prefer to reverse the extension to apply to the object.
public static IEnumerable<T> PrependTo<T>(this T value, IEnumerable<T> values) {
return new[] { value }.Concat(values);
}
Used like
var singleList = element.PrependTo(list);
As pointed out by Niklas & NetMage in the comments.
There is a new built-in Prepend methond in C#.

Categories

Resources