Using key-value pairs as parameters - c#

Simple. If I use:
public void Add(params int[] values)
Then I can use this as:
Add(1, 2, 3, 4);
But now I'm dealing with key-value pairs! I have a KeyValue class to link an integer to a string value. So I start with:
public void Add(params KeyValue[] values)
But I can't use this:
Add(1, "A", 2, "B", 3, "C", 4, "D");
Instead, I'm forced to use:
Add(new KeyValue(1, "A"), new KeyValue(2, "B"), new KeyValue(3, "C"), new KeyValue(4, "D"));
Ewww... Already I dislike this...
So, right now I use the Add function without the params modifier and just pass a pre-defined array to this function. Since it's just used for a quick initialization for a test, I'm not too much troubled about needing this additional code, although I want to keep the code simple to read. I would love to know a trick to use the method I can't use but is there any way to do this without using the "new KeyValue()" construction?

If you accepted an IDictionary<int,string>, you could presumably use (in C# 3.0, at least):
Add(new Dictionary<int,string> {
{1, "A"}, {2, "B"}, {3, "C"}, {4, "D"}
});
Any use?
Example Add:
static void Add(IDictionary<int, string> data) {
foreach (var pair in data) {
Console.WriteLine(pair.Key + " = " + pair.Value);
}
}

You can modify your current class design, but you will need to add generics and use the IEnumerable interface.
class KeyValue<TKey, TValue>
{
public KeyValue()
{
}
}
// 1. change: need to implement IEnumerable interface
class KeyValueList<TKey, TValue> : IEnumerable<TKey>
{
// 2. prerequisite: parameterless constructor needed
public KeyValueList()
{
// ...
}
// 3. need Add method to take advantage of
// so called "collection initializers"
public void Add(TKey key, TValue value)
{
// here you will need to initalize the
// KeyValue object and add it
}
// need to implement IEnumerable<TKey> here!
}
After these additions you can do the following:
new KeyValueList<int, string>() { { 1, "A" }, { 2, "B" } };
The compiler will use the IEnumerable interface and the Add method to populate the KeyValueList. Note that it works for C# 3.0.
If you are using this for tests, these changes are not worth it. It's quite an effort and you change quite a lot of production code for tests.

You could use something like the following with the obvious drawback that you loose strong typing.
public void Add(params Object[] inputs)
{
Int32 numberPairs = inputs.Length / 2;
KeyValue[] keyValues = new KeyValue[numberPairs];
for (Int32 i = 0; i < numberPairs; i++)
{
Int32 key = (Int32)inputs[2 * i];
String value = (String)inputs[2 * i + 1];
keyvalues[i] = new KeyValue(key, value);
}
// Call the overloaded method accepting KeyValue[].
this.Add(keyValues);
}
public void Add(params KeyValue[] values)
{
// Do work here.
}
You should of cause add some error handling if the arguments are of incorrect type. Not that smart, but it will work.

Related

How can I create a new instance of ImmutableDictionary?

I would like to write something like this:
var d = new ImmutableDictionary<string, int> { { "a", 1 }, { "b", 2 } };
(using ImmutableDictionary from System.Collections.Immutable). It seems like a straightforward usage as I am declaring all the values upfront -- no mutation there. But this gives me error:
The type 'System.Collections.Immutable.ImmutableDictionary<TKey,TValue>' has no constructors defined
How I am supposed to create a new immutable dictionary with static content?
You can't create immutable collection with a collection initializer because the compiler translates them into a sequence of calls to the Add method. For example if you look at the IL code for var d = new Dictionary<string, int> { { "a", 1 }, { "b", 2 } }; you'll get
IL_0000: newobj instance void class [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::.ctor()
IL_0005: dup
IL_0006: ldstr "a"
IL_000b: ldc.i4.1
IL_000c: callvirt instance void class [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)
IL_0011: dup
IL_0012: ldstr "b"
IL_0017: ldc.i4.2
IL_0018: callvirt instance void class [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)
Obviously this violates the concept of immutable collections.
Both your own answer and Jon Skeet's are ways to deal with this.
// lukasLansky's solution
var d = new Dictionary<string, int> { { "a", 1 }, { "b", 2 } }.ToImmutableDictionary();
// Jon Skeet's solution
var builder = ImmutableDictionary.CreateBuilder<string, int>();
builder.Add("a", 1);
builder.Add("b", 2);
var result = builder.ToImmutable();
Either create a "normal" dictionary first and call ToImmutableDictionary (as per your own answer), or use ImmutableDictionary<,>.Builder:
var builder = ImmutableDictionary.CreateBuilder<string, int>();
builder.Add("a", 1);
builder.Add("b", 2);
var result = builder.ToImmutable();
It's a shame that the builder doesn't have a public constructor as far as I can tell, as it prevents you from using the collection initializer syntax, unless I've missed something... the fact that the Add method returns void means you can't even chain calls to it, making it more annoying - as far as I can see, you basically can't use a builder to create an immutable dictionary in a single expression, which is very frustrating :(
So far I like this most:
var d = new Dictionary<string, int> { { "a", 1 }, { "b", 2 } }.ToImmutableDictionary();
You could use a helper like this:
public struct MyDictionaryBuilder<TKey, TValue> : IEnumerable
{
private ImmutableDictionary<TKey, TValue>.Builder _builder;
public MyDictionaryBuilder(int dummy)
{
_builder = ImmutableDictionary.CreateBuilder<TKey, TValue>();
}
public void Add(TKey key, TValue value) => _builder.Add(key, value);
public TValue this[TKey key]
{
set { _builder[key] = value; }
}
public ImmutableDictionary<TKey, TValue> ToImmutable() => _builder.ToImmutable();
public IEnumerator GetEnumerator()
{
// Only implementing IEnumerable because collection initializer
// syntax is unavailable if you don't.
throw new NotImplementedException();
}
}
(I'm using the new C# 6 expression-bodied members, so if you want this to compile on older versions, you'd need to expand those into full members.)
With that type in place, you can use collection initializer syntax like so:
var d = new MyDictionaryBuilder<int, string>(0)
{
{ 1, "One" },
{ 2, "Two" },
{ 3, "Three" }
}.ToImmutable();
or if you're using C# 6 you could use object initializer syntax, with its new support for indexers (which is why I included a write-only indexer in my type):
var d2 = new MyDictionaryBuilder<int, string>(0)
{
[1] = "One",
[2] = "Two",
[3] = "Three"
}.ToImmutable();
This combines the benefits of both proposed advantages:
Avoids building a full Dictionary<TKey, TValue>
Lets you use initializers
The problem with building a full Dictionary<TKey, TValue> is that there is a bunch of overhead involved in constructing that; it's an unnecessarily expensive way of passing what's basically a list of key/value pairs, because it will carefully set up a hash table structure to enable efficient lookups that you'll never actually use. (The object you'll be performing lookups on is the immutable dictionary you eventually end up with, not the mutable dictionary you're using during initialization.)
ToImmutableDictionary is just going to iterate through the contents of the dictionary (a process rendered less efficient by the way Dictionary<TKey, TValue> works internally - it takes more work to do this than it would with a simple list), gaining absolutely no benefit from the work that went into building up the dictionary, and then has to do the same work it would have done if you'd used the builder directly.
Jon's code avoids this, using only the builder, which should be more efficient. But his approach doesn't let you use initializers.
I share Jon's frustration that the immutable collections don't provide a way to do this out of the box.
Edited 2017/08/10: I've had to change the zero-argument constructor to one that takes an argument that it ignores, and to pass a dummy value everywhere you use this. #gareth-latty pointed out in a comment that a struct can't have a zero-args constructor. When I originally wrote this example that wasn't true: for a while, previews of C# 6 allowed you to supply such a constructor. This feature was removed before C# 6 shipped (after I wrote the original answer, obviously), presumably because it was confusing - there were scenarios in which the constructor wouldn't run. In this particular case it was safe to use it, but unfortunately the language feature no longer exists. Gareth's suggestion was to change it into a class, but then any code using this would have to allocate an object, causing unnecessary GC pressure - the whole reason I used a struct was to make it possible to use this syntax with no additional runtime overhead.
I tried modifying this to perform deferred initialization of _builder but it turns out that the JIT code generator isn't smart enough to optimize these away, so even in release builds it checks _builder for each item you add. (And it inlines that check and the corresponding call to CreateBuilder which turns out to produce quite a lot of code with lots of conditional branching). It really is best to have a one-time initialization, and this has to occur in the constructor if you want to be able to use this initializer syntax. So the only way to use this syntax with no additional costs is to have a struct that initializes _builder in its constructor, meaning that we now need this ugly dummy argument.
Or this
ImmutableDictionary<string, int>.Empty
.Add("a", 1)
.Add("b", 2);
There is also AddRange method available.
I prefer this syntax:
var dict = ImmutableDictionaryEx.Create<int, string>(
(1, "one"),
(2, "two"),
(3, "three"));
Can be easily achieved using this method:
public static class ImmutableDictionaryEx {
/// <summary>
/// Creates a new <see cref="ImmutableDictionary"/> with the given key/value pairs.
/// </summary>
public static ImmutableDictionary<K, V> Create<K, V>(params (K key, V value)[] items) where K : notnull {
var builder = ImmutableDictionary.CreateBuilder<K, V>();
foreach (var (key, value) in items)
builder.Add(key, value);
return builder.ToImmutable();
}
}
There is another answer I don't see here (maybe it's a new method?):
Just use the CreateRange method to create a new ImmutableDictionary using an IEnumerable<KVP>.
// The MyObject record example
record MyObject(string Name, string OtherStuff);
// The init logic for the ImmutableDictionary
var myObjects = new List<MyObject>();
var dictionary = ImmutableDictionary
.CreateRange(myObjects.Select(obj => new KeyValuePair<string, MyObject>(obj.Name, obj)));
You could write a custom ImmutableDictionaryLiteral class with an implicit operator to make a syntax very close to what you want.
public class ImmutableDictionaryLiteral<TKey, TValue> : Dictionary<TKey, TValue>
{
public static implicit operator ImmutableDictionary<TKey, TValue>(ImmutableDictionaryLiteral<TKey, TValue> source)
{
return source.ToImmutableDictionary();
}
}
Then you call it by declaring a ImmutableDictionary<TKey, TValue> variable and initializing with a ImmutableDictionaryLiteral<TKey, TValue> value.
ImmutableDictionary<string, string> dict = new ImmutableDictionaryLiteral<string, string>()
{
{ "key", "value" },
{ "key2", "value2" }
};
This also works with object initializer syntax
ImmutableDictionary<string, string> dict = new ImmutableDictionaryLiteral<string, string>()
{
["key"] = "value",
["key2"] = "value2"
};

Alternating params

Is there a kind of alternating params for method parameters?
I like the keyword params. But sometimes I need two parameters to be params.
I want to call a method like so:
Method(1, "a", 2, "b", 3, "c")
where 1, 2 and 3 are keys and "a", "b" and "c" are assigned values.
If I try to define the method parameters I would intuitively try to use params for two parameters like so:
void Method(params int[] i, string[] s)
Compiler would add every parameter at odd positions to the first parameter and every parameter at even positions to the second parameter.
But (as you know) params is only possible for last parameter.
Of course I could create a parameter class (e.g. KeyValue) and use it so:
Method(new[] {new KeyValue(1, "a"), new KeyValue(2, "b"), new KeyValue(3, "c")})
But that is too much code imo.
Is there any shorter notation?
Edit: Just now I found a good answer to another question: It suggests to inherit from List and to overload the Add method so that the new List can be initialized by this way:
new KeyValueList<int, string>{{ 1, "a" }, { 2, "b" }, { 3, "c" }}
Method definition would be:
void Method(KeyValueList<int, string> list)
Call would be:
Method(new KeyValueList<int, string>{{ 1, "a" }, { 2, "b" }, { 3, "c" }})
There is no "alternating params" notation as you described.
You can only have one params parameter and it must be last - if you want to have different types as params parameters you can use object as the array type.
Consider passing in a list made of a custom type that retains the meaning of these items.
public class MyType
{
public int MyNum { get; set; }
public string MyStr { get; set; }
}
Method(List<MyType> myList);
You could do this via params object[] keysAndValues and sort it out yourself, but... its a bit icky, what with all the boxing/unboxing that would go on.
Just giving an updated answer for today's developers looking for a solution...
Old School Answer:
You could create a class or struct that includes the parameters of the type you need...
struct Struct {
public int Integer {get; set;}
public string Text {get; set;}
public Struct(int integer, string text) {
Integer = integer;
Text = text;
}
}
... and then define your function to accept it as the params array...
void Method(params Struct[] structs)
... and call it like...
Method(new Struct(1, "A"), new Struct(2, "B"), new Struct(3, "C"));
New School Answer:
Or, with the latest C#, you could simply use ValueTuples...
Method(params (int, string)[] valueTuples)
... and call it like...
Method((1, "A"), (2, "B"), (3, "C"))

Why can I initialize a List like an array in C#?

Today I was surprised to find that in C# I can do:
List<int> a = new List<int> { 1, 2, 3 };
Why can I do this? What constructor is called? How can I do this with my own classes? I know that this is the way to initialize arrays but arrays are language items and Lists are simple objects ...
This is part of the collection initializer syntax in .NET. You can use this syntax on any collection you create as long as:
It implements IEnumerable (preferably IEnumerable<T>)
It has a method named Add(...)
What happens is the default constructor is called, and then Add(...) is called for each member of the initializer.
Thus, these two blocks are roughly identical:
List<int> a = new List<int> { 1, 2, 3 };
And
List<int> temp = new List<int>();
temp.Add(1);
temp.Add(2);
temp.Add(3);
List<int> a = temp;
You can call an alternate constructor if you want, for example to prevent over-sizing the List<T> during growing, etc:
// Notice, calls the List constructor that takes an int arg
// for initial capacity, then Add()'s three items.
List<int> a = new List<int>(3) { 1, 2, 3, }
Note that the Add() method need not take a single item, for example the Add() method for Dictionary<TKey, TValue> takes two items:
var grades = new Dictionary<string, int>
{
{ "Suzy", 100 },
{ "David", 98 },
{ "Karen", 73 }
};
Is roughly identical to:
var temp = new Dictionary<string, int>();
temp.Add("Suzy", 100);
temp.Add("David", 98);
temp.Add("Karen", 73);
var grades = temp;
So, to add this to your own class, all you need do, as mentioned, is implement IEnumerable (again, preferably IEnumerable<T>) and create one or more Add() methods:
public class SomeCollection<T> : IEnumerable<T>
{
// implement Add() methods appropriate for your collection
public void Add(T item)
{
// your add logic
}
// implement your enumerators for IEnumerable<T> (and IEnumerable)
public IEnumerator<T> GetEnumerator()
{
// your implementation
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
Then you can use it just like the BCL collections do:
public class MyProgram
{
private SomeCollection<int> _myCollection = new SomeCollection<int> { 13, 5, 7 };
// ...
}
(For more information, see the MSDN)
It is so called syntactic sugar.
List<T> is the "simple" class, but compiler gives a special treatment to it in order to make your life easier.
This one is so called collection initializer. You need to implement IEnumerable<T> and Add method.
According to the C# Version 3.0 Specification "The collection object to which a collection initializer is applied must be of a type that implements System.Collections.Generic.ICollection for exactly one T."
However, this information appears to be inaccurate as of this writing; see Eric Lippert's clarification in the comments below.
It works thanks to collection initializers which basically require the collection to implement an Add method and that will do the work for you.
Another cool thing about collection initializers is that you can have multiple overloads of Add method and you can call them all in the same initializer! For example this works:
public class MyCollection<T> : IEnumerable<T>
{
public void Add(T item, int number)
{
}
public void Add(T item, string text)
{
}
public bool Add(T item) //return type could be anything
{
}
}
var myCollection = new MyCollection<bool>
{
true,
{ false, 0 },
{ true, "" },
false
};
It calls the correct overloads. Also, it looks for just the method with name Add, the return type could be anything.
The array like syntax is being turned in a series of Add() calls.
To see this in a much more interesting example, consider the following code in which I do two interesting things that sound first illegal in C#, 1) setting a readonly property, 2) setting a list with a array like initializer.
public class MyClass
{
public MyClass()
{
_list = new List<string>();
}
private IList<string> _list;
public IList<string> MyList
{
get
{
return _list;
}
}
}
//In some other method
var sample = new MyClass
{
MyList = {"a", "b"}
};
This code will work perfectly, although 1) MyList is readonly and 2) I set a list with array initializer.
The reason why this works, is because in code that is part of an object intializer the compiler always turns any {} like syntax to a series of Add() calls which are perfectly legal even on a readonly field.

How to specify a list selection method?

I've got a method that computes a list. At certain points in the algorithm a single element from the list needs to be chosen. It doesn't really matter which element is chosen, but I'd like to leave it up to the user to decide.
Right now, I've added an extension method IList<T>.Random() which simply takes a random element. .First() would have worked equally as well. Supposing I want to let the user pick which method is used, or perhaps an entirely different method, how would that look?
I was thinking about using an enum with limited options, and then I could wrap each of these calls in a switch and call the appropriate function. But maybe some sort of lambda function would be more appropriate?
This method needs to be used in two different places, once on a List<char> and once on a List<string>. I want to use the same method for both.
This isn't a GUI app. I'm trying to decide how to design the API.
Specifically, I want to have a field like
public Func<IList<T>, T> SelectElement = list => list.First();
Which would then be used in the method,
public string Reverse(string pattern, IList<object> args = null, IDictionary<string, object> kwargs = null)
But generic fields aren't possible. So I'm looking for an alternative solution. One would be to make the SelectElement method an argument to Reverse(), then I could make it generic... but I was hoping to keep it at a class-level for re-usability. Don't want to pass any more args to the function if I can help it.
Edit: full source code
how about this:
public class MyClass
{
public static class C<T>
{
public static Func<IList<T>, T> SelectElement;
}
public int Test(IList<int> list)
{
return C<int>.SelectElement(list);
}
}
static class Program
{
static void Main(string[] args)
{
MyClass.C<char>.SelectElement = xs => xs.First();
MyClass.C<int>.SelectElement = xs => xs.First();
var list = new List<int>(new int[] { 1, 2, 3 });
var c = new MyClass();
var v = c.Test(list);
Console.WriteLine(v);
}
}
Here's an extremely basic example I put together using a generic method that takes in a Func<IEnumerable<T>, T> for selecting an item from the list and then returns the result. I've done a few examples of how to call it:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Test
{
class Program
{
static void Main(string[] args)
{
//Simple list.
var list = new List<int> { 1, 2, 3, 4 };
// Try it with first
var result = DoItemSelect(list, Enumerable.First);
Console.WriteLine(result);
// Try it with last
result = DoItemSelect(list, Enumerable.Last);
Console.WriteLine(result);
// Try it with ElementAt for the second item (index 1) in the list.
result = DoItemSelect(list, enumerable => enumerable.ElementAt(1));
Console.WriteLine(result);
}
public static T DoItemSelect<T>(IEnumerable<T> enumerable, Func<IEnumerable<T>, T> selector)
{
// You can do whatever you method does here, selector is the user specified func for
// how to select from the enumerable. Here I just return the result of selector directly.
return selector(enumerable);
}
}
}
If you want to limit the choices a user has you could follow the route of an enum and make this method a private method and then have a way to convert the enum to the appropriate selector delegate to pass to the underlying private method.
public Func<IList<object>, object> SelectElement = list => list.First();
private T _S<T>(IEnumerable<T> list)
{
return (T)SelectElement(list.Cast<object>().ToList());
}
I can make the anonymous method work on objects, thereby avoiding generics, and then add a helper method which is what I'll actually use to call it. A little ugly, but seems to work.
This works for chars and strings. Haven't tested with other types. Built this before I saw Ralph's code, which is practically the same.
LINQPad code:
void Main()
{
var chars = new List<char>();
var strings = new List<string>();
chars.AddRange(new char[] {'1','2','4','7','8','3'});
strings.AddRange(new string[] {"01","02","09","12","28","52"});
chars.Dump();
strings.Dump();
Func<IList<object>, string> SelectFirst = ( list )
=> list.First().ToString();
Func<IList<object>, string> SelectLast = ( list )
=> list.Last().ToString();
Func<IList<object>, string> SelectRandom = ( list )
=> list.ElementAt( new Random().Next(0, list.Count())).ToString();
SelectBy(SelectFirst, strings.Cast<object>().ToList()).Dump();
SelectBy(SelectFirst, chars.Cast<object>().ToList()).Dump();
SelectBy(SelectLast, strings.Cast<object>().ToList()).Dump();
SelectBy(SelectLast, chars.Cast<object>().ToList()).Dump();
SelectBy(SelectRandom, strings.Cast<object>().ToList()).Dump();
SelectBy(SelectRandom, chars.Cast<object>().ToList()).Dump();
}
private string SelectBy(Func<IList<object>, string> func, IList<object> list)
{
return func(list);
}

Grouping consecutive identical items: IEnumerable<T> to IEnumerable<IEnumerable<T>>

I've got an interresting problem: Given an IEnumerable<string>, is it possible to yield a sequence of IEnumerable<IEnumerable<string>> that groups identical adjacent strings in one pass?
Let me explain.
1. Basic illustrative sample :
Considering the following IEnumerable<string> (pseudo representation):
{"a","b","b","b","c","c","d"}
How to get an IEnumerable<IEnumerable<string>> that would yield something of the form:
{ // IEnumerable<IEnumerable<string>>
{"a"}, // IEnumerable<string>
{"b","b","b"}, // IEnumerable<string>
{"c","c"}, // IEnumerable<string>
{"d"} // IEnumerable<string>
}
The method prototype would be:
public IEnumerable<IEnumerable<string>> Group(IEnumerable<string> items)
{
// todo
}
But it could also be :
public void Group(IEnumerable<string> items, Action<IEnumerable<string>> action)
{
// todo
}
...where action would be called for each subsequence.
2. More complicated sample
Ok, the first sample is very simple, and only aims to make the high level intent clear.
Now imagine we are dealing with IEnumerable<Anything>, where Anything is a type defined like this:
public class Anything
{
public string Key {get;set;}
public double Value {get;set;}
}
We now want to generate the subsequences based on the Key, (group every consecutive Anything that have the same key) to later use them in order to calculate the total value by group:
public void Compute(IEnumerable<Anything> items)
{
Console.WriteLine(items.Sum(i=>i.Value));
}
// then somewhere, assuming the Group method
// that returns an IEnumerable<IEnumerable<Anything>> actually exists:
foreach(var subsequence in Group(allItems))
{
Compute(subsequence);
}
3. Important notes
Only one iteration over the original sequence
No intermediary collections allocations (we can assume millions of items in the original sequence, and millions consecutives items in each group)
Keeping enumerators and defered execution behavior
We can assume that resulting subsequences will be iterated only once, and will be iterated in order.
Is it possible, and how would you write it?
Is this what you are looking for?
Iterate list only once.
Defer execution.
No intermediate collections (my other post failed on this criterion).
This solution relies on object state because it's difficult to share state between two IEnumerable methods that use yield (no ref or out params).
internal class Program
{
static void Main(string[] args)
{
var result = new[] { "a", "b", "b", "b", "c", "c", "d" }.Partition();
foreach (var r in result)
{
Console.WriteLine("Group".PadRight(16, '='));
foreach (var s in r)
Console.WriteLine(s);
}
}
}
internal static class PartitionExtension
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> src)
{
var grouper = new DuplicateGrouper<T>();
return grouper.GroupByDuplicate(src);
}
}
internal class DuplicateGrouper<T>
{
T CurrentKey;
IEnumerator<T> Itr;
bool More;
public IEnumerable<IEnumerable<T>> GroupByDuplicate(IEnumerable<T> src)
{
using(Itr = src.GetEnumerator())
{
More = Itr.MoveNext();
while (More)
yield return GetDuplicates();
}
}
IEnumerable<T> GetDuplicates()
{
CurrentKey = Itr.Current;
while (More && CurrentKey.Equals(Itr.Current))
{
yield return Itr.Current;
More = Itr.MoveNext();
}
}
}
Edit: Added extension method for cleaner usage. Fixed loop test logic so that "More" is evaluated first.
Edit: Dispose the enumerator when finished
Way Better Solution That Meets All Requirements
OK, scrap my previous solution (I'll leave it below, just for reference). Here's a much better approach that occurred to me after making my initial post.
Write a new class that implements IEnumerator<T> and provides a few additional properties: IsValid and Previous. This is all you really need to resolve the whole mess with having to maintain state inside an iterator block using yield.
Here's how I did it (pretty trivial, as you can see):
internal class ChipmunkEnumerator<T> : IEnumerator<T> {
private readonly IEnumerator<T> _internal;
private T _previous;
private bool _isValid;
public ChipmunkEnumerator(IEnumerator<T> e) {
_internal = e;
_isValid = false;
}
public bool IsValid {
get { return _isValid; }
}
public T Previous {
get { return _previous; }
}
public T Current {
get { return _internal.Current; }
}
public bool MoveNext() {
if (_isValid)
_previous = _internal.Current;
return (_isValid = _internal.MoveNext());
}
public void Dispose() {
_internal.Dispose();
}
#region Explicit Interface Members
object System.Collections.IEnumerator.Current {
get { return Current; }
}
void System.Collections.IEnumerator.Reset() {
_internal.Reset();
_previous = default(T);
_isValid = false;
}
#endregion
}
(I called this a ChipmunkEnumerator because maintaining the previous value reminded me of how chipmunks have pouches in their cheeks where they keep nuts. Does it really matter? Stop making fun of me.)
Now, utilizing this class in an extension method to provide exactly the behavior you want isn't so tough!
Notice that below I've defined GroupConsecutive to actually return an IEnumerable<IGrouping<TKey, T>> for the simple reason that, if these are grouped by key anyway, it makes sense to return an IGrouping<TKey, T> rather than just an IEnumerable<T>. As it turns out, this will help us out later anyway...
public static IEnumerable<IGrouping<TKey, T>> GroupConsecutive<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
using (var e = new ChipmunkEnumerator<T>(source.GetEnumerator())) {
if (!e.MoveNext())
yield break;
while (e.IsValid) {
yield return e.GetNextDuplicateGroup(keySelector);
}
}
}
public static IEnumerable<IGrouping<T, T>> GroupConsecutive<T>(this IEnumerable<T> source)
where T : IEquatable<T> {
return source.GroupConsecutive(x => x);
}
private static IGrouping<TKey, T> GetNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
return new Grouping<TKey, T>(keySelector(e.Current), e.EnumerateNextDuplicateGroup(keySelector));
}
private static IEnumerable<T> EnumerateNextDuplicateGroup<T, TKey>(this ChipmunkEnumerator<T> e, Func<T, TKey> keySelector)
where TKey : IEquatable<TKey> {
do {
yield return e.Current;
} while (e.MoveNext() && keySelector(e.Previous).Equals(keySelector(e.Current)));
}
(To implement these methods, I wrote a simple Grouping<TKey, T> class that implements IGrouping<TKey, T> in the most straightforward way possible. I've omitted the code just so as to keep moving along...)
OK, check it out. I think the code example below pretty well captures something resembling the more realistic scenario you described in your updated question.
var entries = new List<KeyValuePair<string, int>> {
new KeyValuePair<string, int>( "Dan", 10 ),
new KeyValuePair<string, int>( "Bill", 12 ),
new KeyValuePair<string, int>( "Dan", 14 ),
new KeyValuePair<string, int>( "Dan", 20 ),
new KeyValuePair<string, int>( "John", 1 ),
new KeyValuePair<string, int>( "John", 2 ),
new KeyValuePair<string, int>( "Bill", 5 )
};
var dupeGroups = entries
.GroupConsecutive(entry => entry.Key);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"Key: {0} Sum: {1}",
dupeGroup.Key.PadRight(5),
dupeGroup.Select(entry => entry.Value).Sum()
);
}
Output:
Key: Dan Sum: 10
Key: Bill Sum: 12
Key: Dan Sum: 34
Key: John Sum: 3
Key: Bill Sum: 5
Notice this also fixes the problem with my original answer of dealing with IEnumerator<T> objects that were value types. (With this approach, it doesn't matter.)
There's still going to be a problem if you try calling ToList here, as you will find out if you try it. But considering you included deferred execution as a requirement, I doubt you would be doing that anyway. For a foreach, it works.
Original, Messy, and Somewhat Stupid Solution
Something tells me I'm going to get totally refuted for saying this, but...
Yes, it is possible (I think). See below for a damn messy solution I threw together. (Catches an exception to know when it's finished, so you know it's a great design!)
Now, Jon's point about there being a very real problem in the event that you try to do, for instance, ToList, and then access the values in the resulting list by index, is totally valid. But if your only intention here is to be able to loop over an IEnumerable<T> using a foreach -- and you're only doing this in your own code -- then, well, I think this could work for you.
Anyway, here's a quick example of how it works:
var ints = new int[] { 1, 3, 3, 4, 4, 4, 5, 2, 3, 1, 6, 6, 6, 5, 7, 7, 8 };
var dupeGroups = ints.GroupConsecutiveDuplicates(EqualityComparer<int>.Default);
foreach (var dupeGroup in dupeGroups) {
Console.WriteLine(
"New dupe group: " +
string.Join(", ", dupeGroup.Select(i => i.ToString()).ToArray())
);
}
Output:
New dupe group: 1
New dupe group: 3, 3
New dupe group: 4, 4, 4
New dupe group: 5
New dupe group: 2
New dupe group: 3
New dupe group: 1
New dupe group: 6, 6, 6
New dupe group: 5
New dupe group: 7, 7
New dupe group: 8
And now for the (messy as crap) code:
Note that since this approach requires passing the actual enumerator around between a few different methods, it will not work if that enumerator is a value type, as calls to MoveNext in one method are only affecting a local copy.
public static IEnumerable<IEnumerable<T>> GroupConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comparer) {
using (var e = source.GetEnumerator()) {
if (e.GetType().IsValueType)
throw new ArgumentException(
"This method will not work on a value type enumerator."
);
// get the ball rolling
if (!e.MoveNext()) {
yield break;
}
IEnumerable<T> nextDuplicateGroup;
while (e.FindMoreDuplicates(comparer, out nextDuplicateGroup)) {
yield return nextDuplicateGroup;
}
}
}
private static bool FindMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer, out IEnumerable<T> duplicates) {
duplicates = enumerator.GetMoreDuplicates(comparer);
return duplicates != null;
}
private static IEnumerable<T> GetMoreDuplicates<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
try {
if (enumerator.Current != null)
return enumerator.GetMoreDuplicatesInner(comparer);
else
return null;
} catch (InvalidOperationException) {
return null;
}
}
private static IEnumerable<T> GetMoreDuplicatesInner<T>(this IEnumerator<T> enumerator, IEqualityComparer<T> comparer) {
while (enumerator.Current != null) {
var current = enumerator.Current;
yield return current;
if (!enumerator.MoveNext())
break;
if (!comparer.Equals(current, enumerator.Current))
break;
}
}
Your second bullet is the problematic one. Here's why:
var groups = CallMagicGetGroupsMethod().ToList();
foreach (string x in groups[3])
{
...
}
foreach (string x in groups[0])
{
...
}
Here, it's trying to iterate over the fourth group and then the first group... that's clearly only going to work if all the groups are buffered or it can reread the sequence, neither of which is ideal.
I suspect you want a more "reactive" approach - I don't know offhand whether Reactive Extensions does what you want (the "consecutive" requirement is unusual) but you should basically provide some sort of action to be executed on each group... that way the method won't need to worry about having to return you something which could be used later on, after it's already finished reading.
Let me know if you'd like me to try to find a solution within Rx, or whether you would be happy with something like:
void GroupConsecutive(IEnumerable<string> items,
Action<IEnumerable<string>> action)
Here's a solution that I think satisfies your requirements, works with any type of data item, and is quite short and readable:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> list)
{
var current = list.FirstOrDefault();
while (!Equals(current, default(T))) {
var cur = current;
Func<T, bool> equalsCurrent = item => item.Equals(cur);
yield return list.TakeWhile(equalsCurrent);
list = list.SkipWhile(equalsCurrent);
current = list.FirstOrDefault();
}
}
Notes:
Deferred execution is there (both TakeWhile and SkipWhile do it).
I think this iterates over the entire collection only once (with SkipWhile); it does iterate over the collection once more when you process the returned IEnumerables, but the partitioning itself iterates only once.
If you don't care about value types, you can add a constraint and change the while condition to a test for null.
If I am somehow mistaken, I 'd be especially interested in comments pointing out the mistakes!
Very Important Aside:
This solution will not allow you to enumerate the produced enumerables in any order other than the one it provides them in. However, I think the original poster has been pretty clear in comments that this is not a problem.

Categories

Resources