List<String> ByRef - c#

I'm wondering how one can prove what the .Net framework is doing behind the scenes.
I have a method that accepts a parameter of a List<String> originalParameterList.
In my method I have another List<String> newListObj if I do the following:
List<String> newListObj = originalParameterList
newListObj.Add(value);
newListObj.Add(value1);
newListObj.Add(value2);
The count of the originalParameterList grows (+3).
If I do this:
List<String> newListObj = new List<String>(originalParamterList);
newListObj.Add(value);
newListObj.Add(value1);
newListObj.Add(value2);
The count of the originalParameterList stays the sames (+0).
I also found that this code behaves the same:
List<String> newListObj = new List<String>(originalParamterList.ToArray());
newListObj.Add(value);
newListObj.Add(value1);
newListObj.Add(value2);
The count of the originalParameterList stays the sames (+0).
My question is, is there a way to see what the .Net Framework is doing behind the scenes in a definitive way?

You can load your assembly into ILDASM and(when loaded),find your method and double-click it,
it will show the cil code of that method.Just type "IL" in windows start menu in the search.
Alternatively you can you can use these following ways to also create a new independent list
private void GetList(List<string> lst)
{
List<string> NewList = lst.Cast<string>().ToList();
NewList.Add("6");
//not same values.
//or....
List<string> NewList = lst.ConvertAll(s => s);
NewList.Add("6");
//again different values
}

Normally, the documentation should give enough information to use the API.
In your specific example, the documentation for public List(IEnumerable<T> collection) says (emphasis mine):
Initializes a new instance of the List class that contains elements
copied from the specified collection and has sufficient capacity to
accommodate the number of elements copied.
For the reference here is the source code for the constructor:
public List (IEnumerable <T> collection)
{
if (collection == null)
throw new ArgumentNullException ("collection");
// initialize to needed size (if determinable)
ICollection <T> c = collection as ICollection <T>;
if (c == null) {
_items = EmptyArray<T>.Value;;
AddEnumerable (collection);
} else {
_size = c.Count;
_items = new T [Math.Max (_size, DefaultCapacity)];
c.CopyTo (_items, 0);
}
}
void AddEnumerable (IEnumerable <T> enumerable)
{
foreach (T t in enumerable)
{
Add (t);
}
}

The simplest way to do it is simply go to MSDN
http://msdn.microsoft.com/en-us/library/fkbw11z0.aspx
It says that
Initializes a new instance of the List class that contains elements copied from the specified collection and has sufficient capacity to accommodate the number of elements copied.
so internally it`s simply add all elements of passed IEnumerable into new list. It also says that
this is a O(n) operation
which means that no optimizations assumed.

That's because the frist case you referenced the original list (since it is a reference type), and you modified it's collection via newListObj. The second and third case you copied the original objects' collection via List constructor List Class, and you modified the new collection, which is not take any effect to the original.

As others already said, there are various tools that let you examine the source code of the .NET framework. I personally prefer dotPeek from JetBrains, which is free.
In the specific case that you have mentioned, I think when you pass a list into the constructor of another list, that list is copied. If you just assign one variable to another, those variables are then simply referring to the same list.

You can either
read the documentation over at MSDN
decompile the resulting MSIL-code, for instance using Telerik's free JustDecompile
or step through the .NET Framework code using the debugger.

This is the code from List constrcutor:
public List(IEnumerable<T> collection)
{
if (collection == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
}
ICollection<T> collection2 = collection as ICollection<T>;
if (collection2 != null)
{
int count = collection2.Count;
this._items = new T[count];
collection2.CopyTo(this._items, 0);
this._size = count;
return;
}
this._size = 0;
this._items = new T[4];
using (IEnumerator<T> enumerator = collection.GetEnumerator())
{
while (enumerator.MoveNext())
{
this.Add(enumerator.Current);
}
}
}
As you can see when you calls costructor which takes IEnumerable it copies all data to itself.

Related

Does the GetEnumerator() in c# return a copy or the iterates the original source?

I have a simple GetEnumerator usage.
private ConcurrentQueue<string> queue = new ConcurrentQueue<string>();
public IEnumerator GetEnumerator()
{
return queue.GetEnumerator();
}
I want to update the queue outside of this class.
So, I'm doing:
var list = _queue.GetEnumerator();
while (list.MoveNext())
{
list.Current as string = "aaa";
}
Does the GetEnumerator() returns a copy of the queue, or iterated the original value?
So while updating, I update the original?
Thank you :)
It depends on the exact underlying implementation.
As far as I remember, most of the built in dotnet containers use the current data, and not a snapshot.
You will likely get an exception if you modify a collection while iterating over it -- this is to protect against exactly this issue.
This is not the case for ConcurrentQueue<T>, as the GetEnumerator method returns a snapshot of the contents of the queue (as of .Net 4.6 - Docs)
The IEnumerator interface does not have a set on the Current property, so you cannot modify the collection this way (Docs)
Modifying a collection (add, remove, replace elements) when iterating is in general risky, as one should not know how the iterator is implemented.
To add on this, a queue is made to get first element / adding element at the end, but in any case would not allow replacing an element "in the middle".
Here are two approaches that could work:
Approach 1 - Create a new queue with updated elements
Iterate over the original queue and recreate a new collection in the process.
var newQueueUpdated = new ConcurrentQueue<string>();
var iterator = _queue.GetEnumerator();
while (iterator.MoveNext())
{
newQueueUpdated.Add("aaa");
}
_queue = newQueueUpdated;
This is naturally done in one go by using linq .Select and feed the constructor of Queue with the result IEnumerable:
_queue = new ConcurrentQueue<string>(_queue.Select(x => "aaa"));
Beware, could be resource consuming. Of course, other implementations are possible, especially if your collection is large.
Approach 2 - Collection of mutable elements
You could use a wrapper class to enable mutation of objects stored:
public class MyObject
{
public string Value { get; set; }
}
Then you create a private ConcurrentQueue<MyObject> queue = new ConcurrentQueue<MyObject>(); instead.
And now you can mutate the elements, without having to change any reference in the collection itself:
var enumerator = _queue.GetEnumerator();
while (enumerator.MoveNext())
{
enumerator.Current.Value = "aaa";
}
In the code above, the references stored by the container have never changed. Their internal state have changed, though.
In the question code, you were actually trying to change an object (string) by another object, which is not clear in the case of queue, and cannot be done through .Current which is readonly. And for some containers it should even be forbidden.
Here's some test code to see if I can modify the ConcurrentQueue<string> while it is iterating.
ConcurrentQueue<string> queue = new ConcurrentQueue<string>(new[] { "a", "b", "c" });
var e = queue.GetEnumerator();
while (e.MoveNext())
{
Console.Write(e.Current);
if (e.Current == "b")
{
queue.Enqueue("x");
}
}
e = queue.GetEnumerator(); //e.Reset(); is not supported
while (e.MoveNext())
{
Console.Write(e.Current);
}
That runs successfully and produces abcabcx.
However, if we change the collection to a standard List<string> then it fails.
Here's the implementation:
List<string> list = new List<string>(new[] { "a", "b", "c" });
var e = list.GetEnumerator();
while (e.MoveNext())
{
Console.Write(e.Current);
if (e.Current == "b")
{
list.Add("x");
}
}
e = list.GetEnumerator();
while (e.MoveNext())
{
Console.Write(e.Current);
}
That produces ab before throwing an InvalidOperationException.
For ConcurrentQueue this is specifically addressed by the documentation:
The enumeration represents a moment-in-time snapshot of the contents
of the queue. It does not reflect any updates to the collection after
GetEnumerator was called. The enumerator is safe to use concurrently
with reads from and writes to the queue.
So the answer is: It acts as if it returns a copy. (It doesn't actually make a copy, but the effect is as if it was a copy - i.e. changing the original collection while enumerating it will not change the items produced by the enumeration.)
This behaviour is NOT guaranteed for other types - for example, attempting to enumerate a List<T> will fail if the list is modified during the enumeration.

Remove and Return First Item of List

I was wondering if there was a build in method to remove and return the first item of a list with one method/command.
I used this, which was not pretty
Item currentItem = items.First();
items.RemoveAt(0);
So I could wrote an extension-method:
public static class ListExtensions
{
public static T RemoveAndReturnFirst<T>(this List<T> list)
{
T currentFirst = list.First();
list.RemoveAt(0);
return currentFirst;
}
}
//Example code
Item currentItem = items.RemoveAndReturnFirst();
Is this the best possibility or is there any built-in method?
The list is returned from a nHibernate-Query and therefore it should remain a List<T>.
Most suitable collection for this operation is Queue:
var queue = new Queue<int>();
queue.Enqueue(10); //add first
queue.Enqueue(20); //add to the end
var first = queue.Dequeue(); //removes first and returns it (10)
Queue makes Enqueue and Dequeue operations very fast. But, if you need to search inside queue, or get item by index - it's bad choice. Compare, how many different types of operations do you have and according to this choose the most suitable collection - queue, stack, list or simple array.
Also you can create a Queue from a List:
var list = new List<int>();
var queue = new Queue<int>(list);
There is no built-in method. Your code looks fine to me.
One small thing, I would use the indexer, not the First extension method:
T currentFirst = list[0];
And check your list if there is a Count > 0.
public static T RemoveAndReturnFirst<T>(this List<T> list)
{
if (list == null || list.Count == 0)
{
// Instead of returning the default,
// an exception might be more compliant to the method signature.
return default(T);
}
T currentFirst = list[0];
list.RemoveAt(0);
return currentFirst;
}
If you have to worry about concurrency, I would advice to use another collection type, since this one isn't thread-safe.

Impact of IEnumerable.ToList()

I'm just wondering what goes on when calling .ToList() on an IEnumerable in C#. Do the items actually get copied to completely new duplicated items on the heap or does the new List simply refer to the original items on the heap?
I'm wondering because someone told me it's expensive to call ToList, whereas if it's simply about assigning existing objects to a new list, that's a lightweight call.
I've written this fiddle https://dotnetfiddle.net/s7xIc2
Is simply checking the hashcode enough to know?
IEnumerable doesn't have to contain a list of anything. It can (and often does) resolve each current item at the time it is requested.
On the other hand, an IList is a complete in-memory copy of all the items.
So the answer is... It depends.
What is backing your IEnumerable? If its the file system then yes, calling .ToList can be quite expensive. If its an in-memory list already, then no, calling .ToList would not be terribly expensive.
As an example, lets say you created an IEnumerable that generated and returned a random number each time .Next was called. In this case calling .ToList on the IEnumerable would never return, and would eventually throw an Out Of Memory exception.
However, an IEnumerable of database objects has a finite bounds (usually :) ) and as long as all the data fits in memory, calling .ToList could be entirely appropriate.
Here is one version of ToList:
public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
return new List<TSource>(source);
}
It creates a new list from the source, here is the constructor:
// Constructs a List, copying the contents of the given collection. The
// size and capacity of the new list will both be equal to the size of the
// given collection.
//
public List(IEnumerable<T> collection) {
if (collection==null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
Contract.EndContractBlock();
ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
if (count == 0)
{
_items = _emptyArray;
}
else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
}
else {
_size = 0;
_items = _emptyArray;
// This enumerable could be empty. Let Add allocate a new array, if needed.
// Note it will also go to _defaultCapacity first, not 1, then 2, etc.
using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Add(en.Current);
}
}
}
}
It copies the items.
The code is from here: referencesource.microsoft.com
The ToList() create a new List object that will contains reference to the original objects or a copy of the object if they are struct.
For instance a List of int would be full copy. A list of "Product" would be only reference to the product, not a full copy. If the original is modified, the product in the list would also be modified.

Intelligent way of removing items from a List<T> while enumerating in C#

I have the classic case of trying to remove an item from a collection while enumerating it in a loop:
List<int> myIntCollection = new List<int>();
myIntCollection.Add(42);
myIntCollection.Add(12);
myIntCollection.Add(96);
myIntCollection.Add(25);
foreach (int i in myIntCollection)
{
if (i == 42)
myIntCollection.Remove(96); // The error is here.
if (i == 25)
myIntCollection.Remove(42); // The error is here.
}
At the beginning of the iteration after a change takes place, an InvalidOperationException is thrown, because enumerators don’t like when the underlying collection changes.
I need to make changes to the collection while iterating. There are many patterns that can be used to avoid this, but none of them seems to have a good solution:
Do not delete inside this loop, instead keep a separate “Delete List”, that you process after the main loop.
This is normally a good solution, but in my case, I need the item to be gone instantly as “waiting” till after
the main loop to really delete the item changes the logic flow of my code.
Instead of deleting the item, simply set a flag on the item and mark it as inactive. Then add the functionality of pattern 1 to clean up the list.
This would work for all of my needs, but it means that a lot of code will have to change in order to check the inactive flag every time an item is accessed. This is far too much administration for my liking.
Somehow incorporate the ideas of pattern 2 in a class that derives from List<T>. This Superlist will handle the inactive flag, the deletion of objects after the fact and also will not expose items marked as inactive to enumeration consumers. Basically, it just encapsulates all the ideas of pattern 2 (and subsequently pattern 1).
Does a class like this exist? Does anyone have code for this? Or is there a better way?
I’ve been told that accessing myIntCollection.ToArray() instead of myIntCollection will solve the problem and allow me to delete inside the loop.
This seems like a bad design pattern to me, or maybe it’s fine?
Details:
The list will contain many items and I will be removing only some of them.
Inside the loop, I will be doing all sorts of processes, adding, removing etc., so the solution needs to be fairly generic.
The item that I need to delete may not be the current item in the loop. For example, I may be on item 10 of a 30 item loop and need to remove item 6 or item 26. Walking backwards through the array will no longer work because of this. ;o(
The best solution is usually to use the RemoveAll() method:
myList.RemoveAll(x => x.SomeProp == "SomeValue");
Or, if you need certain elements removed:
MyListType[] elems = new[] { elem1, elem2 };
myList.RemoveAll(x => elems.Contains(x));
This assume that your loop is solely intended for removal purposes, of course. If you do need to additional processing, then the best method is usually to use a for or while loop, since then you're not using an enumerator:
for (int i = myList.Count - 1; i >= 0; i--)
{
// Do processing here, then...
if (shouldRemoveCondition)
{
myList.RemoveAt(i);
}
}
Going backwards ensures that you don't skip any elements.
Response to Edit:
If you're going to have seemingly arbitrary elements removed, the easiest method might be to just keep track of the elements you want to remove, and then remove them all at once after. Something like this:
List<int> toRemove = new List<int>();
foreach (var elem in myList)
{
// Do some stuff
// Check for removal
if (needToRemoveAnElement)
{
toRemove.Add(elem);
}
}
// Remove everything here
myList.RemoveAll(x => toRemove.Contains(x));
If you must both enumerate a List<T> and remove from it then I suggest simply using a while loop instead of a foreach
var index = 0;
while (index < myList.Count) {
if (someCondition(myList[index])) {
myList.RemoveAt(index);
} else {
index++;
}
}
I know this post is old, but I thought I'd share what worked for me.
Create a copy of the list for enumerating, and then in the for each loop, you can process on the copied values, and remove/add/whatever with the source list.
private void ProcessAndRemove(IList<Item> list)
{
foreach (var item in list.ToList())
{
if (item.DeterminingFactor > 10)
{
list.Remove(item);
}
}
}
When you need to iterate through a list and might modify it during the loop then you are better off using a for loop:
for (int i = 0; i < myIntCollection.Count; i++)
{
if (myIntCollection[i] == 42)
{
myIntCollection.Remove(i);
i--;
}
}
Of course you must be careful, for example I decrement i whenever an item is removed as otherwise we will skip entries (an alternative is to go backwards though the list).
If you have Linq then you should just use RemoveAll as dlev has suggested.
As you enumerate the list, add the one you want to KEEP to a new list. Afterward, assign the new list to the myIntCollection
List<int> myIntCollection=new List<int>();
myIntCollection.Add(42);
List<int> newCollection=new List<int>(myIntCollection.Count);
foreach(int i in myIntCollection)
{
if (i want to delete this)
///
else
newCollection.Add(i);
}
myIntCollection = newCollection;
Let's add you code:
List<int> myIntCollection=new List<int>();
myIntCollection.Add(42);
myIntCollection.Add(12);
myIntCollection.Add(96);
myIntCollection.Add(25);
If you want to change the list while you're in a foreach, you must type .ToList()
foreach(int i in myIntCollection.ToList())
{
if (i == 42)
myIntCollection.Remove(96);
if (i == 25)
myIntCollection.Remove(42);
}
For those it may help, I wrote this Extension method to remove items matching the predicate and return the list of removed items.
public static IList<T> RemoveAllKeepRemoved<T>(this IList<T> source, Predicate<T> predicate)
{
IList<T> removed = new List<T>();
for (int i = source.Count - 1; i >= 0; i--)
{
T item = source[i];
if (predicate(item))
{
removed.Add(item);
source.RemoveAt(i);
}
}
return removed;
}
How about
int[] tmp = new int[myIntCollection.Count ()];
myIntCollection.CopyTo(tmp);
foreach(int i in tmp)
{
myIntCollection.Remove(42); //The error is no longer here.
}
If you're interested in high performance, you can use two lists. The following minimises garbage collection, maximises memory locality and never actually removes an item from a list, which is very inefficient if it's not the last item.
private void RemoveItems()
{
_newList.Clear();
foreach (var item in _list)
{
item.Process();
if (!item.NeedsRemoving())
_newList.Add(item);
}
var swap = _list;
_list = _newList;
_newList = swap;
}
Just figured I'll share my solution to a similar problem where i needed to remove items from a list while processing them.
So basically "foreach" that will remove the item from the list after it has been iterated.
My test:
var list = new List<TempLoopDto>();
list.Add(new TempLoopDto("Test1"));
list.Add(new TempLoopDto("Test2"));
list.Add(new TempLoopDto("Test3"));
list.Add(new TempLoopDto("Test4"));
list.PopForEach((item) =>
{
Console.WriteLine($"Process {item.Name}");
});
Assert.That(list.Count, Is.EqualTo(0));
I solved this with a extension method "PopForEach" that will perform a action and then remove the item from the list.
public static class ListExtensions
{
public static void PopForEach<T>(this List<T> list, Action<T> action)
{
var index = 0;
while (index < list.Count) {
action(list[index]);
list.RemoveAt(index);
}
}
}
Hope this can be helpful to any one.
Currently you are using a list. If you could use a dictionary instead, it would be much easier. I'm making some assumptions that you are really using a class instead of just a list of ints. This would work if you had some form of unique key. In the dictionary, object can be any class you have and int would be any unique key.
Dictionary<int, object> myIntCollection = new Dictionary<int, object>();
myIntCollection.Add(42, "");
myIntCollection.Add(12, "");
myIntCollection.Add(96, "");
myIntCollection.Add(25, "");
foreach (int i in myIntCollection.Keys)
{
//Check to make sure the key wasn't already removed
if (myIntCollection.ContainsKey(i))
{
if (i == 42) //You can test against the key
myIntCollection.Remove(96);
if (myIntCollection[i] == 25) //or you can test against the value
myIntCollection.Remove(42);
}
}
Or you could use
Dictionary<myUniqueClass, bool> myCollection; //Bool is just an empty place holder
The nice thing is you can do anything you want to the underlying dictionary and the key enumerator doesn't care, but it also doesn't update with added or removed entries.

"Possible multiple enumeration of IEnumerable" vs "Parameter can be declared with base type"

In Resharper 5, the following code led to the warning "Parameter can be declared with base type" for list:
public void DoSomething(List<string> list)
{
if (list.Any())
{
// ...
}
foreach (var item in list)
{
// ...
}
}
In Resharper 6, this is not the case. However, if I change the method to the following, I still get that warning:
public void DoSomething(List<string> list)
{
foreach (var item in list)
{
// ...
}
}
The reason is, that in this version, list is only enumerated once, so changing it to IEnumerable<string> will not automatically introduce another warning.
Now, if I change the first version manually to use an IEnumerable<string> instead of a List<string>, I will get that warning ("Possible multiple enumeration of IEnumerable") on both occurrences of list in the body of the method:
public void DoSomething(IEnumerable<string> list)
{
if (list.Any()) // <- here
{
// ...
}
foreach (var item in list) // <- and here
{
// ...
}
}
I understand, why, but I wonder, how to solve this warning, assuming, that the method really only needs an IEnumerable<T> and not a List<T>, because I just want to enumerate the items and I don't want to change the list.
Adding a list = list.ToList(); at the beginning of the method makes the warning go away:
public void DoSomething(IEnumerable<string> list)
{
list = list.ToList();
if (list.Any())
{
// ...
}
foreach (var item in list)
{
// ...
}
}
I understand, why that makes the warning go away, but it looks a bit like a hack to me...
Any suggestions, how to solve that warning better and still use the most general type possible in the method signature?
The following problems should all be solved for a good solution:
No call to ToList() inside the method, because it has a performance impact
No usage of ICollection<T> or even more specialized interfaces/classes, because they change the semantics of the method as seen from the caller.
No multiple iterations over an IEnumerable<T> and thus risking accessing a database multiple times or similar.
Note: I am aware that this is not a Resharper issue, and thus, I don't want to suppress this warning, but fix the underlying cause as the warning is legit.
UPDATE:
Please don't care about Any and the foreach. I don't need help in merging those statements to have only one enumeration of the enumerable.
It could really be anything in this method that enumerates the enumerable multiple times!
You should probably take an IEnumerable<T> and ignore the "multiple iterations" warning.
This message is warning you that if you pass a lazy enumerable (such as an iterator or a costly LINQ query) to your method, parts of the iterator will execute twice.
There is no perfect solution, choose one acording to the situation.
enumerable.ToList, you may optimize it by firstly trying "enumerable as List" as long as you don't modify the list
Iterate two times over the IEnumerable but make it clear for the caller (document it)
Split in two methods
Take List to avoid cost of "as"/ToList and potential cost of double enumeration
The first solution (ToList) is probably the most "correct" for a public method that could be working on any Enumerable.
You can ignore Resharper issues, the warning is legit in a general case but may be wrong in your specific situation. Especially if the method is intended for internal usage and you have full control on callers.
This class will give you a way to split the first item off of the enumeration and then have an IEnumerable for the rest of the enumeration without giving you a double enumeration, thus avoiding the potentially nasty performance hit. It's usage is like this (where T is whatever type you are enumerating):
var split = new SplitFirstEnumerable(currentIEnumerable);
T firstItem = split.First;
IEnumerable<T> remaining = split.Remaining;
Here is the class itself:
/// <summary>
/// Use this class when you want to pull the first item off of an IEnumerable
/// and then enumerate over the remaining elements and you want to avoid the
/// warning about "possible double iteration of IEnumerable" AND without constructing
/// a list or other duplicate data structure of the enumerable. You construct
/// this class from your existing IEnumerable and then use its First and
/// Remaining properties for your algorithm.
/// </summary>
/// <typeparam name="T">The type of item you are iterating over; there are no
/// "where" restrictions on this type.</typeparam>
public class SplitFirstEnumerable<T>
{
private readonly IEnumerator<T> _enumerator;
/// <summary>
/// Constructor
/// </summary>
/// <remarks>Will throw an exception if there are zero items in enumerable or
/// if the enumerable is already advanced past the last element.</remarks>
/// <param name="enumerable">The enumerable that you want to split</param>
public SplitFirstEnumerable(IEnumerable<T> enumerable)
{
_enumerator = enumerable.GetEnumerator();
if (_enumerator.MoveNext())
{
First = _enumerator.Current;
}
else
{
throw new ArgumentException("Parameter 'enumerable' must have at least 1 element to be split.");
}
}
/// <summary>
/// The first item of the original enumeration, equivalent to calling
/// enumerable.First().
/// </summary>
public T First { get; private set; }
/// <summary>
/// The items of the original enumeration minus the first, equivalent to calling
/// enumerable.Skip(1).
/// </summary>
public IEnumerable<T> Remaining
{
get
{
while (_enumerator.MoveNext())
{
yield return _enumerator.Current;
}
}
}
}
This does presuppose that the IEnumerable has at least one element to start. If you want to do more of a FirstOrDefault type setup, you'll need to catch the exception that would otherwise be thrown in the constructor.
There exists a general solution to address both Resharper warnings: the lack of guarantee for repeat-ability of IEnumerable, and the List base class (or potentially expensive ToList() workaround).
Create a specialized class, I.E "RepeatableEnumerable", implementing IEnumerable, with "GetEnumerator()" implemented with the following logic outline:
Yield all items already collected so far from the inner list.
If the wrapped enumerator has more items,
While the wrapped enumerator can move to the next item,
Get the current item from the inner enumerator.
Add the current item to the inner list.
Yield the current item
Mark the inner enumerator as having no more items.
Add extension methods and appropriate optimizations where the wrapped parameter is already repeatable. Resharper will no longer flag the indicated warnings on the following code:
public void DoSomething(IEnumerable<string> list)
{
var repeatable = list.ToRepeatableEnumeration();
if (repeatable.Any()) // <- no warning here anymore.
// Further, this will read at most one item from list. A
// query (SQL LINQ) with a 10,000 items, returning one item per second
// will pass this block in 1 second, unlike the ToList() solution / hack.
{
// ...
}
foreach (var item in repeatable) // <- and no warning here anymore, either.
// Further, this will read in lazy fashion. In the 10,000 item, one
// per second, query scenario, this loop will process the first item immediately
// (because it was read already for Any() above), and then proceed to
// process one item every second.
{
// ...
}
}
With a little work, you can also turn RepeatableEnumerable into LazyList, a full implementation of IList. That's beyond the scope of this particular problem though. :)
UPDATE: Code implementation requested in comments -- not sure why the original PDL wasn't enough, but in any case, the following faithfully implements the algorithm I suggested (My own implementation implements the full IList interface; that is a bit beyond the scope I want to release here... :) )
public class RepeatableEnumerable<T> : IEnumerable<T>
{
readonly List<T> innerList;
IEnumerator<T> innerEnumerator;
public RepeatableEnumerable( IEnumerator<T> innerEnumerator )
{
this.innerList = new List<T>();
this.innerEnumerator = innerEnumerator;
}
public IEnumerator<T> GetEnumerator()
{
// 1. Yield all items already collected so far from the inner list.
foreach( var item in innerList ) yield return item;
// 2. If the wrapped enumerator has more items
if( innerEnumerator != null )
{
// 2A. while the wrapped enumerator can move to the next item
while( innerEnumerator.MoveNext() )
{
// 1. Get the current item from the inner enumerator.
var item = innerEnumerator.Current;
// 2. Add the current item to the inner list.
innerList.Add( item );
// 3. Yield the current item
yield return item;
}
// 3. Mark the inner enumerator as having no more items.
innerEnumerator.Dispose();
innerEnumerator = null;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
// Add extension methods and appropriate optimizations where the wrapped parameter is already repeatable.
public static class RepeatableEnumerableExtensions
{
public static RepeatableEnumerable<T> ToRepeatableEnumerable<T>( this IEnumerable<T> items )
{
var result = ( items as RepeatableEnumerable<T> )
?? new RepeatableEnumerable<T>( items.GetEnumerator() );
return result;
}
}
I realize this question is old and already marked as answered, but I was surprised that nobody suggested manually iterating over the enumerator:
// NOTE: list is of type IEnumerable<T>.
// The name was taken from the OP's code.
var enumerator = list.GetEnumerator();
if (enumerator.MoveNext())
{
// Run your list.Any() logic here
...
do
{
var item = enumerator.Current;
// Run your foreach (var item in list) logic here
...
} while (enumerator.MoveNext());
}
Seems a lot more straightforward than the other answers here.
Generally speaking, what you need is some state object into which you can PUSH the items (within a foreach loop), and out of which you then get your final result.
The downside of the enumerable LINQ operators is that they actively enumerate the source instead of accepting items being pushed to them, so they don't meet your requirements.
If you e.g. just need the minimum and maximum values of a sequence of 1'000'000 integers which cost $1'000 worth of processor time to retrieve, you end up writing something like this:
public class MinMaxAggregator
{
private bool _any;
private int _min;
private int _max;
public void OnNext(int value)
{
if (!_any)
{
_min = _max = value;
_any = true;
}
else
{
if (value < _min) _min = value;
if (value > _max) _max = value;
}
}
public MinMax GetResult()
{
if (!_any) throw new InvalidOperationException("Sequence contains no elements.");
return new MinMax(_min, _max);
}
}
public static MinMax DoSomething(IEnumerable<int> source)
{
var aggr = new MinMaxAggregator();
foreach (var item in source) aggr.OnNext(item);
return aggr.GetResult();
}
In fact, you just re-implemented the logic of the Min() and Max() operators. Of course that's easy, but they are only examples for arbitrary complex logic you might otherwise easily express in a LINQish way.
The solution came to me on yesterday's night walk: we need to PUSH... that's REACTIVE! All the beloved operators also exist in a reactive version built for the push paradigm. They can be chained together at will to whatever complexity you need, just as their enumerable counterparts.
So the min/max example boils down to:
public static MinMax DoSomething(IEnumerable<int> source)
{
// bridge over to the observable world
var connectable = source.ToObservable(Scheduler.Immediate).Publish();
// express the desired result there (note: connectable is observed by multiple observers)
var combined = connectable.Min().CombineLatest(connectable.Max(), (min, max) => new MinMax(min, max));
// subscribe
var resultAsync = combined.GetAwaiter();
// unload the enumerable into connectable
connectable.Connect();
// pick up the result
return resultAsync.GetResult();
}
Why not:
bool any;
foreach (var item in list)
{
any = true;
// ...
}
if(any)
{
//...
}
Update: Personally, I wouldn't drastically change the code just to get around a warning like this. I would just disable the warning and continue on. The warning is suggesting you change the general flow of the code to make it better; if you're not making the code better (and arguably making it worse) to address the warning; then the point of the warning is missed.
For example:
// ReSharper disable PossibleMultipleEnumeration
public void DoSomething(IEnumerable<string> list)
{
if (list.Any()) // <- here
{
// ...
}
foreach (var item in list) // <- and here
{
// ...
}
}
// ReSharper restore PossibleMultipleEnumeration
UIMS* - Fundamentally, there is no great solve. IEnumerable<T> used to be the "very basic thing that represents a bunch of things of the same type, so using it in method sigs is Correct." It has now also become a "thing that might evaluate behind the scenes, and might take a while, so now you always have to worry about that."
It's as if IDictionary suddenly were extended to support lazy loading of values, via a LazyLoader property of type Func<TKey,TValue>. Actually that'd be neat to have, but not so neat to be added to IDictionary, because now every time we receive an IDictionary we have to worry about that. But that's where we are.
So it would seem that "if a method takes an IEnumerable and evals it twice, always force eval via ToList()" is the best you can do. And nice work by Jetbrains to give us this warning.
*(Unless I'm Missing Something . . . just made it up but it seems useful)
Be careful when accepting enumerables in your method. The "warning" for the base type is only a hint, the enumeration warning is a true warning.
However, your list will be enumerated at least two times because you do any and then a foreach. If you add a ToList() your enumeration will be enumerated three times - remove the ToList().
I would suggest to set resharpers warning settings for the base type to a hint. So you still have a hint (green underline) and the possibility to quickfix it (alt+enter) and no "warnings" in your file.
You should take care if enumerating the IEnumerable is an expensive action like loading something from file or database, or if you have a method which calculates values and uses yield return. In this case do a ToList() or ToArray() first to load/calculate all data only ONCE.
You could use ICollection<T> (or IList<T>). It's less specific than List<T>, but doesn't suffer from the multiple-enumeration problem.
Still I'd tend to use IEnumerable<T> in this case. You can also consider to refactor the code to enumerate only once.
Use an IList as your parameter type rather than IEnumerable - IEnumerable has different semantics to List whereas IList has the same
IEnumerable could be based on a non-seekable stream which is why you get the warnings
You can iterate only once :
public void DoSomething(IEnumerable<string> list)
{
bool isFirstItem = true;
foreach (var item in list)
{
if (isFirstItem)
{
isFirstItem = false;
// ...
}
// ...
}
}
There is something no one had said before (#Zebi). Any() already iterates trying to find the element. If you call a ToList(), it will iterate as well, to create a list. The initial idea of using IEnumerable is only to iterate, anything else provokes an iteration in order to perform. You should try to, inside a single loop, do everything.
And include in it your .Any() method.
if you pass a list of Action in your method you would have a cleaner iterated once code
public void DoSomething(IEnumerable<string> list, params Action<string>[] actions)
{
foreach (var item in list)
{
for(int i =0; i < actions.Count; i++)
{
actions[i](item);
}
}
}

Categories

Resources