Cast generic Collection to List once, then RemoveAt multiple times - c#

I have a generic collection of objects as a property of an object. This data comes from a sql query and api. I want to use the RemoveAt method to remove some of them efficiently. But Visual Studio complains to me that the RemoveAt method is undefined. My intuition is to cast the Collection to a List, giving me access to the RemoveAt method. I only want to cast one time, then use RemoveAt as many times as necessary. I could use the Remove(object) command, but it requires traversing the Collection to look for the object for each call, which is slower than using RemoveAt
Here is what I'm trying:
obj.stuckArray = obj.stuckArray.ToList();
After this line, I have a line of code that looks like this:
obj.stuckArray.RemoveAt(1);
Unfortunately the RemoveAt gets underlined with red and the warning from Visual Studio reads: "ICollection does not contain a definition for 'RemoveAt'"
Is it possible to cast once and RemoveAt multiple? Or is this not possible?

Just do it in three statements instead of two, using a local variable:
var list = obj.stuckArray.ToList();
list.RemoveAt(1);
obj.stuckArray = list;
That way the type of stuckArray doesn't matter as much: you only need to be able to call ToList() on it. The RemoveAt method on List<T> is fine because that's the type of list.

Obviously you want to remove an element from the array and store it back into the original member stuckArray. However as an Icollection has no method RemoveAt defined, you get the error. The method however exists on List<T>.
So do the following instead:
var tmp = obj.stuckArray.ToList();
tmp.RemoveAt(1);
obj.stuckArray = tmp;
However this will traverse the entire collection anyway, as ToList will copy the entire collection into a new one. But I don´t see any way around this in order to delete an element from your array, because an array has no RemoveAt-method.
As per your EDIT: why not just make the Remove after the re-definition of your stuckArray:
var tmp = obj.stuckArray.ToList();
obj.stuckArray = tmp;
Now you can call RemoveAt as often as you want:
((List<MyType>)obj.stuckArray).RemoveAt(1);
((List<MyType>)obj.stuckArray).RemoveAt(1);
((List<MyType>)obj.stuckArray).RemoveAt(1);
Casting this so many times shouldn´t have a big impact on your performance, as obj.stuckArray already is a List<MyType>. RemoveAt on the other hand will have an effect here, as the method will copy the internal array, as you can see at the source-code for RemoveAt:
public void RemoveAt(int index) {
if ((uint)index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
_size--;
if (index < _size) {
Array.Copy(_items, index + 1, _items, index, _size - index); // here the entire array will be traversed again
}
_items[_size] = default(T);
_version++;
}
So by calling RemoveAt three times, you also copy the internal array three times.

If you have control over the object that exposes the struckArray property,
you can expose stuckArray as ICollection<T>, but inside the object use a List<T> as it's backing field. Then you can add a method that removes an item by it's index:
class MyClass
{
private list<int> _stuckArray; // of course, this doesn't have to be int...
public ICollection<int> StuckArray {get {return _stuckArray;}}
public RemoveFromStuckArray(int index)
{
_stuckArray.RemoveAt(index);
}
}
That will enable you to keep whatever references you already have to the property, and also supply a method to remove items by their indexes efficiently, though I'm not sure that's such a good idea to enable removing items by indexes from an ICollection in the first place.

Related

Does .Where(x=> x.listFoos.Count() > 1) count all sub element?

Given a list of object with a property of type List:
class Bar{
public List<Foo> Foos{get;set;}
}
Will the following code, that select all bar with more than one Foo, count all Foos?
Or will it stop iterating at 2 Foos?
var input = new List<Bar>();
var result = input.Where(x=> x.Foos.Count()>1).ToList();
It won't count anything. List<T> redundantly stores the number of elements, so accessing the Count property is a O(1) operation.
This works even if you use the Enumerable.Count() extension method rather than List<T>s built-in Count property, because Enumerable.Count() has a built-in optimization if the underlying data source implements ICollection<T>.
As mentioned by Enigmativity in the comments: If you have an IEnumerable which is not an ICollection<T>, you can use the following instead to prevent iterating the entire enumerable:
var result = input.Where(x => x.Foos.Skip(1).Any()).ToList();
When you have questions about how parts of .Net works it's ideal to look at the
source code
this is source for List.Count
// Read-only property describing how many elements are in the List.
public int Count {
get {
Contract.Ensures(Contract.Result<int>() >= 0);
return _size;
}
}
_size is changed whenever the underlying collection is changed, so it doesn't actually count it just references the known size of the list.

Create a copy of IEnumerable<T> to modify collection from different threads?

I am using a thread party data model which uses it's custom data model. Hierarchy of the data model is as below:
Model
---Tables(type of Table)
-----Rows(type of Row)
-------Cells( type of Cell)
Table has property Rows as like DataTable and I have to access this property in more than tasks. Now I need a row from the table which has a column value to the specified value.
To do this, I have created a method which has lock statement to make it accessible from only one thread once.
public static Row GetRowWithColumnValue(Model model, string tableKey, string indexColumnKey, string indexColumnValue)
{
Row simObj = null;
lock (syncRoot)
{
SimWrapperFromValueFactory wrapperSimSystem = new SimWrapperFromValueFactory(model, tableKey, indexColumnKey);
simObj = wrapperSimSystem.GetWrapper(indexColumnValue);
}
return simObj;
}
To create the lookup for one of the column in Table, I have create a method which always try to create a copy of the rows to avoid collection modified exception:
Private Function GetTableRows(table As Table) As List(Of Row)
Dim rowsList As New List(Of Row)(table.Rows) 'Case 1
'rowsList.AddRange(table.Rows) 'Case 2
' Case 3
'For i As Integer = 0 To table.Rows.Count - 1
'rowsList.Add(table.Rows.ElementAt(i))
'Next
Return rowsList
End Function
but other threads can modify the table(e.g. add, remove rows or update column value in any rows). I am getting below "Collection modified exception":
at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
at System.Collections.Generic.List`1.InsertRange(Int32 index, IEnumerable`1 collection)
I cannot modify this third party library to concurrent collections and this same Data Model shared between multiple project.
Question: I hunting for the solution that let me allow multiple readers on this collection either it modified in another threads.. Is it possible to Get a copy of the collection without getting exception??
Referenced below SO threads but did not find exact solution:
Lock vs. ToArray for thread safe foreach access of List collection
Can ToArray() throw an exception?
Is returning an IEnumerable<> thread-safe?
The simplest solution is to retry on exception, like this:
private List<Row> CopyVolatileList(IEnumerable<Row> original)
{
while (true)
{
try
{
List<Row> copy = new List<Row>();
foreach (Row row in original) {
copy.Add(row);
}
// Validate.
if (copy.Count != 0 && copy[copy.Count - 1] == null) // Assuming Row is a reference type.
{
// At least one element was removed from the list while were copying.
continue;
}
return copy;
}
catch (InvalidOperationException)
{
// Check ex.Message?
}
// Keep trying.
}
}
Eventually you'll get a run where the exception isn't thrown and the data integrity validation passes.
Alternatively, you can dive deep (and I mean very, very deep).
DISCLAIMER: Never ever use this in production. Unless you're desperate and really have no other option.
So we've established that you're working with a custom collection (TableRowCollection) which ultimately uses List<Row>.Enumerator to iterate through the rows. This strongly suggests that your collection is backed by a List<Row>.
First things first, you need to get a reference to that list. Your collection will not expose it publicly, so you'll need to fiddle a bit. You will need to use Reflection to find and get the value of the backing list. I recommend looking at your TableRowCollection in the debugger. It will show you non-public members and you will know what to reflect.
If you can't find your List<Row>, then take a closer look at TableRowCollection.GetEnumerator() - specifically GetEnumerator().GetType(). If that returns List<Row>.Enumerator, then bingo: we can get the backing list out of it, like so:
List<Row> list;
using (IEnumerator<Row> enumerator = table.GetEnumerator())
{
list = (List<Row>)typeof(List<Row>.Enumerator)
.GetField("list", BindingFlags.Instance | BindingFlags.NonPublic)
.GetValue(enumerator);
}
If the above methods of getting your List<Row> have failed, there is no need to read further. You might as well give up.
In case you've succeeded, now that you have the backing List<Row>, we'll have to look at Reference Source for List<T>.
What we see is 3 fields being used:
private T[] _items;
private int _size; // Accessible via "Count".
private int _version;
Our goal is to copy the items whose indexes are between zero and _size - 1 from the _items array into a new array, and to do so in between _version changes.
Observations re thread safety: List<T> does not use locks, none of the fields are marked as volatile and _version is incremented via ++, not Interlocked.Increment. Long story short this means that it is impossible to read all 3 field values and confidently say that we're looking at stable data. We'll have to read the field values repeatedly in order to be somewhat confident that we're looking at a reasonable snapshot (we will never be 100% confident, but you might choose to settle for "good enough").
using System;
using System.Collections.Generic;
using System.Linq.Expressions;
using System.Reflection;
using System.Threading;
private Row[] CopyVolatileList(List<Row> original)
{
while (true)
{
// Get _items and _size values which are safe to use in tandem.
int version = GetVersion(original); // _version.
Row[] items = GetItems(original); // _items.
int count = original.Count; // _size.
if (items.Length < count)
{
// Definitely a torn read. Copy will fail.
continue;
}
// Copy.
Row[] copy = new Row[count];
Array.Copy(items, 0, copy, 0, count);
// Stabilization window.
Thread.Sleep(1);
// Validate.
if (version == GetVersion(original)) {
return copy;
}
// Keep trying.
}
}
static Func<List<Row>, int> GetVersion = CompilePrivateFieldAccessor<List<Row>, int>("_version");
static Func<List<Row>, Row[]> GetItems = CompilePrivateFieldAccessor<List<Row>, Row[]>("_items");
static Func<TObject, TField> CompilePrivateFieldAccessor<TObject, TField>(string fieldName)
{
ParameterExpression param = Expression.Parameter(typeof(TObject), "o");
MemberExpression fieldAccess = Expression.PropertyOrField(param, fieldName);
return Expression
.Lambda<Func<TObject, TField>>(fieldAccess, param)
.Compile();
}
Note re stabilization window: the bigger it is, the more confidence you have that you're not dealing with a torn read (because the list is in process of modifying all 3 fields). I've settled on the smallest value I couldn't fail in my tests where I called CopyVolatileList in a tight loop on one thread, and used another thread to add items to the list, remove them or clear the list at random intervals between 0 and 20ms.
If you remove the stabilization window, you will occasionally get a copy with uninitialized elements at the end of the array because the other thread has removed a row while you were copying - that's why it's needed.
You should obviously validate the copy once it's built, to the best of your ability (at least check for uninitialized elements at the end of the array in case the stabilization window fails).
Good luck.

Invoking .Count on a list contained inside an element of an IEnumerable

In my code I have an IEnumerable:
IEnumerable<SomeType> listOfSomething = MethodWhichReturnsAnIEnumerable(param 1)
Now, each element in listOfSomething, also contains a list of something else let's call it listOfRules. I need to return the elements in listOfSomething which have >0 elements in their listOfRules:
var result = listOfSomething.Where(x => x.listOfRules.Count > 0);
What does that mean for the performance? listOfRules is a List so I'm curious to what calling Count will do to the IEnumerable listOfSomething in terms of whether it will put everything into memory.
Since listOfRules is List, querying Count property is very fast, because for List it just returns the value of private field and not iterating the whole collection each time. Here is an implementation, taken from here:
// Read-only property describing how many elements are in the List.
public int Count {
get {
Contract.Ensures(Contract.Result<int>() >= 0);
return _size;
}
}
If the listOfRules is a List<T> using Count will just return the stored value it won't enumerate the collection. It has nothing to do with listOfSomething, listOfSomething will be enumerated and the Count property will be called on each list.So there is nothing to worry about.
list.Count just returns the value of a field so is very fast. O(1)
So your overall performance will be O(N) where N is number of records in listOfSomething.

How to remove items in IEnumerable<MyClass>?

How do I remove items from a IEnumerable that match specific criteria?
RemoveAll() does not apply.
You can't; IEnumerable as an interface does not support removal.
If your IEnumerable instance is actually of a type that supports removal (such as List<T>) then you can cast to that type and use the Remove method.
Alternatively you can copy items to a different IEnumerable based on your criteria, or you can use a lazy-evaluated query (such as with Linq's .Where) to filter your IEnumerable on the fly. Neither of these will affect your original container, though.
This will produce a new collection rather than modifying the existing one however I think it is the idiomatic way to do it with LINQ.
var filtered = myCollection.Where(x => x.SomeProp != SomValue);
Another option would be to use Where to produce a new IEnumerable<T> with references to the objects you want removed then pass that to a Remove call on the original collection. Of course that would actually consume more resources.
You can't remove items from an IEnumerable<T>. You can remove items from an ICollection<T> or filter items from an IEnumerable<T>.
// filtering example; does not modify oldEnumerable itself
var filteredEnumerable = oldEnumerable.Where(...);
// removing example
var coll = (ICollection<MyClass>)oldEnumerable;
coll.Remove(item);
You don't remove items from an IEnumerable. It's not possible. It's just a sequence of items. You can remove items from some underlying source that generates the sequences, for example if the IEnumerable is based on a list you can remove items from that list.
The other option you have is to create a new sequence, based on this one, that never shows the given items. You can do that using Where, but it's important to realize this isn't removing items, but rather choosing to show items based on a certain condition.
As everyone has already stated, you can't remove from IEnumerable because that is not what the interface is describing. Consider the following example:
public IEnumerable<string> GetSomeStrings()
{
yield return "FirstString";
yield return "Another string";
}
Clearly, removing an element from this IEnumerable is not something you can reasonably do, instead you'd have to make a new enumeration without the ones you don't want.
The yield keywork provides other examples, for example, you can have infinite lists:
public IEnumberable<int> GetPowersOf2()
{
int value = 1;
while(true)
{
yield return value;
value = value * 2;
}
}
Items cannot be removed from an IEnumerable<T>. From the documentation:
Exposes the enumerator, which supports a simple iteration over a collection of a specified type.
You can cast it and use the List<T>.RemoveAll(Predicate<T> match) this is exactly what you need.
This is how i do,
IEnumerable<T> myVar=getSomeData(); // Assume mayVar holds some data
myVar=myVar.Where(d=>d.Id>10); // thats all, i want data with Id>10 only
How about trying Enumerable.Empty i.e.
T obj = new T();
IEnumerable<T> myVar = new T[]{obj} //Now myVar has an element
myVar = Enumerable.Empty<T>(); //Now myVar is empty

Difference on .Count() in IQueryable and IEnumerable

I have list of object that I'm passing to function and inside I want to count them.
When I use IQueryable() it dosen't count them correct, but when I use IEnumerable() it count them Ok.
public int functCount(IQueryable<MyObject> myLisfOfObjects)
{
int numberOfItems = myListOfObjects.Count(o => o.isSet); //isSet is bool property in my object
return numberOfItem;
}
On first call it return correct number of items, but when I change isSet properties in some elements in list it returns the same number as first call.
public int functCount(IEnumerable<MyObject> myLisfOfObjects)
{
int numberOfItems = myListOfObjects.Count(o => o.isSet); //isSet is bool property in my object
return numberOfItem;
}
But when I change my code to this, it returns correct count every time.
Can anyone explain me the difference, and why is this happening?
Since whatever you pass to either of those functions does not cause compilation errors, then it is safe to assume it is an IQueryable. Meaning that .Count() will be reevaluated as a SQL statement and run against the DB (ignoring whatever change you thought you have made in memory).
Calling .Count() on IEnumerable however is calling totally different method which will simply return the count of the in-memory collection.
In other words IEnumerable<T>.Count() and IQueryable<T>.Count() are two completely different methods, that do different things.

Categories

Resources