Speeding up iterating through two foreach loops - c#

Trying speed up iterating though two foreach loops at the moment it takes about 15 seconds`
foreach (var prodCost in Settings.ProdCostsAndQtys)
{
foreach (var simplified in Settings.SimplifiedPricing
.Where(simplified => prodCost.improd.Equals(simplified.PPPROD) &&
prodCost.pplist.Equals(simplified.PPLIST)))
{
prodCost.pricecur = simplified.PPP01;
prodCost.priceeur = simplified.PPP01;
}
}
Basically the ProdCostsAndQtys list is a list of objects which has 5 properties, the size of the list is 798677
The SimplifiedPricing list is a list of objects with 44 properties, the size of this list is 347 but is more than likely going to get a lot bigger (hence wanting to get the best performance now).
The loop iterates through all the objects in the first list within the second loop if the two conditions match they replace the two properties from the first loop with the second loop.

It seems that your SimplifiedPricing is a smaller lookup list and the outer loop iterates on a larger list. It looks to me as if the main source of delay is the Equals check for each item on the smaller list to match each item in the larger list. Also, when you have a match, you update the value in the larger list, so updating multiple times looks redundant.
Considering this, I would suggest building up a Dictionary for the items in the smaller list, increasing memory consumption but drastically speeding up lookup times. First we need something to hold the key of this dictionary. I will assume that the improd and pplist are integers, but it does not matter for this case:
public struct MyKey
{
public readonly int Improd;
public readonly int Pplist;
public MyKey(int improd, int pplist)
{
Improd = improd;
Pplist = pplist;
}
public override int GetHashCode()
{
return Improd.GetHashCode() ^ Pplist.GetHashCode();
}
public override bool Equals(object obj)
{
if (!(obj is MyKey)) return false;
var other = (MyKey)obj;
return other.Improd.Equals(this.Improd) && other.Pplist.Equals(this.Pplist);
}
}
Now that we have something that compares Pplist and Improd in one go, we can use it as a key for a dictionary containing the SimplifiedPricing.
IReadOnlyDictionary<MyKey, SimplifiedPricing> simplifiedPricingLookup =
(from sp in Settings.SimplifiedPricing
group sp by new MyKey(sp.PPPROD, sp.PPLIST) into g
select new {key = g.Key, value = g.Last()}).ToDictionary(o => o.key, o => o.value);
Notice the IReadOnlyDictionary. This is to show our intent of not modifying this dictionary after its creation, allowing us to safely parallelize the main loop:
Parallel.ForEach(Settings.ProdCostsAndQtys, c =>
{
SimplifiedPricing value;
if (simplifiedPricingLookup.TryGetValue(new MyKey(c.improd, c.pplist), out value))
{
c.pricecur = value.PPP01;
c.priceeur = value.PPP01;
}
});
This should change your single-threaded O(n²) loop to a parallelized O(n) loop, with a slight overhead for creating the simplifiedPricingLookup dictionary.

A join should be more efficient:
var toUpdate = from pc in Settings.ProdCostsAndQtys
join s in Settings.SimplifiedPricing
on new { prod=pc.improd, list=pc.pplist } equals new { prod=s.PPPROD, list=s.PPLIST }
select new { prodCost = pc, simplified = s };
foreach (var pcs in toUpdate)
{
pcs.prodCost.pricecur = pcs.simplified.PPP01;
pcs.prodCost.priceeur = pcs.simplified.PPP01;
}

You could make use of multiple threads with parallel.Foreach:
Parallel.ForEach(Settings.ProdCostsAndQtys, prodCost =>
{
foreach (var simplified in Settings.SimplifiedPricing
.Where(simplified =>
prodCost.improd.Equals(simplified.PPPROD) &&
prodCost.pplist.Equals(simplified.PPLIST))
{
prodCost.pricecur = simplified.PPP01;
prodCost.priceeur = simplified.PPP01;
}
}
However, this only applies if you have the lists in memory. There are far more efficient mechanisms for updating the lists in the database. Also, using linq join might make the code more readable at neglectible performance cost.

Related

What is the most efficient way to find elements in a list that do not exist in another list and vice versa?

Consider you have two lists in C#, first list contains elements of TypeOne and second list contains elements of TypeTwo:
TypeOne
{
int foo;
int bar;
}
TypeTwo
{
int baz;
int qux;
}
Now I need to find elements ( with some property value ) in the first list that don't exist in the second list, and similarly I want to find elements in the second list that don't exist in the first list. (There are only zero or one occurences in either lists.)
What I tried so far is to iterate both lists like this:
foreach (var item in firstList)
{
if (!secondList.Any(a=> a.baz == item.foo)
{
// Item is in the first list but not in second list.
}
}
and again:
foreach (var item in secondList)
{
if (!firstList.Any(a=> a.foo == item.baz)
{
// Item is in the second list but not in first list.
}
}
I hardly think this is a good way to do what I want. I'm iterating my lists two times and use Any in each of them which also iterates the list. So too many iterations.
What is the most efficient way to achieve this?
I am afraid there is no prebuild solution for this, so the best we can do is optimize as much as possible. We only have to iterate the first list, because everything that is in second will be compared already
// First we need copies to operate on
var firstCopy = new List<TypeOne>(firstList);
var secondCopy = new List<TypeTwo>(secondList);
// Now we iterate the first list once complete
foreach (var typeOne in firstList)
{
var match = secondCopy.FirstOrDefault(s => s.baz == typeOne.foo);
if (match == null)
{
// Item in first but not in second
}
else
{
// Match is duplicate and shall be removed from both
firstCopy.Remove(typeOne);
secondCopy.Remove(match);
}
}
After running this both copies will only contain the values which are unique in this instance. This not only reduces it to half the number of iterations but also constantly improves because the second copy shrinks with each match.
Use this LINQ Query.
var result1 = secondList.Where(p2 => !firstList.Any(p1 => p1.foo == p2.baz));
var result2=firstList.Where(p1=> !secondList.Any(p2=> p2.foo == p1.baz);

Is it the same to iterate over Linq expression result than to assign it first to a variable?

So, this is more difficult to explain in words, so i will put code examples.
let's suppose i already have a list of clients that i want to filter.
Basically i want to know if this:
foreach(var client in list.Where(c=>c.Age > 20))
{
//Do something
}
is the same as this:
var filteredClients = list.Where(c=>c.Age > 20);
foreach(var client in filteredClients)
{
//Do something
}
I've been told that the first approach executes the .Where() in every iteration.
I'm sorry if this is a duplicate, i couldn't find any related question.
Thanks in advance.
Yes, both those examples are functionally identical. One just stores the result from Enumerable.Where in a variable before accessing it while the other just accesses it directly.
To really see why this will not make a difference, you have to understand what a foreach loop essentially does. The code in your examples (both of them) is basically equivalent to this (I’ve assumed a known type Client here):
IEnumerable<Client> x = list.Where(c=>c.Age > 20);
// foreach loop
IEnumerator<Client> enumerator = x.GetEnumerator();
while (enumerator.MoveNext())
{
Client client = enumerator.Current;
// Do something
}
So what actually happens here is the IEnumerable result from the LINQ method is not consumed directly, but an enumerator of it is requested first. And then the foreach loop does nothing else than repeatedly asking for a new object from the enumerator and processing the current element in each loop body.
Looking at this, it doesn’t make sense whether the x in the above code is really an x (i.e. a previously stored variable), or whether it’s the list.Where() call itself. Only the enumerator object—which is created just once—is used in the loop.
Now to cover that SharePoint example which Colin posted. It looks like this:
SPList activeList = SPContext.Current.List;
for (int i=0; i < activeList.Items.Count; i++)
{
SPListItem listItem = activeList.Items[i];
// do stuff
}
This is a fundamentally different thing though. Since this is not using a foreach loop, we do not get that one enumerator object which we use to iterate through the list. Instead, we repeatedly access activeList.Items: Once in the loop body to get an item by index, and once in the continuation condition of the for loop where we get the collection’s Count property value.
Unfortunately, Microsoft does not follow its own guidelines all the time, so even if Items is a property on the SPList object, it actually is creating a new SPListItemCollection object every time. And that object is empty by default and will only lazily load the actual items when you first access an item from it. So above code will eventually create a large amount of SPListItemCollections which will each fetch the items from the database. This behavior is also mentioned in the remarks section of the property documentation.
This generally violates Microsoft’s own guidelines on choosing a property vs a method:
Do use a method, rather than a property, in the following situations.
The operation returns a different result each time it is called, even if the parameters do not change.
Note that if we used a foreach loop for that SharePoint example again, then everything would have been fine, since we would have again only requested a single SPListItemCollection and created a single enumerator for it:
foreach (SPListItem listItem in activeList.Items.Cast<SPListItem>())
{ … }
They are not quite the same:
Here is the original C# code:
static void ForWithVariable(IEnumerable<Person> clients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
{
Console.WriteLine(client.Age.ToString());
}
}
static void ForWithoutVariable(IEnumerable<Person> clients)
{
foreach (var client in clients.Where(x => x.Age > 20))
{
Console.WriteLine(client.Age.ToString());
}
}
Here is the decompiled Intermediate Language (IL) code this results in (according to ILSpy):
private static void ForWithVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_21_1;
if ((arg_21_1 = Program.<>c.<>9__1_0) == null)
{
arg_21_1 = (Program.<>c.<>9__1_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithVariable>b__1_0));
}
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
foreach (Person current in enumerable)
{
Console.WriteLine(current.Age.ToString());
}
}
private static void ForWithoutVariable(IEnumerable<Person> clients)
{
Func<Person, bool> arg_22_1;
if ((arg_22_1 = Program.<>c.<>9__2_0) == null)
{
arg_22_1 = (Program.<>c.<>9__2_0 = new Func<Person, bool>(Program.<>c.<>9.<ForWithoutVariable>b__2_0));
}
foreach (Person current in clients.Where(arg_22_1))
{
Console.WriteLine(current.Age.ToString());
}
}
As you can see, there is a key difference:
IEnumerable<Person> enumerable = clients.Where(arg_21_1);
A more practical question, however, is whether the differences hurt performance. I concocted a test to measure that.
class Program
{
public static void Main()
{
Measure(ForEachWithVariable);
Measure(ForEachWithoutVariable);
Console.ReadKey();
}
static void Measure(Action<List<Person>, List<Person>> action)
{
var clients = new[]
{
new Person { Age = 10 },
new Person { Age = 20 },
new Person { Age = 30 },
}.ToList();
var adultClients = new List<Person>();
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 1E6; i++)
action(clients, adultClients);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds.ToString());
Console.WriteLine($"{adultClients.Count} adult clients found");
}
static void ForEachWithVariable(List<Person> clients, List<Person> adultClients)
{
var adults = clients.Where(x => x.Age > 20);
foreach (var client in adults)
adultClients.Add(client);
}
static void ForEachWithoutVariable(List<Person> clients, List<Person> adultClients)
{
foreach (var client in clients.Where(x => x.Age > 20))
adultClients.Add(client);
}
}
class Person
{
public int Age { get; set; }
}
After several runs of the program, I was not able to find any significant difference between ForEachWithVariable and ForEachWithoutVariable. They were always close in time, and neither was consistently faster than the other. Interestingly, if I change 1E6 to just 1000, the ForEachWithVariable is actually consistently slower, by about 1 millisecond.
So, I conclude that for LINQ to Objects, there is no practical difference. The same type of test could be run if your particular use case involves LINQ to Entities (or SharePoint).

Comparing two lists of nested lists and returning the added/changed/removed items

I've looked at many similar questions on stackoverflow, but I haven't seen an exact match for my problem.
I need to compare two "lists of nested lists" and capture the differences. One is an "old" list and the other is a "new" list. When comparing the nested lists, they can be considered equal if all of the NESTED list items (the MyObject.Ids) are present in both lists in order (you can assume that the nested MyObject.Ids lists are already sorted and that there are no duplicates). The MyObject.Id and MyObject.Name properties are not considering in the equality comparison, but they are still important metadata for MyObject's which should not get lost.
I am not looking for a boolean indicator of equality. Instead I need to create three new lists which capture the differences between the old and new lists (e.g. a list of items which were Added, a list of items which were Removed, and a list of items which were present in both lists).
Below is an example of some code which does exactly what I want! What I would like to know is how to make this shorter/better/simpler (cutting out one of the for loops would be a good start). To make things trickier, please assume that you cannot make any changes to the MyObject class or use any custom Equals/IEqualityComparer etc implementations.
public class MyObject
{
public Guid Id { get; set; }
public string Name { get; set; }
public List<Guid> Ids { get; set; }
}
...
// Get the list of existing objects (assume this returns some populated list)
List<MyObject> existingObjects = GetExistingObjects();
// Create a list of updated objects
List<MyObject> updatedObjects = new List<MyObject>()
{
new MyObject()
{
Ids = new List<Guid>() { new Guid("48af3cb9-945a-4ab9-91e4-7ee5765e5304"), new Guid("54b5128a-cf53-436c-9d88-2ef7abd15140") }
},
new MyObject()
{
Ids = new List<Guid>() { new Guid("0485382f-8f92-4a71-9eba-09831392ceb9"), new Guid("3d8b98df-caee-41ce-b802-2f0c5f9742de") }
}
};
// Do the comparison and capture the differences
List<MyObject> addedObjects = new List<MyObject>();
List<MyObject> removedObjects = new List<MyObject>();
List<MyObject> sameObjects = new List<MyObject>();
foreach (MyObject obj in updatedObjects)
{
if (existingObjects.Any(list => list.Ids.SequenceEqual(obj.Ids)))
{
sameObjects.Add(obj);
continue;
}
addedObjects.Add(obj);
}
foreach (MyObject obj in existingObjects)
{
if (!updatedObjects.Any(list => list.Ids.SequenceEqual(obj.Ids)))
{
removedObjects.Add(obj);
}
}
Here is a little shorter (due to elimination of the second loop) and little better (due to elimination of the ineffective search contained in the second loop). Still O(N^2) time complexity due to ineffective search contained in the loop though.
var addedObjects = new List<MyObject>();
var removedObjects = new List<MyObject>(existingObjects);
var sameObjects = new List<MyObject>();
foreach (var newObject in updatedObjects)
{
int index = removedObjects.FindIndex(oldObject => oldObject.Ids.SequenceEqual(newObject.Ids));
if (index < 0)
addedObjects.Add(newObject);
else
{
removedObjects.RemoveAt(index);
sameObjects.Add(newObject);
}
}
Update: A shorter, but IMO definitely not better (in fact worse performance wise) version
var addedObjects = updatedObjects.Where(newObject => !existingObjects.Any(oldObject => oldObject.Ids.SequenceEqual(newObject.Ids))).ToList();
var removedObjects = existingObjects.Where(oldObject => !updatedObjects.Any(newObject => newObject.Ids.SequenceEqual(oldObject.Ids))).ToList();
var sameObjects = updatedObjects.Where(newObject => !addedObjects.Any(addedObject => addedObject.Ids.SequenceEqual(newObject.Ids))).ToList();
If MyObject does not define custom equality comparison, i.e. uses default reference equality, the last line could be replaced with shorter and better performing
var sameObjects = updatedObjects.Except(addedObjects);
You can use Intersect and Except function in Linq
With Intersect you will get existing object,
and with Except you will get new objects.
Example of Except from MSDN:
double[] numbers1 = { 2.0, 2.1, 2.2, 2.3, 2.4, 2.5 };
double[] numbers2 = { 2.2 };
IEnumerable<double> onlyInFirstSet = numbers1.Except(numbers2);
foreach (double number in onlyInFirstSet)
Console.WriteLine(number);

One To Many LINQ Joins Across Nested Collections

I have a development scenario where I am joining two collections with Linq; a single list of column header objects which contain presentation metadata, and an enumeration of kv dictionaries which result from a web service call. I can currently iterate (for) through the dictionary enumeration, and join the single header list to the current kv dictionary without issue. After joining, I emit a curated array of dictionary values for each iteration.
What I would like to do is eliminate the for loop, and join the single header list directly to the entire enumeration. I understand the 1-to-1 collection join pretty well, but the 1-to-N syntax is eluding me.
Details
I have the following working method:
public void GetQueryResults(DataTable outputTable)
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
foreach (var row in odResponse)
{
var rowValues = OutputFields
.Join(row, h => h.Key, r => r.Key,
(h, r) => new { Header = h, ResultRow = r })
.Select(r => r.ResultRow.Value);
outputTable.Rows.Add(rowValues.ToArray());
}
}
odResponse contains IEnumerable<IDictionary<string, object>>; OutputFields contains IList<QueryField>; the .Join produces an enumeration of anons containing matched field metadata (.Header) and response kv pairs (.ResultRow); finally, the .Select emits the matched response values for row consumption. The OutputField collection looks like this:
class QueryField
{
public string Key { get; set; }
public string Label { get; set; }
public int Order { get; set; }
}
Which is declared as:
public IList<QueryField> OutputFields { get; private set; }
By joining the collection of field headers to the response rows, I can pluck just the columns I need from the response. If the header keys contain { "size", "shape", "color" } and the response keys contain { "size", "mass", "color", "longitude", "latitude" }, I will get an array of values for { "size", "shape", "color" }, where shape is null, and the mass, longitude, and latitude values are ignored. For the purposes of this scenario, I am not concerned with ordering. This all works a treat.
Problem
What I'd like to do is refactor this method to return an enumeration of value array rows, and let the caller manage the consumption of the data:
public IEnumerable<string[]> GetQueryResults()
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
var responseRows = //join OutputFields to each row in odResponse by .Key
return responseRows;
}
Followup Question
Would a Linq-implemented solution for this refactor require an immediate scan of the enumeration, or can it pass back a lazy result? The purpose of the refactor is to improve encapsulation without causing redundant collection scans. I can always build imperative loops to reformat the response data the hard way, but what I'd like from Linq is something like a closure.
Thanks heaps for spending the time to read this; any suggestions are appreciated!
I'm not completely sure what you mean but could it be you're meaning something like this?
public IEnumerable<object[]> GetQueryResults()
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
// i'd rather you linq here.
var responseRows = from row in odResponse
select new object[]
{
from field in row
join outputfield in OutputFields
on field.Key equals outputfield.Key
select field.Value
};
return responseRows;
}
Instead of filling a DataTable. This will create an array of objects and filling it with field.Value where the field.Key exists in the outputfields. The whole thing is encapsulated in a IEnumerable. (from row in odResponse)
Usage:
var responseRows = GetQueryResults2();
foreach(var rowValues in responseRows)
outputTable.Rows.Add(rowValues);
The trick here is, within one query you iterate a list and create a subquery on the fields and stores the subquery result directly in a object[]. The object[] is only created when the responseRows is iterated. This is the answer on your second question I think -> the Lazy result.

Best way to remove items from a collection

What is the best way to approach removing items from a collection in C#, once the item is known, but not it's index. This is one way to do it, but it seems inelegant at best.
//Remove the existing role assignment for the user.
int cnt = 0;
int assToDelete = 0;
foreach (SPRoleAssignment spAssignment in workspace.RoleAssignments)
{
if (spAssignment.Member.Name == shortName)
{
assToDelete = cnt;
}
cnt++;
}
workspace.RoleAssignments.Remove(assToDelete);
What I would really like to do is find the item to remove by property (in this case, name) without looping through the entire collection and using 2 additional variables.
If RoleAssignments is a List<T> you can use the following code.
workSpace.RoleAssignments.RemoveAll(x =>x.Member.Name == shortName);
If you want to access members of the collection by one of their properties, you might consider using a Dictionary<T> or KeyedCollection<T> instead. This way you don't have to search for the item you're looking for.
Otherwise, you could at least do this:
foreach (SPRoleAssignment spAssignment in workspace.RoleAssignments)
{
if (spAssignment.Member.Name == shortName)
{
workspace.RoleAssignments.Remove(spAssignment);
break;
}
}
#smaclell asked why reverse iteration was more efficient in in a comment to #sambo99.
Sometimes it's more efficient. Consider you have a list of people, and you want to remove or filter all customers with a credit rating < 1000;
We have the following data
"Bob" 999
"Mary" 999
"Ted" 1000
If we were to iterate forward, we'd soon get into trouble
for( int idx = 0; idx < list.Count ; idx++ )
{
if( list[idx].Rating < 1000 )
{
list.RemoveAt(idx); // whoops!
}
}
At idx = 0 we remove Bob, which then shifts all remaining elements left. The next time through the loop idx = 1, but
list[1] is now Ted instead of Mary. We end up skipping Mary by mistake. We could use a while loop, and we could introduce more variables.
Or, we just reverse iterate:
for (int idx = list.Count-1; idx >= 0; idx--)
{
if (list[idx].Rating < 1000)
{
list.RemoveAt(idx);
}
}
All the indexes to the left of the removed item stay the same, so you don't skip any items.
The same principle applies if you're given a list of indexes to remove from an array. In order to keep things straight you need to sort the list and then remove the items from highest index to lowest.
Now you can just use Linq and declare what you're doing in a straightforward manner.
list.RemoveAll(o => o.Rating < 1000);
For this case of removing a single item, it's no more efficient iterating forwards or backwards. You could also use Linq for this.
int removeIndex = list.FindIndex(o => o.Name == "Ted");
if( removeIndex != -1 )
{
list.RemoveAt(removeIndex);
}
If it's an ICollection then you won't have a RemoveAll method. Here's an extension method that will do it:
public static void RemoveAll<T>(this ICollection<T> source,
Func<T, bool> predicate)
{
if (source == null)
throw new ArgumentNullException("source", "source is null.");
if (predicate == null)
throw new ArgumentNullException("predicate", "predicate is null.");
source.Where(predicate).ToList().ForEach(e => source.Remove(e));
}
Based on:
http://phejndorf.wordpress.com/2011/03/09/a-removeall-extension-for-the-collection-class/
For a simple List structure the most efficient way seems to be using the Predicate RemoveAll implementation.
Eg.
workSpace.RoleAssignments.RemoveAll(x =>x.Member.Name == shortName);
The reasons are:
The Predicate/Linq RemoveAll method is implemented in List and has access to the internal array storing the actual data. It will shift the data and resize the internal array.
The RemoveAt method implementation is quite slow, and will copy the entire underlying array of data into a new array. This means reverse iteration is useless for List
If you are stuck implementing this in a the pre c# 3.0 era. You have 2 options.
The easily maintainable option. Copy all the matching items into a new list and and swap the underlying list.
Eg.
List<int> list2 = new List<int>() ;
foreach (int i in GetList())
{
if (!(i % 2 == 0))
{
list2.Add(i);
}
}
list2 = list2;
Or
The tricky slightly faster option, which involves shifting all the data in the list down when it does not match and then resizing the array.
If you are removing stuff really frequently from a list, perhaps another structure like a HashTable (.net 1.1) or a Dictionary (.net 2.0) or a HashSet (.net 3.5) are better suited for this purpose.
What type is the collection? If it's List, you can use the helpful "RemoveAll":
int cnt = workspace.RoleAssignments
.RemoveAll(spa => spa.Member.Name == shortName)
(This works in .NET 2.0. Of course, if you don't have the newer compiler, you'll have to use "delegate (SPRoleAssignment spa) { return spa.Member.Name == shortName; }" instead of the nice lambda syntax.)
Another approach if it's not a List, but still an ICollection:
var toRemove = workspace.RoleAssignments
.FirstOrDefault(spa => spa.Member.Name == shortName)
if (toRemove != null) workspace.RoleAssignments.Remove(toRemove);
This requires the Enumerable extension methods. (You can copy the Mono ones in, if you are stuck on .NET 2.0). If it's some custom collection that cannot take an item, but MUST take an index, some of the other Enumerable methods, such as Select, pass in the integer index for you.
This is my generic solution
public static IEnumerable<T> Remove<T>(this IEnumerable<T> items, Func<T, bool> match)
{
var list = items.ToList();
for (int idx = 0; idx < list.Count(); idx++)
{
if (match(list[idx]))
{
list.RemoveAt(idx);
idx--; // the list is 1 item shorter
}
}
return list.AsEnumerable();
}
It would look much simpler if extension methods support passing by reference !
usage:
var result = string[]{"mike", "john", "ali"}
result = result.Remove(x => x.Username == "mike").ToArray();
Assert.IsTrue(result.Length == 2);
EDIT: ensured that the list looping remains valid even when deleting items by decrementing the index (idx).
Here is a pretty good way to do it
http://support.microsoft.com/kb/555972
System.Collections.ArrayList arr = new System.Collections.ArrayList();
arr.Add("1");
arr.Add("2");
arr.Add("3");
/*This throws an exception
foreach (string s in arr)
{
arr.Remove(s);
}
*/
//where as this works correctly
Console.WriteLine(arr.Count);
foreach (string s in new System.Collections.ArrayList(arr))
{
arr.Remove(s);
}
Console.WriteLine(arr.Count);
Console.ReadKey();
There is another approach you can take depending on how you're using your collection. If you're downloading the assignments one time (e.g., when the app runs), you could translate the collection on the fly into a hashtable where:
shortname => SPRoleAssignment
If you do this, then when you want to remove an item by short name, all you need to do is remove the item from the hashtable by key.
Unfortunately, if you're loading these SPRoleAssignments a lot, that obviously isn't going to be any more cost efficient in terms of time. The suggestions other people made about using Linq would be good if you're using a new version of the .NET Framework, but otherwise, you'll have to stick to the method you're using.
Similar to Dictionary Collection point of view, I have done this.
Dictionary<string, bool> sourceDict = new Dictionary<string, bool>();
sourceDict.Add("Sai", true);
sourceDict.Add("Sri", false);
sourceDict.Add("SaiSri", true);
sourceDict.Add("SaiSriMahi", true);
var itemsToDelete = sourceDict.Where(DictItem => DictItem.Value == false);
foreach (var item in itemsToDelete)
{
sourceDict.Remove(item.Key);
}
Note:
Above code will fail in .Net Client Profile (3.5 and 4.5) also some viewers mentioned it is
Failing for them in .Net4.0 as well not sure which settings are causing the problem.
So replace with below code (.ToList()) for Where statement, to avoid that error. “Collection was modified; enumeration operation may not execute.”
var itemsToDelete = sourceDict.Where(DictItem => DictItem.Value == false).ToList();
Per MSDN From .Net4.5 onwards Client Profile are discontinued. http://msdn.microsoft.com/en-us/library/cc656912(v=vs.110).aspx
Save your items first, than delete them.
var itemsToDelete = Items.Where(x => !!!your condition!!!).ToArray();
for (int i = 0; i < itemsToDelete.Length; ++i)
Items.Remove(itemsToDelete[i]);
You need to override GetHashCode() in your Item class.
The best way to do it is by using linq.
Example class:
public class Product
{
public string Name { get; set; }
public string Price { get; set; }
}
Linq query:
var subCollection = collection1.RemoveAll(w => collection2.Any(q => q.Name == w.Name));
This query will remove all elements from collection1 if Name match any element Name from collection2
Remember to use: using System.Linq;
To do this while looping through the collection and not to get the modifying a collection exception, this is the approach I've taken in the past (note the .ToList() at the end of the original collection, this creates another collection in memory, then you can modify the existing collection)
foreach (SPRoleAssignment spAssignment in workspace.RoleAssignments.ToList())
{
if (spAssignment.Member.Name == shortName)
{
workspace.RoleAssignments.Remove(spAssignment);
}
}
If you have got a List<T>, then List<T>.RemoveAll is your best bet. There can't be anything more efficient. Internally it does the array moving in one shot, not to mention it is O(N).
If all you got is an IList<T> or an ICollection<T> you got roughly these three options:
public static void RemoveAll<T>(this IList<T> ilist, Predicate<T> predicate) // O(N^2)
{
for (var index = ilist.Count - 1; index >= 0; index--)
{
var item = ilist[index];
if (predicate(item))
{
ilist.RemoveAt(index);
}
}
}
or
public static void RemoveAll<T>(this ICollection<T> icollection, Predicate<T> predicate) // O(N)
{
var nonMatchingItems = new List<T>();
// Move all the items that do not match to another collection.
foreach (var item in icollection)
{
if (!predicate(item))
{
nonMatchingItems.Add(item);
}
}
// Clear the collection and then copy back the non-matched items.
icollection.Clear();
foreach (var item in nonMatchingItems)
{
icollection.Add(item);
}
}
or
public static void RemoveAll<T>(this ICollection<T> icollection, Func<T, bool> predicate) // O(N^2)
{
foreach (var item in icollection.Where(predicate).ToList())
{
icollection.Remove(item);
}
}
Go for either 1 or 2.
1 is lighter on memory and faster if you have less deletes to perform (i.e. predicate is false most of the times).
2 is faster if you have more deletes to perform.
3 is the cleanest code but performs poorly IMO. Again all that depends on input data.
For some benchmarking details see https://github.com/dotnet/BenchmarkDotNet/issues/1505
A lot of good responses here; I especially like the lambda expressions...very clean. I was remiss, however, in not specifying the type of Collection. This is a SPRoleAssignmentCollection (from MOSS) that only has Remove(int) and Remove(SPPrincipal), not the handy RemoveAll(). So, I have settled on this, unless there is a better suggestion.
foreach (SPRoleAssignment spAssignment in workspace.RoleAssignments)
{
if (spAssignment.Member.Name != shortName) continue;
workspace.RoleAssignments.Remove((SPPrincipal)spAssignment.Member);
break;
}

Categories

Resources