I can see why this is not allowed:
foreach (Thing t in myCollection) {
if (shouldDelete(t) {
myCollection.Delete(t);
}
}
but how about this?
foreach (Thing t in myCollection.Where(o=>shouldDelete(o)) {
myCollection.Delete(t);
}
I don't understand why this fails. The "Where()" method obviously isn't returning the original collection, so I am not enumerating round the original collection when I try to remove something from it.
I don't understand why this fails.
I assume your question then is "why does this fail?" (You forgot to actually ask a question in your question.)
The "Where()" method obviously isn't returning the original collection
Correct. "Where" returns an IEnumerable<T> which represents the collection with a filter put on top of it.
so I am not enumerating round the original collection when I try to remove something from it.
Incorrect. You are enumerating the original collection. You're enumerating the original collection with a filter put on top of it.
When you call "Where" it does not eagerly evaluate the filter and produce a brand new copy of the original collection with the filter applied to it. Rather, it gives you an object which enumerates the original collection, but skips over the items that do not match the filter.
When you're at a store and you say "show me everything", the guy showing you everything shows you everything. When you say "now just show me the apples that are between $1 and $5 a kilogram", you are not constructing an entirely new store that has only apples in it. You're looking at exactly the same collection of stuff as before, just with a filter on it.
Try use this code
myCollection.RemoveAll(x => x.shouldDelete(x));
You can do:
myCollection.RemoveAll(shouldDelete);
The second statement returns a IEnumerable<> operating on your list. This one should be okay:
foreach (Thing t in myCollection.Where(o=>shouldDelete(o).ToList()) {
myCollection.Delete(t);
}
It is because collection should not be modified withing the foreach loop. It attempts to delete it before entire foreach loop is executed. Hence it will fail.
Where extension method filter the collection values based on the passed predicate and returns IEnumerable. Hence can't modify the collection while iteration.
You can use RemoveAll() for your purpose.
Related
I have a ConcurrentBag of objects, and I want to do following over it:
enumerate all items with a where filtering.
for each item, check some properties, and based on the values, make some method call. After the method call, it's better to remove the item form the bag.
modify some properties' value and save it to the bag.
So basically I need something like following:
foreach (var item in myBag.Where(it => it.Property1 = true))
{
if (item.Property2 = true)
{
SomeMethodToReadTheItem(item);
//it's better to remove this item from the bag here, but
//there is a permeance hit, then just leave it.
}
else
{
item.Property3= "new value";
//now how do I save the item back to the bag?
}
}
Of cause it should be done in a thread-safe way. I know that the enumeration over a ConcurrentBag is actually over a "snapshot" of the real bag, but how about with a where clause filter? Should I do a ToList to prevent it form making a new "snapshot"?
Also if you want to modify one specific item, you just bag.TryTake(out item). But since I've already get the item in the enumeration, should I "take" it again?
Any explanation/comment/sample would be very much apricated.
Thank you.
I'll try to answer specific parts of your question without addressing the performance.
First off, the Where method takes an IEnumerable<T> as its first parameter and will itself iterate over the enumerable which will call GetEnumerator() once so you will only take one snapshot of the underlying ConcurrentBag.
Secondly the thread-safety of your code is not very clear, there may be some implicit guarantees in the rest of your code which are not specified. For example you have a ConcurrentBag so your collection is thread-safe however you modify the items contained within that collection without any thread synchronisation. If there is other code that runs the same method or in another method that reads/modifies the items in the ConcurrentBag concurrently then you may see data races.
Note that it is not necessary to call TryTake if you already have a reference to the item as it will only return the same reference.
I recommend you just create a new list and, if the WHERE filter, add it to this new list.
It would look something like this:
List<T> myNewList = new List<T>();
foreach (var item in myBag.Where(it => it.Property1))
{
if (!item.Property2)
{
myNewList.Add(item);
}
}
attention to " ! "
I need to remove items from the HttpSession collection. In the following code, myList contains the same items as Session. If there are items in myList/Session that are not in itemsToRemove, they should be deleted from the session collection.
However, I'm not sure what the lambda syntax should look like. The following isn't correct.
myList.ForEach(x => !itemsToRemove.Contains(x) { Session.Remove(x) });
Any ideas how I can use a lambda expression to put everything on one line to accomplish this task?
Also, is there a way to avoid creating the intermediate list (myList)? I'm only doing that because I can't remove items from Session while iterating through it.
The most naïve way:
myList.Where(x => !itemsToRemove.Contains(x)) // LINQ extension method
.ToList() <----
.ForEach(x => Session.Remove(x)); // List<T> method so this is required |
Also you can use this:
mystList.Except(itemsToRemove)
.ToList()
.ForEach(x => Session.Remove(x));
But to use ForEach the underlying type should be List<T> so you need to call ToList() first. What causes 1 excess enumeration of the whole collection.
I would do this instead:
foreach (var x in mystList.Except(itemsToRemove))
{
Session.Remove(x)
}
This will minimize the number of enumerations.
First off, abatischev's answer is excellent. It's ideal from both a performance perspective and a readability perspective. If, however, you really want to cram all the functionality into one statement (which I don't recommend), you could try the following:
Session.OfType<string>()
.Except(itemsToRemove)
.ToList()
.ForEach(x => Session.Remove(x));
As abatischev metnioned, the ToList() call costs you an extra enumeration through the collection, which could have a non-trivial performance impact if the collection has a large number of elements in it. However, it means the ForEach() call iterates over a newly created List<string>, which fills the role of your myList and lets you remove items from the Session (since you're iterating through that temporary list, rather than the Session).
(Note that I haven't worked with HttpSessionState objects myself, merely looked at their MSDN article. You may need to replace the string generic type with something else if strings aren't what HttpSessionState holds.)
I have code like this:
Person firstPerson = personsEnumerable.First();
, where personsEnumerable is IEnumerable<Person>
Now, Resharper underlines the personsEnumerable variable and says "Possible iteration of IEnumerable". I understand what this warning means from other questions here on SO, but I'm wondering why it's showing it in my example? I think First() returns the first element and there's no need to iterate the collection at all ?
Is this a 'general' warning message which is not applicable in my case (and I can ignore it) or I don't understand how First() actually works ?
Yes, it returns the first element but it still creates an enumerator to get that first element.You could add ToArray or ToList after Where to prevent this. However if you want to just get the first element, you could use the overloaded version of First which takes a Func delegate and you can remove the Where.
Consider this code:
var personsEnumerable = peopleList.OrderBy(p => p.Age);
var firstPerson = personsEnumerable.First();
foreach (var p in personsEnumerable)
{
// whatever
}
The first line of code sets things up to sort the list and create the results, but it doesn't actually do anything until you try to enumerate it. The second line of code, then would end up doing the sort. So although you're not enumerating the entire result, you're executing code to produce the result. And for a sort, that would entail doing all the heavy lifting.
The foreach loop would do the sort again before actually enumerating the results.
I have some code that filters through a collection of sorted objects according to a filter value. For instance, I want to find the objects where Name=="searchquery". Then I want to take the top X values from that collection.
My questions:
My collection is a List<T>. Does this collection guarantee the sort order?
If so, is there a built-in way to find the the top X objects that satisfy the condition? I'm looking for something like
collection.FindAll(o=>o.Name=="searchquery",100);
That would give me the top 100 objects that satisfy the condition. The reason is performance, once I've found my 100 objects, I don't want to keep checking the entire collection.
If i write:
collection.FindAll(o=>o.Name=="searchquery").Take(100);
will the runtime be intelligent enough to stop checking once it hits 100?
I can of course implement this myself, but if there is a built-in way (like a LInQ method) I'd prefer to use it.
collection.Where(o=>o.Name=="searchquery").Take(100)
The order should be in the same order as the original list, and it will stop checking once it takes 100 elements (Where returns an enumeration which is only evaluated as you take elements). From the documentation:
This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
If you need a different sort order, you will have to specify it (this of course means you have no choice but to examine all elements though).
Ok,
My collection is a List<T>. Does this collection guarantee the sort order?
No, but it will preserve the order of insertion.
If so, is there a built-in way to find the the top X objects that satisfy the condition?
someEnumerable.Where(r => r.Name == "searchquery").Take(100)
If i write:
// Some linq that works
will the runtime be intelligent enough to stop checking once it hits 100?
Yes, probably
Now, if you have a IList that has been sorted and you want to quickly iterate the top 100 items do this.
var list = sourceEnumerable.OrderBy(r => r.Name).ToList();
foreach(var r in list.Where(r => r.Name == "searchquery").Take(100))
{
// Do something
}
collection.Where(o=>o.Name=="searchquery").Take(100)
Is the most correct answer, because behind the scene Where is deferred execution, below is how Where method is implemented:
Where(this IEnumerable<T>, Func<T, bool> func)
{
foreach (var item in collection)
{
if (func(item))
{
yield return item;
}
}
}
So when calling Take(100), the loop just finds first 100 items which satisfy the criteria.
If you know for sure that the objects in your collection are not repeated (e.g.like a primary key), then you can use SortedList instead of List<T>. This will guarantee, that your list will be sorted when you filter it using a certain criteria. Have a look here for sorted list example:
http://msdn.microsoft.com/en-us/library/system.collections.sortedlist(v=vs.100).aspx
Whats the best/easiest way to obtain a count of items within an IEnumerable collection without enumerating over all of the items in the collection?
Possible with LINQ or Lambda?
In any case, you have to loop through it. Linq offers the Count method:
var result = myenum.Count();
The solution depends on why you don't want to enumerate through the collection.
If it's because enumerating the collection might be slow, then there is no solution that will be faster. You might want to consider using an ICollection instead if possible. Unless the enumeration is remarkably slow (e.g. it reads items from disk) speed shouldn't be a problem though.
If it's because enumerating the collection will require more code then it's already been written for you in the form of the .Count() extension method. Just use MyEnumerable.Count().
If it's because you want to be able to enumerate the collection after you've counted then the .Count() extension method allows for this. You can even call .Count() on a collection you're in the middle of enumerating and it will carry on from where it was before the count. For example:
foreach (int item in Series.Generate(5))
{
Console.WriteLine(item + "(" + myEnumerable.Count() + ")");
}
will give the results
0 (5)
1 (5)
2 (5)
3 (5)
4 (5)
If it's because the enumeration has side effects (e.g. writes to disk/console) or is dependant on variables that may change between counting and enumerating (e.g. reads from disk) [N.B. If possible, I would suggest rethinking the architecture as this can cause a lot of problems] then one possibility to consider is reading the enumeration into an intermittent storage. For example:
List<int> seriesAsList = Series.Generate(5).ToList();
All of the above assume you can't change the type (i.e. it is returned from a library that you do not own). If possible you might want to consider changing to use an ICollection or IList (ICollection being more widely scoped than IList) which has a Count property on it.
You will have to enumerate to get a count. Other constructs like the List keep a running count.
Use this.
IEnumerable list =..........;
list.OfType<T>().Count()
it will return the count.
There's also IList or ICollection, if you want to use a construct that is still somewhat flexible, but also has the feature you require. They both imply IEnumerable.
It also depends on what you want to achieve by counting.. If you are interested to find if the enumerable collection has any elements, you could use
myEnumerable.Any() over myEnumerable.Count() where the former will yield the first element and the later will yield all the elements.
An IEnumerable will have to iterate through every item. to get the full count.
If you just need to check if there is one or more items in an IEnumerable a more efficient method is to check if there are any. Any() only check to see there is a value and does not loop through everything.
IEnumerable myStrings = new List(){"one","two", "three"};
bool hasValues = myStrings.Any();
Not possible with LINQ, as calling .Count(...) does enumerate the collection. If you're running into the problem where you can't iterate through a collection twice, try this:
List<MyTableItem> myList = dataContext.MyTable.ToList();
int myTableCount = myList.Count;
foreach (MyTableItem in myList)
{
...
}
If you need to count and then loop you may be better off with a list.
If you're using count to check for members you can use Any() to avoid enumerating the entire collection.
The best solution -as I think is to do the following:
using System.Linq.Dynamic;
myEnumerable.AsQueryable().Count()
When I want to use the Count property, I use ILIST which implements IEnumerable and ICollection interfaces. The ILIST data structure is an Array. I stepped through using the VS Debugger and found that the .Count property below returns the Array.Length property.
IList<string> FileServerVideos = Directory.GetFiles(VIDEOSERVERPATH, "*.mp4");
if (FileServerVideos.Count == 0)
return;