I'm trying to iterate a collection call the TransformCvlValue for each record.
fields?.Select(x => TransformCvlValue(x, cvl)).ToList();
If I call .ToList() it works as expected.
Why does .ToList() need to be called?
Is there another way of doing this?
Calling Select() on an IEnumerable<T> does not immediately execute the action but builds a new IEnumerable<T> with the specified transform / action.
Generally, LINQ extension methods are only called when IEnumerable<T>s are materialized, for example via iterating over them in a foreach or by calling .ToList().
Select() should mostly be used when you really want to project elements from one type to another, e.g. by applying a projection to an element. It should not be used when you want to call a method for every element in an IEnumerable<T>.
Probably the most readable straightforward way for me would be to simply iterate over the fields:
if (fields != null)
{
foreach (var field in fields)
{
TransformCvlValue(field, cvl);
}
}
This makes clear what you intend the code to do and is easy to understand when you or your colleagues have to maintain the code in the future.
Related
Please have a look at the following code:
var list = new List<MyClass>();
...
var selection = list
.Preprocess()
.Where(x => x.MyProperty = 42)
.ToList();
I want to implement Preprocess in such a way that it can process every item selected by the predicate that follows. So, if list contains thousands of object, I don't want Preprocess to process all of them, but only those that are selected by the Where clause.
I know this sounds like it would be better to move Preprocess to the end of the query, but I have my reasons to do it that way.
Is that possible with IQueryable<T>? Or does it behave like IEnumerable, where the whole LINQ-to-objects pipeline is strictly sequential?
For your purposes, IQueryable here is no different from IEnumerable, which is, as you said, purely sequential. Immediately after you invoke Where(), there's effectively no trace of the original "collection".
Now, you could theoretically do what you want, but that would require rolling "your own LINQ". Preprocess() could return some kind of PreprocessedEnumerable, all the custom operators would attach to it and the final ToList() will do the reordering of the calls.
I need to remove items from the HttpSession collection. In the following code, myList contains the same items as Session. If there are items in myList/Session that are not in itemsToRemove, they should be deleted from the session collection.
However, I'm not sure what the lambda syntax should look like. The following isn't correct.
myList.ForEach(x => !itemsToRemove.Contains(x) { Session.Remove(x) });
Any ideas how I can use a lambda expression to put everything on one line to accomplish this task?
Also, is there a way to avoid creating the intermediate list (myList)? I'm only doing that because I can't remove items from Session while iterating through it.
The most naïve way:
myList.Where(x => !itemsToRemove.Contains(x)) // LINQ extension method
.ToList() <----
.ForEach(x => Session.Remove(x)); // List<T> method so this is required |
Also you can use this:
mystList.Except(itemsToRemove)
.ToList()
.ForEach(x => Session.Remove(x));
But to use ForEach the underlying type should be List<T> so you need to call ToList() first. What causes 1 excess enumeration of the whole collection.
I would do this instead:
foreach (var x in mystList.Except(itemsToRemove))
{
Session.Remove(x)
}
This will minimize the number of enumerations.
First off, abatischev's answer is excellent. It's ideal from both a performance perspective and a readability perspective. If, however, you really want to cram all the functionality into one statement (which I don't recommend), you could try the following:
Session.OfType<string>()
.Except(itemsToRemove)
.ToList()
.ForEach(x => Session.Remove(x));
As abatischev metnioned, the ToList() call costs you an extra enumeration through the collection, which could have a non-trivial performance impact if the collection has a large number of elements in it. However, it means the ForEach() call iterates over a newly created List<string>, which fills the role of your myList and lets you remove items from the Session (since you're iterating through that temporary list, rather than the Session).
(Note that I haven't worked with HttpSessionState objects myself, merely looked at their MSDN article. You may need to replace the string generic type with something else if strings aren't what HttpSessionState holds.)
I have written some code that has this form:
var queryResult = from instance in someCollection
where instance meets some criteria
select instance;
foreach (InstanceType instance in queryResult.ToList()) {
instance.SomeMethod();
}
This seems a bit redundant in that the query is iterating over the collection and then there is another iteration to invoke the method on all found instances. It would be nice to be able to invoke the instance method with in the query, rather than having to write an additional loop.
How could someone accomplish what the code above does with just a single query?
You can use ForEach to call void methods:
someCollection
.Where(instance => instance meets some criteria)
.ToList()
.ForEach(item => item.SomeMethod(param1, param2, ...)); // Use Foreach(SomeMethod) for methods w/no args
Just remove the .ToList() from your code.. and you'd be looping over the collection only once..
In general, it is advisable to not have side effects in your queries.. and methods like instance.SomeMethod() are typically side effects..
Apart from removing the ToList call (which is really the additional and redundant loop here), the code snippet looks fine to me..
Looks a little better. Im not sure of the actual number of iterations though.
foreach (var instance in someCollection.Where(instance meets some criteria))
{
instance.SomeMethod();
}
So basically i have this method.
public List<Customer> FilterCustomersByStatus(List<Customer> source, string status)
{
return (List<Customer>)source.Where(c => c.Status == status);
}
I throws me an error that it cannot cast:
Unable to cast object of type 'WhereListIterator`1[AppDataAcces.Customer]' to type 'System.Collections.Generic.List`1[AppDataAcces.Customer]'.
Why...? since the underlying type is the same, does the Enumerable.Where create a new instance of WhereListIterator and if so why would anyone do this, because thats an unnecessary loss of performance and functionality since i always have to create a new list (.ToList())
does the Enumerable.Where create a new instance of WhereListIterator
Yes.
and if so why would anyone do this
Because it allows lazy streaming behavior. Where won't have to filter all the list if its consumer wants only first or second entry. This is normal for LINQ.
because thats an unnecessary loss of performance and functionality since i always have to create a new list (.ToList())
That "loss of performance and functionality" comes from your design. You don't need List<Customer> after filtering, because it's pointless to do any modifications on it.
Update: "why is it implemented so"
Because it it implemented over IEnumerable, not IList. And thus it looks like IEnumerable, it quacks like IEnumerable.
Besides, it's just so much easier to implement it this way. Imagine for a moment that you have to write Where over IList. Which has to return IList. What should it do? Return a proxy over original list? You'll suffer huge performance penalties on every access. Return new list with filtered items? It'll be the same as doing Where().ToList(). Return original list but with all non-matching items deleted? That's what RemoveAll is for, why make another method.
And remember, LINQ tries to play functional, and tries to treat objects as immutables.
As others pointed out, you need to use ToList to convert the result to List<T>.
The reason is that Where is lazily evaluated, so Where does not really filter the data.
What it does is create an IEnumerable which filters data as needed.
Lazy evaluation has several benefits. It might be faster, it allows using Where with infinite IEnumerables, etc.
ToList forces the result to be converted to List<T>, which seems to be what you want.
The Where extension filters and returns IEnumerable<TSource> hence you need to call .ToList() to convert it back
public List<Customer> FilterCustomersByStatus(List<Customer> source, string status)
{
return source.Where(c => c.Status == status).ToList();//This will return a list of type customer
}
The difference between IEnumerable and IList is, the enumerable doesn't contain any data, it contains an iterator that goes through the data as you request the new one (for example, with a foreach loop). On the other hand, the list is a copy of the data. In your case, to create the List, ToList() method iterates through the entire data and adds them to a List object.
Depending to the usage you are planning, both have advantages and disadvantages. For example, if you are planning to use the entire data more than once, you should go with the list, but if you are planning to use it once or you are planning to query it again using linq, enumerable should be your choice.
Edit:
The answer to the question why the return type of Where is WhereListIterator instead of List is, it's partly because how Linq works. For example, if you had another Where or another Linq statement following the first, the compiler would create a single query using the entire method chain, then return the iterator for the final query. On the other hand, if the first Where would return a List that would cause each Linq method in the chain execute separately on the data.
Try this:
public List<Customer> FilterCustomersByStatus(List<Customer> source, string status)
{
return source.Where(c => c.Status == status).ToList();
}
I'm using the new Resharper version 6. In several places in my code it has underlined some text and warned me that there may be a Possible multiple enumeration of IEnumerable.
I understand what this means, and have taken the advice where appropriate, but in some cases I'm not sure it's actually a big deal.
Like in the following code:
var properties = Context.ObjectStateManager.GetObjectStateEntry(this).GetModifiedProperties();
if (properties.Contains("Property1") || properties.Contains("Property2") || properties.Contains("Property3")) {
...
}
It's underlining each mention of properties on the second line, warning that I am enumerating over this IEnumerable multiple times.
If I add .ToList() to the end of line 1 (turning properties from a IEnumerable<string> to a List<string>), the warnings go away.
But surely, if I convert it to a List, then it will enumerate over the entire IEnumerable to build the List in the first place, and then enumerate over the List as required to find the properties (i.e. 1 full enumeration, and 3 partial enumerations). Whereas in my original code, it is only doing the 3 partial enumerations.
Am I wrong? What is the best method here?
I don't know exactly what your properties really is here - but if it's essentially representing an unmaterialized database query, then your if statement will perform three queries.
I suspect it would be better to do:
string[] propertiesToFind = { "Property1", "Property2", "Property3" };
if (properties.Any(x => propertiesToFind.Contains(x))
{
...
}
That will logically only iterate over the sequence once - and if there's a database query involved, it may well be able to just use a SQL "IN" clause to do it all in the database in a single query.
If you invoke Contains() on a IEnumerable, it will invoke the extension method which will just iterate through the items in order to find it. IList has real implementation for Contains() that probably are more efficient than a regular iteration through the values (it might have a search tree with hashes?), hence it doesn't warn with IList.
Since the extension method will only be aware that it's an IEnumerable, it probably can not utilize any built-in methods for Contains() even though it would be possible in theory to identify known types and cast them accordingly in order to utilize them.