Is a statement recalculated in every iteration when used in LINQ? - c#

For instance
myEnumerable.Where(v => v != myDictionary["someKey"])
when this query is called is myDictionary["someKey"] statement executed (meaning that dictionary is queried for the key) or the result of myDictionary["someKey"]
is used after the first iteration?

The result of myDictionary["someKey"] will not be cached(*see edit below), it will be accessed on every item of myEnumerable. However, you can still cache it manually :
var someValue = myDictionary["someKey"];
myEnumerable.Where(v => v != someValue)
Also take note that, if you plan to iterate/access that IEnumerable multiple time, it is best to actualize it via ToList(). Or, the execution will be deferred every single time.
var query = myEnumerable.Where(v => v != myDictionary["someKey"]);
foreach (var item in query) { /* ... */}
foreach (var item in query) { /* ... */}
In the above example, the Where clause is executed twice.
EDIT: As #LucasTrzesniewski has pointed out, this is only stands true for LINQ-to-Objects. This is because the query is evaluated in memory. However, for LINQ-to-Entities, it gets a little bit different, as the query will be converted into SQL query and then executed in the database in order to avoid round trips.

Here's a really simple demo (and please, don't try this at home):
var myDictionary = new Dictionary<string,string>() { { "someKey", "someValue" } };
var myEnumerable = new List<string> { "someValue", "someOtherValue" };
var test = myEnumerable.Where(v => v == myDictionary["someKey"]);
foreach (var t in test)
{
Console.WriteLine(t);
myDictionary["someKey"] = "someOtherValue";
}
If myDictionary["someKey"] was only evaulated once, then changing the value of myDictionary["someKey"] wouldn't change anything. But if you run the code, you will see that it will echo both someValue and someOtherValue. If you comment out the line that changes the dictionary value, then you will only see someValue
As #Lucas Trzesniewski points out in the comments to the other answer, this applies to LINQ-to-objects. There are a number of important differences between LINQ-to-objects and LINQ-to-SQL.

The Lambda expression you supply to the Linq Where extension is simply a Func<> delegate. The method is executed for each item in the IEnumerable(of T), receiving the current item as a parameter. It doesn't do anything special other than that. Your code is somewhat similar similar to:
var myTempCollection = new List<MyClass>();
foreach(MyClass item in myEnumerable)
{
if (item != myDictionary["someKey"])
{
myTempCollection.Add(item);
}
}
var result = myTempCollection;

It depends on the QueryProvider implementation. For example, the ObjectQueryProvider used by Linq-to-objects will access it on every iteration. For Linq-to-entities, it will access it once and then send that value to the database server.

Related

LINQ query becomes invalid upon editing result data?

(Please see the answer I wrote for more understanding of the situation.)
Below is a query which works great, operating on selected rows of table STUDENTS. Then one edit destroys the query variable. What's wrong here?
students is rows selected from an import Datatable defined in part by:
importTable.Columns.Add("SECTION", typeof(string));
importTable.Columns.Add("NUMBER", typeof(string));
importTable.Columns.Add("ID", typeof(string));
(Because the DataTable is untyped, I need to cast the data into string to use the fields).
Then called by:
IEnumerable<DataRow> s = importTable.AsEnumerable();
IEnumerable<DataRow> t = s
.OrderBy(r => r["HALL"]);
IEnumerable<DataRow> sortedTable = t
.OrderBy(r =>
{ //if (r["ID"] is DBNull)
// return "";
//else
return r["ID"]; // ERROR
});
IEnumerable<DataRow> tue = sortedTable.Where(r => r["DAY"].Equals("TUE"));
IEnumerable<DataRow> wed = sortedTable.Where(r => r["DAY"].Equals("WED"));
AssignSections(tue);
AssignSections(wed);
Here is the query:
public void AssignSections(IEnumerable<DataRow> students)
{
IEnumerable<IEnumerable<DataRow>> query = from e in students.AsEnumerable()
orderby (e["SHORTSCHOOL"] as string).PadRight(30) + e["SEED"] as string
group e by new { DAY=e["DAY"], GRADE=e["GRADE"] } into g
orderby g.Key.GRADE as string
select g.AsEnumerable();
var queryList = query.ToList(); // ArgumentException during "WED" call
foreach (var grade in query)
foreach (var student in grade)
if (student["ID"] == DBNull.Value)
{
student["SECTION"] = "S";
student["ID"] = "ID1";
}
}
Assigning SECTION works, NO PROBLEM. Assigning ID causes query to look like:
query now appears invalid. Future uses of query also prove to be invalid (though the foreach finishes fine).
For what it's worth, grade is just fine, but students is also invalidated through the original table seems to be fine as well.
No magic here. It's a combination of LINQ query Deferred Execution and the usage of the DBNull, which cannot be compared to/from other types.
The deferred execution has been explained many times, so I'm not going to spend time on it. Shorty, the query is executed the only (but anytime) when enumerated. Enumerating means foreach, ToList etc. and technically speaking happens when the GetEnumerator of the enumerable (or the first MoveNext of the enumerator) is called.
All you need to remember from the above is that the IEnumerable<T> (or IQueryable<T>) returning LINQ queries are not executed (evaluated) at the time you define them, but every time you enumerate them (directly or indirectly). This should explain the "The answer surprisingly to me is that LINQ reorders code" part from your own answer. No, LINQ does not reorder the code, it's your code which is doing that by reevaluting the LINQ queries at certain points which are different from the place where you define your query variables. If you want to evaluate them just once at specific point, then do that by adding ToList, ToArray and similar methods which enumerate the query and store the result in some in memory collection and use that collection for further processing. It still be IEnumerable<T>, but further enumerations would enumerate the query result rather than reevaluate the query.
The main issue is the DBNull. From your explanations looks like initially all the ID values are DBNull, so the first query runs fine (DBNull knows how to compare to itself :). Once the source contains at least one value which is not DBNull, any further query that uses OrderBy that column with the default IComparer will fail.
It can easily be reproduced w/o data tables with the following simple code:
var data = new[]
{
new { Id = (object)DBNull.Value },
new { Id = (object)DBNull.Value }
};
var query = data.OrderBy(e => e.Id);
query.ToList(); // Success
data[1] = new { Id = (object)"whatever" };
query.ToList(); // Fail
showing the deferred query execution and reevaluation, or directly (to prove that the problem is not with editing):
new[]
{
new { Id = (object)DBNull.Value },
new { Id = (object)"whatever" }
}
.OrderBy(e => e.Id)
.ToList(); // Fail
The solution is to avoid DBNull at all. The easiest (and much better than as string or ToString()) with DataTable is to use DataRowExtensions.Field extension methods instead of object returning indexer, which besides providing strongly typed access to the columns also automatically handle DBNulls for you (converts them to null when you request string or nullable type), so you won't experience such issues.
It can be proved by changing your problematic code to
.OrderBy(r => r.Field<string>("ID"))
and the problem will be gone. I strongly recommend doing that for other column accessors as well.
The answer surprisingly to me is that LINQ reorders code. The context was this:
IEnumerable<DataRow> s = importTable.AsEnumerable();
IEnumerable<DataRow> t = s
.OrderBy(r => r["HALL"]);
IEnumerable<DataRow> sortedTable = t
.OrderBy(r =>
{ //if (r["ID"] is DBNull)
// return "";
//else
return r["ID"]; // ERROR
});
IEnumerable<DataRow> tue = sortedTable.Where(r => r["DAY"].Equals("TUE"));
IEnumerable<DataRow> wed = sortedTable.Where(r => r["DAY"].Equals("WED"));
AssignSections(tue);
AssignSections(wed);
The 3 commented lines indicate the fault. And what happened: sortedTable was partially initialized in order to feed the Where clause for initializing tue. But then the sortedTable was completed to initialize wed AFTER the call to assign wed appeared in the code, but just in time to use wed in the query constructed in AssignSections!
So the ERROR arose during AssignSections, when the code detoured to completing the initializing of sortedTable, and I could detect this by adding the 3 disabled lines and setting a breakpoint on the "return "";
Magic?
DBNull and null is not the same...
As your original error message says "Object must be of type string" (to be assigned to a string)
DBNull can't be cast to a string,it is a class...
You need to handle this case in your code.
See this link for a simple helper method:
Unable to cast object of type 'System.DBNull' to type 'System.String
using System;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
DBNull dbNull = DBNull.Value;
Console.WriteLine(typeof(string).IsAssignableFrom(typeof(DBNull)));//False
Console.WriteLine(dbNull is string); //False
//Console.WriteLine((string)dbNull); // compile time error
//Console.WriteLine(dbNull as string); // compile time error
Console.ReadLine();
}
}
}
Also, make sure you read how "Lazy Loading"/"Deferred Execution" works on LINQ/IEnumerable.
You don't have to use IEnumerable all the time,especially if you are not sure how it works.

IEnumerable takes too long to process when filtering on it

I have a feeling I know that what the reason is for this behavior, but I don't know what the best way of resolving it will be.
I have built a LinqToSQL query:
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
return AllConditionsByCountry;
}
This query returns about 9500+ rows of data.
I'm calling this from my Controller like so:
svcGenerateConditions generateConditions = new svcGenerateConditions(db);
IEnumerable<AllConditionByCountry> AllConditionsByCountry;
AllConditionsByCountry = generateConditions.GenerateConditions(1);
Which then I'm looping through:
foreach (var record in AllConditionsByCountry)
{
...
...
...
This is where I think the issue is:
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
I'm doing an nested loop based off the data that I'm gathering from the above query (utilizing the original data I'm getting from AllConditionByCountry. I think this is where my issue lies. When it is doing the filter on the data, it SLOWS down greatly.
Basically this process writes out a bunch of files (.json, .html)
I've tested this at first using just ADO.Net and to run through all of these records it took about 4 seconds. Using EF (stored procedure or LinqToSql) it takes minutes.
Is there anything I should do with my types of lists that I'm using or is that just the price of using LinqToSql?
I've tried to return List<AllConditionByCountry>, IQueryable, IEnumerable from my GenerateConditions method. List took a very long time (similar to what I'm seeing now). IQueryable I got errors when I tried to do the 2nd filter (Query results cannot be enumerated more than once).
I have run this same Linq statement in LinqPad and it returns in less than a second.
I'm happy to add any additional information.
Please let me know.
Edit:
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
TL;DR: You're making 9895 queries against the database instead of one. You need to rewrite your query such that only one is executed. Look into how IEnumerable works for some hints into doing this.
Ah, yeah, that for loop is your problem.
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID).Select(x => x).AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
Linq-to-SQL works similarly to Linq in that it (loosely speaking) appends functions to a chain to be executed when the enumerable is iterated - for example,
Enumerable.FromResult(1).Select(x => throw new Exception());
This doesn't actually cause your code to crash because the enumerable is never iterated. Linq-to-SQL operates on a similar principle. So, when you define this:
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
You're not executing anything against a database, you're just instructing C# to build a query that does this when it is iterated. That's why just declaring this query is fast.
Your problem comes when you get to your for loop. When you hit your for loop, you signal that you want to start iterating the AllConditionsByCountry iterator. This causes .NET to go off and execute the initial query, which takes time.
When you call AllConditionsByCountry.Where(x => x.ConditionID == conditionID) in the for loop, you're constructing another iterator that doesn't actually do anything. Presumably you actually use the result of rList within that loop, however, you're essentially constructing N queries to be executed against the database (where N is the size of AllConditionsByCountry).
This leads to a scenario where you are effectively executing approximately 9501 queries against the database - 1 for your initial query and then one query for each element within the original query. The drastic slowdown compared to ADO.NET is because you're probably making 9500 more queries than you were originally.
You ideally should change the code so that there is one and only one query executed against the database. You've a couple of options:
Rewrite the Linq-to-SQL query such that all of the legwork is done by the SQL database
Rewrite the Linq-to-SQL query so it looks like this
var conditions = AllConditionsByCountry.ToList();
foreach (var record in conditions)
{
var rList = conditions.Where(....);
}
Note that in that example I am searching conditions rather than AllConditionsByCountry - .ToList() will return a list that has already been iterated so you do not create any more database queries. This will still be slow (since you're doing O(N^2) over 9500 records), but it will still be faster than creating 9500 queries since it will all be done in memory.
Just rewrite the query in ADO.NET if you're more comfortable with raw SQL than Linq-to-SQL. There's nothing wrong with this.
I think I should point out what methods cause an IEnumerable to be iterated and what ones don't.
Any method named As* (such as AsEnumerable<T>()) do not cause the enumerable to be iterated. It's essentially a way of casting from one type to another.
Any method named To* (such as ToList<T>()) will cause the enumerable to be iterated. In the event of Linq-to-SQL this will also execute the database query. Any method that also results in you getting a value out of the enumerable will also cause iteration. You can use this to your advantage by creating a query and forcing iteration using ToList() and then searching that list - this will cause the comparisons to be done in memory, which is what I demo above
//Firstly: IEnumerable<> should be List<>, because you need to massage result later
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
})
.OrderBy(x => x.CountryID)
.ToList() //return a list, so only 1 query is executed
//.AsEnumerable<AllConditionByCountry>();//it's useless code, anyway.
return AllConditionsByCountry;
}
about this part:
foreach (var record in AllConditionsByCountry) // you can use AllConditionsByCountry.ForEach(record=>{...});
{
...
//AllConditionsByCountry will not query db again, because it's a list, no long a query
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID);//.Select(x => x).AsEnumerable(); //no necessary to use AsXXX if compilation do not require it.
...
}
BTW,
you should have your result paged, no page will need 100+ result. 10K return is the issue itself.
GenerateConditions(int paramCountryId, int page = 0, int pagesize = 50)
it's weird that you have to use a sub-query, usually it means GenerateConditions did not return the data structure you need, you should change it to give right data, no more subquery
use compiled query to improve more: https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/compiled-queries-linq-to-entities
we don't see your full query, but usually, it's right the part you should improve, especially when you have many conditions to filter and join and group... a little change could make all differences.

How to set properties using Linq statement

Instead of doing this horrible loop which does achieve the desired result :
foreach (var mealsViewModel in mealsListCollection)
{
foreach (var VARIABLE in mealsViewModel.Items)
{
foreach (var d in VARIABLE.ArticlesAvailable)
{
d.ArticleQty = 0;
}
}
}
I'm trying to achieve the same result but with this linQ statement :
mealsListCollection.ForEach(u =>
u.Items.Select(o => o.ArticlesAvailable.Select(c =>
{
c.ArticleQty = 0;
return c;
})));
But the linQ statement does not reset ArticleQty to zero
What I am doing wrong? and why ?
Change your linq to ForEach cause Select does not iterate through collection in the way you want.
MSDN definition:-
Select Projects each element of a sequence into a new form.
ForEach Performs the specified action on each element of the List.
mealsListCollection.ForEach(u =>
u.Items.ForEach(o =>
o.ArticlesAvailable.ForEach(c =>
{
c.ArticleQty = 0;
})));
Use SelectMany to work through trees of nested lists. Use the ForEach function last to do the work:
mealsListCollection
.SelectMany(m => m.Items)
.SelectMany(i => i.ArticlesAvailable)
.ToList()
.ForEach(a => { a.ArticleQty = 0; });
What you are doing wrong is: select is returning your same collection, but has no effect until the objects are iterated over. Sitting in the foreach call, the selects are outside of the execution path. (Review comments for more information).
.select() in a call by itself does nothing special but determine what the returned list will look like.
.select().ToList() iterates over the collection, applying the projection.
If you were to set a variable equal to the .select call, but never access the data inside it, then the values would essentially still be what they started as. As soon as you iterate over, or select a specific element, it would then apply the projections.
Changing the selects to foreachs per vasily's comments will give you the desired results.
Can I perhaps suggest that you look to set the value equal to 0 further up your stack ( or down)? - Without knowing your use case, maybe there Is a better place to default it back to 0 than where you have chosen?
(automapper, Initializer, etc )

LINQ, Unable to create a constant value of type XXX. Only primitive types or enumeration types are supported in this context

In my application I have Lecturers and they have list of Courses they can teach and when I'm deleting a course I want to remove connection to lecturers. Here's the code:
public void RemoveCourse(int courseId)
{
using (var db = new AcademicTimetableDbContext())
{
var courseFromDb = db.Courses.Find(courseId);
var toRemove = db.Lecturers
.Where(l => l.Courses.Contains(courseFromDb)).ToList();
foreach (var lecturer in toRemove)
{
lecturer.Courses.Remove(courseFromDb);
}
db.SaveChanges();
}
}
but it doesn't work. I get
NotSupportedException: Unable to create a constant value of type Course. Only primitive types or enumeration types are supported in this context.
What am I doing wrong?
You can't use Contains with non-primitive values. Do
Where(l => l.Courses.Select(c => c.CourseId).Contains(courseId)
(or the Id field you use).
If you are using a DbContext, you can query the .Local collection, and the == operator will work also with objects:
public void RemoveCourse(int courseId)
{
using (var db = new AcademicTimetableDbContext())
{
var courseFromDb = db.Courses.Find(courseId);
db.Lecturers.Load() //this is optional, it may take some time in the first load
//Add .Local to this line
var toRemove = db.Lecturers.Local
.Where(l => l.Courses.Contains(courseFromDb)).ToList();
foreach (var lecturer in toRemove)
{
lecturer.Courses.Remove(courseFromDb);
}
db.SaveChanges();
}
}
The .Local is an ObservableCollection, so you can compare anything you like inside it (not limited to SQL queries which don't support object comparison). Just to make sure you get all your objects in the .Local collection you can call the db.Lecturers.Load() method before calling .Local, which brings all database entries into the Local collection.
The Courses collection of below line should be null or empty.
var toRemove = db.Lecturers
.Where(l => l.Courses.Contains(courseFromDb)).ToList();
This can also happen when you pass a Func<T, bool> to Where() as a way to write a dynamic condition like here here
For some reason the delegate can't be translated to SQL.
You cannot compare complex type, if you have not specified what you mean for equality.
As exception detail says, you need to check primitive values (like Integer in your case).
And better to use Any() method instead.
var toRemove = db.Lecturers
.Where(l => l.Courses.Any(p=>p.Id == courseFromDb.Id)).ToList();

LINQ query to select based on property

IEnumerable<MyClass> objects = ...
foreach(MyClass obj in objects)
{
if(obj.someProperty != null)
SomeFunction(obj.someProperty);
}
I get the feeling I can write a smug LINQ version using a lambda but all my C# experience is 'classical' i.e more Java-like and all this Linq stuff confuses me.
What would it look like, and is it worth doing, or is this kind of Linq usage just seen as showing off "look I know Linq!"
LINQ itself doesn't contain anything for this - I'd would use a normal foreach loop:
foreach (var value in objects.Select(x => x.someProperty)
.Where(y => y != null))
{
SomeFunction(value);
}
Or if you want a query expression version:
var query = from obj in objects
let value = obj.SomeProperty
where value != null
select value;
foreach (var value in query)
{
SomeFunction(value);
}
(I prefer the first version, personally.)
Note that I've performed the selection before the filtering to avoid calling the property twice unnecessarily. It's not for performance reasons so much as I didn't like the redundancy :)
While you can use ToList() and call ForEach() on that, I prefer to use a straight foreach loop, as per Eric's explanation. Basically SomeFunction must incur a side-effect to be useful, and LINQ is designed with side-effect-free functions in mind.
objects.where(i => i.someProperty != null)
.ToList()
.ForEach(i=> SomeFunction(i.someProperty))
Although it can be done with Linq, sometimes its not always necessary. Sometimes you lose readability of your code. For your particular example, I'd leave it alone.
One option is to use the pattern outlined in the book Linq In Action which uses an extension method to add a ForEach operator to IEnumerable<>
From the book:
public static void ForEach<T> (this IEnumerable<T> source, Action<T> func)
{
foreach (var item in source)
func(item)
}
Then you can use that like this:
(from foo in fooList
where foo.Name.Contains("bar")
select foo)
.ForEach(foo => Console.WriteLine(foo.Name));
LINQ is used to create a result, so if you use it to call SomeFunction for each found item, you would be using a side effect of the code to do the main work. Things like that makes the code harder to maintain.
You can use it to filter out the non-null values, though:
foreach(MyClass obj in objects.Where(o => o.someProperty != null)) {
SomeFunction(obj.someProperty);
}
You can move the if statement into a Where clause of Linq:
IEnumerable<MyClass> objects = ...
foreach(MyClass obj in objects.Where(obj => obj.someProperty != null)
{
SomeFunction(obj.someProperty);
}
Going further, you can use List's ForEach method:
IEnumerable<MyClass> objects = ...
objects.Where(obj => obj.someProperty != null).ToList()
.ForEach(obj => SomeFunction(obj.someProperty));
That's making the code slightly harder to read, though. Usually I stick with the typical foreach statement versus List's ForEach, but it's entirely up to you.

Categories

Resources