LINQ query sometimes (randomly, for the exact same data) throwing NullReferenceException - c#

I am using LINQ-to-Entities (EF 6.1.3) to perform the following query:
var users = msgList.Select(m => m.From)
.Union(msgList.Select(m => m.To))
.Distinct()
.Where(u => u.ID != userId) //userId is an assigned local var.
.ToList();
msgList is a List (already fetched, not a queryable and lazy loading is off) of Messages which consists of some fields like From and To which are guaranteed to be non-null. Both From and To were Included in the original query, so they are guaranteed to be non-null.
My User object is also guaranteed to be non-null, so there's nothing that can actually be null.
However, this line is sometimes throwing a null pointer exception, and sometimes executing perfectly with the exact same user, exact same database, exactly same data (nothing altered). Load is not an issue as it's a code not yet in production and I'm the only one testing it.
The exception seems to be thrown at the Where call:
at System.Linq.Enumerable.WhereEnumerableIterator`1.MoveNext()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
How can this happen?
UPDATE: This is of course not a duplicate of What is a NullReferenceException, and how do I fix it?. Any sane developer with even a little knowledge in .NET/C#/OOP knows what that error is and that this question has nothing to do with it, even though it involves that exception as a part of it.
UPDATE 2: I've switched it to assigning to a list each line, as suggested below:
var msgListSelection = msgList.Select(m => m.From).ToList();
var union = msgListSelection.Union(msgList.Select(m => m.To)).ToList();
var distinct = union.Distinct().ToList();
var where = distinct.Where(u => u.ID != userId).ToList();
var users = where;
The exception occurs at the where line:
var where = distinct.Where(u => u.ID != User.ID).ToList();
If distinct returned null, it would have been thrown on ToList call of var distinct = union.Distinct().ToList(); on the line above.
Am I missing something?
UPDATE 2: My User class is a POCO C# type mapped to an Entity type in my database which has an ID property of long, and my Message class is again a POCO type mapped in Entity Framework, with navigation properties From and To to some User instances guaranteed to be non-null. They are annotated as Required and I've also checked them at the database level just to be sure.
UPDATE 3: My EF context lives from the beginning of the request (set at a delegating handler in the beginning of the request) to the end. I don't think the problem is related to the lifespan of the DbContext as there are many controllers with the same mechanism with tens of methods that access the context, and I'm only having such problem with this particular method.
UPDATE 4: I've added a null check on distincts:
var distinct = union.Distinct().ToList();
if(distinct == null)
{
throw new Exception("distinct was null");
}
var where = distinct.Where(u => u.ID != userId).ToList();
It seems to pass that point with no problem, but throw the null pointer exception at the last line var where = distinct.Where(u => u.ID != userId).ToList(); which sorts out the possibility that distinct may be null.
UPDATE 5: I've wrote an API testing tool and sent about 250 requests to the same endpoint with the same user. The first one failed with this error, and all the rest succeeded successfully. There seems to be a problem with the first request.

You may be experiencing what is caused by the closure principle. You reference the User property in your LINQ query. Because the LINQ query in itself is executed as an (anonymous) method delegate, the closure principle applies.
Quoting the above link:
In essence, a closure is a block of code which can be executed at a
later time, but which maintains the environment in which it was first
created - i.e. it can still use the local variables etc of the method
which created it, even after that method has finished executing.
The usage of the User property is subject to this principle. Its value can have changed upon the execution of the LINQ query. To protect against this, the User property should be copied to a local variable and that referenced in the LINQ query. Like so:
var user = User;
var users = msgList.Select(m => m.From)
.Union(msgList.Select(m => m.To))
.Distinct()
.Where(u => u.ID != user.ID)
.ToList();
Update
When using a local reference copy to the user property, another possibility for the NullReferenceException may lie with the Select-Union-Distinct methods. When calling ToList, the Where clause is executed on all items in the union of the two Select clauses. By default, Distinct executes the Equals method from the IQuality interface, which would be called on the elements from Select(m => m.From) . If this element is null, it would cause the NullReferenceException.

Related

There is already an open DataReader associated with this Command without ToList()

I have the method below to load dependent data from navigation property. However, it generates an error. I can remove the error by adding ToList() or ToArray(), but I'd rather not do that for performance reasons. I also cannot set the MARS property in my web.config file because it causes a problem for other classes of the connection.
How can I solve this without using extension methods or editing my web.config?
public override void Load(IEnumerable<Ques> data)
{
if (data.Any())
{
foreach (var pstuu in data)
{
if (pstuu?.Id_user != null)
{
db.Entry(pstuu).Reference(q => q.Users).Load();
}
}
}
}
I take it from this question you've got a situation something like:
// (outside code)
var query = db.SomeEntity.Wnere(x => x.SomeCondition == someCondition);
LoadDependent(query);
Chances are based on this method it's probably a call stack of various methods that build search expressions and such, but ultimately what gets passed into LoadDependent() is an IQueryable<TEntity>.
Instead if you call:
// (outside code)
var query = db.SomeEntity.Wnere(x => x.SomeCondition == someCondition);
var data = query.ToList();
LoadDependent(data);
Or.. in your LoadDependent changing doing something like:
base.LoadDependent(data);
data = data.ToList();
or better,
foreach (Ques qst in data.ToList())
Then your LoadDependent() call works, but in the first example you get an error that a DataReader is open. This is because your foreach call as-is would be iterating over the IQueryable meaning EF's data reader would be left open so further calls to db, which I'd assume is a module level variable for the DbContext that is injected, cannot be made.
Replacing this:
db.Entry(qst).Reference(q => q.AspNetUsers).Load();
with this:
db.Entry(qst).Reference(q => q.AspNetUsers).LoadAsync();
... does not actually work. This just delegates the load call asynchronously, and without awaiting it, it too would fail, just not raise the exception on the continuation thread.
As mentioned in the comments to your question this is a very poor design choice to handle loading references. You are far, far better off enabling lazy loading and taking the Select n+1 hit if/when a reference is actually needed if you aren't going to implement the initial fetch properly with either eager loading or projection.
Code like this forces a Select n+1 pattern throughout your code.
A good example of loading a "Ques" with it's associated User eager loaded:
var ques = db.Ques
.Include(x => x.AspNetUsers)
.Where(x => x.SomeCondition == someCondition)
.ToList();
Whether "SomeCondition" results in 1 Ques returned or 1000 Ques returned, the data will execute with one query to the DB.
Select n+1 scenarios are bad because in the case where 1000 Ques are returned with a call to fetch dependencies you get:
var ques = db.Ques
.Where(x => x.SomeCondition == someCondition)
.ToList(); // 1 query.
foreach(var q in ques)
db.Entry(q).Reference(x => x.AspNetUsers).Load(); // 1 query x 1000
1001 queries run. This compounds with each reference you want to load.
Which then looks problematic where later code might want to offer pagination such as to take only 25 items where the total record count could run in the 10's of thousands or more. This is where lazy loading would be the lesser of two Select n+1 evils, as with lazy loading you know that AspNetUsers would only be selected if any returned Ques actually referenced it, and only for those Ques that actually reference it. So if the pagination only "touched" 25 rows, Lazy Loading would result in 26 queries. Lazy loading is a trap however as later code changes could inadvertently lead to performance issues appearing in seemingly unrelated areas as new referenences or code changes result in far more references being "touched" and kicking off a query.
If you are going to pursue a LoadDependent() type method then you need to ensure that it is called as late as possible, once you have a known set size to load because you will need to materialize the collection to load related entities with the same DbContext instance. (I.e. after pagination) Trying to work around it using detached instances (AsNoTracking()) or by using a completely new DbContext instance may give you some headway but will invariably lead to more problems later, as you will have a mix of tracked an untracked entities, or worse, entities tracked by different DbContexts depending on how these loaded entities are consumed.
An alternative teams pursue is rather than a LoadReference() type method would be an IncludeReference() type method. The goal here being to build .Include statements into the IQueryable. This can be done two ways, either by magic strings (property names) or by passing in expressions for the references to include. Again this can turn into a bit of a rabbit hole when handling more deeply nested references. (I.e. building .Include().ThenInclude() chains.) This avoids the Select n+1 issue by eager loading the required related data.
I have solved the problem by deletion the method Load and I have used Include() in my first query of data to show the reference data in navigation property

How to reload database with Entity Framework in C#

I am trying to do some checks on my database with an automated process. On a schedule the process goes out to a service and checks all the entries in the database against a list.
I want to re-insert the records that may have been deleted and update ones that are out of date
foreach (Category x in CustomeClass)
{
Category exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.FirstOrDefault();
if (exists == null)
Context.Add(x);
else
Context.Update(x);
}
Not sure but I keep getting messages about tracking an instance with the same key etc. Can someone point me to a best practice on something like this
Danka!
This type of error is common when re-using the same instance of EF dbContext, especially when trying to load the same entity more then once from database using the same context.
If this is the case, then simply recreate the context (either new it up or use context factory) and then try update or modify your database data.
Creating new context is cheap, so no worries there.
After updating, save changes and dispose of the context (or use it in using statement to begin with).
If you are modifying the same entity multiple times using the same context, then do not load it from database multiple times.
In Your particular code example I would check if there is no duplication of Category objects in CustomeClass collection.
Is there duplication of CategoryID?
Is CategoryID for sure auto-generated when saving data (entity configuration)?
Is code not trying to update multiple entities with same id?
etc.
Entity framework works with references. Two instances of a class with the same data amount to two different references and only one reference can be associated with a DbContext, otherwise you get errors like that.
As per your example below:
foreach (Category x in CustomeClass)
{
Category exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.FirstOrDefault();
if (exists == null)
Context.Add(x);
else
Context.Update(x);
}
Category is assumed to be your Entity, so "CustomeClass" would be a collection of instances that are not associated with your Context. These are Detached instances.
When your "exists" comes back as #null, this will appear to work as the Category "x" gets added and tracked by the Context. However, when "exists" comes back as not #null, you now have two instances for the same entity. "exists" is tracked by the DbContext, while "x" is not. You cannot use Update() with "x", you must copy the values across.
The simplest way to do this would be Automapper where you can create a map from Category to Category, then use Map to copy all values from "x" over into "exists":
var config = new MapperConfiguration(cfg => cfg.CreateMap<Category, Category>());
var mapper = config.CreateMapper();
mapper.Map(x, exists);
This is purely an example, you'll probably want to configure and inject a mapper that handles your entity copying. You can configure the CreateMap to exclude columns that shouldn't ever change. (Using ForMember, etc.)
Alternatively you can copy the values across manually:
// ...
else
{
exists.Name = x.Name,
exists.SomeValue = x.SomeValue,
// ...
}
In general you should avoid using the Update method in EF as this will result in a statement that overwrites all columns in a table, rather than updating just the column(s) that changes. (If no columns changed then no UPDATE SQL will actually get run)
On another side note, when getting "exists", you should use SingleOrDefault() not FirstOrDefault() as you expect 0 or 1 row back. First* methods should be used in cases where you expect there can be multiple matches but only want the first match, and should always be used with an OrderBy*() method to ensure the results are predictable.
You can use Update by performing an Exists check query that doesn't load a tracked entity. Examples would be:
Category exists = Context.SSActivewear_Category
.AsNoTracking()
.Where(b => b.CategoryID == x.CategoryID)
.SingleOrDefault();
or better:
bool exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.Any();
Then you could use Context.Update(x). AsNoTracking() tells EF to load an instance but not track it. This would really be a waste in this case as it's a round trip to the DB to return everything in the Category only to check if something was returned or not. The Any() call would be a round trip but just does an EXISTS db query to return true or false.
However, these are not fool-proof as there is no guarantee that the Context instance isn't already tracking an instance for that Category from some other operation. Something as trivial as having a Category appear in CustomeClass twice for any reason would be enough to trip the above examples up as once you call Context.Update(x) that instance is now tracked, so the loop iteration for the second instance would fail.

Populate navigation properties when used in method passed to Where clause

I have an endpoint which should list all Candidatures filtered by search term, so what I do is to create a method which accepts candidature entity and searchTerm as params. Then I pass that method to Where clause, but the problem is that I got NullReferenceException because navigation properties are nulls. If I put statement inside Where clause instead of the method then it doesn't throw Exception. The question is how to fix this, but I want to keep the external method because there will be a lot more logic, but I need to have access to all navigation properties i.e. they should be populated.
if (!string.IsNullOrEmpty(searchTerm))
{
query = query.Where(c => FilterBySearchTerm(c, searchTerm));
}
var result = await query.Select(c => new CandidaturesResponseModel()
{
Id = c.Id,
Name = c.PrimaryMember.FullName, // that's filled
}).ToListAsync();
private bool FilterBySearchTerm(Candidature c, string searchTerm)
{
return c.PrimaryMember.FirstName.Contains(searchTerm); // here is the exception because PrimaryMember navigation property is null. So I want this to be filled.
}
The issue is that you're materializing the query by using your FilterBySearchTerm method. EF cannot translate random methods to SQL, so it has to go ahead and run the query, get the results back and then apply your Where. EF would actually throw an exception in the past, but EF Core handles this silently.
Anyways, once the query is run, you're done. Your filtering is happening in-memory, and at that point, without an Include, your related entities are not there to work with. Long and short, you'll need to build your filter in place (rather than using a separate method) in order for EF to be able to translate that to SQL.
An alternative approach which may serve you better is pass a queryable to your FilterBySearchTerm method. For example, instead of doing:
query = query.Where(c => FilterBySearchTerm(c, searchTerm));
Do
query = FilterBySearchTerm(query, searchTerm);
Then, inside FilterBySearchTerm, you can directly apply Where clauses to the passed in query. That allows you to build an actual query that EF can understand, while also encapsulating the logic.
Just use Include method to add PrimaryMember
query = query.Include(x=> x.PrimaryMember).Where(c => FilterBySearchTerm(c, searchTerm));

Database Querying issue in Entity Framework. Getting null each time

My problem is, that every time when I try querying the database with LinQ methods, I get either Null when using SingleOrDefault(), or an InvalidOperationException when using Single().
Here's my code.
var currentUser = UserManager.FindById(User.Identity.GetUserId());
var post = ApplicationDbContext.Posts
.Include(c => c.Author)
.SingleOrDefault(c => c.Author.Id == currentUser.Id);
if (post == null)
return View(new CertainPost());
I'm suspicious if it ain't wrong that I've used the .Include method on author and in the same time I used the author value to query the DB. If that's the issue, how should I write this properly? Can I in some way use SingleOrDefault() method later in my code, when Author will be loaded? I've been doing it like this, but I find this way very messy
//Such a mess
var posts = ApplicationDbContext.Posts
.Include(c => c.Author)
.ToList();
foreach(var item in posts)
{
if(item.Author.Id == currentUser.Id)
var post = ApplicationDbContext.Posts.SingleOrDefault(c=>c.Id==item.Id)
}
So, what code should I write to compromise on usefulness and optimization?
The problem is likely that you have multiple posts for that author. The only meaningful logical difference between the non-working code and the working but "messy" code is that, in the working code, you're selecting by an item id.
The difference between Single and SingleOrDefault is only that when the first fails, it raises an exception, while the second will simply return null. With either, there's actually two scenarios that will make them fail:
There are no matching items
There are more than one matching items
Since you're sure that the first is not the situation (you've confirmed there is a post), then the second is the issue.
Long and short, you need to either provide some other differentiating detail to SingleOrDefault such that it can match one and only one post, or you need to just use something like Where and return all matching posts:
var post = ApplicationDbContext.Posts
.Include(c => c.Author)
.SingleOrDefault(c => c.Author.Id == currentUser.Id && c.Id == postId);
Or
var posts = ApplicationDbContext.Posts
.Include(c => c.Author)
.Where(c => c.Author.Id == currentUser.Id);
FWIW, you don't need to use Include for any relationships that are part of the query. In the case of Author, it has to be joined in order to figure out if its id equals the current user's id, so it will already be included.
Additionally, it's an unnecessary query to lookup the user when all your need is the id. You already have the id, since you used that to look the user up in the first place. You might have done that because passing User.Identity.GetUserId() directly to your query will raise an exception. However, that only occurs because Entity Framework doesn't know how to translate that method to something it can do in a SQL query. If you pass it just the value, you're fine:
var userId = User.Identity.GetUserId();
var posts = ApplicationDbContext.Posts
.Where(c => c.Author.Id == userId);

Why am I unable to acess properties of my object model?

For some odd reason, I am unable to access the properties of this object EVEN if I cast it as its model type. Does anyone have an idea as to why? (It may be obvious, but I'm pretty new to C# so please be patient! :o) )
Users currentUser = new Users();
currentUser = (from x in db_tc.Users where x.Id == Convert.ToInt32(User.Identity.Name) select x);
When I call currentUser, I am only able to access the CRUD methods and a List<Users> property called usrList. I didn't create the list definition, so I imagine this is some piece of the Entity framework that is automagically created.
I did try casting currentUser with (Users) prior to the entity query, it didn't help at all.
That's because you've only created the query, you haven't actually executed it. Add Single() (or First() etc.) to get the result:
var currentUser = (from x in db_tc.Users where x.Id == Convert.ToInt32(User.Identity.Name) select x).SingleOrDefault();
Single(): Gets the first element of the sequence, but will throw an exception if no element is found or if the sequence has more than one element.
First(): Gets the first element of the sequence, but will throw an exception if no element is found.
SingleOrDefault() and FirstOrDefault(): Same as above, but will return default(T) instead of throwing on the empty sequence.
This "deferred execution" aspect of LINQ is probably the most difficult part to understand. The basic idea is that you can build up a query using the query operations (Where(), Select(), etc.), and then you can execute that query to actually get its results, using one of the non-deferred execution operations (Single(), ToList(), etc.)
The operation you have here will return a list of matches the DB will return, it is a collection.
If you intend on it only returning a single record append .First(); to the end of your linq query.
Also remove this line Users currentUser = new Users();
and add this var currentUser =...
Some more tips from the "good LINQ practices" wagon...
LINQ should "usually" return var, then you convert to the data type you are expecting. Another good practice I have found is to immediately validate the return from LINQ, as any usage without validation is highly exception prone. For instance:
var qUser = (from x in db.Users select x);
if (qUser != null && currentUser.Count() > 0)
{
List<User> users = (List<User>)qUser.ToList();
... process users result ...
}
The not null and count greater than 0 check should be required after each LINQ query. ;)
And don't forget to wrap in try-catch for SqlException!

Categories

Resources