EF Core Entity Equality - c#

Sorry if this is a dumb question, most of my experience with ORMs has not been EF, and looking this up online gets me a lot of bad hits. It's almost like "reference" means different things to different people...
If I write code like so:
using (var db = new DbContext())
{
var entity1 = await db.Foos.FirstOrDefaultAsync(x => x.Id == 1);
var entity2 = await db.Foos.FirstOrDefaultAsync(x => x.Id == 1);
return entity1.Equals(entity2);
}
This returns true. Since my entity is a reference type, the Equals under the hood should be an Object.ReferenceEquals() call.
What I want to know is, is this reliable, i.e. will any entity represented by a particular database record in a context always be referentially equal, or can it "drop out" of the cache, get reloaded on demand and have a new reference like what happens in some less sophisticated ORMs? If an entity is loaded as part of a collection on another entity, is it still the same object? Are there rules/ settings that govern this behavior?

As #IvanStoev pointed out in the comments, the referential consistency is by design and a core part of EF, so the same object in the database should always reference the same object in the database context... at least in the scope of the specific database context you're in. YMMV if you are dealing with multiple database contexts.

Related

.NET 6 Entity Framework Tracking with Dependency Injection

I guess I just don't understanding EF tracking. I have the context added via dependency injection via:
builder.Services.AddDbContext<OracleContext>(options => options.UseOracle(OracleConnectionString, b => b.UseOracleSQLCompatibility("11"))
.LogTo(s => System.Diagnostics.Debug.WriteLine(s))
.EnableDetailedErrors(Settings.Dev_System)
.EnableSensitiveDataLogging(Settings.Dev_System)
.UseQueryTrackingBehavior(QueryTrackingBehavior.NoTracking));
I set the tracking behavior to NoTracking here (at least so I thought).
I have a .NET Controller that has the context in its constructor. It passes this context to my class constructor containing my methods. Pretty much everything works fine... except for one:
I have a method that does a context.RawSqlQuery to get a list of objects. I iterate over these objects calling two separate methods from a different class that was generated the same way (using the injected context). This method first does a EF query to verify the object does not already exist, if it does it returns it and we move on - no issues. On the query to check if it exists I also added .AsNoTracking() for SnGs. However, if the object does not exist, and I try to make a new one... every time I do an context.add I get
"The instance of entity type 'Whatever' cannot be tracked because another instance with the key value '{MfrId: 90}' is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached."
I have tried adding
db.ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking - no change
I have tried adding context.Entry(NewItem).State = EntityState.Detached; before and after the call - no change.
I tried a loop in the context that gets all tracked objects and sets them to detached - no change.
What am I missing here? First - why is it tracking at all? Second - any suggestions on how to get passed this? Should I just give up using dependency injection for the context (suck... lots of rework for this)?
As requested - here is the class & method that is failing (non related stuff removed):
public class AssetMethods : IAssetMethods
{
public OracleContext db;
public AssetMethods(OracleContext context)
{
db = context;
}
public CcpManufacturer? CreateNewManufacturer(CcpManufacturer NewMan, string ActorID)
{
...blah blah non DB validation stuff removed...
//Check if it exists already
CcpManufacturer? OldMan = db.CcpManufacturers.Where(m=>m.MfrName == NewMan.MfrName).AsNoTracking().FirstOrDefault();
if (OldMan != null) {
return OldMan;
}
//Who done did it
NewMan.CreatedBy = ActorID;
NewMan.CreationDate = DateTime.Now;
NewMan.Isenabled = true;
//save
db.CcpManufacturers.Add(NewMan);
db.SaveChanges(); //fails here
//Prevent XSS Reflection
return db.CcpManufacturers.Find(NewMan.MfrId);
}
}
this method is called from this code. The OM is also using the injected context
List<MasterItem> Items = OM.RawSqlQuery(Query, x => new MasterItem { MFR_NAME = (string)x[0], MODEL_NUMBER = (string)x[1], LEGAL_NAME= (string)x[2]});
foreach (MasterItem item in Items)
{
CcpManufacturer? Man = new() {
MfrName = item.MFR_NAME,
Displayname = item.MFR_NAME
};
Man = AM.CreateNewManufacturer(Man,System.Id); //if its new.. it never get passed here because of the discussed error...
if (Man == null || Man.MfrId == 0)
{
continue;
}
.... other stuff
}
So the mfr id is added to a new object that's passed to a pretty much identical methods to create a item (where the mfr id is attached). Now - if I detach THAT item - I am ok. But why is it tracking when I have it turned off pretty much everywhere?
Yes, you found your problem.
Turning off Tracking affects what EF does when querying for entities. This means when I tell EF to read data from the DB and give me entities, it will not hang onto references of those entities.
However, entities you tell a DBContext to ADD to a DbSet and related entities will be tracked, regardless of your tracking setting.
So if I do something like:
var entity1 = context.Entities.Single(x => x.Id == entityId).AsNoTracking();
var entity2 = context.Entities.Single(x => x.Id == entityId).AsNoTracking();
The references to entity1 and entity2 will be 2 distinct references to the same record. Both are detached, so the DbContext isn't tracking either of them. I can use and attach either of them to perform an update, but that entity would be from that point considered Attached until I explicitly detach it again. Attempting to use the other reference for an update would result in that error. If I query specifying NoTracking after I have attached and updated that first entity reference, I will get back a new untracked entity reference. The DbContext doesn't return it's tracked reference, but it doesn't discard it either.
The exact same thing happens if I add a new entity then query for it specifying NoTracking. The query returns an untracked reference. So if you try and attach it to update a row, EF will complain about the reference it is already tracking.
I don't recommend diving down the rabbit hole of passing around detached entities unless you're keen to spend the time to really understand what is going on behind the scenes and prepared for the pretty deliberate and careful handling of references. The implications aren't just things not working as expected, it's having things work or not work on a completely situational basis which can be a nasty thing to debug and fix, even when you know what to look for.

How to reload database with Entity Framework in C#

I am trying to do some checks on my database with an automated process. On a schedule the process goes out to a service and checks all the entries in the database against a list.
I want to re-insert the records that may have been deleted and update ones that are out of date
foreach (Category x in CustomeClass)
{
Category exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.FirstOrDefault();
if (exists == null)
Context.Add(x);
else
Context.Update(x);
}
Not sure but I keep getting messages about tracking an instance with the same key etc. Can someone point me to a best practice on something like this
Danka!
This type of error is common when re-using the same instance of EF dbContext, especially when trying to load the same entity more then once from database using the same context.
If this is the case, then simply recreate the context (either new it up or use context factory) and then try update or modify your database data.
Creating new context is cheap, so no worries there.
After updating, save changes and dispose of the context (or use it in using statement to begin with).
If you are modifying the same entity multiple times using the same context, then do not load it from database multiple times.
In Your particular code example I would check if there is no duplication of Category objects in CustomeClass collection.
Is there duplication of CategoryID?
Is CategoryID for sure auto-generated when saving data (entity configuration)?
Is code not trying to update multiple entities with same id?
etc.
Entity framework works with references. Two instances of a class with the same data amount to two different references and only one reference can be associated with a DbContext, otherwise you get errors like that.
As per your example below:
foreach (Category x in CustomeClass)
{
Category exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.FirstOrDefault();
if (exists == null)
Context.Add(x);
else
Context.Update(x);
}
Category is assumed to be your Entity, so "CustomeClass" would be a collection of instances that are not associated with your Context. These are Detached instances.
When your "exists" comes back as #null, this will appear to work as the Category "x" gets added and tracked by the Context. However, when "exists" comes back as not #null, you now have two instances for the same entity. "exists" is tracked by the DbContext, while "x" is not. You cannot use Update() with "x", you must copy the values across.
The simplest way to do this would be Automapper where you can create a map from Category to Category, then use Map to copy all values from "x" over into "exists":
var config = new MapperConfiguration(cfg => cfg.CreateMap<Category, Category>());
var mapper = config.CreateMapper();
mapper.Map(x, exists);
This is purely an example, you'll probably want to configure and inject a mapper that handles your entity copying. You can configure the CreateMap to exclude columns that shouldn't ever change. (Using ForMember, etc.)
Alternatively you can copy the values across manually:
// ...
else
{
exists.Name = x.Name,
exists.SomeValue = x.SomeValue,
// ...
}
In general you should avoid using the Update method in EF as this will result in a statement that overwrites all columns in a table, rather than updating just the column(s) that changes. (If no columns changed then no UPDATE SQL will actually get run)
On another side note, when getting "exists", you should use SingleOrDefault() not FirstOrDefault() as you expect 0 or 1 row back. First* methods should be used in cases where you expect there can be multiple matches but only want the first match, and should always be used with an OrderBy*() method to ensure the results are predictable.
You can use Update by performing an Exists check query that doesn't load a tracked entity. Examples would be:
Category exists = Context.SSActivewear_Category
.AsNoTracking()
.Where(b => b.CategoryID == x.CategoryID)
.SingleOrDefault();
or better:
bool exists = Context.SSActivewear_Category
.Where(b => b.CategoryID == x.CategoryID)
.Any();
Then you could use Context.Update(x). AsNoTracking() tells EF to load an instance but not track it. This would really be a waste in this case as it's a round trip to the DB to return everything in the Category only to check if something was returned or not. The Any() call would be a round trip but just does an EXISTS db query to return true or false.
However, these are not fool-proof as there is no guarantee that the Context instance isn't already tracking an instance for that Category from some other operation. Something as trivial as having a Category appear in CustomeClass twice for any reason would be enough to trip the above examples up as once you call Context.Update(x) that instance is now tracked, so the loop iteration for the second instance would fail.

Live Transfer of data from one provider to another in Entity Framework

I apologise if this has been asked already, I am struggling greatly with the terminology of what I am trying to find out about as it conflicts with functionality in Entity Framework.
What I am trying to do:
I would like to create an application that on setup gives the user to use 1 database as a "trial"/"startup" database, i.e. non-production database. This would allow a user to trial the application but would not have backups etc. in no way would this be a "production" database. This could be SQLite for example.
When the user is then ready, they could then click "convert to production" (or similar), and give it the target of the new database machine/database. This would be considered the "production" environment. This could be something like MySQL, SQLServer or.. whatever else EF connects to these days..
The question:
Does EF support this type of migration/data transfer live? Would it need another app where you could configure the EF source and EF destination for it to then run through the process of conversion/seeding/population of the data source to another data source?
Why I have asked here:
I have tried to search for things around this topic, but transferring/migration brings up subjects totally non-related, so any help would be much appreciated.
From what you describe I don't think there is anything out of the box to support that. You can map a DbContext to either database, then it would be a matter of fetching and detaching entities from the evaluation DbContext and attaching them to the production one.
For a relatively simple schema / object graph this would be fairly straight-forward to implement.
ICollection<Customer> customers = new List<Customer>();
using(var context = new AppDbContext(evalConnectionString))
{
customers = context.Customers.AsNoTracking().ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{ // Assuming an empty database...
context.Customers.AddRange(customers);
}
Though for more complex models this could take some work, especially when dealing with things like existing lookups/references. Where you want to move objects that might share the same reference to another object you would need to query the destination DbContext for existing relatives and substitute them before saving the "parent" entity.
ICollection<Order> orders = new List<Order>();
using(var context = new AppDbContext(evalConnectionString))
{
orders = context.Orders
.Include(x => x.Customer)
.AsNoTracking()
.ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{
var customerIds = orders.Select(x => x.Customer.CustomerId)
.Distinct().ToList();
var existingCustomers = context.Customers
.Where(x => customerIds.Contains(x.CustomerId))
.ToList();
foreach(var order in orders)
{ // Assuming all customers were loaded
var existingCustomer = existingCustomers.SingleOrDefault(x => x.CustomerId == order.Customer.CustomerId);
if(existingCustomer != null)
order.Customer = existingCustomer;
else
existingCustomers.Add(order.Customer);
context.Orders.Add(order);
}
}
This is a very simple example to outline how to handle scenarios where you may be inserting data with references that may, or may not exist in the target DbContext. If we are copying across Orders and want to deal with their respective Customers we first need to check if any tracked customer reference exists and use that reference to avoid a duplicate row being inserted or throwing an exception.
Normally loading the orders and related references from one DbContext should ensure that multiple orders referencing the same Customer entity will all share the same entity reference. However, to use detached entities that we can associate with the new DbContext via AsNoTracking(), detached references to the same record will not be the same reference so we need to treat these with care.
For example where there are 2 orders for the same customer:
var ordersA = context.Orders.Include(x => x.Customer).ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Passes
var ordersB = context.Orders.Include(x => x.Customer).AsNoTracking().ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Fails
Even though in the 2nd example both are for the same customer. Each will have a Customer reference with the same ID, but 2 different references because the DbContext is not tracking the references used. One of the several "gotchas" with detached entities and efforts to boost performance etc. Using tracked references isn't ideal since those entities will still think they are associated with another DbContext. We can detach them, but that means diving through the object graph and detaching all references. (Do-able, but messy compared to just loading them detached)
Where it can also get complicated is when possibly migrating data in batches (disposing of a DbContext regularly to avoid performance pitfalls for larger data volumes) or synchronizing data over time. It is generally advisable to first check the destination DbContext for matching records and use those to avoid duplicate data being inserted. (or throwing exceptions)
So simple data models this is fairly straight forward. For more complex ones where there is more data to bring across and more relationships between that data, it's more complicated. For those systems I'd probably look at generating a database-to-database migration such as creating INSERT statements for the desired target DB from the data in the source database. There it is just a matter of inserting the data in relational order to comply with the data constraints. (Either using a tool or rolling your own script generation)

Entity Framework 6 - use my getHashCode()

There's a certain amount of background to get through for this one - please bear with me!
We have a n-tier WPF application using EF - we load the data from the database via dbContext into POCO classes. The dbContext is destroyed and the user is then able to edit the data.
We use "state painting" as suggested by Julie Lerman in her book "Programming Entity Framework: DBContext" so that when we add the root entity to a new dbContext for saving we can set whether each child entity is added, modified or left unchanged etc.
The problem we had when we first did this (back in November 2012!) was that if the root entity we are adding to the dbContext has multiple instances of the same child entity (ie, a "Task" record linked to a user, with "Status Histories" also linked to the same user) the process would fail because even though the child entities were the same (from the same database row) they were given different hashcodes so EF recognised them as different objects.
We fixed this, (back in December 2012!), by overriding GetHashCode on our entities to return either the database ID if the entity came from the database, or an unique negative number if the entity is as yet unsaved. Now when we add the root entity to the dbContext it was clever enough to realise the same child entity is being added more than once and it dealt with it correctly. This has been working fine since December 2012 until we upgraded to EF6 last week...
One of the new "features" with EF6 is that it now uses it's own Equals and GetHashCode methods to perform change-tracking tasks, ignoring any custom overrides. See: http://msdn.microsoft.com/en-us/magazine/dn532202.aspx (search for "Less Interference with your coding style"). This is great if you expect EF to manage the change-tracking but in a disconnected n-tier application we don't want this and in fact this breaks our code that has been working fine for over a year.
Hopefully this makes sense.
Now - the question - does anyone know of any way we can tell EF6 to use OUR GetHashCode and Equals methods like it did in EF5, or does anyone have a better way to deal with adding a root entity to a dbContext that has duplicated child entities in it so that EF6 would be happy with it?
Thanks for any help. Sorry for the long post.
UPDATED
Having poked around in the EF code it looks like the hashcode of an InternalEntityEntry (dbEntityEntry) used to be set by getting the hashcode of the entity, but now in EF6 is retrieved by using RuntimeHelpers.GetHashCode(_entity), which means our overridden hashcode on the entity is ignored. So I guess getting EF6 to use our hashcode is out of the question, so maybe I need to concentrate on how to add an entity to the context that potentially has duplicated child entities without upsetting EF. Any suggestions?
UPDATE 2
The most annoying thing is that this change in functionality is being reported as a good thing, and not, as I see it, a breaking change! Surely if you have disconnected entities, and you've loaded them with .AsNoTracking() for performance (and because we know we are going to disconnect them so why bother tracking them) then there is no reason for dbContext to override our getHashcode method!
UPDATE 3
Thanks for all the comments and suggestion - really appreciated!
After some experiments it does appear to be related to .AsNoTracking(). If you load the data with .AsNoTracking() duplicate child entities are separate objects in memory (with different hashcodes) so there is a problem state painting and saving them later. We fixed this problem earlier by overriding the hashcodes, so when the entities are added back to the saving context the duplicate entities are recognised as the same object and are only added once, but we can no longer do this with EF6. So now I need to investigate further why we used .AsNoTracking() in the first place.
One other thought I have is that maybe EF6's change tracker should only use its own hashcode generation method for entries it is actively tracking - if the entities have been loaded with .AsNoTracking() maybe it should instead use the hashcode from the underlying entity?
UPDATE 4
So now we've ascertained we can't continue to use our approach (overridden hashcodes and .AsNoTracking) in EF6, how should we manage updates to disconnected entities? I've created this simple example with blogposts/comments/authors:
In this sample, I want to open blogpost 1, change the content and the author, and save again. I've tried 3 approaches with EF6 and I can't get it to work:
BlogPost blogpost;
using (TestEntities te = new TestEntities())
{
te.Configuration.ProxyCreationEnabled = false;
te.Configuration.LazyLoadingEnabled = false;
//retrieve blog post 1, with all comments and authors
//(so we can display the entire record on the UI while we are disconnected)
blogpost = te.BlogPosts
.Include(i => i.Comments.Select(j => j.Author))
.SingleOrDefault(i => i.ID == 1);
}
//change the content
blogpost.Content = "New content " + DateTime.Now.ToString("HH:mm:ss");
//also want to change the author from Fred (2) to John (1)
//attempt 1 - try changing ID? - doesn't work (change is ignored)
//blogpost.AuthorID = 1;
//attempt 2 - try loading the author from the database? - doesn't work (Multiplicity constraint violated error on Author)
//using (TestEntities te = new TestEntities())
//{
// te.Configuration.ProxyCreationEnabled = false;
// te.Configuration.LazyLoadingEnabled = false;
// blogpost.AuthorID = 1;
// blogpost.Author = te.Authors.SingleOrDefault(i => i.ID == 1);
//}
//attempt 3 - try selecting the author already linked to the blogpost comment? - doesn't work (key values conflict during state painting)
//blogpost.Author = blogpost.Comments.First(i => i.AuthorID == 1).Author;
//blogpost.AuthorID = 1;
//attempt to save
using (TestEntities te = new TestEntities())
{
te.Configuration.ProxyCreationEnabled = false;
te.Configuration.LazyLoadingEnabled = false;
te.Set<BlogPost>().Add(blogpost); // <-- (2) multiplicity error thrown here
//paint the state ("unchanged" for everything except the blogpost which should be "modified")
foreach (var entry in te.ChangeTracker.Entries())
{
if (entry.Entity is BlogPost)
entry.State = EntityState.Modified;
else
entry.State = EntityState.Unchanged; // <-- (3) key conflict error thrown here
}
//finished state painting, save changes
te.SaveChanges();
}
If you use this code in EF5, using our existing approach of adding .AsNoTracking() to the original query..
blogpost = te.BlogPosts
.AsNoTracking()
.Include(i => i.Comments.Select(j => j.Author))
.SingleOrDefault(i => i.ID == 1);
..and overriding GetHashCode and Equals on the entities: (for example, in the BlogPost entity)..
public override int GetHashCode()
{
return this.ID;
}
public override bool Equals(object obj)
{
BlogPost tmp = obj as BlogPost;
if (tmp == null) return false;
return this.GetHashCode() == tmp.GetHashCode();
}
..all three approaches in the code now work fine.
Please can you tell me how to achieve this in EF6? Thanks
It’s interesting and surprising that you got your application working this way in EF5. EF always requires only a single instance of any entity. If a graph of objects are added and EF incorrectly assumes that it is already tracking an object when it is in fact tracking a different instance, then the internal state that EF is tracking will be inconsistent. For example, the graph just uses .NET references and collections, so the graph will still have multiple instances, but EF will only be tracking one instance. This means that changes to properties of an entity may not be detected correctly and fixup between instances may also result in unexpected behavior. It would be interesting to know if your code solved these problems in some way or if it just so happened that your app didn’t hit any of these issues and hence the invalid state tracking didn’t matter for your app.
The change we made for EF6 makes it less likely that an app can get EF state tracking into an invalid state which would then cause unexpected behavior. If you have a clever pattern to ensure the tracking state is valid that we broke with EF6 then it would be great if you could file a bug with a full repro at http://entityframework.codeplex.com/.

Entity equality across different Linq-to-SQL contexts

I am attempting to add some multi-threading to a WPF application I have created in order to create a more responsive interface, but as Linq-to-SQL data contexts are not thread safe, I am forced to use one per thread.
My problem is that the same entity pulled from two different contexts, are apparently not equal. Take the following code sample, where I have a simple database with employee records:
var context1 = new DataModelDataContext();
var context2 = new DataModelDataContext();
var emp1 = context1.Employees.Single(x => x.ID == 1);
var emp2 = context2.Employees.Single(x => x.ID == 1);
Console.WriteLine(string.Format("Employees equal: {0}", emp1 == emp2));
Console.ReadKey();
When run, this returns:
Employees equal: False
In my mind I would expect these to objects to be equal, as they would be if I pulled them from the same context. I can overcome this by checking emp1.ID == emp2.ID instead, but this is problematic when trying to use WPF bindings such as SelectedItem.
Is there any way around this? This behaviour appears to be the same in Entity Framework as well.
Two independently instantiated objects may represent the same item in the data store, but they will never evaluate as equal in any language. You will have to write code that compares the members of each object to determine if their data is equal. This may or may not be as simple as comparing the primary key.
You can always override Equals and GetHasCode to ensure that objects are "equal" even if they are not same instance (which is default equality rule used for reference types).
As #cdonner stated, when you load an object from the data store - two different instances of the object will be created with identical data. This will mean that object1 != object2.
One way of overcoming this is to set up a cache like dictionary in your repository. Example: Dictionary<Type, object>. Where object is the type of your indentifier (int in this case).
So rather than querying the data store by using inline code like context1.Employees.Single(x => x.ID == 1); you could set it up so that you call it like Repositories.Employees.WithID(1);
What this would then do is check the local repository cache for an Employee object with ID == 1, if this is available return it instead of querying the data store.
From then on, your references will always be the same.
This may need some refining for when you want to update and\or refresh the object in memory from the data store so that you don't persist stale data, and so that when you refresh the data from the data store, you update your cache.

Categories

Resources