What is the difference between IDbSet.Add and DbEntityEntry.State = EntityState.Added?

What is the difference between IDbSet.Add and DbEntityEntry.State = EntityState.Added? - c#

In EF 4.1+, is there a difference between these 2 lines of code?
dbContext.SomeEntitySet.Add(entityInstance);
dbContext.Entry(entityInstance).State = EntityState.Added;
Or do they do the same thing? I'm wondering if one might affect child collections / navigation properties differently than the other.

When you use dbContext.SomeEntitySet.Add(entityInstance); the status for this and all its related entities/collections is set to added, while dbContext.Entry(entityInstance).State = EntityState.Added; adds also all the related entities/collections to the context but leaves them as unmodified.
So if the entity that you are trying to create has a related entity (and it's value its not null), when you use Add it will create a new object for that child entity, while with the other way it won't.

I just tested this with EF 6, with related entities/navigation properties, and in both cases the created objects were identical. (All parent and related child objects were created.) The only difference I noticed was that Add was faster by about a factor of 2. My data had 1000 parent objects, each with 5 child objects for a total of 6000 objects written to the DB.

Related

How can I attach an entity to the context when one of its children is already attached?

Background
In my application we were running into issues when trying to add a new entity with existing children to the database after mapping it from a DTO using AutoMapper. The entity and its children were not attached to the context, and so the children would be treated as new and EF would attempt to insert duplicates. My solution to this was to automatically attach an entity to the context whenever an object was mapped to a BaseEntity type (BaseEntity is the base class for all of our Model objects, it has an Id property and nothing else). Here is the code:
public TDestination Map<TDestination>(object source) where TDestination : class
{
var result = _mapper.Map<TDestination>(source);
if (typeof(TDestination).IsSubclassOf(typeof(BaseEntity)) && result != null)
_context.Attach(result); //_context is a DbContext
return result;
}
This worked fine in my initial test cases, but now I've run into an issue where the entity I'm attaching has a child that is already attached to the context. This throws "The instance of entity type 'MyChildEntity' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked.".
How can I attach an entity to the context when a child is already attached? I'm trying to keep this method extremely generic so that it can be used by any object that we are trying to map from a DTO to a BaseEntity.
What I've Tried
I've tried grabbing the associated EntityEntry and recursively detach all of its children using the following method before attempting to call Attach():
private void DetachChildren(EntityEntry entity)
{
foreach (var member in entity.Members.Where(x => x.CurrentValue != null))
{
if (IsBaseEntityType(member.CurrentValue.GetType()))
{
var childEntity = _context.Entry(member.CurrentValue);
childEntity.State = EntityState.Detached;
DetachChildren(childEntity);
}
}
}
I've confirmed that this loop does reach the problem child and sets its state to detached, but I still end up getting the same error when calling Attach().

Welcome to the hell that is working with detached entities.
Automapper can be leveraged to update existing entities rather than forming an entity and attaching it. Given an object like an Order:
public void UpdateOrder(OrderDTO orderDTO)
{
var order = _context.Orders.Single(x => x.OrderId = orderDTO.OrderId);
_mapper.Map(orderDTO, order);
_context.SaveChanges();
}
The benefits of this approach is that it handles whether the order happens to be tracked or not, asserts the order exists for something like an Update where it is assumed it does, and when SaveChanges runs, only the fields that actually changed will be updated. If only 1 field changed, the update statement updates that single field. Attaching a new object and setting EntityState to Modified will update all fields. This could introduce unexpected attack vectors to change data you don't expect since a DTO needs to pass enough info to construct a whole entity to avoid unintentionally #null-ing data. The mapping from DTO to entity should ensure that only editable fields are copied across.
In the case where the OrderDTO will contain one or more child collections to update, you will likely need to use a mapping that excludes those collections, then use AfterMap in the mapping configuration to inspect the child collection for new vs. existing vs. removed entities and handle those accordingly. (Add vs. mapper.Map vs. Remove)
Generally the updates can be structured to perform atomic operations that make the entity interactions as straight forward as possible. For instance UpdateOrderDetails(orderDTO) would update information about the order itself, where there would be separate methods such as AddOrderLine(newOrderLineDTO) vs. UpdateOrderLine(orderLineDTO) vs. RemoveOrderLine(orderLineId) etc. rather than having all order line operations and other related changes done through a single UpdateOrder method accepting a whole modified object graph.
The alternative when dealing with graphs and the possibility of tracked entities is that you need to check each and every related entity against the DbSet's .Local or other means to check to see if the entity is tracked. If it is, then you have to replace the references and copy any applicable changes to the already tracked entity. Telling a DbContext to ignore an existing entity isn't always a simple matter as there can be references to that entity in other tracked entities. Generally you'll want to detect a tracked entity reference then update your references to use that tracked reference and update it as needed. It is lots of mucking around with references, and definitely does not work well with Generic methods
Generic operations are tempting from a DNRY standpoint, but dealing with situations where some entities might be tracked vs. not, and then handling type mismatches etc. (source = object = can be anything..) adds a lot of complexity in place of simpler methods to handle operations.

EF 6 Saving multiple levels of child entities and multiple parents

Given this model:
I would like to be able to save in one SaveChange call the relations. Which means, I either have a new or updated ContainerParent, and multiple first level children and each of those can have 1 or 2 levels deeper.
The thing is, the children both have a key to themselves, for finding its parent, and a key to the container, for the container to get all its Children independently of their hierarchical level.
With this pseudo code (in the case of all entities are created, not updated)
var newContainerParent = context.ContainerParents.Add(new ContainerParent());
var rootChild = context.Children.Add(new Child());
var secondLevelChild = new Child();
var thirdLevelChild = new Child();
secondLevelChild.Children.Add(thirdLevelChild);
rootChild.Children.Add(secondLevelChild);
newContainerParent.Children.Add(rootChild);
context.SaveChanges();
Problem with this code, is that only the rootchild will have the FK for the container set. I also tried to add the children to they child parent AND the container:
rootChild.Children.Add(secondLevelChild);
newContainerParent.Children.Add(rootChild);
newContainerParent.Children.Add(secondLevelChild);
newContainerParent.Children.Add(thirdLevelChild);
I have the same problem while updating an existing container with new children. I set all the children with the already existing key of the parent, but when SaveChanges is called the key is not saved, its reverted to null.
I fixed it by doing all this in 2 steps, saving once and then getting all the newly created children and updating them with the parent key, the calling SaveChanges again.
I have a feeling I'm missing something, that I should not need to save twice.

The number or frequence of SaveChange calls have no implication on anything, not on performance or so. So why do you want to minimize it ?
Actually, storing such a self referencing table with one SaveChanges is not possible,
cause the ID of an new entity, is generated, when it is saved. So you first need to save it, and then you get the ID, that you can store in another entity. This might require further update-Commands, to the entity you just stored.
You have two chances to solve this.
1) manually generated ID's, handle it all yourself and you know the ID before your store it.
2) In case you have no circularity in your dependency, so a perfect tree structure, you save the items top-down, level by level. I assume you have the childs having a reference to it's parents, so the root has no reference to any other items, you save that first, than the 1st level children, and so on.
This requires multiple SaveChanges, but this is not a disadvantage. It is one Insert-SQL-Command per entity anyway, no matter if you do it in 1 SaveChanges or in 100 SaveChanges.
Both solutions avoid "Update" Commands to the entities, they do Inserts only.
Entity Framework could actually find out this dependencies itself and create an order for new entities to insert, but this is not implemented today, or not perfect, especially on self-referenced tables. The order of saving items is kind of random. So you have to enforce the order with intermediate SaveChanges.

Behavior I can't explain with Entity Framework

I'm trying to find documentation about a behavior in Entity Framework. It works but before relying on this behavior, I want to be sure it's a normal behavior of EF and not a unexpected side effect that would be "fixed" in a future version. Here's the situation :
I've a pretty deep object hierachy (which I will simplify here). The structure is a multi levels collection of objects (class A contains a collection of class B which contains a collection of class C, which contains ...) for 7 levels deep.
I've to filter elements on some properties of C and my first trials of doing it loading the complete hierachy produce a complicated LINQ query (which would be a maintenance nightmare) and the generated SQL query was far from efficient. To simplify all this, I decided to split the query in 2 steps : first, I load the collection of class C (and all its childs) filtered as I want and then, I load class A and B for all instances of B that contains an item of my filtered collection of C.
Here's the catch : using that technique, I expected to have to repopulate manually the collection of C in my B class but actually, the collection is already populated with the elements of the collection. I verified the SQL query in intellitrace and the data required to fill instances of C is not included in the second query so the only logical conclusion is that EF did this from the informations in the context. BTW, lazy loading is turned off for that context.
Is this behavior normal in EF ? It so, can you give me link to the documentation explaining how this works ?
Here's a snippet to illustrate this :
using(var context = new MyContext())
{
//Includes and where clauses are greatly simplified for the purpose of the sample
var filteredC = context.C.Include(x=>x.ListOfD).Include(x=>x.ListOfD.Select(y=>y.ListOfE)).Where(c=>c.Status==Status).ToList();
int[] bToLoad = filteredC.Select(c=>c.IDofB).Distinct().ToArray();
var listOfAAndB = context.A.Include(a=>a.ListOfB).Where(x=>x.ListOfB.Any(y=>bToLoad.Contains(y.ID))).ToList();
//At this step, I expected B.ListOfC to be empty but it's somehow populated
}
Thanks

This is standard behavior for a DbContext life cycle. To be honest, I can't link you to any documentation that documents this feature, but I can explain you how this works.
An EF Context is stateful, and keeps track of all the entities that have already been fetched. It also knows about the relations between entities in your DB and your entity model.
So if you fetch new objects that have a direct relation to that object (in your case, C has a foreign key to B), the navigation property is populated by the Context. This is a feature, and not a bug, as it tries to explicitly avoid Lazy loading queries to the DB for objects that have already been fetched.

How does DetectChanges work?

There is the following 2 entities, with the following properties:
Parent
ID
Children
Child
ID
ParentID
Parent
Now I have the following code:
db.Configuration.AutoDetectChangesEnabled = false;
var child1=new Child();
parent.Children.Add(child1);
db.ChangeTracker.DetectChanges();
parent.Children.Remove(child1);
var child2=new Child();
child2.Parent=parent;
child2.ParentID=parent.ID;
db.Children.add(child2);
At this point, child1 and child2 are completely identical. The Parent and ParentID properties have the same values (that of parent). Examining the dbContext entry for both of them also shows exactly the same information, e.g. OriginalValues is empty for both.
If I now call db.ChangeTracker.DetectChanges however, child1.Parent becomes null, while child2.Parent keeps its value. How does EF know to do this - where does it keep the info needed to be able to make this difference?
Thank you for any ideas

In general
Detect Changes works by detecting the differences between the current property values of the entity and the original property values that are stored in a snapshot when the entity was queried or attached.
Entity Framework Automatic Detect Changes
and
Ensures that ObjectStateEntry changes are synchronized with changes in all objects that are tracked by the ObjectStateManager.
ObjectContext.DetectChanges Method
so in your example after you disable change tracking the new values are on top of the original values until you decide to update the model by calling context.ChangeTracker.DetectChanges and synchronize all objects (it more or less defers the operations). Now when you call DetectChanges all operations (Remove & Add) get commited to the model so that they can be saved later. Thus the parent is removed from the child1 parent.Children.Remove(child1) and child2 keeps its value because here the operation was to add the parent (assignment).
The reason why you would want to disable change tracking is performace:
If you are tracking a lot of entities in your context and you call one of these methods many times in a loop, then you may get significant performance improvements by turning off detection of changes for the duration of the loop.
If you'd like to know the exact technical/implementation details you can digg through the source code:
.NET Framework source code online > DetectChanges

Load Entities AsNoTracking() with navigation properties, without specifying includes

I would like to know if the following scenario is possible with Entity Framework:
I want to load several tables with the option AsNoTracking since they are all like static tables that cannot be changed by user.
Those tables also happen to be navigation property of others. Up till now I relied on the AutoMapping feature of the Entity Framework, and don't use the .Include() or LazyLoading functionality.
So instead of:
var result = from x in context.TestTable
.Include("ChildTestTable")
select x;
I am using it like this:
context.ChildTestTable.Load();
context.TestTable.Load();
var result = context.TestTable.Local;
This is working smoothly because the application is so designed that the tables within the Database are very small, there won't be a table that exceeds 600 rows (and that's already pretty high value in my app).
Now my way of loading data, isn't working with .AsNoTracking().
Is there any way to make it working?
So I can write:
context.ChildTestTable.AsNoTracking().List();
var result = context.TestTable.AsNoTracking().List();
Instead of:
var result = from x in context.TestTable.AsNoTracking()
.Include("ChildTestTable")
select x;
So basically, I want to have 1 or more tables loaded with AutoMapping feature on but without loading them into the Object State Manager, is that a possibility?

The simple answer is no. For normal tracking queries, the state manager is used for both identity resolution (finding a previously loaded instance of a given entity and using it instead of creating a new instance) and fixup (connecting navigation properties together). When you use a no-tracking query it means that the entities are not tracked in the state manager. This means that fixup between entities from different queries cannot happen because EF has no way of finding those entities.
If you were to use Include with your no-tracking query then EF would attempt to do some fixup between entities within the query, and this will work a lot of the time. However, some queries can result in referencing the same entity multiple times and in some of those cases EF has no way of knowing that it is the same entity being referenced and hence you may get duplicates.
I guess the thing you don't really say is why you want to use no-tracking. If your tables don't have a lot of data then you're unlikely to see significant perf improvements, although many factors can influence this. (As a digression, using the ObservableCollection returned by .Local could also impact perf and should not be necessary if the data never changes.) Generally speaking you should only use no-tracking if you have an explicit need to do so since otherwise it ends up adding complexity without benefit.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

What is the difference between IDbSet.Add and DbEntityEntry.State = EntityState.Added? - c#

Related

How can I attach an entity to the context when one of its children is already attached?

EF 6 Saving multiple levels of child entities and multiple parents

Behavior I can't explain with Entity Framework

How does DetectChanges work?

Load Entities AsNoTracking() with navigation properties, without specifying includes

Categories

Resources