Unexpected behavior in entity framework

Unexpected behavior in entity framework - c#

I ran into what I think is a really odd situation with entity framework. Basically, if I update an row directly with a sql command, when I retrive that row through linq it doesn't have the updated information. Please see the below example for more information.
First I created a simple DB table
CREATE TABLE dbo.Foo (
Id int NOT NULL PRIMARY KEY IDENTITY(1,1),
Name varchar(50) NULL
)
Then I created a console application to add an object to the DB, update it with a sql command and then retrieve the object that was just created. Here it is:
public class FooContext : DbContext
{
public FooContext() : base("FooConnectionString")
{
}
public IDbSet<Foo> Foo { get; set; }
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Entity<Foo>().ToTable("Foo");
base.OnModelCreating(modelBuilder);
}
}
public class Foo
{
[Key]
public int Id { get; set; }
public string Name { get; set; }
}
public class Program
{
static void Main(string[] args)
{
//setup the context
var context = new FooContext();
//add the row
var foo = new Foo()
{
Name = "Before"
};
context.Foo.Add(foo);
context.SaveChanges();
//update the name
context.Database.ExecuteSqlCommand("UPDATE Foo Set Name = 'After' WHERE Id = " + foo.Id);
//get the new foo
var newFoo = context.Foo.FirstOrDefault(x => x.Id == foo.Id);
//I would expect the name to be 'After' but it is 'Before'
Console.WriteLine(string.Format("The new name is: {0}", newFoo.Name));
Console.ReadLine();
}
}
The write line at the bottom prints out "Before" however I would expect that it prints out "After". The odd thing about it is that if I run profiler I see the sql query run and if I run the query in management studio myself, it returns "After" as the name. I am running sql server 2014.
Can someone please help me understand what is going on here?
UPDATE:
It is going to the database on the FirstOrDefault line. Please see the attached screen shot from sql profiler.
So my question really is this:
1) If it is caching, shouldn't it not be going to the DB? Is this a bug in EF?
2) If it is going to the db and spending the resources, shouldn't EF update the object.

FooContext includes change tracking and caching, so the in-memory object that is returned from your query is the same instance that you added earlier. Calling SaveChanges() does clear the context and FooContext is not aware of the changes that happened underneath it in the database.
This is usually a good thing -- not making expensive database calls for every operation.
In your sample, try making the same query from a new FooContext, and you should see "After".
update
Responding to your updated question, yes, you are right. I missed before that you were using FirstOrDefault(). If you were using context.Find(foo.Id), as I wrongly assumed, then there would be no query.
As for why the in-memory object is not updated to reflect the change in the database, I'd need to do some research to do anything more than speculate. That said, here is my speculation:
An instance of the database context cannot return more than one instance of the same entity. Within a unit of work, we must be able to rely on the context to return the same instance of the entity. Otherwise, we might query by different criteria and get 3 objects representing the same conceptual entity. At that point, how can the context deal with changes to any of them? What if the name is changed to a different value on two of them and then SaveChanges() is called -- what should happen?
Given then that the context tracks at most a single instance of each entity, why can't EF just update that entity at the point at which a query is executed? EF could even discard that change if there is a pending in-memory change, since it knows about those changes.
I think one part of the answer is that diffing all the columns on large entities and in large result sets is performance prohibitive.
I think a bigger part of the answer is that it executing a simple SELECT statement should not have the potential to cause side effects throughout the system. Entities may be grouped or looped over by the value of some property and to change the value of that property at an indeterminate time and as a result of a SELECT query is highly unsound.

Related

Entity Framework Core - Modified key on owned type

While saving changes in my database, an exception with the following message is returned:
The property 'OrderId' on entity type 'Order.CustomerDeliveryDetails#CustomerDetails' is part of a key and so cannot be modified or marked as modified. To change the principal of an existing entity with an identifying foreign key first delete the dependent and invoke 'SaveChanges' then associate the dependent with the new principal.
The database is implemented with entity framework core with a 'code first' approach. Order.CustomerDeliveryDetails is an owned type (of the type CustomerDetails) of the entity Order. CustomerDetails has no property called OrderId. As I understand OrderId is a implicit key, generated by entity framework core as a shadow property.
The classes are structured as follows:
public class Order
{
public int Id { get; set; }
public CustomerDetails CustomerDeliveryDetails { get; set; }
}
[Owned]
public class CustomerDetails
{
public string Street { get; set; }
}
The object is updated as follows:
var order = await orderContext.Orders
.Where(o => o.Id== updateOrder.Id)
.FirstOrDefaultAsync();
order.CustomerDeliveryDetails.Street = updateOrder.CustomerDeliveryDetails.Street;
await orderContext.SaveChangesAsync();
What I fail to understand is how OrderId can be modified, when it can't be accessed directly in the code.
The only thing I can think of which might cause this error, is the fact that this update is being run on a timed webjob in Azure. This is hunch is supported by the fact that the update passes the related unit tests. Could this have to do with a race condition?
Update:
I'm fairly certain the error comes from some sort of race condition. The timed webjob loads a list of orders that need to be updated every 2 minutes. The update works fine as long as the list contains less then +-100 orders, but starts to fail once this list gets longer.
The webjob is probably inable to finish updating all the orders within 2 minutes if the list gets to long.
The context is added through dependency injection as follows:
serviceProvider.AddDbContext<OrdersContext>(options => options.UseSqlServer(ctx.Configuration["ConnectionString"], sqlOptions => sqlOptions.EnableRetryOnFailure()));
My best geuss is that the context is being shared between multiple calls of the webjob, which is causing the errors.

This boils down to your database relationships.Are you using database first or code first approach? How are the models defined? Whats the relationship between Order, CustomerDetails and the CustomerDeliveryDetails tables?
Please provide the code and I will be able to help you with the solution.

how DbContext.AttachRange() works in this scenario

I saw a book with some code like this:
public class Order
{
public int OrderID { get; set; }
public ICollection<CartLine> Lines { get; set; }
...
}
public class CartLine
{
public int CartLineID { get; set; }
public Product Product { get; set; }
public int Quantity { get; set; }
}
//Product class is just a normal class that has properties such as ProductID, Name etc
and in the order repository, there is a SaveOrder method:
public void SaveOrder(Order order)
{
context.AttachRange(order.Lines.Select(l => l.Product));
if (order.OrderID == 0)
{
context.Orders.Add(order);
}
context.SaveChanges();
}
and the book says:
when store an Order object in the database. When the user’s cart data is deserialized from the session store, the JSON package creates new objects that are not known to
Entity Framework Core, which then tries to write all the objects into the database. For the Product objects, this means that Entity Framework Core tries to write objects that have already been stored, which causes an error. To avoid this problem, I notify Entity Framework Core that the objects exist and shouldn’t be stored in the database unless they are modified
I'm confused, and have two questions:
Q1-why writing objects that have already been stored will cause an error, in the point of view of underlying database, it's just an update SQL statement that modify all columns to their current values?I know it does unnecessary works by changing nothing and rewrite everything, but it shouldn't throw any error in database level?
Q2-why we don't do the same thing to CartLine as:
context.AttachRange(order.Lines.Select(l => l.Product));
context.AttachRange(order.Lines);
to prevent CartLine objects stored in the database just as the way we do it to Product object?

Okay, so this is gonna be a long one:
1st Question:
In Entity Framework (core or "old" 6), there's this concept of "Change tracking". The DbContext class is capable of tracking all the changes you made to your data, and then applying it in the DB via SQL statements (INSERT, UPDATE, DELETE). To understand why it throws an error in your case, you first need to understand how the DbContext / change tracking actually works. Let's take your example:
public void SaveOrder(Order order)
{
context.AttachRange(order.Lines.Select(l => l.Product));
if (order.OrderID == 0)
{
context.Orders.Add(order);
}
context.SaveChanges();
}
In this method, you receive an Order instance which contains Lines and Products. Let's assume that this method was called from some web application, meaning you didn't load the Order entity from the DB. This is what's know as the Disconected Scenario
It's "disconnected" in the sense that your DbContext is not aware of their existence. When you do context.AttachRange you are literally telling EF: I'm in control here, and I'm 100% sure these entities already exist in the DB. Please be aware of them for now on!,
Let's use your code again: Imagine that it's a new Order (so it will enter your if there) and you remove the context.AttachRange part of the code. As soon as the code reaches the Add and SaveChanges these things will happen internally in the DbContext:
The DetectChanges method will be called
It will try to find all the entities Order, Lines and Products in its current graph
If it doesn't find them, they will be added to the "pending changes" as a new records to be inserted
Then you continue and call SaveChanges and it will fail as the book tells you. Why? Imagine that the Products selected were:
Id: 1, "Macbook Pro"
Id: 2, "Office Chair"
When the DbContext looked at the entities and didn't know about them, it added them to the pending changes with a state of Added. When you call SaveChanges, it issues the INSERT statements for these products based on their current state in the model. Since Id's 1 and 2 already exists in the database, the operation failed, with a Primary Key violation.
That's why you have to call Attach (or AttachRange) in this case. This effectively tells EF that the entities exist in the DB, and it should not try to insert them again. They will be added to the context with a state of Unchanged. Attach is often used in these cases where you didn't load the entities from the dbContext before.
2nd question:
This is hard for me to access because I don't know the context/model at that level, but here's my guess:
You don't need to do that with the Cartline because with every order, you probably want to insert new Order line. Think like buying stuff at Amazon. You put the products in the cart and it will generate an Order, then Order Lines, things that compose that order.
If you were then to update an existing order and add more items to it, then you would run into the same issue. You would have to load the existing CartLines prior to saving them in the db, or call Attach as you did here.
Hope it's a little bit clearer. I have answered a similar question where I gave more details, so maybe reading that also helps more:
How does EF Core Modified Entity State behave?

Strange caching issues in Entity Framework 7

I've come across something quite specific and wondering if anyone out there has faced the same issue.
My SQL query (in a Stored Procedure) is simple, I've simplified it a little but:
BEGIN
SELECT DISTINCT
[ANU].[OldUserId] AS [ID]
,[ANU].[Email]
FROM
[dbo].[AspNetUsers] AS [ANU]
INNER JOIN
[dbo].[User] AS [U]
ON
[U].[ID] = [ANU].[OldUserId]
END
Pretty simple, and the SP is fine when run directly through SQL Management Studio.
However, I run it via Entity Framework as such:
[ResponseCache(Duration = 0)] // used this out of desperation
public List<DriverDTO> GetByOrganisation(int organisationId, bool isManager)
{
return _context.Set<DriverDTO>().FromSql("dbo.New_User_List #OrganisationId = {0}, #IsManager = {1}", organisationId, isManager).ToList();
}
DriverDTO:
public class DriverDTO
{
[Key] // tried removing this also
public int ID { get; set; }
public string Email { get; set; }
}
It runs and brings back results, fine. However these results are getting cached. Every call to the SP after the first call brings back the same results, even if I update the records. So, say I edit a User record and change the email - the originally fetched email will always be brought back.
Again, running the SP via SQL Manager brings back the correct results, but my C#/EF side does not. The only logical thing in my head here is that something is somehow being cached under the hood that I desperately need to get around?!

Your loaded entities are cached in the DbContext (in each DbSet's Local collection).
There are several options:
Use AsNoTracking for your query:
return _context.Set<DriverDTO>()
.AsNoTracking()
.FromSql("dbo.New_User_List #OrganisationId = {0}, #IsManager = {1}", organisationId, isManager)
.ToList();
This should avoid Entity Framework caching completely for this query
Use a new DbContext instance for each query
Alternatively, detach all cached entities from the context before issuing your query... something like (untested):
_context.Set<DriverDTO>().Local.ToList().ForEach(x=>
{
_context.Entry(x).State = EntityState.Detached;
});
Notice that, opposite to what one may think, you can't use _context.Set<DriverDTO>().Local.Clear(), since this will mark your entities as deleted (so if you SaveChanges afterwards, it'll delete the entities from the database), so be careful if you are experimenting with the local cache.
Unless you have a need to use a single DbContext or have the received entities from the SP attached to it, I'd go for #2. Otherwise, I'd go for #1. I put #3 there for completeness but I'd avoid mangling with the local cache unless strictly necessary.

EntityFramework adding new object to a collection

Probably this question has been answered before, if that's the case I would appreciate if you guys point me in the right direction.
I would like to know what happens when a new object is added to an EntityFramework collection.
More precisely, I'd like to know if in order to add the new object the whole collection is loaded into memory
For example:
public class MyContext : DbContext
{
public DbSet<Assignment> Assignments { get; set; }
}
public class SomeClass
{
public void AddAssignment(Assignment assignment)
{
var ctx = new MyContext();
ctx.Assignments.Add(assignment);
ctx.SaveChanges();
}
}
Do all the assignment records have to be loaded into memory just to perform a simple insert???

In short: no load process of entire entity collection.
The AddObject() method is used for adding newly created objects that do not exist in the database. When AddObject() is called, a temporary EntityKey is generated and the EntityState is set to 'Added', as shown below:
When context.SaveChanges() is called, EF 4.0 goes ahead and inserts the record into the database. Note that Entity Framework converts the code into queries that the database understand and handles all data interaction and low-level details. Also notice in the code above, that we are accessing data as objects and properties.
After you have executed the code, you can go ahead and physically verify the record in the database.

Entity Framework: Linq query finds entries by original data but returns reference to changed entry

I just spent some days now to find a bug caused by some strange behavior of the Entity Framework (version 4.4.0.0). For an explanation I wrote a small test program. At the end you'll find some questions I have about that.
Declaration
Here we have a class "Test" which represents our test dataset. It only has an ID (primary key) and a "value" property. In our TestContext we implement a DbSet Tests, which shall handle our "Test" objects as a database table.
public class Test
{
public int ID { get; set; }
public int value { get; set; }
}
public class TestContext : DbContext
{
public DbSet<Test> Tests { get; set; }
}
Initialization
Now, we remove any (if existent) entries from our "Tests" table and add our one and only "Test" object. It has ID=1 (primary key) and value=10.
// Create a new DBContext...
TestContext db = new TestContext();
// Remove all entries...
foreach (Test t in db.Tests) db.Tests.Remove(t);
db.SaveChanges();
// Add one test entry...
db.Tests.Add(new Test { ID = 1, value = 10 });
db.SaveChanges();
Tests
Finally, we run some tests. We select our entry by it's original value (=10) and we change the "value" of our entry to 4711. BUT, we do not call db.SaveChanges(); !!!
// Find our entry by it's value (=10)
var result = from r in db.Tests
where r.value == 10
select r;
Test t2 = result.FirstOrDefault();
// change its value from 10 to 4711...
t2.value = 4711;
Now, we try to find the (old) entry by the original value (=10) and do some tests on the results of that.
// now we try to select it via its old value (==10)
var result2 = from r in db.Tests
where r.value == 10
select r;
// Did we get it?
if (result2.FirstOrDefault() != null && result2.FirstOrDefault().value == 4711)
{
Console.WriteLine("We found the changed entry by it's old value...");
}
When running the program we'll actually see "We found the changed entry by it's old value...". That means we have run a query for r.value == 10, found something... This would be acceptable. BUT, get receive the already changed object (not fulfilling value == 10)!!!
Note: You'll get an empty result set for "where r.value == 4711".
In some further testing, we find out, that the Entity Framework always hands out a reference to the same object. If we change the value in one reference, it's changed in the other one too. Well, that's ok... but one should know it happens.
Test t3 = result2.FirstOrDefault();
t3.value = 42;
if (t2.value == 42)
{
Console.WriteLine("Seems as if we have a reference to the same object...");
}
Summary
When running a LINQ query on the same Database Context (without calling SaveChanges()) we will receive references to the same object, if it has the same primary key. The strange thing is: Even, if we change an object we will find it (only!) by it's old values. But we will receive a reference to the already changed object. This means that the restrictions in our query (value == 10) is not guaranteed for any entries that we changed since our last call of SaveChanges().
Questions
Of course, I'll probably have to live with some effects here. But I, would like to avoid to "SaveChanges()" after every little change. Especially, because I would like to use it for transaction handling... to be able to revert some changes, if something goes wrong.
I would be glad, if anyone could answer me one or even both of the following questions:
Is there a possibility to change the behavior of entity framework to work as if I would communicate with a normal database during a transaction? If so... how to do it?
Where is a good resource for answering "How to use the context of entity framework?" which answers questions like "What can I rely on?" and "How to choose the scope of my DBContext object"?
EDIT #1
Richard just explained how to access the original (unchanged) database values. While this is valuable and helpful I've got the urge to clarify the goal ...
Let's have a look at what happens when using SQL. We setup a table "Tests":
CREATE TABLE Tests (ID INT, value INT, PRIMARY KEY(ID));
INSERT INTO Tests (ID, value) VALUES (1,10);
Then we have a transaction, that first looks for entities whose values are 10. After this, we update the value of these entries and look again for those entries. In SQL we already work on the updated version, so we will not find any results for our second query. After all we do a "rollback", so the value of our entry should be 10 again...
START TRANSACTION;
SELECT ID, value FROM Tests WHERE value=10; {1 result}
UPDATE Tests SET value=4711 WHERE ID=1; {our update}
SELECT ID, value FROM Tests WHERE value=10; {no result, as value is now 4711}
ROLLBACK; { just for testing transactions... }
I would like to have exactly this behavior for the Entity Framework (EF), where db.SaveChanges(); is equivalent to "COMMIT", where all LINQ queries are equivalent to "SELECT" statements and every write access to an entity is just like an "UPDATE". I don't care about when the EF does actually calls the UPDATE statement, but it should behave the same way as using a SQL Database the direct way... Of course, if "SaveChanges()" is called and returning successfully it should be guaranteed that all data was persisted correctly.
Note: Yes, I could call db.SaveChanges() before every query, but then I would loose the possibility for a "Rollback".
Regards,
Stefan

As you've discovered, Entity Framework tracks the entities it has loaded, and returns the same reference for each query which accesses the same entity. This means that the data returned from your query matches the current in-memory version of the data, and not necessarily the data in the database.
If you need to access the database values, you have several options:
Use a new DbContext to load the entity;
Use .AsNoTracking() to load an un-tracked copy of your entity;
Use context.Entry(entity).GetDatabaseValues() to load the property values from the database;
If you want to overwrite the properties of the local entity with the values from the database, you'll need to call context.Entry(entity).Reload().

You can wrap your updates in a transaction to achive the same result as in your SQL example:
using (var transaction = new TransactionScope())
{
var result = from r in db.Tests
where r.value == 10
select r;
Test t2 = result.FirstOrDefault();
// change its value from 10 to 4711...
t2.value = 4711;
// send UPDATE to Database but don't commit transcation
db.SaveChanges();
var result2 = from r in db.Tests
where r.value == 10
select r;
// should not return anything
Trace.Assert(result2.Count() == 0);
// This way you can commit the transaction:
// transaction.Complete();
// but we do nothing and after this line, the transaction is rolled back
}
For more information see http://msdn.microsoft.com/en-us/library/bb896325(v=vs.100).aspx

I think your problem is the expression tree. The Entity Framework executes your query to the database when you say SaveChanges(), as you allready mentioned. When manipulating something within the context, the changes do not happen on the database, they happen in your physical memory. Just when you call SaveChanges() your actions are translated to let's say SQL.
When you do a simple select the database is queried just in the moment when you acces the data. So if your have not call SaveChanges(), it finds the dataset in the database with (SQL)SELECT* FROM Test WHERE VALUE = 10 but interprets from the expression tree, that it has to be value == 4711.
The transaction in EF is happening in your storage. Everything you do before SaveChanges() is your transaction. Read for further information: MSDN
A really good ressource, which is probably up to date, for infomations about the EF is the Microsoft Data Developer Center

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Unexpected behavior in entity framework - c#

Related

Entity Framework Core - Modified key on owned type

how DbContext.AttachRange() works in this scenario

Strange caching issues in Entity Framework 7

EntityFramework adding new object to a collection

Entity Framework: Linq query finds entries by original data but returns reference to changed entry

Categories

Resources