Bulk Update in Entity Framework Core - c#

I pull a bunch of timesheet entries out of the database and use them to create an invoice. Once I save the invoice and have an Id I want to update the timesheet entries with the invoice Id. Is there a way to bulk update the entities without loading them one at a time?
void SaveInvoice(Invoice invoice, int[] timeEntryIds) {
context.Invoices.Add(invoice);
context.SaveChanges();
// Is there anything like?
context.TimeEntries
.Where(te => timeEntryIds.Contains(te.Id))
.Update(te => te.InvoiceId = invoice.Id);
}

Disclaimer: I'm the owner of the project Entity Framework Plus
Our library has a Batch Update feature which I believe is what you are looking for
This feature supports EF Core
// Is there anything like? YES!!!
context.TimeEntries
.Where(te => timeEntryIds.Contains(te.Id))
.Update(te => new TimeEntry() { InvoiceId = invoice.Id });
Wiki: EF Batch Update
EDIT: Answer comment
does it supports contains as in your example? I think this is coming from EF Core which is not supported feature in 3.1 version even
EF Core 3.x support contains: https://dotnetfiddle.net/DAdIO2
EDIT: Answer comment
this is great but this requires to have zero parameter public constructors for classes. which is not a great. Any way to get around this issue?
Anonymous type is supported starting from EF Core 3.x
context.TimeEntries
.Where(te => timeEntryIds.Contains(te.Id))
.Update(te => new { InvoiceId = invoice.Id });
Online example: https://dotnetfiddle.net/MAnPvw

As of EFCore 7.0 you will see the built-in BulkUpdate() and BulkDelete methods:
context.Customers.Where(...).ExecuteDelete();
context.Customers.Where(...).ExecuteUpdate(c => new Customer { Age = c.Age + 1 });
context.Customers.Where(...).ExecuteUpdate(c => new { Age = c.Age + 1 });
context.Customers.Where(...).ExecuteUpdate(c => c.SetProperty(b => b.Age, b => b.Age + 1));

Are you after the performance of simplified syntax?
I would suggest to use direct SQL query,
string query = "Update TimeEntries Set InvoiceId = <invoiceId> Where Id in (comma separated ids)";
context.Database.ExecuteSqlCommandAsync(query);
For comma separated ids you can do string.Join(',', timeEntryIds)
It depends on what you actually need. If you want to go with Linq, then you need to iterate through each object.

If TimeEntry has an association to Invoice (check the navigation properties), you can probably do something like this:
var timeEntries = context.TimeEntries.Where(t => timeEntryIds.Contains(te.Id)).ToArray();
foreach(var timeEntry in timeEntries)
invoice.TimeEntries.Add(timeEntry);
context.Invoices.Add(invoice);
//save the entire context and takes care of the ids
context.SaveChanges();

The IQueryable.ToQueryString method introduced in Entity Framework Core 5.0 may help with this scenario. This method will generate SQL that can be included in a raw SQL query to perform a bulk update of records identified by that query.
For example:
void SaveInvoice(Invoice invoice, int[] timeEntryIds) {
context.Invoices.Add(invoice);
context.SaveChanges();
var query = context.TimeEntries
.Where(te => timeEntryIds.Contains(te.Id))
.Select(te => te.Id);
var sql = $"UPDATE TimeEntries SET InvoiceId = {{0}} WHERE Id IN ({query.ToQueryString()})";
context.Database.ExecuteSqlRaw(sql, invoice.Id);
}
The major drawback of this approach is that you end up with raw SQL appearing in your code. However I don't know of any reasonable way to avoid that with current Entity Framework Core capabilities - you're stuck with this caveat, or the caveats of other answers posted here such as:
Introducing a dependency on another library such as Entity Framework Plus or ELinq.
Using DbContext.SaveChanges() which will involve the execution of multiple SQL queries to retrieve and update records one at a time rather than doing a bulk update.

In entity framework core , you can do with update range method. you can see some samples usage here .
using (var context = new YourContext())
{
context.UpdateRange(yourModifiedEntities);
// or the followings are also valid
//context.UpdateRange(yourModifiedEntity1, yourModifiedEntity2, yourModifiedEntity3);
//context.YourEntity.UpdateRange(yourModifiedEntities);
//context.YourEntity.UpdateRange(yourModifiedEntity1, yourModifiedEntity2,yourModifiedEntity3);
context.SaveChanges();
}

Bulk update supported with EF 7:
context
.TimeEntries
.Where(te => timeEntryIds.Contains(te.Id))
.ExecuteUpdate(s => s.SetProperty(
i => te.InvoiceId,
i => invoice.Id));
Also there is async version for this method ExecuteUpdateAsync.

in EF Core 7, use ExecuteUpdate(), what's new
var multipleRows = TableA.Where(t=>t.Id < 99);
multipleRows.ExecuteUpdate(t=>
t.SetProperty(
r => r.Salary,
r => r.Salary * 2));
//SQL already sent to database, do not run below
//SaveChanges();
SQL being generated by EF
UPDATE [t]
SET [t].[Salary] = [t].[Salary] * 2
FROM [TableA] AS [t]
WHERE [t].[ID] < 99

Related

What is a good way to archive data with the identity column using EF Core?

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 hour ago.
Improve this question
I'm using .NET 6 and EF Core 6.0.13. I have two databases Foo and FooArchive with identical schemas. I need to archive (migrate) data that are older than a year from Foo to FooArchive for 7 tables. What's the best way to do this with EF Core? I will describe below what I tried and the issues I'm running into.
NOTE: There are no foreign keys or any relationships defined for any table in both DBs and so no navigation properties, etc.
There are FooContext and FooArchiveContext classes both using the same entity models but different connections and injected into the repository class
Query for the customer ids whose account is older than 365 days on a different transaction
var customerIds = GetCustomerIds();
Loop through CustomerIds collection and archive one customer at a time
foreach(var customerId in customerIds)
{
using(var fooTx = _fooContext.Database.BeginTransaction())
using(var fooArchiveTx = _fooArchiveContext.Database.BeginTransaction())
{
//Series of left joins to get the data from 7 tables
var recordsToArchive = (from cust in _fooContext.Customers
join ord in _fooContext.Orders on ord.CustId equals cust.Id into co
from ord in co.DefaultIfEmpty()
join odt in _fooContext.OrderDetails on odt.OrderId equals ord.id into ordt
.
where cust.id = customerId
select new {
Customer = cust,
Order = ord,
OrderDetail = odt,
.
.
}).ToList();
var customer = recordsToArchive.Select(x => x.Customer).Distinct().First();
var orders = recordsToArchive.Select(x => x.Order).Where(x != null).Distinct();
var orderDetails = recordsToArchive.Select(x => x.OrderDetail).Where(x != null).Distinct();
.
.
// Check if the record to be migrated is already in FooArchiveContext
var existingRecord = _fooArchiveContext.Customers.FirstOrDefault(x => x.Id == customerId);
if(existingRecord == null)
{
_fooArchiveContext.Customers.Add(customer);
_fooArchiveContext.Orders.AddRange(orders);
_fooArchiveContext.OrderDetails.AddRange(orderDetails);
.
}
else
{
_fooArchiveContext.Customers.Update(customer);
_fooArchiveContext.Orders.UpdateRange(orders);
_fooArchiveContext.OrderDetails.UpdateRange(orderDetails);
.
}
_fooArchiveContext.SaveChanges();
//Remove the record from fooContext
_fooContext.OrderDetails.RemoveRange(orderDetails);
_fooContext.Orders.RemoveRange(orders);
_fooContext.Customers.Remove(customer);
.
_fooContext.SaveChanges();
fooArchiveTx.Commit();
fooTx.Commit();
}
}
Is what I'm doing the right approach? I think I may have to use the AutoMapper to copy entities in between two contexts. It works in the InMemory database but fails when I try it against the actual SQL Server instances. I get an error
Cannot insert explicit value for identity column in table 'Orders' when IDENTITY_INSERT is set to OFF
I would like to keep the same Ids as in the original db instance (fooContext).
I guess I can remove the Id in the entity object and save. Then query for the new Id and update the related entities but sounds tackier than the code I already have. I've seen SO answers where EF core is turning identity insert option on and off before and after calling SaveChanges() like below but haven't tried.
db.Users.Add(user);
db.Database.ExecuteSqlRaw("SET IDENTITY_INSERT MyDB.Users ON");
db.SaveChanges();
db.Database.ExecuteSqlRaw("SET IDENTITY_INSERT MyDB.Users OFF");
transaction.Commit();
Thanks for your help.
If I understand your problem correctly
Your approach of looping through each customer and archiving their records one at a time seems reasonable. However, there are a few areas where you can improve your implementation.
Firstly, you should avoid querying the database multiple times for the same data. In your code, you are querying the same data multiple times to get the customers, orders, and order details. This can be improved by using the Include method to eagerly load the related entities along with the primary entity.
Secondly, you should avoid duplicating code. In your code, you have duplicate code for adding and updating the entities in the archive database. You can reduce the duplication by using the Attach method to attach the entities to the context and then calling Update or Add depending on whether the entity is already in the context or not.
Thirdly, you should use a bulk insert/update operation instead of adding/updating the entities one at a time. EF Core does not have built-in support for bulk operations, but you can use third-party libraries like Entity Framework Extensions or Z.EntityFramework.Plus to perform bulk operations.
Finally, you should avoid setting the identity column values explicitly. Instead, let the database generate the identity values for you. To do this, you can remove the identity column from your entity models or use the ValueGeneratedOnAdd() method in your entity configuration.
With these improvements in mind, here's an example implementation of your code:
using (var fooTx = _fooContext.Database.BeginTransaction())
using (var fooArchiveTx = _fooArchiveContext.Database.BeginTransaction())
{
var cutoffDate = DateTime.UtcNow.AddYears(-1);
var customerIds = _fooContext.Customers
.Where(c => c.CreatedAt < cutoffDate)
.Select(c => c.Id)
.ToList();
foreach (var customerId in customerIds)
{
var customer = _fooContext.Customers
.Include(c => c.Orders)
.ThenInclude(o => o.OrderDetails)
.FirstOrDefault(c => c.Id == customerId);
if (customer != null)
{
if (_fooArchiveContext.Customers.Any(c => c.Id == customerId))
{
_fooArchiveContext.Attach(customer);
_fooArchiveContext.Update(customer);
}
else
{
_fooArchiveContext.Add(customer);
}
_fooArchiveContext.SaveChanges();
_fooArchiveContext.Orders.BulkInsert(customer.Orders);
_fooArchiveContext.OrderDetails.BulkInsert(customer.Orders.SelectMany(o => o.OrderDetails));
_fooContext.OrderDetails.RemoveRange(customer.Orders.SelectMany(o => o.OrderDetails));
_fooContext.Orders.RemoveRange(customer.Orders);
_fooContext.Customers.Remove(customer);
_fooContext.SaveChanges();
}
}
fooArchiveTx.Commit();
fooTx.Commit();
}
In this code, we first get the list of customer IDs whose accounts are older than a year. We then loop through each customer and retrieve their orders and order details using the Include method. We then check if the customer already exists in the archive database and use the Attach and Update methods to update the existing customer, or the Add method to add a new customer.
We then use the BulkInsert method from the Entity Framework Extensions library to insert the orders and order details in bulk. We also remove the orders, order details, and customer from the source database using the RemoveRange method.
Finally, we call SaveChanges on the archive and source contexts and commit the transactions.

Entity framework select an existing item and only pull back one of its fields to update

I'm trying to do something that should be "simple", I want to pull out a piece of data from my database but I only want to pull out the description (the database table for this item has first name, last name, address etc etc).
So when I call my database call I want to just grab the description and then update it, I don't want to grab anything else as this will cost network power and may cause lag if uses multiple times in a few seconds.
Here is my code that i'm trying to fix
using (var context = new storemanagerEntities())
{
var stock = context.stocks.Where(x => x.id == model.Id).Select(
x => new stock()
{
description = x.description
}).FirstOrDefault();
stock.description = model.Description;
context.SaveChanges();
}
The error I am catching is this
**The entity or complex type 'StoreManagerModel.stock' cannot be constructed in a LINQ to Entities query.**
I'm sure using the "new" keyword is probably the problem, but does anyone have any ideas on how to solve this?
--update
This code works, but it doesn't seem to actually update the database
public void UpdateDescription(StockItemDescriptionModel model)
{
using (var context = new storemanagerEntities())
{
var stock = context.stocks.Where(x => x.id == model.Id)
.AsEnumerable()
.Select(
x => new stock
{
description = x.description
}).FirstOrDefault();
stock.description = model.Description;
context.SaveChanges();
}
}
At the moment it would seem it maybe my MySQl driver which is 6390, it seems to work in another version I tried, sorry I haven't found the answer yet
You can do it even without getting any entity from the database by creating a stub entity:
context.Configuration.ValidateOnSaveEnabled = false;
// Create stub entity:
var stock = new stock { id = model.Id, description = model.Description };
// Attach stub entity to the context:
context.Entry(stock).State = System.Data.Entity.EntityState.Unchanged;
// Mark one property as modified.
context.Entry(stock).Property("description").IsModified = true;
context.SaveChanges();
Validation on save is switched off, otherwise EF will validate the stub entity, which is very likely to fail because it may have required properties without values.
Of course it may be wise to check whether the entity does exist in the database at all, but that can be done by a cheap Any() query.
You can't project to a mapped entity type during an L2E query, you would need to switch the context back to L2F. For optimal performance it's recommended to use AsEnumerable over ToList to avoid materializing the query too early e.g.
var stock = context.stocks.Where(x => x.id == model.Id)
.AsEnumerable()
.Select(x => new stock
{
description = x.description
})
.FirstOrDefault();
As to your actual problem, the above won't allow you to do this as is because you have effectively created a non-tracking entity. In order for EF to understand how to connect your entity to your DB you would need to attach it to the context - this would require that you also pull down the Id of the entity though i.e.
Select(x => new stock
{
id = x.id,
description = x.description
})
...
context.stocks.Attach(stock);
stock.description = model.Description;
context.SaveChanges();

Update Multiple Rows in Entity Framework from a list of ids

I am trying to create a query for entity framework that will allow me to take a list of ids and update a field associated with them.
Example in SQL:
UPDATE Friends
SET msgSentBy = '1234'
WHERE id IN (1, 2, 3, 4)
How do I convert the above into entity framework?
something like below
var idList=new int[]{1, 2, 3, 4};
using (var db=new SomeDatabaseContext())
{
var friends= db.Friends.Where(f=>idList.Contains(f.ID)).ToList();
friends.ForEach(a=>a.msgSentBy='1234');
db.SaveChanges();
}
UPDATE:
you can update multiple fields as below
friends.ForEach(a =>
{
a.property1 = value1;
a.property2 = value2;
});
The IQueryable.ToQueryString method introduced in Entity Framework Core 5.0 may help with this scenario, if you are willing to have some raw SQL appearing in your code. This method will generate SQL that can be included in a raw SQL query to perform a bulk update of records identified by that query.
For example:
using var context = new DbContext();
var ids = new List<int>() { 1, 2, 3, 4 };
var query = context.Friends.Where(_ => ids.Contains(_.id)).Select(_ => _.id);
var sql = $"UPDATE Friends SET msgSentBy = {{0}} WHERE id IN ({query.ToQueryString()})";
context.Database.ExecuteSqlRaw(sql, "1234");
The major drawback of this approach is the use of raw SQL. However I don't know of any reasonable way to avoid that with current Entity Framework Core capabilities - you're stuck with this caveat, or the caveats of other answers posted here such as:
Introducing a dependency on another library like https://github.com/yangzhongke/Zack.EFCore.Batch.
Using DbContext.SaveChanges() which will update records one at a time rather than doing a bulk update.
If (when) the following issue is addressed in the future then we are likely to get a better answer here: Bulk (i.e. set-based) CUD operations (without loading data into memory) #795
var idList=new int[]{1, 2, 3, 4};
var friendsToUpdate = await Context.Friends.Where(f =>
idList.Contains(f.Id).ToListAsync();
foreach(var item in previousEReceipts)
{
item.msgSentBy = "1234";
}
You can use foreach to update each element that meets your condition.
Here is an example in a more generic way:
var itemsToUpdate = await Context.friends.Where(f => f.Id == <someCondition>).ToListAsync();
foreach(var item in itemsToUpdate)
{
item.property = updatedValue;
}
Context.SaveChanges()
In general you will most probably use async methods with await for db queries.
I have created a library to batch delete or update records with a round trip on EF Core 5.
Sample code as follows:
await ctx.DeleteRangeAsync(b => b.Price > n || b.AuthorName == "zack yang");
await ctx.BatchUpdate()
.Set(b => b.Price, b => b.Price + 3)
.Set(b=>b.AuthorName,b=>b.Title.Substring(3,2)+b.AuthorName.ToUpper())
.Set(b => b.PubTime, b => DateTime.Now)
.Where(b => b.Id > n || b.AuthorName.StartsWith("Zack"))
.ExecuteAsync();
Github repository: https://github.com/yangzhongke/Zack.EFCore.Batch
Report: https://www.reddit.com/r/dotnetcore/comments/k1esra/how_to_batch_delete_or_update_in_entity_framework/
The best way to do a masive update with Entity Framework 7 is like this:
context.Friends
.Where(f => f.Id <= 1_000)
.ExecuteUpdate(f => f.SetProperty(x => x.Name, x => $"Updated {x.Name}"));
Referenece: https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-7.0/whatsnew#executeupdate-and-executedelete-bulk-updates

Inefficient entity framework queries

I have a following foreach statement:
foreach (var articleId in cleanArticlesIds)
{
var countArt = context.TrackingInformations.Where(x => x.ArticleId == articleId).Count();
articleDictionary.Add(articleId, countArt);
}
Database looks like this
TrackingInformation(Id, ArticleId --some stuff
Article(Id, --some stuff
what I want to do is to get all the article ids count from TrackingInformations Table.
For example:
ArticleId:1 Count:1
ArticleId:2 Count:8
ArticleId:3 Count:5
ArticleId:4 Count:0
so I can have a dictionary<articleId, count>
Context is the Entity Framework DbContext. The problem is that this solution works very slow (there are > 10k articles in db and they should rapidly grow)
Try next query to gather grouped data and them add missing information. You can try to skip Select clause, I don't know if EF can handle ToDictionary in good manner.
If you encounter Select n + 1 problem (huge amount of database requests), you can add ToList() step between Select and ToDictionary, so that all required information will be brought into memory.
This depends all your mapping configuration, environment, so in order to get good performance, you need to play a little bit with different queries. Main approach is to aggregate as much data as possible at database level with few queries.
var articleDictionary =
context.TrackingInformations.Where(trackInfo => cleanArticlesIds.Contains(trackInfo.ArticleId))
.GroupBy(trackInfo => trackInfo.ArticleId)
.Select(grp => new{grp.Key, Count = grp.Count()})
.ToDictionary(info => "ArticleId:" + info.Key,
info => info.Count);
foreach (var missingArticleId in cleanArticlesIds)
{
if(!articleDictionary.ContainsKey(missingArticleId))
articleDictionary.add(missingArticleId, 0);
}
If TrackingInformation is a navigatable property of Article, then you can do this:
var result=context.Article.Select(a=>new {a.id,Count=a.TrackingInformation.Count()});
Putting it into a dictionary is simple as well:
var result=context.Article
.Select(a=>new {a.id,Count=a.TrackingInformation.Count()})
.ToDictionary(a=>a.id,a=>a.Count);
If TrackingInforation isn't a navigatable property, then you can do:
var result=context.Article.GroupJoin(
context.TrackingInformation,
foo => foo.id,
bar => bar.id,
(x,y) => new { id = x.id, Count = y.Count() })
.ToDictionary(a=>a.id,a=>a.Count);

What is the recommended practice to update or delete multiple entities in EntityFramework?

In SQL one might sometimes write something like
DELETE FROM table WHERE column IS NULL
or
UPDATE table SET column1=value WHERE column2 IS NULL
or any other criterion that might apply to multiple rows.
As far as I can tell, the best EntityFramework can do is something like
foreach (var entity in db.Table.Where(row => row.Column == null))
db.Table.Remove(entity); // or entity.Column2 = value;
db.SaveChanges();
But of course that will retrieve all the entities, and then run a separate DELETE query for each. Surely that must be much slower if there are many entities that satisfy the criterion.
So, cut a long story short, is there any support in EntityFramework for updating or deleting multiple entities in a single query?
EF doesn't have support for batch updates or deletes but you can simply do:
db.Database.ExecuteSqlCommand("DELETE FROM ...", someParameter);
Edit:
People who really want to stick with LINQ queries sometimes use workaround where they first create select SQL query from LINQ query:
string query = db.Table.Where(row => row.Column == null).ToString();
and after that find the first occurrence of FROM and replace the beginning of the query with DELETE and execute result with ExecuteSqlCommand. The problem with this approach is that it works only in basic scenarios. It will not work with entity splitting or some inheritance mapping where you need to delete two or more records per entity.
Take a look to Entity Framework Extensions (Multiple entity updates). This project allow set operations using lambda expressions. Samples from doc:
this.Container.Devices.Delete(o => o.Id == 1);
this.Container.Devices.Update(
o => new Device() {
LastOrderRequest = DateTime.Now,
Description = o.Description + "teste"
},
o => o.Id == 1);
Digging EFE project source code you can see how automatize #Ladislav Mrnka second approach also adding setting operations:
public override string GetDmlCommand()
{
//Recover Table Name
StringBuilder updateCommand = new StringBuilder();
updateCommand.Append("UPDATE ");
updateCommand.Append(MetadataAccessor.GetTableNameByEdmType(
typeof(T).Name));
updateCommand.Append(" ");
updateCommand.Append(setParser.ParseExpression());
updateCommand.Append(whereParser.ParseExpression());
return updateCommand.ToString();
}
Edited 3 years latter
Take a look to this great answer: https://stackoverflow.com/a/12751429
Entity Framework Extended Library helps to do this.
Delete
//delete all users where FirstName matches
context.Users.Delete(u => u.FirstName == "firstname");
Update
//update all tasks with status of 1 to status of 2
context.Tasks.Update(
t => t.StatusId == 1,
t2 => new Task {StatusId = 2});
//example of using an IQueryable as the filter for the update
var users = context.Users.Where(u => u.FirstName == "firstname");
context.Users.Update(users, u => new User {FirstName = "newfirstname"});
https://github.com/loresoft/EntityFramework.Extended

Categories

Resources