LINQ to SQL DAL, is this thread safe? - c#

My code is written with C# and the data layer is using LINQ to SQL that fill/load detached object classes.
I have recently changed the code to work with multi threads and i'm pretty sure my DAL isn't thread safe.
Can you tell me if PopCall() and Count() are thread safe and if not how do i fix them?
public class DAL
{
//read one Call item from database and delete same item from database.
public static OCall PopCall()
{
using (var db = new MyDataContext())
{
var fc = (from c in db.Calls where c.Called == false select c).FirstOrDefault();
OCall call = FillOCall(fc);
if (fc != null)
{
db.Calls.DeleteOnSubmit(fc);
db.SubmitChanges();
}
return call;
}
}
public static int Count()
{
using (var db = new MyDataContext())
{
return (from c in db.Calls select c.ID).Count();
}
}
private static OCall FillOCall(Model.Call c)
{
if (c != null)
return new OCall { ID = c.ID, Caller = c.Caller, Called = c.Called };
else return null;
}
}
Detached OCall class:
public class OCall
{
public int ID { get; set; }
public string Caller { get; set; }
public bool Called { get; set; }
}

Individually they are thread-safe, as they use isolated data-contexts etc. However, they are not an atomic unit. So it is not safe to check the count is > 0 and then assume that there is still something there to pop. Any other thread could be mutating the database.
If you need something like this, you can wrap in a TransactionScope which will give you (by default) the serializable isolation level:
using(var tran = new TransactionScope()) {
int count = OCall.Count();
if(count > 0) {
var call = Count.PopCall();
// TODO: something will call, assuming it is non-null
}
}
Of course, this introduces blocking. It is better to simply check the FirstOrDefault().
Note that PopCall could still throw exceptions - if another thread/process deletes the data between you obtaining it and calling SubmitChanges. The good thing about it throwing here is that you shouldn't find that you return the same record twice.
The SubmitChanges is transactional, but the reads aren't, unless spanned by a transaction-scope or similar. To make PopCall atomic without throwing:
public static OCall PopCall()
{
using(var tran = new TrasactionScope())
using (var db = new MyDataContext())
{
var fc = (from c in db.Calls where c.Called == false select c).FirstOrDefault();
OCall call = FillOCall(fc);
if (fc != null)
{
db.Calls.DeleteOnSubmit(fc);
db.SubmitChanges();
}
return call;
}
tran.Complete();
}
}
Now the FirstOrDefault is covered by the serializable isolation-level, so doing the read will take a lock on the data. It would be even better if we could explicitly issue an UPDLOCK here, but LINQ-to-SQL doesn't offer this.

Count() is thread-safe. Calling it twice at the same time, from two different threads will not harm anything. Now, another thread might change the number of items during the call, but so what? Another thread might change the number of item a microsecond after it returns, and there's nothing you can do about it.
PopCall on the other hand, does has a possibility of threading problems. One thread could read fc, then before it reaches the SubmitChanges(), another thread may interceded and do the read & delete, before returning to the first thread, which will attempt to delete the already deleted record. Then both calls will return the same object, even though it was your intension that a row only be returned once.

Unfortunately no amount of Linq-To-Sql trickery, nor SqlClient isolation levels, nor System.Transactions can make the PopCall() thread safe, where 'thread safe' really means 'concurrent safe' (ie. when concurrency occurs on the database server, outside the controll and scope of the client code/process). And neither is any sort of C# locking and synchronization going to help you. You just need to deeply internalize how a relational storage engine works in order to get this doen correctly. Using tables as queues (as you do it here) is notoriously tricky , deadlock prone, and really hard to get it correct.
Even less fortunate, your solution is going to have to be platform specific. I'm only going to explain the right way to do it with SQL Server, and that is to leverage the OUTPUT clause. If you want to get a bit more details why this is the case, read this article Using tables as Queues. Your Pop operation must occur atomically in the database with a calls like this:
WITH cte AS (
SELECT TOP(1) ...
FROM Calls WITH (READPAST)
WHERE Called = 0)
DELETE
FROM cte
OUTPUT DELETED.*;
Not only this, but the Calls table has to be organized with a leftmost clustered key on the Called column. Why this is the case, is again explained in the article I referenced before.
In this context the Count call is basically useless. Your only way to check correctly for an item available is to Pop, asking for Count is just going to put useless stress on the database to return a COUNT() value which means nothing under a concurrent environment.

Related

Sql transaction error occurs sometimes [duplicate]

I am currently getting this error:
System.Data.SqlClient.SqlException: New transaction is not allowed because there are other threads running in the session.
while running this code:
public class ProductManager : IProductManager
{
#region Declare Models
private RivWorks.Model.Negotiation.RIV_Entities _dbRiv = RivWorks.Model.Stores.RivEntities(AppSettings.RivWorkEntities_connString);
private RivWorks.Model.NegotiationAutos.RivFeedsEntities _dbFeed = RivWorks.Model.Stores.FeedEntities(AppSettings.FeedAutosEntities_connString);
#endregion
public IProduct GetProductById(Guid productId)
{
// Do a quick sync of the feeds...
SyncFeeds();
...
// get a product...
...
return product;
}
private void SyncFeeds()
{
bool found = false;
string feedSource = "AUTO";
switch (feedSource) // companyFeedDetail.FeedSourceTable.ToUpper())
{
case "AUTO":
var clientList = from a in _dbFeed.Client.Include("Auto") select a;
foreach (RivWorks.Model.NegotiationAutos.Client client in clientList)
{
var companyFeedDetailList = from a in _dbRiv.AutoNegotiationDetails where a.ClientID == client.ClientID select a;
foreach (RivWorks.Model.Negotiation.AutoNegotiationDetails companyFeedDetail in companyFeedDetailList)
{
if (companyFeedDetail.FeedSourceTable.ToUpper() == "AUTO")
{
var company = (from a in _dbRiv.Company.Include("Product") where a.CompanyId == companyFeedDetail.CompanyId select a).First();
foreach (RivWorks.Model.NegotiationAutos.Auto sourceProduct in client.Auto)
{
foreach (RivWorks.Model.Negotiation.Product targetProduct in company.Product)
{
if (targetProduct.alternateProductID == sourceProduct.AutoID)
{
found = true;
break;
}
}
if (!found)
{
var newProduct = new RivWorks.Model.Negotiation.Product();
newProduct.alternateProductID = sourceProduct.AutoID;
newProduct.isFromFeed = true;
newProduct.isDeleted = false;
newProduct.SKU = sourceProduct.StockNumber;
company.Product.Add(newProduct);
}
}
_dbRiv.SaveChanges(); // ### THIS BREAKS ### //
}
}
}
break;
}
}
}
Model #1 - This model sits in a database on our Dev Server.
Model #1 http://content.screencast.com/users/Keith.Barrows/folders/Jing/media/bdb2b000-6e60-4af0-a7a1-2bb6b05d8bc1/Model1.png
Model #2 - This model sits in a database on our Prod Server and is updated each day by automatic feeds. alt text http://content.screencast.com/users/Keith.Barrows/folders/Jing/media/4260259f-bce6-43d5-9d2a-017bd9a980d4/Model2.png
Note - The red circled items in Model #1 are the fields I use to "map" to Model #2. Please ignore the red circles in Model #2: that is from another question I had which is now answered.
Note: I still need to put in an isDeleted check so I can soft delete it from DB1 if it has gone out of our client's inventory.
All I want to do, with this particular code, is connect a company in DB1 with a client in DB2, get their product list from DB2 and INSERT it in DB1 if it is not already there. First time through should be a full pull of inventory. Each time it is run there after nothing should happen unless new inventory came in on the feed over night.
So the big question - how to I solve the transaction error I am getting? Do I need to drop and recreate my context each time through the loops (does not make sense to me)?
After much pulling out of hair I discovered that the foreach loops were the culprits. What needs to happen is to call EF but return it into an IList<T> of that target type then loop on the IList<T>.
Example:
IList<Client> clientList = from a in _dbFeed.Client.Include("Auto") select a;
foreach (RivWorks.Model.NegotiationAutos.Client client in clientList)
{
var companyFeedDetailList = from a in _dbRiv.AutoNegotiationDetails where a.ClientID == client.ClientID select a;
// ...
}
As you've already identified, you cannot save from within a foreach that is still drawing from the database via an active reader.
Calling ToList() or ToArray() is fine for small data sets, but when you have thousands of rows, you will be consuming a large amount of memory.
It's better to load the rows in chunks.
public static class EntityFrameworkUtil
{
public static IEnumerable<T> QueryInChunksOf<T>(this IQueryable<T> queryable, int chunkSize)
{
return queryable.QueryChunksOfSize(chunkSize).SelectMany(chunk => chunk);
}
public static IEnumerable<T[]> QueryChunksOfSize<T>(this IQueryable<T> queryable, int chunkSize)
{
int chunkNumber = 0;
while (true)
{
var query = (chunkNumber == 0)
? queryable
: queryable.Skip(chunkNumber * chunkSize);
var chunk = query.Take(chunkSize).ToArray();
if (chunk.Length == 0)
yield break;
yield return chunk;
chunkNumber++;
}
}
}
Given the above extension methods, you can write your query like this:
foreach (var client in clientList.OrderBy(c => c.Id).QueryInChunksOf(100))
{
// do stuff
context.SaveChanges();
}
The queryable object you call this method on must be ordered. This is because Entity Framework only supports IQueryable<T>.Skip(int) on ordered queries, which makes sense when you consider that multiple queries for different ranges require the ordering to be stable. If the ordering isn't important to you, just order by primary key as that's likely to have a clustered index.
This version will query the database in batches of 100. Note that SaveChanges() is called for each entity.
If you want to improve your throughput dramatically, you should call SaveChanges() less frequently. Use code like this instead:
foreach (var chunk in clientList.OrderBy(c => c.Id).QueryChunksOfSize(100))
{
foreach (var client in chunk)
{
// do stuff
}
context.SaveChanges();
}
This results in 100 times fewer database update calls. Of course each of those calls takes longer to complete, but you still come out way ahead in the end. Your mileage may vary, but this was worlds faster for me.
And it gets around the exception you were seeing.
EDIT I revisited this question after running SQL Profiler and updated a few things to improve performance. For anyone who is interested, here is some sample SQL that shows what is created by the DB.
The first loop doesn't need to skip anything, so is simpler.
SELECT TOP (100) -- the chunk size
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
FROM [dbo].[Clients] AS [Extent1]
ORDER BY [Extent1].[Id] ASC
Subsequent calls need to skip previous chunks of results, so introduces usage of row_number:
SELECT TOP (100) -- the chunk size
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
FROM (
SELECT [Extent1].[Id] AS [Id], [Extent1].[Name] AS [Name], row_number()
OVER (ORDER BY [Extent1].[Id] ASC) AS [row_number]
FROM [dbo].[Clients] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > 100 -- the number of rows to skip
ORDER BY [Extent1].[Id] ASC
We have now posted an official response to the bug opened on Connect. The workarounds we recommend are as follows:
This error is due to Entity Framework creating an implicit transaction during the SaveChanges() call. The best way to work around the error is to use a different pattern (i.e., not saving while in the midst of reading) or by explicitly declaring a transaction. Here are three possible solutions:
// 1: Save after iteration (recommended approach in most cases)
using (var context = new MyContext())
{
foreach (var person in context.People)
{
// Change to person
}
context.SaveChanges();
}
// 2: Declare an explicit transaction
using (var transaction = new TransactionScope())
{
using (var context = new MyContext())
{
foreach (var person in context.People)
{
// Change to person
context.SaveChanges();
}
}
transaction.Complete();
}
// 3: Read rows ahead (Dangerous!)
using (var context = new MyContext())
{
var people = context.People.ToList(); // Note that this forces the database
// to evaluate the query immediately
// and could be very bad for large tables.
foreach (var person in people)
{
// Change to person
context.SaveChanges();
}
}
Indeed you cannot save changes inside a foreach loop in C# using Entity Framework.
context.SaveChanges() method acts like a commit on a regular database system (RDMS).
Just make all changes (which Entity Framework will cache) and then save all of them at once calling SaveChanges() after the loop (outside of it), like a database commit command.
This works if you can save all changes at once.
Just put context.SaveChanges() after end of your foreach(loop).
Making your queryable lists to .ToList() and it should work fine.
FYI: from a book and some lines adjusted because it's still valid:
Invoking SaveChanges() method begins a transaction which automatically rolls back all changes persisted to the database if an exception occurs before iteration completes; otherwise the transaction commits. You might be tempted to apply the method after each entity update or deletion rather than after iteration completes, especially when you're updating or deleting massive numbers of entities.
If you try to invoke SaveChanges() before all data has been processed, you incur a "New transaction is not allowed because there are other threads running in the session" exception. The exception occurs because SQL Server doesn't permit starting a new transaction on a connection that has a SqlDataReader open, even with Multiple Active Record Sets (MARS) enabled by the connection string (EF's default connection string enables MARS)
Sometimes its better to understand why things are happening ;-)
Always Use your selection as List
Eg:
var tempGroupOfFiles = Entities.Submited_Files.Where(r => r.FileStatusID == 10 && r.EventID == EventId).ToList();
Then Loop through the Collection while save changes
foreach (var item in tempGroupOfFiles)
{
var itemToUpdate = item;
if (itemToUpdate != null)
{
itemToUpdate.FileStatusID = 8;
itemToUpdate.LastModifiedDate = DateTime.Now;
}
Entities.SaveChanges();
}
I was getting this same issue but in a different situation. I had a list of items in a list box. The user can click an item and select delete but I am using a stored proc to delete the item because there is a lot of logic involved in deleting the item. When I call the stored proc the delete works fine but any future call to SaveChanges will cause the error. My solution was to call the stored proc outside of EF and this worked fine. For some reason when I call the stored proc using the EF way of doing things it leaves something open.
We started seeing this error "New transaction is not allowed because there are other threads running in the session" after migrating from EF5 to EF6.
Google brought us here but we are not calling SaveChanges() inside the loop. The errors were raised when executing a stored procedure using the ObjectContext.ExecuteFunction inside a foreach loop reading from the DB.
Any call to ObjectContext.ExecuteFunction wraps the function in a transaction. Beginning a transaction while there is already an open reader causes the error.
It is possible to disable wrapping the SP in a transaction by setting the following option.
_context.Configuration.EnsureTransactionsForFunctionsAndCommands = false;
The EnsureTransactionsForFunctionsAndCommands option allows the SP to run without creating its own transaction and the error is no longer raised.
DbContextConfiguration.EnsureTransactionsForFunctionsAndCommands Property
Here are another 2 options that allow you to invoke SaveChanges() in a for each loop.
The first option is use one DBContext to generate your list objects to iterate through, and then create a 2nd DBContext to call SaveChanges() on. Here is an example:
//Get your IQueryable list of objects from your main DBContext(db)
IQueryable<Object> objects = db.Object.Where(whatever where clause you desire);
//Create a new DBContext outside of the foreach loop
using (DBContext dbMod = new DBContext())
{
//Loop through the IQueryable
foreach (Object object in objects)
{
//Get the same object you are operating on in the foreach loop from the new DBContext(dbMod) using the objects id
Object objectMod = dbMod.Object.Find(object.id);
//Make whatever changes you need on objectMod
objectMod.RightNow = DateTime.Now;
//Invoke SaveChanges() on the dbMod context
dbMod.SaveChanges()
}
}
The 2nd option is to get a list of database objects from the DBContext, but to select only the id's. And then iterate through the list of id's (presumably an int) and get the object corresponding to each int, and invoke SaveChanges() that way. The idea behind this method is grabbing a large list of integers, is a lot more efficient then getting a large list of db objects and calling .ToList() on the entire object. Here is an example of this method:
//Get the list of objects you want from your DBContext, and select just the Id's and create a list
List<int> Ids = db.Object.Where(enter where clause here)Select(m => m.Id).ToList();
var objects = Ids.Select(id => db.Objects.Find(id));
foreach (var object in objects)
{
object.RightNow = DateTime.Now;
db.SaveChanges()
}
If you get this error due to foreach and you really need to save one entity first inside loop and use generated identity further in loop, as was in my case, the easiest solution is to use another DBContext to insert entity which will return Id and use this Id in outer context
For example
using (var context = new DatabaseContext())
{
...
using (var context1 = new DatabaseContext())
{
...
context1.SaveChanges();
}
//get id of inserted object from context1 and use is.
context.SaveChanges();
}
I was also facing same issue.
Here is the cause and solution.
http://blogs.msdn.com/b/cbiyikoglu/archive/2006/11/21/mars-transactions-and-sql-error-3997-3988-or-3983.aspx
Make sure before firing data manipulation commands like inserts, updates, you have closed all previous active SQL readers.
Most common error is functions that read data from db and return values.
For e.g functions like isRecordExist.
In this case we immediately return from the function if we found the record and forget to close the reader.
So in the project were I had this exact same issue the problem wasn't in the foreach or the .toList() it was actually in the AutoFac configuration we used.
This created some weird situations were the above error was thrown but also a bunch of other equivalent errors were thrown.
This was our fix:
Changed this:
container.RegisterType<DataContext>().As<DbContext>().InstancePerLifetimeScope();
container.RegisterType<DbFactory>().As<IDbFactory>().SingleInstance();
container.RegisterType<UnitOfWork>().As<IUnitOfWork>().InstancePerRequest();
To:
container.RegisterType<DataContext>().As<DbContext>().As<DbContext>();
container.RegisterType<DbFactory>().As<IDbFactory>().As<IDbFactory>().InstancePerLifetimeScope();
container.RegisterType<UnitOfWork>().As<IUnitOfWork>().As<IUnitOfWork>();//.InstancePerRequest();
I know it is an old question but i faced this error today.
and i found that, this error can be thrown when a database table trigger gets an error.
for your information, you can check your tables triggers too when you get this error.
I needed to read a huge ResultSet and update some records in the table.
I tried to use chunks as suggested in Drew Noakes's answer.
Unfortunately after 50000 records I've got OutofMemoryException.
The answer Entity framework large data set, out of memory exception explains, that
EF creates second copy of data which uses for change detection (so
that it can persist changes to the database). EF holds this second set
for the lifetime of the context and its this set thats running you out
of memory.
The recommendation is to re-create your context for each batch.
So I've retrieved Minimal and Maximum values of the primary key- the tables have primary keys as auto incremental integers.Then I retrieved from the database chunks of records by opening context for each chunk. After processing the chunk context closes and releases the memory. It insures that memory usage is not growing.
Below is a snippet from my code:
public void ProcessContextByChunks ()
{
var tableName = "MyTable";
var startTime = DateTime.Now;
int i = 0;
var minMaxIds = GetMinMaxIds();
for (int fromKeyID= minMaxIds.From; fromKeyID <= minMaxIds.To; fromKeyID = fromKeyID+_chunkSize)
{
try
{
using (var context = InitContext())
{
var chunk = GetMyTableQuery(context).Where(r => (r.KeyID >= fromKeyID) && (r.KeyID < fromKeyID+ _chunkSize));
try
{
foreach (var row in chunk)
{
foundCount = UpdateRowIfNeeded(++i, row);
}
context.SaveChanges();
}
catch (Exception exc)
{
LogChunkException(i, exc);
}
}
}
catch (Exception exc)
{
LogChunkException(i, exc);
}
}
LogSummaryLine(tableName, i, foundCount, startTime);
}
private FromToRange<int> GetminMaxIds()
{
var minMaxIds = new FromToRange<int>();
using (var context = InitContext())
{
var allRows = GetMyTableQuery(context);
minMaxIds.From = allRows.Min(n => (int?)n.KeyID ?? 0);
minMaxIds.To = allRows.Max(n => (int?)n.KeyID ?? 0);
}
return minMaxIds;
}
private IQueryable<MyTable> GetMyTableQuery(MyEFContext context)
{
return context.MyTable;
}
private MyEFContext InitContext()
{
var context = new MyEFContext();
context.Database.Connection.ConnectionString = _connectionString;
//context.Database.Log = SqlLog;
return context;
}
FromToRange is a simple structure with From and To properties.
Recently I faced the same issue in my project so posting my experience and it might help some on the same boat as i was. The issue was due to i am looping through the results of EF select query (results are not retrieved into memory).
var products = (from e in _context.Products
where e.StatusId == 1
select new { e.Name, e.Type });
foreach (var product in products)
{
//doing some insert EF Queries
//some EF select quries
await _context.SaveChangesAsync(stoppingToken); // This code breaks.
}
I have updated my Products select query to bring the results into LIST rather than IQueryable (This seems to be opening the reader throughout for each loop and hence save was failing).
var products = (from e in _context.Products
where e.StatusId == 1
select new { e.Name, e.Type })**.ToList()**; //see highlighted
The code below works for me:
private pricecheckEntities _context = new pricecheckEntities();
...
private void resetpcheckedtoFalse()
{
try
{
foreach (var product in _context.products)
{
product.pchecked = false;
_context.products.Attach(product);
_context.Entry(product).State = EntityState.Modified;
}
_context.SaveChanges();
}
catch (Exception extofException)
{
MessageBox.Show(extofException.ToString());
}
productsDataGrid.Items.Refresh();
}
In my case, the problem appeared when I called Stored Procedure via EF and then later SaveChanges throw this exception. The problem was in calling the procedure, the enumerator was not disposed. I fixed the code following way:
public bool IsUserInRole(string username, string roleName, DataContext context)
{
var result = context.aspnet_UsersInRoles_IsUserInRoleEF("/", username, roleName);
//using here solved the issue
using (var en = result.GetEnumerator())
{
if (!en.MoveNext())
throw new Exception("emty result of aspnet_UsersInRoles_IsUserInRoleEF");
int? resultData = en.Current;
return resultData == 1;//1 = success, see T-SQL for return codes
}
}
I am much late to the party but today I faced the same error and how I resolved was simple. My scenario was similar to this given code I was making DB transactions inside of nested for-each loops.
The problem is as a Single DB transaction takes a little bit time longer than for-each loop so once the earlier transaction is not complete then the new traction throws an exception, so the solution is to create a new object in the for-each loop where you are making a db transaction.
For the above mentioned scenarios the solution will be like this:
foreach (RivWorks.Model.Negotiation.AutoNegotiationDetails companyFeedDetail in companyFeedDetailList)
{
private RivWorks.Model.Negotiation.RIV_Entities _dbRiv = RivWorks.Model.Stores.RivEntities(AppSettings.RivWorkEntities_connString);
if (companyFeedDetail.FeedSourceTable.ToUpper() == "AUTO")
{
var company = (from a in _dbRiv.Company.Include("Product") where a.CompanyId == companyFeedDetail.CompanyId select a).First();
foreach (RivWorks.Model.NegotiationAutos.Auto sourceProduct in client.Auto)
{
foreach (RivWorks.Model.Negotiation.Product targetProduct in company.Product)
{
if (targetProduct.alternateProductID == sourceProduct.AutoID)
{
found = true;
break;
}
}
if (!found)
{
var newProduct = new RivWorks.Model.Negotiation.Product();
newProduct.alternateProductID = sourceProduct.AutoID;
newProduct.isFromFeed = true;
newProduct.isDeleted = false;
newProduct.SKU = sourceProduct.StockNumber;
company.Product.Add(newProduct);
}
}
_dbRiv.SaveChanges(); // ### THIS BREAKS ### //
}
}
I am a little bit late, but I had this error too. I solved the problem by checking what where the values that where updating.
I found out that my query was wrong and that there where over 250+ edits pending. So I corrected my query, and now it works correct.
So in my situation: Check the query for errors, by debugging over the result that the query returns. After that correct the query.
Hope this helps resolving future problems.
My situation was similar others above. I had an IQueryable which I was doing a foreach on. This in turn called a method with SaveChanges(). Booom exception here as there was already a transaction open from the query above.
// Example:
var myList = _context.Table.Where(x => x.time == null);
foreach(var i in myList)
{
MyFunction(i); // <<-- Has _context.SaveChanges() which throws exception
}
Adding ToList() to the end of the query was the solution in my case.
// Fix
var myList = _context.Table.Where(x => x.time == null).ToList();
Most of answers related with loops. But my problem was different. While i was trying to use multiple dbcontext.Savechanges() command in same scope, i got the error many times.
In my case for ef core 3.1 using
dbcontext.Database.BeginTransaction()
and
dbcontext.Database.CommitTransaction();
has fixed the problem. Here is my entire Code :
public IActionResult ApplyForCourse()
{
var master = _userService.GetMasterFromCurrentUser();
var trainee = new Trainee
{
CourseId = courseId,
JobStatus = model.JobStatus,
Gender = model.Gender,
Name = model.Name,
Surname = model.Surname,
Telephone = model.Telephone,
Email = model.Email,
BirthDate = model.BirthDate,
Description = model.Description,
EducationStatus = EducationStatus.AppliedForEducation,
TraineeType = TraineeType.SiteFirst
};
dbcontext.Trainees.Add(trainee);
dbcontext.SaveChanges();
dbcontext.Database.BeginTransaction();
var user = userManager.GetUserAsync(User).Result;
master.TraineeId = trainee.Id;
master.DateOfBirth = model.BirthDate;
master.EducationStatus = trainee.EducationStatus;
user.Gender = model.Gender;
user.Email = model.Email;
dbcontext.Database.CommitTransaction();
dbcontext.SaveChanges();
return RedirectToAction("Index", "Home");
}
}

EntityFramework is painfully slow at executing an update query

We're investigating a performance issue where EF 6.1.3 is being painfully slow, and we cannot figure out what might be causing it.
The database context is initialized with:
Configuration.ProxyCreationEnabled = false;
Configuration.AutoDetectChangesEnabled = false;
Configuration.ValidateOnSaveEnabled = false;
We have isolated the performance issue to the following method:
protected virtual async Task<long> UpdateEntityInStoreAsync(T entity,
string[] changedProperties)
{
using (var session = sessionFactory.CreateReadWriteSession(false, false))
{
var writer = session.Writer<T>();
writer.Attach(entity);
await writer.UpdatePropertyAsync(entity, changedProperties.ToArray()).ConfigureAwait(false);
}
return entity.Id;
}
There are two names in the changedProperties list, and EF correctly generated an update statement that updates just these two properties.
This method is called repeatedly (to process a collection of data items) and takes about 15-20 seconds to complete.
If we replace the method above with the following, execution time drops to 3-4 seconds:
protected virtual async Task<long> UpdateEntityInStoreAsync(T entity,
string[] changedProperties)
{
var sql = $"update {entity.TypeName()}s set";
var separator = false;
foreach (var property in changedProperties)
{
sql += (separator ? ", " : " ") + property + " = #" + property;
separator = true;
}
sql += " where id = #Id";
var parameters = (from parameter in changedProperties.Concat(new[] { "Id" })
let property = entity.GetProperty(parameter)
select ContextManager.CreateSqlParameter(parameter, property.GetValue(entity))).ToArray();
using (var session = sessionFactory.CreateReadWriteSession(false, false))
{
await session.UnderlyingDatabase.ExecuteSqlCommandAsync(sql, parameters).ConfigureAwait(false);
}
return entity.Id;
}
The UpdatePropertiesAsync method called on the writer (a repository implementation) looks like this:
public virtual async Task UpdatePropertyAsync(T entity, string[] changedPropertyNames, bool save = true)
{
if (changedPropertyNames == null || changedPropertyNames.Length == 0)
{
return;
}
Array.ForEach(changedPropertyNames, name => context.Entry(entity).Property(name).IsModified = true);
if (save)
await context.SaveChangesAsync().ConfigureAwait(false);
}
}
What is EF doing that completely kills performance? And is there anything we can do to work around it (short of using another ORM)?
By timing the code I was able to see that the additional time spent by EF was in the call to Attach the object to the context, and not in the actual query to update the database.
By eliminating all object references (setting them to null before attaching the object and restoring them after the update is complete) the EF code runs in "comparable times" (5 seconds, but with lots of logging code) to the hand-written solution.
So it looks like EF has a "bug" (some might call it a feature) causing it to inspect the attached object recursively even though change tracking and validation have been disabled.
Update: EF 7 appears to have addressed this issue by allowing you to pass in a GraphBehavior enum when calling Attach.
The problem with Entity framework is that when you call SaveChanges(), insert statements are sent to database one by one, that's how Entity works.
And actually there are 2 db hits per insert, first db hit is insert statement for a record, and the second one is select statement to get the id of inserted record.
So you have numOfRecords * 2 database trips * time for one database trip.
Write this in your code context.Database.Log = message => Debug.WriteLine(message); to log generated sql to console, and you will see what am I talking about.
You can use BulkInsert, here is the link: https://efbulkinsert.codeplex.com/
Seeing as though you already have tried setting:
Configuration.AutoDetectChangesEnabled = false;
Configuration.ValidateOnSaveEnabled = false;
And you are not using an ordered lists, I think you are going to have to refactor your code and do some benchmarking.
I believe the bottleneck is coming from the foreach as the context is having to deal with a potentially large amounts of bulk data (not sure how many this is in your case).
Try and cut the items contained in your array down into smaller batches before calling the SaveChanges(); or SaveChangesAsync(); methods, and note the performance deviations as apposed to letting the context grow too large.
Also, if you are still not seeing further gains, try disposing of the context post SaveChanges(); and then creating a new one, depending on the size of your entities list, flushing out the context may yield even further improvements.
But this all depends on how many entities we are talking about and may only be noticeable in the hundreds and thousands of record scenarios.

Entity Framework 4 Out of Memory on SaveChanges

I have a table that contains greater than half a million records. Each record contains about 60 fields but we only make changes to three of them.
We make a small modification to each entity based on a calculation and a look up.
Clearly I can't update each entity in turn and then SaveChanges as that would take far too long.
So at the end of the whole process I call SaveChanges on the Context.
This is causing an Out of Memory error when i apply SaveChanges
I'm using the DataRepository pattern.
//Update code
DataRepository<ExportOrderSKUData> repoExportOrders = new DataRepository<ExportOrderSKUData>();
foreach (ExportOrderSKUData grpDCItem in repoExportOrders.all())
{
..make changes to enity..
}
repoExportOrders.SaveChanges();
//Data repository snip
public DataRepository()
{
_context = new tomEntities();
_objectSet = _context.CreateObjectSet<T>();
}
public List<T> All()
{
return _objectSet.ToList<T>();
}
public void SaveChanges()
{
_context.SaveChanges();
}
What should I be looking for in this instance?
Making changes to half a million record through EF within one transaction is not supposed use case. Doing it in small batches is a better technical solution. Doing it on database side through some stored procedure can be even better solution.
I would start by slightly modifying your code (translate it to your repository API yourselves):
using (var readContext = new YourContext()) {
var set = readContext.CreateObjectSet<ExportOrderSKUData>();
foreach (var item in set.ToList()) {
readContext.Detach(item);
using (var updateContext = new YourContext()) {
updateContext.Attach(item);
// make your changes
updateContext.SaveChanges();
}
}
}
This code uses separate context for saving item = each save is in its own transaction. Don't be afraid of that. Even if you try to save more records within one call of SaveChanges EF will use separate roundtrip to database for every updated record. The only difference is if you want to have multiple updates in the same transaction (but having half a million updates in single transaction will cause issues anyway).
Another option may be:
using (var readContext = new YourContext()) {
var set = readContext.CreateObjectSet<ExportOrderSKUData>();
set.MergeOption = MergeOption.NoTracking;
foreach (var item in set) {
using (var updateContext = new YourContext()) {
updateContext.Attach(item);
// make your changes
updateContext.SaveChanges();
}
}
}
This can in theory consume even less memory because you don't need to have all entities loaded prior to doing foreach. The first example probably needs to load all entities prior to enumeration (by calling ToList) to avoid exception when calling Detach (modifying collection during enumeration) - but I'm not sure if that really happens.
Modifying those examples to use some batches should be easy.

EF4.1 DBContext: Insert/update in one Save() function without identity PK

I have a streets table, which has a combo of two string columns acting as the PK, being postalcode and streetcode.
With EF4.1 and DBContext, I'd like to write a single "Save" method than takes a street (coming in in an unattached state), checks if it already exists in the database. If it does, it issues an UPDATE, and if it doesn't, it issues an INSERT.
FYI, the application that saves these streets, is reading them from a textfile and saves them (there a few tens of thousands of these "streetlines" in that file).
What I've come up with for now is:
public void Save(Street street)
{
var existingStreet = (
from s in streetContext.Streets
where s.PostalCode.Equals(street.PostalCode)
&& s.StreetCode.Equals(street.StreetCode)
select s
).FirstOrDefault();
if (existingStreet != null)
this.streetContext.Entry(street).State = System.Data.EntityState.Modified;
else
this.streetContext.Entry(street).State = System.Data.EntityState.Added;
this.streetContext.SaveChanges();
}
Is this good practice ? How about performance here ? Cause for every street it first does a roundtrip to the db to see if it exists.
Wouldn't it be better performance-wise to try to insert the street (state = added), and catch any PK violations ? In the catch block, I can then change the state to modified and call SaveChanges() again. Or would that not be a good practice ?
Any suggestions ?
Thanks
Select all the streets then make a for each loop that compares and change states. After the loop is done call saveChanges. This way you only make a few calls to the db instead of several thousends
Thanks for the replies but none were really satisfying, so I did some more research and rewrote the method like this, which satisfies my need.
ps: renamed the method to Import() because I find that a more appropriate name for a method that is used for (bulk) importing entities from an outside source (like a textfile in my case)
ps2: I know it's not really best practice to catch an exception and let it die silently by not doing anything with it, but I don't have the need to do anything with it in my particular case. It just serves as a method to find out that the row already exists in the database.
public void Import(Street street)
{
try
{
this.streetContext.Entry(street).State = System.Data.EntityState.Added;
this.streetContext.SaveChanges();
}
catch (System.Data.Entity.Infrastructure.DbUpdateException dbUex)
{
this.streetContext.Entry(street).State = System.Data.EntityState.Modified;
this.streetContext.SaveChanges();
}
finally
{
((IObjectContextAdapter)this.streetContext).ObjectContext.Detach(street);
}
}
Your code must result in exception if street already exists in the database because you will load it from the context and after that you will try to attach another instance with the same primary key to the same context instance.
If you really have to do this use this code instead:
public void Save(Street street)
{
string postalCode = street.PostalCode;
string streetCode = steet.StreetCode;
bool existingStreet = streetContext.Streets.Any(s =>
s.PostalCode == postalCode
&& s.StreetCode = steetCode);
if (existingStreet)
streetContext.Entry(street).State = System.Data.EntityState.Modified;
else
streetContext.Entry(street).State = System.Data.EntityState.Added;
streetContext.SaveChanges();
}
Anyway it is still not reliable solution in highly concurrent system because other thread can insert the same street between you check and subsequent insert.

C# - Entity Framework - Understanding some basics

Model #1 - This model sits in a database on our Dev Server.
Model #1 http://content.screencast.com/users/Keith.Barrows/folders/Jing/media/bdb2b000-6e60-4af0-a7a1-2bb6b05d8bc1/Model1.png
Model #2 - This model sits in a database on our Prod Server and is updated each day by automatic feeds. alt text http://content.screencast.com/users/Keith.Barrows/folders/Jing/media/4260259f-bce6-43d5-9d2a-017bd9a980d4/Model2.png
I have written what should be some simple code to sync my feed (Model #2) into my working DB (Model #1). Please note this is prototype code and the models may not be as pretty as they should. Also, the entry into Model #1 for the feed link data (mainly ClientID) is a manual process at this point which is why I am writing this simple sync method.
private void SyncFeeds()
{
var sourceList = from a in _dbFeed.Auto where a.Active == true select a;
foreach (RivWorks.Model.NegotiationAutos.Auto source in sourceList)
{
var targetList = from a in _dbRiv.Product where a.alternateProductID == source.AutoID select a;
if (targetList.Count() > 0)
{
// UPDATE...
try
{
var product = targetList.First();
product.alternateProductID = source.AutoID;
product.isFromFeed = true;
product.isDeleted = false;
product.SKU = source.StockNumber;
_dbRiv.SaveChanges();
}
catch (Exception ex)
{
string m = ex.Message;
}
}
else
{
// INSERT...
try
{
long clientID = source.Client.ClientID;
var companyDetail = (from a in _dbRiv.AutoNegotiationDetails where a.ClientID == clientID select a).First();
var company = companyDetail.Company;
switch (companyDetail.FeedSourceTable.ToUpper())
{
case "AUTO":
var product = new RivWorks.Model.Negotiation.Product();
product.alternateProductID = source.AutoID;
product.isFromFeed = true;
product.isDeleted = false;
product.SKU = source.StockNumber;
company.Product.Add(product);
break;
}
_dbRiv.SaveChanges();
}
catch (Exception ex)
{
string m = ex.Message;
}
}
}
}
Now for the questions:
In Model #2, the class structure for Auto is missing ClientID (see red circled area). Now, everything I have learned, EF creates a child class of Client and I should be able to find the ClientID in the child class. Yet, when I run my code, source.Client is a NULL object. Am I expecting something that EF does not do? Is there a way to populate the child class correctly?
Why does EF hide the child entity ID (ClientID in this case) in the parent table? Is there any way to expose it?
What else sticks out like the proverbial sore thumb?
TIA
1) The reason you are seeing a null for source.Client is because related objects are not loaded until you request them, or they are otherwise loaded into the object context. The following will load them explicitly:
if (!source.ClientReference.IsLoaded)
{
source.ClientReference.Load();
}
However, this is sub-optimal when you have a list of more than one record, as it sends one database query per Load() call. A better alternative is to the Include() method in your initial query, to instruct the ORM to load the related entities you are interested in, so:
var sourceList = from a in _dbFeed.Auto .Include("Client") where a.Active == true select a;
An alternative third method is to use something call relationship fix-up, where if, in your example for instance, the related clients had been queried previously, they would still be in your object context. For example:
var clients = (from a in _dbFeed.Client select a).ToList();
The EF will then 'fix-up' the relationships so source.Client would not be null. Obviously this is only something you would do if you required a list of all clients for synching, so is not relevant for your specific example.
Always remember that objects are never loaded into the EF unless you request them!
2) The first version of the EF deliberately does not map foreign key fields to observable fields or properties. This is a good rundown on the matter. In EF4.0, I understand foreign keys will be exposed due to popular demand.
3) One issue you may run into is the number of database queries requesting Products or AutoNegotiationContacts may generate. As an alternative, consider loading them in bulk or with a join on your initial query.
It's also seen as good practice to use an object context for one 'operation', then dispose of it, rather than persisting them across requests. There is very little overhead in initialising one, so one object context per SychFeeds() is more appropriate. ObjectContext implements IDisposable, so you can instantiate it in a using block and wrap the method's contents in that, to ensure everything is cleaned up correctly once your changes are submitted.

Categories

Resources