Linq: Delete and Insert same Primary Key values within TransactionScope - c#

I want to replace existing records in the DB with new records in one transaction. Using TransactionScope, I have
using ( var scope = new TransactionScope())
{
db.Tasks.DeleteAllOnSubmit(oldTasks);
db.Tasks.SubmitChanges();
db.Tasks.InsertAllOnSubmit(newTasks);
db.Tasks.SubmitChanges();
scope.Complete();
}
My program threw
System.InvalidOperationException: Cannot add an entity that already exists.
After some trial and error, I found the culprit lies in the the fact that there isn't any other execution instructions between the delete and the insert. If I insert other code between the first SubmitChanges() and InsertAllOnSubmit(), everything works fine. Can anyone explain why is this happening? It is very concerning.
I tried another one to update the objects:
IEnumerable<Task> tasks = ( ... some long query that involves multi tables )
.AsEnumerable()
.Select( i =>
{
i.Task.Duration += i.LastLegDuration;
return i.Task;
}
db.SubmitChanges();
This didn't work neither. db didn't pick up any changes to Tasks.
EDIT:
This behavior doesn't seem to have anything to do with Transactions. At the end, I adopted the grossly inefficient Update:
newTasks.ForEach( t =>
{
Task attached = db.Tasks.Single( i => ... use primary id to look up ... );
attached.Duration = ...;
... more updates, Property by Property ...
}
db.SubmitChanges();

Instead of inserting and deleting or making multiple queries, you can try to update multiple rows in one pass by selecting a list of Id's to update and checking if the list contains each item.
Also, make sure you mark your transaction as complete to indicate to transaction manager that the state across all resources is consistent, and the transaction can be committed.
Dictionary<int,int> taskIdsWithDuration = getIdsOfTasksToUpdate(); //fetch a dictionary keyed on id's from your long query and values storing the corresponding *LastLegDuration*
using (var scope = new TransactionScope(TransactionScopeOption.Required))
{
var tasksToUpdate = db.Tasks.Where(x => taskIdsWithDuration.Keys.Contains(x.id));
foreach (var task in tasksToUpdate)
{
task.duration1 += taskIdsWithDuration[task.id];
}
db.SaveChanges();
scope.Complete();
}
Depending on your scenario, you can invert the search in the case that your table is extremely large and the number of items to update is reasonably small, to leverage indexing. Your existing update query should work fine if this is the case, so I doubt you'll need to invert it.

I had same problem in LinqToSql and I don't think its to do with the transaction, but with how the session/context is coalescing changes. I say this because I fixed the problem by bypassing linqtosql for the delete and using some raw sql to do it. Ugly I know, but it worked, and all inside a transaction scope.

Related

EF (C#) with tunneling to MySQL - optimising database calls

So I have program to deal with some sort of work in MySQL database. I'm connecting through SSH tunneling with putty (yes, I know launching programs on server itself would be much better but I don't have a choice here).
I have some problems with programs speed. I solved one by adding ".Include(table_name)" but I can't think about a way to do it here.
So purpose of this method is to clean database of unwanted, broken records. Simplified code looks like this:
using (var dbContext = new MyDatabase_dataEntities())
{
List<achievements> achiList = new List<achievements>();
var achievementsQuery = from data in dbContext.achievements
orderby data.playerID
select data;
achiList = achievementsQuery.Skip(counter * 5000).Take(5000).ToList();
foreach (achievements record in achiList)
{
var playerExists = from data in dbContext.players_data
where data.playerID == record.playerID
select data;
if(!playerExists.Any())
{
dbContext.achievements.Remove(record);
}
}
dbContext.SaveChanges();
counter++;
}
So this is built this way because I want to load achievements table then check if achievements have their player in player_data. If such player doesn't exist, remove his achievements.
It is all in do while, so I don't overload my memory by loading all data at once.
I know the problem is with checking in database in foreach steps but I can't figure out how to do it without it. Other things I tried generated errors either because EF couldn't translate it into SQL or exceptions were thrown when trying to access non-existing entity. Doing it in foreach bottlenecks whole program probably because of ping to the server.
I will need similar thing more often so I would be really gratefull if anyone could help me with making it so there is no need to call to database in "foreach". I know I could try to load whole players_data table and then check for Any(), but some tables I need it on are too big for that.
Oh, and turning off tracking changes doesn't help at this point because it is not what slows the program.
I would be gratefull for any help, thanks in advance!
EDIT: Mmmm, is there a way to get achievements which doesn't have player_data corresponding to them through one query using associations? Like adding to achievements query something like:
where !data.player_data.Exists()
Intellisense shows me that there is nothing like Exists or Any to use at this point. Is there any trick similar to this? It would definitely deal with the problem I have with speed there since database call in foreach wouldn't be needed.
If you want to delete achievements that don't have corresponding user records, then you can user a SQL query below:
DELETE a
FROM `achievements` a
LEFT JOIN `user` AS u
ON u.`playerID` = a.`playerID`
WHERE u.`playerID` IS NULL;
SQL query will be an order of magnitude faster than Entity Framework.
If you want to do that in the application, you can use the following code that uses LINQ to Entities and LINQ extensions methods. I assume you have a foreign key for player_data in achievements table so Entity Framework generates player_data lazy property for your achievements entity:
using (var dbContext = new MyDatabase_dataEntities())
{
var proceed = true;
while (proceed)
{
// Get net 1000 entities to delete
var entitiesToDelete = dbContext.achievements
.Where(x => x.players_data == null)
.Take(1000)
.ToList();
dbContext.achievements.RemoveRange(entitiesToDelete);
dbContext.SaveChanges();
// Proceed if deleted entities during this iteration
proceed = entitiesToDelete.Count() > 0;
}
}
If you prefer to use LINQ query syntax instead of extension methods, then your code will look like:
using (var dbContext = new MyDatabase_dataEntities())
{
var proceed = true;
while (proceed)
{
// Get net 1000 entities to delete
var query = from achievement in dbContext.achievements
where achievement.players_data == null
select achievement;
var entitiesToDelete = query.ToList();
dbContext.achievements.RemoveRange(entitiesToDelete);
dbContext.SaveChanges();
// Proceed if deleted entities during this iteration
proceed = entitiesToDelete.Count() > 0;
}
}

How To insert records of one database to other database in LINQ

I have 2 database with tables.
I wanted to insert records from first database to second database table in LINQ. I have created 2 dbml files with 2 datacontexts but I am unable to code the insertion of records.
I have list of records:
using(_TimeClockDataContext)
{
var Query = (from EditTime in _TimeClockDataContext.tblEditTimes
orderby EditTime.ScanDate ascending
select new EditTimeBO
{
EditTimeID = EditTime.EditTimeID,
UserID = Convert.ToInt64(EditTime.UserID),
ScanCardId = Convert.ToInt64(EditTime.ScanCardId),
}).ToList();
return Query;
}
Now I want to insert record in new table which is in _Premire2DataContext.
If you want to "copy" records from one database to another using Linq then you need two database contexts, one for the database you are reading from, and one for the database you are reading to.
EditTime[] sourceRows;
using (var sourceContext = CreateSourceContext())
{
sourceRows = ReadRows(sourceContext);
}
using (var destinationContext = CreateDestinationContext())
{
WriteRows(destinationContext, sourceRows);
}
You now just need to fill in the implementations for ReadRows and WriteRows using standard LINQ to SQL code. The code for writing rows should look a bit like this.
void WriteRows(TimeClockDataContext context, EditTime[] rows)
{
foreach (var row in rows)
{
destinationContext.tblEditTimes.Add(row);
}
destinationContext.SubmitChanges();
}
Note that as long as the schema is the same you can use the same context and therefore the same objects - so when reading records we ideally want to return the correct array type, therefore reading is going to look a bit like this
EditTime[] ReadRows(TimeClockDataContext context)
{
return (
from EditTime in _TimeClockDataContext.tblEditTimes
orderby EditTime.ScanDate ascending
select EditTime
).ToArray();
}
You can use an array or a list - it doesn't really matter. I've used an array mostly because the syntax is shorter. Note that we return the original EditTime objects rather than create new ones as this means we can add those objects directly to the second data context.
I've not compiled any of this code yet, so it might contain some typos. Also apologies if I've made some obvious errors - its been a while since I last used LINQ to SQL.
If you have foreign keys or the second database has a different schema then things get more complicated, but the fundamental process remains the same - read from one context (using standard LINQ to SQL) and store the results somewhere, then add the rows the the second context (using standard LINQ to SQL).
Also note that this isn't necessarily going to be particularly quick. If performance is an issue then you should look into using bulk inserts in the WriteRows method, or potentially even use linked servers to do the entire thing in SQL.

Pessimistic locking in EF code first

I'd like to lock specified row(s) in my table exclusively, so no reads no updates allowed until the actual transaction completes. To do this, I've created a helper class in my database repository:
public void PessimisticMyEntityHandler(Action<IEnumerable<MyEntity>> fieldUpdater, string sql, params object[] parameters)
{
using (var scope = new System.Transactions.TransactionScope())
{
fieldUpdater(DbContext.Set<MyEntity>().SqlQuery(sql, parameters));
scope.Complete();
}
}
Here is my test code. Basicly I'm just starting two tasks and both of them tries to lock the row with the Id '1'. My guess was that the second task won't be able to read(and update) the row until the first one finishes its job, but the output window shows that it actually can.
Task.Factory.StartNew(() =>
{
var dbRepo = new DatabaseRepository();
dbRepo.PessimisticMyEntityHandler(myEntities =>
{
Debug.WriteLine("entered into lock1");
/* Modify some properties considering the current ones... */
var myEntity = myEntities.First();
Thread.Sleep(1500);
myEntity.MyEntityCode = "abcdefgh";
dbRepo.Update<MyEntity>(myEntity);
Debug.WriteLine("leaving lock1");
}, "SELECT * FROM MyEntities WITH (UPDLOCK, HOLDLOCK) WHERE Id = #param1", new SqlParameter("param1", 1));
});
Task.Factory.StartNew(() =>
{
Thread.Sleep(500);
var dbRepo = new DatabaseRepository();
dbRepo.PessimisticMyEntityHandler(myEntities =>
{
Debug.WriteLine("entered into lock2");
/* Modify some properties considering the current ones... */
var myEntity = myEntities.First();
myEntity.MyEntityCode = "xyz";
dbRepo.Update<MyEntity>(myEntity);
Debug.WriteLine("leaving lock2");
}, "SELECT * FROM MyEntities WITH (UPDLOCK, HOLDLOCK) WHERE Id = #param1", new SqlParameter("param1", 1));
});
Output window:
entered into lock1
entered into lock2
leaving lock2
leaving lock1
What you are asking for, involves two main phenomena in DBMS and particularly in SQL Server: Lock and Isolation Level. I do my best to explain them in summery.
You asked about Pessimistic Concurrency. The answer is: it is not supported yet in Entity Framework. In other words, by conventional API of EF you cannot lock a table or some rows for SELECT like what for example Oracle does via SELECT FOR UPDATE. Though you can achieve this by writing a native SQL command to select some rows or the entire table with an Exclusive lock and maintain this lock until the end of the transaction. This way other threads not only cannot update the selected rows, but also cannot select them. They get blocked until you release the lock. This is what I do in my projects and though somehow risky, it works fine.
So In summery:
Lock for select: NO by EF / Yes by native SQL
Lock for Update:
When you modify rows in DB, the modified rows gain some sort of lock. The kind of lock is determined by the Isolation Level of the running Transaction. The default of Isolation Level in SQL Server is Read Committed which means that all the rows that are modified in current transaction gain Shared lock. This lock is compatible with SELECT but not with UPDATE or DELETE. It means that when you modify a row in your transaction, by default it is guaranteed that no other parallel threads can infer and change them until you end the transaction either by COMMIT or ROLLBACK.
To understand the locks in SQL Server see:
http://technet.microsoft.com/en-us/library/aa213039(v=sql.80).aspx
To understand transaction isolation level see: http://technet.microsoft.com/en-us/library/ms189122(v=sql.105).aspx
.
Update:
Table hints UPDLOCK and HOLDLOCK may be ignored by query optimizer or other DBMS modules since they are just hint :-). The only combination of table hints that is surely enforced is (XLOCK, PAGLOCK).
Example: SELECT * FROM MyTable WITH (XLOCK, PAGLOCK)
As I said, manual locking is risky. Use it with maximum level of consideration.

Query Context for Inserted but yet Uncommitted records

I've searched and searched stackoverflow and www, but have found no answers to this question.
I am looping through a number of records and under certain conditions I'm inserting new records into table A. Then I'm looping again on another data source (which cannot be merged with the first one), and if that be the case, I want to insert new records into the same table A. I only want to commit the records at the end of the process, but it'll give a primary key violation error if I just insert them.
Note: linq is not managing primary keys. Probably because I'm sort of a noob with linq and don't really know how to get linq to work with Oracle sequences.
My question is how do I check the existing context for the records I have inserted. This is what I am doing.
foreach(var rec in recordList1)
{
...
dataContext.InsertOnSubmit(obj);
}
foreach(var rec in recordList2)
{
if ( ! [check context here for existing record] )
{
...
dataContext.InsertOnSubmit(obj);
}
}
dataContext.SubmitChanges();
I've tried querying the context in different ways, but it'll only return committed values.
Thanks in advance!
Best regards.
To access the objects inserted, updated, deleted in the datacontext you need to call GetChangeSet.
var changed = dataContext.GetChangeSet();
var inserted = changed.Inserts;
var updated = changed.Updates;
var deleted = changed.Deletes;

How to update with Linq-To-SQL?

I need to update values but I am looping all the tables values to do it:
public static void Update(IEnumerable<Sample> samples
, DataClassesDataContext db)
{
foreach (var sample in db.Samples)
{
var matches = samples.Where(a => a.Id == sample.Id);
if(matches.Any())
{
var match = matches.First();
match.SomeColumn = sample.SomeColumn;
}
}
db.SubmitChanges();
}
I am certain the code above isn't the right way to do it, but I couldn't think of any other way yet. Can you show a better way?
Yes, there is a simpler way. Much simpler. If you attach your entities to the context and then Refresh (with KeepCurrentValues selected), Linq to SQL will get those entities from the server, compare them, and mark updated those that are different. Your code would look something like this.
public static void Update(IEnumerable<Sample> samples
, DataClassesDataContext db)
{
db.Samples.AttachAll(samples);
db.Refresh(RefreshMode.KeepCurrentValues, samples)
db.SubmitChanges();
}
In this case, Linq to SQL is using the keys to match and update records so as long as your keys are in synch, you're fine.
With Linq2Sql (or Linq to Entities), there is no way* to update records on the server without retrieving them in full first, so what you're doing is actually correct.
If you want to avoid this, write a stored procedure that does what you want and add it to your model.
I'm not entirely sure if that was your intended question however :)
*: There are some hacks around that use LINQ to build a SELECT statement and butcher the resulting SELECT statement into an UPDATE somehow, but I wouldn't recommend it.

Categories

Resources