I'm using Entity Framework as my way of communicating with the database and fetching/writing information on it, on a ASP.NET CORE application. (Used as a very basic API, acting as a server for a separate application.)
There comes a time when the clients make requests to join a given lobby. I've just confirmed that if 4 requests at the same time enter, they will all be signed up on the lobby, but the player count did not update, and if it did - it'd go over the top/limit.
Am I using entity framework wrong? Is there an alternative tool to be used for such things, or should I just make it so it uses a single thread (If someone can remind me how), or encapsulate all my actions/endpoints with a lock block statement?
No matter how I structure my code, it's all prone to these same-timed http requests, moving parallelly through my repository/context.
It'd be great if I could make some kind of a queue, which I believe is what encapsulating everything in a lock would do.
EDIT:
As answered by vasily.sib, I can resolve this with the use of concurrency tokens. Please check his comment for some amazing information on how to use them!
Your problem is that operations like these...
_context.Sessions.FirstOrDefault(s => s.Id == sessionId).PlayerCount += 1;
_context.SaveChanges();
...are not atomic. With FirstOrDefault you get the session (which you dereference then without a null check, so First would be a better option here, since you will get a better error message). Then you save the changes in another step. Between those steps another concurrent thread could have changed and already saved a new value for PlayerCount.
There are multiple ways to resolve this, and most of them would require some changes on DB level.
One way to resolve it is to write a stored procedure that can do the update atomically. You would need an UPDATE statement similar to this one:
UPDATE Sessions
SET PlayerCount = PlayerCount + 1
FROM Sessions
WHERE Id = #SessionId
If you don't want to use stored procedures you could send this SQL directly to the DB.
One problem with this solution is that later you also do this step:
if (thisSession.Size == thisSession.PlayerCount)
{
thisSession.Status = SessionStatus.Active;
_context.SaveChanges(); // this can be trimmed and added at the previous scope
}
You would have to integrate this in your UPDATE statement to keep the operation atomic. If you want to add additional steps, things can get complicated.
Another way is to use optimistic concurrency built into EF core. In a nutshell, this means that when you save the data ef will first check whether the destination row is still in the same version compared to the moment you retrieved it.
To achieve that, your session needs a column that will contain a version counter. For SQL Server this would be a colun of type rowversion. Once you have this column, EF can do its optimistic concurrency magic. EF will throw a DbUpdateConcurrencyException which you would have to handle. There are different ways to resolve the error. One easy way would be to repeat your entire operation until it works out.
Related
Today in interview I got asked this kind of question.
Let's say you want to update multiple records same time using web api then how you can do that.
As it is around 1000s of records for updating..
So I give reply to use async and await for now.
Then he ask me if user 1 update record and same time user 2 update record so which will take action and how this scenario can handle.
So what should be best reply for this kind of question.
Entity framework is not designed for mass updates - it is designed for queries and transactional updates (a few records at a time).
If you need to mass-update data then either write a SQL statement that will do all of the updates or use an ETL tool like SSIS. So, raw SQL is faster than Entity framework. With that said, For normal CRUD that is not querying a billion rows, EF is totally fine, but when handling a significant amount of data, replacing EF calls with stored procedures is a good choice.
Question: If user 1 update record and same time user 2 update record so which will take action and how this scenario can handle?
Its call "concurrency", To handle this kind so scneario you have to use locking mechanism
for handling concurrency (updating same record from different user) so that user may understand that other
user is modifying this as well.
Asp.net Entity Framework Aspect:
In asp.net Entity framework has
RowVersion property which will track update log for a specific data
when it’s been updated.
Two kind of Concurrency Mechanism you can use:
Optimistic Concurrency
Pessimistic concurrency
You can have a look official document
for more details. Additionally here is the implementations
Database Aspect:
From database you can also handle this scenario. All the database also
has Locking mechanism to work on same records simultaneously by
multipole user. You can have a look here
Hope it will guide you accordingly and redeem your confusion.
Is it possible to keep on to the 'changes' you will be wanting to make with Entity Framework?
For instance I do an update query then the connection fails, I close the app and then tomorrow I want it to do that update query when a connection is restored.
Is a thing like that possible with Entity Framework 6?
You could possibly create your own ChangeTracker, like in this tutorial, which saves all changes to a file. Parsing the file for a later use might be tricky.
The other option would be to use retry logic and hope the connection problem was just a slight hiccup
I think ZorgoZ is right and this answer will not address your actual question, but what I think it is your actual problem (see XY problem): you fail to save some changes (business-wise) and you want to be able to retry saving them later, not necessary some EF changes that you want to persist a later moment.
One way to do it is to store the business information that trigger the change and retry the whole flow:
define some sort of queue stored in a file, isolated storage etc. (depends on your technology)
your update flow should persist some sort of record / object in the queue that contains all relevant information + a status (e.g. Queued, Save error etc.)
update is tried. If it fails, you can have a status update (e.g. Save error)
the user closes the application
later, the user opens the application seeing the queue items status (item 1 saved OK, item 2 save error etc.)
Besides EF technical issues, this implementation allows for flow changes that include things outside EF save changes (e.g. also check some external API) and abstracts away the data access layer entirely (EF changes serialisation implementation has a big chance to depend on some EF specifics).
Also I sense context information serialization (e.g. data behind some form) is easier to implement than EF context changes serialization.
Folks - apologies for rehashing this topic as I see even here on Stack, there are so many questions on the topic already.
But I find myself in an interesting place and I'm hoping you can help.
High level Question: can SQL SERVER have the leeway to decide that a view should be wrapped in a ISOLATION LEVEL SNAPSHOT?
I know that sounds like a crazy question but I'm trying to exhaust all avenues per an issue I'm encountering.
I'm working with an application that runs 35 queries to retrieve data from another database via Link Server. The queries are simple selects against one table respectively. All DB operations are carried out against SQL SERVER, and retrieval code is ADO.NET/c#, etc.
34 of the queries work flawlessly - but there's this one bad apple, and for it, I get the transaction isolation level snapshot issue.
I've also tested data retrieval outside of the application and when I implement the below snippet on the "problem" query, I also get the issue:
using (var trans = conn.BeginTransaction(IsolationLevel.Snapshot))
However, when I do NOT implement it on said query, all is well.
I've also tested this against the other queries - with and without "Shapshot" - and my results are predictable... With "Shapshot" in place, no queries process... When not implemented, all queries process...
My results suggest that the application is responsible for changing up the data retrieval strategy.
Per their knowledge base, I found this: Locking is handled by the database level (MS SQL Server/Oracle) and not by "us". Generally, locking is row level but the optimizer may choose something different
Unfortunately I don't have native access to the boiler-plate application code responsible for data retrieval. I suspect that this particular query/table has one or more key words - either in the column or query/table naming - that trigger the application to use an alternate retrieval strategy. Per the developer forums, I've asked of this is the case and I'm awaiting a reply...
Anyway back to their mention of the optimizer may choose something different- their optimizer, or perhaps the database optimizer? Can SQL SERVER be set up to make a "judgement call" ? Is the statement unclear or do I just not enough of SQL SERVER and its capabilities?
I know it seems like a crazy question but I really want to knock out all possible avenues here.
Thank you kindly for suspending your disbelieve and humoring that crazy post :)
Apparently objects with the word "valuation" (perhaps because of the sensitive nature implied) cause the application to build the transaction. Once I changed the view name, the data returned to the client successfully.
So yes the application was/is the issue.
I have this ASP.NET MVC3 project that uses EF5 db-first for data access; a pattern my predecessor established was to introduce the following check for update / delete operations:
if (context.SaveChanges() == 0)
throw new SqlExecutionException("...");
I've recently come to realize that this approach fails when used to update existing data with unchanged data, like when a user opens a window to edit stuff and presses "ok" without actually changing anything; the number of changed records is then 0, the exception throws, and that's wrong.
Looking around I've come to perceive the purpose of this check to be mistaken: in case the record has been destroyed after being pulled from the DB, EF will throw a subtype of DataException to signal that, and I cant think of any other reason for this check to exist. My predecessor is out of touch, thus my question: can I safely purge these checks from my code, supplementing them with a filter for DataExceptions on a higher level?
Yes, this is a pattern from the C++ days. If there is a database constraint violation EF or another ORM will give you good error messages.
We have a system built using Entity Framework 5 for creating, editing and deleting data but the problem we have is that sometimes EF is too slow or it simply isn't possible to use entity framework (Views which build data for tables based on users participating in certain groups in database, etc) and we are having to use a stored procedure to update the data.
However we have gotten ourselves into a problem where we are having to save the changes to EF in order to have the data in the database and then call the stored procedures, we can't use ITransactionScope as it always elevates to a distributed transaction and/or locks the table(s) for selects during the transaction.
We are also trying to introduce a DomainEvents pattern which will queue events and raise them after the save changes so we have the data we need in the DB but then we may end up with the first part succeeding and the second part failing.
Are there any good ways to handle this or do we need to move away from EF entirely for this scenario?
I had similar scenario . Later I break the process into small ones and use EF only, and make each small process short. Even overall time is longer, but system is easier to maintain and scale. Also I minimized joins, only update entity itself, disable EF'S AutoDetectChangesEnabled and ValidateOnSaveEnabled.
Sometimes if you look your problem in different ways, you may have better solution.
Good luck!