Our team is developing a machine that will perform a physical process on a tray that holds vials of medical samples. The physical process will take approximately 1.5 hours. The tray and related vials are entities, loaded from a database using the Entity Framework. As the process runs, the device will update values on the entities. The changes may happen minutes or seconds apart. At the end of certain steps, between 10 and 45 minutes apart, we want to save those entities back to the database, and keep going.
Is it acceptable to have an Entity Framework context open for 1.5 hours? Can I make changes and save the entities multiple times during that time period using that context? If not, what is the best way to handle this?
Some ideas so far:
We could use the attach/detach capability. This should allow us to make changes to the entities outside of the context, then create a new context and attach the entity when we want to save, then detach it to continue working.
We could create a new context every time we want to change one of the entities. But I don't think we want to save every time we make a change.
We could copy the entities to business objects, and make the changes there. Then when we want to save, we would open a context and copy the changes into the entities, and save.
A combination of 2 and 3 will be ideal.
First off, do not keep a context open for hours at a time. You can do this through configuration but it is just going to waste resources considering that you are doing an operation for 90 minutes and it should take roughly 3 milliseconds to open a connection.
So just create a context as you need it. Next, keep in mind that although you open a context to gather data or maintain state, you do not actually need to save the data if it is not ready to be stored. You can just store it locally.
This is where step 3 comes in, with local memory. Basically you should keep it in local memory with an event handler attached. As the local copy changes, have the database update if the change has occurred within some acceptable time window.
Is it acceptable to have an Entity Framework context open for 1.5 hours?
UPDATE: Per the resources you link, if you allow EF to manage the opening and closing of the connection, it will open the connection as late as possible and close it as early as possible, so that the relatively costly database connection can be returned to the connection pool. The connection will only be held open for the duration that the context exists if you manually manage the database connection.
At the end of certain steps, between 10 and 45 minutes apart, we want to save those entities back to the database, and keep going.
Note that if the client crashes for any reason, the changes kept in memory will be lost. Consider the impact of this when deciding whether you really want to wait that long before persisting your data.
If it is quite certain that this is and will remain an architecture with one or just a few clients writing to a dedicated database, I would opt to keep the code as simple as possible... trading resource inefficiency that does not matter in this very specific case for a lesser chance of programmer error.
I understand that you want to save data in batches and that it's no use to save individual values if a batch as a whole doesn't succeed.
Since resources are not the bottleneck and this is a dedicated system, it doesn't matter that a context lives relatively long, so I would use a context per batch. The context collects the data and concludes each batch by one SaveChanges call, which automatically saves one batch in one database transaction. Roughly, the code would look like this:
do
{
// Start of a new batch.
using(var db = new MyContext())
{
// Collect data into the context
...
SaveChanges();
}
} while (....); // While there are new batches
Database connections will be opened and closed when needed. SaveChanges will do this, but also any other database interaction you may need in-between. EF will never leave a connection open for longer than necessary.
Related
We have a process that needs to run every so soften against a DB used by a web app, and we need to prevent all other updates during this process execution. Is there any global way to do this maybe thru nHibernate, .NET or maybe directly in Oracle?
The original idea was to have a one-record DB table to indicate if the process is running or not, but with this we will need to go back to every single save/update method and make changes to verify if this record exist or not prior to the save/update call.
My reaction to that kind of requirement is to review the design as it is highly unusual outside of doing application upgrades. Other than that there are a couple option:
Shutdown the DB, open it in exclusive mode, make changes, and then open it up for everyone.
Attempt to lock all the required tables with LOCK TABLE. That might generate deadlock exceptions depending on the order of doing the locks.
I'm trying to make entity framework work properly with my application. The scenario I have is something like the following: Say I have 8000 items, and each item has 100 components. The way it currently works is I eager load the 8000 items and lazy load the components for each item, because eager loading the entire thing would be too slow on application startup.
As far as I understood in order to have lazy loading working, I need to keep the context alive for the whole application lifetime. So I have a single instance of the context that is open on startup and is closed on exit. I also use that to track changes and save changes.
However I've been reading about EF and many people advise against this approach, in favor of opening and closing contexts at each operation. My question is: how would you go about lazy loading properties, tracking changes, and saving changes if I cannot work with the same context?
Furthermore, I am already facing with issues since I use different threads to load data in the background or save in the background (say it's saving, if I edit a tracking property it raises an exception). I fixed some of them by using a FIFO queue (on a specific thread) for operations on the same context, however tracking properties won't respect the queue.
Some help would be greatly appreaciated as to how to use EF properly
If I do a write to a (ms)SQL database and saves the changes using SaveChangesAsync(), is there a possibility that a future read on the database could read unsaved changes?
Factors could include, whether a different DbContext, thread or process is used to access the database.
Short Answer, NO (Tim P is mis-informed).
Calling DbContext.SaveChangesAsync will automagically create a Transaction for the duration of the saving.
This means that if any other thread tries to access the table, one of a number of things can occur.
Generally it means that the other Database call will block on the other thread, whilst the transaction is uncommited/not-rolledback.
Short answer: Yes.
It depends on several factors, such as how much you are saving to the database. If you're saving 1,000 rows/objects, and it's a slow database server, it's possible that this time window is wide enough that another thread is reading while (for example) row #879 has yet to be saved. And this has nothing to do with it being asynchronous. This is the normal concurrency problem in dealing with multi-user relational database systems.
I have process A that runs every 5 minutes and needs to write something in table "EventLog". This works all day long but at night there is another process B starting that needs to delete a lot of old data from this table. The table has millions of rows (blobs included) and many related tables (deletion by cascade) so process B runs up to ~45 minutes. While process B is running I get a lot of deadlock warnings for process A and I want to get rid of these.
The easy option would be "Don't run process A when process B is running" but there must be a better approach. I am using EntityFramework 6 and TransactionScope in both processes. I didn't find out how to set priority or something like that on my processes. Is this possible?
EDIT:
I forgot to say that I am already using one delete transaction per record, not one transaction for all records. Inside loop I create new DBContext and TransactionScope, so each record has its own transaction. My problem is that deleting a record still takes some time because of the related BLOBs and data in other related tables (lets say about 5 sec. per row). I still get deadlock situations when deleting process (B) crosses with inserting process (A).
Transactions don't have priority. Deadlock victims are chosen by the database, most commonly using things like "work required to roll it back". One way of avoiding a deadlock is to ensure that you block rather than deadlock, by accessing tables in the same order, and by taking locks at the eventual level (for example, taking an UPDLOCK when reading the data, to avoid two queries getting read locks, then one trying to escalate to a write lock). Ultimately, though, this is a tricky area - and something that takes 45M to complete (please tell me that isn't a single transaction!) is always going to cause problems.
Rework process B to not delete it all at once but in smaller batches that never take more than 1 minute. Run those in a loop until all to be deleted is done.
I'm running a number of threads which each attempt to perform INSERTS to one SQLite database. Each thread creates it's own connection to the DB. They each create a command, open a Transaction perform some INSERTS and then close the transaction. It seems that the second thread to attempt anything gets the following SQLiteException: The database file is locked. I have tried unwrapping the INSERTS from the transaction as well as narrowing the scope of INSERTS contained within each commit with no real effect; subsequent access to the db file raises the same exception.
Any thoughts? I'm stumped and I'm not sure where to look next...
Update your insertion code so that if it encounters an exception indicating database lock, it waits a bit and tries again. Increase the wait time by random increments each time (the "random backoff" algorithm). This should allow the threads to each grab the global write lock. Performance will be poor, but the code should work without significant modification.
However, SQLite is not appropriate for highly-concurrent modification. You have two permanent solutions:
Move to a "real" database, such as PostgreSQL or MySQL
Serialize all your database modifications through one thread, to avoid SQLite's modifications.
Two things to check:
1) Confirmed that your version of SQLite was compiled with THREAD support
2) Confirm that you are not opening the database EXCLUSIVE
I was not doing this in C#, but rather in Android, but I got around this "database is locked" error by keeping the sqlite database always opened within the wrapper class that owns it, for the entire lifetime of the wrapper class. Each insert done within this class then can be in its own thread (because, depending on your data storage situation, sd card versus device memory etc., db writing could take a long time), and I even tried throttling it, making about a dozen insert threads at once, and each one was handled very well because the insert method didn't have to worry about opening/closing a DB.
I'm not sure if persistent DB life-cycles is considered good style, though (it may be considered bad in most cases), but for now it's working pretty well.