Entity Framework and ADO.NET with Unit of Work pattern - c#

We have a system built using Entity Framework 5 for creating, editing and deleting data but the problem we have is that sometimes EF is too slow or it simply isn't possible to use entity framework (Views which build data for tables based on users participating in certain groups in database, etc) and we are having to use a stored procedure to update the data.
However we have gotten ourselves into a problem where we are having to save the changes to EF in order to have the data in the database and then call the stored procedures, we can't use ITransactionScope as it always elevates to a distributed transaction and/or locks the table(s) for selects during the transaction.
We are also trying to introduce a DomainEvents pattern which will queue events and raise them after the save changes so we have the data we need in the DB but then we may end up with the first part succeeding and the second part failing.
Are there any good ways to handle this or do we need to move away from EF entirely for this scenario?

I had similar scenario . Later I break the process into small ones and use EF only, and make each small process short. Even overall time is longer, but system is easier to maintain and scale. Also I minimized joins, only update entity itself, disable EF'S AutoDetectChangesEnabled and ValidateOnSaveEnabled.
Sometimes if you look your problem in different ways, you may have better solution.
Good luck!

Related

Transactional Systems: archiving data from databases with Entity Framework

I've written a small tool for archiving data from my Entity Framework code-first driven database.
I'm testing it thoroughly and I'm coming to a point where I'm trying it with large amounts of data. Where it comes to some problems. For example I got timeouts or exceptions like this sometimes:
Deadlock found when trying to get lock; try restarting transaction.
I know what transactions are and I guess Entity Framework is making them for all of its changes in one DbContext so in case any of them or the entire thing fails when SaveChanges() is called nothing is actually changed (short side questions: can I then simply run SaveChanges() again?)
What I want to know is this: since I need to delete different batches of information throughout my database (after exporting it) I'm constantly creating dbcontext's for each of those batches.
Should I create transactions manually for every batch and commit them all at once at the very end?
I'm studying Informatics and learn about transactional information systems in one of my courses. How is it possible with Entity Framework to create a meta transaction for all my single transactions when deleting batches of data, so all the data that is spread throughout the database is only really deleted when everything worked well, like this:
Or is there a better way to solve the entire thing?

Continuous delivery and database schema changes with entity framework

We want to progress towards being able to do continuous delivery of of our application into production. We currently deploy to azure and use table/blob storage and have a azure sql database, which we access with the entity.
As the database schema changes we want to be able to automatically apply the schema changes to the production database, but as this will happen whilst the application is live and the code changes are being deployed to many nodes at the same time we are not sure what the correct approach is.
After some reading it seems (and this makes sense) that the application needs to be tolerant of the 2 different database schema versions, so that it doesn't matter if its an old version of the code or a new version of the code which sees the database, however I'm not sure what the best way to approach handling this in the application is, using the entity framework.
Should we have versioned instances of the EF generated classes in the code which know how to access a specific version of the schema? What happens when the schema is updated and an old version of the code is running against the database?
Our entity framework classes are mapped to views in specific schemas in the db and nothing is mapped to the underlying tables, so potentially this could allow us to create v1 views which the old code uses and v2 views which the new code uses, but maintaining this feels like it would be a bit of a nightmare (its already enough of a pain simply maintaining the EF mappings to views rather than tables)
So what are best practices in this area? What do others do to solve this problem?
Whether you use EF or not, maintaining the code's ability to work with 2 consecutive versions of the database is a good (and perhaps the only viable) approach here.
Here are some ways we handle specific types of migrations:
When adding a column, we can typically just add the column (with a default constraint if non-nullable) and not worry about the code. EF will never issue a "SELECT *", so it will be able to continue to function properly while ignoring the new column. Similarly, adding a table is easy.
When removing a column or table, simply keep that column around 1 version longer than you would have otherwise.
For more complex migrations (e. g. completely changing the structure for a table or segment of the data model), deploy the new model alongside backwards-compatibility views (or tables with triggers to keep them in-sync), which will live as long as does the code that references them. As you say, this can a lot of work depending on the complexity of the migration, but it sounds like you are already well-positioned to do this because your EF entities point to views anyway. On the other hand, the benefit of this work is that you have more time to do the code migration. If you have a large codebase, this could be really beneficial in allowing you to migrate the data model to fit the needs of new features while still supporting old features without major code changes.
As a side-note, the difficulty of data migration often makes us push developing a finalized data model as far back as possible in the development schedule. With EF, you can write and test a lot of code before the data model is finalized (we use code-first to generate a sample SQLExpress database in a unit tests, even though our production database is not maintained by code-first). That way, we make fewer incremental changes to the production data model once a new feature is released.

How entity framework works for large number of records? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I see already a un-answered question here on.
My question is -
Is EF really production ready for large application?
The question originated from these underlying questions -
EF pulls all the records into memory then performs the query
operation. How EF would behave when table has around ~1000 records?
For simple edit I have to pull the record edit it and
then push to db using SaveChanges()
I faced a similar situation where we had a large database with many tables 7- 10 million records each. we used Entity framework to display the data. To get nice performance here's what I learned; My 10 Golden rules for Entity Framework:
Understand that call to database made only when the actual records are required. all the operations are just used to make the query (SQL) so try to fetch only a piece of data rather then requesting a large number of records. Trim the fetch size as much as possible
Yes, (In some cases stored procedures are a better choice, they are not that evil as some make you believe), you should use stored procedures where necessary. Import them into your model and have function imports for them. You can also call them directly ExecuteStoreCommand(), ExecuteStoreQuery<>(). Same goes for functions and views but EF has a really odd way of calling functions "SELECT dbo.blah(#id)".
EF performs slower when it has to populate an Entity with deep hierarchy. be extremely careful with entities with deep hierarchy
Sometimes when you are requesting records and you are not required to modify them you should tell EF not to watch the property changes (AutoDetectChanges). that way record retrieval is much faster
Indexing of database is good but in case of EF it becomes very important. The columns you use for retrieval and sorting should be properly indexed.
When you model is large, VS2010/VS2012 Model designer gets real crazy. so break your model into medium sized models. There is a limitation that the Entities from different models cannot be shared even though they may be pointing to the same table in the database.
When you have to make changes in the same entity at different places, use the same entity, make changes and save it only once. The point is to AVOID retrieving the same record, make changes & save it multiple times. (Real performance gain tip).
When you need the info in only one or two columns try not to fetch the full entity. you can either execute your sql directly or have a mini entity something. You may need to cache some frequently used data in your application also.
Transactions are slow. be careful with them.
SQL Profiler or any query profiler is your friend. Run it when developing your application to see what does EF sends to database. When you perform a join using LINQ or Lambda expression in ur application, EF usually generates a Select-Where-In-Select style query which may not always perform well. If u find any such case, roll up ur sleeves, perform the join on DB and have EF retrieve results. (I forgot this one, the most important one!)
if you keep these things in mind EF should give almost similar performance as plain ADO.NET if not the same.
1. EF pulls all the records into memory then performs the query operation. How EF would behave when table has around ~1000 records?
That's not true! EF fetches only necessary records and queries are transformed into proper SQL statements. EF can cache objects locally within DataContext (and track all changes made to entities), but as long as you follow the rule to keep context open only when needed, it won't be a problem.
2. For simple edit I have to pull the record edit it and then push to db using SaveChanges()
It's true, but I would not bother in doing that unless you really see the performance problems. Because 1. is not true, you'll only get one record from DB fetched before it's saved. You can bypass that, by creating the SQL query as a string and sending it as a plain string.
EF translates your LINQ query into an SQL query, so it doesn't pull all records into memory. The generated SQL might not always be the most efficient, but a thousand records won't be a problem at all.
Yes, that's one way of doing it (assuming you only want to edit one record). If you are changing several records, you can get them all using one query and SaveChanges() will persist all of those changes.
EF is not a bad ORM framework. It is a different one with its own characteristics. Compare Microsoft Entity Framework 6, against say NetTiers which is powered by Microsoft Enterprise Library 6.
These are two entirely different beasts. The accepted answer is really good because it goes through the nuances of EF6. Whats key to understand is that each ORM has its own strengths and weaknesses. Compare the project requirements and its data access patterns against the ORM's behavior patterns.
For Example: NetTiers will always give you higher raw performance than EF6. However that is primarily because it is not a point and click ORM and as part and parcel of generating the ORM you will be optimizing your data model, adding custom stored procedures where relevant, etc... if you engineered your data model with the same effort for EF6 you could probably get close to the same performance.
Also consider can you modify the ORM? for example with NetTiers you can add extensions to the codesmith templates to include your own design patterns over and above what is generated by the base ORM library.
Also consider EF6 makes significant use of reflection whereas NetTiers or any library powered by Microsoft Enterprise Library will make heavy use of Generics instead. These are two entirely different approaches. Why so? Because EF6 is based on dynamic reflection whereas NetTiers is based on static reflection. Which is faster and which is better entirely depends on the usage patterns that will be required of the ORM.
Sometimes a hybrid approach works better: Consider for example EF6 for Web API OData endpoints, A few large tables wrapped with NetTiers & Microsoft Enterprise Library with custom stored procedures, and a few large masterdata tables wrapped with a custom built write through object cache where on initial load the record set is streamed into the memory cache using an ADO data reader.
These are all different and they all have their best fit scenarios: EF6, NetTiers, NHibernate, Wilson OR Mapper, XPO from Dev Express, etc...
There is no simple answer for your question. The main thing is about what you want to do with your data? And do you need so much data at one time?
EF translated your Queries to SQL so at this time there is no Object in Memory. When you get the data, then the selected records are in memory. If you are selecting a large amount of large objects then it can be a performance killer if you need to manipulate them all.
If you don't need to manipulate them all you can disable change tracking and enable it later for single objects you need to manipulate.
So you see it depends on your type of application.
If you need to manipulate a mass of data efficient, then don't use a OR-Mapper!
Otherwise EF is fine, but consider how many objects you really need at one time and what you want to do with them.

Is EF or SQL the better choice to audit data changes?

The requirement seems simple: when data changes, audit the changes.
Here's some important pieces to the equation:
The Data in my application spans multiple tables (some cross ref. tables).
My DTO is deep, with Navigation Properties conditionally populated.
When loaded, I copy the original DTO with its "original values".
When saved is requested, the original DTO contains the changes.
Ideally, foreign keys will read like useful text not Id numbers.
Unlike TFS' cool history feature, mine seems more complicated because of the many related tables and conditional child entities.
I see three possibilities (so far):
I could use C# to reflect the objects and create a before/after record.
I could use triggers in SQL 2008R2 to catch changes and coalesce a before/after record.
I could store the raw before/after objects and let SQL 2008R2 parse them.
Please note: Right now, I seems to me that SQL 2008R2's CDC is far too heavy of an option. I am really looking for something I can build, but I admit my mind is open to anything right now.
My question
Before I get started building this:
How does everybody else handle auditing a complex EF DTO?
Is there a low(ish)-tech solution available?
Thank you in advance.
Related, but not-completely-related questions already on StackOverflow: Implementing Audit Log / Change History with MVC & Entity Framework and Create Data Audit in SQL Server and https://stackoverflow.com/questions/5773419/how-to-audit-many-to-many-relationship-in-entity-framework and Maintaining audit log for entities split across multiple tables and Linq to SQL Audit Trail / Audit Log: should I use triggers or doddleaudit? do not provide an answer.
IF audit is a real requirement I would opt for the trigger solution... since the other methods have several "shortcomings":
"blind" to any changes happening through other means than your application
if you make some code changes and forget about adding the audit code the audit trail gets "blind spots"
The trigger-based solution can be secured so that only special users can even see the audited data...
I usually work with Oracle but from my experience in such situations: allow the app only SELECT rights via Views , any insert/delete/update should be done via Stored procedures and audit trail should be done via triggers...
I've recently implemented an audit log manager on top of Entity Framework. When I instantiate my audit manager, I reflect all of the entity classes, and store the property information. Then within the object context SavingChanges event, I audit all of the changes. It works great. In the case of foreign keys, I just store their Id's before and after during changes.
The nice thing about this solution is that it doesn't require any extra coding. Once you create a log manager of sort, you don't have to worry about adding new triggers, or modifying triggers when new columns are added. Any changes to your entity classes will automatically be picked up when reflecting the classes.
Well, let's see. SQL Server auditing already exists, comes with tools, is probably already known by your DBAs, doesn't slow down your app, and can trace events that the application itself will never even see.
On the other hand, rolling your own in EF will allow you to audit non-SQL Server data sources. It also doesn't require EE.
Trigger Solution, Pros:
Cannot bypass the audit
Trigger Solution, Cons:
Cannot audit non SQL data
Cannot audit complex objects on insert
Entity Framework, Pros:
Can audit everything
Can audit complex objects in any state
Entity Framework, Cons:
Can be bypassed (like direct-to-SQL)
Requires a copy of original values
My choice is Entity Framework. Using STE makes it easier.
Either way you have to roll your own.

Why creating Tables in run-time (code behind) is bad?

People suggest creating database table dynamically (or, in run-time) should be avoided, with the saying that it is bad practice and will be hard to maintain.
I don't see the reason why, and I don't see difference between creating table and any another SQL query/statement such as SELECT or INSERT. I wrote apps that create, delete and modify database and tables in run time, and so far I do not see any performance issues.
Can anyone explane the cons of creating database and tables in run-time?
Tables are much more complex entities than rows and managing table creation is much more complex than an insert which has to abide by an existing model, the table. True, a table create statement is a standard SQL operation but depending on creating them dynamically smacks of a bad design decisions.
Now, if you just create one or two and that's it, or an entire database dynamically, or from a script once, that might be ok. But if you depend on having to create more and more tables to handle your data you will also need to join more and more and query more and more. One very serious issue I encountered with an app that made use of dynamic table creation is that a single SQL Server query can only involve 255 tables. It's a built-in constraint. (And that's SQL Server, not CE.) It only took a few weeks in production for this limit to be reached resulting in a nonfunctioning application.
And if you get into editing the tables, e.g. adding/dropping columns, then your maintenance headache gets even worse. There's also the matter of binding your db data to your app's logic. Another issue is upgrading production databases. This would really be a challenge if a db had been growing with objects dynamically and you suddenly needed to update the model.
When you need to store data in such a dynamic manner the standard practice is to make use of EAV models. You have fixed tables and your data is added dynamically as rows so your schema does not have to change. There are drawbacks of course but it's generally thought of as better practice.
KMC ,
Remember the following points
What if you want to add or remove a column , you many need to change in the code and compile it agian
what if the database location changes
Developers who are not very good at database can make changes , if you create the schema at the backend , DBA's can take care of it.
If you get any performance issues , it may get tough to debug.
You will need to be a little clearer about what you mean by "creating tables".
One reason to not allow the application to control table creation and deletion is that this is a task that should be handled only by an administrator. You don't want normal users to have the ability to delete whole tables.
Temporary tables ar a different story, and you may need to create temporary tables as part of your queries, but your basic database structure should be managed only by someone with the rights to do so.
sometimes, creating tables dynamically is not the best option security-wise (Google SQL injection), and it would be better using stored procedures and have your insert or update operations occur at the database level by executing the stored procedures in code.

Categories

Resources