Does Entity Framework Use Reflection and Hurt Performance? - c#

I ultimately have two areas with a few questions each about Entity Framework, but let me give a little background so you know what context I am asking for this information in.
At my place of work my team is planning a complete re-write of our application structure so we can adhere to more modern standards. This re-write includes a completely new data layer project. In this project most of the team wants to use Entity Framework. I too would like to use it because I am very familiar with it from my using it in personal projects. However, one team member is opposed to this vehemently, stating that Entity Framework uses reflection and kills performance. His other argument is that EF uses generated SQL that is far less efficient than stored procedures. I'm not so familiar with the inner-workings of EF and my searches haven't turned up anything extremely useful.
Here are my questions. I've tried to make them as specific as possible. If you need some clarification please ask.
Issue 1 Questions - Reflection
Is this true about EF using reflection and hurting performance?
Where does EF use reflection if it does?
Are there any resources that compare performance? Something that I could use to objectively compare technologies for data access in .NET and then present it to my team?
Issue 2 Questions - SQL
What are the implications of this?
Is it possible to use stored procedures to populate EF entities?
Again are there some resources that compare the generated queries with stored procedures, and what the implications of using stored procedures to populate entities (if you can) would be?
I did some searching on my own but didn't come up with too much about EF under the hood.

Yes, it does like many other ORMs (NHibernate) and useful frameworks (DI tools). For example WPF cannot work without Reflection.
While the performance implications of using Reflection has not changed much over the course of the last 10 years since .NET 1.0 (although there has been improvements), with the faster hardware and general trend towards readability, it is becoming less of a concern now.
Remember that main performance hit is at the time of reflecting aka binding which is reading the type metadata into xxxInfo (such as MethodInfo) and this happens at the application startup.
Calling reflected method is definitely slower but is not considered much of an issue.
UPDATE
I have used Reflector to look at the source code of EF and I can confirm it heavily uses Reflection.

Answer for Issue 1:
You can take a look at exactly what is output by EF by examining the Foo.Designer.cs file that is generated. You will see that the resulting container does not use reflection, but does make heavy use of generics.
Here are the places that Entity Framework certainly does use reflection:
The Expression<T> interface is used in creating the SQL statements. The extension methods in System.Linq are based around the idea of Expression Trees which use the types in System.Reflection to represent function calls and types, etc.
When you use a stored procedure like this: db.ExecuteStoreQuery<TEntity>("GetWorkOrderList #p0, #p1", ...), Entity Framework has to populate the entity, and at very least has to check that the TEntity type provided is tracked.
Answer for Issue 2:
It is true that the queries are often strange-looking but that does not indicate that it is less efficient. You would be hard pressed to come up with a query whose acutal query plan is worse.
On top of that, you certainly can use Stored Procedures, or even Inline SQL with entity framework, for querying and for Creating, Updating and Deleting.
Aside:
Even if it used reflection everywhere, and didn't let you use stored procedures, why would that be a reason not to use it? I think that you need to have your coworker prove it.

I can comment on Issue 2 about Generated EF Queries are less efficient than Stored Procedures.
Basically yes sometimes the generated queries are a mess and need some tuning. There are many tools to help you correct this, SQL Profiler, LinqPad, and others. But in the end the Generated Queries may look like crap but they do typically run quickly.
Yes you can map EF entities to Procedures. This is possible and will allow you to control some of the nasty generated EF queries. In turn you could also map views to your entities allowing you to control how the views select the data.
I cannot think of resources but I must say this. The comparison to using EF vs SQL stored procedures is apples to oranges. EF provides a robust way of mapping your Database to your code directly. This combined with LINQ to Entity queries will allow your developers to quickly produce code. EF is an ORM where as SQL store procedures is not.

The entity framework likely uses reflection, but I would not expect this to hurt performance. High-end librairies that are based on reflection typically use light-weight code generation to mitigate the cost. They inspect each type just once to generate the code and then use the generated code from that point on. You pay a little when your application starts up, but the cost is negligible from there on out.
As for stored procedures, they are faster than plain old queries, but this benefit is often oversold. The primary advantage is that the database will precompile and store the execution plan for each stored procedure. But the database will also cache the execution plans it creates for plain old SQL queries. So this benefit varies a great deal depending on the number and type of queries your application executes. And yes, you can use stored procedures with the entity framework if you need to.

I don't know if EF uses reflection (I don't believe it does... I can't think of what information would have to be determined at run-time); but even if it did, so what?
Reflection is used all over the place in .NET (e.g. serializers), some of which are called all of the time.
Reflection really isn't all that slow; especially in the context of this discussion. I imagine the overhead of making a database call, running the query, returning it and hydrating the objects dwarf the performance overhead of reflection.
EDIT
Article by Rick Strahl on reflection performance: .Net Reflection and Performance(old, but still relevant).

Entity Framework generated sql queries are fine, even if they are not exactly as your DBA would write by hand.
However it generates complete rubbish when it comes to queries over base types. If you are planning on using a Table-per-Type inheritance scheme, and you anticipate queries on the base types, then I would recommend proceeding with caution.
For a more detailed explanation of this bizarre shortcoming see my question here. take special note of the accepted answer to that question --- this is arguably a bug.
As for the issue of reflection -- I think your co-worker is grasping at straws. It's highly unlikely that reflection will be the root cause of any performance problems in your app.

EF uses reflection. I didn't check it by I think it is the main point of the mapping when materializing instance of entity from database record. You will say what is a column's name and what is the name of a property and when a data reader is executed you must somehow fill the property which you only know by its name.
I believe that all these "performance like" issues are solved correctly by caching needed information for mapping. Performance impact caused by reflection will probably be nothing comparing to network communication with the server, executing the complex query, etc.
If we talk about EF performance check these two documents: Part 1 and Part2. It describes some reason why people sometimes think the EF is very slow.
Are stored procedures faster then EF queries? My answer: I don't think so. The main advantage of the linq is that you can build your query in the application. This is extreamly handy when you need anything like list filtering, paging and sorting + changing number of displayed columns, loading realted entities etc. If you want to implement this in a stored procedure you will either have tens of procedures for diffent query configurations or you will use a dynamic sql. Dynamic sql is exactly what uses EF. But in case of EF that query has compile time validation which is not the case of plain SQL. The only difference in performance is the amount of transfered data when you send whole query to the server and when you send only exec procecdeure and parameters.
It is true that queries are sometimes very strange. Especially inheritance produces sometimes bad queries. But this is already solved. You can always use custom stored procedure or SQL query to return entities, complex types or custom types. EF will materialize results for you so you don't need to bother with data readers etc.
If you want to be objective when presenting Entity framework you should also mention its cons. Entity framework doesn't support command batching. If you need to update or insert multiple entities each sql command have its own roundrip to database. Because of that you should not use EF for data migrations etc. Another problem with EF is almost no hooks for extensibility.

Related

Stored procedure vs. Entity Framework (LINQ queries) [duplicate]

I've used the entity framework in a couple of projects. In every project, I've used stored procedures mapped to the entities because of the well known benefits of stored procedures - security, maintainability, etc. However, 99% of the stored procedures are basic CRUD stored procedures. This seems like it negates one of the major, time saving features of the Entity Framework -- SQL generation.
I've read some of the arguments regarding stored procedures vs. generated SQL from the Entity Framework. While using CRUD SPs is better for security, and the SQL generated by EF is often more complex than necessary, does it really buy anything in terms of performance or maintainability to use SPs?
Here is what I believe:
Most of the time, modifying an SP requires updating the data model
anyway. So, it isn't buying much in terms of maintainability.
For web applications, the connection to the database uses a single user ID specific to the application. So, users don't even have direct database access. That reduces the security benefit.
For a small application, the slightly decreased performance from using
generated SQL probably wouldn't be much of an issue. For high
volume, performance critical applications, would EF even be a wise
choice? Plus, are the insert / update / delete statements generated
by EF really that bad?
Sending every attribute to a stored procedure has it's own performance penalties, whereas the EF generated code only sends the attributes that were actually changed. When doing updates to large tables, the increased network traffic and overhead of updating all attributes probably negates the performance benefit of stored procedures.
With that said, my specific questions are:
Are my beliefs listed above correct? Is the idea of always using SPs something that is "old school" now that ORMs are gaining in popularity? In your experience, which is the better way to go with EF -- mapping SPs for all insert / update / deletes, or using EF generated SQL for CRUD operations and only using SPs for more complex stuff?
I think always using SP's is somewhat old school. I used to code that way, and now do everything I can in EF generated code...and when I have a performance problem, or other special need, I then add back in a strategic SP to solve a particular problem....it doesn't have to be either or - use both.
All my basic CRUD operations are straight EF generated code - my web apps used to have 100's or more of SP's, now a typical one will have a dozen SP's and everything else is done in my C# code....and my productivity has gone WAY up by eliminating those 95% of CRUD stored procs.
Yes your beliefs are absolutely correct. Using stored procedures for data manipulation has meaning mainly if:
Database follows strict security rules where changing data is allowed only through stored procedures
You are using views or custom queries for mapping your entities and you need advanced logic in stored procedure to push data back
You have some advanced logic (related to data) in the procedure for any other reason
Using procedures for pure CUD where non of mentioned cases applies is redundant and it doesn't provide any measurable performance boost except single scenario
You will use stored procedures for batch / bulk modifications
EF doesn't have bulk / batch functionality so changing 1000 records result in 1000 updates each executed with separate database roundtrip! But such procedures cannot be mapped to entities anyway and must be executed separately via function import (if possible) or directly as ExecuteStoreCommand or old ADO.NET (for example if you want to use table valued parameter).
The whole different story can be R in CRUD where stored procedure can get significant performance boost for reading data with your own optimized queries.
If performance is your primary concern, then you should take one of your existing apps that uses EF with SPs, disable the SPs, and benchmark the new version. That's the only way to get an answer perfectly applicable to your situation. You might find that EF no matter what you do isn't fast enough for your performance needs compared to custom code, but outside of very high volume sites I think EF 4.1 is actually pretty reasonable.
From my PoV, EF is a great developer productivity boost. You lose a fair bit of that if you're writing SPs for simple CRUD operations, and for Insert/Update/Delete in particular I really don't see you gaining much in performance because those operations are so straightforward to generate SQL for. There are definitely some Select cases where EF will not do the optimal thing and you can get major performance increases by writing a SP (hierarchical queries using CONNECT BY in Oracle come to mind as an example).
The best way to deal with that type of thing is to write your app letting EF generate the SQL. Benchmark it. Find areas where there's performance issues and write SPs for those. Delete is almost never going to be one of the cases you need to do this.
As you mentioned, the security gain here is somewhat lessened because you should have EF on an Application tier that has its own account for the app anyway, so you can restrict what it does. SPs do give you a bit more control but in typical usage I don't think it matters.
It's an interesting question that doesn't have a truely right or wrong answer. I use EF primarily so that I don't have to write generic CRUD SPs and can instead spend my time working on the more complex cases, so for me I'd say you should write fewer of them. :)
I agree broadly with E.J, but there are a couple of other options. It really boils down to the requirements for the particular system:
Do you need to get the app developed FAST? - Then use entity framework and its automatic SQL
Need fine-grained and solid security? - Get onto stored procedures
Need it to run as fast as possible? - You're probably looking at some happy medium!
In my opinion as long as your application/database does not suffer from performance issues and you are mostly using the database for CRUD and accessing it using just one DB user, it is better to use generated SQL. It is faster to develop, more maintainable and the few security or more privacy benefits are not worth it (if the data is not so sensitive). Also the use of model based database access or LINQ disabled the threat from SQL injections.

Is using Stored Procedure such a bad idea?

Microsoft has often provided ways to make it easy to develop things that are simple and trivial.
There are certain things that I dislike in EFxx.
First and foremost, the fact that in order to do an update, you need to LOAD the record first, so the operation becomes a 2 step process where maybe you just want to update a boolean value.
Second, I like Stored Procedures because i can run 10 different things within the same connection call where if I were using EFxx I would have to run 10 separate DB calls (or more if update was involved).
My concern and question to the MVC EF gurus is ...
Is using Stored Procedures such a bad idea? I still see EFxx as just another way Microsoft gives us to develop simple programs much faster, but in reality it's not the true recommended way.
Any hint and tip will be much appreciated, specially on the concept of "what's the best way to run an update on EFxx" & "is Stored Procedures bad for EFxx".
You are falling into a logical fallacy. Just because EF is designed to work a certain way doesn't mean you aren't supposed to ever do it a different way. And just because EF may not be good to do a certain thing in a certain way doesn't mean EF sucks or shouldn't be used for anything. This is the All or nothing argument. If it can't do everything perfectly, then it is useless.. and that's just not true.
EF is an Object-Relational Mapping tool. You would only use it when you want to work with your data as objects. You would not use it if you want to work with your data as relational sets (aka SQL).
You're also not stuck with using EF or nothing. You could use EF for queries, and use stored procs for updates. Or the other way around. It's about using the tool that works best for the given situation.
And no, EF is not just for "simple" or "trivial" things. But, using it for more complex scenarios often requires deeper knowledge of how EF works so that you know what its doing under the covers.
Using a stored proc in EF is as simple as saying MyContext.Database.ExecuteSqlCommand() or MyContext.Database.SqlQuery(). This is the most basic way to do so, and it provides rudimentary object to sproc mapping, but it does not support the more complex ORM functionality like caching, change tracking, etc..
EF6 will more fully support sprocs for backing of queries, updates, and deletes as well, supporting more of the feature set.
EF is not a magic bullet. It has tradeoffs, and you need to decide whether it's right for you in the circumstances you're going to use it.
FYI, you're absolutely wrong about needing to get an object before updating it, although that's just the simplest way of dealing with it. EF also implements a unit of work pattern, so if you are doing 10 inserts, it's not going to make 10 round trips, it will send them all as a single prepared statement.
Just like you can write bad SQL, you can write bad EF queries. Just because you are good at SQL and bad at EF doesn't mean EF sucks. It means, you aren't an expert in it yet.
So to your question, no. Nobody has ever said using Sprocs is a bad idea. The thing is, in many cases, sprocs are overkill. They also create an artificial separation of your logic into two different subsystems. Writing your queries in C# means you're writing your business logic entirely in one language, which as a lot of maintenance benefits. Some environments need sproc use, some don't..
This has been asked and answered many times. Like this one.
There will always be pros and cons to both. It's just a matter of what is important to you. Do you just need simple CRUD operations (one at a time)? I would probably use ORMs. Do you do bulk DB operations? Use SPs. Do you need to do rapid development? Use ORMs. Do you need flexibility such that you need full control over SQL? Use SP.
Also, take note that you can reduce the number of DB trips your context in EF does. You can try to read more about different types of EF loading. Also, calling SPs is possible in EF. Data read using SP & Add/Update using SP.

Moving from ADO.NET to ADO.NET Entity Framework

I am web developer for nine years now.
I love to develop custom CMS and purly hand coded web applications.
I was ok with ADO.NET Data Access model, writting native SQL Queries to the database and calling store procedures via DBCommand.
2 years now i was thinking to move to ADO.NET Entity Framework.
I know there are alot of advantages in terms of productivity but i really don't like/understand the way it work Entity Framework.
In terms of productivity i have create an application that auto generates for me the ADO.NET Code so i don't waste mutch time to code ADO.NET code.
Should i move on Entity Framework?
PS : I am a performance lover.:P
PS 2 :For example, How can i implement a Modified Preorder Tree Traversal to manage hierarchical data (ex : Categories of products) in Entity framework?
PS 3 : I Work with MySql Server
Edit
After a bit of reading, i understand that ADO.NET Entity Framework is wonderful.
It give us alot of benefits that we have to hand craft or "copy-paste" in the past.
Another benefit that comes with it is that is completely Provider independent.
Even if you like the old ADO.NET micanism or you are a dinosaur like me(:P) you can use the entity framework using the EntityClient like SqlClient, MySqlClient and use the power of Entity-Sql witch is provider independent.
Yes you loose some performance.
But with all these cache technologies out there you can overcome this.
As i always say, "The C is fast, the Assembly even more...but we use C#/VB.NET/Java"
Thank you very mutch for the good advices.
It depends.
ORMs work well when you are forced to persist an object graph to a relational storage. The better option would be to use an Object Database. So:
If your application will benefit from using an Object Database and you are forced to use relational storage, then answer is simple: Yes, you need ORM.
If you already have your data layer strategy and you don't need to spend a lot of time using it and you feel it's fine, then the answer is also simple: You don't need ORM., with one simple "but"...
You can't foresee all advantages/disadvantages until you try. And nobody has your mind and your projects. So the better answer would be: Try it and figure it out yourself.
The choice of ORM does not change your data model in most cases. You can use the exact same methods that you used to use, but you now use them in code rather than SQL.
In your MPTT example, you would do the same thing, but it would look something like this in the case of a tree of food items, where the root items left value is 1, and right value 20.
var query = from f in food where lft > 1 and rgt < 20 select f.name;
What's more, if you do discover something you can't do very well in the ORM, you can always just use the ORM to call a sproc that does what you need in SQL.
In fact, even if I wasn't using an ORM to map tables, i'd still use it to call my sprocs because it will automatically create all the wrapper code, parameterize the queries, make it all type safe, and reconstitute it into a data transfer object. It saves writing a lot of boilerplate.
To answer at least one aspect of performance, EF will generate parameterized queries, which is good for performance. Parameterized queries allow the db to store execution plans and the dba to optimize the plans if necessary. Otherwise most queries are treated by the db as totally brand new and thus it creates a new execution plan every time.

How is using Entity + LINQ not just essentially hard coding my queries?

So I've been developing with Entity + LINQ for a bit now and I'm really starting to wonder about best practices. I'm used to the model of "if I need to get data, reference a stored procedure". Stored procedures can be changed on the fly if needed and don't require code recompiling. I'm finding that my queries in my code are looking like this:
List<int> intList = (from query in context.DBTable
where query.ForeignKeyId == fkIdToSearchFor
select query.ID).ToList();
and I'm starting to wonder what the difference is between that and this:
List<int> intList = SomeMgrThatDoesSQLExecute.GetResults(
string.Format("SELECT [ID]
FROM DBTable
WHERE ForeignKeyId = {0}",
fkIdToSearchFor));
My concern is that that I'm essentially hard coding the query into the code. Am I missing something? Is that the point of Entity? If I need to do any real query work should I put it in a sproc?
The power of Linq doesn't really make itself apparent until you need more complex queries.
In your example, think about what you would need to do if you wanted to apply a filter of some form to your query. Using the string built SQL you would have to append values into a string builder, protected against SQL injection (or go through the additional effort of preparing the statement with parameters).
Let's say you wanted to add a statement to your linq query;
IQueryable<Table> results = from query in context.Table
where query.ForeignKeyId = fldToSearchFor
select query;
You can take that and make it;
results.Where( r => r.Value > 5);
The resulting sql would look like;
SELECT * FROM Table WHERE ForeignKeyId = fldToSearchFor AND Value > 5
Until you enumerate the result set, any extra conditions you want to decorate will get added in for you, resulting in a much more flexible and powerful way to make dynamic queries. I use a method like this to provide filters on a list.
I personally avoid to hard-code SQL requests (as your second example). Writing LINQ instead of actual SQL allows:
ease of use (Intellisense, type check...)
power of LINQ language (which is most of the time more simple than SQL when there is some complexity, multiple joins...etc.)
power of anonymous types
seeing errors right now at compile-time, not during runtime two months later...
better refactoring if your want to rename a table/column/... (you won't forget to rename anything with LINQ becaues of compile-time checks)
loose coupling between your requests and your database (what if you move from Oracle to SQL Server? With LINQ you won't change your code, with hardcoded requests you'll have to review all of your requests)
LINQ vs stored procedures: you put the logic in your code, not in your database. See discussion here.
if I need to get data, reference a stored procedure. Stored procedures
can be changed on the fly if needed and don't require code recompiling
-> if you need to update your model, you'll probably also have to update your code to take the update of the DB into account. So I don't think it'll help you avoid a recompilation most of the time.
Is LINQ is hard-coding all your queries into your application? Yes, absolutely.
Let's consider what this means to your application.
If you want to make a change to how you obtain some data, you must make a change to your compiled code; you can't make a "hotfix" to your database.
But, if you're changing a query because of a change in your data model, you're probably going to have to change your domain model to accommodate the change.
Let's assume your model hasn't changed and your query is changing because you need to supply more information to the query to get the right result. This kind of change most certainly requires that you change your application to allow the use of the new parameter to add additional filtering to the query.
Again, let's assume you're happy to use a default value for the new parameter and the application doesn't need to specify it. The query might include an field as part of the result. You don't have to consume this additional field though, and you can ignore the additional information being sent over the wire. It has introduced a bit of a maintenance problem here, in that your SQL is out-of-step with your application's interpretation of it.
In this very specific scenario where you either aren't making an outward change to the query, or your application ignores the changes, you gain the ability to deploy your SQL-only change without having to touch the application or bring it down for any amount of time (or if you're into desktops, deploy a new version).
Realistically, when it comes to making changes to a system, the majority of your time is going to be spent designing and testing your queries, not deploying them (and if it isn't, then you're in a scary place). The benefit of having your query in LINQ is how much easier it is to write and test them in isolation of other factors, as unit tests or part of other processes.
The only real reason to use Stored Procedures over LINQ is if you want to share your database between several systems using a consistent API at the SQL-layer. It's a pretty horrid situation, and I would prefer to develop a service-layer over the top of the SQL database to get away from this design.
Yes, if you're good at SQL, you can get all that with stored procs, and benefit from better performance and some maintainance benefits.
On the other hand, LINQ is type-safe, slightly easier to use (as developers are accustomed to it from non-db scenarios), and can be used with different providers (it can translate to provider-specific code). Anything that implements IQueriable can be used the same way with LINQ.
Additionally, you can pass partially constructed queries around, and they will be lazy evaluated only when needed.
So, yes, you are hard coding them, but, essentially, it's your program's logic, and it's hard coded like any other part of your source code.
I also wondered about that, but the mentality that database interactivity is only in the database tier is an outmoded concept. Having a dedicated dba churn out stored procedures while multiple front end developers wait, is truly an inefficient use of development dollars.
Java shops hold to this paradigm because they do not have a tool such as linq and hold to that paradigm due to necessity. Databases with .Net find that the need for a database person is lessoned if not fully removed because once the structure is in place the data can be manipulated more efficiently (as shown by all the other responders to this post) in the data layer using linq than by raw SQL.
If one has a chance to use code first and create a database model from the ground up by originating the entity classes first in the code and then running EF to do the magic of actually creating the database...one understands how the database is truly a tool which stores data and the need for stored procedures and a dedicated dba is not needed.

Methods of pulling data from a database

I'm getting ready to start a C# web application project and just wanted some opinions regarding pulling data from a database. As far as I can tell, I can either use C# code to access the database from the code behind (i.e. LINQ) of my web app or I can call a stored procedure that will collect all the data and then read it with a few lines of code in my code behind. I'm curious to know which of these two approaches, or any other approach, would be the most efficient, elegant, future proof and easiest to test.
The most future proof way to write your application would be to have an abstraction between you and your database. To do that you would want to use an ORM of some sort. I would recommend using either NHibernate or Entity Framework.
This would give you the advantage of only having to write your queries once instead of multiple times (Example: if you decide to change your database "moving from mssql to mysql or vice versa"). This also gives you the advantage of having all of your data in objects. Which is much easier to work with than raw ado Datatables or DataReaders.
Most developers like to introduce at least one layer between the code behind and the Database.
Additionally there are many data access strategies that people use within that layer. ADO.NET, Entity Framework, Enterprise Library NHibernate, Linq etc.
In all of those you can use SQL Queries or Stored Procedures. I prefer Stored Procedures because they are easy for me to write and deploy. Others prefer to use Parameterized queries.
When you have so many options its usually indicative that there really isn't a clear winner. This means you can probably just pick a direction and go with it and you'll be fine.
But you really shouldn't use non-parameterized queries and you shouldn't do it in the code behind but instead in seperate classes
Using LINQ to SQL to access your data is probably the worst choice right now. Microsoft has said that they will no longer be improving LINQ to SQL in favor of Entity Framework. Also, you can use LINQ with your EF if you should choose to go that route.
I would recommend using an ORM like nHibernate or Entity framework instead of a sproc/ADO approach. Between the two ORMs, I would probably suggest EF for you where you are just getting the hang of this. EF isn't QUITE as powerful as nHibernate but it has a shorter learning curve and is pretty robust.

Categories

Resources