I am currently trying to find different areas where Linq is not sufficient and FromSqlRaw or ExecuteSqlRaw have to be used.
Some examples I have found are
Bulk updates https://learn.microsoft.com/en-us/ef/core/performance/efficient-updating
Executing stored procedures https://learn.microsoft.com/en-us/ef/core/querying/raw-sql
However I am looking for more areas where Linq does not perform good enough and even queries that cannot be generated from Linq in EF Core when it comes to database access.
My goal is to find poor performing Linq translations and examine the cause.
This is a bit of a solution looking for a problem. Given an application and looking for inefficiencies that might benefit from a different approach is something I would start off using a profiler and observing the database access in as close to a production capacity as I am allowed to get.
EF is like any tool, it can be leveraged to create works of art, and it can be abused and misused to create shanties. Even when done correctly, optimizations like indexes are something that are tuned based on looking at real-world options. There are many options that I would look at to address performance issues before considering direct to SQL. Typical culprits that can be easily identified via profiling:
Lazy loading. (Dozens to hundreds of queries following up a "main" query.)
Over-use of eager loading. (Queries involving a heck of a lot of joins)
Sloppy use of client-side evaluation. (Either enabling that feature in EF Core, or slapping a ToList somewhere when a query complains to "fix" it, AsSplitQuery can help here, Projection is a better solution in most cases)
Lack of pagination where more data is returned than necessary. (Similar to #3, having methods like "GetAll" and then applying filtering, pagination, etc.)
Giving users too much flexibility in querying that they don't need 99% of the time, but in that 1% someone does try it, grinds the system to a halt. (Giving users filters/sorts on ALL columns and performing things like string.Contains by default for text searches)
Giving users access to expensive, but necessary queries in real-time. (Big, justified queries, but being run against the production dataset and not "throttled" by something like a Queue to ensure too many of these monsters don't get run at once.)
Those are some of the top culprits that come to mind around performance, and none of them resort to going to SQL. Batch processing in your list is certainly one case that I believe does deserve looking outside of Linq, and potentially outside of EF all-together. Stored Procs I am mixed on. If there is business logic that is shared between an EF-supported application and another existing system and I want to share that business logic as-is. The trouble is that if I'm relying on the Sproc for business rules then there's little point to EF, and if I'm splitting business rules between C#/EF and Sprocs, then that's having to manage logic in two locations.
Microsoft has often provided ways to make it easy to develop things that are simple and trivial.
There are certain things that I dislike in EFxx.
First and foremost, the fact that in order to do an update, you need to LOAD the record first, so the operation becomes a 2 step process where maybe you just want to update a boolean value.
Second, I like Stored Procedures because i can run 10 different things within the same connection call where if I were using EFxx I would have to run 10 separate DB calls (or more if update was involved).
My concern and question to the MVC EF gurus is ...
Is using Stored Procedures such a bad idea? I still see EFxx as just another way Microsoft gives us to develop simple programs much faster, but in reality it's not the true recommended way.
Any hint and tip will be much appreciated, specially on the concept of "what's the best way to run an update on EFxx" & "is Stored Procedures bad for EFxx".
You are falling into a logical fallacy. Just because EF is designed to work a certain way doesn't mean you aren't supposed to ever do it a different way. And just because EF may not be good to do a certain thing in a certain way doesn't mean EF sucks or shouldn't be used for anything. This is the All or nothing argument. If it can't do everything perfectly, then it is useless.. and that's just not true.
EF is an Object-Relational Mapping tool. You would only use it when you want to work with your data as objects. You would not use it if you want to work with your data as relational sets (aka SQL).
You're also not stuck with using EF or nothing. You could use EF for queries, and use stored procs for updates. Or the other way around. It's about using the tool that works best for the given situation.
And no, EF is not just for "simple" or "trivial" things. But, using it for more complex scenarios often requires deeper knowledge of how EF works so that you know what its doing under the covers.
Using a stored proc in EF is as simple as saying MyContext.Database.ExecuteSqlCommand() or MyContext.Database.SqlQuery(). This is the most basic way to do so, and it provides rudimentary object to sproc mapping, but it does not support the more complex ORM functionality like caching, change tracking, etc..
EF6 will more fully support sprocs for backing of queries, updates, and deletes as well, supporting more of the feature set.
EF is not a magic bullet. It has tradeoffs, and you need to decide whether it's right for you in the circumstances you're going to use it.
FYI, you're absolutely wrong about needing to get an object before updating it, although that's just the simplest way of dealing with it. EF also implements a unit of work pattern, so if you are doing 10 inserts, it's not going to make 10 round trips, it will send them all as a single prepared statement.
Just like you can write bad SQL, you can write bad EF queries. Just because you are good at SQL and bad at EF doesn't mean EF sucks. It means, you aren't an expert in it yet.
So to your question, no. Nobody has ever said using Sprocs is a bad idea. The thing is, in many cases, sprocs are overkill. They also create an artificial separation of your logic into two different subsystems. Writing your queries in C# means you're writing your business logic entirely in one language, which as a lot of maintenance benefits. Some environments need sproc use, some don't..
This has been asked and answered many times. Like this one.
There will always be pros and cons to both. It's just a matter of what is important to you. Do you just need simple CRUD operations (one at a time)? I would probably use ORMs. Do you do bulk DB operations? Use SPs. Do you need to do rapid development? Use ORMs. Do you need flexibility such that you need full control over SQL? Use SP.
Also, take note that you can reduce the number of DB trips your context in EF does. You can try to read more about different types of EF loading. Also, calling SPs is possible in EF. Data read using SP & Add/Update using SP.
I am aware that this question may be a little bit dangerous to ask, but I really need some opinion with this.
We've got our system, it's an website (will be popular web portal, we use MVC3) and before I was here rest of my co-woorkers chose NHibernate as their OR Mapper solution, and they started to write criteria queries and such..
Right now the team is closer to Linq approach, so we tried to wrote queries in built-in Linq provider.. The thing is.. it's horribly adapted - literally you cannot write non-trivial query and do not get Not supported exception...
We decided that it's the last possible moment, to change our OR Mapper to something more Linq-based and we since the EF4.1 got ultrafriendly Code First option, we are decided that this is what we need.
The problem that I need some opinions on it is worth the time to migrate from NHibernate to EF4.1... The project will last at least one year further in development, so we have a lot of work to do, and we want to do it in nice and non-frustrating way..
Some facts:
We have about 50 entities in our project
We have about 160 queries written in Criteria API (all covered in unit tests)
We need to have composite, inheritance and many-many support
The project will be twice as big as it is now
We are not satisfied with our database performance
We hate the way that we write queries right now!
So.. now.. is migration a good or bad idea? Will EF resolve our problems, will it make us happy or that step will be just the waste of our time?
Regards
Be aware that you will exchange better linq support for worse mapping functionality and sometimes much worse performance (inheritance queries, no query or command batching, ...). Now return to the blackboard and think again. If you don't like your database performance now, it will hardly improve with EF.
I guess it is little bit late to change the technology - it will have high cost. But anyway if you really want to do it why not to make proof-of-concept where you take some really complex mapped feature with some advanced queries and try to do the same in EF code first? You can test the same just in simple console application and compare both mapping experience and queries + performance.
Performance is perhaps not an issue at the moment but it can be something you really have to optimize in the future and EF will provide you much less features for that. If you want to improve performance of EF solution you very often revert back to native SQL and stored procedures. Do you thing that it will have better experience for writing queries?
I have to agree with #Ladislav, EF and LINQ is nice it just works compared to NHibernate LINQ, however the SQL EF generates sometimes is pretty terrible and is not terribly performant and you will be forced to recode complex queries into views and SP's etc. Nhibernate also can fall into this trap however having many different options is a benefit as you can cherry pick the best one to suit your needs.
I suppose you are looking at balancing the following:-
rewrite using EF and then revisting ugly/slow generated SQL into more performant database queries
use NHibernate and drop LINQ for anything too complex into Criteria/QueryOver/HQL
Kinda jumping from one ORM into another can be like jumping from the frying pan into the fire. Both have their sweet spots both will burn your fingers!
Personally I also come across the issues with the linq provider for Nhibernate.
However, I chose to stick with NHibernate because of its "proper" SQL-generation, overall performance and extensibility.
The latter allows you to alleviate your RDBMS with a 2nd level cache of your choice, such as MemcacheD. It keeps the objects in-memory (on the MemcacheD server) fetching/committing them from/to the RDBMS only if needed. Also applies to compiled SQL-queries.
Well to be honest it sounds like you've already made up your mind. I'm not sure what you are really asking here. If you are interested in sticking with NHibernate you should ask specific questions on problems you are having.
In my opinion if your team is more familiar with EF then you should switch. If they aren't then I'm not sure if you will ever know if the EF will solve all of your problems unless you actually outline the specific problems you are having.
Did you try Fluent NHibernate? http://fluentnhibernate.org/. You can use LINQ with it. I used it in a project and it worked very well.
I've heard from some that LINQ to SQL is good for lightweight apps. But then I see LINQ to SQL being used for Stackoverflow, and a bunch of other .coms I know (from interviewing with them).
Ok, so is this true? for an e-commerce site that's bringing in millions and you're typically only doing basic CRUDs most the time with the exception of an occasional stored proc for something more complex, is LINQ to SQL complete enough and performance-wise good enough or able to be tweaked enough to run happily on an e-commerce site? I've heard that you just need to tweak performance on the DB side when using LINQ to SQL for a better approach.
So there are really 2 questions here:
1) Meaning/scope/definition of a "Lightweight" O/RM solution: What the heck does "lightweight" mean when people say LINQ to SQL is a "lightweight O/RM" and is that true??? If this is so lightweight then why do I see a bunch of huge .coms using it?
Is it good enough to run major .coms (obviously it looks like it is) and what determines what the context of "lightweight" is...it's such a generic statement.
2) Performance: I'm working on my own .com and researching different O/RMs. I'm not really looking at the Entity Framework (yet), just want to figure out the LINQ to SQL basics here and determine if it will be efficient enough for me. The problem I think is you can't tweak or control the SQL it generates...
When people describe LINQ to SQL as lightweight, I think they mean it is good enough at what it does, but there is a lot of stuff it doesn't even try to do. I think this is a good thing, because all that extra stuff that other ORMs might try to let you do isn't really even needed if you're just willing to make a few sacrifices.
For example, I think it's a best practice to try to keep all application data in a single database. This is the kind of thing that LINQ to SQL expects if you want to be able to do Joins and whatnot. However, if you work in some environment with layers of bureaucracy, you might not be able to convince everyone to move legacy data around, or centralize on a single way of doing things. In the end you need a more complicated ORM and you end up with arguably crapper software. That's just one example of why you might not be able to shape that data as it needs to be.
So yeah, if big .com's are willing or able to do things in a consistent manner and follow best practices there is no reason why the ORM can't be as simple as necessary.
Microsoft Linq to SQL, Entity Framework (EF), and nHibernate, etc are all proposing ORMS as the next generation of Data Mapping technologies, and are claiming to be lightweight, fast and easy. Like for example this article that just got published in VS magazine:
http://visualstudiomagazine.com/features/article.aspx?editorialsid=2583
Who all are excited about implementing these technologies in their projects? Where is the innovation in these technologies that makes them so great over their predecessors?
I have written data access layers, persistence components, and even my own ORMs in hundreds of applications over the years (one of my "hobbies"); I have even implemented my own business transaction manager (discussed elsewhere on SO).
ORM tools have been around for a long time on other platforms, such as Java, Python, etc. It appears that there is a new fad now that Microsoft-centric teams have discovered them. Overall, I think that is a good thing--a necessary step in the journey to explore and comprehend the concepts of architecture and design that seems to have been introduced along with the arrival of .NET.
Bottom line: I would always prefer to do my own data access rather than fight some tool that is trying to "help" me. It is never acceptable to give up my control over my destiny, and data access is a critical part of my application's destiny. Some simple principles make data access very manageable.
Use the basic concepts of modularity, abstraction, and encapsulation--so wrap your platform's basic data access API (e.g., ADO.NET) with your own layer that raises the abstraction level closer to your problem space. DO NOT code all your data access DIRECTLY against that API (also discussed elsewhere on SO).
Severely apply the DRY (Don't Repeat Yourself) principle = refactor the daylights out of your data access code. Use code generation when appropriate as a means of refactoring, but seek to eliminate the need for code generation whenever you can. Generally, code generation reveals that something is missing from your environment--language deficiency, designed-in tool limitation, etc.
Meanwhile, learn to use the available API well, particularly regarding performance and robustness, then incorporate those lessons into your own abstracted data access layer. For example, learn to make proper use of parameters in your SQL rather than embedding literal values into SQL strings.
Finally, keep in mind that any application/system that becomes successful will grow to encounter performance problems. Fixing performance problems relies more on designing them out rather than just "tweaking" something in the implementation. That design work will affect the database and the application, which must change in sync. Therefore, seek to be able to make such changes easily (agile) rather than attempt to avoid ever changing the application itself. In part, that eventually means being able to deploy changes without downtime. It is not hard to do, if you don't "design" away from it.
I'm a huge ORM advocate. Code generation with ORM saves my shop about 20-30% on most of our projects.
And we do contract development, so this is a big win.
Chris Lively made an interesting point about having to do a redeploy any time a query gets touched. This may be a problem for some people, but it does not touch us at all. We have rules against making production database changes directly.
Also, you can still rely on traditional sprocs and even views when appropriate... We are not dogmatically 100% ORM, that's for sure.
I have been on the ORM train for the longest time, since the free version of LLBLGen to the latest and greatest commercial product LLBLGen Pro. I think ORMs fit in very well for a lot of the common data input output systems.
That isn't to say they solve all problems however. It is a tool which can be used where it makes sense to be used. If your database schema is relativly close to how your business objects need to be, ORMs are the best.
It's not a bandwagon to jump on, is a reaction to a real problem! Object Relational Mapping (ORM) has been around for a long time and it solves a real problem.
Original Object Oriented(OO) languages were all about simulating real world problems using a computer language. It could be argued that if you are really using an OO language to build systems you will be simulating the real world problem domain using a Domain Driven Design (DDD). This logically takes you to a separation of concerns model in order to keep your DDD clean and clear from all the clutter of data persistence and application controls.
When you build systems following a DDD pattern and use a Relational database for persistence then you really need a good ORM or you will be spending too much time building and maintaining database crud (pun intended).
ORM is an old problem and was solved years ago by products like Object Lens and Top Link. Object Lens was a Smalltalk ORM built by ParkPlace in the 90's. Top Link was built by Object People for Smalltalk, then converted for Java, and is currently used by Oracle. Top Link has also been around since the 90's. DDD advocates are now beginning to clearly articulate the case for DDD and gaining mind share. Therefore ORM, by necessity, is becoming mainstream and Microsoft is just reacting as usual.
No. Not everyone is.
Here's the number one big ass elephant in the room with most of the ORM tools (especially LINQ to SQL:
You are guaranteed that ANY data related change will require a full redeployment of your application.
For example, my day job can currently fix most query problems by modifying an existing stored procedure and redeploying just that one piece. With Linq, et al, your data queries are moved into your actual code. Guess what that means?
ORM is a good match for people who get along ok with software that writes software for them; but if you are obsessive about controlling what's happening and why, ORM can be suboptimal particularly with database optimization. Any abstraction layer has costs and benefits; ORM has both, but the balance isn't right yet IMHO. And ORM, in its current form, ironically adds an abstraction layer that still puts classes and unabstacted database schemas to intimately together.
My experience is that it can help you get a proof-of-concept version together quickly, but can introduce refactoring requirements you may not be familiar with (at least yet.)
Add to that, that the tool is evolving, and best-practices and patterns are not well established, nor a concensus of the kind that lets other programmers (or future programmers) feel comfortable with your code. So I expect to see higher-than-usual refactoring requirements for a while.
I'll reserve judgment (optimistically) about where it will settle in terms of being mainstream. I wouldn't bet a current project on it at this point. My patterns for dealing with the impedance mismatch are satisfactory for my purposes.
You have to fight with the ORM system once you want to do anything beyond the simplest select, update or delete. And your performance goes into the toilet once you begin doing real stuff.
So no.
I look forward to the day my team starts looking into ORM solutions. Until that day, we are a pure DataSet/Stored Procedure shop and let me tell you that it isn't all biscuits and gravy being "pure".
Firstly, the new Linq to SQL is performing close to that of stored procs and data readers. We use datasets everywhere, so performance would improve. So far so good for ORM.
Secondly, stored procs have the added benefit of being released separate of code, but they also have the detriment of being released separate of code. Pretend for a second that you have a database with more than 5 systems connecting to it and more than 10 people working on it. Now think about managing all those stored procedure changes and dependencies, especially when there is a common code base. It is a nightmare...
Third, it is difficult to debug stored procs. They often result in erroneous behavior for any number of reasons. That is not to say the same could no result from the dynamic sql being generated by the ORM, but one less problem is one less problem. No permissions issues (though mostly resolved in SQL 2005), no more multi step development. Writing the query, testing the query, deploying the query, tying it into the code, testing the code, deploying the code. The query is part of the code and I see this as a good thing.
Fourth, you can still used stored procedures. Running some reports that take a long time? Stored procs are a better solution. Why? Query execution plans can be optimized by the database engine. I won't pretend to understand all the workings of the database, but I do know there are some limitations to optimizing dynamic sql currently and that is a trade off we make when going with an ORM. However, stored procs are not ruled out when using an ORM solution.
Really the biggest reason I see people avoiding an ORM is that they simply don't have experience with one. There will be an obvious learning curve and ignorance stage. However, if it is going to improve development performance and hardly hinder (or in my case improve) performance. It is a trade off worth making.
I'm a big fan as well, using EF and Linq-to-SQL. My reasons are:
Since LINQ is compiled and type safe, you don't get the problems of typos in "string-based" SQL. I don't know how many hours I've spent of my life tracking down an error in an SP or other SQL where a "tick" or some other keyword was in the wrong place.
The above and other factors make development faster.
Though there certainly is overhead compared to "closer to the metal" methods of database querying, none of us would be using .NET at all, or even C++ if performance was our #1 concern. For most apps, I've been able to get excellent performance from Linq-To-SQL, even without using the stored proc approach (the client-based compiled queries is my usual approach).
Certainly for some applications you still want to do things the old fashioned way though.
I guess what I meant was, what is the innovation that ORMs provide over building your DAL using traditional ADO.NET, SQL and mapping them to objects in code?
Here are the three major peices of my DAL and I am comparing with ORMs to see the benefits:
You still have to have a query in an ORM = SQL (SQL is more powerful by far)
Mapping code moves to configuration but still not eliminated, just shifts from one paradigm to another
Objects have to be defined and managed tightly relatedto your Data Schema unlike in the traditional approach which I can keep them decoupled.
Am I missing something?
I have been following Fluent-NHibernate very closely as it has some of the most potential I've ever seen in a project.
I am a big ORM guy, get the logic out of the database, use the database only for speed.
I love the speed you can develop an application. The biggest advantage, depending on the ORM, is you can change the back end without having to translate your business logic.
I switched to LINQ and have never looked back. DBLinq is awesome for doing other database than MSSQL. I have used it with MY SQL and it is GREAT.
not yet, still skeptical; like most microsoft products, i wait for SP2 or a year and a half before trusting them in a producton environment
and note that pretty much every new thing introduced by anyone, not just microsoft, is hailed as "lightweight, fast and easy" - take it with a block of salt. They do not advertise the problems/issues quite as loudly as the benefits/features! That's why we wait for early adopters to discover them.
This is not to disparage ORM or LINQ or anything like that; I'm reserving judgement until
I have time to evaluate them,
some need arises that only they can satisfy,
the technology appears stable and well-supported enough to risk in one of my clients' production environments, and/or
a client requests it
Please note: I've done ORM before, manually, and it worked just fine, so I have high hopes for the newer ORM systems.
If codesmith generates code based on your tables, aren't you still tightly coupled to your data schema? I would prefer a solution that decouples my objects from my database schema for mor flexibility in the architecture
That's from one of your comments - It's true, CodeSmith tightly couples you to your tables.
NHibernate on the other hand has allot of features that can help with that : you have Components so that in code you can have : Person with a property Address ,where Address is a separate class .
You also inheritance mapping. So it does a fair job of decoupling your schema from your domain.
We still use a hand rolled, repetitive cut'n'paste DAL where I work. It's extremely painful, complex, and error prone, but it's something all developers can understand. Although it is working for us at the moment, I don't recommend it as it begins to break down quickly on large projects. If someone doesn't want to go to full blown ORM, I'd at least advocate some sort of DAL generation.
I'm actually working on writing an ORM tool in .NET as a side project to keep me entertained.
SQL is dead to me. I hate writing it, especially having it in my code anywhere. Manually writing select/insert/update/delete queries for each object is a waste of time IMO. And don't even get me started on handling NULLs ("where col_1 = ?" vs "where col_1 is null") when dynamically generating queries. The ORM tools can handle that for me.
Also, having 1 place that SQL may be dynamically generated would hopefully go a long was to eliminating SQL injection attacks.
On the other hand, I've used Hibernate, and I absolutely hate it. In a real-word project, we ran into limitations, unimplemented bits, and bugs every few weeks.
Keeping the query logic DB side (usually as a view or stored procedure) also has the benefit of being available for DBAs to tune. That was a constant battle at my last job, using Hibernate. DBAs: "give us all the possible queries so we can tune" Devs: "uh, I don't know because Hibernate will generate them on the fly. I can give you some HQL and an XML mapping though!" DBAs: "Don't make me punch you in the face!"
I dislike the code generation used in most ORMs. In fact, code generation in general I find to be a weak tool that is usually indicative of using the wrong language in the first place.
In particular with .Net reflection, I don't see any need for code gen for ORM purposes.
Here's one strong opinion.
No, I dumped ORMs and switched to an smalltalk and an OODB: Gemstone. Better code, less code, faster development.