I'm curious to know what the community feels on this subject. I've recently come into the question with a NHibernate/WCF scenario(entities persisted at the service layer) and realized I may be going the wrong direction here.
My question is plainly, when using a persistent object graph(NHibernate, LINQ to SQL, etc) behind a web service(WCF in this scenario), do you prefer to send those entities over the wire? Or would you create a set of lighter DTO's(sans cyclic references) across?
DTOs. Use AutoMapper for object-to-object mapping
I've been in this scenario multiple times before and can speak from experience on both sides. Originally I was just serializing my entities and sending them as is. This worked fine from a functional standpoint but the more I looked into it the more I realized that I was sending more data than I needed to and I was losing the ability to vary the implementation on either side. In subsequent service applications I've taken to created DTOs whose only purpose is to get data to and from the web service.
Outside of any interop, having to think about all the fields that are being sent over the wire is very helpful (to me) to make sure I'm not sending data that isn't needed or worse, should not get down to the client.
As others have mentioned, AutoMapper is a great tool for entity to DTO mapping.
I've almost always created dtos to transfer over the wire and use richter entities on my server and client. On the client they'll have some common presentation logic while on the server they'll have business logic. Mapping between the dtos and the entities can be dumb but it needs to happen. Tools like AutoMapper help you.
If you're asking do I send serialized entities from a web service to the outside world? then the answer is definitely no, you're going to get minimal interoperability if you do that. DTOs help solve this problem by defining a set of 'objects' that can be instantiated in any language whether you're using C#, Java, Javascript or anything else.
I've always had problems sending nHibernate objects over the wire. Particularly if your using a ActiveRecord model. and/or if your object has ties to the session (yuck). Another nasty result is that nHibernate may try and load the object at the entry of the method (before you can get to it) which can also possibly cause problems.
So...getting the message here? problems, problems problems...DTO's all the way
Related
I am using MVC 5 Web API with Entity Framework 6 database first while i'm serializing my Object to JSON i faced a problems with Self referencing loops, i googled to identify the problem and i found many solutions so i'm wondering what is the best model ever?
I Found:
use [JsonIgnore] but i will need to add it every time i update the model form DB
Remove virtual from the collection
create new layer of Data Transfer Objects (DTO)
Use JsonSerializerSettings (Doesn't work with me since it generates "$id", "$ref")
I guess you know that the answer is "It depends" right? However in my experience I have come to the conclusion that more often than not we end up with DTO layer. It is useful for solving a myriad of problems of this kind while other solutions solve only this particular instance. In other occasions we had to flatten the object (Employee.CompanyName instead of Employee.Company.Name) and other similar issues. The downside is that you cannot directly expose IQueryable from your API although we did something to translate expression trees. So basically I guess it is a question of whether you care about the IQueryable exposed directly from the service.
I have done a lot of searching and experimenting and have been unable to find a workable resolution to this problem.
Environment/Tools
Visual Studio 2013
C#
Three tier web application:
Database tier: SQL Server 2012
Middle tier: Entity Framework 6.* using Database First, Web API 2.*
Presentation tier: MVC 5 w/Razor, Bootstrap, jQuery, etc.
Background
I am building a web application for a client that requires a strict three-tier architecture. Specifically, the presentation layer must perform all data access through a web service. The presentation layer cannot access a database directly. The application allows a small group of paid staff members to manage people, waiting lists, and the resources they are waiting for. Based on the requirements the data model/database design is entirely centered around the people (User table).
Problem
When the presentation layer requests something, say a Resource, it is related to at least one User, which in turn is related to some other table, say Roles, which are related to many more Users, which are related to many more Roles and other things. The point being that, when I query for just about anything EF wants to bring in almost the entire database.
Normally this would be okay because of EF's default lazy-load behavior, but when serializing just about any object to JSON for returning to the presentation layer, the Newtonsoft.Json serializer hangs for a long time then blows a stack error.
What I Have Tried
Here is what I have attempted so far:
Set Newtonsoft's JSON serialier ReferenceLoopHandling setting to Ignore. No luck. This is not cyclic graph issue, it is just the sheer volume of data that gets brought in (there are over 20,000 Users).
Clear/reset unneeded collections and set reference properties to null. This showed some promise, but I could not get around Entity Framework's desire to track everything.
Just setting nav properties to null/clear causes those changes to be saved back to the database on the next .SaveChanges() (NOTE: This is an assumption here, but seemed pretty sound. If anyone knows different, please speak up).
Detaching the entities causes EF to automatically clear ALL collections and set ALL reference properties to null, whether I wanted it to or not.
Using .AsNotTracking() on everything threw some exception about not allowing non-tracked entities to have navigation properties (I don't recall the exact details).
Use AutoMapper to make copies of the object graph, only including related objects I specify. This approach is basically working, but in the process of (I believe) performing the auto-mapping, all of the navigation properties are accessed, causing EF to query and resolve them. In one case this leads to almost 300,000 database calls during a single request to the web service.
What I am Looking For
In short, has anyone had to tackle this problem before and come up with a working and performant solution?
Lacking that, any pointers for at least where to look for how to handle this would be greatly appreciated.
Additional Note: It occurred to me as I wrote this that I could possibly combine the second and third items above. In other words, set/clear nav properties, then automap the graph to new objects, then detach everything so it won't get saved (or perhaps wrap it in a transaction and roll it back at the end). However, if there is a more elegant solution I would rather use that.
Thanks,
Dave
It is true that doing what you are asking for is very difficult and it's an architectural trap I see a lot of projects get stuck in.
Even if this problem were solveable, you'd basically end up just having a data layer which just wraps the database and destroys performance because you can't leverage SQL properly.
Instead, consider building your data access service in such a way that it returns meaningful objects containing meaningful data; that is, only the data required to perform a specific task outlined in the requirements documentation. It is true that an post is related to an account, which has many achievements, etc, etc. But usually all I want is the text and the name of the poster. And I don't want it for one post. I want it for each post in a page. Instead, write data services and methods which do things which are relevant to your application.
To clarify, it's the difference between returning a Page object containing a list of Posts which contain only a poster name and message and returning entire EF objects containing large amounts of irrelevant data such as IDs, auditing data such as creation time.
Consider the Twitter API. If it were implemented as above, performance would be abysmal with the amount of traffic Twitter gets. And most of the information returned (costing CPU time, disk activity, DB connections as they're held open longer, network bandwidth) would be completely irrelevant to what developers want to do.
Instead, the API exposes what would be useful to a developer looking to make a Twitter app. Get me the posts by this user. Get me the bio for this user. This is probably implemented as very nicely tuned SQL queries for someone as big as Twitter, but for a smaller client, EF is great as long as you don't attempt to defeat its performance features.
This additionally makes testing much easier as the smaller, more relevant data objects are far easier to mock.
For three tier applications, especially if you are going to expose your entities "raw" in services, I would recommend that you disable Lazy Load and Proxy generation in EF. Your alternative would be to use DTO's instead of entities, so that the web services are returning a model object tailored to the service instead of the entity (as suggested by jameswilddev)
Either way will work, and has a variety of trade-offs.
If you are using EF in a multi-tier environment, I would highly recommend Julia Lerman's DbContext book (I have no affiliation): http://www.amazon.com/Programming-Entity-Framework-Julia-Lerman-ebook/dp/B007ECU7IC
There is a chapter in the book dedicated to working with DbContext in multi-tier environments (you will see the same recommendations about Lazy Load and Proxy). It also talks about how to manage inserts and updates in a multi-tier environment.
i had such a project which was the stressful one .... and also i needed to load large amount of data and process them from different angles and pass it to complex dashboard for charts and tables.
my optimization was :
1-instead of using ef to load data i called old-school stored procedure (and for more optimization grouping stuff to reduce table as much as possible for charts. eg query returns a table that multiple charts datasets can be extracted from it)
2-more important ,instead of Newtonsoft's JSON i used fastJSON which performance was mentionable( it is really fast but not compatible with complex object. simple example may be view models that have list of models inside and may so on and on or )
better to read pros and cons of fastJSON before
https://www.codeproject.com/Articles/159450/fastJSON
3-in relational database design who is The prime suspect of this problem it might be good to create those tables which have raw data to process in (most probably for analytics) denormalized schema which save performance on querying data.
also be ware of using model class from EF designer from database for reading or selecting data especially when u want serialize it(some times i think separating same schema model to two section of identical classes/models for writing and reading data in such a way that the write models has benefit of virtual collections came from foreign key and read models ignore it...i am not sure for this).
NOTE: in case of very very huge data its better go deeper and set up in-memory table OLTP for the certain table contains facts or raw data how ever in that case your table acts as none relational table like noSQL.
NOTE: for example in mssql you can use benefits of sqlCLR which let you write scripts in c#,vb..etc and call them by t-sql in other words handle data processing from database level.
4-for interactive view which needs load data i think its better to consider which information might be processed in server side and which ones can be handled by client side(some times its better to query data from client-side ... how ever you should consider that those data in client side can be accessed by user) how ever it is situation-wise.
5-in case of large raw data table in view using datatables.min.js is a good idea and also every one suggest using serverside-paging on tables.
6- in case of importing and exporting data from big files oledb is a best choice i think.
how ever still i doubt them to be exact solutions. if any body have practical solutions please mention it ;) .
I have fiddled with a similar problem using EF model first, and found the following solution satisfying for "One to Many" relations:
Include "Foreign key properties" in the sub-entities and use this for later look-up.
Define the get/set modifiers of any "Navigation Properties" (sub-collections) in your EF entity to private.
This will give you an object not exposing the sub-collections, and you will only get the main properties serialized. This workaround will require some restructuring of your LINQ queries, asking directly from your table of SubItems with the foreign key property as your filtering option like this:
var myFitnessClubs = context.FitnessClubs
?.Where(f => f.FitnessClubChainID == myFitnessClubChain.ID);
Note 1:
You may off-cause choose to implement this solution partly, hence only affecting the sub-collections that you strongly do not want to serialize.
Note 2:
For "Many to Many" relations, at least one of the entities needs to have a public representation of the collection. Since the relation cannot be retrieved using a single ID property.
I'm working with repositories and one thing I'm really working hard on is to make things as most decoupled as they can. So, if tomorrow we change from relational databases to something else, like NoSQL and things like that we are good to go, we just have to change our DAL.
I've been trying to find out how to implement the SaveChanges method in my WebAPI controller without needing to use the EFContextProvider. I've found then the Breeze NoDb sample, however this sample uses the Breeze ContextProvider in the repository. This is something that troubles me because Breeze is a JS library, so it is something about the presentation of my application. In that case, making the repository use a component from Breeze will couple the DAL and the presentation, something I don't want to do.
Searching again for how to implement SaveChanges without EF I've found this question where there's one very good answer telling how to convert the SaveBundle to a SaveMap and then tell to use this to implement the saving logic. However I'm stuck in this method because the entries of the SaveMap give just one Type object and the EntityInfo, so I don't see how to use this with my repositories.
So, how to deal with SaveChanges without to refer to EFContextProvider and without coupling the repositories with the ContextProvider?
Are you planning to switch from SQL Server to NoSQL database? why don't you want to do that just now? How often are you planning to switch backing storage? Probably not often, if ever.
I found that database switch, especially from SQL to NoSQL is a big shift in paradigm. In one of my application I've gone through conversion from SQL to RavenDb. Despite of having everything decoupled and with using Repositories everywhere, I still had to rewrite most of the application storage logic.
What you are trying to do - you are not going to need it. So stop making life hard for yourself and get on with implementing features.
The ContextProvider does the work of converting the JObject (which Json.NET gives you in the SaveChanges method), into real, typed .NET objects. The EntityInfo object that the ContextProvider creates for each entity contains the entity object itself, as well as the entityAspect properties that it got from the client: the EntityState (Added, Modified, or Deleted), the original values of all changed properties, and the temporary values for any auto-generated keys. This is the information that you would need to save the entities yourself. The "SaveMap" just organizes them by Type for convenience, but you can manipulate them however you like.
As described in the post you referenced, you could proceed by using a ContextProvider just to convert the JObject to entities, then pass those entities to the appropriate repositories. Your repositories don't need to know anything about the ContextProvider.
Breeze offers an NHibernate provider that you can look at that shows how to talk to a non EF backend that is still a .NET server. The ContextProvider is a convenence that makes implementing any .NET provider substantially easier, but it is by no means a requirement.
As for NoSQL you should take a look at the breeze Node provider and MongoDB sample which is hosted in NodeJs ( which shows that the ContextProvider is obviously not a requirement).
We also expect to have a Breeze server implementation written in Java in the near future, which again has no "ContextProvider" requirement.
We are developing a application which should use Entity Framework and its an simple WCf Soap service (not Wcf data service). I am very confused now I have read these following posts but I don't understand where to go This question is almost the same but I have a restriction to use POCOs and try to avoid DTOs. Its not that big service. But the link which I mentioned ,in answer its written that if I try to send POCO classes on wire, there will be problem with serialization.
This post has implemented the solution which related to my problem but he did not mention anything related to serialization problem. He just changed the ProxyCreationEnabled =false which I found in many other articles as well.
But these posts are also little old, so what is the recommendation today. I have to post and get lot of Word/Excel/PDFs/Text files as well, so will it be OK to send POCO classes or it will be problem in serialization.
Thanks!
I definitely do not agree with this answer. The answer mentioned suggests to reinvent the wheel (The answer does not even indicate why not using POCOs).
You can definitely go with POCOs, I see no reason to have serialization issues; but if you have any, you can write DTOs for these specific problematic parts and map them to POCOs in the Business layer.
It is a good practice to use POCOs as the name itself suggests; Plain Old CLR objects. Writing the same classes again instead of generating them will not have any advantage. You can simply test it.
UPDATE:
Lazy Loading: Lazy loading means fetching related objects from database whenever they are accessed. If you have already serialized and deserialized an entity (ex. you have sent the entity to client side over a wire), Lazy Loading will not work, since you will not have a proxy in the client side.
Proxy: Proxy class simply enables to communicate with DB (a very simple definition by the way). It is not possible to use an instance of Proxy in the client side; it does not make sense. Just seperate the Proxy class and POCO entities into two different DLLs and share only the POCO objects with the client. And use the proxy in the service side.
I'm starting a project using EF 4 and POCO.
What is the best practice for sending data to the client ? Should I send the POCO or I should have a DTO instead?
Are there any issue I should be aware of when sending the entity (that is disconnected from the context) to the client ?
Is it a recommended practice to send the POCO to the client layer?
I believe that we are mixing 2 definitions here that don't have relation with each other.
DTO or Data Transfer Object is a design pattern, you can use it to transfer data between layers, and also they don't have behavior. Martin Fowler explains this very well at: http://www.martinfowler.com/eaaCatalog/dataTransferObject.html
In the other hand we have POCO or Plain Old CLR Object. But to talk about POCO, we have to know where it started, that is POJO, or Plain Old Java Object. Martin Fowler with two partners coined the term and he explains it here: http://www.martinfowler.com/bliki/POJO.html
So POCOs can have behavior and everything you want. They are the same common classes you write in your daily-basis, they just gave them that name to call them in a short and easy-remember way.
In anwser to your second question, I think the best approach and the one I always go for is sending DTOs from the Busines Layer to everything that uses it (e.g.: your services, web site, desktop app, mobile app, etc.). This is because they don't have behavior and not pretty much than only properties in most of the cases, so they are light-weight and ideally to use in services, and of course, they don't reveal sensitive data from your business.
That being said, if you are planning to use DTO, I can recommend you to download EntitiesToDTOs, an Entity Framework DTO Generator that I just recently published at CodePlex, it is free and open source. Go to http://entitiestodtos.codeplex.com
For me, one of the main reasons to use EF4 with POCO is the fact that you don't need DTO's. I can understand using DTO's with traditional EDMX files where your entities are pretty bloated, but this isn't the case.
Your POCO obviously needs to be serializable, but there really shouldn't be any issues specific to sending POCO entities that don't also occur with DTO's.
I have a bit different opinion from above opinions.
I believe DTO or ViewModel is still needed for out side of the Server Layer.
In real world application, there is a few view layer which only need one Domain Object, that is, almost every views need multiple Domain Objects.
And all those Domain Objects are wrapped in one DTO or ViewModel Class.
This is why I insist DTO or ViewModel is still needed even though they are POCO.
I would consider EF4 entities business models AND viewmodels rolled into one. They already implement PropertyChanged out of the box, for example. Partial classes can provide custom functionality if you need. Mirroring the entities with your own safety layer creates unnecessary work and maintenance, in my opinion.
I'm a believer in separation of business logic and everything else. However in the case of EF4 the work is already done for you. Go nuts.