Decoupling into DAL and BLL - my concerns

Decoupling into DAL and BLL - my concerns - c#

In many posts concerning this topic I come across very simple examples that do not answer my question.
Let's say a have a document table and user table. In DAL written in ADO.NET i have a method to retries all documents for some criteria. Now I the UI I have a case where I need to show this list along with the names of the creator.
Up to know I have it done with one method in DAL containig JOIN statement.
However eveytime I have such a complex method i have to do custom mapping to some object that doesn't mark 1:1 to DB.
Should it be put into another layer ? If so then I will have to resing from join query for iteration through results and querying each document author. . . which doen't make sense... (performance)
what is the best approach for such scenarios ?

For your ui my suggestion is to have a dto (a viewmodel for those mvp/mvc people) hold the user's data and the corresponding list of documents.
Custom mapping will always be present so I suggest you take a look at Automapper here to ease those mapping pains.

I ran into the same thing in the past while creating my own custom data access layers. You want your objects to map one to one to your db, but many times you just need to write one off custom functions to retrieve inner joining data. I would not put these custom actions into their own layer.
At times, what I have done was created a general class that was responsible for retrieving data for grids, combo boxes, etc, that joined information from a number of tables. This class would return custom objects containing the retrieved results. I you are not satisfied with a tool that performs automatic custom mapping for you, I would suggest creating your own auto mapping class builder utility.
As long as you split your app into data access, business, and UI layers I think you are headed in the right direction.

Related

Using a Data Access Layer in an OOP C# application using SQL

I have come from an environment where I was taught to use objects and employ OOP techniques where possible and I guess it has guided me down a particular road and influenced my product designs for quite some time.
Typically my data access layer will have a number of classes which map onto database tables, so if I need to display a list of Companies, I will have a 'Company' object and a database table called 'company'. The company object knows how to instantiate itself from a DataRow from the database read using a 'SELECT * FROM company WHERE id = x' type query. So when-ever I display a companies list I will populate a list of company objects and display them. If I need to display attributes of the company I already have the data loaded.
It has been mentioned that 'select *' is frowned on and my object approach can be inefficient, but I am having issues identifying another way of working with a database table and objects which would work if you only read specific fields - the object just wouldn't be populated.
Yes I could change the list to directly query just the required fields from the database and display them but that means my UI code would need to be more closely linked to the data access code - I personally like the degree of separation of having the object separating the layers.
Always willing to learn though - I work on my own so don't always get up to speed with the latest technologies or methodologies so any comments welcome.

I don't think I can show you a definitive solution to that, but I'll try pointing you on the right direction (since it's more of a theoretical question).
Depending on the design pattern you follow on your application, you could decouple the data access layer from the UI and still follow this rule of not fetching all columns when they are not necessary. I mean, chosing the right design pattern for an application can bring you this sort of easyness.
For example, maybe you could interpret the less detailed version of an object as an object itself (which honestly I don't think it would be a good approach).
Also, I'll comment that the very popular rails ORM ActiveRecord only fetches from DB when the data is used. Maybe you could use a similar logic to track not only when but which columns will be used so you could limit the query.

Serializing complex EF model over JSON

I have done a lot of searching and experimenting and have been unable to find a workable resolution to this problem.
Environment/Tools
Visual Studio 2013
C#
Three tier web application:
Database tier: SQL Server 2012
Middle tier: Entity Framework 6.* using Database First, Web API 2.*
Presentation tier: MVC 5 w/Razor, Bootstrap, jQuery, etc.
Background
I am building a web application for a client that requires a strict three-tier architecture. Specifically, the presentation layer must perform all data access through a web service. The presentation layer cannot access a database directly. The application allows a small group of paid staff members to manage people, waiting lists, and the resources they are waiting for. Based on the requirements the data model/database design is entirely centered around the people (User table).
Problem
When the presentation layer requests something, say a Resource, it is related to at least one User, which in turn is related to some other table, say Roles, which are related to many more Users, which are related to many more Roles and other things. The point being that, when I query for just about anything EF wants to bring in almost the entire database.
Normally this would be okay because of EF's default lazy-load behavior, but when serializing just about any object to JSON for returning to the presentation layer, the Newtonsoft.Json serializer hangs for a long time then blows a stack error.
What I Have Tried
Here is what I have attempted so far:
Set Newtonsoft's JSON serialier ReferenceLoopHandling setting to Ignore. No luck. This is not cyclic graph issue, it is just the sheer volume of data that gets brought in (there are over 20,000 Users).
Clear/reset unneeded collections and set reference properties to null. This showed some promise, but I could not get around Entity Framework's desire to track everything.
Just setting nav properties to null/clear causes those changes to be saved back to the database on the next .SaveChanges() (NOTE: This is an assumption here, but seemed pretty sound. If anyone knows different, please speak up).
Detaching the entities causes EF to automatically clear ALL collections and set ALL reference properties to null, whether I wanted it to or not.
Using .AsNotTracking() on everything threw some exception about not allowing non-tracked entities to have navigation properties (I don't recall the exact details).
Use AutoMapper to make copies of the object graph, only including related objects I specify. This approach is basically working, but in the process of (I believe) performing the auto-mapping, all of the navigation properties are accessed, causing EF to query and resolve them. In one case this leads to almost 300,000 database calls during a single request to the web service.
What I am Looking For
In short, has anyone had to tackle this problem before and come up with a working and performant solution?
Lacking that, any pointers for at least where to look for how to handle this would be greatly appreciated.
Additional Note: It occurred to me as I wrote this that I could possibly combine the second and third items above. In other words, set/clear nav properties, then automap the graph to new objects, then detach everything so it won't get saved (or perhaps wrap it in a transaction and roll it back at the end). However, if there is a more elegant solution I would rather use that.
Thanks,
Dave

It is true that doing what you are asking for is very difficult and it's an architectural trap I see a lot of projects get stuck in.
Even if this problem were solveable, you'd basically end up just having a data layer which just wraps the database and destroys performance because you can't leverage SQL properly.
Instead, consider building your data access service in such a way that it returns meaningful objects containing meaningful data; that is, only the data required to perform a specific task outlined in the requirements documentation. It is true that an post is related to an account, which has many achievements, etc, etc. But usually all I want is the text and the name of the poster. And I don't want it for one post. I want it for each post in a page. Instead, write data services and methods which do things which are relevant to your application.
To clarify, it's the difference between returning a Page object containing a list of Posts which contain only a poster name and message and returning entire EF objects containing large amounts of irrelevant data such as IDs, auditing data such as creation time.
Consider the Twitter API. If it were implemented as above, performance would be abysmal with the amount of traffic Twitter gets. And most of the information returned (costing CPU time, disk activity, DB connections as they're held open longer, network bandwidth) would be completely irrelevant to what developers want to do.
Instead, the API exposes what would be useful to a developer looking to make a Twitter app. Get me the posts by this user. Get me the bio for this user. This is probably implemented as very nicely tuned SQL queries for someone as big as Twitter, but for a smaller client, EF is great as long as you don't attempt to defeat its performance features.
This additionally makes testing much easier as the smaller, more relevant data objects are far easier to mock.

For three tier applications, especially if you are going to expose your entities "raw" in services, I would recommend that you disable Lazy Load and Proxy generation in EF. Your alternative would be to use DTO's instead of entities, so that the web services are returning a model object tailored to the service instead of the entity (as suggested by jameswilddev)
Either way will work, and has a variety of trade-offs.
If you are using EF in a multi-tier environment, I would highly recommend Julia Lerman's DbContext book (I have no affiliation): http://www.amazon.com/Programming-Entity-Framework-Julia-Lerman-ebook/dp/B007ECU7IC
There is a chapter in the book dedicated to working with DbContext in multi-tier environments (you will see the same recommendations about Lazy Load and Proxy). It also talks about how to manage inserts and updates in a multi-tier environment.

i had such a project which was the stressful one .... and also i needed to load large amount of data and process them from different angles and pass it to complex dashboard for charts and tables.
my optimization was :
1-instead of using ef to load data i called old-school stored procedure (and for more optimization grouping stuff to reduce table as much as possible for charts. eg query returns a table that multiple charts datasets can be extracted from it)
2-more important ,instead of Newtonsoft's JSON i used fastJSON which performance was mentionable( it is really fast but not compatible with complex object. simple example may be view models that have list of models inside and may so on and on or )
better to read pros and cons of fastJSON before
https://www.codeproject.com/Articles/159450/fastJSON
3-in relational database design who is The prime suspect of this problem it might be good to create those tables which have raw data to process in (most probably for analytics) denormalized schema which save performance on querying data.
also be ware of using model class from EF designer from database for reading or selecting data especially when u want serialize it(some times i think separating same schema model to two section of identical classes/models for writing and reading data in such a way that the write models has benefit of virtual collections came from foreign key and read models ignore it...i am not sure for this).
NOTE: in case of very very huge data its better go deeper and set up in-memory table OLTP for the certain table contains facts or raw data how ever in that case your table acts as none relational table like noSQL.
NOTE: for example in mssql you can use benefits of sqlCLR which let you write scripts in c#,vb..etc and call them by t-sql in other words handle data processing from database level.
4-for interactive view which needs load data i think its better to consider which information might be processed in server side and which ones can be handled by client side(some times its better to query data from client-side ... how ever you should consider that those data in client side can be accessed by user) how ever it is situation-wise.
5-in case of large raw data table in view using datatables.min.js is a good idea and also every one suggest using serverside-paging on tables.
6- in case of importing and exporting data from big files oledb is a best choice i think.
how ever still i doubt them to be exact solutions. if any body have practical solutions please mention it ;) .

I have fiddled with a similar problem using EF model first, and found the following solution satisfying for "One to Many" relations:
Include "Foreign key properties" in the sub-entities and use this for later look-up.
Define the get/set modifiers of any "Navigation Properties" (sub-collections) in your EF entity to private.
This will give you an object not exposing the sub-collections, and you will only get the main properties serialized. This workaround will require some restructuring of your LINQ queries, asking directly from your table of SubItems with the foreign key property as your filtering option like this:
var myFitnessClubs = context.FitnessClubs
?.Where(f => f.FitnessClubChainID == myFitnessClubChain.ID);
Note 1:
You may off-cause choose to implement this solution partly, hence only affecting the sub-collections that you strongly do not want to serialize.
Note 2:
For "Many to Many" relations, at least one of the entities needs to have a public representation of the collection. Since the relation cannot be retrieved using a single ID property.

Entity Framework with two (almost) identical models

I have two databases that are almost identical in schema but seperate in the data that is held within them. The consuming application does not need to know about these differences so I am creating a service which will have the entity framework models but I want it to return a single type back. For example:
**DatabaseA** -> EFModelA
- Customers
- Orders
**DatabaseB** -> EFModelB
- Customers
- Orders
I want to be able to return a Customer class through the a call rather than it returning either EFModelA.Customer or EFModelB.Customer.
I have have looked into two ways of doing this but can't work out which way will provide the least resistance in terms of maintenance. They are:
Create a unified view of the data in DatabaseA which unions the data from DatabaseB so they application only sees one entity set. The downside is working out how to make sure that they keys are unique but I can add an extra field to denote which database it came from.
Have 2 separate EF Models and have some way of scripting (e.g. T4 templates) a combined model of equivalent types and then implemented mapping code between the two EF types and the combined type.
Has anyone else come across this problem and how have you solved this? If not, how would you tackle this problem?

Looking for basic pointers on Repository Pattern for highly related entities

I'm writing an database app in C# using SQL Server CR E 3.5 and would like to implement a Repository Pattern. I've done several searches both on Google and SO; however, I cannot find an implementation that matches my needs so I will ask the SO community directly.
The key business objects in my app are: video, actor, tag category and tag. The basic business rules are as follows:
Every tag belongs to a tag category.
A video may or may have not multiple actors and tags associated with it.
Actors and tags may or may not have multiple videos associated with them.
Here is where things get fuzzy for me:
Should I implement a video repository that includes actors, tag categories, and tags or should each of these business objects have their own repositories? Given these objects can exist independently, I'm inclined to create a repository for each one.
If each object should have its own repository, how do I relate them? For example, should the video repository include a property that queries the tag repository for matches?
I'm looking for some guidelines or best practices for setting this up. I understand the basics of the repository pattern, but I need some advice as to how to connect them together.

You should only have a repository for your aggregate roots.
I would not recommend using the repository as a way of encapsulating all your queries. Repositories are not big dumping grounds for queries - they are a specific tool for use in scenarios where DDD is most applicable. See this article for some more info: http://ayende.com/blog/3955/repository-is-the-new-singleton
There should be no need to 'connect' or 'relate' repositories.
If you want to write a query such as "Load all the tags for videos that this user has borrowed", it is probably best not to put it in the repository. This query is most likely specific to a certain case, e.g. a UI, and should be written inside or close to the class for which the query is required. The output of the query would probably be mapped to read-only Data Transfer Objects specifically created for the UI's requirement, not to your entities.

When to separate certain entities into different repositories?

I generally try and keep all related entities in the same repository. The following are entities that have a relationship between the two (marked with indentation):
User
UserPreference
So they make sense to go into a user repository. However users are often linked to many different entities, what would you do in the following example?
User
UserPrefence
Order
Order
Product
Order has a relationship with both product and user but you wouldn't put functionality for all 4 entities in the same repository. What do you do when you are dealing with the user entities and gathering order information? You may need extra information about the product and often ORMs will offer the ability of lazy loading. However if your product entity is in a separate repository to the user entity then surely this would cause a conflict between repositories?

In the Eric Evan's Domain Driven Design ( http://domaindrivendesign.org/index.htm ) sense of things you should first think about what about your Aggregates. You then build you repositories around those.
There are many techniques for handling Aggregates that relate to each other. The one that I use most often is to only allow Aggregates to relate to each other through a read only interface. One of the key thoughts behind Aggregates is that you can't change state of underlying objects without going through the root. So if Product and User are root Aggregates in your model than I can't update a Product if I got to it by going through User->Order->Product. I have to get the Product from the Product repository to edit it. (From a UI point of view you can make it look like you go User->Order->Product, but when you hit the Product edit screen you grab the entity from the Product Repository).
When you are looking at a Product (in code) by going from User->Order->Product you should be looking at a Product interface that does not have any way to change the underlying state of the Product (only gets no sets etc.)
Organize your Aggregates and therefor Repositories by how you use them. I can see User and Prodcut being their own Aggregates and having their own Repositories. I'm not sure from your description if Order should belong to User or also be stand alone.
Either way use a readonly interface when Aggregates relate. When you have to cross over from one Aggregate to the other go fetch it from its own Repository.
If your Repositories are caching then when you load an Order (through a User) only load the Product Id's from the database. Then load the details from the Product Repository using the Product Id. You can optimize a bit by loading any other invariants on the Product as you load the Order.

By repository you mean class?
Depending on the use of the objects (repositories) you could make a view that combines the data on the database and create a class (repository) with your ORM to represent that view. This design would work when you want to display lighter weight objects with only a couple columns from each of the tables.

If SQL Server is your database, and by repository you mean a database, then I would just stick the information in whatever database makes sense and have a view in dependent databases that selects out of the other database via three-dot notation.

I'm still confused by what you mean by "repository." I would make all of the things you talked about separate classes (and therefore separate files) and they'd all reside in the same project.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.