Design of Data Access Layer with multiple databases

Design of Data Access Layer with multiple databases - c#

I have read many posts concerning the issue of having several databases and how to design a DAL efficiently in this case. In many cases, the forum suggests to apply the repository pattern, which works fine in most cases.
However, I find myself in a different situation. I have 3 different databases: Oracle, OLE DB and SQLServer. Currently, there exists a unique DAL with many different classes sending SQL queries down to a layer below to be executed in the corresponding database. They way the system works is that two of the databases are only used to read information from them, and the other one is used to either store this same information or read it later on. I have to propose a better design to the current implementation, but it seems as if a common interface for all three databases is not plausible from an architectural point of view.
Is there any design pattern that solves this situation? Should I have three different DALs? Or perhaps it is possible (and advisable) to apply the repository pattern to this problem?

Answers to your question will probably be very subjective. These are some thoughts.
You could apply command-query separation. The query side integrates directly with your data layer, bypassing any business or domain layer, and the returning entities are optimized for read and project from your databases. This layer could also be responsible to merge results from different database calls.
The command side consists of command handlers, using domain or business entities, which are mapped from your R/W database.
By doing this, the interface that you expose will be more clear and business oriented.
I'm not sure that completely abstracting out the data access layer with custom units of work and repositories is really needed: do the advantages outweigh the disadvantages? They rarely do, because you will you ever change a database technology? And if you do, this probably means a rewrite anyway. Also, if you use entity framework code first, you already have unit of work and an abstraction on top of your database; and the flexibility of using LINQ.
Bottom line - try not to over-engineer/over-abstract things; or make things super-generic.

Your core/business code should never be dependent on any contract/interface/class that is placed in the DAL layer of the application.
Accessing data is something the business/core layer of your application needs to be able to do, and this it should be able to do without any dependency of SQL statements and without having any knowledge of the underlying data access technology.
I think you need to remove any "sql" statements from the core part of the application. SQL is vendor dependent and any dependency to a specific database engine needs to be clean out of you core, and moved to the DAL where it belongs. Then you need to create interfaces that resides outside of the DAL(s) which you then create implementation classes for in one or many DAL modules/classes. Your DAL can be dependent of your core, but not the other way around.
I don't see why the repository layer can't be used in this case. When I have a database which I can only read from, I usually let the name of the repository interface indicate this, like ICompanyRepositoryRead.

Related

which layer doest contain query for reading?

My project has 4 layers: Presentation, application, domain, infrastructure.
I am using ORM for the write side. Read side, sometimes I have cases that have a complex query. I process by using a native query.
I write it at the infrastructure layer. But, I see some projects such as "ESHOPONWEB" write a native query in application layers.
Which layer should contain the native query?

I wasn't familiar with ESHOPONWEB aka eShopOnWeb, a reference architecture & implementation that Microsoft put out, but I see it has an accompanying eBook which includes an "Architectural principles" section, another on "Common web application architectures", and at a glance other sections explain the architecture in more detail.
Using the book and code should give you a good way to learn the relevant concepts at your own pace.
Which layer should contain the native query?
I'm not 100% sure what you mean by "native query". I did a quick search through the eBook and didn't see any obvious references to that.
Typical best practice architecture (like what is in the eBook) says that queries in the Application / Business Logic layer should be agnostic - meaning they should not contain any code or dependencies that tightly-couples the Business Logic to a specific Data Access provider.
So, if you need to query a data base, the database specific query code (or reference to a Stored Procedure in the database or any other database specific concept) should be in the Data Access code, which is in the Data Access (Infrastructure) layer.
If something in your Business Logic layer wants to do a query, there's three approaches that come to mind:
It wants to query itself, which it is free to do, using agnostic code against objects it already has.
It wants to query data from the database, it gets the Data Access to do that. The query inputs and results need to be transported between these layers agnostically. E.g. query results might come back as a collection/array/list of Data Transfer Objects (DTO) or Plain Old CLR/Class Objects (POCO).
You take some section of data from your database / Data Access layer, pre-load that as a database agnostic data model in the Business Logic layer, which it can query just like #1.
I'm not sure, but your Domain layer might be what you can use to do #3?
Does that make sense, and answer your question?
Regarding Layers
As the eShopOnWeb book explains, you want to keep various "concerns" (like business logic, UI, and data access all separate. The separation is to make maintenance and change easier to manage, especially over a long time. E.g.:
Keeping the business logic layer/code separate from the UI layer/code means that you should be able to add new types of UI without having to re-write the application code.
Keeping the business logic layer/code separate from data access layer/code means that you should be able to add/swap data providers (e.g. different types of database), and the business logic should be less impacted by database changes.
In the two points, above I say "should" because it depends on how clean and effective your separation is.
To translate that to your layers, UI is obviously your Presenation layer, Business Logic sounds like your Application and Domain layers, Data Access is your Infrastructure layer.
In the eBook they give an example (fig 5-1, pg 16) where they logically have those four layers you use but in the one code-base / package (it will all get deployed together).

Do I neet Unity of Work and Repository patterns when working with Entity Framework?

So at my job I was pointed to http://www.codeproject.com/Articles/990492/RESTful-Day-sharp-Enterprise-Level-Application#_Toc418969121 and was told to learn these patterns and implement them in my solution.
What confused me was that these things were before entity framework 6 and from what I understood, Unity of Work is used to optimize database performance by grouping queries together. Since EF6 has already these optimizations, should I still implement these layers? I get that the layerness helps with modularization and switching of data source. Does that mean that EF6 is too complex to implement with these patterns and should ADO.Net be used directly or something like that?
EDIT: I've noticed that this added layer allows usage of mock data sources. Not sure how useful this is because of the need to add another layer of apstraction

"Unit of Work is used to optimize database performance by grouping queries together." - This is not correct. Unit of Work is there to collect related operations together into a single transaction which is then committed or rolled back as a whole. It tracks changes made to objects so that required database operations can be deduced automatically and performed on your behalf.
When you work with Entity Framework, you use it to create DbContext from model. That class is both the Repository and Unit of Work, so you don't have to do anything special. Things only become more complicated than that when your project becomes so large that DbContext becomes more of a burden.

Repository is used to abstract your application from datasource, but since EntityFramework implements this pattern by itself and gives you a possibility to change data source seamlessly, there is no neccesity to add one more layer of abstraction.
You will just limit EF possibilities, while creating something like GenericRepository<T>. And nevertheless you won't be able to replace EF by another library with no changes to your code, even if you implement such a layer. (Some queries written for EF will fail for NHibernate, for example).
Just don't use DbContext everywhere in your application (inside UI code at least), use it by your data access layer (services with business dependent methods or something in that way).
Even for scenarios, where some cloud data storage is used (which EF won't be able to handle seamlessly), there is no neccessity for that layer, it's better to introduce separate classes and use them explicitly, because you cannot fit db and cloud interaction into one abstraction, it will start leaking at some point.

Entity Framework is a UnitOfWork/Repository pattern itself. If you need to abstract yourself from EF, then you could implement a layer on top of EF with your own UoW/Rep pattern.
This is good if you want to replace EF at some point in your proyect.
The cons? I think that building a UoW on top of EF gives you a redundant architecture and you will end up writing more code for something that maybe will never change.
In my current proyect, the main structure is very common, with a Data layer (with a sublayer for the Entities) for EF, Logic layer (where I put all the Business Logic) and the View layer (It can be web, or whatever). With that structure I directly invoke EF in the Logic layer.

Are Entity Framework Models Reusable?

How reusable are the results of using EF?
Currently, I use stored procedures for 100% of my data access. The main reason I may be looking to do things differently for my newest project is for maintainability: adding an attribute to a table means manually altering dozens of stored procedures. If I understand EF correctly, I should be able to add an attribute to an Entity in my EF model, and then ask EF to update my CRUD methods for me. awesome.
However, there is one thing holding me back: reusability. I like that I can make the SP's for a given database once, and be done with them; I can make 12 applications that all use that database and getting that data will be as easy as calling the correct SP.
Will the same be true if I switch to a more EF-centric approach?
Can I import an Existing EF Data Model and have it work without too much trouble?
Can I make it so that I alter the Data Model once, and that change is felt across all applications?

Ad1. You can easily reuse complex EF queries if you follow the Repository pattern. Repositories are where you encapsulate your data access and, yes, repositories are easily reused between different modules/applications.
(not that you can't reuse code without Repositories! Repositories are just a common way of doing it for data access layer)
Ad2. I am not sure what you mean by "import existing EF model" (import where?) but usually EF models are straightforward and can be reused
Ad3. Just have your model in a separate assembly.

A real benefit to using EF is getting away from stored procedures.
The problem that exists with using stored procedures for all your data access is that you are forced to put business logic into your data layer.
Check out Should the data access layer contain business logic? While its not true in every case, generally keeping your business logic in your business layer gives you better separation of concerns.
I have an EF project that I use as the data layer for several applications. This allows me to change it once and have all the other projects get the benefits. Granted, sometimes supporting code needs to be changed in these other projects, but you'd have that same problem in a stored procedure model as well.

Any N-tier design would solve this problem by simply creating a class (in a class library) that understands how to access the data by using entity framework. Then any applications that want to access the data uses the class library, configure a connection string in the app.config or web.config and you're done.

Rich domain model with ORM

I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.

I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").

In short - it is not solved
(here goes additional useless characters to post my awesome answer)

All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.

Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.

this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up

Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}

.net, C# Interface between Business Logic and DAL

I'm working on a small application from scratch and using it to try to teach myself architecture and design concepts. It's a .NET 3.5, WPF application, and I'm using Sql Compact Edition as my data store.
I'm working on the business logic layer, and have just now begun to write the DAL. I'm just using SqlCeComamnds to send over simple queries and SqlCeResultSet to get at the results. I'm starting to design my Insert and Update methods, and here's the issue - I don't know the best way to get the necessary data from the BLL into the DAL. Do I pass in a generic collection? Do I have a massive parameter list with all the data for the database? Do I simply pass in the actual business object (thus tying my DAL to the conrete stuff in the BLL?).
I thought about using interfaces - simply passing IBusinessObjectA into the DAL, which provides the simplicity I'm looking for without tying me TOO tightly to current implementations. What do you guys think?

I don't think there is a simple answer to your questions because there are many options depending on the circumstances. I have found it helpful to read the two books below to help me understand the problems you describe better.
MS .NET: Architecting Applications for the Enterprise (Esposito, Saltarello)
MS Application Architecture Guide, 2nd edition.
The second book is available online. Look here.

I think it is OK to pass the Business object to the Data Access Layer. I think the BLL's job is just to work with its objects, to check if all rules are being followed, about what can be saved, by whom, on what fields, time, etc.
Once it has done that it should pass it to the DAL, and I think it is IT'S job to figure out how to convert what it got into something that can be persisted, but it wont check what is being persisted or read or by whom, it will just do it. This could be straight foward, a la linq, but if your logic mdoels do not match your data model 1:1, then the DAL should do all the conversion.
About tying your DAL to the stuff in the BLL, I think you should worry about the other way around, tying your BLL to your DAL. I would use an interface to represent your DAL (as in IRepository) that way you can make your BLL call any kind of persistance mechanism just by changing the type of IRepository it is using (extra points if you use IoC :P). The concrete classes that implement the IRepository would be tied to the business objects, but they have to know what is it that they are saving don't they? while the BLL does NOT have to know what is doing the saving.

To pass business object in the DAL is the simpler and fastest method. It works in small projects, but have same disadvantages:
1) Business Objects are part of BLL layer, and if you pass objects in BLL then DAL becomes dependent of BLL. low layer knows about upper one - this contradicts the idea of layers at all.
2) Business Object are usially very complex to save it directly in BD. In this case it is better to introduce new "Mappers" intermediate layer.
To overcome all these issues I usially make interface to DAL independent of Business Objects. I use "Row" classes instead - representation of one record in the database or XML. In .NET 3.5 linqtosql autogenerated classes can be used for this purpose.

If I was in your position, I'd probably use LINQ to SQL to define my data access layer - it'll save you lots of work maintaining all that SqlCeFooBar stuff and give you a designer (of sorts) for maintaining your database that you would otherwise lack, using SQL CE.
So in that case, I'd probably couple the business logic layer pretty tightly to the entities exposed by the L2S layer. The justification being that the entities are the business objects, albeit devoid of any services.
I probably wouldn't let the entities get as far up the hierarchy as the UI though. At that level, it makes much more sense to use a model specifically for the view - especially given that you're using WPF.
Of course, all of this depends upon the size and complexity of your application. I'm assuming it's a fairly small scale application (single user?) given that you're using SQL CE.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.