If I wanted to access a database in Delphi, I could add a datamodule to a project, configure it from my mainform and then access it anywhere in the application; a reference would be stored in a global variable.
I know that in C# and other more modern OO languages global variables are frowned upon. So how can I access my database from where I need it? The biggest problem I have is the configuration: location, user, password, etc. are unknown at design time.
I now have a db-class and make a new instance when I need it, but then I would have to store those settings in some globally accessible thing, and I have simply moved the problem.
What's the standard solution?
Thanks, regards, Miel.
I always use the singleton pattern. As for configuration, look at the System.Configuration.ConfigurationManager class which allows you to read settings from your project's app.config/web.config file.
It's a bit tricky to define the absolute best practice for database access in OOP.
You've hit the nail on the head that there are a lot of factors to consider:
how are configuration parameters handled?
is the app multi-threaded? do you need database connection pools?
do you need database portability (ie: do you use different DBs in dev versus production? are you concerned about vendor lock-in with one DB? Are you distributing the app the other users who may be using a different db?)
are you concerned with securing your SQL statements, or centrally enforcing other access permissions?
is there common logic involved when performing some inserts and updates that you'd rather not duplicate everywhere a particular table is touched?
Because of this, many OOP folks gravitate to an ORM framework for anything but the simplest cases. The general idea is that your application logic shouldn't need to talk to the database directly at any point: isolate your business code from the actual persistence mechanism for as long as possible.
Instead, try to design your application so that your business logic talks to a model layer. In other words, have model objects in the system that encapsulate and describe your business data. These model objects then expose methods for obtaining and saving their state into the database, but your logic doesn't need to care about that.
For example, say you have a concept called "Person" in your system. You'd probably model this as a class with some properties. In pseudo-code:
Person:
- first_name
- last_name
Your actual code in the system is then only concerned with instantiating and using Person objects, not with obtaining DB handles or writing SQL:
p = Person.get(first_name='Joe')
p.last_name = 'Bloggs'
p.save()
In an object-oriented world, you'll find that your business logic code becomes cleaner (and shorter!), easier to maintain, and much more testable.
Of course, you're right in that this means you need to now go off and build a database back-end that translates that Person class to one or more tables in your relational database. This is where using an ORM framework comes in handy. In Python, people use Django and SQLAlchemy. I'll let others indicate what folks use in C# (I'm not a C# developer, but you did tag your question OOP, so I'm going for the generic answer here, rather than C# specific).
The point, though, is that the ORM framework puts all the DB access in a single set of classes in the code, so that the DB access, configuration and pools are handled in one place... no need to instantiate them all over the application. What you use "where you need it" is the model object.
Of course, if your app is very simple and you want just a raw DB handle, then I do recommend the dependency injection approach others have listed.
Hope that helps.
It seems to me that you need to create an appropriate object (containing the connection or similar), and pass that instance to each object requiring access (see dependency injection)
This is different from using singletons. By using this mechanism, it'll free you from the dependency on one object and (perhaps a more compelling reason in this instance) allow you to perform testing by injecting mock objects or similar in place of the originally-injected database accessor object. I would definitely shy away from the singleton mechanism in this scenario.
I actually use a repository class that takes in the db information in its constructor and have the classes that need it get it passed in. I actually use an Inversion of Control (IOC) tool to inject that values in.
You could store the user information in a flat file somewhere, then read / write to it from your db-class
This way you won't duplicate the settings in your code, but the user can still modify the settings.
SubSonic is the "Swiss Army knife" for object relational mapping, and offers the ability to execute stored procedures and return results to List. You can have it up and running within a half hour.
Related
I have read many posts concerning the issue of having several databases and how to design a DAL efficiently in this case. In many cases, the forum suggests to apply the repository pattern, which works fine in most cases.
However, I find myself in a different situation. I have 3 different databases: Oracle, OLE DB and SQLServer. Currently, there exists a unique DAL with many different classes sending SQL queries down to a layer below to be executed in the corresponding database. They way the system works is that two of the databases are only used to read information from them, and the other one is used to either store this same information or read it later on. I have to propose a better design to the current implementation, but it seems as if a common interface for all three databases is not plausible from an architectural point of view.
Is there any design pattern that solves this situation? Should I have three different DALs? Or perhaps it is possible (and advisable) to apply the repository pattern to this problem?
Answers to your question will probably be very subjective. These are some thoughts.
You could apply command-query separation. The query side integrates directly with your data layer, bypassing any business or domain layer, and the returning entities are optimized for read and project from your databases. This layer could also be responsible to merge results from different database calls.
The command side consists of command handlers, using domain or business entities, which are mapped from your R/W database.
By doing this, the interface that you expose will be more clear and business oriented.
I'm not sure that completely abstracting out the data access layer with custom units of work and repositories is really needed: do the advantages outweigh the disadvantages? They rarely do, because you will you ever change a database technology? And if you do, this probably means a rewrite anyway. Also, if you use entity framework code first, you already have unit of work and an abstraction on top of your database; and the flexibility of using LINQ.
Bottom line - try not to over-engineer/over-abstract things; or make things super-generic.
Your core/business code should never be dependent on any contract/interface/class that is placed in the DAL layer of the application.
Accessing data is something the business/core layer of your application needs to be able to do, and this it should be able to do without any dependency of SQL statements and without having any knowledge of the underlying data access technology.
I think you need to remove any "sql" statements from the core part of the application. SQL is vendor dependent and any dependency to a specific database engine needs to be clean out of you core, and moved to the DAL where it belongs. Then you need to create interfaces that resides outside of the DAL(s) which you then create implementation classes for in one or many DAL modules/classes. Your DAL can be dependent of your core, but not the other way around.
I don't see why the repository layer can't be used in this case. When I have a database which I can only read from, I usually let the name of the repository interface indicate this, like ICompanyRepositoryRead.
I have a situation where I need to create an application which supports multiple databases. Multiple databases means the client can use any of the database like Oracle, SQL Server, MySQL, PostgreSQL at first.
I was trying to use ORM like NHibernate or MyBatis. But they have their limitation and need expertise to use.
So I decide to user the Data Providers provided by Microsoft like ADO.NET, OLEDB, ODP.NET etc.
Is there any way so that the my logic of database keep same for all the database? I have tried IDbConeection, IDbCommand etc but they have a problem in case of Oracle (Ref Cursor).
I there any way to achieve this? Some link or guide would be appreciated.
Edit:
There is problem with the DBTypes because they are enum define differently with different data providers.
Well, real-life applications are complicated like that. Before you know it, you want to replace the UI with an App, expose your logic as a WCF service, change the e-mail service with another service provider, test pieces of your code while mocking the DAL and change the database with another one.
The usual way to deal with this is to pass all calls through an interface that separates the implementation from the caller. After that, you can implement the different DAL's.
Personally I usually go with this approach:
First create a single DLL that contains all interfaces. Basically the idea is to expose all calls that your UI, App or whatever needs through the interface. From now on, your UI doesn't talk to databases or e-mail providers anymore.
If you need to get access to the interface, you use a factory pattern. Never use 'new'; that will get you in trouble in the long run.
It's not trivial to create this, and needs proper crafting. Usually I begin with a bare minimum version, hack everything else in the UI as a first version, then move everything that touches a DB or a service into the right project while creating interfaces and finally re-engineer everything until I'm 100% satisfied.
Interfaces should be built to last. Sure, changes will happen over time, but you really want to minimize these. Think about what the future will hold, read up on what other people came up with and ensure your interfaces reflect that.
Basically you now have a working piece of software that works with a single database, mail provider, etc. So far so good.
Next, re-engineer the factory. Basically you want to use the configuration settings to pick the right provider (the right DLL that implements your interface) for your data. A simple switch can suffice in most cases.
At this point I usually make it a habit to make a ton of unit tests for the interfaces.
The last step is to create DLL's for the different database providers. One of these will be loaded at run-time in your application.
I prefer simple Linq to SQL (I also use the library from LinqConnect) because it's pretty fast. I simply start by copy-pasting the other database provider, and then re-engineer it until it works. Personally I don't believe in a magic 'support all sql databases' solution anymore: In my experience, some databases will handle certain queries a much, much faster than other databases - which means that you will probably end up with some custom code for each database anyways.
This is also the point where your unit tests are really going to pay off. Basically, you can just start with copy-paste and give it a test. If you're lucky, everything will run right away with decent performance... if not, you know where to start.
Build to last
Build things to last. Things will change:
Think about updates and test them. Prefer automatic tests.
You don't want to tinker with your Factory every day. Use Reflection, Expressions, Code generation or whatever your poison is to save yourself the trouble of changing code.
Spend time writing tests. Make sure you cover the bulk. I cannot stress this enough; under pressure people usually 'save' time by not writing tests. You'll notice that this time that you 'save' will double back on you as support when you've gone live. Every month.
What about Entity Framework
I've seen a lot of my customers get into trouble with performance because of this. In the many times that I've tested it, I had the same experience. I noticed customers hacking around EF for a lot of queries to get a bit of decent performance.
To be fair, I gave up a few years ago, and I know they have made considerable performance improvements. Still, I would test it (especially with complex queries) before considering it.
If I would use EF, I'd implement all EF stuff in a 'database common DLL', and then derive classes from that. As I said, not all databases are the same with queries - and you might want to implement some hacks that are necessary to get decent performance. Your tests will tell.
Bonuses
Other reasons for programming through interfaces has a lot of advantages in combination with proxy's. To name a few, you can easily create log sinks, caching, statistics, WCF, etc. by simply implementing the same interface. And if you end up hating your current OR mapper some day, you can just throw it away without touching a single line of your app.
I believe Microsoft's Data Access Components would be suitable to you.
https://en.wikipedia.org/wiki/Microsoft_Data_Access_Components
How about writing microservices and connect them by using a rest api?
You (and maybe your team) could provide a core application which handles the logic and the ui. This is still based on your current technology. But instead of adding directly some kind of database connection, you could provide multiple types of microservices (based on asp.net or core) providing a rest api. You get your data from each database from such a microservice. So you would develop 1 micro service for e.g. MySQl and another one for MsSQL and when a new customer comes up with oracle you write a new small microservice which handles your expected API.
More info (based on .net core) is here: https://docs.asp.net/en/latest/tutorials/first-web-api.html
I think this is a teams discussion, which kind of technology you decide to use. But today I would recommend writing a micro service. It makes the attachment of a new app for a e.g. mobile device also much easier :)
Yes its possible.
Right now am working with the same scenario where my all logic related data( typically you can call meta data) reside inside one DB and date resides in another DB.
What you need to do. you should have connection related parameter in two different file or you can call these file as prop files. now you need to have connection concrete class which take the parameter from these prop file. so where you need to create connection just supply the prop files and it will created the db connection as desired.
I seem to be missing something and extensive use of google didn't help to improve my understanding...
Here is my problem:
I like to create my domain model in a persistence ignorant manner, for example:
I don't want to add virtual if I don't need it otherwise.
I don't like to add a default constructor, because I like my objects to always be fully constructed. Furthermore, the need for a default constructor is problematic in the context of dependency injection.
I don't want to use overly complicated mappings, because my domain model uses interfaces or other constructs not readily supported by the ORM.
One solution to this would be to have separate domain objects and data entities. Retrieval of the constructed domain objects could easily be solved using the repository pattern and building the domain object from the data entity returned by the ORM. Using AutoMapper, this would be trivial and not too much code overhead.
But I have one big problem with this approach: It seems that I can't really support lazy loading without writing code for it myself. Additionally, I would have quite a lot of classes for the same "thing", especially in the extended context of WCF and UI:
Data entity (mapped to the ORM)
Domain model
WCF DTO
View model
So, my question is: What am I missing? How is this problem generally solved?
UPDATE:
The answers so far suggest what I already feared: It looks like I have two options:
Make compromises on the domain model to match the prerequisites of the ORM and thus have a domain model the ORM leaks into
Create a lot of additional code
UPDATE:
In addition to the accepted answer, please see my answer for concrete information on how I solved those problems for me.
I would question that matching the prereqs of an ORM is necessarily "making compromises". However, some of these are fair points from the standpoint of a highly SOLID, loosely-coupled architecture.
An ORM framework exists for one sole reason; to take a domain model implemented by you, and persist it into a similar DB structure, without you having to implement a large number of bug-prone, near-impossible-to-unit-test SQL strings or stored procedures. They also easily implement concepts like lazy-loading; hydrating an object at the last minute before that object is needed, instead of building a large object graph yourself.
If you want stored procs, or have them and need to use them (whether you want to or not), most ORMs are not the right tool for the job. If you have a very complex domain structure such that the ORM cannot map the relationship between a field and its data source, I would seriously question why you are using that domain and that data source. And if you want 100% POCO objects, with no knowledge of the persistence mechanism behind, then you will likely end up doing an end run around most of the power of an ORM, because if the domain doesn't have virtual members or child collections that can be replaced with proxies, then you are forced to eager-load the entire object graph (which may well be impossible if you have a massive interlinked object graph).
While ORMs do require some knowledge in the domain of the persistence mechanism in terms of domain design, an ORM still results in much more SOLID designs, IMO. Without an ORM, these are your options:
Roll your own Repository that contains a method to produce and persist every type of "top-level" object in your domain (a "God Object" anti-pattern)
Create DAOs that each work on a different object type. These types require you to hard-code the get and set between ADO DataReaders and your objects; in the average case a mapping greatly simplifies the process. The DAOs also have to know about each other; to persist an Invoice you need the DAO for the Invoice, which needs a DAO for the InvoiceLine, Customer and GeneralLedger objects as well. And, there must be a common, abstracted transaction control mechanism built into all of this.
Set up an ActiveRecord pattern where objects persist themselves (and put even more knowledge about the persistence mechanism into your domain)
Overall, the second option is the most SOLID, but more often than not it turns into a beast-and-two-thirds to maintain, especially when dealing with a domain containing backreferences and circular references. For instance, for fast retrieval and/or traversal, an InvoiceLineDetail record (perhaps containing shipping notes or tax information) might refer directly to the Invoice as well as the InvoiceLine to which it belongs. That creates a 3-node circular reference that requires either an O(n^2) algorithm to detect that the object has been handled already, or hard-coded logic concerning a "cascade" behavior for the backreference. I've had to implement "graph walkers" before; trust me, you DO NOT WANT to do this if there is ANY other way of doing the job.
So, in conclusion, my opinion is that ORMs are the least of all evils given a sufficiently complex domain. They encapsulate much of what is not SOLID about persistence mechanisms, and reduce knowledge of the domain about its persistence to very high-level implementation details that break down to simple rules ("all domain objects must have all their public members marked virtual").
In short - it is not solved
(here goes additional useless characters to post my awesome answer)
All good points.
I don't have an answer (but the comment got too long when I decided to add something about stored procs) except to say my philosophy seems to be identical to yours and I code or code generate.
Things like partial classes make this a lot easier than it used to be in the early .NET days. But ORMs (as a distinct "thing" as opposed to something that just gets done in getting to and from the database) still require a LOT of compromises and they are, frankly, too leaky of an abstraction for me. And I'm not big on having a lot of dupe classes because my designs tend to have a very long life and change a lot over the years (decades, even).
As far as the database side, stored procs are a necessity in my view. I know that ORMs support them, but the tendency is not to do so by most ORM users and that is a huge negative for me - because they talk about a best practice and then they couple to a table-based design even if it is created from a code-first model. Seems to me they should look at an object datastore if they don't want to use a relational database in a way which utilizes its strengths. I believe in Code AND Database first - i.e. model the database and the object model simultaneously back and forth and then work inwards from both ends. I'm going to lay it out right here:
If you let your developers code ORM against your tables, your app is going to have problems being able to live for years. Tables need to change. More and more people are going to want to knock up against those entities, and now they all are using an ORM generated from tables. And you are going to want to refactor your tables over time. In addition, only stored procedures are going to give you any kind of usable role-based manageability without dealing with every tabl on a per-column GRANT basis - which is super-painful. If you program well in OO, you have to understand the benefits of controlled coupling. That's all stored procedures are - USE THEM so your database has a well-defined interface. Or don't use a relational database if you just want a "dumb" datastore.
Have you looked at the Entity Framework 4.1 Code First? IIRC, the domain objects are pure POCOs.
this what we did on our latest project, and it worked out pretty well
use EF 4.1 with virtual keywords for our business objects and have our own custom implementation of T4 template. Wrapping the ObjectContext behind an interface for repository style dataaccess.
using automapper to convert between Bo To DTO
using autoMapper to convert between ViewModel and DTO.
you would think that viewmodel and Dto and Business objects are same thing, and they might look same, but they have a very clear seperation in terms of concerns.
View Models are more about UI screen, DTO is more about the task you are accomplishing, and Business objects primarily concerned about the domain
There are some comprimises along the way, but if you want EF, then the benfits outweigh things that you give up
Over a year later, I have solved these problems for me now.
Using NHibernate, I am able to map fairly complex Domain Models to reasonable database designs that wouldn't make a DBA cringe.
Sometimes it is needed to create a new implementation of the IUserType interface so that NHibernate can correctly persist a custom type. Thanks to NHibernates extensible nature, that is no big deal.
I found no way to avoid adding virtual to my properties without loosing lazy loading. I still don't particularly like it, especially because of all the warnings from Code Analysis about virtual properties without derived classes overriding them, but out of pragmatism, I can now live with it.
For the default constructor I also found a solution I can live with. I add the constructors I need as public constructors and I add an obsolete protected constructor for NHibernate to use:
[Obsolete("This constructor exists because of NHibernate. Do not use.")]
protected DataExportForeignKey()
{
}
Your advice is needed! I'm just out for some sort of pseudo-code/idea of what way to go that are robust and reliable. Maybe there exist a usefull pattern for the purpose?
void AddDevice(string itemId);
I have a interface with some methods (above is one). In a new class, that implements the interface, there are an external provider involved which need to be informed of updates in the class.
The class itself is get/sets information to a sql server database. Some (not all) of the information must be pushed to the external provider.
This give me two scenarios (which I ask for help)
WriteOnlytoDatabase = true / false
I would like to use same method in both cases, without using a method bool parameter. Is that possible? Could a delegate being used to switch between the difference? Please remember it's a interface here (that GUI talks to).
Two transfers, how to track errors
Because we do two transfers (database, external provider) there can be error that make one or other unavailable. If error on the external provider, I think of some sort of "undone actions queue" to handle..
Advices are welcome..
This could be solved on a dousin of ways but there are more or less good designs :)
[Reply from MatÃas below was wroted
before my edit of question]
First of all, I would point to the fact that maybe there's some available synchronization mechanism in the market or open source community that can do it for you, outside your code. If this is the case, I'll suggest that I wouldn't implement my own way of synchronize such data. I'd prefer to let such tool do it for me.
Perhaps this isn't your case and your data couldn't be sync'd with an standard or known tool, we need to think about another solution.
I believe some entity isn't responsible of sync'ing with itself in another storage. That should be a task for the layer between business and that storage: the data layer.
Business access to the data without any detail of where to retrieve it. It just get or change business objects' states, or removes them from the store.
It's the business who in some case would require some argument like "ActionKind" - an enumeration - and, since business would rely on some data access layer, some code there would do something depending on the "ActionKind".
This "ActionKind" would let data access layer to choose an implementation of "how and where to store data".
That data access layer would have some "event" or "trigger" that would fire when some change has been made to one of underlying storage devices, and some handler(s) would manage to synchronize data in all other stores.
The "event" or "trigger" handler would be implemented directly in code (hard-coding) or with some interface like "IDataSynchronizer" (choose your own identifier, it's just an example) having a "Synchronize" method, that would be called when some data changed in any storage.
I believe using some approach like this one you'd have less problems with synchronization and you won't need to care about if "1st device has the data, 2nd no, so, I need to check blah blah..."! :)
What class in my project should responsible for keeping track of which Aggregate Roots have already been created so as not to create two instances for the same Entity. Should my repository keep a list of all aggregates it has created? If yes, how do i do that? Do I use a singleton repository (doesn't sound right to me)? Would another options be to encapsulate this "caching" somewhere else is some other class? What would this class look like and in what pattern would it fit?
I'm not using an O/R mapper, so if there is a technology out there that handles it, i'd need to know how it does it (as in what pattern it uses) to be able to use it
Thanks!
I believe you are thinking about the Identity Map pattern, as described by Martin Fowler.
In his full description of the pattern (in his book), Fowler discusses implementation concerns for read/write entities (those that participate in transactions) and read-only (reference data, which ideally should be read only once and subsequently cached in memory).
I suggest obtaining his excellent book, but the excerpt describing this pattern is readable on Google Books (look for "fowler identity map").
Basically, an identity map is an object that stores, for example in a hashtable, the entity objects loaded from the database. The map itself is stored in the context of the current session (request), preferably in a Unit of Work (for read/write entities). For read-only entities, the map need not be tied to the session and may be stored in the context of the process (global state).
I consider caching to be something that happens at the Service level, rather than in a repository. Repositories should be "dumb" and just do basic CRUD operations. Services can be smart enough to work with caching as necessary (which is probably more of a business rule than a low-level data access rule).
To put it simply, I don't let any code use Repositories directly - only Services can do that. Then everything else consumes the relevant services as interfaces. That gives you a nice wrapper for putting in biz logic, caching, etc.
I would say, if this "caching" managed anywhere other than the Repository then you are letting concerns leak out.
The repository is your collection of items. No code consuming the repository should have to decide whether to retrieve the object from the repository or from somewhere else.
Singleton sounds like the wrong lifetime; it likely should be per-request. This is easy to manage if you are using an IoC/DI container.
It seems like the fact that you even have consider multiple calls for the same aggregate evidences a architecture/design problem. I would be interested to hear an example of what those first and second calls might be, and why they require the same exact instance of your AR.