Database layer design with unit tests in mind

Database layer design with unit tests in mind - c#

Basically I need a bunch of database access interfaces that can be implemented differently, or to put it another way, de-coupled from the underlying actual database:
interface IDbConnexion
{
IDbCollection Collection { get; }
Task Connect();
}
interface IDbCollection
{
void Insert(Model model);
}
I will use Collection to do CRUDs. But the problem with that is, for a concrete implementation for a particular database, eg MongoDB, I would need to do the following:
class MongoDbCollection: IDbCollection
{
IMongoCollection<Model> _collection;
void IDbCollection.Insert(Model model)
{
_collection.InsertOne(model);
}
}
Because this is my own wrapper for the mongodb client, I would need to wrap every single method provided by IMongoCollection so that I can use its full power. Eg. if I wanna use InsertMany then I would need to add my own wrapper for InsertMany to MongoDbCollection. And as my need grows, MongoDbCollection will inflate too. But I don't wanna do this. I should just be able to access all the methods the mongodb driver exposes to me without too much extra work.
So I am thinking an alternative:
interface IDbConnexion
{
IMongoCollection<Model> Collection { get; }
...
}
But the problem with this is that my interface is coupled with mongodb. My unit test mockup will need to implement IMongoCollection just for the sake of making it compile.
So my question is: is there a design pattern to allow me full access to the concrete Collection property while allowing me to easily change the concrete implementation.

Related

C# work with database which type is not known in advance

I have an app that collects data and writes it to a database. The database type is not known in advance, it's defined via an .ini file. So I have a method like this, if the database is Firebird SQL:
public bool writeToDB()
{
FbConnection dbConn = new FbConnection(connString);
dbConn.Open();
FbTransaction dbTrans = dbConn.BeginTransaction();
FbCommand writeCmd = new FbCommand(cmdText, dbConn, dbTrans);
/* some stuff */
writeCmd.ExecuteNonQuery();
dbTrans.Commit();
writeCmd.Dispose();
dbConn.Close();
return true;
}
To make the same work for e.g. MS Access database, I only have to replace FbConnection, FbTransaction and FbCommand with OleDbConnection, OleDbTransaction and OleDbCommand respectively.
But I don't want to have a separate identical method for each type of database.
Is it possible to define the database connection / transaction / command type at runtime, after the database type is known?
Thanks

When you're writing code at this level - opening and closing connections, creating and executing commands - there's probably no benefit in trying to make this method or class database-agnostic. This code is about implementation details so it makes sense that it would be specific to an implementation like a particular database.
But I don't want to have a separate identical method for each type of database.
You're almost certainly better off having separate code for separate implementations. If you try to write code that accommodates multiple implementations it will be complicated. Then another implementation (database) comes along which almost but doesn't quite fit the pattern you've created and you have to make it even more complicated to fit that one in.
I don't know the specifics of what you're building, but "what if I need a different database" is usually a "what if" that never happens. When we try to write one piece of code that satisfies requirements we don't actually have, it becomes complex and brittle. Then real requirements come along and they're harder to meet because our code is tied in knots to do things it doesn't really need to do.
That doesn't mean that all of our code should be coupled to a specific implementation, like a database. We just have to find a level of abstraction that's good enough. Does our application need to interact with a database to save and retrieve data? A common abstraction for that is a repository. In C# we could define an interface like this:
public interface IFooRepository
{
Task<Foo> GetFoo(Guid fooId);
Task Save(Foo foo);
}
Then we can create separate implementations for different databases if and when we need them. Code that depends on IFooRepository won't be coupled to any of those implementations, and those implementations won't be coupled to each other.

First (and Second and Third). STOP REINVENTING THE WHEEL.
https://learn.microsoft.com/en-us/ef/core/providers/?tabs=dotnet-core-cli
Alot of code and an alot of better testing has already been done.
Guess what is in that larger list:
FirebirdSql.EntityFrameworkCore.Firebird Firebird 3.0 onwards
EntityFrameworkCore.Jet Microsoft Access files
......
So I'm gonna suggest something in lines with everyone else. BUT also... allows for some reuse.
I am basing this .. on the fact the Entity Framework Core...provides functionality to several RDBMS.
See:
https://learn.microsoft.com/en-us/ef/core/providers/?tabs=dotnet-core-cli
public interface IEmployeeDomainDataLayer
{
Task<Employee> GetSingle(Guid empKey);
Task Save(Employee emp);
}
public abstract class EmployeeEntityFrameworkDomainDataLayerBase : IEmployeeDomainDataLayer
{
/* you'll inject a MyDbContext into this class */
//implement Task<Employee> GetSingle(Guid empKey); /* but also allow this to be overrideable */
//implement Task Save(Employee emp); /* but also allow this to be overrideable */
}
public class EmployeeJetEntityFrameworkDomainDataLayer : EmployeeEntityFrameworkDomainDataLayerBase, IEmployeeDomainDataLayer
{
/* do not do any overriding OR override if you get into a jam */
}
public class EmployeeSqlServerEntityFrameworkDomainDataLayer : EmployeeEntityFrameworkDomainDataLayerBase, IEmployeeDomainDataLayer
{
/* do not do any overriding OR override if you get into a jam */
}
You "code to an interface, not an implementation". Aka, your business layer codes to IEmployeeDomainDataLayer.
This gives you most code in EmployeeEntityFrameworkDomainDataLayerBase. BUT if any of the concretes give you trouble, you have a way to code something up ONLY FOR THAT CONCRETE.
If you want DESIGN TIME "picking of the RDBMS", then you do this:
You inject one of the concretes ( EmployeeJetEntityFrameworkDomainDataLayer OR EmployeeSqlServerEntityFrameworkDomainDataLayer ) into your IOC, based on which backend you want to wire to.
If you want RUN-TIME "picking of the RDMBS", you can define a "factory".
public static class HardCodedEmployeeDomainDataLayerFactory
{
public static IEmployeeDomainDataLayer getAnIEmployeeDomainDataLayer(string key)
{
return new EmployeeJetEntityFrameworkDomainDataLayer();
// OR (based on key)
return new EmployeeSqlServerEntityFrameworkDomainDataLayer();
}
}
The factory above suffers from IOC anemia. Aka, if your concretes need items for their constructors..you have to fudge them.
A better idea of the above is the kissing cousin of "Factory" pattern, called the Strategy Design.
It is a "kinda factory", BUT you inject the possible results of the "factory" in via a constructor. Aka, the "factory" is NOT hard coded...and does NOT suffer from IOC anemia.
See my answer here:
Using a Strategy and Factory Pattern with Dependency Injection

Generic Interface for replaceable/swapable data layer

I'm trying to come up with a way to write a generic interface that can be implemented for multiple data stores, basically the generic type will specify the ID for the data in the data store (some of these data stores use strongly typed libraries, so you have to call something like long createNode(Node node))
For example the system we are using now, uses longs for the IDs, but if we made the transition (or use both at the same time) to something like SQL it would most likely be GUIDs. The entities a the business logic layer are pretty much set, think of it as basically a file system with a bunch of custom attributes on the nodes.
So far I tried something like this:
public interface DataThing<T>
{
T CreateThing(CustomEntity node);
CustomEntity GetThing(T ID);
}
public class LongDataThing : DataThing<long>
{
long CreateThing(CustomEntity node)
{
//...implement
}
CustomEntity GetThing(long ID)
{
//...implement
}
}
...do the same thing for GUID/Int/string...whatever
Then when it comes to instantiating the class to work with basically a factory design pattern is where I'm having problems. I thought I could do something like:
private DataThing myDataStore;
switch(config.dbtype)
{
case "longDB":
this.myDataStore = new LongDataThing();
break;
case: "stringDB":
this.myDataStore = new StringDataThing();
break;
//...etc.
}
But I can't create private DataThing myDataStore; without specifying it as long, string...etc, which means the business logic already has to know which type is being implemented instead of just asking the factory for a datastore object.
So what am I missing here? Can you not abstract out the datatype of an interface like this and have it transparent to the calling 'business logic' and figure out which implementation is desired from a config file or some other outside logic?

Effective Repository in C# - Where to put methods?

I'm trying to build a new application using the Repository pattern for the first time and I'm a little confused about using a Repository. Suppose I have the following classes:
public class Ticket
{
}
public class User
{
public List<Ticket>AssignedTickets { get; set; }
}
public class Group
{
public List<User> GroupMembers { get;set; }
public List<Ticket> GroupAssignedTickets { get;set; }
}
I need a methods that can populate these collections by fetching data from the database.
I'm confused as to which associated Repository class I should put those methods in. Should I design my repositories so that everything returning type T goes in the repository for type T as such?
public class TicketRepository
{
public List<Ticket> GetTicketsForGroup(Group g) { }
public List<Ticket> GetTicketsForUser(User u) { }
}
public class UserRepository
{
public List<User> GetMembersForGroup(Group g) { }
}
The obvious drawback I see here is that I need to start instantiating a lot of repositories. What if my User also has assigned Widgets, Fidgets, and Lidgets? When I populate a User, I need to instantiate a WidgetRepository, a FidgetRepository, and a LidgetRepository all to populate a single user.
Alternatively, do I construct my repository so that everything requesting based on type T is lumped into the repository for type T as listed below?
public class GroupRepository
{
public List<Ticket> GetTickets(Group g) { }
public List<User> GetMembers(Group g) { }
}
public class UserRepository
{
public List<Ticket> GetTickets(User u) { }
}
The advantage I see here is that if I now need my user to have a collection of Widgets, Fidgets, and Lidgets, I just add the necessary methods to the UserRepository pattern and don't need to instantiate a bunch of different repository classes every time I want to create a user, but now I've scattered the concerns for a user across several different repositories.
I'm really not sure which way is right, if any. Any suggestions?

The repository pattern can help you to:
Put things that change for the same reason together
As well as
Separate things that change for different reasons
On the whole, I would expect a "User Repository" to be a repository for obtaining users. Ideally, it would be the only repository that you can use to obtain users, because if you change stuff, like user tables or the user domain model, you would only need to change the user repository. If you have methods on many repositories for obtaining a user, they would all need to change.
Limiting the impact of change is good, because change is inevitable.
As for instantiating many repositories, using a dependency injection tool such as Ninject or Unity to supply the repositories, or using a repository factory, can reduce new-ing up lots of repositories.
Finally, you can take a look at the concept of Domain Driven Design to find out more about the key purpose behind domain models and repositories (and also about aggregate roots, which are relevant to what you are doing).

Fascinating question with no right answer. This might be a better fit for programmers.stackexchange.com rather than stackoverflow.com. Here are my thoughts:
Don't worry about creating too many repositories. They are basically stateless objects so it isn't like you will use too much memory. And it shouldn't be a significant burden to the programmer, even in your example.
The real benefit of repositories is for mocking the repository for unit testing. Consider splitting them up based on what is simplest for the unit tests, to make the dependency injection simple and clear. I've seen cases where every query is a repository (they call those "queries" instead of repositories). And other cases where there is one repository for everything.

As it turns out, the first option was the more practical option in this case. There were a few reasons for this:
1) When making changes to a type and its associated repository (assume Ticket), it was far easier to modify the Ticket and TicketRepository in one place than to chase down every method in every repository that used a Ticket.
2) When I attempted to use interfaces to dictate the type of queues each repository could pull, I ran into issues where a single repository couldn't implement an generic interface using type T multiple times with the only differentiation in interface method implementation being the parameter type.
3) I access data from SharePoint and a database in my implementation, and created two abstract classes to provide data tools to the concrete repositories for either Sharepoint or SQL Server. Assume that in the example above Users come from Sharepoint while Tickets come from a database. Using my model I would not be able to use these abstract classes, as the group would have to inherit from both my Sharepoint abstract class and my SQL abstract class. C# does not support multiple inheritance of abstract classes. However, if I'm grouping all Ticket-related behaviours into a TicketRepository and all User-related behaviours into a UserRepository, each repository only needs access to one type of underlying data source (SQL or Sharepoint, respectively).

Where should I put commonly used data access code with logic not fitting to Repository when using Service classes on top of Repository/UnitOrWork?

In my earlier question I was asking about implementing repository/unit of work pattern for large applications built with an ORM framework like EF.
One followup problem I cannot come through right now is where to put codes containing business logic, but still lower-level enough to be used commonly in many other part of the application.
For example here is a few such method:
Getting all users in one or more roles.
Getting all cities where a user has privileges within an optional
region.
Getting all measure devices of a given device type, within a given
region for which the current user has privileges.
Finding a product by code, checking if it's visible and throwing
exception if not found or not visible.
All of these methods use a UnitOfWork for data access or manipulation, and receive several parameters as in their specification. I think everyone could write a lot more example for such common tasks in a large project. My question is where shall I put tese method implementations? I can see the following options currently.
Option 1: Every method goes to its own service class
public class RegionServices {
// support DI constructor injection
public RegionServices(IUnitOfWork work) {...}
...
public IEnumerable<City> GetCitiesForUser(User user, Region region = null) { ... }
...
}
public class DeviceServices {
// support DI constructor injection
public DeviceServices(IUnitOfWork work) {...}
...
public IEnumerable<Device> GetDevicesForUser(User user, DeviceType type, Region region = null) { ... }
...
}
What I don't like about it is that if a higher-level application service needs to call for example 3 or these methods, then it needs to instantiate 3 services, and if I use DI then I even have to put all 3 into the constructor, easily resulting quite a bit of code smell.
Option 2: Creating some kind of Facade for such common data access
public class DataAccessHelper {
// support DI constructor injection
public DataAccessHelper(IUnitOfWork work) {...}
...
public IEnumerable<City> GetCitiesForUser(User user, Region region = null) { ... }
public IEnumerable<Device> GetDevicesForUser(User user, DeviceType type, Region region = null) { ... }
public IEnumerable<User> GetUsersInRoles(params string[] roleIds) { ... }
...
}
I don't like it because it feels like violating the SRP, but its usage can be much more comfortable however.
Option 3: Creating extension methods for the Repositories
public static class DataAccessExtensions {
public static IEnumerable<City> GetCitiesForUser(this IRepository repo, User user, Region region = null) { ... }
}
Here IRepository is an interface with generic methods like Query<T>, Save<T>, etc. I don't like it either because it feels like I want to give business logic to repositories which is not advisable AFAIK. However, it expresses that these methods are common and lower level than service classes, which I like.
Maybe there are other options as well?... Thank you for the help.

If you say that a certain piece of domain logic needs to look at 3 distinct pieces of information in order to make a decision then we will need to provide this information to it.
Further if we say that each of these distinct pieces can be useful to other parts of the domain then each of them will need to be in its own method also. We can debate whether each query needs to be housed in a separate class or not depending on your domain/design.
The point I wanted to make is that there will be a application service which delegates to one or more Finder classes (classes where your queries are housed), these classes house only queries and then accumulate the results and pass it down to a Domain Service as method params.
The domain service acts on on the received parameters executes the logic and returns the result. This way the domain service is easily testable.
psuedo code
App Service
result1 = finder.query1()
result2 = finder.query2()
result3= yetanotherfinder.query();
domainresult = domainservice.calculate(result1,result2,result3);

Repositories belong to the domain, queries do not (http://www.jefclaes.be/2014/01/repositories-where-did-we-go-wrong_26.html).
You could define explicit queries and query handlers and use those outside of your domain.
public class GetUserStatisticsQuery
{
public int UserId { get; set; }
}
public class GetUserStatisticsQueryResult
{
...
}
public class GetUserStatisticsQueryHandler :
IHandleQuery<GetUserStatisticsQuery, GetUserStatisticsQueryResult>
{
public GetUserStatisticsQueryResult Handle(GetUserStatisticsQuery query)
{
... "SELECT * FROM x" ...
}
}
var result = _queryExecutor.Execute<GetUserStatisticsQueryResult>(
new GetUserStatisticsQuery(1));

I'm adding my conclusion as an answer, because I quickly realized that this question is quite relative and not exact, heavily depends on personal favours or design trends.
The comments and the answers helped me in seeing more clearly how things like this should basically be implemented, thank you for all of your effort.
Conclusion
A "repository" should be responsible clearly only for data persisting. Because it doesn't hold any domain logic, or type specific logc, I think it can be represented and implemented as an IRepository interface with generic methods like Save<T>, Delete<T>, Query<T>, GetByID<T>, etc. Please refer to my previous question mentioned in the beginning of my original post.
On the other hand, I think (at least now with my current project) that introducing new class/classes for each lower-level domain logic (in the most cases some kind of querying logic) task is a bit over-engineered solution, which is not needed for me. I mean I don't want to introduce classes like GetUsersInRoles or GetDevicesInRegionWithType, etc. I feel I would end up with a lot of classes, and a lot of boilerplate code when refering them.
I decided to implement the 3rd option, adding static query functions as extensions to IRepository. It can be nicely separated in a Queries project folder, and structured in several static classes each named after the underlying domain model on which it defines operations. For example I've implemented user related queries as follows: in Queries folder I've created a UserQueries.cs file, in which I have:
public static class UserQueries {
public static IEnumerable<User> GetInRoles(this IRepository repository, params string[] roles)
{
...
}
}
This way I can easily and comfortable access such methods via extensions on every IRepository, the methods are unit-testable and support DI (as they are callable on any IRepository implementation). This technique fits best for my current needs.
It can be refactored even further to make it even cleaner. I could introduce "ghost" sealed classes like UserQueriesWrapper and use it to structure the calling code and this way not put every kind of such extensions to IRepository. I mean something like this:
// technical class, wraps an IRepository dummily forwarding all members to the wrapped object
public class RepositoryWrapper : IRepository
{
internal RepositoryWrapper(IRepository repository) {...}
}
// technical class for holding user related query extensions
public sealed class UserQueriesWrapper : RepositoryWrapper {
internal UserQueriesWrapper(IRepository repository) : base(repository) {...}
}
public static class UserQueries {
public static UserQueriesWrapper Users(this IRepository repository) {
return new UserQueriesWrapper(repository);
}
public static IEnumerable<User> GetInRoles(this UserQueriesWrapper repository, params string[] roles)
{
...
}
}
...
// now I can use it with a nicer and cleaner syntax
var users = _repo.Users().GetInRoles("a", "b");
...
Thank you for the answers and comments again, and please if there is something I didn't notice or any gotcha with this technique, leave a comment here.

Architecture Design for DataInterface - remove switch on type

I am developing a project that calculates various factors for a configuration of components.
The configuration is set/changed by the user at runtime. I have a Component base class and all configuration items are derived from it.
The information for each component is retrieved from data storage as and when it is required.
So that the storage medium can change I have written a DataInterface class to act as an intermediary.
Currently the storage medium is an Access Database. The DataInterface class thus opens the database and creates query strings to extract the relevant data. The query string will be different for each component.
The problem I have is designing how the call to GetData is made between the component class and the DataInterface class. My solutions have evolved as follows:
1) DataInterface has a public method GetXXXXData() for each component type. (where XXX is component type).
Sensor sensor = new Sensor();
sensor.Data = DataInterface.GetSensorData();
2) DataInterface has a public method GetData(componentType) and switches inside on component type.
Sensor sensor = new Sensor();
sensor.Data = DataInterface.GetData(ComponentType.Sensor);
3) Abstract component base class has virtual method GetData() which is overidden by each derived class. GetData() makes use of the DataInterface class to extract data.
Sensor sensor = new Sensor();
sensor.GetData();
//populates Data field internally. Could be called in constructor
For me solution 3 appears to be the most OOD way of doing things. The problem I still have however is that the DataInterface still needs to switch on the type of the caller to determine which query string to use.
I could put this information in each component object but then this couples the components to the storage medium chosen. Not good. Also, the component should not care how the data is stored. It should just call its GetData method and get data back.
Hopefully, that makes sense. What im looking for is a way to implement the above functionality that does not depend on using a switch on type.
I'm still learning how to design architecture so any comments on improvement welcome.
TIA

Actually, solution #3 is the worst because it gives the Sensor class artificial responsibilities. The other two solutions are better in that they encapsulate the data access responsibilities into different classes.
I would suggest the following interfaces and classes.
interface IComponentDataReader
{
object GetData();
}
abstract class AbstractComponent
{
private IComponentDataReader dataReader;
public AbstractComponent(IComponentDataReader dataReader)
{
this.dataReader = dataReader;
}
protected object GetData()
{
return dataReader.GetData();
}
}
class Sensor : AbstractComponent
{
public Sensor(IComponentDataReader dataReader)
: base(dataReader)
{
}
public void DoSomethingThatRequiresData()
{
object data = GetData();
// do something
}
}
class SensorDataReader : IComponentDataReader
{
public object GetData()
{
// read your data
return data;
}
}
class MyApplication
{
public static void Main(string[] args)
{
Sensor sensor = new Sensor(new SensorDataReader());
sensor.DoSomethingThatRequiresData();
}
}
I hope this makes sense. Basically, for good OOD, if you can keep your classes to do only one thing (Single Responsibility Principle) and know only about itself, you will be fine. You must be asking why there is an IComponentDataReader passed to SensorComponent if it should only know about itself. In this case, consider that this is provided to SensorComponent (Dependency Injection) instead of it requesting for it (which would be looking outside its own responsibilities).

First, I agree with the idea of each component object, in it's constructor being responsible for asking for its configuration. In fact, perhaps that's pushed up into the base class constructor. We end up with
DataInterface.GetData( getMyType() );
kind of a call.
Then, you main question, how can we implement GetData( type)?
In effect you want a mapping from a type to a query string, and you don't want to be changing code as new components are added. So how about providing some data-driven approach. A simple external configuration proving that mapping. Then it's just a config change to add more components.

If i understand you right you make it a little too complicated:
Define an iterface with the getData() method (and a few connect, disconnect methods and maybe some Exceptions would also be a good Idea).
Derive a seperate class for every data provider / different storage type bassed on that interface like "AcdcessStorage", "MySQLStorage", "WhateverStroage" ...
Now you can quickly exchange one data storage implementation another, have different connection methods/query strings for each implementation and you can use multiple storages at the same time and iterate through them by a static interface method tha hass acces to all storages and keeps them in a list.
No need for any switches.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Database layer design with unit tests in mind - c#

Related

C# work with database which type is not known in advance

Generic Interface for replaceable/swapable data layer

Effective Repository in C# - Where to put methods?

Where should I put commonly used data access code with logic not fitting to Repository when using Service classes on top of Repository/UnitOrWork?

Architecture Design for DataInterface - remove switch on type

Categories

Resources