Repository Pattern without an ORM

Repository Pattern without an ORM - c#

I am using repository pattern in a .NET C# application that does not use an ORM. However the issue I am having is how to fill One-to-many List properties of an entity. e.g. if a customer has a list of orders i.e. if the Customer class has a List property called Orders and my repository has a method called GetCustomerById, then?
Should I load the Orders list within the GetCustomerById method?
What if the Order itself has another list property and so on?
What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
This also raises questions for Features like Change Tracking, Deleting etc? So i think the end result is can I do DDD without ORM ?
But right now I am only interested in lazy loading List properties in my domain entities? Any idea?
Nabeel
I am assuming this is a very common issue for anyone not using an ORM in a Domain Driven Design? Any idea?

can I do DDD without ORM ?
Yes, but an ORM simplifies things.
To be honest I think your problem isn't to do with whether you need an ORM or not - it's that you are thinking too much about the data rather than behaviour which is the key for success with DDD. In terms of the data model, most entities will have associations to most another entities in some form, and from this perspective you could traverse all around the model. This is what it looks like with your customer and orders and perhaps why you think you need lazy loading. But you need to use aggregates to break these relationships up into behavioural groups.
For example why have you modelled the customer aggregate to have a list of order? If the answer is "because a customer can have orders" then I'm not sure you're in the mindset of DDD.
What behaviour is there that requires the customer to have a list of orders? When you give more thought to the behaviour of your domain (i.e. what data is required at what point) you can model your aggregates based around use cases and things become much clearer and much easier as you are only change tracking for a small set of objects in the aggregate boundary.
I suspect that Customer should be a separate aggregate without a list of orders, and Order should be an aggregate with a list of order lines. If you need to perform operations on each order for a customer then use orderRepository.GetOrdersForCustomer(customerID); make your changes then use orderRespository.Save(order);
Regarding change tracking without an ORM there are various ways you can do this, for example the order aggregate could raise events that the order repository is listening to for deleted order lines. These could then be deleted when the unit of work completed. Or a slightly less elegant way is to maintain deleted lists, i.e. order.DeletedOrderLines which your repository can obviously read.
To Summarise:
I think you need to think more about behaviour than data
ORM's make life easier for change tracking, but you can do it without one and you can definitely do DDD without one.
EDIT in response to comment:
I don't think I'd implement lazy loading for order lines. What operations are you likely to perform on the order without needing the order lines? Not many I suspect.
However, I'm not one to be confined to the 'rules' of DDD when it doesn't seem to make sense, so... If in the unlikely scenario that there are a number of operations performed on the order object that didn't require the order lines to be populated AND there are often a large number of order lines associated to an order (both would have to be true for me to consider it an issue) then I'd do this:
Have this private field in the order object:
private Func<Guid, IList<OrderLine>> _lazilyGetOrderLines;
Which would be passed by the order repository to the order on creation:
Order order = new Order(this.GetOrderLines);
Where this is a private method on the OrderRepository:
private IList<OrderLine> GetOrderLines(Guid orderId)
{
//DAL Code here
}
Then in the order lines property could look like:
public IEnumberable<OrderLine> OrderLines
{
get
{
if (_orderLines == null)
_orderLines = _lazilyGetOrderLines(this.OrderId);
return _orderLines;
}
}
Edit 2
I've found this blog post which has a similar solution to mine but slightly more elegant:
http://thinkbeforecoding.com/post/2009/02/07/Lazy-load-and-persistence-ignorance

1) Should I load the Orders list within the GetCustomerById method?
It's probably a good idea to separate the order mapping code from the customer mapping code. If you're writing your data access code by hand, calling that mapping module from the GetCustomerById method is your best option.
2) What if the Order itself has another list property and so on?
The logic to put all those together has to live somewhere; the related aggregate repository is as good a place as any.
3) What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
The best solution I've seen is to make your repository return subclassed domain entities (using something like Castle DynamicProxy) - that lets you maintain persistence ignorance in your domain model.

Another possible answer is to create a new Proxy object that inherits from Customer, call it CustomerProxy, and handle the lazy load there. All this is pseudo-code, so it's to give you an idea, not just copy and paste it for use.
Example:
public class Customer
{
public id {get; set;}
public name {get; set;}
etc...
public virtual IList<Order> Orders {get; protected set;}
}
here is the Customer "proxy" class... this class does not live in the business layer, but in the Data Layer along with your Context and Data Mappers. Note that any collections you want to make lazy-load you should declare as virtual (I believe EF 4.0 also requires you to make props virtual, as if spins up proxy classes at runtime on pure POCO's so the Context can keep track of changes)
internal sealed class CustomerProxy : Customer
{
private bool _ordersLoaded = false;
public override IList<Order> Orders
{
get
{
IList<Order> orders = new List<Order>();
if (!_ordersLoaded)
{
//assuming you are using mappers to translate entities to db and back
//mappers also live in the data layer
CustomerDataMapper mapper = new CustomerDataMapper();
orders = mapper.GetOrdersByCustomerID(this.ID);
_ordersLoaded = true;
// Cache Cases for later use of the instance
base.Orders = orders;
}
else
{
orders = base.Orders;
}
return orders;
}
}
}
So, in this case, our entity object, Customer is still free from database/datamapper code calls, which is what we want... "pure" POCO's. You've delegated the lazy-load to the proxy object which lives in the Data layer, and does instantiate data mappers and make calls.
there is one drawback to this approach, which is calling client code can't override the lazy load... it's either on or off. So it's up to you in your particular usage circumstance. If you know maybe 75% of the time you'll always needs the Orders of a Customer, than lazy-load is probably not the best bet. It would be better for your CustomerDataMapper to populate that collection at the time you get a Customer entity.
Again, I think NHibernate and EF 4.0 both allow you to change lazy-loading characteristics at runtime, so, as per usual, it makes sense to use an ORM, b/c a lot of functionality is provided for you.
If you don't use Orders that often, then use a lazy-load to populate the Orders collection.
I hope that this is "right", and is a way of accomplishing lazy-load the correct way for Domain Model designs. I'm still a newbie at this stuff...
Mike

Related

Many-to-many relation with repository pattern in ASP.NET MVC

I am building an ASP.NET MVC application. My database has many-to-many relationship with a Intermediate table.
In my application, I am using repository pattern without Unit-Of-Work. I have a generic repository with CRUD operations defined in it.
Because I am using Entity Framework with a database-first approach, I have created my models from EDMX.
The intermediate table which I have is not showing up in .edmx file, but it is indicated by diamond sign so apparently that defines many to many.
This is an background to what I have. Now the issues
I have 2 tables Student & Books. In my view, I want a to display a form which has fields from both Student and Books table. Idea is each student will fill in their details and they will choose the books they are interested to read and then they will hit "Submit" upon submit their record should be save in my database and for admin I want all that record stored to be display for which I will use accordion to show the data. Because I am using generic repo at a time, I am injecting only one Student repo into my controller and upon form creation I only get data from Student table (i.e. their details) - I don't get fields from books where they can select the books.
Can someone please suggest me a solution?
For backup I am thinking if this doesn't work, I will get all the details in one single table in my database and use that. But I want to avoid that approach.
Any ideas, suggestions will be really helpful.

If you are using a Generic Repository pattern, then consider replacing it with a more purpose-built Repository class to serve the controller or service that wants to interact with the data domain. Generic Repositories, while extremely common out there in examples and such are an anti-pattern especially when it comes to Entity Framework. The reason is because they are poorly suited to the task and don't follow the intent of a Generic pattern. Generic classes are classes optimized where you can treat all instances entirely equally. This means if I have a Repository<Student> and a Repository<Book> then every operation between a Student and Book should be identical.
With EF, working on such assumptions is either crippling the capabilities that EF can bring, or adding a lot of unnecessary complexity to your solution to enable features like eager loading, filtering, projection, sorting, pagination, etc. While a Generic Repository can still serve as a base class for a repository, even then, the common capability that it can really provide doesn't really make it very worthwhile.
The other problem with Generic Repositories, or more specifically a Repository tied to a single Domain object is that it violates the Single Responsibility Principle. SRP is part of the S.0.L.I.D. design principles and states that a class should have one, and only one reason to change. While on the surface, using a Repository per domain object might seem like you're giving a repository one reason to change, this isn't really the case. Take something like a StudentRepository. How many controllers or services will need to interact with Students? Will they all be expecting to perform the exact same operations and have the exact same requirements of the Repository fetching and updating Students? Each consumer of a StudentRepository is a reason for that repository to change. One technique to get the most out of EF is to leverage Projection where-by we use Select or ProjectTo to significantly reduce the data size coming back and can leverage indexes for commonly used queries. If we are using a Generic Repository it's even worse because now the code in the repository has every reason to change as it needs to apply to all domain classes.
By all means you can write Generic Repositories or Repository-per-Domain Class to satisfy SRP, however the resulting Repository will either be extremely inefficient or extremely complex.
Instead, I recommend thinking of a Repository like you would a Controller in MVC, where a Repository has a single purpose: To Serve that Controller/Service.
For example, if I have a StudentController, I would create a StudentRepository. However, the purpose of StudentRepository is to serve the StudentController as opposed to the Student domain object. If the StudentController needs a list of Books, the StudentRepository will expose a method to retrieve them. A better example might be where I have a SearchStudentController and EditStudentController. Each of these would have respective SearchStudentRepository and EditStudentRepository. In this way the StudentRepository can expose methods specific to the needs of the Controller or Service that needs access to the domain. It has one, and only one reason to change.
The other advantage of this pattern is it makes dependency management a lot cleaner. Rather than a StudentController needing a StudentRepository, and a BookRepository, and a CourseRepository, and a ... It needs just one Repository to serve the domain.
There may be a legitimate case to have more common Repository available for things like lookup values or such that pretty much all similar Controllers or Services might consume where that consumption is identical across all controllers.
The counter-argument to this approach is that code can be duplicated. For instance if you have a BooksRepository for listing/adding/managing books and a StudentsRepository that also needs to list books, then you can end up with duplicate or similar code for something like:
IEnumerable<Book> GetBooks();
However, these methods are often "similar" rather than "identical". When you want a list of Books for a particular Student, chances are you are filtering out books that are applicable to their courses, or the current revision, etc. When you are listing Books on a book search and management screen you might want to see/filter books by completely different criteria.
So if in your case we have a StudentController and a non-Generic, Controller-serving StudentRepository class, when we want to get a list of students we can explore options that don't impact anything else. At a start we can consider something like:
public async Task<IEnumerable<Student>> GetStudents()
{
var students = await _context.Students
.Include(s => s.Books)
.ToListAsync();
return students;
}
Doing this with a Generic Repository isn't really viable, but with a Repository designed to serve our specific needs we can write queries that meet those needs.
For something like search results where we don't need every detail, the repository could return a simplified DTO with the details that need to be displayed. For instance if we just wanted the student's ID #, Name, and # of books:
[Serializable]
public class StudentSummaryDTO
{
public int StudentId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int BookCount { get; set; }
}
Then in the repository:
public async Task<IEnumerable<StudentSummaryDTO>> GetStudents()
{
var students = await _context.Students
.Select(s => new StudentSummaryDTO
{
StudentId = s.StudentId,
FirstName = s.FirstName,
LastName = s.LastName,
BookCount = s.Books.Count
}).ToListAsync();
return students;
}
This can generate a much faster and lighter weight query to run to return just enough data for the consumer. A more advanced variant is just to design the repository to return IQueryable<Student> to allow the consuming Controller to perform its own projection, pagination, etc.

Repository with many methods or Entities with foreign keys

I have a database with a Customer, Supplier, and Services (this is a gross simplification, I really have about 100 tables)
I am developing a new Entity Framework library for accessing these tables.
A Customer has many Suppliers
A Supplier has many Services
I am trying to decide which approach to follow -
A )
Use mapping to connect the Customer to the Supplier and the Supplier to the Services, then every time I load a customer I get all his suppliers and their services (and other tables loaded)
B )
Have no mapping between entities, but provide methods in the relevant repository; e.g. in the supplier repository I'll have IEnumerable<Supplier> GetSupplierByCustomerID(int customerID)
EDIT Changed above to IEnumerable based on suggestions.
Are these the two main approaches when using EF? Which is considered better, from your perspective.
Is there another approach I'm not considering?

I would personnally expose many simple methods.
Use mapping to connect the Customer to the Supplier and the Supplier
to the Services, then every time I load a customer I get all his
suppliers and their services (and other tables loaded)
If you only need to get the Name of your customer from its ID, then the above solution would require you to load useless and heavy object graph unless you use lazy loading, but as you may have some serialization process (3-tier architecture ?), it's a problem for you as you can't use lazy loading in this case.
So you could expose for example:
Supplier GetSupplierByID(int supplierID)
IEnumerable<Supplier> GetSuppliersByCustomerID(int customerID)
...
I would also recommend not exposing IQueryable. If possible, use IEnumerable instead. See this article for more details about the danger of using IQueryable when all implications are not well known.

In general I feel like putting a repository over EF is always a good idea. You get to abstract your database logic from your client-side logic (or even business logic). And the specific case that you're mentioning you would be able to do one other nice benefit: You would only get the information that you want when you specifically call for it (like the GetSupplierByCustomerID example that you mentioned.
Another approach you might consider is the one that I mentioned in the answer to this question: Bounded Contexts. The more separation of concerns that you have in your application, the better it will be in the long run for you and your fellow programmers (especially when you want to unit test it all).

It is just my opinion, I do not know whether it is proper in your case since it depends on your business requirements, but I generally prefer the third option.
All repositories return IEnumerables, not IQueryables : this enables all database operations to be finished before running any business logic.
All repositories expose methods with optional parameters enabling to declare included navigation properties : this enables to call repository methods with required navigation entities.
Create a base generic repository and inherit from it in each of your repositories.
Implement unit of work pattern to share context and enable transaction.
sample method signiture from base repository (T is the type of entity):
IEnumerable<T> Find(Expression<Func<T, bool>> criteria, params Expression<Func<T, object>>[] navigationList)

When its about to map or not to map, I'd opt for A. There are many advantages to navigation properties (like Customer.Supplier) and there are many ways to control lazy/eager loading.
Advantages of navigation properties is that linq queries are much easier to write. Hardly ever you'll have to write a join:
With join:
from supp in db.Supliers
join serv in db.Services on supp.SupplierId equals serv.SupplierId
select ...
With navigation property
from supp in db.Supliers
from serv in supp.Services
select ...
Or things like this:
from supp in db.Supliers
select new { supp.Name, ServicesCount = supp.Services.Count() }
and EF will figure out how to do the joins in SQL.
Having navigation properties doesn't mean that they always get loaded. For lazy loading to happen, two conditions must be met
The property must be defined as virtual to enable EF to override it in a proxy type with wiring to cary out lazy loading.
The context must be lazy-loading enabled. They are by default, but you can turn it off per instance by setting context.Configuration.LazyLoadingEnabled = false.
So this also shows two ways to control lazy loading: you can enable/disable it structurally or temporarily.
Apart from that you can control the opposite, eager loading, in two ways:
Using the Include statement:
db.Suppliers.Include(s => s.Services)
Including navigation properties in projections:
from supp in db.Supliers
from serv in supp.Services
select new { supp.Name, serv.ServiceName }
(there are more ways, but these are the most important ones)
This would applies to writing linq queries in your services or repositories. As others have said: don't expose IQueryable to the consumers of your service/repository methods.
One last important note: lazy loading is only possible within the scope of a life context. If the context is disposed and a lazy-loading navigation property is addressed, an exception is thrown. At the same time it is recommended to uses context instances with a short life span. So there's the dilemma: expose entity objects or only DTO's or view models or stuff like that. When you expose lazy loading-enabled entity objects a consumer may inadvertently address a navigation property that has not been loaded yet, and the context is gone.

My entities have only setters and getters but no methods - design failure -

I have read this once:
"Don't leave entities as bags of getters and setters and put their methods in another layer unless you have a good reason to"
My customer, order, ... objects just get the data from the SqlDataReaders. They have only getter and setter.
My first question is which design approach does this follow when someone implements methods in entities AND what are these methods doing?

This way of thinking comes from the Domain Driven Design community.
In DDD you create a Domain Model that captures the functionality that your users request. You design your entities as having functionality and the data they need for it. You group them together in aggregates and you have separate classes that are responsible for construction (Factories) and querying (Repositories).
If you only have getters/setters you have an 'Anemic Domain Model'. Martin Fowler wrote about it in this article.
The problem with an Anemic Domain model is that you have the overhead of mapping your database to objects but not the benefits of it. If you don't use your entities as a real domain model, why don't you just use a DataTable or something for your data and keep your business logic in separate functions? An Anemic Domain model is an anti-pattern that should be avoided.
You also mention that you map the entities yourself. This blog explains why using an Object-Relational Mapping tool can really help. If you use Entity Framework with a Code First approach you can write a clean domain model with both data and functionality and map it to your database without much hassle. Then you will have the best of both worlds.

When you have methods as part of your model, you should only include model specific type of logic. For example, consider a bank account:
public class Account {
public AccountId Id { get; set; }
public Person Customer {get; set; }
public void Credit(Money amount) { ... }
public void Debit(Money amount) { ... }
}
Credit and Debit are model-specific logic (you won't find them anywhere else in the application), and should be encapsulated in the Account class.
You also mentioned that you used SqlDataReader within your model classes to get the data from the database. This is a big anti-pattern. Here are some problems you will encounter with this:
Violating Single Responsibility Principle - The model is now in-charge of representing the data and getting the data from the db.
How about querying children in your model? It gets messy.
You won't be able to change your data-access as easily.
Keep the model lean. Put the data access logic in a repository, i.e. AccountRepository.

Is there a design pattern for light & heavy versions of an object?

I have the need for both light-weight, and heavy-weight versions of an object in my application.
A light-weight object would contain only ID fields, but no instances of related classes.
A heavy-weight object would contain IDs, and instances of those classes.
Here is an example class (for purpose of discussion only):
public class OrderItem
{
// FK to Order table
public int OrderID;
public Order Order;
// FK to Prodcut table
public int ProductID;
public Product Product;
// columns in OrderItem table
public int Quantity;
public decimal UnitCost;
// Loads an instance of the object without linking to objects it relates to.
// Order, Product will be NULL.
public static OrderItem LoadOrderItemLite()
{
var reader = // get from DB Query
var item = new OrderItem();
item.OrderID = reader.GetInt("OrderID");
item.ProductID = reader.GetInt("ProductID");
item.Quantity = reader.GetInt("Quantity");
item.UnitCost = reader.GetDecimal("UnitCost");
return item;
}
// Loads an instance of the objecting and links to all other objects.
// Order, Product objects will exist.
public static OrderItem LoadOrderItemFULL()
{
var item = LoadOrderItemLite();
item.Order = Order.LoadFULL(item.OrderID);
item.Product = Product.LoadFULL(item.ProductID);
return item;
}
}
Is there a good design pattern to follow to accomplish this?
I can see how it can be coded into a single class (as my example above), but it is not apparent in which way an instance is being used. I would need to have NULL checks throughout my code.
Edit:
This object model is being used on client side of client-server application. In the case where I'm using the light-weight objects, I don't want lazy load because it will be a waste of time and memory ( I will already have the objects in memory on client side elsewhere)

Lazy initialization, Virtual Proxy and Ghost are three implementations of that lazy loading pattern. Basically they refer to load properties once you need them. Now, I suppose you'll be using some repo to store objects so I'll encourage you to use any of the ORM tools available. (Hibernate, Entity Framework and so on), they all implement these functionality free for you.

Have you considered using an ORM tool like NHibernate for accessing DB? If you use something like NHibernate, you would get this behavior by means of lazy loading.
Most ORM tools do exactly what you are looking for within lazy loading - they first get the object identifiers, and upon accessing a method, they issue subsequent queries to load the related objects.

Sounds like you might have a need for a Data Transfer Object (DTO), just a "dumb" wrapper class that summarizes a business entity. I usually use something like that when I need to flatten out an object for display. Be careful, though: overuse results in an anti-pattern.
But rendering an object for display is different from limiting hits against the database. As Randolph points out, if your intention is the latter, then use one of the existing deferred loading patterns, or better yet, use an ORM.

Take a look at the registry pattern, you can use it to find objects and also to better manage these objects, like keeping them in a cache.

Advice on Linq to SQL mapping object design

I hope the title and following text are clear, I'm not very familiar with the correct terms so please correct me if I get anything wrong. I'm using Linq ORM for the first time and am wondering how to address the following.
Say I have two DB tables:
User
----
Id
Name
Phone
-----
Id
UserId
Model
The Linq code generator produces a bunch of entity classes.
I then write my own classes and interfaces which wrap these Linq classes:
class DatabaseUser : IUser
{
public DatabaseUser(User user)
{
_user = user;
}
public Guid Id
{
get { return _user.Id; }
}
... etc
}
so far so good.
Now it's easy enough to find a users phones from Phones.Where(p => p.User = user) but surely comsumers of the API shouldn't need to be writing their own Linq queries to get at data, so I should wrap this query in a function or property somewhere.
So the question is, in this example, would you add a Phones property to IUser or not?
In other words, should my interface specifically be modelling my database objects (in which case Phones doesn't belong in IUser), or are they actually simply providing a set of functions and properties which are conceptually associated with a User (in which case it does)?
There seems drawbacks to both views, but I'm wondering if there is a standard approach to the problem. Or just any general words of wisdom you could share.
My first thought was to use extension methods but in fact that doesn't work in this case.

I've had some awful experiences trying to abstract LINQtoSQL entities behind interfaces. It was a while ago, but from memory the main problem was that it totally breaks associations. For example, if you have a Customer -> Order relationship, you end up exposing it as an ICustomer, with a collection of IOrders, which means that Customer has to do some awkward mapping to cast it's internal collection of Order objects as IOrders.
Then you have to assume that when an IOrder gets passed back in, that we can cast it to an Order. Otherwise LINQtoSQL can't deal with it, but then that defeats the point of having the interface there in the first place.
I would strongly recommend that you don't try and abstract away the entity classes too much, LINQtoSQL doesn't actually put any real magic in them, the DataContext handles their persistence lifecycle, so they remain testable.
The aspects that I would be looking to hide behind an interface would be the interactions with DataContext, for example using Repository-style classes:
public interface IPhoneRepository
{
IEnumerable<Phone> GetPhonesForUser(User user);
}
public class L2SPhoneRepository : IPhoneRepository
{
private readonly MyDataContext context;
public L2SPhoneRepository(MyDataContext context)
{
this.context = context;
}
public IEnumerable<Phone> GetPhonesForUser(User user)
{
return context.Phones.Where(p => p.User == user);
}
}

Your interface should model how you would like for the objects to be used. Since you are trying to abstract, then the consumer should not have to query the DB. Whether you make it a property, or a separate function call (ie, GetPhones()), is entirely up to you. Since you are completely wrapping things, you'll have to make some choices about how deep/lazily you want to load your objects.

You should add Phones property to IUser and make it nullable, so for a User who don't have Phone, it will be null.
Since you don't want consumers of the API to write queries, than you should implement functions like GetUser().. etc.
Here is a nice list of article abt n-tier application in Asp.net
http://imar.spaanjaars.com/QuickDocId.aspx?quickdoc=416

I tend to consider the Linq2Sql related stuff to be an implementation detail of the data access code and, like the real structure of the database, shouldn't necessarily be exposed to other parts of the system.
If your API is going to be consumed by other people it should be cohesive and easy to use and not cluttered by things the consumer doesn't need to know about. If I'm dealing with users and their phones I don't really want to know about DataContexts or (ugh) DataSets.
Also, by keeping the bulk of your code ignorant of the L2S and database you will have an easier time testing, making schema changes (oh, so now the User table needs to keep a history of every change made) or even changing the ORM completely.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.