I would like to de-couple my business data entities from my database data entities, and in this way make my application a bit more "data source independent", and in this way to switch data source all I would need to do is to create a few new repositories for the new data source.
However, I cant quite make up my mind of how to do the mapping.
My current data source is a "database" from Parse.com, and in my first attempt to do the mapping I were using AutoMapper.
What if one if my entities has a reference to itself? or what if one of the "child entities" has a reference back to its parent (sort of as EF does it)?
Class ParentClass
{
public string Name { get;set }
public IEnumerable<ChildClass> Children { get;set; }
}
Class ChildClass
{
public string Name { get;set }
public ParentClass Parent{ get;set; }
}
I assume that if I map this up, it would end up in a loop?
Another problem I'm having is that what if for instance Children would have some really deep reference properties?
Lets for instance pretend that a ChildClass looks like this
Class ChildClass
{
public string Name { get;set }
public ParentClass Parent { get;set; }
public IEnumerable<ChildClass> Children { get;set; }
}
I know this might be a stupid example, but in this case a ChildClass object could have a very deep going reference to a bunch of children and these children might have a bunch of children and so forth.
If I would map this using AutoMapper I would end up actually mapping all these children until there is no children left to map. But what if I'm not actually in need of all the "sub-children"?
Is there a way to make it "lazy load" the children on property use?
As an example:
myParentObject.Children.FirstOrDefault().Children.FirstOrDefault().Children.FirstOrDefault()
That would cause the Children property to only get loaded "on call"
Any suggestions as to how to map your data entities to business entities?
I assume that what I'm actually looking for is something that makes it possible to have custom business entities that's treated in the same way as entity framework treats its entities.
Your repositories could return simple poco dto objects (yes, I know the "o" in poco and dto already stands for "object"). Theese do not need to have such complexities like a backlink to the parent. They are just hierarchical. Master with list of details and so on. Some have references to some lookups. Here you could get into trouble if two objects point to the same lookup object and you can't manage to make them the same. Identity Map Pattern http://en.wikipedia.org/wiki/Identity_map_pattern or Value Object Pattern http://en.wikipedia.org/wiki/Value_object comes to help here.
Now your Mapper could be in an own assemlby that references your business entities and the dtos. So your business entities and dtos need not to know each other. In this way you have a maximum of decoupling and independance.
Your Repository should be different for every use case and should "know" which data is needed to which depth and load it accordingly. No almighty "ContactReposiory" that always loads the adresses even when not needed in all use cases. And don't use lazy loading in production. You could end up with a lot of (performance) trouble with it. (Traversing with two litte foreach could end up in 1000th of database roundtrips).
Your mapper could use automapper if suitable at least for the easy parts. The complex parts you could implement the hard way, by hand.
Related
My goal is async loading of related entities using DBContext.
Let imagine two projects. The first named MyApp.Domain and contains domain entities.
namespace MyApp.Domain
{
public class PlanPage
{
public Guid Id { get; set; }
}
}
namespace MyApp.Domain
{
public class PlanPageDay
{
public Guid Id { get; set; }
public Guid PlanPageId { get; set; }
}
}
The second project named MyApp.Infrastructure.EntityFramework and contains configuration of projection entities to database. It also contains class which extends domain entity and implements Entity framework specific logic.
namespace MyApp.Infrastructure.EntityFramework.Models
{
public class PlanPageEntity : PlanPage
{
private readonly ApplicationDbContext _applicationDbContext;
protected PlanPageEntity(ApplicationDbContext applicationDbContext)
{
_applicationDbContext = applicationDbContext;
}
public ICollection<PlanPageDay>? Days { get; set; }
public async Task<ICollection<PlanPageDay>> GetDays()
{
return Days ??= await _applicationDbContext.PlanPageDays
.Where(pd => pd.PlanPageId == Id)
.ToListAsync();
}
}
}
The purpose of this example is simple. We separate infrastructure code from domain code. Look how do we plan to use this concept:
// Entity initializing code. Placing somewhere in domain logic.
var plan = new PlanPage(/*some constructor arguments*/);
// Entity loading code. Placing somewhere in infrastructure implementation.
public async Task<PlanPage> GetPlanPage(Guid id)
{
return await _applicationDbContext.Set<PlanPageEntity>().FindAsync(id);
}
Note that we tell to Entity framework to use child class (PlanPageEntity) so it can handle all specific things that it can.
The question is: Is it possible to configure the EF so that it allows us to use this concept?
As requested here's a little more details for my opinion stated in the comments.
The main reason why I think your current approach is a bad idea is that it violates the separation of concerns design principle: when you are mixing domain models with data access models, you make your domain logic completely dependent on how you model the data in your database. This quickly limits your options because the database may have some restrictions on how you can model your data that doesn't fit well with the domain logic you want to implement as well as making maintenance difficult. E.g. if you decide to split up one DB table into two then you might have a big task ahead of you in order to make your domain logic work with those two new models/tables. Additionally, making performance optimizations in your database easily becomes a nightmare if not thought through ahead of time - and you shouldn't spend time thinking of optimizing your system before it's necessary.
I know this is a little abstract since I don't know much about your domain but I'm sure I could find more arguments against it.
Instead, separating data access models (and in general all external data models) from your domain models makes it much easier to maintain: if you need to make some changes to your database, you simply need to update the logic that maps the data from your data access models to your domain model - nothing in your domain logic needs to change.
In the examples you have given, you have already logically separated your domain models and data access models into two separate projects. So why not follow through with that thought and separate the two with a binding/mapping layer in-between?
Is it possible to configure the EF so that it allows us to use this concept?
Yes. Essentially you have DTO's, and your Entities derive from your DTOs. So when you fetch an Entity you can return it directly. But if you wouldn't be able to attach a non-Entity, so you'd have to map it. It's going to be inconvenient, and like 99.999% of bespoke entity and repository designs, will be ultimately a waste of time.
This is somewhat similar to the what EF already does for you. Start with persistence-ignorant Entity classes, and introduce persistence-aware runtime subtypes for scenarios that require them, which is basically just Lazy Loading.
Consider if you will, the example of an Order class having a collection property of OrderLines.
public class Order
{
public OrderLineCollection OrderLines { get; private set; }
}
Now consider a Data Access Layer that returns a collection of Order objects without the OrderLines property populated (empty collection).
To minimize round trips to the server, the system passes the ids of the all Order objects to the DAL, which returns the OrderLine objects for each Order in one go. Code in the Business Rules Layer is responsible for adding the correct OrderLine objects to the correct Order objects.
public class OrderDAL
{
public IEnumerable<Order> GetOrdersByCustomer(int customerId)
{
...
}
public IEnumerable<OrderLine> GetOrderLines(IEnumerable<int> orderIds)
{
...
}
}
Is this general way of doing this kind of thing (to reduce database round-trips)?
Should the DAL have the responsibility of returning fully populated Order objects?
Are there better ways?
And no, I cannot use a ORM tool in this particular instance!
I for one don't. I don't want to go back to the store to retrieve more data after an initial query. When loading the data, you (ought to) know for what environment you are loading it, so you will know what "navigational properties" or joins you want to make on beforehand. This way with one query you can get all the data you want.
This is however from a stateless point of view, as I'm currently focusing on MVC and Entity Framework. I guess if you're creating an accounting program, you may have one Orders screen that displays order headers, and an Order Details screen where you want to display the details for the selected order. So in that case, yes, it can be useful to only have to retrieve the OrderLines for the selected order(s).
As usual, the answer is: it depends.
And no, I cannot use a ORM tool in this particular instance!
Why?
I'm reading through Pro ASP.NET MVC 3 Framework that just came out, and am a bit confused about how to handle the retrieval of aggregate objects from a data store. The book uses Entity framework, but I an considering using a mini-ORM (Dapper or PetaPoco). As an example, the book uses the following objects:
public class Member {
public string name { get; set; }
}
public class Item {
public int id { get; set; }
public List<Bid> bids { get; set; }
}
public class Bid {
public int id { get; set; }
public Member member { get; set; }
public decimal amount { get; set; }
}
As far as I'm into the book, they just mention the concept of aggregates and move on. So I am assuming you would then implement some basic repository methods, such as:
List<Item> GetAllItems()
List<Bid> GetBidsById(int id)
GetMemberById(int id)
Then, if you wanted to show a list of all items, their bids, and the bidding member, you'd have something like
List<Item> items = Repository.GetAllItems();
foreach (Item i in items) {
i.Bids = Repository.GetBidsById(i.id);
}
foreach (Bid b in items.Bids) {
b.Member = Repository.GetMemberById(b.id);
}
If this is correct, isn't this awfully inefficient, since you could potentially issue thousands of queries in a few seconds? In my non-ORM thinking mind, I would have written a query like
SELECT
item.id,
bid.id,
bid.amount,
member.name
FROM
item
INNER JOIN bid
ON item.id = bid.itemId
INNER JOIN member
ON bid.memberId = member.id
and stuck it in a DataTable. I know it's not pretty, but one large query versus a few dozen little ones seems a better alternative.
If this is not correct, then can someone please enlighten me as to the proper way of handling aggregate retrieval?
If you use Entity Framework for you Data Access Layer, read the Item entity and use the .Include() fluent method to bring the Bids and Members along for the ride.
An aggregate is a collection of related data. The aggregate root is the logical entry point of that data. In your example, the aggregate root is an Item with Bid data. You could also look at the Member as an aggregate root with Bid data.
You may use your data access layer to retrieve the object graph of each aggregate and transforming the data for your use in the view. You may even ensure you eager fetch all of the data from the children. It is possible to transform the data using a tool like AutoMapper.
However, I believe that it is better to use your data access layer to project the domain objects into the data structure you need for the view, whether it be ORM or DataSet. Again, to use your example, would you actually retrieve the entire object graph suggested? Do I need all items including their bids and members? Or do I need a list of items, number of bids, plus member name and amount for the current winning bid? When I need more data about a particular item, I can go retrieve that when the request is made.
In short, your intuition was spot-on that it is inefficient to retrieve all that data, when a projection would suffice. I would just urge you to limit the projection even further and retrieve only the data you require for the current view.
This would be handled in different ways depending on your data access strategy. If you were using NHibernate or Entity Framework, you can have the ORM automatically populate these properties for you eagerly, lazy load them, etc. Entity Framework calls them "Navigation Properties", I'm not sure that NHibernate has a specific name for these "child properties" or "child collections".
In old-school ADO.NET, you might do something like create a stored procedure that returns multiple result sets (one for the main object and other result sets for your child collections or related objects), which would let you avoid calling the database multiple times. You could then iterate over the results sets and hydrate your object with all its relationships with one database call, and inside of a single repository method.
Where ever in your system you do the data retrieval, you would program your orm of choice to do an eager fetch of the related objects (aggregates).
Using what kind of data access method depends on your project.
Convenience vs performance.
Using EF or Linq to SQL really boosts the coding speed. When talking about performance, you really should care about every sql statement you deliver to the database.
No ORM can do both.
You can treat the read (query) and the write (command) side of the model separately.
When you want to mutate the state of your Aggregate, you load the Aggregate Root (AR) via a repository, mutate its state using the intention revealing public methods on the AR, then save the AR with the repository back again.
On the read side however, you can be as flexible as you want. I don't know Entity Framework, but with NHibernate you could use the QueryOver API to generate flexible queries to populate DTO's designed to be consumed by the client, whether it be a service or a View. If you want more performance you could go with Dapper. You could even use Stored Procs that projects itself to a DTO, that way you can be as efficient in the DB layer as possible.
I am using repository pattern in a .NET C# application that does not use an ORM. However the issue I am having is how to fill One-to-many List properties of an entity. e.g. if a customer has a list of orders i.e. if the Customer class has a List property called Orders and my repository has a method called GetCustomerById, then?
Should I load the Orders list within the GetCustomerById method?
What if the Order itself has another list property and so on?
What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
This also raises questions for Features like Change Tracking, Deleting etc? So i think the end result is can I do DDD without ORM ?
But right now I am only interested in lazy loading List properties in my domain entities? Any idea?
Nabeel
I am assuming this is a very common issue for anyone not using an ORM in a Domain Driven Design? Any idea?
can I do DDD without ORM ?
Yes, but an ORM simplifies things.
To be honest I think your problem isn't to do with whether you need an ORM or not - it's that you are thinking too much about the data rather than behaviour which is the key for success with DDD. In terms of the data model, most entities will have associations to most another entities in some form, and from this perspective you could traverse all around the model. This is what it looks like with your customer and orders and perhaps why you think you need lazy loading. But you need to use aggregates to break these relationships up into behavioural groups.
For example why have you modelled the customer aggregate to have a list of order? If the answer is "because a customer can have orders" then I'm not sure you're in the mindset of DDD.
What behaviour is there that requires the customer to have a list of orders? When you give more thought to the behaviour of your domain (i.e. what data is required at what point) you can model your aggregates based around use cases and things become much clearer and much easier as you are only change tracking for a small set of objects in the aggregate boundary.
I suspect that Customer should be a separate aggregate without a list of orders, and Order should be an aggregate with a list of order lines. If you need to perform operations on each order for a customer then use orderRepository.GetOrdersForCustomer(customerID); make your changes then use orderRespository.Save(order);
Regarding change tracking without an ORM there are various ways you can do this, for example the order aggregate could raise events that the order repository is listening to for deleted order lines. These could then be deleted when the unit of work completed. Or a slightly less elegant way is to maintain deleted lists, i.e. order.DeletedOrderLines which your repository can obviously read.
To Summarise:
I think you need to think more about behaviour than data
ORM's make life easier for change tracking, but you can do it without one and you can definitely do DDD without one.
EDIT in response to comment:
I don't think I'd implement lazy loading for order lines. What operations are you likely to perform on the order without needing the order lines? Not many I suspect.
However, I'm not one to be confined to the 'rules' of DDD when it doesn't seem to make sense, so... If in the unlikely scenario that there are a number of operations performed on the order object that didn't require the order lines to be populated AND there are often a large number of order lines associated to an order (both would have to be true for me to consider it an issue) then I'd do this:
Have this private field in the order object:
private Func<Guid, IList<OrderLine>> _lazilyGetOrderLines;
Which would be passed by the order repository to the order on creation:
Order order = new Order(this.GetOrderLines);
Where this is a private method on the OrderRepository:
private IList<OrderLine> GetOrderLines(Guid orderId)
{
//DAL Code here
}
Then in the order lines property could look like:
public IEnumberable<OrderLine> OrderLines
{
get
{
if (_orderLines == null)
_orderLines = _lazilyGetOrderLines(this.OrderId);
return _orderLines;
}
}
Edit 2
I've found this blog post which has a similar solution to mine but slightly more elegant:
http://thinkbeforecoding.com/post/2009/02/07/Lazy-load-and-persistence-ignorance
1) Should I load the Orders list within the GetCustomerById method?
It's probably a good idea to separate the order mapping code from the customer mapping code. If you're writing your data access code by hand, calling that mapping module from the GetCustomerById method is your best option.
2) What if the Order itself has another list property and so on?
The logic to put all those together has to live somewhere; the related aggregate repository is as good a place as any.
3) What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
The best solution I've seen is to make your repository return subclassed domain entities (using something like Castle DynamicProxy) - that lets you maintain persistence ignorance in your domain model.
Another possible answer is to create a new Proxy object that inherits from Customer, call it CustomerProxy, and handle the lazy load there. All this is pseudo-code, so it's to give you an idea, not just copy and paste it for use.
Example:
public class Customer
{
public id {get; set;}
public name {get; set;}
etc...
public virtual IList<Order> Orders {get; protected set;}
}
here is the Customer "proxy" class... this class does not live in the business layer, but in the Data Layer along with your Context and Data Mappers. Note that any collections you want to make lazy-load you should declare as virtual (I believe EF 4.0 also requires you to make props virtual, as if spins up proxy classes at runtime on pure POCO's so the Context can keep track of changes)
internal sealed class CustomerProxy : Customer
{
private bool _ordersLoaded = false;
public override IList<Order> Orders
{
get
{
IList<Order> orders = new List<Order>();
if (!_ordersLoaded)
{
//assuming you are using mappers to translate entities to db and back
//mappers also live in the data layer
CustomerDataMapper mapper = new CustomerDataMapper();
orders = mapper.GetOrdersByCustomerID(this.ID);
_ordersLoaded = true;
// Cache Cases for later use of the instance
base.Orders = orders;
}
else
{
orders = base.Orders;
}
return orders;
}
}
}
So, in this case, our entity object, Customer is still free from database/datamapper code calls, which is what we want... "pure" POCO's. You've delegated the lazy-load to the proxy object which lives in the Data layer, and does instantiate data mappers and make calls.
there is one drawback to this approach, which is calling client code can't override the lazy load... it's either on or off. So it's up to you in your particular usage circumstance. If you know maybe 75% of the time you'll always needs the Orders of a Customer, than lazy-load is probably not the best bet. It would be better for your CustomerDataMapper to populate that collection at the time you get a Customer entity.
Again, I think NHibernate and EF 4.0 both allow you to change lazy-loading characteristics at runtime, so, as per usual, it makes sense to use an ORM, b/c a lot of functionality is provided for you.
If you don't use Orders that often, then use a lazy-load to populate the Orders collection.
I hope that this is "right", and is a way of accomplishing lazy-load the correct way for Domain Model designs. I'm still a newbie at this stuff...
Mike
Basically, I need to set a property to the results of a query that uses data from the parent object.
With the domain model below, I need to set the C property of EntityB using data from both EntityA and EntityB.
Also, I need to set the A property of EntityB to be the actual instance of EntityA that is its parent.
Query:
Set EntityB.C = (select * from EntityC where SomeProperty = EntityB.SomeProperty and AnotherProperty = EntityB.A.AnotherProperty);
SomeProperty and AnotherProperty are not just keys.
class EntityA
{
public IList<EntityB> B
{
get;
set;
}
}
class EntityB
{
public EntityA A
{
get;
set;
}
public EntityC C
{
get;
set;
}
}
class EntityC
{
...
}
I need a way to execute code (to run the query and assign to property) for each entity returned. I came close using the onload method of an interceptor, but I am looking for another way. Perhaps using a Result Transformer or a Projection?
First of all, if you're using NHibernate properly, the properties and associations should be automatically done for you by the framework. If they're not, then you don't have it set up correctly...
As for doing a query in a property... this is usually not recommended (abstract it into a utility class, or at the very least a function call), but I do remember seeing some way on here how to do it.
There are actually two questions.
Question 1: How to have a property that is loaded by some query?
Ask your self if it really needs to be in the entity. Consider to have a DTO (data transfer object) that holds data from different entities and queries instead.
If you're sure that you need this property in the entity, take a look at formulas for single ended properties and filters for collections.
I can't provide more detailed information, because your question is highly general, and it depends on the actual problem. But you should find a solution by starting with the given links.
Question 2: How can I have a property pointing to the parent?
Very easy: By just implementing the property and map the collection of children (B) "inverse=true". Implement your entities the way that they consistently point to the correct parent.
Why is NH not doing this for you? Because NH's responsibility is only to persist your entities to the database. NH does not make any changes on the data by its own. This is responsibility of your business logic.
Note: your application should also be able to run without NH, eg in a unit test. So relations should be managed in your code.