How to efficiently load an object model from a data store?

How to efficiently load an object model from a data store? - c#

Consider if you will, the example of an Order class having a collection property of OrderLines.
public class Order
{
public OrderLineCollection OrderLines { get; private set; }
}
Now consider a Data Access Layer that returns a collection of Order objects without the OrderLines property populated (empty collection).
To minimize round trips to the server, the system passes the ids of the all Order objects to the DAL, which returns the OrderLine objects for each Order in one go. Code in the Business Rules Layer is responsible for adding the correct OrderLine objects to the correct Order objects.
public class OrderDAL
{
public IEnumerable<Order> GetOrdersByCustomer(int customerId)
{
...
}
public IEnumerable<OrderLine> GetOrderLines(IEnumerable<int> orderIds)
{
...
}
}
Is this general way of doing this kind of thing (to reduce database round-trips)?
Should the DAL have the responsibility of returning fully populated Order objects?
Are there better ways?
And no, I cannot use a ORM tool in this particular instance!

I for one don't. I don't want to go back to the store to retrieve more data after an initial query. When loading the data, you (ought to) know for what environment you are loading it, so you will know what "navigational properties" or joins you want to make on beforehand. This way with one query you can get all the data you want.
This is however from a stateless point of view, as I'm currently focusing on MVC and Entity Framework. I guess if you're creating an accounting program, you may have one Orders screen that displays order headers, and an Order Details screen where you want to display the details for the selected order. So in that case, yes, it can be useful to only have to retrieve the OrderLines for the selected order(s).
As usual, the answer is: it depends.
And no, I cannot use a ORM tool in this particular instance!
Why?

Related

Fetching from Database: Property or Method?

I've got a class that I'd like to let it get a record from a database. I need to make sure that there's at most one record. The record should be a single match to the class, based on the OrderId.
I feel like a property getter would make more sense than a method, but I know that property getters should avoid throwing Exceptions and .Single()/.SingleOrDefault() could end up throwing one. I feel like the method might make people think that it was fetching from the database each time. Either way, I'd have the result cached in a local field.
What is the best practice for something like this? I have an example of what my code is like below.
N.B.: I know that ideally we'd have a unique index on the DB column to make sure it's unique, it's not possible to do with the vendor database we're using.
class OrderDetails
{
DbOrder _order;
string OrderId { get; set; }
DbOrder Order // property way
{
get{
if (this._order == null)
this._order = dbContext.Where(x => x.OrderId == this.OrderId).SingleOrDefault();
return _order;
}
}
DbOrder GetOrder() // method way
{
if (this._order == null)
this._order = dbContext.Where(x => x.OrderId == this.OrderId).SingleOrDefault();
return _order;
}
}

I'd say a property should always raise an exception if it's needed, same as anywhere (just as similarly as it should be avoided if possible).
More to the point, I think it's that properties shouldn't have "side effects", and although yours does not strictly, this is the closest thing I can liken it to. It seems like "a lot to do" (open db connection, queried data, piped results) for a property when a method could be more descriptive: you kind of expect a method will do more legwork ad hoc

Take some time to think by yourself what your class OrderDetails represents:
It represents the details of an order.
or: it represents some access to some storage where order details can be fetched (and possibly changed).
If OrderDetails would represent the first, then it would simply be some POCO with only get and set properties.
Clearly, your OrderDetails represent the 2nd. Your class is meant to ease the access to the storage of the order details. It also hides the storage, so if the storage changes (database changes structure, or not a database anymore but in-memory data), users of your class won't have to change.
The function of your OrderDetails is also more an access to order details, because two objects of class OrderDetails would not mean two order details, but two methods to access the same order details.
If you didn't mean this, but you wanted every order detail object to represent its own order details, consider changing the class such that it contains the fetched data of the order details, not some access to fetch the data. Also make some functions that fetches the data and returns an object with the fetched data.
Consider making static functions like OrderDetails.Create(...) or even better, create an order detail factory class that creates OrderDetail objects for you filled with the desired data.
If you have separated the data from the methods to fetch the data your question will be answered: the methods to fetch the data will be raising exceptions if fetching the data does not succeed. The POCO object that contains the fetched data won't have to raise exceptions: the data itself is not wrong
If you really meant, that your OrderDetail is not the detail of the order, but some access to get the order details, then getting the details of the order is clearly not a property of the access object, but some functionality of this access object.
Another reason to separate the order details from the access to the storage of the order details would be consistency if data. if your Order Detail has some related properties, and you would get them from the storage in separate calls, how would you guarantee that your related data does not change between the first and the second call?
For instance: Get Postcode and City of the address of a person in separate calls. If the data of the person you are querying is changing between your first and second call, because the person is moving, then you could get the old Postcode and the new City
Summarized: make sure that it is clear to you what parts of your design represents the data itself, or some access to get the data, and design your classes accordingly. Data classes will be filled with property get/set, Access classes will be filled with Functions that return the data classes

Is there a design pattern for light & heavy versions of an object?

I have the need for both light-weight, and heavy-weight versions of an object in my application.
A light-weight object would contain only ID fields, but no instances of related classes.
A heavy-weight object would contain IDs, and instances of those classes.
Here is an example class (for purpose of discussion only):
public class OrderItem
{
// FK to Order table
public int OrderID;
public Order Order;
// FK to Prodcut table
public int ProductID;
public Product Product;
// columns in OrderItem table
public int Quantity;
public decimal UnitCost;
// Loads an instance of the object without linking to objects it relates to.
// Order, Product will be NULL.
public static OrderItem LoadOrderItemLite()
{
var reader = // get from DB Query
var item = new OrderItem();
item.OrderID = reader.GetInt("OrderID");
item.ProductID = reader.GetInt("ProductID");
item.Quantity = reader.GetInt("Quantity");
item.UnitCost = reader.GetDecimal("UnitCost");
return item;
}
// Loads an instance of the objecting and links to all other objects.
// Order, Product objects will exist.
public static OrderItem LoadOrderItemFULL()
{
var item = LoadOrderItemLite();
item.Order = Order.LoadFULL(item.OrderID);
item.Product = Product.LoadFULL(item.ProductID);
return item;
}
}
Is there a good design pattern to follow to accomplish this?
I can see how it can be coded into a single class (as my example above), but it is not apparent in which way an instance is being used. I would need to have NULL checks throughout my code.
Edit:
This object model is being used on client side of client-server application. In the case where I'm using the light-weight objects, I don't want lazy load because it will be a waste of time and memory ( I will already have the objects in memory on client side elsewhere)

Lazy initialization, Virtual Proxy and Ghost are three implementations of that lazy loading pattern. Basically they refer to load properties once you need them. Now, I suppose you'll be using some repo to store objects so I'll encourage you to use any of the ORM tools available. (Hibernate, Entity Framework and so on), they all implement these functionality free for you.

Have you considered using an ORM tool like NHibernate for accessing DB? If you use something like NHibernate, you would get this behavior by means of lazy loading.
Most ORM tools do exactly what you are looking for within lazy loading - they first get the object identifiers, and upon accessing a method, they issue subsequent queries to load the related objects.

Sounds like you might have a need for a Data Transfer Object (DTO), just a "dumb" wrapper class that summarizes a business entity. I usually use something like that when I need to flatten out an object for display. Be careful, though: overuse results in an anti-pattern.
But rendering an object for display is different from limiting hits against the database. As Randolph points out, if your intention is the latter, then use one of the existing deferred loading patterns, or better yet, use an ORM.

Take a look at the registry pattern, you can use it to find objects and also to better manage these objects, like keeping them in a cache.

A quick question about aggregate relational objects in MVC

I'm reading through Pro ASP.NET MVC 3 Framework that just came out, and am a bit confused about how to handle the retrieval of aggregate objects from a data store. The book uses Entity framework, but I an considering using a mini-ORM (Dapper or PetaPoco). As an example, the book uses the following objects:
public class Member {
public string name { get; set; }
}
public class Item {
public int id { get; set; }
public List<Bid> bids { get; set; }
}
public class Bid {
public int id { get; set; }
public Member member { get; set; }
public decimal amount { get; set; }
}
As far as I'm into the book, they just mention the concept of aggregates and move on. So I am assuming you would then implement some basic repository methods, such as:
List<Item> GetAllItems()
List<Bid> GetBidsById(int id)
GetMemberById(int id)
Then, if you wanted to show a list of all items, their bids, and the bidding member, you'd have something like
List<Item> items = Repository.GetAllItems();
foreach (Item i in items) {
i.Bids = Repository.GetBidsById(i.id);
}
foreach (Bid b in items.Bids) {
b.Member = Repository.GetMemberById(b.id);
}
If this is correct, isn't this awfully inefficient, since you could potentially issue thousands of queries in a few seconds? In my non-ORM thinking mind, I would have written a query like
SELECT
item.id,
bid.id,
bid.amount,
member.name
FROM
item
INNER JOIN bid
ON item.id = bid.itemId
INNER JOIN member
ON bid.memberId = member.id
and stuck it in a DataTable. I know it's not pretty, but one large query versus a few dozen little ones seems a better alternative.
If this is not correct, then can someone please enlighten me as to the proper way of handling aggregate retrieval?

If you use Entity Framework for you Data Access Layer, read the Item entity and use the .Include() fluent method to bring the Bids and Members along for the ride.

An aggregate is a collection of related data. The aggregate root is the logical entry point of that data. In your example, the aggregate root is an Item with Bid data. You could also look at the Member as an aggregate root with Bid data.
You may use your data access layer to retrieve the object graph of each aggregate and transforming the data for your use in the view. You may even ensure you eager fetch all of the data from the children. It is possible to transform the data using a tool like AutoMapper.
However, I believe that it is better to use your data access layer to project the domain objects into the data structure you need for the view, whether it be ORM or DataSet. Again, to use your example, would you actually retrieve the entire object graph suggested? Do I need all items including their bids and members? Or do I need a list of items, number of bids, plus member name and amount for the current winning bid? When I need more data about a particular item, I can go retrieve that when the request is made.
In short, your intuition was spot-on that it is inefficient to retrieve all that data, when a projection would suffice. I would just urge you to limit the projection even further and retrieve only the data you require for the current view.

This would be handled in different ways depending on your data access strategy. If you were using NHibernate or Entity Framework, you can have the ORM automatically populate these properties for you eagerly, lazy load them, etc. Entity Framework calls them "Navigation Properties", I'm not sure that NHibernate has a specific name for these "child properties" or "child collections".
In old-school ADO.NET, you might do something like create a stored procedure that returns multiple result sets (one for the main object and other result sets for your child collections or related objects), which would let you avoid calling the database multiple times. You could then iterate over the results sets and hydrate your object with all its relationships with one database call, and inside of a single repository method.

Where ever in your system you do the data retrieval, you would program your orm of choice to do an eager fetch of the related objects (aggregates).

Using what kind of data access method depends on your project.
Convenience vs performance.
Using EF or Linq to SQL really boosts the coding speed. When talking about performance, you really should care about every sql statement you deliver to the database.
No ORM can do both.

You can treat the read (query) and the write (command) side of the model separately.
When you want to mutate the state of your Aggregate, you load the Aggregate Root (AR) via a repository, mutate its state using the intention revealing public methods on the AR, then save the AR with the repository back again.
On the read side however, you can be as flexible as you want. I don't know Entity Framework, but with NHibernate you could use the QueryOver API to generate flexible queries to populate DTO's designed to be consumed by the client, whether it be a service or a View. If you want more performance you could go with Dapper. You could even use Stored Procs that projects itself to a DTO, that way you can be as efficient in the DB layer as possible.

Repository Pattern without an ORM

I am using repository pattern in a .NET C# application that does not use an ORM. However the issue I am having is how to fill One-to-many List properties of an entity. e.g. if a customer has a list of orders i.e. if the Customer class has a List property called Orders and my repository has a method called GetCustomerById, then?
Should I load the Orders list within the GetCustomerById method?
What if the Order itself has another list property and so on?
What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
This also raises questions for Features like Change Tracking, Deleting etc? So i think the end result is can I do DDD without ORM ?
But right now I am only interested in lazy loading List properties in my domain entities? Any idea?
Nabeel
I am assuming this is a very common issue for anyone not using an ORM in a Domain Driven Design? Any idea?

can I do DDD without ORM ?
Yes, but an ORM simplifies things.
To be honest I think your problem isn't to do with whether you need an ORM or not - it's that you are thinking too much about the data rather than behaviour which is the key for success with DDD. In terms of the data model, most entities will have associations to most another entities in some form, and from this perspective you could traverse all around the model. This is what it looks like with your customer and orders and perhaps why you think you need lazy loading. But you need to use aggregates to break these relationships up into behavioural groups.
For example why have you modelled the customer aggregate to have a list of order? If the answer is "because a customer can have orders" then I'm not sure you're in the mindset of DDD.
What behaviour is there that requires the customer to have a list of orders? When you give more thought to the behaviour of your domain (i.e. what data is required at what point) you can model your aggregates based around use cases and things become much clearer and much easier as you are only change tracking for a small set of objects in the aggregate boundary.
I suspect that Customer should be a separate aggregate without a list of orders, and Order should be an aggregate with a list of order lines. If you need to perform operations on each order for a customer then use orderRepository.GetOrdersForCustomer(customerID); make your changes then use orderRespository.Save(order);
Regarding change tracking without an ORM there are various ways you can do this, for example the order aggregate could raise events that the order repository is listening to for deleted order lines. These could then be deleted when the unit of work completed. Or a slightly less elegant way is to maintain deleted lists, i.e. order.DeletedOrderLines which your repository can obviously read.
To Summarise:
I think you need to think more about behaviour than data
ORM's make life easier for change tracking, but you can do it without one and you can definitely do DDD without one.
EDIT in response to comment:
I don't think I'd implement lazy loading for order lines. What operations are you likely to perform on the order without needing the order lines? Not many I suspect.
However, I'm not one to be confined to the 'rules' of DDD when it doesn't seem to make sense, so... If in the unlikely scenario that there are a number of operations performed on the order object that didn't require the order lines to be populated AND there are often a large number of order lines associated to an order (both would have to be true for me to consider it an issue) then I'd do this:
Have this private field in the order object:
private Func<Guid, IList<OrderLine>> _lazilyGetOrderLines;
Which would be passed by the order repository to the order on creation:
Order order = new Order(this.GetOrderLines);
Where this is a private method on the OrderRepository:
private IList<OrderLine> GetOrderLines(Guid orderId)
{
//DAL Code here
}
Then in the order lines property could look like:
public IEnumberable<OrderLine> OrderLines
{
get
{
if (_orderLines == null)
_orderLines = _lazilyGetOrderLines(this.OrderId);
return _orderLines;
}
}
Edit 2
I've found this blog post which has a similar solution to mine but slightly more elegant:
http://thinkbeforecoding.com/post/2009/02/07/Lazy-load-and-persistence-ignorance

1) Should I load the Orders list within the GetCustomerById method?
It's probably a good idea to separate the order mapping code from the customer mapping code. If you're writing your data access code by hand, calling that mapping module from the GetCustomerById method is your best option.
2) What if the Order itself has another list property and so on?
The logic to put all those together has to live somewhere; the related aggregate repository is as good a place as any.
3) What if I want to do lazy loading? Where would I put the code to load the Orders property in customer? Inside the Orders property get{} accessor? But then I would have to inject repository into the domain entity? which I don't think is the right solution.
The best solution I've seen is to make your repository return subclassed domain entities (using something like Castle DynamicProxy) - that lets you maintain persistence ignorance in your domain model.

Another possible answer is to create a new Proxy object that inherits from Customer, call it CustomerProxy, and handle the lazy load there. All this is pseudo-code, so it's to give you an idea, not just copy and paste it for use.
Example:
public class Customer
{
public id {get; set;}
public name {get; set;}
etc...
public virtual IList<Order> Orders {get; protected set;}
}
here is the Customer "proxy" class... this class does not live in the business layer, but in the Data Layer along with your Context and Data Mappers. Note that any collections you want to make lazy-load you should declare as virtual (I believe EF 4.0 also requires you to make props virtual, as if spins up proxy classes at runtime on pure POCO's so the Context can keep track of changes)
internal sealed class CustomerProxy : Customer
{
private bool _ordersLoaded = false;
public override IList<Order> Orders
{
get
{
IList<Order> orders = new List<Order>();
if (!_ordersLoaded)
{
//assuming you are using mappers to translate entities to db and back
//mappers also live in the data layer
CustomerDataMapper mapper = new CustomerDataMapper();
orders = mapper.GetOrdersByCustomerID(this.ID);
_ordersLoaded = true;
// Cache Cases for later use of the instance
base.Orders = orders;
}
else
{
orders = base.Orders;
}
return orders;
}
}
}
So, in this case, our entity object, Customer is still free from database/datamapper code calls, which is what we want... "pure" POCO's. You've delegated the lazy-load to the proxy object which lives in the Data layer, and does instantiate data mappers and make calls.
there is one drawback to this approach, which is calling client code can't override the lazy load... it's either on or off. So it's up to you in your particular usage circumstance. If you know maybe 75% of the time you'll always needs the Orders of a Customer, than lazy-load is probably not the best bet. It would be better for your CustomerDataMapper to populate that collection at the time you get a Customer entity.
Again, I think NHibernate and EF 4.0 both allow you to change lazy-loading characteristics at runtime, so, as per usual, it makes sense to use an ORM, b/c a lot of functionality is provided for you.
If you don't use Orders that often, then use a lazy-load to populate the Orders collection.
I hope that this is "right", and is a way of accomplishing lazy-load the correct way for Domain Model designs. I'm still a newbie at this stuff...
Mike

c# object equality for database persistance

I want to learn how others cope with the following scenario.
This is not homework or an assignment of any kind. The example classes have been created to better illustrate my question however it does reflect a real life scenario which we would like feedback on.
We retrieve all data from the database and place it into an object. A object represents a single record and if multiple records exist in the database, we place the data into a List<> of the record object.
Lets say we have the following classes;
public class Employee
{
public bool _Modified;
public string _FirstName;
public string _LastName;
public List<Emplyee_Address> _Address;
}
public class Employee_Address
{
public bool _Modified;
public string _Address;
public string _City;
public string _State;
}
Please note that the Getters and Setters have been omitted from the classes for the sake of clarity. Before any code police accuse me of not using them, please note that have been left out for this example only.
The database has a table for Employees and another for Employee Addresses.
Conceptually, what we do is to create a List object that represents the data in the database tables. We do a deep clone of this object which we then bind to controls on the front end. We then have two objects (Orig and Final) representing data from the database.
The user then makes changes to the "Final" object by creating, modifying, deleting records. We then want to persist these changes to the database.
Obviously we want to be as elegant as possible, only editing, creating, deleting those records that require it.
We ultimately want to compare the two List objects so that we can;
See what properties have changed so that the changes can be persisted to the database.
See what properties (records) no longer exist in the second List<> so that these records can be deleted from the database.
See what new properties exist in the new List<> so that we can create these in the database.
Who wants to get the ball rolling on how we can best achieve this. Keep in mind that we also need to drill down into the Employee_Address list to check for any changes, not just the top level properties.
I hope I have made myself clear and look forward to any suggestions.

Add nullable ObjectID field to your layer's base type. Pass it to front end and back to see if particular instance persists in the database.
It also has many other uses even if you don't have any kind of Identity Map

I would do exactly the same thing .NET does in their Data classes, that is keep the record state (System.Data.DataRowState comes to mind) and all associated versions together in one object.
This way:
You can tell at a glance whether it has been modified, inserted, deleted, or is still the original record.
You can quickly find what has been changed by querying the new vs old versions, without having to dig in another collection to find the old version.

You should investigate the use of the Identity Map pattern. Coupled with Unit of Work, this allows you to maintain an object "cache" of sorts from which you can check which objects need saving to the database, and when reading, to return objects from the identity map rather than creating new objects and returning those.

Why would you want to compare two list objects? You will potentially be using a lot of memory for what is essentially duplicate data.
I'd suggest having a status property for each object that can tell you if that particular object is New, Deleted, or Changed. If you want go further than making the property an Enum, you can make it an object that contains some sort of Dictionary that contains the changes to update, though that will most likely apply only in the case of the Changed status.
After you've added such a property, it should be easy to go through your list, add the New objects, remove the Deleted objects etc.
You may want to check how the Entity Framework does this sort of thing as well.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to efficiently load an object model from a data store? - c#

Related

Fetching from Database: Property or Method?

Is there a design pattern for light & heavy versions of an object?

A quick question about aggregate relational objects in MVC

Repository Pattern without an ORM

c# object equality for database persistance

Categories

Resources