Attaching entities to EF context, without loading them, without sacrificing DDD

Attaching entities to EF context, without loading them, without sacrificing DDD - c#

In DDD it is customary to protect an entity's properties like this:
public class Customer
{
private Customer() { }
public Customer(int id, string name) { /* ...populate properties... */ }
public int Id { get; private set; }
public string Name { get; private set; }
// and so on...
}
EF uses reflection so it can handle all those privates.
But what if you need to attach an entity without loading it (a very common thing to do):
var customer = new Customer { Id = getIdFromSomewhere() }; // can't do this!
myContext.Set<Customer>().Attach(customer);
This won't work because the Id setter is private.
What is a good way to deal with this mismatch between the language and DDD?
Ideas:
make Id public (and break DDD)
create a constructor/method to populate a dummy object (makes no sense)
use reflection ("cheat")
???
I think the best compromise, is to use reflection, and set that private Id property, just like EF does. Yes it's reflection and slow, but much faster than loading from the database. And yes it's cheating, but at least as far as the domain is concerned, there is officially no way to instantiate that entity without going through the constructor.
How do you handle this scenario?
PS I did a simple benchmark and it takes about 10s to create a million instances using reflection. So compared to hitting the database, or the reflection performed by EF, the extra overhead is tiny.

"customary" implicitly means it's not a hard set rule, so if you have specific reasons to break those rules in your application, go for it. Making the property setter public would be better than going into reflection for this: not only because of performance issues, but also because it makes it much easier to put unwanted side-effects in your application. Reflection just isn't the way to deal with this.
But I think the first question here is why you would want the ID of an object to be set from the outside in the first place. EF uses the ID primarily to identify objects and you should not use the ID for other logic in your application than just that.
Assuming you have a strong reason to want to change the ID, I actually think you gave the answer yourself in the source you just put in the comments:
So you would have methods to control what happens to your objects and
in doing so, constrain the properties so that they are not exposed to
be set or modified “willy nilly”.
You can keep the private setter and use a method to set the ID.
EDIT:
After reading this I tried doing some more testing myself and you could have the following:
public class Customer
{
private Customer() { }
public Customer(int id) { /* only sets id */ }
public Customer(int id, string name) { /* ...populate properties... */ }
public int Id { get; private set; }
public string Name { get; private set; }
// and so on...
public void SetName(string name)
{
//set name, perhaps check for condition first
}
}
public class MyController
{
//...
var customer = new Customer(getIdFromSomewhere());
myContext.Set<Customer>().Attach(customer);
order.setCustomer(customer);
myContext.SaveChanges(); //sets the customer to order and saves it, without actually changing customer: still read as unchanged.
//...
}
This code leaves the private setters as they were (you will need the methods for editing of course) and only the required changes are pushed to the db afterwards. As is also explained in the link above, only changes made after attaching are used and you should make sure you don't manually set the state of the object to modified, else all properties are pushed (potentially emptying your object).

This is what I'm doing, using reflection. I think it's the best bad option.
var customer = CreateInstanceFromPrivateConstructor<Customer>();
SetPrivateProperty(p=>p.ID, customer, 10);
myContext.Set<Customer>().Attach(customer);
//...and all the above was just for this:
order.setCustomer(customer);
myContext.SaveChanges();
The implementations of those two reflection methods aren't important. What is important:
EF uses reflection for lots of stuff
Database reads are much slower than these reflection calls (the benchmark I mentioned in the question shows how insignificant this perf hit is, about 10s to create a million instances)
Domain is fully DDD - you can't create an entity in a weird state, or create one without going through the constructor (I did that above but I cheated for a specific case, just like EF does)

Related

Updating DDD Aggregates with Collections

So, I've got an aggregate( Project ) that has a collection of entities (ProjectVariables) in it. The variables do not have Ids on them because they have no identity outside of the Project Aggregate Root.
public class Project
{
public Guid Id { get; set; }
public string Name { get; set; }
public List<ProjectVariable> ProjectVariables { get; set; }
}
public class ProjectVariable
{
public string Key { get; set; }
public string Value { get; set; }
public List<string> Scopes { get; set; }
}
The user interface for the project is an Angular web app. A user visits the details for the project, and can add/remove/edit the project variables. He can change the name. No changes persist to the database until the user clicks save and the web app posts some json to the backend, which in turns passes it down to the domain.
In accordance to DDD, it's proper practice to have small, succinct methods on the Aggregate roots that make atomic changes to them. Examples in this domain could be a method Project.AddProjectVariable(projectVariable).
In order to keep this practice, that means that the front end app needs to track changes and submit them something like this:
public class SaveProjectCommand
{
public string NewName { get; set; }
public List<ProjectVariable> AddedProjectVariables { get; set; }
public List<ProjectVariable> RemovedProjectVariables { get; set; }
public List<ProjectVariable> EditedProjectVariables { get; set; }
}
I suppose it's also possible to post the now edited Project, retrieve the original Project from the repo, and diff them, but that seems a little ridiculous.
This object would get translated into Service Layer methods, which would call methods on the Aggregate root to accomplish the intended behaviors.
So, here's where my questions come...
ProjectVariables have no Id. They are transient objects. If I need to remove them, as passed in from the UI tracking changes, how do identify the ones that need to be removed on the Aggregate? Again, they have no identification. I could add surrogate Ids to the ProjectVariables entity, but that seems wrong and dirty.
Does change tracking in my UI seem like it's making the UI do too much?
Are there alternatives mechanisms? One thought was to just replace all of the ProjectVariables in the Project Aggregate Root every time it's saved. Wouldn't that have me adding a Project.ClearVariables() and the using Project.AddProjectVariable() to the replace them? Project.ReplaceProjectVariables(List) seems to be very "CRUDish"
Am I missing something a key component? It seems to me that DDD atomic methods don't mesh well with a pattern where you can make a number of different changes to an entity before committing it.

In accordance to DDD, it's proper practice to have small, succinct
methods on the Aggregate roots that make atomic changes to them.
I wouldn't phrase it that way. The methods should, as much as possible, reflect cohesive operations that have a domain meaning and correspond with a verb or noun in the ubiquitous language. But the state transitions that happen as a consequence are not necessarily small, they can change vast swaths of Aggregate data.
I agree that it is not always feasible though. Sometimes, you'll just want to change some entities field by field. If it happens too much, maybe it's time to consider changing from a rich domain model approach to a CRUD one.
ProjectVariables have no Id. They are transient objects.
So they are probably Value Objects instead of Entities.
You usually don't modify Value Objects but replace them (especially if they're immutable). Project.ReplaceProjectVariables(List) or some equivalent is probably your best option here. I don't see it as being too CRUDish. Pure CRUD here would mean that you only have a setter on the Variables property and not even allowed to create a method and name it as you want.

C# immutability.. a lie?

It is one ideal practice in a multithreaded environment to clone objects (eg: a list) to promote immutability. However if we do so it can be a lie to the API users. What I'm saying is that..
Consider the following code:
public class Teacher {
public List<Student> Students = new List<Student>();
public Student GetStudent(int index) {
return Students[index].Clone();
}
}
public class Student {
public DateTime LastAttended { get; set; }
}
and the users of the API could have done so:
var teacher = new Teacher();
var student3 = teacher.GetStudent(3);
student3.LastAttended = DateTime.Now;
Without proper documentation the user could not have known the student object he is getting is actually a cloned object and in which case all changes made to the object will not reflect the original one.
How can the code above be improved in a way for the user to know intuitively that the GetStudent is meant only for reading and not for modification? Is there any way to force / restrict from modifying the Student object returned from the GetStudent method?

Your Student object isn't immutable at all. If you want immutability, make an immutable object:
public sealed class Student {
private readonly DateTime _lastAttended;
public DateTime LastAttended { get { return _lastAttended; } }
public Student(DateTime lastAttended)
{
_lastAttended = lastAttended;
}
}
If you don't want someone to set the value of a property, then do not expose a setter, only a getter.
This of course requires architecting the application around this. If you actually need to update the LastAttended time, you would do that e.g. through a Repository that updates the Database and returns a new Student object. Also, many ORMs can't automatically handle immutable objects and need some translation code.
Note that your issue is super-common when people cache objects in Memory and then pass them along, e.g. to View Models which manipulate them, unknowingly modifying the master-object in the cache. This is why cloning is often recommended for Caches. Cloning protects you from code making modifications to "your" objects - every time someone asks, they get a new instance of your master object. Cloning does not prevent the caller to mess up his own version.
Note that declaring a Field as readonly doesn't do much if the Type of the Field is a mutable type - I could still do e.g. Student.Course.Name = "Test"; even if Course were readonly - I cannot change the reference in the Student object, but I can access any property setters.
True immutability is a bit of a pain in C# as it's a lot of typing and a lot of factory methods. At some point, it may be okay to just leave a normal mutable Get/Set and trust that callers know what to do as they can only mess up themselves, not you. That said, anything that actually manipulates the data in the database needs proper security/business rule checks.

Update an entity property when another property changes (after entity is initialized)

I am using Entity Framework and Code First approach in a WPF MVVM application backed by a SQL CE database. I am trying to design a model class that can simply update one of its property values in response to another one of its property values changing. Basically, I am looking for a way to define a poco that is "self-tracking" after the instance is initialized by EF. If the answer involves abandoning Code First, then maybe that is the only viable route (not sure). A basic example:
class ThingModel
{
public int Id { get; set; }
public bool OutsideDbNeedsUpdate { get; set; }
private string _foo;
public string Foo
{
get { return _foo; }
set
{
if (_foo != value)
{
_foo = value;
OutsideDbNeedsUpdate = true;
}
}
}
}
However, the problem with the above is that whenever DbContext is initializing an instance at runtime and setting the fields, my class is prematurely setting the dependent field in response. In other words, I am searching for a simple pattern that would allow my poco class to ONLY do this special change tracking after EF has finished initializing the fields on an instance.
I realize I could do something like the solution here
but my business case requires that this special change tracking be decoupled from the EF change tracking, in other words, I require the ability to SaveChanges regardless of the state of the HasChanges property above. This is because I would like to be able to periodically check the HasChanges property on my entities and in turn update dependent values in an outside database (not the same one backing the EF DbContext) and many changes/saves may happen to the EF DB between pushes to the outside DB. Hence the reason I was hoping to just persist the flag with the record in my DB and reset it to false when the periodic update to the outside DB occurs.

After your edit I think you can use the ObjectMaterialized event.
This event is raised after all scalar, complex, and reference properties have been set on an object, but before collections are loaded.
Put this in the constructor of your DbContext:
((IObjectContextAdapter)this).ObjectContext.ObjectMaterialized +=
HandleObjectMaterialized;
And the method:
private void HandleObjectMaterialized(object sender, ObjectMaterializedEventArgs e)
{ }
Now the question is, what to put in the method body? Probably the easiest solution is to define an interface
interface IChangeTracker
{
bool Materialized { get; set; }
bool OutsideDbNeedsUpdate { get; }
}
and let the classes you want to track implement this interface.
Then, in HandleObjectMaterialized you can do:
var entity = e.Entity as IChangeTracker;
if (entity != null)
{
entity.Materialized = true;
}
After this you know when you can set OutsideDbNeedsUpdate internally.
Original text
Generally it is not recommended to have properties with side effects (well, more exact, with more side effects than changing the state the represent). Maybe there are exceptions to this rule, but most of the time it is just not a good idea to have dependencies between properties.
I have to guess a bit what you can do best, because I don't know what your real code is about, but it might be possible to put the logic in the getter. Just an example:
public State State
{
get { return this.EndDate.HasValue ? MyState.Completed : this._state; }
set { this._state = value; }
}
This does not remove the mutual dependencies, but it defers the moment of effect to the time the property is accessed. Which in your case may be not sooner than SaveChanges().
Another strategy is making a method that sets both properties at once. Methods are expected to have side effects, especially when their names clearly indicate it. You could have a method like SetMasterAndDependent (string master).
Now methods are not convenient in data binding scenarios. In that case you better let the view model set both properties or call the method as above.

Entities doing too much?

I having an old puzzle, so I thought I'll share it with you, may be will get right direction.
Thing is, that some of our entities in database are quite big (read have many properties), and rarely business logic uses all of entity properties, so every time I need to think what properties must be loaded for business logic to work correctly. Very hypothetical sample:
public class Product
{
public string Title {get;set;}
public string Description {get;set;}
public string RetailPrice {get;set;}
public string SupplierId {get;set;}
public Supplier Supplier { get;set;}
// many other properties
}
public class ProductDiscountService
{
public decimal Get(Product product)
{
// use only RetailPrice and Supplier code
return discount;
}
}
public class ProductDescriptionService
{
public string GetSearchResultHtml(Product product)
{
// use only Title and Description
return html;
}
}
It looks like I could extract interfaces IDiscountProduct and ISearchResultProduct, mark product as implementing those interfaces, then create smaller DTOs implementing each of those interfaces, but that looks at the moment as overkill (at least I haven't seen anyone grouping properties using interfaces).
To split entity in database to smaller entities also doesn't look reasonable, as all those properties belong to product and I'm afraid I'll be forced to use many joins to select something and if I'll decide that some property belongs to another entity, that move will be quite hard to implement.
To have every property used in particular method's business logic as method parameter also looks like bad solution.

Unless the properties are big (read long strings and/or binaries) I'd just load them all.
The points below are for simple properties (e.g. Title)
No extra code (get this product with title only, or get with price only, blah-blah)
A product instance is always complete, so you can pass it around without checking if the property is null.
If you'll have to lazy-load some other properties, it'll cost you more than loading them eagerly. If you have like 20 properties - this isn't even a big object (again, if your (hypothetical) Description property is not kilobytes in size).
Now, if you have related objects (ProductSupplier) - this should be lazy-loaded, imo, unless you know this property will be used.

Best "pattern" for Data Access Layer to Business Object

I'm trying to figure out the cleanest way to do this.
Currently I have a customer object:
public class Customer
{
public int Id {get;set;}
public string name {get;set;}
public List<Email> emailCollection {get;set}
public Customer(int id)
{
this.emailCollection = getEmails(id);
}
}
Then my Email object is also pretty basic.
public class Email
{
private int index;
public string emailAddress{get;set;}
public int emailType{get;set;}
public Email(...){...}
public static List<Email> getEmails(int id)
{
return DataAccessLayer.getCustomerEmailsByID(id);
}
}
The DataAccessLayer currently connects to the data base, and uses a SqlDataReader to iterate over the result set and creates new Email objects and adds them to a List which it returns when done.
So where and how can I improve upon this?
Should I have my DataAccessLayer instead return a DataTable and leave it up to the Email object to parse and return a List back to the Customer?
I guess "Factory" is probably the wrong word, but should I have another type of EmailFactory which takes a DataTable from the DataAccessLayer and returns a List to the Email object? I guess that kind of sounds redundant...
Is this even proper practice to have my Email.getEmails(id) as a static method?
I might just be throwing myself off by trying to find and apply the best "pattern" to what would normally be a simple task.
Thanks.
Follow up
I created a working example where my Domain/Business object extracts a customer record by id from an existing database. The xml mapping files in nhibernate are really neat. After I followed a tutorial to setup the sessions and repository factories, pulling database records was pretty straight forward.
However, I've noticed a huge performance hit.
My original method consisted of a Stored Procedure on the DB, which was called by a DAL object, which parsed the result set into my domain/business object.
I clocked my original method at taking 30ms to grab a single customer record. I then clocked the nhibernate method at taking 3000ms to grab the same record.
Am I missing something? Or is there just a lot of overhead using this nhibernate route?
Otherwise I like the cleanliness of the code:
protected void Page_Load(object sender, EventArgs e)
{
ICustomerRepository repository = new CustomerRepository();
Customer customer = repository.GetById(id);
}
public class CustomerRepository : ICustomerRepository
{
public Customer GetById(string Id)
{
using (ISession session = NHibernateHelper.OpenSession())
{
Customer customer = session
.CreateCriteria(typeof(Customer))
.Add(Restrictions.Eq("ID", Id))
.UniqueResult<Customer>();
return customer;
}
}
}
The example I followed had me create a helper class to help manage the Session, maybe that's why i'm getting this overhead?
public class NHibernateHelper
{
private static ISessionFactory _sessionFactory;
private static ISessionFactory SessionFactory
{
get
{
if (_sessionFactory == null)
{
Configuration cfg = new Configuration();
cfg.Configure();
cfg.AddAssembly(typeof(Customer).Assembly);
_sessionFactory = cfg.BuildSessionFactory();
}
return _sessionFactory;
}
}
public static ISession OpenSession()
{
return SessionFactory.OpenSession();
}
}
With the application i'm working on, speed is of the essence. And ultimately a lot of data will pass between the web-app and the database. If it takes an agent 1/3 of a second to pull up a customer record as opposed to 3 seconds, that would be a huge hit. But if i'm doing something weird and this is a one time initial setup cost, then it might be worth it if the performance was just as good as executing stored procedures on the DB.
Still open to suggestions!
Updated.
I'm scrapping my ORM/NHibernate route. I found the performance is just too slow to justify using it. Basic customer queries just take too long for our environment. 3 seconds compared to sub-second responses is too much.
If we wanted slow queries, we'd just keep our current implementation. The idea to rewrite it was to drastically increase times.
However, after having played with NHibernate this past week, it is a great tool! It just doesn't quite fit my needs for this project.

If the configuration you've got works now, why mess with it? It doesn't sound like you're identifying any particular needs or issues with the code as it is.
I'm sure a bunch of OO types could huddle around and suggest various refactorings here so that the correct responsibilities and roles are being respected, and somebody might even try to shoehorn in a design pattern or two. But the code you have now is simple and sounds like it doesn't have any issues - i'd say leave it.

I've implemented a DAL layer by basically doing what NHibernate does but manually. What NHibernate does is create a Proxy class that inherits from your Domain object (which should have all its fields marked as virtual). All data access code goes into property overrides, its pretty elegant actually.
I simplified this somewhat by having my Repositories fill out the simple properties themselves and only using a proxy for Lazy loading. What I ended up is a set of classes like this:
public class Product {
public int Id {get; set;}
public int CustomerId { get; set;}
public virtual Customer Customer { get; set;}
}
public class ProductLazyLoadProxy {
ICustomerRepository _customerRepository;
public ProductLazyLoadProxy(ICustomerRepository customerRepository) {
_customerRepository = customerRepository;
}
public override Customer {
get {
if(base.Customer == null)
Customer = _customerRepository.Get(CustomerId);
return base.Customer
}
set { base.Customer = value; }
}
}
public class ProductRepository : IProductRepository {
public Product Get(int id) {
var dr = GetDataReaderForId(id);
return new ProductLazyLoadProxy() {
Id = Convert.ToInt(dr["id"]),
CustomerId = Convert.ToInt(dr["customer_id"]),
}
}
}
But after writing about 20 of these I just gave up and learned NHibernate, with Linq2NHibernate for querying and FluentNHibernate for configuration nowadays the roadblocks are lower than ever.

Most likely your application has its domain logic setup in transaction scripts. For .NET implementations that use transaction script Martin Fowler recommends the usage of the table data gateway pattern. .NET provides good support for this pattern because the table data gateway pattern is great with record set, which Microsoft implements with its DataSet-type classes.
Various tools within the Visual Studio environment should increase your productivity. The fact that DataSets can easily be databound to various controls (like the DataGridView) makes it a good choice for data-driven applications.
If your business logic is more complex than a few validations a domain model becomes a good option. Do note that a domain model comes with a whole different set of data access requirements!

This may be too radical for you and doesn't really solve the question, but how about completely scrapping your data layer and opting for an ORM? You will save a lot of code redundancy that spending a week or so on a DAL will bring.
That aside, the pattern you're using resembles a repository pattern, sort of. I'd say your options are
A service object in your Email class - say EmailService - instantiated in the constructor or a property. Accessed via an instance such as email.Service.GetById(id)
A static method on Email, like Email.GetById(id) which is a similar approach
A completely separate static class that is basically a façade class, EmailManager for example, with static methods like EmailManager.GetById(int)
The ActiveRecord pattern where you are dealing with an instance, like
email.Save() and email.GetById()

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Attaching entities to EF context, without loading them, without sacrificing DDD - c#

Related

Updating DDD Aggregates with Collections

C# immutability.. a lie?

Update an entity property when another property changes (after entity is initialized)

Entities doing too much?

Best "pattern" for Data Access Layer to Business Object

Categories

Resources