MongoDB: Normalized Store to Embedded Domain Model

MongoDB: Normalized Store to Embedded Domain Model - c#

I have determined a relational model makes more sense for a particular collection in a database. The problem is, the domain model was originally started with an embedded model and there is a large amount of UI code that expects it to be in this form. Updating the database schema isn't an issue, but I'm curious if there is any easy way around remodeling the C# domain objects to a more old-fashioned relational model. I've started writing mappers (with AutoMapper) between version one and version two (see classes below), but it's getting messy really quick.
Below is some fictitious code that outlines a similar domain model.
// Existing
class MyClass
{
List<Event> Events { get; set; }
List<Movie> Movies { get; set; }
}
// How it should have been modeled
class MyClass
{
List<int> Events { get; set; } // Stores Event IDs
List<int> Movies { get; set; } // Stores Movie IDs
}
The database will have to be normalized.
If I have to remodel the domain model, that's fine; I just want to feel comfortable I've exhausted other possibilities that might save time. Is there an easy solution to this problem I'm overlooking?

If the only purpose of your restructuring is the relational database I'd advise you to look into O/R mapping. An O/R mapper, like NHibernate or the Entity Framework, should be able to map your existing embedded model to a relational database. Using an O/R mapper can take away the need of remodeling your domain.

Given the specific problem, it seemed the only two options I could justify were the two I mentioned in my initial post (map the data manually or change my domain object). Ultimately, for me, the path of least resistance was to map the data manually. I appreciate the suggestion by pjvds, but I couldn't justify switching to a new ORM at this stage of the project considering so many other things work better with the C# MongoDB driver and also considering a mapper isn't necessary for the other portions of our database.

Related

It is possible to use child class to implement Separation of concerns using EF Core?

My goal is async loading of related entities using DBContext.
Let imagine two projects. The first named MyApp.Domain and contains domain entities.
namespace MyApp.Domain
{
public class PlanPage
{
public Guid Id { get; set; }
}
}
namespace MyApp.Domain
{
public class PlanPageDay
{
public Guid Id { get; set; }
public Guid PlanPageId { get; set; }
}
}
The second project named MyApp.Infrastructure.EntityFramework and contains configuration of projection entities to database. It also contains class which extends domain entity and implements Entity framework specific logic.
namespace MyApp.Infrastructure.EntityFramework.Models
{
public class PlanPageEntity : PlanPage
{
private readonly ApplicationDbContext _applicationDbContext;
protected PlanPageEntity(ApplicationDbContext applicationDbContext)
{
_applicationDbContext = applicationDbContext;
}
public ICollection<PlanPageDay>? Days { get; set; }
public async Task<ICollection<PlanPageDay>> GetDays()
{
return Days ??= await _applicationDbContext.PlanPageDays
.Where(pd => pd.PlanPageId == Id)
.ToListAsync();
}
}
}
The purpose of this example is simple. We separate infrastructure code from domain code. Look how do we plan to use this concept:
// Entity initializing code. Placing somewhere in domain logic.
var plan = new PlanPage(/*some constructor arguments*/);
// Entity loading code. Placing somewhere in infrastructure implementation.
public async Task<PlanPage> GetPlanPage(Guid id)
{
return await _applicationDbContext.Set<PlanPageEntity>().FindAsync(id);
}
Note that we tell to Entity framework to use child class (PlanPageEntity) so it can handle all specific things that it can.
The question is: Is it possible to configure the EF so that it allows us to use this concept?

As requested here's a little more details for my opinion stated in the comments.
The main reason why I think your current approach is a bad idea is that it violates the separation of concerns design principle: when you are mixing domain models with data access models, you make your domain logic completely dependent on how you model the data in your database. This quickly limits your options because the database may have some restrictions on how you can model your data that doesn't fit well with the domain logic you want to implement as well as making maintenance difficult. E.g. if you decide to split up one DB table into two then you might have a big task ahead of you in order to make your domain logic work with those two new models/tables. Additionally, making performance optimizations in your database easily becomes a nightmare if not thought through ahead of time - and you shouldn't spend time thinking of optimizing your system before it's necessary.
I know this is a little abstract since I don't know much about your domain but I'm sure I could find more arguments against it.
Instead, separating data access models (and in general all external data models) from your domain models makes it much easier to maintain: if you need to make some changes to your database, you simply need to update the logic that maps the data from your data access models to your domain model - nothing in your domain logic needs to change.
In the examples you have given, you have already logically separated your domain models and data access models into two separate projects. So why not follow through with that thought and separate the two with a binding/mapping layer in-between?

Is it possible to configure the EF so that it allows us to use this concept?
Yes. Essentially you have DTO's, and your Entities derive from your DTOs. So when you fetch an Entity you can return it directly. But if you wouldn't be able to attach a non-Entity, so you'd have to map it. It's going to be inconvenient, and like 99.999% of bespoke entity and repository designs, will be ultimately a waste of time.
This is somewhat similar to the what EF already does for you. Start with persistence-ignorant Entity classes, and introduce persistence-aware runtime subtypes for scenarios that require them, which is basically just Lazy Loading.

Are Domain Models different from the Database models?

I understand the concepts in DDD but in practise it gets a bit confusing.
I am using C#, SQL Server and EF. I see that based on my database schema, the persistence models would look different from my aggregates. In order to define clean, nice aggregates, entities and value objects, my domain models would look different from the database models.
Moreover, if I try to merge these 2 then somehow I try to design my domain models more based on technology, not on domain.
Maybe a more concrete example would be for example:
Do I need to add an ID field in the entity if that id is used just for DB?
In case of Many2Many relationships it can also get tricky
Are these 2 models different and the implementation of the repository must convert db model into domain model or I should make the domain models the same as the models used by EF?

That's a common issue when you start with DDD.
They are completely separate things.
A Domain Model is an abstraction. It should not concern itself about the technology that you are using and it doesn't represent database tables, documents, classes, functions etc.
The Domain Model represents concepts, dependencies and interactions between these concepts.
Technology is used for the Implementation of a Domain Model.
Different technologies can be used to implement the same Domain Model.
As much as we would like to have the freedom do do whatever we like with your domain models, in practice we do use a technology for it's implementation and sometimes this will affect it.
Of course you can dump all frameworks and libraries and make your own solutions that will make your implementation a lot easier. Even if you do this you are still left with your language, C#, Java, Ruby etc... and the tools that it provides you.
Here's an example:
Let's say we develop a system for car rentals. A person can rent a car. In our Domain we have the concept of an Account that a Person has, a Car and a CarRental.
That's your Domain Model. At this point we are not concerned with the language, database or anything
Now to the Implementation of the Domain Model.
We will use C#. Now we need to decide for a database. If we decide to use SQL, we may decide to use the nice abilities of RDBMS to do joins so we may implement it by using integer IDs this way:
public class Account {
public int ID { get; private set; }
// other stuff
}
public class Car {
public int ID { get; private set; }
// other stuff
}
public class CarRental {
public int AccountID { get; private set; }
public int CarID { get; private set; }
// other stuff
}
On the other hand we may decide that integer ID's are not good because you can get collisions if you have to work with other systems and move databases and stuff, so we decide to think a bit and use an e-mail as a unique identifier for the Account and the license place as a unique identifier for the Car:
public class Account {
public Email Email { get; private set; }
// other stuff
}
public class Car {
public LicensePlace LicensePlace { get; private set; }
// other stuff
}
public class CarRental {
public Email AccountEmail { get; private set; }
public LicensePlace CarLicensePlace { get; private set; }
// other stuff
}
If we follow DDD and divide your Entities into Aggregates properly, you should not need to do joins between them as they should not be loaded or changes in the same transaction.
On the other hand, we may want to build a different model (read model maybe if we use CQRS, or we may need to extract reports) on top of the same database and using the second approach can make this harder since we loose the ability to do joins.
On the other hand, if we use a NoSQL database like MongoDB for example, we don't have the ability to do joins. So using integer ID's doesn't bring us any value.
Even if we decide use a SQL database and don't use integer ID's we can still use Domain Events and build additional models and reporting using them.
Also using joins won't work if we have a distributed system, so using integer ID's may not bring us any value at all.
Sometimes it's convenient to use the abilities of your technology to make some things easier, but we let it harm our Systems and Domain Models.
We end up building system that are very robust from technological point of view but they don't do the things that they are supposed to do, and that makes them useless.
A useless system is useless no matter how good it's implemented.
If you haven't read the DDD book, read it. Eric Evans talks about how you technology can either help you or fight you all the way. Eric Evans also talks about how DDD allows for a freedom of implementation so that you don't have to fight your technology.
Another thing that we tend to do if think all the time about persistence. It's true that we persist things most of the time but that doesn't mean that a Domain Model is something that is Persisted to a database.
When I started programming, I started with computer graphics and modeling applications like 3DsMax and Maya. When I started to write applications that use the holly Database it was really weird to me how people don't think and don't know about their domains, just persist stuff and make them work and all they talk about is technology.
If you are into computer graphics you won't be able to write a single like of code if you don't know Math. So you start with learning Math. After you know some of it, then you can write code.
Let's take games for example. You may design a Physics Engine that models the Domain of physics . In this model you will have concepts like Work, Power, Gravity, Acceleration etc. It's not necessary to persist them. You do persist other concepts like Weight of your Player for example so that the Physics Engine will know how gravity should affect it, but still you don't persist Power to the database. It's still a Domain Model. Power, Work etc are functions, not entities of aggregates. They are still part of your Domain Model.
Let's say you want to build a Physics engine The thing is that if you want to build a Physics Engine you have to know Physics. Physics is complex. Even if you are a really good programming knowing in EF or SQL of whatever, it won't help you build Physics engines. Knowing the Domain of physics and being able to make a Domain Model of it and then Implement it is the key.
If you want to really feel the difference between a Domain model from Implementation, check this out, This domain can blow your mind before you get to any implementation.
Also check this article on modeling entities with DDD.
EDIT
Here is another article that explains differences between NHibernate and EntityFramework for Domain Modeling.

Designing a Persistence Layer

For a project we are starting to look at persistence features and how we want to implement this. Currently we are looking at keeping Clean Architecture in mind, probably going for Onion Architecture. As such, we want to define a new outer layer which in which the persistence layer resides.
We're looking at various ORM solutions (we seem to be converging to Entity Framework) using SQLite as data store and we are hitting a snag: How should be manage ID's and deal with add/removal in some collection or move some instance between different collections.
In the core of our 'onion', we want to keep our POCO objects. As such, we do not want some kind of 'ID' property to be added in our business objects. Only inside the persistence layer do we want to have classes with object ID's. Because of this separation:
how should removing a business object from some collection cause a row to be deleted from the SQLite database?
More complex (at least I think it is), how should a POCO instance moved from 1 collection to another cause a foreign key of a SQLite databaserow to be changed? (Instead of removing the row and recreating it with the same values)
Looking around the internet I've yet to find an implementation somewhere that demonstrates a persistence layer in a Clean Architecture design. Plenty of high level diagrams and "depend only inward", but no source code examples to give a demonstration.
Some possible solutions that we came up with so far:
Have some lookup between POCO instances and their representative 'database model objects' (which have ID's etc) within the persistence layer. When saving the project state, business model objects will be matched with this database model objects and update the state for the matches accordingly. Then the object is persisted.
When loading a project, the persistence layer returns decorator objects of business objects that add an ID to the business object, which is only visible within the persistence layer by casting the objects to that decorator class. However, this prevents us from defining sealed POCO objects and seems to break the Clean Architecture design philosophy.
Option 1 seems costly in memory due to effectively doubling the business objects in memory. Option 2 seems the most elegant, but as I've written: it feels that it breaks Clean Architecture.
Are there better alternatives to there? Should we just go for Option 2 and take Clean Architecture more as guidelines than rule? Can someone point us to a working example in code (I did find a iOs example at https://github.com/luisobo/clean-architecture, but as I'm not literate in the language it cannot do much with it).

As others have mentioned in the comments, IDs are a natural part of applications and are usually required in other parts than persistence. So trying to avoid IDs at all costs is going to produce awkward designs.
Identity Design
However, identity design (where to use which IDs, what information to put in IDs, user defined vs system generated, etc.) is something that is very important and requires thought.
A good starting point to determine what requires an ID and what not is the Value Object / Entity distinction of domain-driven design.
Value objects are things that consist of other values and don't change - so you don't need an ID.
Entities have a lifecycle and change over time. So their value alone is not enough to identify them - they need an explicit ID.
As you see here, reasoning is very different from the technical point of view that you take in your question. This does not mean you should ignore constraints imposed by frameworks (e.g. entity framework), however.
If you want an in-depth discussion about identity design, I can recommend "Implementing DDD" by Vaughn Vernon (Section "Unique Identity" in Chapter 5 - Entities).
Note: I don't mean to recommend that you use DDD because of this. I just think that DDD has some nice guidelines about ID design. Whether or not to use DDD in this project is an entirely different question.

First of all, everything in the real world have ids. You have your social security number. Cars have their registration number. Items in shops have an EAN code (and a production identity). Without ids nothing in the world would work (a bit exaggerated, but hopefully you get my point).
It's the same with applications.
If your business objects do not have any natural keys (like a social security number) you MUST have a way to identify them. You application will otherwise fail as soon as you copy your object or transfer it over the process boundry. Because then it's a new object. It's like when you cloned the sheep Dolly. Is it the same sheep? No, it's Mini-Dolly.
The other part is that when you build complex structures you are violating the law of Demeter. For instance like:
public class ForumPost
{
public int Id { get; set; }
public string Title { get; set; }
public string Body { get; set; }
public User Creator { get; set; }
}
public class User
{
public string Id { get; set; }
public string FirstName { get; set; }
}
When you use that code and invoke:
post.User.FirstName = "Arnold";
postRepos.Update(post);
what do you expect to happen? Should your forum post repos suddenly be responsible of changes made in the user?
That's why ORMs are so sucky. They violate good architecture.
Back to ids. A good design is instead to use a user id. Because then we do not break law of Demeter and still got a good separation of concern.
public class ForumPost
{
public int Id { get; set; }
public string Title { get; set; }
public string Body { get; set; }
public int CreatorId { get; set; }
}
So the conclusion is:
Do not abandon ids, as it introduces complexity when trying to identify the real object from all the copies of it that you will get.
Using ids when referencing different entities helps you keep a good design with distinct responsibilities.

Manipulating large quantities of data in ASP.NET MVC 5

I am currently working towards implementing a charting library with a database that contains a large amount of data. For the table I am using, the raw data is spread out across 148 columns of data, with over 1000 rows. As I have only created models for tables that contain a few columns, I am unsure how to go about implementing a model for this particular table. My usual method of creating a model and using the Entity Framework to connect it to a database doesn't seem practical, as implementing 148 properties for each column does not seem like an efficient method.
My questions are:
What would be a good method to implement this table into an MVC project so that there are read actions that allow one to pull the data from the table?
How would one structure a model so that one could read 148 columns of data from it without having to declare 148 properties?
Is the Entity Framework an efficient way of achieving this goal?

Entity Framework Database First sounds like the perfect solution for your problem.
Data first models mean how they sound; the data exists before the code does. Entity Framework will create the models as partial classes for you based on the table you direct it to.
Additionally, exceptions won't be thrown if the table changes (as long as nothing is accessing a field that doesn't exist), which can be extremely beneficial in a lot of cases. Migrations are not necessary. Instead, all you have to do is right click on the generated model and click "Update Model from Database" and it works like magic. The whole process can be significantly faster than Code First.
Here is another tutorial to help you.

yes with Database First you can create the entites so fast, also remember that is a good practice return onlye the fiedls that you really need, so, your entity has 148 columns, but your app needs only 10 fields, so convert the original entity to a model or viewmodel and use it!
One excelent tool that cal help you is AutoMapper
Regards,

Wow, that's a lot of columns!
Given your circumstances a few thoughts come to mind:
1: If your problem is the leg work of creating that many properties you could look at Entity Framework Power Tools. EF Tools is able to reverse engineer a database and create the necessary models/entity relation mappings for you, saving you a lot of the grunt work.
To save you pulling all of that data out in one go you can then use projections like so:
var result = DbContext.ChartingData.Select(x => new PartialDto {
Property1 = x.Column1,
Property50 = x.Column50,
Property109 = x.Column109
});
A tool like AutoMapper will allow you to do this with ease via simply configurable mapping profiles:
var result = DbContext.ChartingData.Project().To<PartialDto>().ToList();
2: If you have concerns with the performance of manipulating such large entities through Entity Framework then you could also look at using something like Dapper (which will happily work alongside Entity Framework).
This would save you the hassle of modelling the entities for the larger tables but allow you to easily query/update specific columns:
public class ModelledDataColumns
{
public string Property1 { get; set; }
public string Property50 { get; set; }
public string Property109 { get; set; }
}
const string sqlCommand = "SELECT Property1, Property50, Property109 FROM YourTable WHERE Id = #Id";
IEnumerable<ModelledDataColumns> collection = connection.Query<ModelledDataColumns>(sqlCommand", new { Id = 5 }).ToList();
Ultimately if you're keen to go the Entity Framework route then as far as I'm aware there's no way to pull that data from the database without having to create all of the properties one way or another.

ServiceStack.Net Redis: Storing Related Objects vs. Related Object Ids

My team has decided to work with Redis via the ServiceStack.net Redis Client as an underlying repository for a new high-volume website we're working on. I'm not really sure where to look for documentation for this question (either for general Redis docs or specific ServiceStack.Net docs or both) - is there actually a definitive source for documentation on how to implement a Redis via ServiceStack.Net that includes all you need to know about both Redis concepts and ServiceStack.Net concepts, or do we need to integrate documentation from both aspects separately to get the full picture?.
I'm just grappling with how exactly to store related objects in our model's object graph. Here's a simple scenario that I want to work with:
There are two objects in the system: User and Feed. In RDBMS terms these two objects have a one-to-many relationship, that is, a User has a collection of Feed objects and a feed can only belong to one User. Feeds will always be accessed from Redis via their user but occasionally we'll want to get access to the user via a feed instance.
So the question I have is whether we should be storing the related objects as properties or should we store the Id values of the related objects? To illustrate:
Approach A:
public class User
{
public User()
{
Feeds = new List<Feed>();
}
public int Id { get; set; }
public List<Feed> Feeds { get; set; }
// Other properties
}
public class Feed
{
public long Id { get; set; }
public User User { get; set; }
}
Approach B:
public class User
{
public User()
{
FeedIds = new List<long>();
}
public long Id { get; set; }
public List<long> FeedIds { get; set; }
public List<Feed> GetFeeds()
{
return repository.GetFeeds( FeedIds );
}
}
public class Feed
{
public long Id { get; set; }
public long UserId { get; set; }
public User GetUser()
{
return repository.GetUser( UserId );
}
}
Which of the above approaches will work best? I've seen both approaches used in various examples but I get the impression that some of the examples I've seen may not be best-practice.
A few simple related questions:
If I make a change to an object will it automatically be reflected in Redis or will it require a save? I'm assuming the latter, but need to be absolutely clear.
If I (can) use Approach A, will an update to User object X be reflected throughout the entire object graph wherever it is referenced or will it be necessary to save changes across the graph?
Is there a problem with storing an object via it's interface (i.e. use IList<Feed> as opposed to List<Feed>?
Sorry if these questions are a little basic - until 2 weeks ago I'd never even heard of Redis - let alone ServiceStack - (nor had anyone in my team) so we're really starting from scratch here...

Rather than re-hash a lot of other documentation that's out there in the wild, I'll list a couple around for some background info around Redis + ServiceStack's Redis Client:
What to think about when designing a NoSQL Redis application
Designing a NoSQL Database using Redis
General Overview of Redis and .NET
Schemaless versioning and Data Migrations with C# Redis Client
There is no magic - Redis is a blank canvas
First I want to point out that using Redis as a data store just provides a blank canvas and doesn't have any concept of related entities by itself. i.e. it just provides access to distributed comp-sci data structures. How relationships get stored is ultimately up to the client driver (i.e. ServiceStack C# Redis Client) or the app developer, by using Redis's primitive data structure operations. Since all the major data structures are implemented in Redis, you basically have complete freedom on how you want to structure and store your data.
Think how you would structure relationships in code
So the best way to think about how to store stuff in Redis, is to completely disregard about how data is stored in an RDBMS table and think about how it is stored in your code, i.e. using the built-in C# collection classes in memory - which Redis mirrors in behavior with their server-side data-structures.
Despite not having a concept of related entities, Redis's built-in Set and SortedSet data structures provide the ideal way to store indexes. E.g. Redis's Set collection only stores a max of 1 occurrence of an element. This means you can safely add items/keys/ids to it and not care if the item exists already as the end result will be the same had you called it 1 or 100 times - i.e. it's idempotent, and ultimately only 1 element remains stored in the Set. So a common use-case is when storing an object graph (aggregate root) is to store the Child Entity Ids (aka Foreign Keys) into a Set every time you save the model.
Visualizing your data
For a good visualization of how Entities are stored in Redis I recommend installing the Redis Admin UI which works well with ServiceStack's C# Redis Client as it uses the key naming convention below to provide a nice hierarchical view, grouping your typed entities together (despite all keys existing in the same global keyspace).
To view and edit an Entity, click on the Edit link to see and modify the selected entity's internal JSON representation. Hopefully you'll be able to make better decisions about how to design your models once you can see how they're stored.
How POCO / Entities are stored
The C# Redis Client works with any POCOs that have a single primary key - which by default is expected to be Id (though this convention overridable with ModelConfig).
Essentially POCOs gets stored into Redis as serialized JSON with both the typeof(Poco).Name and the Id used to form a unique key for that instance. E.g:
urn:Poco:{Id} => '{"Id":1,"Foo":"Bar"}'
POCOs in the C# Client are conventionally serialized using ServiceStack's fast Json Serializer where only properties with public getters are serialized (and public setters to get de-serialized back).
Defaults are overrideable with [DataMember] attrs but not recommended since it uglifies your POCOs.
Entities are blobbed
So knowing that POCOs in Redis are just blobbed, you only want to keep non-aggregate root data on your POCOs as public properties (unless you purposely want to store redundant data). A good convention is to use methods to fetch the related data (since it wont get serialized) but also tells your app which methods make remote calls to read data.
So the question on whether the Feed should get stored with the User is whether or not it's non-aggregate root data, i.e. whether or not you want to access the users feeds outside the context of the user? If no, then leave the List<Feed> Feeds property on the User type.
Maintaining Custom Indexes
If however you would like to keep all feeds accessible independently, i.e. with redisFeeds.GetById(1) then you will want to store it outside of the user and maintain an index linking the 2 entities.
As you've noticed there are many ways to store relationships between entities and how you do so is largely a matter of preference. For the child entity in a parent>child relationship you would always want to store the ParentId with the child entity. For the Parent you can either choose to store a collection of ChildIds with the model and then do a single fetch for all child entities to re-hydrate the model.
Another way is to maintain the index outside of the parent dto in its own Set for each parent instance. Some good examples of this is in the C# Source code of the Redis StackOverflow demo where the relationship of Users > Questions and Users > Answers is stored in:
idx:user>q:{UserId} => [{QuestionId1},{QuestionId2},etc]
idx:user>a:{UserId} => [{AnswerId1},{AnswerId2},etc]
Although the C# RedisClient does include support for a default Parent/Child convention via its TParent.StoreRelatedEntities(), TParent.GetRelatedEntities<TChild>() and TParent.DeleteRelatedEntities() APIs where an index is maintained behind the scene that looks like:
ref:Question/Answer:{QuestionId} => [{answerIds},..]
Effectively these are just some of your possible options, where there are many different ways to achieve the same end and in which you also have the freedom to roll your own.
NoSQL's schema-less, loose-typing freedoms should be embraced and you shouldn't be worried about trying to follow a rigid, pre-defined structure you might be familiar with when using an RDBMS.
In conclusion, there's no real right way to store data in Redis, e.g. The C# Redis Client makes some assumptions in order to provide a high-level API around POCOs and it blobs the POCOs in Redis's binary-safe string values - though there are other clients will prefer to store an entities properties in Redis Hashes (Dictionaries) instead. Both will work.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.