ServiceStack.Net Redis: Storing Related Objects vs. Related Object Ids

ServiceStack.Net Redis: Storing Related Objects vs. Related Object Ids - c#

My team has decided to work with Redis via the ServiceStack.net Redis Client as an underlying repository for a new high-volume website we're working on. I'm not really sure where to look for documentation for this question (either for general Redis docs or specific ServiceStack.Net docs or both) - is there actually a definitive source for documentation on how to implement a Redis via ServiceStack.Net that includes all you need to know about both Redis concepts and ServiceStack.Net concepts, or do we need to integrate documentation from both aspects separately to get the full picture?.
I'm just grappling with how exactly to store related objects in our model's object graph. Here's a simple scenario that I want to work with:
There are two objects in the system: User and Feed. In RDBMS terms these two objects have a one-to-many relationship, that is, a User has a collection of Feed objects and a feed can only belong to one User. Feeds will always be accessed from Redis via their user but occasionally we'll want to get access to the user via a feed instance.
So the question I have is whether we should be storing the related objects as properties or should we store the Id values of the related objects? To illustrate:
Approach A:
public class User
{
public User()
{
Feeds = new List<Feed>();
}
public int Id { get; set; }
public List<Feed> Feeds { get; set; }
// Other properties
}
public class Feed
{
public long Id { get; set; }
public User User { get; set; }
}
Approach B:
public class User
{
public User()
{
FeedIds = new List<long>();
}
public long Id { get; set; }
public List<long> FeedIds { get; set; }
public List<Feed> GetFeeds()
{
return repository.GetFeeds( FeedIds );
}
}
public class Feed
{
public long Id { get; set; }
public long UserId { get; set; }
public User GetUser()
{
return repository.GetUser( UserId );
}
}
Which of the above approaches will work best? I've seen both approaches used in various examples but I get the impression that some of the examples I've seen may not be best-practice.
A few simple related questions:
If I make a change to an object will it automatically be reflected in Redis or will it require a save? I'm assuming the latter, but need to be absolutely clear.
If I (can) use Approach A, will an update to User object X be reflected throughout the entire object graph wherever it is referenced or will it be necessary to save changes across the graph?
Is there a problem with storing an object via it's interface (i.e. use IList<Feed> as opposed to List<Feed>?
Sorry if these questions are a little basic - until 2 weeks ago I'd never even heard of Redis - let alone ServiceStack - (nor had anyone in my team) so we're really starting from scratch here...

Rather than re-hash a lot of other documentation that's out there in the wild, I'll list a couple around for some background info around Redis + ServiceStack's Redis Client:
What to think about when designing a NoSQL Redis application
Designing a NoSQL Database using Redis
General Overview of Redis and .NET
Schemaless versioning and Data Migrations with C# Redis Client
There is no magic - Redis is a blank canvas
First I want to point out that using Redis as a data store just provides a blank canvas and doesn't have any concept of related entities by itself. i.e. it just provides access to distributed comp-sci data structures. How relationships get stored is ultimately up to the client driver (i.e. ServiceStack C# Redis Client) or the app developer, by using Redis's primitive data structure operations. Since all the major data structures are implemented in Redis, you basically have complete freedom on how you want to structure and store your data.
Think how you would structure relationships in code
So the best way to think about how to store stuff in Redis, is to completely disregard about how data is stored in an RDBMS table and think about how it is stored in your code, i.e. using the built-in C# collection classes in memory - which Redis mirrors in behavior with their server-side data-structures.
Despite not having a concept of related entities, Redis's built-in Set and SortedSet data structures provide the ideal way to store indexes. E.g. Redis's Set collection only stores a max of 1 occurrence of an element. This means you can safely add items/keys/ids to it and not care if the item exists already as the end result will be the same had you called it 1 or 100 times - i.e. it's idempotent, and ultimately only 1 element remains stored in the Set. So a common use-case is when storing an object graph (aggregate root) is to store the Child Entity Ids (aka Foreign Keys) into a Set every time you save the model.
Visualizing your data
For a good visualization of how Entities are stored in Redis I recommend installing the Redis Admin UI which works well with ServiceStack's C# Redis Client as it uses the key naming convention below to provide a nice hierarchical view, grouping your typed entities together (despite all keys existing in the same global keyspace).
To view and edit an Entity, click on the Edit link to see and modify the selected entity's internal JSON representation. Hopefully you'll be able to make better decisions about how to design your models once you can see how they're stored.
How POCO / Entities are stored
The C# Redis Client works with any POCOs that have a single primary key - which by default is expected to be Id (though this convention overridable with ModelConfig).
Essentially POCOs gets stored into Redis as serialized JSON with both the typeof(Poco).Name and the Id used to form a unique key for that instance. E.g:
urn:Poco:{Id} => '{"Id":1,"Foo":"Bar"}'
POCOs in the C# Client are conventionally serialized using ServiceStack's fast Json Serializer where only properties with public getters are serialized (and public setters to get de-serialized back).
Defaults are overrideable with [DataMember] attrs but not recommended since it uglifies your POCOs.
Entities are blobbed
So knowing that POCOs in Redis are just blobbed, you only want to keep non-aggregate root data on your POCOs as public properties (unless you purposely want to store redundant data). A good convention is to use methods to fetch the related data (since it wont get serialized) but also tells your app which methods make remote calls to read data.
So the question on whether the Feed should get stored with the User is whether or not it's non-aggregate root data, i.e. whether or not you want to access the users feeds outside the context of the user? If no, then leave the List<Feed> Feeds property on the User type.
Maintaining Custom Indexes
If however you would like to keep all feeds accessible independently, i.e. with redisFeeds.GetById(1) then you will want to store it outside of the user and maintain an index linking the 2 entities.
As you've noticed there are many ways to store relationships between entities and how you do so is largely a matter of preference. For the child entity in a parent>child relationship you would always want to store the ParentId with the child entity. For the Parent you can either choose to store a collection of ChildIds with the model and then do a single fetch for all child entities to re-hydrate the model.
Another way is to maintain the index outside of the parent dto in its own Set for each parent instance. Some good examples of this is in the C# Source code of the Redis StackOverflow demo where the relationship of Users > Questions and Users > Answers is stored in:
idx:user>q:{UserId} => [{QuestionId1},{QuestionId2},etc]
idx:user>a:{UserId} => [{AnswerId1},{AnswerId2},etc]
Although the C# RedisClient does include support for a default Parent/Child convention via its TParent.StoreRelatedEntities(), TParent.GetRelatedEntities<TChild>() and TParent.DeleteRelatedEntities() APIs where an index is maintained behind the scene that looks like:
ref:Question/Answer:{QuestionId} => [{answerIds},..]
Effectively these are just some of your possible options, where there are many different ways to achieve the same end and in which you also have the freedom to roll your own.
NoSQL's schema-less, loose-typing freedoms should be embraced and you shouldn't be worried about trying to follow a rigid, pre-defined structure you might be familiar with when using an RDBMS.
In conclusion, there's no real right way to store data in Redis, e.g. The C# Redis Client makes some assumptions in order to provide a high-level API around POCOs and it blobs the POCOs in Redis's binary-safe string values - though there are other clients will prefer to store an entities properties in Redis Hashes (Dictionaries) instead. Both will work.

Related

NHibernate: Better to store reference to another Entity or the entity's ID?

New to NHibernate and C#.
I have these two classes:
User //Simplified version
{
private long _id;
private String _username; // unique
private ISet<Role> _roles;
//Properties
}
and
Role
{
private long _id;
private String _name;
//Properties
}
Is it better to store a reference to the Role class (as done above) or just store the IDs of the Role class (so: private ISet<Long> _roles)? Why?
Any pros and cons I should be aware of?

Well, firstly NHibernate is ORM.
... In object-oriented programming, data management tasks act on object-oriented (OO) objects that are almost always non-scalar values. For example, consider an address book entry that represents a single person along with zero or more phone numbers and zero or more addresses. This could be modeled in an object-oriented implementation by a "Person object" with attributes/fields to hold each data item that the entry comprises: the person's name, a list of phone numbers, and a list of addresses. The list of phone numbers would itself contain "PhoneNumber objects" and so on. The address book entry is treated as a single object by the programming language (it can be referenced by a single variable containing a pointer to the object, for instance). Various methods can be associated with the object, such as a method to return the preferred phone number, the home address, and so on....
Secondly - is it better to do A or B... would be more dependent on a use case.
But I can say, (based on my experience) that if:
there are two objects in our domain, e.g. User and Role
we can represent them as one-to-many andmany-to-one` (bidirectional mapping)
I will always map them via references. Because there is no benefit to map them as long ReferenceId and ISet<long> ReferenceIds.
The only use case where to map just IDs (I can imagine) would be to use it in stateless session to get some huge amount of data. But even in this scenario, we can use projections.

"Storing" the Ids doesn't sound like a good idea to me. In fact, the database schema would look the same, so it's not a difference how you store data, just how you design your classes. And ids aren't very useful in contrast to actual objects.
Here some pros and cons anyway:
Advantage of mapping IDs
You could serialize your entity more easily, because the object graph ends here and you wouldn't end up in serializing too many objects. (Note that serializing entities has some other issues and is not recommended in many cases.)
Advantages of mapping objects:
You can easily navigate to the objects without DB interaction thus taking full advantages of using an ORM (maintainability).
You can make use of batch size, which avoids the N+1 problem without optimizing data access in your problem domain (performance and maintainability)

When you build the domain model it should use the proper references rather than using id values,
Advantages,
You can have a proper domain model, so programming becomes easier (if you want to get list of role names per user, then in domain model it's pretty straightforward while if you have id list then it
Easy to query (using either QueryOver / Linq or HQL)
Efficient SQL (if you want to load the user and roles, you can use Future to load all in a single query if you use references, but if you use Id then you have to use multiple queries)
I don't see any disadvantages of using references as long as mapping is correct.
However I'd rather use Id of entities or a DTO stored if the requirement is to store a object over multiple sessions. For example if you want to store the user in the Web Session object, I would not store any domain objects there rather I'd store the Id or a DTO object.

What's a robust way of obtaining updated instances in an NHibernate object graph after a Session.Merge?

In my domain I deal entirely in disconnected entities from the perspective of NHibernate (version 3.1). Accordingly, when mapping from my primary domain to my rdbms domain, I call Session.Merge on the rdbms entities because the graph contains a mixture of persistent and transient instances - some exist already, and some are new and need adding to the database.
Say I have the following model (pseudo)
class Post
{
ISet<Comment> Comments { get; set; }
}
class Comment
{
Post Post { get; set; }
string UnimportantString { get; set; }
}
I have an object instance for the Post obtained from the Session, and a new (transient) Comment instance myComment, created as a result of mapping from my domain model.
I add this to that existing Post:
myComment.Post = myPost;
myPost.Comments.Add(myComment);
myPost = Session.Merge(myPost);
Assume please that this is wrapped in units of work.
By calling Session.Merge on myPost, by design NHibernate does not modify the existing instances in the supplied graph, but returns different instances that are either created or were already in the 1st-level cache. So although the returned myPost now contains a persistent instance of the comment, the original myComment instance to which my code has a reference has not been modified.
I wish to map the modified entities back to my primary domain, but I'm lacking a robust way to select the modified Comment from the merged myPost.Comments. I'm assuming that myPost.Comments.Last() is not guaranteed to be the one I just added.
I wish to robustly correlate which entry in the freshly-merged myPost.Comments set correlates to my original myComment instance, for example so that I may obtain the ID as a result of the transaction completing. I can't do it by ID because my original myComment was a new object. I can't call SaveOrUpdate on myComment because this is a simplified example of a much more complex graph - there are other properties and a deeper graph and I want to be able to reliably correlate NHibernate's new instances with the original instances I supplied to it.
What's the best way of grabbing the updated Comment after a Session.Merge(myPost) so that I can map it back to my primary domain model?
p.s. Unfortunately just changing the call to myComment = Session.Merge(myComment) isn't sufficient as I have other objects too that I want to reference following a single call to Merge, not just this one Comment in my simplified example
p.p.s. This contrived example hides the fact, but I use Session.Merge because in the process of mapping from my primary domain to my rdbms domain I end up with a graph that may contain persistent, transient and/or dirty entities

I've done something similar to this and I've implemented it by using the entity version feature of NHibernate. Using this, I can supply the ID of the entity ahead of time and then reliably match the updated entities.

MongoDB: Normalized Store to Embedded Domain Model

I have determined a relational model makes more sense for a particular collection in a database. The problem is, the domain model was originally started with an embedded model and there is a large amount of UI code that expects it to be in this form. Updating the database schema isn't an issue, but I'm curious if there is any easy way around remodeling the C# domain objects to a more old-fashioned relational model. I've started writing mappers (with AutoMapper) between version one and version two (see classes below), but it's getting messy really quick.
Below is some fictitious code that outlines a similar domain model.
// Existing
class MyClass
{
List<Event> Events { get; set; }
List<Movie> Movies { get; set; }
}
// How it should have been modeled
class MyClass
{
List<int> Events { get; set; } // Stores Event IDs
List<int> Movies { get; set; } // Stores Movie IDs
}
The database will have to be normalized.
If I have to remodel the domain model, that's fine; I just want to feel comfortable I've exhausted other possibilities that might save time. Is there an easy solution to this problem I'm overlooking?

If the only purpose of your restructuring is the relational database I'd advise you to look into O/R mapping. An O/R mapper, like NHibernate or the Entity Framework, should be able to map your existing embedded model to a relational database. Using an O/R mapper can take away the need of remodeling your domain.

Given the specific problem, it seemed the only two options I could justify were the two I mentioned in my initial post (map the data manually or change my domain object). Ultimately, for me, the path of least resistance was to map the data manually. I appreciate the suggestion by pjvds, but I couldn't justify switching to a new ORM at this stage of the project considering so many other things work better with the C# MongoDB driver and also considering a mapper isn't necessary for the other portions of our database.

A quick question about aggregate relational objects in MVC

I'm reading through Pro ASP.NET MVC 3 Framework that just came out, and am a bit confused about how to handle the retrieval of aggregate objects from a data store. The book uses Entity framework, but I an considering using a mini-ORM (Dapper or PetaPoco). As an example, the book uses the following objects:
public class Member {
public string name { get; set; }
}
public class Item {
public int id { get; set; }
public List<Bid> bids { get; set; }
}
public class Bid {
public int id { get; set; }
public Member member { get; set; }
public decimal amount { get; set; }
}
As far as I'm into the book, they just mention the concept of aggregates and move on. So I am assuming you would then implement some basic repository methods, such as:
List<Item> GetAllItems()
List<Bid> GetBidsById(int id)
GetMemberById(int id)
Then, if you wanted to show a list of all items, their bids, and the bidding member, you'd have something like
List<Item> items = Repository.GetAllItems();
foreach (Item i in items) {
i.Bids = Repository.GetBidsById(i.id);
}
foreach (Bid b in items.Bids) {
b.Member = Repository.GetMemberById(b.id);
}
If this is correct, isn't this awfully inefficient, since you could potentially issue thousands of queries in a few seconds? In my non-ORM thinking mind, I would have written a query like
SELECT
item.id,
bid.id,
bid.amount,
member.name
FROM
item
INNER JOIN bid
ON item.id = bid.itemId
INNER JOIN member
ON bid.memberId = member.id
and stuck it in a DataTable. I know it's not pretty, but one large query versus a few dozen little ones seems a better alternative.
If this is not correct, then can someone please enlighten me as to the proper way of handling aggregate retrieval?

If you use Entity Framework for you Data Access Layer, read the Item entity and use the .Include() fluent method to bring the Bids and Members along for the ride.

An aggregate is a collection of related data. The aggregate root is the logical entry point of that data. In your example, the aggregate root is an Item with Bid data. You could also look at the Member as an aggregate root with Bid data.
You may use your data access layer to retrieve the object graph of each aggregate and transforming the data for your use in the view. You may even ensure you eager fetch all of the data from the children. It is possible to transform the data using a tool like AutoMapper.
However, I believe that it is better to use your data access layer to project the domain objects into the data structure you need for the view, whether it be ORM or DataSet. Again, to use your example, would you actually retrieve the entire object graph suggested? Do I need all items including their bids and members? Or do I need a list of items, number of bids, plus member name and amount for the current winning bid? When I need more data about a particular item, I can go retrieve that when the request is made.
In short, your intuition was spot-on that it is inefficient to retrieve all that data, when a projection would suffice. I would just urge you to limit the projection even further and retrieve only the data you require for the current view.

This would be handled in different ways depending on your data access strategy. If you were using NHibernate or Entity Framework, you can have the ORM automatically populate these properties for you eagerly, lazy load them, etc. Entity Framework calls them "Navigation Properties", I'm not sure that NHibernate has a specific name for these "child properties" or "child collections".
In old-school ADO.NET, you might do something like create a stored procedure that returns multiple result sets (one for the main object and other result sets for your child collections or related objects), which would let you avoid calling the database multiple times. You could then iterate over the results sets and hydrate your object with all its relationships with one database call, and inside of a single repository method.

Where ever in your system you do the data retrieval, you would program your orm of choice to do an eager fetch of the related objects (aggregates).

Using what kind of data access method depends on your project.
Convenience vs performance.
Using EF or Linq to SQL really boosts the coding speed. When talking about performance, you really should care about every sql statement you deliver to the database.
No ORM can do both.

You can treat the read (query) and the write (command) side of the model separately.
When you want to mutate the state of your Aggregate, you load the Aggregate Root (AR) via a repository, mutate its state using the intention revealing public methods on the AR, then save the AR with the repository back again.
On the read side however, you can be as flexible as you want. I don't know Entity Framework, but with NHibernate you could use the QueryOver API to generate flexible queries to populate DTO's designed to be consumed by the client, whether it be a service or a View. If you want more performance you could go with Dapper. You could even use Stored Procs that projects itself to a DTO, that way you can be as efficient in the DB layer as possible.

how does your custom class relate to the database

Okay, so i've studied c# and asp.net long enough and would like to know how all these custom classes i created relate to the database. for example.
i have a class call Employee
public class Employee
{
public int ID { get; set; }
public string Name { get; set; }
public string EmailAddress { get; set; }
}
and i have a database with the following 4 fields:
ID
Name
EmailAddress
PhoneNumber
it seems like the custom class is my database. and in asp.net i can simple run the LINQ to SQL command on my database and get the whole schema of my class without typing out a custom class with getter and setter.
so let's just say that now i am running a query to retrieve a list of employees. I would like to know how does my application map to my Employee class to my database?

by itself, it doesn't. But add any ORM or similar, and you start to get closer. for example, LINQ-to-SQL (which I mention because it is easy to get working with Visual Studio), you typically get (given to you by the tooling) a custom "data context" class, which you use as:
using(var ctx = new MyDatabase()) {
foreach(var emp in ctx.Employees) {
....
}
}
This is generating TSQL and mapping the data to objects automatically. By default the tooling creates a separate Employee class, but you can tweak this via partial classes. This also supports inserts, data changes and deletion.
There are also tools that allow re-use of your existing domain objects; either approach can be successful - each has advantages and disadvantages.
If you only want to read data, then it is even easier; a micro-ORM such as dapper-dot-net allows you to use our type with TSQL that you write, with it handling the tedious materialisation code.

Your question is a little vague, imo. But what you are referring to is the Model of the MVC (Model-View-Controller) architecture.
What the Model , your Employee Class, manages data of the application. So it can not only get and set (save / update) your data, but it can also be used to notify of a data change. (Usually to the view).
You mentioned you where using SQL, so more then likely you could create and save an entire employee record by sending an Associative Array of the table data to save it to the database. Your setting for the Class would handle the unique SQL syntax to INSERT the data. In larger MVC Frameworks. The Model of your application inherits several other classes to handle the proper saving to different types of backends other than MS SQL.
Models will also, normally, have functions to handle finding records and updating records. This is normally by specify a search field, and it returning the record, of which would include the ID and you would normally base this back into a save / update function to make changes to record. You could also tie into this level of the Model to create revision of the data you are saving
So how the model directly correlates to your SQL structure is dependent on how you right it. Or which Framework you decide to use. I believe a common one for asp.net is the Microsoft's ASP.Net MVC

Your class cannot be directly mapped to the database without ORM tool, The ORM tool will read your configuration and will map your class to DB row as per your mappings automatically. That means you don't need to read the row and set the class fields explicitly but you have to provide mapping files and have to go through the ORM framework to load the entities, and the framework will take care of the rest
You can check nHibernate and here is getting started on nHibernate.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.