Store objects with common base class in database - c#

Let's say i have a common base class/interface
interface ICommand
{
void Execute();
}
Then there are a few commands inheriting from this interface.
class CommandA : ICommand
{
int x;
int y;
public CommandA(int x, int y)
{ ... }
public void Execute () { ... }
}
class CommandB : ICommand
{
string name;
public CommandB(string name)
{ ... }
public void Execute () { ... }
}
Now i want to store these commands in a database, with a common method, and then later load all of them from the DB into a List<ICommand> and execute the Execute-method of them.
Right now I just have one table in the DB called commands and here i store a string serialization of the object. Basically the columns in the table are: id|commandType|commaSeparatedListOfParameters. While this is very easy and works good for loading all commands, I can't query the commands easily without using substring and other obscure methods. I would like to have an easy way of SELECT id,x,y FROM commandA_commands WHERE x=... and at the same time have a generic way of loading all commands from the commands-table (i guess this would be some kind of UNION/JOIN of commandA_commands, commandB_commands, etc).
It is important that not much manual fiddling in the DB, or manual creation of serialize/parse-methods, is required to add a new command. I have tons of them and new ones are added and removed all the time. I don't mind creating a command+table+query generation tool though if this would be required for the best solution.
The best i can think of myself is a common table like id|commandType|param1|param2|param3|etc.. which isn't much better (actually worse?) than my current solution as many commands are going to need null parameters and the datatype will vary so i have to resort to common string conversion again and size each field big enough for the largest command.
The database is SQL Server 2008
Edit: Found similar question here Designing SQL database to represent OO class hierarchy

You can use an ORM to map the command inheritance to database. For example you can use "Table per Hierarchy" technique provided by the ORMs (eg: Entity Framework, nHibernate, etc). ORM will instantiate the correct subclass when you retrieve them.
Here's an example of doing it in Entity Framework Code first
abstract class Command : ICommand
{
public int Id {get;set;}
public abstract void Execute();
}
class CommandA : Command
{
public int X {get;set;}
public int Y {get;set;}
public override void Execute () { ... }
}
class CommandB : Command
{
public string Name {get;set;}
public override void Execute () { ... }
}
Refere EF 4.1 Code First Walkthrough to configure this model with EF.
If your commands takes drastically different set of parameter you can consider using "Table per Type" inheritance modeling. Here you will have pay some significant performance penalty because of lot of Unions and table joins involved in this.
Alternative approach would be to store the Command parameters as a XML configuration where you (de)serialize manually. This way you can keep all your commands in a single table without sacrificing performance. Again this has a drawback where you can not filter using the command parameters.
Each approach has its Pros and Cons. You can choose the strategy which suits your requirements.

This problem is pretty common, and I haven't seen a solution that is without drawbacks. The only option to exactly store a hierarchy of objects in a database is to use some NoSQL database.
However, if a relational database is mandated, I usually go with this approach:
one table for the base class/interface, that stores the common data for all types
one table per descending class, that uses the exact same primary key from the base table (this is a nice use case for the SQL Server 2011 sequences, btw)
one table that holds the types that will be stored
a view that joins all those tables together, to enable easy loading/querying of the objects
In your case, I would create:
Table CommandBase
ID (int or guid) - the ID of the command
TypeID (int) - ID of the type of command
Table CommandA
ID (int or guid) - the same ID from the CommandBase table
X (int)
Y (int)
Table CommandB
ID (int or guid) - the same ID from the CommandBase table
Name (nvarchar)
Table CommandTypes
ID (int) - ID of the command type
Name (nvarchar) - Name of the command type ("CommandA", "CommandB",...)
TableName (nvarchar) - Name of the table that stores the type - usefull if some dynamic sql is needed - otherwise optional
View Commands, something along the lines of:
select cb.ID, a.X, a.Y, b.Name
from CommandBase cb
left outer join CommandA a on a.ID = cb.ID
left outer join CommandB b on b.ID = cb.ID
The upside of this approach is that it mirrors your class structure. It's easy to understand and use.
The downside is that is gets more and more cumbersome as you add new classes, it's hard to model more then one level of hierarchy, and the view can get a mile long if there are lots of descendants.
Personally, I would use this approach if I know that the number of subclasses is relatively small and relatively fixed, as it requires creating (and maintaining) a new table for each new type. However, the model is quite simple, so it's possible to create a tool/script that could do the creating and maintaining for you.

We are using XML in SQL more and more for our soft data. It might be slightly painful to query (using XPath), but it allows us to store metadata per row, validated against a schema, etc.
XML can then be parsed by code and parameters can be matched via reflection to parameters in the de-serialised object.
Effectively this replicates the ORM functionality, but with a flat database structure, simplifies queries, and the parameters are even queryable through XPath.
And remember kids, views are evil.

Related

Entity framework 6 code first: what is the best implementation for a baseobject with 10 childobjects

We have a baseobject with 10 childobjects and EF6 code first.
Of those 10 childobjects, 5 have only a few (extra) properties, and 5 have multiple properties (5 to 20).
We implemented this as table-per-type, so we have one table for the base and 1 per child (total 10).
This, however, creates HUGE select queries with select case and unions all over the place, which also takes the EF 6 seconds to generate (the first time).
I read about this issue, and that the same issue holds in the table-per-concrete type scenario.
So what we are left with is table-per-hierachy, but that creates a table with a large number of properties, which doesn't sound great either.
Is there another solution for this?
I thought about maybe skip the inheritance and create a union view for when I want to get all the items from all the child objects/records.
Any other thoughts?
Another solution would be to implement some kind of CQRS pattern where you have separate databases for writing (command) and reading (query). You could even de-normalize the data in the read database so it is very fast.
Assuming you need at least one normalized model with referential integrity, I think your decision really comes down to Table per Hierarchy and Table per Type. TPH is reported by Alex James from the EF team and more recently on Microsoft's Data Development site to have better performance.
Advantages of TPT and why they're not as important as performance:
Greater flexibility, which means the ability to add types without affecting any existing table. Not too much of a concern because EF migrations make it trivial to generate the required SQL to update existing databases without affecting data.
Database validation on account of having fewer nullable fields. Not a massive concern because EF validates data according to the application model. If data is being added by other means it is not too difficult to run a background script to validate data. Also, TPT and TPC are actually worse for validation when it comes to primary keys because two sub-class tables could potentially contain the same primary key. You are left with the problem of validation by other means.
Storage space is reduced on account of not needing to store all the null fields. This is only a very trivial concern, especially if the DBMS has a good strategy for handling 'sparse' columns.
Design and gut-feel. Having one very large table does feel a bit wrong, but that is probably because most db designers have spent many hours normalizing data and drawing ERDs. Having one large table seems to go against the basic principles of database design. This is probably the biggest barrier to TPH. See this article for a particularly impassioned argument.
That article summarizes the core argument against TPH as:
It's not normalized even in a trivial sense, it makes it impossible to enforce integrity on the data, and what's most "awesome:" it is virtually guaranteed to perform badly at a large scale for any non-trivial set of data.
These are mostly wrong. Performance and integrity are mentioned above, and TPH does not necessarily mean denormalized. There are just many (nullable) foreign key columns that are self-referential. So we can go on designing and normalizing the data exactly as we would with a TPH. In a current database I have many relationships between sub-types and have created an ERD as if it were a TPT inheritance structure. This actually reflects the implementation in code-first Entity Framework. For example here is my Expenditure class, which inherits from Relationship which inherits from Content:
public class Expenditure : Relationship
{
/// <summary>
/// Inherits from Content: Id, Handle, Description, Parent (is context of expenditure and usually
/// a Project)
/// Inherits from Relationship: Source (the Principal), SourceId, Target (the Supplier), TargetId,
///
/// </summary>
[Required, InverseProperty("Expenditures"), ForeignKey("ProductId")]
public Product Product { get; set; }
public Guid ProductId { get; set; }
public string Unit { get; set; }
public double Qty { get; set; }
public string Currency { get; set; }
public double TotalCost { get; set; }
}
The InversePropertyAttribute and the ForeignKeyAttribute provide EF with the information required to make the required self joins in the single database.
The Product type also maps to the same table (also inheriting from Content). Each Product has its own row in the table and rows that contain Expenditures will include data in the ProductId column, which is null for rows containing all other types. So the data is normalized, just placed in a single table.
The beauty of using EF code first is we design the database in exactly the same way and we implement it in (almost) exactly the same way regardless of using TPH or TPT. To change the implementation from TPH to TPT we simply need to add an annotation to each sub-class, mapping them to new tables. So, the good news for you is it doesn't really matter which one you choose. Just build it, generate a stack of test data, test it, change strategy, test it again. I reckon you'll find TPH the winner.
Having experienced similar problems myself I've a few suggestions. I'm also open to improvements on these suggestions as It's a complex topic, and I don't have it all worked out.
Entity framework can be very slow when dealing with non-trivial queries on complex entities - ie those with multiple levels of child collections. In some performance tests I've tried it does sit there an awful long time compiling the query. In theory EF 5 and onwards should cache compiled queries (even if the context gets disposed and re-instantiated) without you having to do anything, but I'm not convinced that this is always the case.
I've read some suggestions that you should create multiple DataContexts with only smaller subsets of your database entities for a complex database. If this is practical for you give it a try! But I imagine there would be maintenance issues with this approach.
1) I Know this is obvious but worth saying anyway - make sure you have the right foreign keys set up in your database for related entities, as then entity framework will keep track of these relationships, and be much quicker generating queries where you need to join using the foreign key.
2) Don't retrieve more than you need. One-size fits all methods to get a complex object are rarely optimal. Say you are getting a list of base objects (to put in a list) and you only need to display the name and ID of these objects in the list of the base object. Just retrieve only the base object - any navigation properties that aren't specifically needed should not be retrieved.
3) If the child objects are not collections, or they are collections but you only need 1 item (or an aggregate value such as the count) from them I would absolutely implement a View in the database and query that instead. It is MUCH quicker. EF doesn't have to do any work - its all done in the database, which is better equipped for this type of operation.
4) Be careful with .Include() and this goes back to point #2 above. If you are getting a single object + a child collection property you are best not using .Include() as then when the child collection is retrieved this will be done as a separate query. (so not getting all the base object columns for every row in the child collection)
EDIT
Following comments here's some further thoughts.
As we are dealing with an inheritance hierarchy it makes logical sense to store separate tables for the additional properties of the inheriting classes + a table for the base class. As to how to make Entity Framework perform well though is still up for debate.
I've used EF for a similar scenario (but fewer children), (Database first), but in this case I didn't use the actual Entity framework generated classes as the business objects. The EF objects directly related to the DB tables.
I created separate business classes for the base and inheriting classes, and a set of Mappers that would convert to them. A query would look something like
public static List<BaseClass> GetAllItems()
{
using (var db = new MyDbEntities())
{
var q1 = db.InheritedClass1.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass1Mapper.MapFromContext(x));
var q2 = db.InheritedClass2.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass2Mapper.MapFromContext(x));
return q1.Union(q2).ToList();
}
}
Not saying this is the best approach, but it might be a starting point?
The queries are certainly quick to compile in this case!
Comments welcome!
With Table per Hierarchy you end up with only one table, so obviously your CRUD operations will be faster and this table is abstracted out by your domain layer anyway. The disadvantage is that you loose the ability for NOT NULL constraints, so this needs to be handled properly by your business layer to avoid potential data integrity. Also, adding or removing entities means that the table changes; but that's also something that is manageable.
With Table per type you have the problem that the more classes in the hierarchy you have, the slower your CRUD operations will become.
All in all, as performance is probably the most important consideration here and you have a lot of classes, I think Table per Hierarchy is a winner in terms of both performance and simplicity and taking into account your number of classes.
Also look at this article, more specifically at chapter 7.1.1 (Avoiding TPT in Model First or Code First applications), where they state: "when creating an application using Model First or Code First, you should avoid TPT inheritance for performance concerns."
The EF6 CodeFirst model I'm working on using generics and an abstract base classes called "BaseEntity". I also use generics and a base class for the EntityTypeConfiguration class.
In the event that I need to reuse a couple of properties "columns" on some tables and it doesn't make sense for them to be on BaseEntity or BaseEntityWithMetaData, I make an interface for them.
E.g. I have one for addresses I haven't finished yet. So if an entity has address information it will implement IAddressInfo. Casting an entity to IAddressInfo will give me an object with just the AddressInfo on it.
Originally I had my metadata columns as their own table. But like others have mentioned, the queries were horrendous, and it was slower than slow. So I thought, why don't I just use multiple inheritance paths to support what I want to do so the columns are on every table that need them, and not on the ones that don't. Also I am using mysql which has a column limit of 4096. Sql Server 2008 has 1024. Even at 1024, I don't see realistic scenarios for going over that on one table.
And non of my objjets inherit in such a way that they have columns they don't need. When that need arises I create a new base class at a level to prevent the extra columns.
Here's are enough snippets from my code to understand how I have my inheritance setup. So far it works really well for me. I haven't really produced a scenario I couldn't model with this setup.
public BaseEntityConfig<T> : EntityTypeConfiguration<T> where T : BaseEntity<T>, new()
{
}
public BaseEntity<T> where T : BaseEntity<T>, new()
{
//shared properties here
}
public BaseEntityMetaDataConfig : BaseEntityConfig<T> where T: BaseEntityWithMetaData<T>, new()
{
public BaseEntityWithMetaDataConfig()
{
this.HasOptional(e => e.RecCreatedBy).WithMany().HasForeignKey(p => p.RecCreatedByUserId);
this.HasOptional(e => e.RecLastModifiedBy).WithMany().HasForeignKey(p => p.RecLastModifiedByUserId);
}
}
public BaseEntityMetaData<T> : BaseEntity<T> where T: BaseEntityWithMetaData<T>, new()
{
#region Entity Properties
public DateTime? DateRecCreated { get; set; }
public DateTime? DateRecModified { get; set; }
public long? RecCreatedByUserId { get; set; }
public virtual User RecCreatedBy { get; set; }
public virtual User RecLastModifiedBy { get; set; }
public long? RecLastModifiedByUserId { get; set; }
public DateTime? RecDateDeleted { get; set; }
#endregion
}
public PersonConfig()
{
this.ToTable("people");
this.HasKey(e => e.PersonId);
this.HasOptional(e => e.User).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasOptional(p => p.Employee).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasMany(e => e.EmailAddresses).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.Property(e => e.FirstName).IsRequired().HasMaxLength(128);
this.Property(e => e.MiddleName).IsOptional().HasMaxLength(128);
this.Property(e => e.LastName).IsRequired().HasMaxLength(128);
}
}
//I Have to use this pattern to allow other classes to inherit from person, they have to inherit from BasePeron<T>
public class Person : BasePerson<Person>
{
//Just a dummy class to expose BasePerson as it is.
}
public class BasePerson<T> : BaseEntityWithMetaData<T> where T: BasePerson<T>, new()
{
#region Entity Properties
public long PersonId { get; set; }
public virtual User User { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public virtual Employee Employee { get; set; }
public virtual ICollection<PersonEmail> EmailAddresses { get; set; }
#endregion
#region Entity Helper Properties
[NotMapped]
public PersonEmail PrimaryPersonalEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Personal_Primary select e).FirstOrDefault();
return ret;
}
}
[NotMapped]
public PersonEmail PrimaryWorkEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Work_Primary select e).FirstOrDefault();
return ret;
}
}
private string _DefaultEmailAddress = null;
[NotMapped]
public string DefaultEmailAddress
{
get
{
if (string.IsNullOrEmpty(_DefaultEmailAddress))
{
PersonEmail personalEmail = this.PrimaryPersonalEmail;
if (personalEmail != null && !string.IsNullOrEmpty(personalEmail.EmailAddress))
_DefaultEmailAddress = personalEmail.EmailAddress;
else
{
PersonEmail workEmail = this.PrimaryWorkEmail;
if (workEmail != null && !string.IsNullOrEmpty(workEmail.EmailAddress))
_DefaultEmailAddress = workEmail.EmailAddress;
}
}
return _DefaultEmailAddress;
}
}
#endregion
#region Constructor
static BasePerson()
{
}
public BasePerson()
{
this.User = null;
this.EmailAddresses = new HashSet<PersonEmail>();
}
public BasePerson(string firstName, string lastName)
{
this.FirstName = firstName;
this.LastName = lastName;
}
#endregion
}
Now, code in the context on ModelCreating looks like,
//Config
modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
//initialize configuration, each line is responsible for telling entity framework how to create relation ships between the different tables in the database.
//Such as Table Names, Foreign Key Contraints, Unique Contraints, all relations etc.
modelBuilder.Configurations.Add(new PersonConfig());
modelBuilder.Configurations.Add(new PersonEmailConfig());
modelBuilder.Configurations.Add(new UserConfig());
modelBuilder.Configurations.Add(new LoginSessionConfig());
modelBuilder.Configurations.Add(new AccountConfig());
modelBuilder.Configurations.Add(new EmployeeConfig());
modelBuilder.Configurations.Add(new ContactConfig());
modelBuilder.Configurations.Add(new ConfigEntryCategoryConfig());
modelBuilder.Configurations.Add(new ConfigEntryConfig());
modelBuilder.Configurations.Add(new SecurityQuestionConfig());
modelBuilder.Configurations.Add(new SecurityQuestionAnswerConfig());
The reason I created base classes for the Configuration of my entities was because when I started down this path I ran into an annoying problem. I had to configure the shared properties for every derrived class over and over again. And if I updated one of the fluent API mappings, I had to update code in every derrived class.
But by using this inheritance method on the configuration classes the two properties are configured in one place, and inherited by the configuration class for derrived entities.
So when PeopleConfig is configured, it runs the logic on the BaseEntityWithMetaData class to configure the two properties, and again when UserConfig runs, etc etc etc.
Three different approaches have different names in M. Fowler's language:
Single Table inheritance - whole inheritance hierarchy held in one table. No joins, optional columns for child types. You need to distinguish which child type it is.
Concrete Table inheritance - you have one table for each concrete type. Joins, no optional columns. In this case, base type table is needed only if the base type requires to have its own mapping (instance can be created).
Class Table inheritance - you have base type table, and child tables - each adding only additional columns to the base's columns. Joins, no optional columns. In this case, base type table always contains row for each child; however, you can retrieve common columns only if no child-specific columns are needed (rest comes with lazy loading maybe?).
All approaches are workable - it only depends on the amount and structure of data you have, so you can measure performance differences first.
Choice will be based on the number of joins vs. data distribution vs. optional columns.
If you don't have (and not going to have) many child types, I would go with class table inheritance since that stands close to the domain and will be easy to translate/map.
If you have many child tables to work with at the same time, and anticipate bottleneck in joins - go with single table inheritance.
If joins are not needed at all and you are going to work with one concrete type at a time - go with concrete table inheritance.
Although, the Table per Hierarchy (TPH) is a better approach for fast CRUD operations, yet in that case it is impossible to avoid a single table with a so many properties for the database created. The case and union clauses that you mentioned are created because the resulting query is effectively requesting a polymorphic result set that includes multiple types.
However, when EF returns flattened table that includes the data for all the types, it does extra work to ensure that, null values are returned for columns that may be irrelevant for a particular type. Technically, this extra validation using case and union is not necessary
The below issue is a performance glitch in Microsoft EF6 and they are are aiming to deliver this fix in a future release.
The below query:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN[UnionAll1].[State] END AS [C2],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))THEN[UnionAll1].[Zip] END AS [C3],
FROM [dbo].[Customers] AS [Extent1]
can be safely replaced by:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
[UnionAll1].[State] AS [C2],
[UnionAll1].[Zip] AS [C3],
FROM [dbo].[Customers] AS [Extent1]
So, you just saw the problem and the flaw of Entity Framework 6 current release, you have an option to either use a Model First Approach or use a TPH approach.

Using interfaces in LINQ database queries

I am working on part of an application that simply pulls information from the database and displays it to users. For simplicity sake, let us assume I have a database with two tables, Cats and Dogs. Both tables have manually assigned primary keys and are never duplicated/overlapped. The goal I am trying to achieve is to perform 1 LINQ query that will concat both tables.
I recently asked this question regarding performing a LINQ concat on two collections of objects, Cats and Dogs, that were manually created in code. I advise reading the previous question as it will give much insight to this one.
The reason I wish to use interfaces is to simplify my queries. I currently have a solution that .Select each of the columns I need into an anonymous type. This would work fine for this instance, but will consume pages with the data I am working with.
The different between that previous question and this one is that I am trying to pull these animals from a database. From my analysis, it seems that .NET or Entity Framework is not able to relate my database to my interface
Model (From old question)
public interface iAnimal
{
string name { get; set; }
int age { get; set; }
}
public class Dog :iAnimal
{
public string name { get; set; }
public int age { get; set; }
}
public class Cat:iAnimal
{
public string name { get; set; }
public int age { get; set; }
}
Here are some different LINQ queries I have tried and the resulting error. The first example will be using the solution from the previous question.
var model = _db.Cats.Concat<iAnimal>(_db.Dogs).Take(4);
System.ArgumentException: DbUnionAllExpression requires arguments with compatible collection ResultTypes.
Without Covariance:
var model = _db.Cats.Cast<iAnimal>().Concat(_db.Dogs.Cast<iAnimal>());
System.NotSupportedException: Unable to cast the type 'Test.Models.Cat' to type 'Test.Interfaces.iAnimals'. LINQ to Entities only supports casting Entity Data Model primitive types.
From the above error, it looks like I am not able to use interfaces to interact with databases as it is not mapped to any particular table.
Any insight would be much appreciated. Thanks
EDIT
In response to #Reed Copsey, with your solution, I get the same error as my example without covariance. I tried changing the view's type to match what the error recommends, which results in this error
System.InvalidOperationException: The model item passed into the dictionary is of type 'System.Data.Entity.Infrastructure.DbQuery`1[Test.Interfaces.iAnimal]', but this dictionary requires a model item of type 'System.Collections.Generic.IEnumerable`1[Test.Models.Cat]'.
You database knows nothing about your interface and you will probably not be able to get this working. I see two options.
You could use inheritance - for example supported by the Entity Framework - and inherit both entities from a common base entity. Than you will be able to perform queries against the base type but this may require changes to your data model depending on the way you implement inheritance at the database level.
Have a look at the documentation for TPT inheritance and TPH inheritance. There are still other inheritance models like TPC inheritance but they currently lack designer support.
The second option is to fetch results from both tables into memory and use LINQ to Objects to merge them into a single collection.
var dogs = database.Dogs.Take(4).ToList();
var cats = database.Cats.Take(4).ToList();
var pets = dogs.Cast<IPet>().Concat(cats).ToList();
Also note that your query
var model = _db.Cats.Concat<iAnimal>(_db.Dogs).Take(4);
seems not really well designed - the result will definitely depend on the database used but I would not be surprised if you usually just get the first four cats and never see any dog.

Best practice when storing/retrieving data from database inside class

I would like to know what's the best code design when storing and retrieving data from a database when working with objects and classes.
I could do this in two ways.
In the class constructur I query the database and stores all info in instance variables inside the class and retrieve them with getters/setters. This way I can always get any information I want, but in many cases wont be needing all the information all the time.
public class Group {
public int GroupID { get; private set; }
public string Name { get; private set; }
public Group(int groupID)
{
this.GroupID = groupID;
this.Name = // retrieve data from database
}
public string getName()
{
// this is just an example method, I know I can retrieve the name from the getter :)
return Name;
}
}
The other way is to create some methods and pass in the groupID as a parameter, and then query the database for that specific information I need. This could result in more querys but I will only get the information I need.
public class Group {
public Group()
{
}
public string getName(int groupID)
{
// query database for the name based on groupID
return name;
}
}
What do you think is the best way to go? Is there a best practice to go with here or is it up to me what I think works the best?
You don't want to do heavy DB work in the constructor. Heavy work should be done in methods.
You also don't want to necessarily couple the DB work with the entity class that holds the data. What if you want a method to return two of those objects from the database? For example GetGroups() - you can't even construct one without doing DB work. For something that returns multiple, the storage and retrieval is decouple from the entity class.
Instead, decouple your DB work from your entity objects. One option is you can have a dataaccesslayer with methods like GetFoo or GetFoos etc... that query the database, populate the objects and return them.
If you use an ORM, see:
https://stackoverflow.com/questions/3505/what-are-your-favorite-net-object-relational-mappers-orm
Lazy loading versus early loading, which is what this really boils down to, is best determined by usage.
Mostly this means related entities -- if you are dealing with an individual address for instance, splitting the read for the city from the read for the state would be crazy; OTOH when returning a list of company employee's reading their address information is probably a waste of time and memory.
Also, these aren't mutually exclusive options -- you can have a constructor that calls the databases, and a constructor that uses provided data.
If it is a relational database the best way would be to do it with ORM (object-relational mapping). See here for a list of ORM-mappers:
https://en.wikipedia.org/wiki/List_of_object-relational_mapping_software

how does your custom class relate to the database

Okay, so i've studied c# and asp.net long enough and would like to know how all these custom classes i created relate to the database. for example.
i have a class call Employee
public class Employee
{
public int ID { get; set; }
public string Name { get; set; }
public string EmailAddress { get; set; }
}
and i have a database with the following 4 fields:
ID
Name
EmailAddress
PhoneNumber
it seems like the custom class is my database. and in asp.net i can simple run the LINQ to SQL command on my database and get the whole schema of my class without typing out a custom class with getter and setter.
so let's just say that now i am running a query to retrieve a list of employees. I would like to know how does my application map to my Employee class to my database?
by itself, it doesn't. But add any ORM or similar, and you start to get closer. for example, LINQ-to-SQL (which I mention because it is easy to get working with Visual Studio), you typically get (given to you by the tooling) a custom "data context" class, which you use as:
using(var ctx = new MyDatabase()) {
foreach(var emp in ctx.Employees) {
....
}
}
This is generating TSQL and mapping the data to objects automatically. By default the tooling creates a separate Employee class, but you can tweak this via partial classes. This also supports inserts, data changes and deletion.
There are also tools that allow re-use of your existing domain objects; either approach can be successful - each has advantages and disadvantages.
If you only want to read data, then it is even easier; a micro-ORM such as dapper-dot-net allows you to use our type with TSQL that you write, with it handling the tedious materialisation code.
Your question is a little vague, imo. But what you are referring to is the Model of the MVC (Model-View-Controller) architecture.
What the Model , your Employee Class, manages data of the application. So it can not only get and set (save / update) your data, but it can also be used to notify of a data change. (Usually to the view).
You mentioned you where using SQL, so more then likely you could create and save an entire employee record by sending an Associative Array of the table data to save it to the database. Your setting for the Class would handle the unique SQL syntax to INSERT the data. In larger MVC Frameworks. The Model of your application inherits several other classes to handle the proper saving to different types of backends other than MS SQL.
Models will also, normally, have functions to handle finding records and updating records. This is normally by specify a search field, and it returning the record, of which would include the ID and you would normally base this back into a save / update function to make changes to record. You could also tie into this level of the Model to create revision of the data you are saving
So how the model directly correlates to your SQL structure is dependent on how you right it. Or which Framework you decide to use. I believe a common one for asp.net is the Microsoft's ASP.Net MVC
Your class cannot be directly mapped to the database without ORM tool, The ORM tool will read your configuration and will map your class to DB row as per your mappings automatically. That means you don't need to read the row and set the class fields explicitly but you have to provide mapping files and have to go through the ORM framework to load the entities, and the framework will take care of the rest
You can check nHibernate and here is getting started on nHibernate.

Adding behavior to LINQ to Entities models

What's the preferred approach when using L2E to add behavior to the objects in the data model?
Having a wrapper class that implements the behavior you need with only the data you need
using (var dbh = new ffEntities())
{
var query = from feed in dbh.feeds select
new FFFeed(feed.name, new Uri(feed.uri), feed.refresh);
return query.ToList();
}
//Later in a separate place, not even in the same class
foreach (FFeed feed in feedList) { feed.doX(); }
Using directly the data model instances and have a method that operates over the IEnumerable of those instances
using (var dbh = new ffEntities())
{
var query = from feed in dbh.feeds select feed;
return query.ToList();
}
//Later in a separate place, not even in the same class
foreach (feeds feed in feedList) { doX(feed); }
Using extension methods on the data model class so it ends up having the extra methods the wrapper would have.
public static class dataModelExtensions {
public static void doX(this feeds source) {
//do X
}
}
//Later in a separate place, not even in the same class
foreach (feeds feed in feedList) { feed.doX(); }
Which one is best? I tend to favor the last approach as it's clean, doesn't interfere with the CRUD facilities (i can just use it to insert/update/delete directly, no need to wrap things back), but I wonder if there's a downside I haven't seen.
Is there a fourth approach? I fail at grasping LINQ's philosophy a bit, especially regarding LINQ to Entities.
The Entity classes are partial classes as far as i know, so you can add another file extending them directly using the partial keyword.
Else, i usually have a wrapper class, i.e. my ViewModel (i'm using WPF with MVVM). I also have some generic Helper classes with fluent interfaces that i use to add specific query filters to my ViewModel.
I think it's a mistake to put behaviors on entity types at all.
The Entity Framework is based around the Entity Data Model, described by one of its architects as "very close to the object data model of .NET, modulo the behaviors." Put another way, your entity model is designed to map relational data into object space, but it should not be extended with methods. Save your methods for business types.
Unlike some other ORMs, you are not stuck with whatever object type comes out of the black box. You can project to nearly any type with LINQ, even if it is shaped differently than your entity types. So use entity types for mapping only, not for business code, data transfer, or presentation models.
Entity types are declared partial when code is generated. This leads some developers to attempt to extend them into business types. This is a mistake. Indeed, it is rarely a good idea to extend entity types. The properties created within your entity model can be queried in LINQ to Entities; properties or methods you add to the partial class cannot be included in a query.
Consider these examples of a business method:
public Decimal CalculateEarnings(Guid id)
{
var timeRecord = (from tr in Context.TimeRecords
.Include(“Employee.Person”)
.Include(“Job.Steps”)
.Include(“TheWorld.And.ItsDog”)
where tr.Id = id
select tr).First();
// Calculate has deep knowledge of entity model
return EarningsHelpers.Calculate(timeRecord);
}
What's wrong with this method? The generated SQL is going to be ferociously complex, because we have asked the Entity Framework to materialize instances of entire objects merely to get at the minority of properties required by the Calculate method. The code is also fragile. Changing the model will not only break the eager loading (via the Include calls), but will also break the Calculate method.
The Single Responsibility Principle states that a class should have only one reason to change. In the example shown on the screen, the EarningsHelpers type has the responsibility both of actually calculating earnings and of keeping up-to-date with changes to the entity model. The first responsibility seems correct, the second doesn't sound right. Let's see if we can fix that.
public Decimal CalculateEarnings(Guid id)
{
var timeData = from tr in Context.TimeRecords
where tr.Id = id
select new EarningsCalculationContext
{
Salary = tr.Employee.Salary,
StepRates = from s in tr.Job.Steps
select s.Rate,
TotalHours = tr.Stop – tr.Start
}.First();
// Calculate has no knowledge of entity model
return EarningsHelpers.Calculate(timeData);
}
In the next example, I have rewritten the LINQ query to pick out only the bits of information required by the Calculate method, and project that information onto a type which rolls up the arguments for the Calculate method. If writing a new type just to pass arguments to a method seemed like too much work, I could have also projected onto an anonymous type, and passed Salary, StepRates, and TotalHours as individual arguments. But either way, we have fixed the dependency of EarningsHelpers on the entity model, and as a free bonus we've gotten more efficient SQL, as well.
You might look at this code and wonder what would happen if the Job property of TimeRecord where nullable. Wouldn't I get a null reference exception?
No, I would not. This code will not be compiled and executed as IL; it will be translated to SQL. LINQ to Entities coalesces null references. In the example query shown on the screen, StepRates would simply return null if Job was null. You can think of this as being identical to lazy loading, except without the extra database queries. The code says, "If there is a job, then load the rates from its steps."
An additional benefit of this kind of architecture is that it makes unit testing of the Web assembly very easy. Unit tests should not access a database, generally speaking (put another way, tests which do access a database are integration tests rather than unit tests). It's quite easy to write a mock repository which returns arrays of objects as Queryables rather than actually going to the Entity Framework.

Categories

Resources