I am working on part of an application that simply pulls information from the database and displays it to users. For simplicity sake, let us assume I have a database with two tables, Cats and Dogs. Both tables have manually assigned primary keys and are never duplicated/overlapped. The goal I am trying to achieve is to perform 1 LINQ query that will concat both tables.
I recently asked this question regarding performing a LINQ concat on two collections of objects, Cats and Dogs, that were manually created in code. I advise reading the previous question as it will give much insight to this one.
The reason I wish to use interfaces is to simplify my queries. I currently have a solution that .Select each of the columns I need into an anonymous type. This would work fine for this instance, but will consume pages with the data I am working with.
The different between that previous question and this one is that I am trying to pull these animals from a database. From my analysis, it seems that .NET or Entity Framework is not able to relate my database to my interface
Model (From old question)
public interface iAnimal
{
string name { get; set; }
int age { get; set; }
}
public class Dog :iAnimal
{
public string name { get; set; }
public int age { get; set; }
}
public class Cat:iAnimal
{
public string name { get; set; }
public int age { get; set; }
}
Here are some different LINQ queries I have tried and the resulting error. The first example will be using the solution from the previous question.
var model = _db.Cats.Concat<iAnimal>(_db.Dogs).Take(4);
System.ArgumentException: DbUnionAllExpression requires arguments with compatible collection ResultTypes.
Without Covariance:
var model = _db.Cats.Cast<iAnimal>().Concat(_db.Dogs.Cast<iAnimal>());
System.NotSupportedException: Unable to cast the type 'Test.Models.Cat' to type 'Test.Interfaces.iAnimals'. LINQ to Entities only supports casting Entity Data Model primitive types.
From the above error, it looks like I am not able to use interfaces to interact with databases as it is not mapped to any particular table.
Any insight would be much appreciated. Thanks
EDIT
In response to #Reed Copsey, with your solution, I get the same error as my example without covariance. I tried changing the view's type to match what the error recommends, which results in this error
System.InvalidOperationException: The model item passed into the dictionary is of type 'System.Data.Entity.Infrastructure.DbQuery`1[Test.Interfaces.iAnimal]', but this dictionary requires a model item of type 'System.Collections.Generic.IEnumerable`1[Test.Models.Cat]'.
You database knows nothing about your interface and you will probably not be able to get this working. I see two options.
You could use inheritance - for example supported by the Entity Framework - and inherit both entities from a common base entity. Than you will be able to perform queries against the base type but this may require changes to your data model depending on the way you implement inheritance at the database level.
Have a look at the documentation for TPT inheritance and TPH inheritance. There are still other inheritance models like TPC inheritance but they currently lack designer support.
The second option is to fetch results from both tables into memory and use LINQ to Objects to merge them into a single collection.
var dogs = database.Dogs.Take(4).ToList();
var cats = database.Cats.Take(4).ToList();
var pets = dogs.Cast<IPet>().Concat(cats).ToList();
Also note that your query
var model = _db.Cats.Concat<iAnimal>(_db.Dogs).Take(4);
seems not really well designed - the result will definitely depend on the database used but I would not be surprised if you usually just get the first four cats and never see any dog.
Related
We have a baseobject with 10 childobjects and EF6 code first.
Of those 10 childobjects, 5 have only a few (extra) properties, and 5 have multiple properties (5 to 20).
We implemented this as table-per-type, so we have one table for the base and 1 per child (total 10).
This, however, creates HUGE select queries with select case and unions all over the place, which also takes the EF 6 seconds to generate (the first time).
I read about this issue, and that the same issue holds in the table-per-concrete type scenario.
So what we are left with is table-per-hierachy, but that creates a table with a large number of properties, which doesn't sound great either.
Is there another solution for this?
I thought about maybe skip the inheritance and create a union view for when I want to get all the items from all the child objects/records.
Any other thoughts?
Another solution would be to implement some kind of CQRS pattern where you have separate databases for writing (command) and reading (query). You could even de-normalize the data in the read database so it is very fast.
Assuming you need at least one normalized model with referential integrity, I think your decision really comes down to Table per Hierarchy and Table per Type. TPH is reported by Alex James from the EF team and more recently on Microsoft's Data Development site to have better performance.
Advantages of TPT and why they're not as important as performance:
Greater flexibility, which means the ability to add types without affecting any existing table. Not too much of a concern because EF migrations make it trivial to generate the required SQL to update existing databases without affecting data.
Database validation on account of having fewer nullable fields. Not a massive concern because EF validates data according to the application model. If data is being added by other means it is not too difficult to run a background script to validate data. Also, TPT and TPC are actually worse for validation when it comes to primary keys because two sub-class tables could potentially contain the same primary key. You are left with the problem of validation by other means.
Storage space is reduced on account of not needing to store all the null fields. This is only a very trivial concern, especially if the DBMS has a good strategy for handling 'sparse' columns.
Design and gut-feel. Having one very large table does feel a bit wrong, but that is probably because most db designers have spent many hours normalizing data and drawing ERDs. Having one large table seems to go against the basic principles of database design. This is probably the biggest barrier to TPH. See this article for a particularly impassioned argument.
That article summarizes the core argument against TPH as:
It's not normalized even in a trivial sense, it makes it impossible to enforce integrity on the data, and what's most "awesome:" it is virtually guaranteed to perform badly at a large scale for any non-trivial set of data.
These are mostly wrong. Performance and integrity are mentioned above, and TPH does not necessarily mean denormalized. There are just many (nullable) foreign key columns that are self-referential. So we can go on designing and normalizing the data exactly as we would with a TPH. In a current database I have many relationships between sub-types and have created an ERD as if it were a TPT inheritance structure. This actually reflects the implementation in code-first Entity Framework. For example here is my Expenditure class, which inherits from Relationship which inherits from Content:
public class Expenditure : Relationship
{
/// <summary>
/// Inherits from Content: Id, Handle, Description, Parent (is context of expenditure and usually
/// a Project)
/// Inherits from Relationship: Source (the Principal), SourceId, Target (the Supplier), TargetId,
///
/// </summary>
[Required, InverseProperty("Expenditures"), ForeignKey("ProductId")]
public Product Product { get; set; }
public Guid ProductId { get; set; }
public string Unit { get; set; }
public double Qty { get; set; }
public string Currency { get; set; }
public double TotalCost { get; set; }
}
The InversePropertyAttribute and the ForeignKeyAttribute provide EF with the information required to make the required self joins in the single database.
The Product type also maps to the same table (also inheriting from Content). Each Product has its own row in the table and rows that contain Expenditures will include data in the ProductId column, which is null for rows containing all other types. So the data is normalized, just placed in a single table.
The beauty of using EF code first is we design the database in exactly the same way and we implement it in (almost) exactly the same way regardless of using TPH or TPT. To change the implementation from TPH to TPT we simply need to add an annotation to each sub-class, mapping them to new tables. So, the good news for you is it doesn't really matter which one you choose. Just build it, generate a stack of test data, test it, change strategy, test it again. I reckon you'll find TPH the winner.
Having experienced similar problems myself I've a few suggestions. I'm also open to improvements on these suggestions as It's a complex topic, and I don't have it all worked out.
Entity framework can be very slow when dealing with non-trivial queries on complex entities - ie those with multiple levels of child collections. In some performance tests I've tried it does sit there an awful long time compiling the query. In theory EF 5 and onwards should cache compiled queries (even if the context gets disposed and re-instantiated) without you having to do anything, but I'm not convinced that this is always the case.
I've read some suggestions that you should create multiple DataContexts with only smaller subsets of your database entities for a complex database. If this is practical for you give it a try! But I imagine there would be maintenance issues with this approach.
1) I Know this is obvious but worth saying anyway - make sure you have the right foreign keys set up in your database for related entities, as then entity framework will keep track of these relationships, and be much quicker generating queries where you need to join using the foreign key.
2) Don't retrieve more than you need. One-size fits all methods to get a complex object are rarely optimal. Say you are getting a list of base objects (to put in a list) and you only need to display the name and ID of these objects in the list of the base object. Just retrieve only the base object - any navigation properties that aren't specifically needed should not be retrieved.
3) If the child objects are not collections, or they are collections but you only need 1 item (or an aggregate value such as the count) from them I would absolutely implement a View in the database and query that instead. It is MUCH quicker. EF doesn't have to do any work - its all done in the database, which is better equipped for this type of operation.
4) Be careful with .Include() and this goes back to point #2 above. If you are getting a single object + a child collection property you are best not using .Include() as then when the child collection is retrieved this will be done as a separate query. (so not getting all the base object columns for every row in the child collection)
EDIT
Following comments here's some further thoughts.
As we are dealing with an inheritance hierarchy it makes logical sense to store separate tables for the additional properties of the inheriting classes + a table for the base class. As to how to make Entity Framework perform well though is still up for debate.
I've used EF for a similar scenario (but fewer children), (Database first), but in this case I didn't use the actual Entity framework generated classes as the business objects. The EF objects directly related to the DB tables.
I created separate business classes for the base and inheriting classes, and a set of Mappers that would convert to them. A query would look something like
public static List<BaseClass> GetAllItems()
{
using (var db = new MyDbEntities())
{
var q1 = db.InheritedClass1.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass1Mapper.MapFromContext(x));
var q2 = db.InheritedClass2.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass2Mapper.MapFromContext(x));
return q1.Union(q2).ToList();
}
}
Not saying this is the best approach, but it might be a starting point?
The queries are certainly quick to compile in this case!
Comments welcome!
With Table per Hierarchy you end up with only one table, so obviously your CRUD operations will be faster and this table is abstracted out by your domain layer anyway. The disadvantage is that you loose the ability for NOT NULL constraints, so this needs to be handled properly by your business layer to avoid potential data integrity. Also, adding or removing entities means that the table changes; but that's also something that is manageable.
With Table per type you have the problem that the more classes in the hierarchy you have, the slower your CRUD operations will become.
All in all, as performance is probably the most important consideration here and you have a lot of classes, I think Table per Hierarchy is a winner in terms of both performance and simplicity and taking into account your number of classes.
Also look at this article, more specifically at chapter 7.1.1 (Avoiding TPT in Model First or Code First applications), where they state: "when creating an application using Model First or Code First, you should avoid TPT inheritance for performance concerns."
The EF6 CodeFirst model I'm working on using generics and an abstract base classes called "BaseEntity". I also use generics and a base class for the EntityTypeConfiguration class.
In the event that I need to reuse a couple of properties "columns" on some tables and it doesn't make sense for them to be on BaseEntity or BaseEntityWithMetaData, I make an interface for them.
E.g. I have one for addresses I haven't finished yet. So if an entity has address information it will implement IAddressInfo. Casting an entity to IAddressInfo will give me an object with just the AddressInfo on it.
Originally I had my metadata columns as their own table. But like others have mentioned, the queries were horrendous, and it was slower than slow. So I thought, why don't I just use multiple inheritance paths to support what I want to do so the columns are on every table that need them, and not on the ones that don't. Also I am using mysql which has a column limit of 4096. Sql Server 2008 has 1024. Even at 1024, I don't see realistic scenarios for going over that on one table.
And non of my objjets inherit in such a way that they have columns they don't need. When that need arises I create a new base class at a level to prevent the extra columns.
Here's are enough snippets from my code to understand how I have my inheritance setup. So far it works really well for me. I haven't really produced a scenario I couldn't model with this setup.
public BaseEntityConfig<T> : EntityTypeConfiguration<T> where T : BaseEntity<T>, new()
{
}
public BaseEntity<T> where T : BaseEntity<T>, new()
{
//shared properties here
}
public BaseEntityMetaDataConfig : BaseEntityConfig<T> where T: BaseEntityWithMetaData<T>, new()
{
public BaseEntityWithMetaDataConfig()
{
this.HasOptional(e => e.RecCreatedBy).WithMany().HasForeignKey(p => p.RecCreatedByUserId);
this.HasOptional(e => e.RecLastModifiedBy).WithMany().HasForeignKey(p => p.RecLastModifiedByUserId);
}
}
public BaseEntityMetaData<T> : BaseEntity<T> where T: BaseEntityWithMetaData<T>, new()
{
#region Entity Properties
public DateTime? DateRecCreated { get; set; }
public DateTime? DateRecModified { get; set; }
public long? RecCreatedByUserId { get; set; }
public virtual User RecCreatedBy { get; set; }
public virtual User RecLastModifiedBy { get; set; }
public long? RecLastModifiedByUserId { get; set; }
public DateTime? RecDateDeleted { get; set; }
#endregion
}
public PersonConfig()
{
this.ToTable("people");
this.HasKey(e => e.PersonId);
this.HasOptional(e => e.User).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasOptional(p => p.Employee).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasMany(e => e.EmailAddresses).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.Property(e => e.FirstName).IsRequired().HasMaxLength(128);
this.Property(e => e.MiddleName).IsOptional().HasMaxLength(128);
this.Property(e => e.LastName).IsRequired().HasMaxLength(128);
}
}
//I Have to use this pattern to allow other classes to inherit from person, they have to inherit from BasePeron<T>
public class Person : BasePerson<Person>
{
//Just a dummy class to expose BasePerson as it is.
}
public class BasePerson<T> : BaseEntityWithMetaData<T> where T: BasePerson<T>, new()
{
#region Entity Properties
public long PersonId { get; set; }
public virtual User User { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public virtual Employee Employee { get; set; }
public virtual ICollection<PersonEmail> EmailAddresses { get; set; }
#endregion
#region Entity Helper Properties
[NotMapped]
public PersonEmail PrimaryPersonalEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Personal_Primary select e).FirstOrDefault();
return ret;
}
}
[NotMapped]
public PersonEmail PrimaryWorkEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Work_Primary select e).FirstOrDefault();
return ret;
}
}
private string _DefaultEmailAddress = null;
[NotMapped]
public string DefaultEmailAddress
{
get
{
if (string.IsNullOrEmpty(_DefaultEmailAddress))
{
PersonEmail personalEmail = this.PrimaryPersonalEmail;
if (personalEmail != null && !string.IsNullOrEmpty(personalEmail.EmailAddress))
_DefaultEmailAddress = personalEmail.EmailAddress;
else
{
PersonEmail workEmail = this.PrimaryWorkEmail;
if (workEmail != null && !string.IsNullOrEmpty(workEmail.EmailAddress))
_DefaultEmailAddress = workEmail.EmailAddress;
}
}
return _DefaultEmailAddress;
}
}
#endregion
#region Constructor
static BasePerson()
{
}
public BasePerson()
{
this.User = null;
this.EmailAddresses = new HashSet<PersonEmail>();
}
public BasePerson(string firstName, string lastName)
{
this.FirstName = firstName;
this.LastName = lastName;
}
#endregion
}
Now, code in the context on ModelCreating looks like,
//Config
modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
//initialize configuration, each line is responsible for telling entity framework how to create relation ships between the different tables in the database.
//Such as Table Names, Foreign Key Contraints, Unique Contraints, all relations etc.
modelBuilder.Configurations.Add(new PersonConfig());
modelBuilder.Configurations.Add(new PersonEmailConfig());
modelBuilder.Configurations.Add(new UserConfig());
modelBuilder.Configurations.Add(new LoginSessionConfig());
modelBuilder.Configurations.Add(new AccountConfig());
modelBuilder.Configurations.Add(new EmployeeConfig());
modelBuilder.Configurations.Add(new ContactConfig());
modelBuilder.Configurations.Add(new ConfigEntryCategoryConfig());
modelBuilder.Configurations.Add(new ConfigEntryConfig());
modelBuilder.Configurations.Add(new SecurityQuestionConfig());
modelBuilder.Configurations.Add(new SecurityQuestionAnswerConfig());
The reason I created base classes for the Configuration of my entities was because when I started down this path I ran into an annoying problem. I had to configure the shared properties for every derrived class over and over again. And if I updated one of the fluent API mappings, I had to update code in every derrived class.
But by using this inheritance method on the configuration classes the two properties are configured in one place, and inherited by the configuration class for derrived entities.
So when PeopleConfig is configured, it runs the logic on the BaseEntityWithMetaData class to configure the two properties, and again when UserConfig runs, etc etc etc.
Three different approaches have different names in M. Fowler's language:
Single Table inheritance - whole inheritance hierarchy held in one table. No joins, optional columns for child types. You need to distinguish which child type it is.
Concrete Table inheritance - you have one table for each concrete type. Joins, no optional columns. In this case, base type table is needed only if the base type requires to have its own mapping (instance can be created).
Class Table inheritance - you have base type table, and child tables - each adding only additional columns to the base's columns. Joins, no optional columns. In this case, base type table always contains row for each child; however, you can retrieve common columns only if no child-specific columns are needed (rest comes with lazy loading maybe?).
All approaches are workable - it only depends on the amount and structure of data you have, so you can measure performance differences first.
Choice will be based on the number of joins vs. data distribution vs. optional columns.
If you don't have (and not going to have) many child types, I would go with class table inheritance since that stands close to the domain and will be easy to translate/map.
If you have many child tables to work with at the same time, and anticipate bottleneck in joins - go with single table inheritance.
If joins are not needed at all and you are going to work with one concrete type at a time - go with concrete table inheritance.
Although, the Table per Hierarchy (TPH) is a better approach for fast CRUD operations, yet in that case it is impossible to avoid a single table with a so many properties for the database created. The case and union clauses that you mentioned are created because the resulting query is effectively requesting a polymorphic result set that includes multiple types.
However, when EF returns flattened table that includes the data for all the types, it does extra work to ensure that, null values are returned for columns that may be irrelevant for a particular type. Technically, this extra validation using case and union is not necessary
The below issue is a performance glitch in Microsoft EF6 and they are are aiming to deliver this fix in a future release.
The below query:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN[UnionAll1].[State] END AS [C2],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))THEN[UnionAll1].[Zip] END AS [C3],
FROM [dbo].[Customers] AS [Extent1]
can be safely replaced by:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
[UnionAll1].[State] AS [C2],
[UnionAll1].[Zip] AS [C3],
FROM [dbo].[Customers] AS [Extent1]
So, you just saw the problem and the flaw of Entity Framework 6 current release, you have an option to either use a Model First Approach or use a TPH approach.
I have a quick question about the sqlite-net library which can be found here : https://github.com/praeclarum/sqlite-net.
The thing is I have no idea how collections, and custom objects will be inserted into the database, and how do I convert them back when querying, if needed.
Take this model for example:
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
private string _name; // The name of the subject. i.e "Physics"
private ObservableCollection<Lesson> _lessons;
Preface: I've not used sqlite-net; rather, I spent some time simply reviewing the source code on the github link posted in the question.
From the first page on the sqlite-net github site, there are two bullet points that should help in some high level understanding:
Very simple methods for executing CRUD operations and queries safely (using parameters) and for retrieving the results of those
query in a strongly typed fashion
In other words, sqlite-net will work well with non-complex models; will probably work best with flattened models.
Works with your data model without forcing you to change your classes. (Contains a small reflection-driven ORM layer.)
In other words, sqlite-net will transform/map the result set of the SQL query to your model; again, will probably work best with flattened models.
Looking at the primary source code of SQLite.cs, there is an InsertAll method and a few overloads that will insert a collection.
When querying for data, you should be able to use the Get<T> method and the Table<T> method and there is also an Query<T> method you could take a look at as well. Each should map the results to the type parameter.
Finally, take a look at the examples and tests for a more in-depth look at using the framework.
I've worked quite a bit with SQLite-net in the past few months (including this presentation yesterday)
how collections, and custom objects will be inserted into the database
I think the answer is they won't.
While it is a very capable database and ORM, SQLite-net is targeting lightweight mobile apps. Because of this lightweight focus, the classes used are generally very simple flattened objects like:
public class Course
{
public int CourseId { get; set; }
public string Name { get; set; }
}
public class Lesson
{
public int LessonId { get; set; }
public string Name { get; set; }
public int CourseId { get; set; }
}
If you then need to Join these back together and to handle insertion and deletion of related objects, then that's down to you - the app developer - to handle. There's no auto-tracking of related objects like there is in a larger, more complicated ORM stack.
In practice, I've not found this a problem. I find SQLite-net very useful in my mobile apps.
I have two entities and there are their POCO:
public class DocumentColumn
{
public virtual long Id { get; set; }
public virtual string Name { get; set; }
public virtual long? DocumentTypeId { get; set; }
}
public class DocumentType {
public virtual long Id { get; set; }
public virtual string Name { get; set; }
}
There is a relation between those two entities. In the db the relation called:FK_T_DOCUMENT_COLUMN_T_DOCUMENT_TYPE.
When I do:
DocumentColumns.Where(x => x.DocumentTypeId == documentTypeId).ToList();
I get the exception:
{"Metadata information for the relationship 'MyModel.FK_T_DOCUMENT_COLUMN_T_DOCUMENT_TYPE' could not be retrieved. If mapping attributes are used, make sure that the EdmRelationshipAttribute for the relationship has been defined in the assembly. When using convention-based mapping, metadata information for relationships between detached entities cannot be determined.\r\nParameter name: relationshipName"}
I tryed to remove the relationship and the DocumentColumn table and reload them but the code still throws the exception.
Whet does this exception means and how can I solve it?
EDIT:
The exception happens also If I do DocumentColumns.ToList();
(Presuming you are talking about Code First ....)
There is no information in either class to let CF know that there is a relationship between them. It doesn't matter that the database has the info. Entity Framework needs to have a clue about the relationship. You provide only a property with an integer. CF cannot infer a relationship. You must have something in one class or another that provides type or another. This is not a database. It's a data model. Very different things.
But that's not all. I'm guessing that this is a one to many relationship. You could either put a List property into the Document class or a Document property in the DocumentColumn class. If you only do the latter, CF and EF will NOT know about the 1:. It will presume a 1:1 (that is if you leave DocumentId integer in there, otherwise it will presume a 1:0..1). However, I think you could get away with this and then just configure the multiplicity (1:) in fluent API.
UPDATE...reading your question again, I think you are using an EDMX and designer not code first. What are you using to create your POCO classes? Are you doing code gen from the EDMX or just writing the classes. I still think the lack of a navigation property in at least ONE of the types might be the cause of the problem. The ERROR message does not suggest that...I'm only coming to this conclusion by looking at the classes and inferring my understanding of how EF works with the metadata. I could be barking up the wrong tree. FWIW, I have asked the team if they are familiar with this exception and can provide some idea of what pattern would create it. It's pretty bizarre. :)
It seems odd to me that you are using EF with a defined relationship and you are not using the related property. Can you not do:
DocumentColumns.Where(x=>x.DocumentType.Id == documentTypeId).ToList();
This is what I would expect to see in this instance.
I'm reading through Pro ASP.NET MVC 3 Framework that just came out, and am a bit confused about how to handle the retrieval of aggregate objects from a data store. The book uses Entity framework, but I an considering using a mini-ORM (Dapper or PetaPoco). As an example, the book uses the following objects:
public class Member {
public string name { get; set; }
}
public class Item {
public int id { get; set; }
public List<Bid> bids { get; set; }
}
public class Bid {
public int id { get; set; }
public Member member { get; set; }
public decimal amount { get; set; }
}
As far as I'm into the book, they just mention the concept of aggregates and move on. So I am assuming you would then implement some basic repository methods, such as:
List<Item> GetAllItems()
List<Bid> GetBidsById(int id)
GetMemberById(int id)
Then, if you wanted to show a list of all items, their bids, and the bidding member, you'd have something like
List<Item> items = Repository.GetAllItems();
foreach (Item i in items) {
i.Bids = Repository.GetBidsById(i.id);
}
foreach (Bid b in items.Bids) {
b.Member = Repository.GetMemberById(b.id);
}
If this is correct, isn't this awfully inefficient, since you could potentially issue thousands of queries in a few seconds? In my non-ORM thinking mind, I would have written a query like
SELECT
item.id,
bid.id,
bid.amount,
member.name
FROM
item
INNER JOIN bid
ON item.id = bid.itemId
INNER JOIN member
ON bid.memberId = member.id
and stuck it in a DataTable. I know it's not pretty, but one large query versus a few dozen little ones seems a better alternative.
If this is not correct, then can someone please enlighten me as to the proper way of handling aggregate retrieval?
If you use Entity Framework for you Data Access Layer, read the Item entity and use the .Include() fluent method to bring the Bids and Members along for the ride.
An aggregate is a collection of related data. The aggregate root is the logical entry point of that data. In your example, the aggregate root is an Item with Bid data. You could also look at the Member as an aggregate root with Bid data.
You may use your data access layer to retrieve the object graph of each aggregate and transforming the data for your use in the view. You may even ensure you eager fetch all of the data from the children. It is possible to transform the data using a tool like AutoMapper.
However, I believe that it is better to use your data access layer to project the domain objects into the data structure you need for the view, whether it be ORM or DataSet. Again, to use your example, would you actually retrieve the entire object graph suggested? Do I need all items including their bids and members? Or do I need a list of items, number of bids, plus member name and amount for the current winning bid? When I need more data about a particular item, I can go retrieve that when the request is made.
In short, your intuition was spot-on that it is inefficient to retrieve all that data, when a projection would suffice. I would just urge you to limit the projection even further and retrieve only the data you require for the current view.
This would be handled in different ways depending on your data access strategy. If you were using NHibernate or Entity Framework, you can have the ORM automatically populate these properties for you eagerly, lazy load them, etc. Entity Framework calls them "Navigation Properties", I'm not sure that NHibernate has a specific name for these "child properties" or "child collections".
In old-school ADO.NET, you might do something like create a stored procedure that returns multiple result sets (one for the main object and other result sets for your child collections or related objects), which would let you avoid calling the database multiple times. You could then iterate over the results sets and hydrate your object with all its relationships with one database call, and inside of a single repository method.
Where ever in your system you do the data retrieval, you would program your orm of choice to do an eager fetch of the related objects (aggregates).
Using what kind of data access method depends on your project.
Convenience vs performance.
Using EF or Linq to SQL really boosts the coding speed. When talking about performance, you really should care about every sql statement you deliver to the database.
No ORM can do both.
You can treat the read (query) and the write (command) side of the model separately.
When you want to mutate the state of your Aggregate, you load the Aggregate Root (AR) via a repository, mutate its state using the intention revealing public methods on the AR, then save the AR with the repository back again.
On the read side however, you can be as flexible as you want. I don't know Entity Framework, but with NHibernate you could use the QueryOver API to generate flexible queries to populate DTO's designed to be consumed by the client, whether it be a service or a View. If you want more performance you could go with Dapper. You could even use Stored Procs that projects itself to a DTO, that way you can be as efficient in the DB layer as possible.
I am having some issues with using the OrderBy extension method on a LINQ query when it is operating on an enum type. I have created a regular DataContext using visual studio by simply dragging and dropping everything onto the designer. I have then created seperate entity models, which are simply POCO's, and I have used a repository pattern to fetch the data from my database and map them into my own entity models (or rather, I have a repository pattern, that builds up and IQueryable that'll do all this).
Everything works just fine, except when I try to apply an OrderBy (outside of the repository) on a property that I have mapped from short/smallint to an enum.
Here are the relevant code bits:
public class Campaign
{
public long Id { get; set; }
public string Name { get; set; }
....
public CampaignStatus Status { get; set; }
...
}
public enum CampaignStatus : short {
Active,
Inactive,
Todo,
Hidden
}
public class SqlCampaignRepository : ICampaignRepository
{
...
public IQueryable<Campaign> Campaigns()
{
DataContext db = new DataContext();
return from c in db.Campaigns
select new Campaign
{
Id = c.Id,
Name = c.Name,
...
Status = (CampaignStatus)c.Status,
...
};
}
}
And then elsewhere
SqlCampaignRepository rep = new SqlCampaignRepository();
var query = rep.Campaigns().OrderBy(c => c.Status);
This triggers the following exception:
System.ArgumentException was unhandled by user code
Message="The argument 'value' was the wrong type. Expected 'IQMedia.Models.CampaignType'. Actual 'System.Int16'."
Source="System.Data.Linq"
StackTrace:
ved System.Data.Linq.SqlClient.SqlOrderExpression.set_Expression(SqlExpression value)
ved System.Data.Linq.SqlClient.SqlBinder.Visitor.VisitSelect(SqlSelect select)
ved System.Data.Linq.SqlClient.SqlVisitor.Visit(SqlNode node)
ved System.Data.Linq.SqlClient.SqlBinder.Visitor.VisitIncludeScope(SqlIncludeScope scope)
...
(sorry about the danish in there, ved = by/at).
I have tried typecasting the Status to short in the orderBy expression, but that doesn't help it, same if i cast it to the actual enum type as well.
Any help fixing this is greatly appreciated!
Can you specify the type CampaignStatus directly in your DataContext trough the designer? This way the value is automatically mapped to the enum.
What is the relationship between the Campaign class and Campaigns? If Campaigns returns the set of Campaign object, note you can't normally select new a mapped entity.
I wonder if it would work any better if you did the OrderBy before the Select?
One final trick might be to create a fake composable [Function], using trivial TSQL. For example, ABS might be enough. i.e. something like (on the context):
[Function(Name="ABS", IsComposable=true)]
public int Abs(int value)
{ // to prove not used by our C# code...
throw new NotImplementedException();
}
Then try:
.OrderBy(x => ctx.Abs(x.Status))
I haven't tested the above, but can give it a go later... it works for some other similar cases, though.
Worth a shot...
My DataContext has it's own entity class named Campaign, (living in a different namespace, of course). Also the status column is saved as a smallint in the database, and the LINQ Entity namespace has it's type listed as a short (System.Int16).
The orderby DOES work if I apply it in the query in my repository - this is all a part of a bigger thing though, and the whole idea is to NOT have the repository applying any sort, filtering or anything like that, but merely map the database entity classes to my own ones. This example right there is obviously a bit pointless in which it's pretty much a straight mapping, but in some cases I have localization added into it as well.
Also I forgot to add - the exception obviously doesn't occour till i try to execute the query (ie - calling ToList, or enumerating over the collection).
In the bigger picture this method is being used by a service class which is then supposed to add filtering, sorting and all that - and the point of all this is of course to separate things out a bit, but also to allow easy transition to a different database, or a different OR/M, later on, if that would be the desire.
Ah didn't see that last bit till after I replied - I have not had any experience using the Function attribute yet, but I will not have access to the datacontext in the class where I am supposed to apply the sorting.