Best strategies when working with micro ORM? - c#

I started using PetaPOCO and Dapper and they both have their own limitations. But on the contrary, they are so lightning fast than Entity Framework that I tend to let go the limitations of it.
My question is: Is there any ORM which lets us define one-to-many, many-to-one and many-to-many relationships concretely? Both Dapper.Net and PetaPOCO they kind of implement hack-ish way to fake these relationship and moreover they don't even scale very well when you may have 5-6 joins. If there isn't a single micro ORM that can let us deal with it, then my 2nd question is should I let go the fact that these micro ORMs aren't that good in defining relationships and create a new POCO entity for every single type of query that I would be executing that includes these types of multi joins? Can this scale well?
I hope I am clear with my question. If not, let me know.

I generally follow these steps.
I create my viewmodel in such a way that represents the exact data and format I want to display in a view.
I query straight from the database via PetaPoco on to my view models.
In my branch I have a
T SingleInto<T>(T instance, string sql, params object[] args);
method which takes an existing object and can map columns directly on to it matched by name. This works brilliantly for this scenario.
My branch can be found here if needed.
https://github.com/schotime/petapoco/

they don't even scale very well when you may have 5-6 joins
Yes, they don't, but that is a good thing, because when the system you will be building starts to get complex, you are free to do the joins you want, without performance penalties or headaches.
Yes, I miss when I don't needed to write all this JOINS with Linq2SQL, but then I created a simple tool to write the common joins so I get the basic SQL for any entity and then I can build from there.
Example:
[TableName("Product")]
[PrimaryKey("ProductID")]
[ExplicitColumns]
public class Product {
[PetaPoco.Column("ProductID")]
public int ProductID { get; set; }
[PetaPoco.Column("Name")]
[Display(Name = "Name")]
[Required]
[StringLength(50)]
public String Name { get; set; }
...
...
[PetaPoco.Column("ProductTypeID")]
[Display(Name = "ProductType")]
public int ProductTypeID { get; set; }
[ResultColumn]
public string ProductType { get; set; }
...
...
public static Product SingleOrDefault(int id) {
var sql = BaseQuery();
sql.Append("WHERE Product.ProductID = #0", id);
return DbHelper.CurrentDb().SingleOrDefault<Product>(sql);
}
public static PetaPoco.Sql BaseQuery(int TopN = 0) {
var sql = PetaPoco.Sql.Builder;
sql.AppendSelectTop(TopN);
sql.Append("Product.*, ProductType.Name as ProductType");
sql.Append("FROM Product");
sql.Append(" INNER JOIN ProductType ON Product.ProductoTypeID = ProductType.ProductTypeID");
return sql;
}

Would QueryFirst help here? You get the speed of micro orms, with the added comfort of every-error-a-compile-time-error, plus intellisense both for your queries and their output. You define your joins in SQL as god intended. If typing out join conditions is really bugging you, DBForge might be the answer, and because you're working in SQL, these tools are compatible, and you're not locked in.

Related

Entity Framework LINQ SQL Query Performance

Hello everyone I'm working on an API that returns a dish with its restaurant details from a database that has restaurants and their dishes.
I'm wondering if the following makes the query any efficient by converting the first, to second:
from res in _context.Restaurant
join resdish in _context.RestaurantDish
on res.Id equals resdish.RestaurantId
where resdish.RestaurantDishId == dishId
Second:
from resdish in _context.RestaurantDish
where resdish.RestaurantDishId == dishId
join res in _context.Restaurant
on resdish.RestaurantId equals res.Id
The reason why I'm debating this is because I feel like the second version filters to the single restaurant dish, then joining it, rather than joining all dishes then filtering.
Is this correct?
You can use a profiler on your database to capture the SQL in both cases, or inspect the SQL that EF generates and you'll likely find that the SQL in both cases is virtually identical. It boils down to how the reader (developers) interprets the intention of the logic.
As far as building efficient queries in EF goes, EF is an ORM meaning it offers to map between an object-oriented model and a relational data model. It isn't just an API to enable translating Linq to SQL. Part of the power for writing simple and efficient queries is through the use of navigation properties and projection. A Dish will be considered the property of a particular Restaurant, while a Restaurant has many Dishes on its menu. This forms a One-to-Many relationship in the database, and navigation properties can map this relationship in your object model:
public class Restaurant
{
[Key]
public int RestaurantId { get; set; }
// ... other fields
public virtual ICollection<Dish> Dishes { get; set; } = new List<Dish>();
}
public class Dish
{
[Key]
public int DishId { get; set; }
//[ForeignKey(nameof(Restaurant))]
//public int RestaurantId { get; set; }
public virtual Restaurant Restaurant { get; set; }
}
The FK propery for the Restaurant ID is optional and can be configured to use a Shadow Property. (One that EF knows about and generates, but isn't exposed in the Entity) I recommend using shadow properties for FKs mainly to avoid 2 sources of truth for relationships. (dish.RestaurantId and dish.Restaurant.RestaurantId) Changing the FK does not automatically update the relationship unless you reload the entity, and updating the relationship does not automatically update the FK until you call SaveChanges.
Now if you wanted to get a particular dish and it's associated restaurant:
var dish = _context.Dishes
.Include(d => d.Restaurant)
.Single(d => d.DishId == dishId);
This fetches both entities. Note that there is no need now to manually write Joins like you would with SQL. EF supports Join, but it should only be used in very rare cases where a schema isn't properly normalized/relational and you need to map loosely joined entities/tables. (Such as a table using an "OwnerId" that could join to a "This" or a "That" table based on a discriminator such as OwnerType.)
If you leave off the .Include(d => d.Restaurant) and have lazy loading enabled on the DbContext, then EF would attempt to automatically load the Restaurant if and when the first attempt of the code to access dish.Restaurant. This provides a safety net, but can incur some steep performance penalties in many cases, so it should be avoided or treated as a safety net, not a crutch.
Eager loading works well when dealing with single entities and their related data where you will need to do things with those relationships. For instance if I want to load a Restaurant and review, add/remove dishes, or load a Dish and possibly change the Restaurant. However, eager loading can come at a significant cost in how EF and SQL provides that related data behind the scenes.
By default when you use Include, EF will add an INNER or LEFT join between the associated tables. This creates a Cartesian Product between the involved tables. If you have 100 restaurants that have an average of 30 dishes each and select all 100 restaurants eager loading their dishes, the resulting query is 3000 rows. Now if a Dish has something like Reviews and there are an average of 5 reviews per dish and you eager load Dishes and Reviews, that would be a resultset of every column across all three tables with 15000 rows in total. You can hopefully appreciate how this can grow out of hand pretty fast. EF then goes through that Cartesian and populates the associated entities in the object graph. This can lead to questions about why "my query runs fast in SSMS but slow in EF" since EF can have a lot of work to do, especially if it has been tracking references from restaurants, dishes, and/or reviews to scan through and provide. Later versions of EF can help mitigate this a bit by using query splitting so instead of JOINs, EF can work out to fetch the related data using multiple separate SELECT statements which can execute and process a fair bit faster, but it still amounts to a lot of data going over the wire and needing memory to materialize to work with.
Most of the time though, you won't need ALL rows, nor ALL columns for each and every related entity. This is where Projection comes in such as using Select. When we pull back our list of restaurants, we might want to list the restaurants in a given city along with their top 5 dishes based on user reviews. We only need the RestaurantId & Name to display in these results, along with the Dish name and # of positive reviews. Instead of loading every column from every table, we can define a view model for Restaurants and Dishes for this summary View, and project the entities to these view models:
public class RestaurantSummaryViewModel
{
public int RestaurantId { get; set; }
public string Name { get; set; }
public ICollection<DishSummaryViewModel> Top5Dishes { get; set; } = new List<DishSummaryViewModel>();
}
public class DishSummaryViewModel
{
public string Name { get; set; }
public int PositiveReviewCount {get; set; }
}
var restaurants = _context.Restaurants
.Where(r => r.City.CityId == cityId)
.OrderBy(r => r.Name)
.Select(r => new RestaurantSummaryViewModel
{
RestaurantId = r.RestaurantId,
Name = r.Name,
Top5Dishes = r.Dishes
.OrderByDescending(d => d.Reviews.Where(rv => rv.Score > 3).Count())
.Select(d => new DishSummaryViewModel
{
Name = d.Name,
PositiveReviewCount = d.Reviews.Where(rv => rv.Score > 3).Count()
}).Take(5)
.ToList();
}).ToList();
Notice that the above Linq example doesn't use Join or even Include. Provided you follow a basic set of rules to ensure that EF can work out what you want to project down to SQL you can accomplish a fair bit producing far more efficient queries. The above statement would generate SQL to run across the related tables but would only return the fields needed to populate the desired view models. This allows you to tune indexes based on what data is most commonly needed, and also reduces the amount of data going across the wire, plus memory usage on both the DB and app servers. Libraries like Automapper and it's ProjectTo method can simplify the above statements even more, configuring how to select into the desired view model once, then replacing that whole Select( ... ) with just a ProjectTo<RestaurantSummaryViewModel>(config) where "config" is a reference to the Automapper configuration where it can resolve how to turn Restaurants and their associated entities into the desired view model(s).
In any case it should give you some avenues to explore with EF and learning what it can bring to the table to produce (hopefully:) easy to understand, and efficient query expressions.

EF Core without additional repository pattern

There seems to be a certain movement advocating that when we use EF Core we should avoid creating a Repository & Unit of work pattern because EF Core already implement those two and we can leverage this implicit implementation. That would be great because implementing those patterns is not always as straightforward as it would seem.
So here's the problem. When implementing repository the 'classic' way we have a place to put the code that builds our domain objects. Let me explain with an example; we have an Invoice and an InvoiceRow entities. Each Invoice has many InvoiceRows. I included only the navigational property for brevity.
public class Invoice
{
public int Id { get; set; }
public DateTime Date { get; set; }
public decimal Total { get; set; }
public List<InvoiceRow> InvoiceRows { get; }
}
public class InvoiceRow
{
public int Id { get; set; }
public Invoice Invoice { get; set; }
public decimal UnitPrice { get; set; }
public decimal RowPrice { get; set; }
}
Now, my business object is an Invoice with its rows, and this should be the only way to manipulate the invoices.
When using 'explicit' repository we would do something like:
public class InvoicesRepo
{
public AppDbContext AppDbContext { get; private set; }
public Invoice Find(int id)
{
return
AppDbContext.Invoices.Where(invoice => invoice.Id == id)
.Include(nameof(InvoiceRow))
.First();
}
}
This restricts the access to the Invoice to the method [InvoicesRepo].Find(id) that builds the invoice in the way that is expected by the domain logic code.
Is it possible to achieve this with bare EF Core? Maybe working with visibility of DbSets and/or additional features that I don't know? Since this seems to be quite a fundamental functionality of a full-blown repository, if it's not achievable, have I just destroyed the main argument of experts advocating for no (additional) repository when using EF Core?
Is it possible to achieve this with bare EF Core? Maybe working with visibility of DbSets and/or additional features that I don't know?
Sure, accepting that the DbContext is your repository doesn't mean you can't make design decisions and you have to have use the default DbContext design.
You can add reusable data access code to your DbContext for convenience and consistency, eg methods like:
public Invoice FindInvoice(int id)
{
this.Invoices.Where(invoice => invoice.Id == id)
.Include(nameof(InvoiceRow))
.First();
}
So for code that needs the standard shape of Invoice with InvoiceRows, they call this method. But for code that needs some nonstandard shape, they still can access the DbSets or IQueryable methods and construct a custom query.
You can even eliminate the DbSet properties, to more strongly guide users to use your custom methods, like:
public IQueryable<Invoice> Invoices => this.Invoices.Include(nameof(InvoiceRow));
Then to get Invoices without InvoiceRows a consumer would either add a custom projection to this, something like
db.Invoices.Where(i => i.CustomerID == custId).Select(i => new InvoiceDTO(i)).ToList();
or access the DbSet
var invoice = db.Set<Invoice>().Find(invoiceId);
And you can organize the methods on your DbContext by having it implement various interfaces.
Ok, this is just a preliminary answer based on many readings all around.
public class InvoiceQuery
{
public AppDbContext AppDbContext { get; private set; }
public int Id { get; set; }
public Invoice Execute()
{
return AppDbContext
.Invoices
.Include(nameof(Invoice.InvoiceRows))
.Where(invoice => invoice.Id == Id)
.FirstOrDefault();
}
}
My problem was that this is not substantially different from what we would have into a repository. That's from a practical point of view; from a theoretical point of view, this puts you on the right perspective while the repository is sort of misleading.
The reason why it's misleading is that there is no 1-1 association between entities and actions or queries that you can do on the database. Even in this case, Invoice is not just invoice, but is Invoice plus InvoiceRows. (By the way, I think that InvoiceQuery is a good name (and not InvoiceWithRowsQuery) because from a business logic point of view an invoice is a full-loaded invoice; an invoice with 0 rows is an empty invoice, not an partially-loaded invoice.)
So a query focuses on what you get, not on the entity you start from, because they can be more than one.
This "query" name is sort of counter-intuitive, because one would say that as you move towards the business logic, you stop seeing things like queries and you start seeing things like paramenters. And actually we have parameters, "query" is only the name of the container. Maybe we should call it Business Object Query, but it would be too long, but that's the meaning. So this query object is just a simple container for an EF Query; I will talk about this later on. In this class based implementation, besides being glorified as object, there is no addictional functionality. Maybe you would expect some additional functionality, instead you have something less. This missing thing is that they are no associated with an entity anymore. This is another counter-intuitive fact that proves that here we are dismantleing something: the contrived entity-repository view.
Having a so simple object poses a question. Why not a method on some related class? If we put it on the Entity, we go back to the repository-like organization, that seems flawed mainly because the missing 1-1 bla bla means that in many cases you couldn't tell which entity associate a query to. But some authors use a sort of 'container class' that is really no more than a container just it's not entitled to an entity (OrdersData for example). If you like me like classes that have a clear purpose, this sends you shivers down the spine. We started up saying that we have a solution that is better than the repository. That the repository has so many problems. We end up with a class that has "Data" in its name. How could a better solution be so lame? It feels like the repository alternative is just a bunch of stuff, not a steady and well designed as expected. On the other hand, this is just a way to collect queries (the same queries that you would have in a repository, the very same query that you would have in a query/command pattern...) under a name that is not that of an entity.
Yes, all this seems to boil down to a naming choice. Advocates of repository pattern and advocates of no-repository fiercely fighting each others. Then you look at the code, what it actually does. The code is the same and the names are all we are talking about. That is: if you look what repositoriests do in their repositories and non-repositoriests do in their whatever (not always clear) classes, they do exaclty the same things. But this is just an impression, I've to time-proof it.
In the repository model, having a repository for entity with its methods was reassuring. It provided a sort of scaffolding where to put all of the methods. Unfortunately that scaffolding seems to be too limitating, again because of the not 1-1 relation between entities and queries/commands. Yet we have to say that the root entities like Invoice (in DDD speaking Invoice are root, InvoiceRows are aggregate) would fit quite well in repository-style organization.
For this reason, in this query based solution the flatness of the Query collection is not fully satisfying, and this is another reason why I think I want to further elaborate on this topic.

Entity framework 6 code first: what is the best implementation for a baseobject with 10 childobjects

We have a baseobject with 10 childobjects and EF6 code first.
Of those 10 childobjects, 5 have only a few (extra) properties, and 5 have multiple properties (5 to 20).
We implemented this as table-per-type, so we have one table for the base and 1 per child (total 10).
This, however, creates HUGE select queries with select case and unions all over the place, which also takes the EF 6 seconds to generate (the first time).
I read about this issue, and that the same issue holds in the table-per-concrete type scenario.
So what we are left with is table-per-hierachy, but that creates a table with a large number of properties, which doesn't sound great either.
Is there another solution for this?
I thought about maybe skip the inheritance and create a union view for when I want to get all the items from all the child objects/records.
Any other thoughts?
Another solution would be to implement some kind of CQRS pattern where you have separate databases for writing (command) and reading (query). You could even de-normalize the data in the read database so it is very fast.
Assuming you need at least one normalized model with referential integrity, I think your decision really comes down to Table per Hierarchy and Table per Type. TPH is reported by Alex James from the EF team and more recently on Microsoft's Data Development site to have better performance.
Advantages of TPT and why they're not as important as performance:
Greater flexibility, which means the ability to add types without affecting any existing table. Not too much of a concern because EF migrations make it trivial to generate the required SQL to update existing databases without affecting data.
Database validation on account of having fewer nullable fields. Not a massive concern because EF validates data according to the application model. If data is being added by other means it is not too difficult to run a background script to validate data. Also, TPT and TPC are actually worse for validation when it comes to primary keys because two sub-class tables could potentially contain the same primary key. You are left with the problem of validation by other means.
Storage space is reduced on account of not needing to store all the null fields. This is only a very trivial concern, especially if the DBMS has a good strategy for handling 'sparse' columns.
Design and gut-feel. Having one very large table does feel a bit wrong, but that is probably because most db designers have spent many hours normalizing data and drawing ERDs. Having one large table seems to go against the basic principles of database design. This is probably the biggest barrier to TPH. See this article for a particularly impassioned argument.
That article summarizes the core argument against TPH as:
It's not normalized even in a trivial sense, it makes it impossible to enforce integrity on the data, and what's most "awesome:" it is virtually guaranteed to perform badly at a large scale for any non-trivial set of data.
These are mostly wrong. Performance and integrity are mentioned above, and TPH does not necessarily mean denormalized. There are just many (nullable) foreign key columns that are self-referential. So we can go on designing and normalizing the data exactly as we would with a TPH. In a current database I have many relationships between sub-types and have created an ERD as if it were a TPT inheritance structure. This actually reflects the implementation in code-first Entity Framework. For example here is my Expenditure class, which inherits from Relationship which inherits from Content:
public class Expenditure : Relationship
{
/// <summary>
/// Inherits from Content: Id, Handle, Description, Parent (is context of expenditure and usually
/// a Project)
/// Inherits from Relationship: Source (the Principal), SourceId, Target (the Supplier), TargetId,
///
/// </summary>
[Required, InverseProperty("Expenditures"), ForeignKey("ProductId")]
public Product Product { get; set; }
public Guid ProductId { get; set; }
public string Unit { get; set; }
public double Qty { get; set; }
public string Currency { get; set; }
public double TotalCost { get; set; }
}
The InversePropertyAttribute and the ForeignKeyAttribute provide EF with the information required to make the required self joins in the single database.
The Product type also maps to the same table (also inheriting from Content). Each Product has its own row in the table and rows that contain Expenditures will include data in the ProductId column, which is null for rows containing all other types. So the data is normalized, just placed in a single table.
The beauty of using EF code first is we design the database in exactly the same way and we implement it in (almost) exactly the same way regardless of using TPH or TPT. To change the implementation from TPH to TPT we simply need to add an annotation to each sub-class, mapping them to new tables. So, the good news for you is it doesn't really matter which one you choose. Just build it, generate a stack of test data, test it, change strategy, test it again. I reckon you'll find TPH the winner.
Having experienced similar problems myself I've a few suggestions. I'm also open to improvements on these suggestions as It's a complex topic, and I don't have it all worked out.
Entity framework can be very slow when dealing with non-trivial queries on complex entities - ie those with multiple levels of child collections. In some performance tests I've tried it does sit there an awful long time compiling the query. In theory EF 5 and onwards should cache compiled queries (even if the context gets disposed and re-instantiated) without you having to do anything, but I'm not convinced that this is always the case.
I've read some suggestions that you should create multiple DataContexts with only smaller subsets of your database entities for a complex database. If this is practical for you give it a try! But I imagine there would be maintenance issues with this approach.
1) I Know this is obvious but worth saying anyway - make sure you have the right foreign keys set up in your database for related entities, as then entity framework will keep track of these relationships, and be much quicker generating queries where you need to join using the foreign key.
2) Don't retrieve more than you need. One-size fits all methods to get a complex object are rarely optimal. Say you are getting a list of base objects (to put in a list) and you only need to display the name and ID of these objects in the list of the base object. Just retrieve only the base object - any navigation properties that aren't specifically needed should not be retrieved.
3) If the child objects are not collections, or they are collections but you only need 1 item (or an aggregate value such as the count) from them I would absolutely implement a View in the database and query that instead. It is MUCH quicker. EF doesn't have to do any work - its all done in the database, which is better equipped for this type of operation.
4) Be careful with .Include() and this goes back to point #2 above. If you are getting a single object + a child collection property you are best not using .Include() as then when the child collection is retrieved this will be done as a separate query. (so not getting all the base object columns for every row in the child collection)
EDIT
Following comments here's some further thoughts.
As we are dealing with an inheritance hierarchy it makes logical sense to store separate tables for the additional properties of the inheriting classes + a table for the base class. As to how to make Entity Framework perform well though is still up for debate.
I've used EF for a similar scenario (but fewer children), (Database first), but in this case I didn't use the actual Entity framework generated classes as the business objects. The EF objects directly related to the DB tables.
I created separate business classes for the base and inheriting classes, and a set of Mappers that would convert to them. A query would look something like
public static List<BaseClass> GetAllItems()
{
using (var db = new MyDbEntities())
{
var q1 = db.InheritedClass1.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass1Mapper.MapFromContext(x));
var q2 = db.InheritedClass2.Include("BaseClass").ToList()
.ConvertAll(x => (BaseClass)InheritedClass2Mapper.MapFromContext(x));
return q1.Union(q2).ToList();
}
}
Not saying this is the best approach, but it might be a starting point?
The queries are certainly quick to compile in this case!
Comments welcome!
With Table per Hierarchy you end up with only one table, so obviously your CRUD operations will be faster and this table is abstracted out by your domain layer anyway. The disadvantage is that you loose the ability for NOT NULL constraints, so this needs to be handled properly by your business layer to avoid potential data integrity. Also, adding or removing entities means that the table changes; but that's also something that is manageable.
With Table per type you have the problem that the more classes in the hierarchy you have, the slower your CRUD operations will become.
All in all, as performance is probably the most important consideration here and you have a lot of classes, I think Table per Hierarchy is a winner in terms of both performance and simplicity and taking into account your number of classes.
Also look at this article, more specifically at chapter 7.1.1 (Avoiding TPT in Model First or Code First applications), where they state: "when creating an application using Model First or Code First, you should avoid TPT inheritance for performance concerns."
The EF6 CodeFirst model I'm working on using generics and an abstract base classes called "BaseEntity". I also use generics and a base class for the EntityTypeConfiguration class.
In the event that I need to reuse a couple of properties "columns" on some tables and it doesn't make sense for them to be on BaseEntity or BaseEntityWithMetaData, I make an interface for them.
E.g. I have one for addresses I haven't finished yet. So if an entity has address information it will implement IAddressInfo. Casting an entity to IAddressInfo will give me an object with just the AddressInfo on it.
Originally I had my metadata columns as their own table. But like others have mentioned, the queries were horrendous, and it was slower than slow. So I thought, why don't I just use multiple inheritance paths to support what I want to do so the columns are on every table that need them, and not on the ones that don't. Also I am using mysql which has a column limit of 4096. Sql Server 2008 has 1024. Even at 1024, I don't see realistic scenarios for going over that on one table.
And non of my objjets inherit in such a way that they have columns they don't need. When that need arises I create a new base class at a level to prevent the extra columns.
Here's are enough snippets from my code to understand how I have my inheritance setup. So far it works really well for me. I haven't really produced a scenario I couldn't model with this setup.
public BaseEntityConfig<T> : EntityTypeConfiguration<T> where T : BaseEntity<T>, new()
{
}
public BaseEntity<T> where T : BaseEntity<T>, new()
{
//shared properties here
}
public BaseEntityMetaDataConfig : BaseEntityConfig<T> where T: BaseEntityWithMetaData<T>, new()
{
public BaseEntityWithMetaDataConfig()
{
this.HasOptional(e => e.RecCreatedBy).WithMany().HasForeignKey(p => p.RecCreatedByUserId);
this.HasOptional(e => e.RecLastModifiedBy).WithMany().HasForeignKey(p => p.RecLastModifiedByUserId);
}
}
public BaseEntityMetaData<T> : BaseEntity<T> where T: BaseEntityWithMetaData<T>, new()
{
#region Entity Properties
public DateTime? DateRecCreated { get; set; }
public DateTime? DateRecModified { get; set; }
public long? RecCreatedByUserId { get; set; }
public virtual User RecCreatedBy { get; set; }
public virtual User RecLastModifiedBy { get; set; }
public long? RecLastModifiedByUserId { get; set; }
public DateTime? RecDateDeleted { get; set; }
#endregion
}
public PersonConfig()
{
this.ToTable("people");
this.HasKey(e => e.PersonId);
this.HasOptional(e => e.User).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasOptional(p => p.Employee).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.HasMany(e => e.EmailAddresses).WithRequired(p => p.Person).WillCascadeOnDelete(true);
this.Property(e => e.FirstName).IsRequired().HasMaxLength(128);
this.Property(e => e.MiddleName).IsOptional().HasMaxLength(128);
this.Property(e => e.LastName).IsRequired().HasMaxLength(128);
}
}
//I Have to use this pattern to allow other classes to inherit from person, they have to inherit from BasePeron<T>
public class Person : BasePerson<Person>
{
//Just a dummy class to expose BasePerson as it is.
}
public class BasePerson<T> : BaseEntityWithMetaData<T> where T: BasePerson<T>, new()
{
#region Entity Properties
public long PersonId { get; set; }
public virtual User User { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public virtual Employee Employee { get; set; }
public virtual ICollection<PersonEmail> EmailAddresses { get; set; }
#endregion
#region Entity Helper Properties
[NotMapped]
public PersonEmail PrimaryPersonalEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Personal_Primary select e).FirstOrDefault();
return ret;
}
}
[NotMapped]
public PersonEmail PrimaryWorkEmail
{
get
{
PersonEmail ret = null;
if (this.EmailAddresses != null)
ret = (from e in this.EmailAddresses where e.EmailAddressType == EmailAddressType.Work_Primary select e).FirstOrDefault();
return ret;
}
}
private string _DefaultEmailAddress = null;
[NotMapped]
public string DefaultEmailAddress
{
get
{
if (string.IsNullOrEmpty(_DefaultEmailAddress))
{
PersonEmail personalEmail = this.PrimaryPersonalEmail;
if (personalEmail != null && !string.IsNullOrEmpty(personalEmail.EmailAddress))
_DefaultEmailAddress = personalEmail.EmailAddress;
else
{
PersonEmail workEmail = this.PrimaryWorkEmail;
if (workEmail != null && !string.IsNullOrEmpty(workEmail.EmailAddress))
_DefaultEmailAddress = workEmail.EmailAddress;
}
}
return _DefaultEmailAddress;
}
}
#endregion
#region Constructor
static BasePerson()
{
}
public BasePerson()
{
this.User = null;
this.EmailAddresses = new HashSet<PersonEmail>();
}
public BasePerson(string firstName, string lastName)
{
this.FirstName = firstName;
this.LastName = lastName;
}
#endregion
}
Now, code in the context on ModelCreating looks like,
//Config
modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
//initialize configuration, each line is responsible for telling entity framework how to create relation ships between the different tables in the database.
//Such as Table Names, Foreign Key Contraints, Unique Contraints, all relations etc.
modelBuilder.Configurations.Add(new PersonConfig());
modelBuilder.Configurations.Add(new PersonEmailConfig());
modelBuilder.Configurations.Add(new UserConfig());
modelBuilder.Configurations.Add(new LoginSessionConfig());
modelBuilder.Configurations.Add(new AccountConfig());
modelBuilder.Configurations.Add(new EmployeeConfig());
modelBuilder.Configurations.Add(new ContactConfig());
modelBuilder.Configurations.Add(new ConfigEntryCategoryConfig());
modelBuilder.Configurations.Add(new ConfigEntryConfig());
modelBuilder.Configurations.Add(new SecurityQuestionConfig());
modelBuilder.Configurations.Add(new SecurityQuestionAnswerConfig());
The reason I created base classes for the Configuration of my entities was because when I started down this path I ran into an annoying problem. I had to configure the shared properties for every derrived class over and over again. And if I updated one of the fluent API mappings, I had to update code in every derrived class.
But by using this inheritance method on the configuration classes the two properties are configured in one place, and inherited by the configuration class for derrived entities.
So when PeopleConfig is configured, it runs the logic on the BaseEntityWithMetaData class to configure the two properties, and again when UserConfig runs, etc etc etc.
Three different approaches have different names in M. Fowler's language:
Single Table inheritance - whole inheritance hierarchy held in one table. No joins, optional columns for child types. You need to distinguish which child type it is.
Concrete Table inheritance - you have one table for each concrete type. Joins, no optional columns. In this case, base type table is needed only if the base type requires to have its own mapping (instance can be created).
Class Table inheritance - you have base type table, and child tables - each adding only additional columns to the base's columns. Joins, no optional columns. In this case, base type table always contains row for each child; however, you can retrieve common columns only if no child-specific columns are needed (rest comes with lazy loading maybe?).
All approaches are workable - it only depends on the amount and structure of data you have, so you can measure performance differences first.
Choice will be based on the number of joins vs. data distribution vs. optional columns.
If you don't have (and not going to have) many child types, I would go with class table inheritance since that stands close to the domain and will be easy to translate/map.
If you have many child tables to work with at the same time, and anticipate bottleneck in joins - go with single table inheritance.
If joins are not needed at all and you are going to work with one concrete type at a time - go with concrete table inheritance.
Although, the Table per Hierarchy (TPH) is a better approach for fast CRUD operations, yet in that case it is impossible to avoid a single table with a so many properties for the database created. The case and union clauses that you mentioned are created because the resulting query is effectively requesting a polymorphic result set that includes multiple types.
However, when EF returns flattened table that includes the data for all the types, it does extra work to ensure that, null values are returned for columns that may be irrelevant for a particular type. Technically, this extra validation using case and union is not necessary
The below issue is a performance glitch in Microsoft EF6 and they are are aiming to deliver this fix in a future release.
The below query:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN[UnionAll1].[State] END AS [C2],
CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT(([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL ASvarchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))THEN[UnionAll1].[Zip] END AS [C3],
FROM [dbo].[Customers] AS [Extent1]
can be safely replaced by:
SELECT
[Extent1].[CustomerId] AS [CustomerId],
[Extent1].[Name] AS [Name],
[Extent1].[Address] AS [Address],
[Extent1].[City] AS [City],
[UnionAll1].[State] AS [C2],
[UnionAll1].[Zip] AS [C3],
FROM [dbo].[Customers] AS [Extent1]
So, you just saw the problem and the flaw of Entity Framework 6 current release, you have an option to either use a Model First Approach or use a TPH approach.

How to insert an ObservableCollection property to a local sqlite-net db?

I have a quick question about the sqlite-net library which can be found here : https://github.com/praeclarum/sqlite-net.
The thing is I have no idea how collections, and custom objects will be inserted into the database, and how do I convert them back when querying, if needed.
Take this model for example:
[PrimaryKey, AutoIncrement]
public int Id { get; set; }
private string _name; // The name of the subject. i.e "Physics"
private ObservableCollection<Lesson> _lessons;
Preface: I've not used sqlite-net; rather, I spent some time simply reviewing the source code on the github link posted in the question.
From the first page on the sqlite-net github site, there are two bullet points that should help in some high level understanding:
Very simple methods for executing CRUD operations and queries safely (using parameters) and for retrieving the results of those
query in a strongly typed fashion
In other words, sqlite-net will work well with non-complex models; will probably work best with flattened models.
Works with your data model without forcing you to change your classes. (Contains a small reflection-driven ORM layer.)
In other words, sqlite-net will transform/map the result set of the SQL query to your model; again, will probably work best with flattened models.
Looking at the primary source code of SQLite.cs, there is an InsertAll method and a few overloads that will insert a collection.
When querying for data, you should be able to use the Get<T> method and the Table<T> method and there is also an Query<T> method you could take a look at as well. Each should map the results to the type parameter.
Finally, take a look at the examples and tests for a more in-depth look at using the framework.
I've worked quite a bit with SQLite-net in the past few months (including this presentation yesterday)
how collections, and custom objects will be inserted into the database
I think the answer is they won't.
While it is a very capable database and ORM, SQLite-net is targeting lightweight mobile apps. Because of this lightweight focus, the classes used are generally very simple flattened objects like:
public class Course
{
public int CourseId { get; set; }
public string Name { get; set; }
}
public class Lesson
{
public int LessonId { get; set; }
public string Name { get; set; }
public int CourseId { get; set; }
}
If you then need to Join these back together and to handle insertion and deletion of related objects, then that's down to you - the app developer - to handle. There's no auto-tracking of related objects like there is in a larger, more complicated ORM stack.
In practice, I've not found this a problem. I find SQLite-net very useful in my mobile apps.

Is this Repository pattern efficient with LINQ-to-SQL?

I'm currently reading the book Pro Asp.Net MVC Framework. In the book, the author suggests using a repository pattern similar to the following.
[Table(Name = "Products")]
public class Product
{
[Column(IsPrimaryKey = true,
IsDbGenerated = true,
AutoSync = AutoSync.OnInsert)]
public int ProductId { get; set; }
[Column] public string Name { get; set; }
[Column] public string Description { get; set; }
[Column] public decimal Price { get; set; }
[Column] public string Category { get; set; }
}
public interface IProductsRepository
{
IQueryable<Product> Products { get; }
}
public class SqlProductsRepository : IProductsRepository
{
private Table<Product> productsTable;
public SqlProductsRepository(string connectionString)
{
productsTable = new DataContext(connectionString).GetTable<Product>();
}
public IQueryable<Product> Products
{
get { return productsTable; }
}
}
Data is then accessed in the following manner:
public ViewResult List(string category)
{
var productsInCategory = (category == null) ? productsRepository.Products : productsRepository.Products.Where(p => p.Category == category);
return View(productsInCategory);
}
Is this an efficient means of accessing data? Is the entire table going to be retrieved from the database and filtered in memory or is the chained Where() method going to cause some LINQ magic to create an optimized query based on the lambda?
Finally, what other implementations of the Repository pattern in C# might provide better performance when hooked up via LINQ-to-SQL?
I can understand Johannes' desire to control the execution of the SQL more tightly and with the implementation of what i sometimes call 'lazy anchor points' i have been able to do that in my app.
I use a combination of custom LazyList<T> and LazyItem<T> classes that encapsulate lazy initialization:
LazyList<T> wraps the IQueryable functionality of an IList collection but maximises some of LinqToSql's Deferred Execution functions and
LazyItem<T> will wrap a lazy invocation of a single item using the LinqToSql IQueryable or a generic Func<T> method for executing other code deferred.
Here is an example - i have this model object Announcement which may have an attached image or pdf document:
public class Announcement : //..
{
public int ID { get; set; }
public string Title { get; set; }
public AnnouncementCategory Category { get; set; }
public string Body { get; set; }
public LazyItem<Image> Image { get; set; }
public LazyItem<PdfDoc> PdfDoc { get; set; }
}
The Image and PdfDoc classes inherit form a type File that contains the byte[] containing the binary data. This binary data is heavy and i might not always need it returned from the DB every time i want an Announcement. So i want to keep my object graph 'anchored' but not 'populated' (if you like).
So if i do something like this:
Console.WriteLine(anAnnouncement.Title);
..i can knowing that i have only loaded from by db the data for the immediate Announcement object. But if on the following line i need to do this:
Console.WriteLine(anAnnouncement.Image.Inner.Width);
..i can be sure that the LazyItem<T> knows how to go and get the rest of the data.
Another great benefit is that these 'lazy' classes can hide the particular implementation of the underlying repository so i don't necessarily have to be using LinqToSql. I am (using LinqToSql) in the case of the app I'm cutting examples from, but it would be easy to plug another data source (or even completely different data layer that perhaps does not use the Repository pattern).
LINQ but not LinqToSql
You will find that sometimes you want to do some fancy LINQ query that happens to barf when the execution flows down to the LinqToSql provider. That is because LinqToSql works by translating the effective LINQ query logic into T-SQL code, and sometimes that is not always possible.
For example, i have this function that i want an IQueryable result from:
private IQueryable<Event> GetLatestSortedEvents()
{
// TODO: WARNING: HEAVY SQL QUERY! fix
return this.GetSortedEvents().ToList()
.Where(ModelExtensions.Event.IsUpcomingEvent())
.AsQueryable();
}
Why that code does not translate to SQL is not important, but just believe me that the conditions in that IsUpcomingEvent() predicate involve a number of DateTime comparisons that simply are far too complicated for LinqToSql to convert to T-SQL.
By using .ToList() then the condition (.Where(..) and then .AsQueryable() i'm effectively telling LinqToSql that i need all of the .GetSortedEvents() items even tho i'm then going to filter them. This is an instance where my filter expression will not render to SQL correctly so i need to filter it in memory. This would be what i might call the limitation of LinqToSql's performance as far as Deferred Execution and lazy loading goes - but i only have a small number of these WARNING: HEAVY SQL QUERY! blocks in my app and i think further smart refactoring could eliminate them completely.
Finally, LinqToSql can make a fine data access provider in large apps if you want it to. I found that to get the results i want and to abstract away and isolate certain things i've needed to add code here and there. And where i want more control over the actual SQL performance from LinqToSql, i've added smarts to get the desired results. So IMHO LinqToSql is perfectly ok for heavy apps that need db query optimization provided you understand how LinqToSql works. My design was originally based on Rob's Storefront tutorial so you might find it useful if you need more explanation about my rants above.
And if you want to use those lazy classes above, you can get them here and here.
Is this an efficient means of
accessing data? Is the entire table
going to be retrieved from the
database and filtered in memory or is
the chained Where() method going to
cause some LINQ magic to create an
optimized query based on the lambda?
It is efficient, if you wish to say so. The Repository exposes an IQueryable inteface, which basically represents any LINQ Data Provider (in this case Linq2Sql).
Queries are executed the moment you start iterating over the result.
IQueryable therefore supports query composition. You can add any .Where() or .GroupBy() or .OrderBy() call to a query and it will be statisfied by the database.
If you put an enumeration in your query, such as .ToList(), everything after that will happen in memory (LinqToObjects).
But I think the repository implementation is useless. I want my repository to control query execution, which is impossible when exposing IQueryable.
Yes linq2sql will generate magic to make it more efficient. It depends on you using the IQueryable interface. If you want to check clamp the SQL profiler on and you can see it generate the appropriate query.
I would recommend introducing a service layer to abstract away your dependancy on linq2sql.
I've also read that book recently and this is the SQL generated when I ran the sample code:
SELECT [t1].[Category]
FROM ( SELECT DISTINCT [t0].[Category]
FROM [Products] AS [t0] ) AS [t1] ORDER BY [t1].[Category]
I don't think you can write anything more efficient given that database. However in most real databases your Categories would be in a separate table to keep things DRY.

Categories

Resources