I have an inheritance strategy Table per Type (TPT) with an abstract class "Task" with a lot of concrete classes (like 30 classes).
When I try to show a worklist of "to do tasks", so asking EF for the abstract class, or when I try to get a generic task by ID, EF makes a 10000 lines query joining every concrete class, that result very slow.
There is a way to configure EF to avoid the big query?
In the worklist method, I need only fields of the abstract class.
This is my code:
public Task GetTaskById(int id) {
return this.repository.Tasks.Where(t => t.ID == id).FirstOrDefault();
}
public IQueryable<Task> GetWorklist() {
return this.repository.Tasks.Where(t => a.ActivitySate.Code == ActivitySateEnum.TO_DO);
}
Thank you
Relational databases don't handle the concept of inheritance very well. Several strategies have been invented for entity framework to mock inheritance.
Which strategy suits you most depends on which kind of queries and updates you perform most often.
Suppose you have a class Person, and two specific kinds of Persons: Teachers and Students. There are two popular strategies to implement inheritance
Table per Type (TPT)
Every class is represented in a separate table. In our example three tables are created: a Persons table, a Teachers table with a foreign key to the Person it is, and a Students table with a foreign key to the Person data of the Student.
If you query: "give me the Persons that ...", only one table needs to be inspected. However, if you ask: "give me the Students who ...", then a join between the Persons table and the Students table is needed.
If you add / update / remove one Student, then two tables need to be updated.
If in future one column needs to be added to one of the classes, only one table is involved.
Adding a new kind of Person, like Sponsors is easy, however they have to be Persons and inherit all Person columns. If later you decide that a Sponsor is not a Person anymore you are in trouble.
This method is most suitable if you ask far more often for Persons than for Students and Teachers. It is less suitable if you ask quite often for Students with Person data. Also if you add / remove / update Students very often, don't use this method.
Also use this method if you need to create a Person that is neither a Teacher nor a Student yet, but later may become one of them, or maybe both Teacher and Student
Table per concreate class (TPC)
There is no separate table for Persons. All Person properties are in the Teachers table as well as in the Students table.
Querying "Students who ..." or "Teachers that ..." will only involve one table. However querying "Persons that ..." will involve the concatenation of data retrieved from the Students table with data retrieved from the Teachers table.
Add / Remove / Update a Student will always involve one table.
Adding a column to a Student involves changing one table. However adding a column to Person involves changing both Students and Teachers tables.
Adding a new kind of Person, say Janitors or Sponsors is easy. It won't be a problem if in future a Sponsor is not a Person anymore.
You can't create a Person, it always has to be either a Teacher or a Student. A Student never can become a Teacher, he will become a new Person (which seems a bit ironic :-). No Student can be a Teacher as well.
Use this method if you seldom ask for Persons who ..., but most often ask for Students who ...
Conclusion
The strategy to choose for your inheritance depends on how you will use your tables.
You seem to have 30 kinds of Persons implemented as TPC (no separate Persons table). If you ask for Persons who ..., your database has to concatenate the results from all 30 tables.
If you think this is by far the most used kind of query, consider changing the inheritance strategy to TPT. Whether you should do this depends on whether the database is already filled with a lot of data or not. If you are using code-first, you'll probably start with a fairly empty database.
The problem is that you use a ba repository that does not return IQueryable, so it doesn ot allow EF to actually use the filters you DO have (you do, right?) where you limit the returned data to only some fields.
So, what is lett is materialize the entity (which is SOOOO standard for the repository antipattern). And there you go.... for that.... It NEEDS to join TPT. Those are 30 classes, which mean 30+ tables. First, the query likely has no 10k lines. Second, this is normal and smallish for really complex SQL (which you DO have here). Third, you set that up yourself - yes, this is what is needed to pull in all the data.
Solution? Get rid of the suplus repository (DbContext IS a repository, you know) and then make the filter based on the base type and make sure to project ONLY the needed fields into an anonymous class, so EF CAN do optimization.
Related
Let's say I have 4 tables: tbl_dogs, tbl_cats, tbl_birds and tbl_fish each with their own _Id columns, of course.. I want to create the ability to have many-to-many relationships between any of these tables. In my head, I picture a tbl_relationships table that has 2 columns: animal1_id and animal2_id and I can entries can be cat_12 | dog_3, bird_1 | dog_9, fish_8 | cat_4, etc. This is a 2 part question:
1) is this possible with EF code first? meaning the two columns on my "relationships" table can actually be pulled from multiple different tables? If so, how would I define that in the EF classes?
2) what if rather than animal1_id and animal2_id as the columns, I wanted to have parent_animal and child_animal so that when I went to look at a fish, I could pull all the child_animal records that have that fish as a parent_animal as well as all the parent_animal records that have that fish as a child_animal?
Any help would be greatly appreciated, thank you!
It's not possible with any relational database. Entity Framework is beside the point. A foreign key has to be made to a specific table. Period.
The Python framework, Django does sort of what you're looking for with its generic content types, but it's more of a hack than anything. For each generically related item, two columns are set: one for the type of the object and one for the id of the object. There are no foreign keys, because again, foreign keys are impossible in this scenario.
In the framework code, in order to materialize the related object, they then issue a query to the correct table (based on the object type) using the object id. However, this is much simpler in Python than C# because Python is duck-typed.
You can technically achieve the same thing, in C# if you were properly motivated, but it would be an entirely manually endeavor. Entity Framework is of no help to you here. You would also need to employ reflection in order to materialize the right type in the end, and reflection is both a pain in the posterior and hugely inefficient (slow).
That said, since the specific scenario here deals with things all of a certain type, animals, you can sort of achieve what you're looking for with inheritance. In other words, you can create an Animal entity and then have each of Dog, Cat, Bird, Fish, etc. inherit from Animal. Then, you can create a foreign key to Animal and interact with any of them. However, you would only be able to interact with them as Animals, not as specifically a Bird, for example.
If you need to have a single matrix table with just two columns, and you need each animal type to be in its own table, then Chris Pratt’s answer is succinct.
If you can go with more than two columns but are forced to stick with individual animal tables, I would try to have a column for every animal table in the matrix table and turn this matrix table into a massive multi-columned monster.
If you are limited to two columns in the matrix table but can target a matrix table based on what parent/child combo you are looking for, then I would set up twelve matrix tables and have each of them for one of the two possible animal combinations for each pair of animal tables. Obviously you are going to have issues if the number of animal tables is actually much larger.
If the animal tables will all have identical columns (type/content/requirements) themselves, there is a possible workaround. You can have all animals in one table with an identifier that signifies what kind of animal it is. This would typically be a foreign key attached to a lookup table of animal types. Then you can have your matrix table as both parent and child columns, as they will have explicit relationships to just a single other table (the ‘mass animal’ table; both parent and child would point to its PK).
A more elegant possible workaround to the prior paragraph (which I haven’t had a chance to use yet, so I am really hazy on its implementation with respects to gotchas) is Inheritance. Inheritance allows you to combine several tables that hold near-identical (in terms of fields and data types) collections of data, and be able to tell those collections apart by means of a discriminator. That way, as per the prior paragraph, you could have a single Animals table with a discriminator that indicates what each row is (cat, dog, etc.), and a matrix table where both the parent and child point to the ID of the Animals table. Since each column in the matrix table has an association to a unique table (instead of multiple tables), this could work out well.
Taking a simple example:
Customer can place order on ecommerce website.
Now, I can infer two composition ('has a' relationship) from this statement.
Order has a customer.
Customer has one or multiple orders.
How should i create class for this. What all factors should i keep in mind while designing these two classes. Below are possibilities i have found till now.
Order class with customerId.
Order class with Customer object.
Customer class with no order history. (Since i can still find information about customer order from order table)
Customer class with List of OrderId.
Customer class with List of Order objects.
How do i decide which is best for a situation?
The first possibility is the simple one which seems enough for your case! An order will have the customer id as a foreign key.
I was hoping someone could give me a bit of advice here. I am wondering if I am on track or way off base in my approach. I am using Entity Framework, database first approach. I have a link table that associates people to each other. Person 1 associated to Person 2 as a friend for example. (association_type holds a key value associated to a lookup table)
I noticed that Entity Framework creates two separate navigation properties.
[EdmRelationshipNavigationPropertyAttribute("IntelDBModel", "FK_a_Person_Person_t_Person", "a_Person_Person")]
public EntityCollection<a_Person_Person> a_Person_Person
[EdmRelationshipNavigationPropertyAttribute("IntelDBModel", "FK_a_Person_Person_t_Person1", "a_Person_Person")]
public EntityCollection<a_Person_Person> a_Person_Person1
In other parts of the application, I have successfully used Entity Framework to write data to the database. For example, I have a person to telephone relationship.
In the person to telephone scenario, I create a t_Person (p) object, then create a t_Telephone (t) object and use p.t_Telephone.Add(t);
That seems to work fine.
I am somewhat lost in terms of how to manage this person to person link table insert.
When saving to the database, I use foreach to iterate through the People objects.
foreach (t_Person p in People)
{ctx.t_Person.AddObject(p);
...
}
I know what person is associated to what person in this People object collection. However, I don't know how to utilize the t_Person navigation properties (a_Person_Person) to save the person1 and person2 values to the link table (a_Person_Person).
Any hints would be greatly appreciated.
I think the given situation will generally give you hard time when using EF, since you are linking two foreign key two one table with same Primary key, since the relationship or lazy loading would be difficult to handle you might get double records or wrong records, I would add another property to the t_person table like datecreated which would make the the EF treat t_person table as not an association, but as actual entity giving you more control over entity and insertion and deletion.
I have a conceptual problem. It's about the correct Database architecture for persisting inherited objects. I'd like to have the proper way to do it, not using EF Code First, because this is all conceived beforehand, and is not necessarily used with EF, it may be, but not sure, or not only. I need a proper way which still will be consistant with ORM approaches, like Entity Framework.
Let's simplify and say we have an object called "Transportation Vehicle" : TransVehicle, it has following properties :
Name
Color
Age
Let's say now we have a "Car" inheriting from the TransVehicle, which adds following properties :
FuelType
WheelSize
We also have a "Plane" which adds those following other properties :
EngineQuantity
MaxTakeOffWeight
So, I may have in my code a List which will contain Cars and Planes.
I suppose I could have a table "TransVehicle" with fields like "Id, Name, Color, Age", then a table "Cars" with fields like "Id, FuelType, WheelSize", and a table "Planes" with fields "Id, EngineQuantity, MaxTakeOffWeight".
I could say :
I read "Plane" rows and complete with informations coming from "TransVehicle" with the same ID.
I read "TransVehicle" rows, and for each, see if I find a Planes record or a Cars record to instanciate the proper object.
I read "TransVehicle" rows, and look an enum value (string, int ?) in a special field to have the object type, then depending on this type, get the informations from the "Plane" or "Car" table.
Which is good conceptual practice ? Have you other tips ? Which way will be easy to map in an ORM ?
This is common problem of mapping an object hierarchy to relational model. You can read about it all over the web.
You basically have three options:
Hierarchy as single table - hierarchy is flattened into a table with discriminator column.
Table for each class - each class has its own table and you do join over all of them with complex queries to get the data. This is what you are doing.
Table for each concrete class - middle ground between the other two if you have abstract classes in the hierarchy
Most ORMs do allow you to pick which one fits the hierarchy best or even mix and match between them for maximum performance or storage savings.
I started using RavenDB and I am getting concurrency exceptions when I am creating an entity and linking the entity to another entity, for examples:
Class - List students - string is the student id
Student
When creating a new student I am fetching the related class and adding the student to the "students" list. I saw that this storing and adding this relation is my bottleneck in my application.
How I can fix this concurrency problem? or maybe I can do this linking in other way? or update the Class with PatchRequest and then I won't have problems with concurrency?
You have two options
Retry on concurrency failure
Reduce contention on the Class entity.
Really, you have a many-to-many relationship between students and classes - so you can store the related key on either side of the relationship (or both sides if desired).
If your app works more frequently with classes than students, try putting a list of ClassIds on each student. You can still get back a list of all students in the class via a query.
Aside - I suggest "Course" instead of "Class" to avoid stepping on keywords in c#