Entity-Framework: Sort on Many-to-Many

Entity-Framework: Sort on Many-to-Many - c#

I have two entities with a many-to-many relationship and I'm looking for a way to be able to sort the result from the tables.
In other words, when I get a row from table1 and all the corresponding records from table2 I want to be able to have a stored sort order for table2 that's specific for that row in table1.
My first thought was to add a sort column to the table that represents the relation, but to my knowledge there is no way of accessing the new column in the relation.
Does anybody have any suggestions on how to accomplish this?

As Ladislav Mrnka states, if you add the new column to the junction table, there will be a new entity "in the middle" that will make navigation much harder.
If you want to avoid this, but still be able to make the navigation as usual, you can keep the junction table and add a new table, just like the junction, with the order column added. When you need the order info, you can just join this table to get it and use it.
This new table will, of course, require some maintenance. I.e. you can create a delete on cascade for the junction+order to the junction table. And use a trigger (ooops, that's not good!) to create a new row with default order for each new created relation. So, it would be much more advisable to handle this in you business logic.
I know it's too tricky, but there's no magic solution... just choose what is more comfortable to you.

You can add new column to the junction table but the table will become a new entity so your model will now consist of three entities where and two one-to-many relations instead of two entities and single many-to-many relation.

Due to your requirement of sorting table2 results per table1 row and not globally, you have three non-elegant solutions:
The approach Ladislav suggested (with the bad looking model) - add order column, add bridge entity.
The approach JotaBe suggested (with the bad looking schema) - add an additional table and maintain both.
If the context is used only for reading (no need to change relationships) and you don't mind changing the EDMX manually after every update from DB, then you could hack the emdx and change the SSDL definition of the relationship table to an SQL query e.g.
<EntitySet Name="AS_TO_BS" EntityType="BlaBla.Store.AS_TO_BS">
<DefiningQuery>
SELECT ID1, ID2 ORDER BY ORDERVALUE
FROM AS_TO_BS
</DefiningQuery>
</EntitySet>
Instead of:
<EntitySet Name="AS_TO_BS" EntityType="BlaBla.Store.AS_TO_BS"
store:Type="Tables" Schema="MY_SCHEMA" />
See if you can relax your requirements, if not then settle on one of the three solutions.
Edit:
Another idea:
Use a view to duplicate the relationship table, then map the relationship to the view (as read only) and the order entity to the table (writable).

Thank you all for the good answers to my question. I now feel more confident about the pros and cons of the different solutions.
What I ended up doing was this: As it turns out, just adding a sort column to the relation-table doesn't affect the model, update from DB still works and the table still gets mapped as a many-to-many relation. Then I created a stored procedure that fetches the sort column from the relation-table and another stored procedure to update the sort-index of a specified record.

Related

Data modeling for Same tables with same columns

I have many tables that have same number of columns and names because they are all lookup tables.
For example, there are LabelType and TaskType tables. LabelType and TaskType tables have TypeID and TypeName columns. They will be used as a foreign key in other tables such as LabelType table with shippingLog table and TaskType table with EmployeeTask Table.
LabelType Table
TypeID TypeName
1 Fedex
2 UPS
3 USPS
TaskType Table
TypeID TypeName
1 Receiving
2 Pickup
3 Shipping
So far, I have more than 20 tables and I am expecting it is going to be keep increasing.
I have no problem with it , but I am just wondering whether there is any better or smarter way of using tables or not. I was even thinking to consolidate all those tables as one lookup Type Table and differentiate them by adding a foreign key from lookup table. The lookup table may have data like Label, Task, and etc. Then I just need one or two tables for all those lookup data.
Please, advise me if you have any better or smarter way of data modeling.

Just because data has similar structure doesn't mean it has the same meaning or same constraints. Keep your lookup tables separate. This keeps foreign keys separate, so the database can protect itself from referencing the wrong kind of lookup data.1
I wish relational DBMSes supported inheritance, where you could define the basic structure in the parent table and just add specific FKs in the child tables. As it stands now, you'll need to endure some repetition in your DDL...
NOTE: One exception from "keep lookup tables separate" rule might be when your system needs to be dynamic (i.e. be able to add new kinds of lookup data without actually creating new physical tables in the database), but it doesn't look that way from your question.
1 With one big lookup table, FKs alone won't stop (for example) the ShippingLog table from referencing a row meant for the EmployeeTask table. By using identifying relationships and migrating PKs, you can protect yourself from this, but not without introducing some redundancies and needing some careful constraining. It's cleaner and probably more performant to simply do the right thing and keep lookup tables separate.

Keep your lookup tables separate. It's faster at lookup time, and you will do millions of lookups between times when you add a new lookup table.
A lot of tables is not a big problem.

Recommend usage of temp table or table variable in Entity Framework 4. Update Performance Entity framework

I need to update a bit field in a table and set this field to true for a specific list of Ids in that table.
The Ids are passed in from an external process.
I guess in pure SQL the most efficient way would be to create a temp table and populate it with the Ids, then join the main table with this and set the bit field accordingly.
I could create a SPROC to take the Ids but there could be 200 - 300,000 rows involved that need this flag set so its probably not the most efficient way. Using the IN statement has limitation wrt the amount of data that can be passed and performance.
How can I achieve the above using the Entity Framework
I guess its possible to create a SPROC to create a temp table but this would not exist from the models perspective.
Is there a way to dynamically add entities at run time. [Or is this approach just going to cause headaches].
I'm making the assumption above though that populating a temp table with 300,000 rows and doing a join would be quicker than calling a SPROC 300,000 times :)
[The Ids are Guids]
Is there another approach that I should consider.

For data volumes like 300k rows, I would forget EF. I would do this by having a table such as:
BatchId RowId
Where RowId is the PK of the row we want to update, and BatchId just refers to this "run" of 300k rows (to allow multiple at once etc).
I would generate a new BatchId (this could be anything unique -Guid leaps to mind), and use SqlBulkCopy to insert te records onto this table, i.e.
100034 17
100034 22
...
100034 134556
I would then use a simgle sproc to do the join and update (and delete the batch from the table).
SqlBulkCopy is the fastest way of getting this volume of data to the server; you won't drown in round-trips. EF is object-oriented : nice for lots of scenarios - but not this one.

I'm assigning Marcs response as the answer but I'd just like to give a little detail on how we implemented the requirement.
Marc response helped greatly in the formulation of our solution.
We had to deal with an aim/guideline to keep within the Entity Framework while not utilizing SPROCS and although our solution may not suit others it has worked for us
We created a Item table in the Database with BatchId [uniqueidentifier] and ItemId varchar columns.
This table was added to the EF model so we did not use temporary tables.
On upload of these Ids this table is populated with the Ids [Inserts are quick enough we find using EF]
We then use context.ExecuteStoreCommand to run the SQL to do join the item table and the main table and update the bit field in the main table for records that exist for the batch Id created specifically for that session.
We finally clear this table for that batchId.
We have the performance, keeping within our no SPROC goal. [Which not of us agree with :) but its a democracy]
Our exact requirements are a little more complex but insofar as needing good update performance using the Entity framework given our specific restrictions it works fine.
Liam

Database Table Schema and Aggregate Roots

Applicaiton is single user, 1-tier(1 pc), database SqlCE. DataService layer will be (I think) : Repository returning domain objects and quering database with LinqToSql (dbml). There are obviously a lot more columns, this is simplified view.
LogTime in separate table: http://i53.tinypic.com/9h8cb4.png
LogTime in ItemTimeLog table (as Time): http://i51.tinypic.com/4dvv4.png
alt text http://i53.tinypic.com/9h8cb4.png
This is my first attempt of creating a >2 tables database. I think the table schema makes sense, but I need some reassurance or critics. Because the table relations looks quite scary to be honest. I'm hoping you could;
Look at the table schema and respond if there are clear signs of troubles or errors that you spot right away.. And if you have time,
Look at Program Summary/Questions, and see if the table layout makes makes sense to those points.
Please be brutal, I will try to defend :)
Program summary:
a) A set of categories, each having a set of strategies (1:m)
b) Each day a number of items will be produced. And each strategy MAY reference it.
(So there can be 50 items, and a strategy may reference 23 of them)
c) An item can be referenced by more than one strategy. So I think it's an m:m relation.
d) Status values will be logged at fixed time-fractions through the day, for:
- .... each Strategy.....each StrategyItem....each item
e) An action on an item may be executed by a strategy that reference it.
- This is logged as ItemAction (Could have called it StrategyItemAction)
User Requsts
b) -> e) described the main activity mode of the program. To work with only today's DayLog , for each category. 2nd priority activity is retrieval of history, which typically will be From all categories, from day x to day y; Get all StrategyDailyLog.
Questions
First, does the overall layout look sound? I'm worried to see that there are so many relationships in all directions, connecting everything. Is this normal, or does it look like trouble?
StrategyItem is made to represent an m:m relationship. Is it correct as I noted 1:m / 1:1 (marked red) ?
StrategyItemTimeLog and ItemTimeLog; Logs values that both need to be retrieved together, when retreiving a StrategyItem. Reason I separated is that the first one is strategy-specific, and several strategies can reference same item. So I thought not to duplicate those values that are not dependent no strategy, but only on the item. Hence I also dragged out the LogTime, as it seems to be the only parameter to unite the logs. But this all looks quite disturbing with those 3 tables. Does it make sense at all? Or you have suggestion?
Pink circles shows my vague attempt of Aggregate Root Paths. I've been thinking in terms of "what entity is responsible for delete". Though I'm unsure about the actual root. I think it's Category. Does it make sense related to User Requests described above?
EDIT1:
(Updated schema, showing typical number of hierarchy items for the first few relations, for 365 days, and additional explanations)
1:1 relation: Sorry. I made a mistake. The StrategyDailyLog should be 1:m. See updated schema. It is one per Strategy, per day.
DayLog / StrategyDailyLog: I’ve been pondering over wether DayLog shall be a part of the hierarchy like this or not. The purpose of the DayLog table is to hold “sum values” derived from all the StrategyDailyLog tables for the same day. Like performance values for this day. It also holds the date value. Which allows me to omit a date value in the StrategyDailyLog (Which I feel would kind of be a duplicate modeling of the date-field), but instead the reference to DayLog exist to “find” the date. I’m not sure if this is an abuse/misconception of normalization.
Null value: I haden’t thought about this. I believe I found 2, as now marked in StrategyDailyLog and ItemAction. They can not be null on creation, but they can be set to null if one need to delete either a Strategy, or a StrategyItem. That should not require a delete of the StrategyDailyLog and the ItemAction. Hence they can be set to null.
All Id –columns: My idea was to have ID (autogenerated Integer) as PK for all my tables. I believed that also would be sufficient as candidate key. Is this not a proper way to make PKs? It’s the only way any table of mine can be identified. I asked a question before if that was ok, maybe I misunderstood, but thought that was a good approach.
m:m relation: This is what I have attempted to do: StrategyItem is the m:m table of StrategyDailyLog / DailyItem.

Ok. Here is me being brutal. I do not understand the model.
So instead of trying to comment on that so much, here are some thoughts that came to my mind when I looked at it.
I think you should have look at your 1:1 relationships (all of them). Why is DayLog and StrategyDailyLog separated in two tables? Probably because you will always have at least one DayLog item but not all DayLog items have a StrategyDailyLog item. If that is the case you can have a StrategyID FK in DayLog table with allow nulls option.
It would help to understand the model if you could show which fields are required and which fields accept null as a value.
All your tables have its own id column. That can be quite confusing when doing 1:1 relations and m:m relations. For a 1:1 relation, usually the relation between the two tables is made on the primary key in both tables. If you do not do that you have to create a candidate key on the foreign key column. In your case that means that StrategyDailyLog should have a candidate key on DayLogID.
A m:m relation between two tables is usually solved by adding a new table in between, with the primary keys from both tables. Those fields together is the primary key for the table in the middle.
Lets say for example that you should have a m:m relationship between Category and Strategy. You should then create a table called CategoryStrategy with two fields CategoryID and StrategyID that together is the primary key for table CategoryStrategy.
I hope my comments makes sense and that they are useful to you.
EDIT 2011-01-17
I do not think that you should have as a principle to use a IDENTITY column as primary key in all tables. A m:m relation does not need it so you should not do it. I also think that you have misunderstood what I meant with a candidate key. A candidate key is a key that could have been used as the primary key. In MS SQL Server you define a UNIQUE CONSTRAINT for your candidate key.
Ex: Table StrategyItem have id as PK but the combination of StrategyID and DailyItemID is the candidate key. Better would be to remove id and use StrategyID+DailyItemID as PK.
Below is the schema that I would have built with your description. I might have missed something important because I do not know everything about what you want to do.
You should not think so much about query performance and building aggregates when designing the schema. That can be handled by creating indexes on columns and using sum, count and group by in your queries. An index on column Created in the model below would be necessary for your queries on a date or date interval. In MS SQL Server there is something called the clustered index. Default the PK of a table is the clustered index but in this case I would make the index on Created column the clustered index.
A Category has 0,1 or more Strategy.
LogItem have on Category and optionally one Strategy
LogItem.Created holds date and time.

How do you use Linq to connect tables in different databases?

I'm a bit of a Linq newbie, and I couldn't find any documentation to help me with what seems to be a pretty trivial problem - so your help will be much appreciated!
I have a table Table1 in database DB1, which has a "pseudo" foreign key Table2ID to table Table2 in database DB2, on the same server. "Pseudo", because obviously I can't have an actual FK spanning two databases.
Now I'm playing around with the O/R designer, and I love the way all the relationships are generated when I bring database objects into the designer... very cool! And I want my Table1 object to have a relationship to Table2, just like it has relationships with all the "real" foreign key-related objects in DB1. But I can't bring Table2 into my db diagram, because it's in the wrong DB.
To synthesize this, I tried creating a view Table2 in DB1, which is simply select * from DB2..Table2. Aha, now I can drop a Table2 object into my diagram. I can even make a parent/child relationship between Table1 and Table2. But when I look at the generated code, Table1 still has no relationship to Table2, which I find most perplexing.
Am I missing a step somewhere? Is there a better/recommended way of doing this?
Thanks!
Later...
Along the lines of what one person suggested, I tried filling in the partial class of Table1 with all the methods required to access Table2, by copying all the structures for a related object within the same DB.
This actually worked for reads, but as soon as I tried to update or insert a record, I got an exception:
An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.
So it looks like the designers of Linq have actually thought about this scenario, and decided that you are not allowed to connect objects in different databases. That's really a shame... :(
... and even later...
Thanks to #williammandra.com, I found that you need to create the primary key on a view manually. But there's still another problem: for some reason when you load a value from the view Table2 and set it on the new record Table1, then commit changes, it tries to insert a new record into Table2, which obviously causes a PK violation. Any idea why this happens, and how to get around it?

Views don't have primary keys (without it the O/R designer can't create the relationship). Your solution to use a view gets you halfway there.... The step you are missing is setting the "Primary Key" property to true in the O/R designer for the key field in the Table2 view. You still have to create the association manually, but once you save the dbml the relationship will show up in the generated code.

You could create two dbml's, one for each db. Then join the tables in your query:
var tb1 = DataContext1.Table1
var tb2 = DataContext2.Table2
var result = (from t1 in tb1
join t2 in tb2 on tb1.column equals tb2.column
where ...
select ...
)
You could also set tb2 = to your view rather than another datacontext...

Assuming you can access one database from the other you can do this by manually editing the .dbml file.
<Table Name="Table1.dbo.Table" Member="MemberObject">
<Table Name="Table2.dbo.Table" Member="MemberObject">
You might actually be able do this by looking at the properties of a table and changing the source.

LINQ to SQL Association - "Properties do not have matching types"

I am trying to link two fields of a given table to the same field in another table.
I have done this before so I can't work out what is wrong this time.
Anyway:
Table1
- Id (Primary)
- FK-Table2a (Nullable, foreign key relationship in DB to Table2.Id)
- FK-Table2b (Nullable, foreign key relationship in DB to Table2.Id)
Table2
- Id (Primary)
The association works for FK-Table2a but not FK-Table2b.
In fact, when I load into LINQ to SQL, it shows Table2.Id as associated to Table1.Id.
If I try and change this, or add a new association for FK-Table2b to Table2.Id it says: "Properties do not have matching types".
This also works in other projects - maybe I should just copy over the .dbml?
Any ideas?

I see this problem when I try to create one-to-one relationships where one side of the relationship is nullable (so really, one-to-zero/one). LINQ-to-SQL doesn't seem to support this so it appears we are forced to a plural relationship and a collection that will contain zero or one items. Annoying.

No idea on the cause, but I just reconstructed my .dbml from scratch and it fixed itself.
Oh for a "refresh" feature...

I had the same problem. This error appeared when I tried to link different types of fields, or when I tryied to drag-and-drop table to .dbml space, but .dbml already had contained linked tables with different types of linked fields.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.