In our database we have this parent - child - grandchild relation that is many-to-many relationship ( twice ). This happens through two junction / cross-reference tables. Parent/Child/Grandschild tables have varchar functional keys that are unique. Below is a simplified version showing only the first step in the hierarachy:
Parent Junction Child
+----+-------+ +------+------+ +----+-------+
| PK | F_KEY | | PK_1 | PK_2 | | PK | F_KEY |
+----+-------+ +------+------+ +----+-------+
| 1 | AAA | | 1 | 1 | | 1 | BBB |
+----+-------+ +------+------+ +----+-------+
The number of records in both parent / child / grandchild are several millions.
Situation
We need to deal with the situation where we're given a collection of parent-child-grandchild and some of them may already be present in the database. We need to insert the ones that are not yet present, ignore rest ( based on functional key ).
So the current implementation:
switches off autodetectChanges and disables all constraints on the datacontext.
checks for parents already present ( using F_KEY ) - inserts non existing ones
checks for children already present ( F_KEY ) - inserts non existing ones and I think manually updates EF
idem for grandchildren
Not surprisingly - something went wrong and now we have missing links in our junction table and we're having to fix this through scripts.
This implementation doesn't sit well with me. Argument of the dev was performance. Original implementation did not perform:
Given list of parents - ignore existing ones
Look at remaining children - replace existing ones with DbEntries
Idem for grandchildren
SaveChanges()
Didn't perform. My colleague said - 'think about it: you have to enter parents, then retrieve the id's. Save children, retrieve id's, use these for first junction table etc.'
Question
How can I make this perform? I mean - it works, but not very maintainable and really rubs me the wrong way.
An idea I had - if we make the junction table contain the unique functional keys like so:
Parent Junction Child
+----+-------+ +------+------+ +----+-------+
| PK | F_KEY | | PK_1 | PK_2 | | PK | F_KEY |
+----+-------+ +------+------+ +----+-------+
| 1 | AAA | | AAA | BBB | | 1 | BBB |
+----+-------+ +------+------+ +----+-------+
Then we don't have to retrieve the ids of the inserted items to store them in the junction table. Does that make sense? Will EF be able to benefit from that?
If that doesn't work - and we're not using EF in the way it's at its best - we might as well consider using stored procedures or direct queries to the database. You save the overhead of EF altogether and at least then you're in full control of what we're doing and not have EF make the queries for us behind the scenes.
What are the thoughts on that? Any other suggestions are very welcome as well of course.
For this kind of task I would make a stored procedure that accepts few table-valued parameters https://msdn.microsoft.com/en-us/library/bb510489.aspx https://msdn.microsoft.com/en-us/library/bb675163(v=vs.110).aspx with the list of new Parents, Children, Junctions, GrandChildren, Junctions and perform all merging on the server inside one transaction without transmitting anything back to the client.
A bunch of MERGE T-SQL statements processing rows in bulk worked quite well for me in similar cases.
Merge Parents, then Children, then GrandChildren tables. Then Junction between Parents and Children. Then Junction between Children and GrandChildren.
As long as the size of collection that you need to merge is reasonable (say, around 10K rows) it would work very well with a single call to the stored procedure. If you have to merge significantly more rows, consider splitting them in smaller batches and calling your stored procedure several times.
Related
First of all I'm an amateur and non-english native speaker, so I would appreciate it if you would have a little patience with me ;)
Trying to do two things here and I'm not sure if I should do two questions about it, but since it's all related in my case, I would like to say it all in one question.
I'm making a sort of accounting software, in theory for my personnal use. I'm using a DB generated auto_increment ID for almost all my objects, but for some specific cases I need a "parallel" more open ID that won't be primary key but could be manipulated by the user(yeah, I've read lots of questions about "you don't need a consecutive Primary Key", and i understand it and agree, but let me remark that this column won't be the primary key, lets call it just a "human-not-computer-expert friendly ID") matching these conditions:
The Id should auto increment when no parameters given.
When a number is given as a parameter that number should be used if not occupied, if occupied throw an exception.
The user should be asked if he/she wants to fill the missing IDs by DELETEs and whatever other operations, so if the user "say yes", the minimum missing ID should be automatically found and used.
I have no problem with doing this "by hand" in c#, but are there some way to achieve something like this in MySQL directly? I've read in the MySQL documentation that AUTO_INCREMENT does fulfill my first two conditions, but even if it fills missing deleted numbers by default, which I'm not sure of, I don't want it to do that by default, I need the software to ask first, or at least to do it based on a configuration pre established by the user.
Therefore I think I should do it by hand in c#(at least the last part, but i suspect i will be forced to do it entirely), which brings the question about LAST_INSERT_ID.
So, the MYSQL documentation says:
If the previous statement returned an error, the value of LAST_INSERT_ID() is undefined. For transactional tables, if the statement is rolled back due to an error, the value of LAST_INSERT_ID() is left undefined. For manual ROLLBACK, the value of LAST_INSERT_ID() is not restored to that before the transaction; it remains as it was at the point of the ROLLBACK.
I understand that LAST_INSERT_ID() is basically useless if the previous INSERT statement fails for whatever reason.
If that's the case, there's no way to retrieve the last inserted ID that ensures a known behaviour when something fails? Something like when INSERT fails returns 0 or a SQL exception? And if there's no other way what is the standard way of doing it(I suppose MAX(Id) won't do it), if something like a standard way exists... or should I just stop trying to do it at one go and do first the updates, check if all went ok, and then do a SELECT LAST_INSERT_ID?
To sum up:
Are there some way to achieve a column of consecutive numbers that fulfill the given conditions in MySQL directly?
What's with LAST_INSERT_ID? Should I give up and don't use it directly?
Situation 1, knowing an id that you want inserted into an AUTO_INCREMENT
Honoring that the AI is not a PK as described.
-- drop table a12b;
create table a12b
( id varchar(100) primary key,
ai_id int not null AUTO_INCREMENT,
thing varchar(100) not null,
key(ai_id)
);
insert a12b (id,thing) values ('a','fish'); -- ai_id=1
insert a12b (id,thing) values ('b','dog'); -- 2
insert a12b (id,thing) values ('b2','cat'); -- 3
delete from a12b where id='b';
insert a12b(id,ai_id,thing) values ('b',2,'dog with spots'); -- 2 ******** right here
insert a12b (id,thing) values ('z','goat'); -- 4
select * from a12b;
+----+-------+----------------+
| id | ai_id | thing |
+----+-------+----------------+
| a | 1 | fish |
| b | 2 | dog with spots |
| b2 | 3 | cat |
| z | 4 | goat |
+----+-------+----------------+
4 rows in set (0.00 sec)
Situation 2, having a system where you delete rows at some point. And want to fill those explicitly deleted gaps later: See my answer Here
Situation 3 (INNODB has a bunch of gaps sprinkled all over):
This was not part of the question. Perhaps use a left join utilizing a helper table (at least for ints not varchars. But then again we are talking about ints). If you need to spot a gap without knowing, shoot for a left join with a helper table (loaded up with numbers). I know it sounds lame, but helper tables are lean and mean and get the job done. The following would be a helper table: https://stackoverflow.com/a/33666394
INNODB Gap Anomaly
using the above table with 4 rows, continue with:
insert a12b (id,thing) values ('z','goat'); -- oops, problem, failed, but AI is incremented behind the scene
insert a12b (id,thing) values ('z2','goat'); -- 6 (you now have a gap)
data:
+----+-------+----------------+
| id | ai_id | thing |
+----+-------+----------------+
| a | 1 | fish |
| b | 2 | dog with spots |
| b2 | 3 | cat |
| z | 4 | goat |
| z2 | 6 | goat |
+----+-------+----------------+
There are a ton of ways to generate gaps. See This and That
I am designing SQL Server database and i have to create multiple FK in one column
so I have these tables and create a menu in one table
Table 1 Table 2 Table 3
| Pages | Jobs | News
|------------ |--------- |-----------
| Pageid | Jobid | NewsId
| PageName | JobName | NewsTitle
| MenuName | MenuName | MenuName
My aim is to reference these table in one column
I have a table from this scenario
| MenuGroup
|------------
| menuGroupId
| MenuName
| RecordeId
So how will i achieve the normalize database design?
Sol 1 (Fixed no of Columns):
This is the most standard and normalized solution. You can create a new table with nullable columns as suggested by #Tim
| JoiningTable
|------------
| Id
| PageId
| JobId
| NewsId
Sol2:(Dynamic no of Columns):
Although I do not consider it a good approach , since referential integrity is lost here, but in case of dynamic number of columns I don't have anyother solution except this one.
Type:
|------------
| TypeId
| Name
JoiningTable
|------------
| Id
| JoiningId
| TypeId (news,job,pages etc etc)
You can compress these two tables into one by replacing a TypeId with type field in JoiningTable.
NoSQl may also be a solution but I have no experience of working on NOSQL so I cannot recommend you anything about that.
I will suggest you create another table that has the 3 tables' ids as foreign keys then use the primary key of this new table in MenuGroup table and you can use LEFT JOIN to get individual tables through the new table.
Having multiple FK's in a single table isn't bad, try using the MenuName column as FK so you won't have to create an extra field in your database.
I have 3 tables simplified to the below:
main_table
id | attribute1_id | attribute2_id | price
attribute1_table
id | attribute_name
attribute2_table
id | attribute_name
I've created a view in Sql Server that joins the tables together to give me the following output:
main_table
id | attribute1_id | attribute2_id | attribute1_name | attribute2_name | price
The problem I have is I want to be able to show the data in a DataGridView and allow the price to be editable. But I've created a "view" which I take it this is not the correct thing to use (i.e it's called a "view" which doesn't sound editable?)
I know I could create my own script to go through and update only the "main_table" but I think there must be a way to use DataGridView / Linked datasets with joined tables?
-
Best thing I can advise is to create a stored procedure that takes all of the parameters and then within that procedure do the individual update statements to the table. Should work fairly well.
I'm working with databases for the first time (SQL CE 3.5) and I'm not certain how I define a relationship between tables where later ( I think) I'll have to use a join to select some value for one field from another table.
_________ __________ __________
MY TABLE| | TABLE A | | TABLE B |
--------- |---------| |---------|
OrderID | | a_Text | | b_Text |
--------- |---------| |---------|
a_Text |
---------
b_Text |
---------
When it is all implemented when I define a value for a_Text in [MY TABLE] I only want to be able to set a value for a_Text as defined in [Table A] (and again for b_Text).
What you want here is a Foreign Key Constraint on your a_Text field which enforces a link between MY_TABLE/TABLE_A for that particular field value.
This is a database relationship and as such should be defined at database-level and if required, at model-level. Most modern day ORM technologies e.g. EntityFramework/NHibernate, do a pretty good job at representing the same relationships at model-level, or at least make it very trivial to do so - EF will do it automatically if you create a context via a database directly.
It's pretty simple to create a relationship using SQLCE through the VS Designer - Walkthrough: Creating a SQL Server Compact Database gives an example of adding a relationship between two tables.
Based on your requirements I wouldn't recommend having your value field (a_Text) in TABLE A as the PK. One of the biggest concerns is if you update the key you need to cascade that change throughout the other referencing tables. It's much more flexible to introduce a surrogate key and make your a_Text field a unique key itself.
My Table Table A Table B
-------- ------- -------
OrderID a_ID a_ID
a_ID a_Text a_Text
b_ID
I'm working on a local city project and have some questions on efficiently creating relationships between "parks" and "activities" in Microsoft SQL 2000. We are using ASP.NET C# to
I have my two tables "Parks" and "Activities." I have also created a lookup table with the proper relationships set on the primary keys of both "Parks" and "Activities." My lookup table is called "ParksActitivies."
We have about 30 activities that we can associate with each park. An intern is going to be managing the website, and the activities will be evaluated every 6 months.
So far I have created an admin tool that allows you to add/edit/delete each park. Adding a park is simple. The data is new, so I simply allow them to edit the park details, and associate "Activities" dynamically pulled from the database. This was done in a repeater control.
Editing works, but I don't feel that its as efficient as it could be. Saving the main park details is no problem, as I simply call Save() on the park instance that I created. However, to remove the stale records in the lookup table I simply DELETE FROM ParksActitivies WHERE ParkID = #ParkID" and then INSERT a record for each of the checked activities.
For my ID column on the lookup table, I have an incrementing integer value, which after quite a bit of testing has got into the thousands. While this does work, I feel that there has to be a better way to update the lookup table.
Can anyone offer some insight on how I may improve this? I am currently using stored procedures, but I'm not the best at very complex statements.
[ParkID | ParkName | Latitude | Longitude ]
1 | Freemont | -116.34 | 35.32
2 | Jackson | -116.78 | 34.2
[ActivityID | ActivityName | Description ]
1 | Picnic | Blah
2 | Dancing | Blah
3 | Water Polo | Blah
[ID | ParkID | ActivityID ]
1 | 1 | 2
2 | 2 | 1
3 | 2 | 2
4 | 2 | 3
I would prefer to learn how to do it a more universal way as opposed to using Linq-To-SQL or ADO.NET.
would prefer to learn how to do it a more universal way as opposed to using LINQ2SQL or ADO.NET.
You're obviously using ADO.NET Core :). And that's fine I think you should stick to using Stored procedures and DbCommands and such...
If you were using MSSQL 2008 you'd be able to do this using TableValued parameters and the MERGE statement. since you're using MSSQL 200 (why?) what you'd need to do is the following:
1. Send a comma delimited list of the Activity ids (the new ones) along with the ParkId to your stored proc. The ActivityIds parameter would be a varchar(50) for example.
In your stored proc you can split the ids
The strategy would be something like
1. For the Ids passed in, delete records that don't match
The SQL for that would be
DELETE FROM ParkActivities
WHERE ActivityId NOT IN (Some List of Ids)
WHERE ParkId = #ParkId
Since your list is a string you can do it like this
EXEC('DELETE FROM ParkActivities WHERE ActivityId NOT IN (' + #ActivityIds + ') AND ParkId = ' + #ParkId)
Now you can insert those activities that are not already in the table. The simplest way to do this would be to insert the ParkActivity ids into a temp table. To do that you'll need to split the comma delimited list into individual ids and insert them into a temp table. Once you have the data in the temp table you can insert doing a join.
The is a built-in user defined function in MSSQL 2000 that can do the split and return a Table Variable with each value on a seperate row.
http://msdn.microsoft.com/en-us/library/Aa496058
What is wrong with LinqToSQL and ADO.NET? I mean, could you specify your doubts about using those technologies
update
if LinqToSQL is not supported for 2000, you can easily upgrade to free 2008 express. It would be definitely enough for purposes you described.