I'm currently working on a sandbox environment based on two databases located on different servers. What I am aiming to do is allow my clients to make changes on a test server and then once approved, I can simply hit a button and import the data across to my live database.
So far, I have managed to port the data across the two databases but what I would like to do is amend the primary keys on the test server to match those held on the live (incase I need backups and so that I can make checks to stop the same information being copied multiple times).
So far I have tried this solution:
DT_SitePage OldPage = new DT_SitePage
{
PageID = SP.PageID
};
DT_SitePage NewPage = new DT_SitePage
{
PageID = int.Parse(ViewState["PrimaryKey"].ToString())
};
Sandbox.DT_SitePages.Attach(NewPage, OldPage);
Sandbox.SubmitChanges();
However I keep getting the error:
***Value of member 'PageID' of an object of type 'DT_SitePage' changed.
A member defining the identity of the object cannot be changed.
Consider adding a new object with new identity and deleting the existing one instead.***
Is there anyway in LINQ to avoid this error and force the database to update this field???
Many Thanks
Why won't you use the stock backup/restore functionality supplied by DB manufacturer?
It makes a perfect logical sense that high-level ORM tools won't allow you to change the primary key of the record, as they only identify the record by its primary key.
You should consider making direct UPDATE queries to DB from your code instead.
And anyway, changing the primary key is the bad idea, what prevents you from INSERTing it with the needed value in the first place?
As said, modifying primary keys is typically something you don't want to do. If Linq-to-sql wouldn't have the early warning, your RDBMS would probably complain (SQL server does!). Especially when records are related by foreign key constraints updating primary keys is not trivial.
In cross-database scenarios, it is more common to use some "global" unique identification, for which GUIDs may do. Maybe your data can be identified in an alternative way? (Like when two users have the same name, they are deemed identical).
If you don't need to keep identical the database structures, you may consider using an extra field in your test database to store the "live" primary key.
Here is a post with lots of useful thoughts.
Related
What is the best practice to handle the following situation?
It is known that many records (thousands) will be inserted with a fair possibility of a primary key exception. In some cases the exception should trigger some alternative queries and logic. In other cases it doesn't matter much and I merely want to log the event.
Should the insert be attempted and the exception caught?
or
Should a query be made to check for the existing record, then attempt the insert if none exists?
or Both?
I have noticed slightly better performance when merely catching the exception, but there's not a significant difference.
IMO It depends. If the client is responsible for generating a PK, using a UUID or Snowflake etc. where keys are expected to be unique then the first option is fine. Whether you bother with a retry after generating a new ID or simply fail the operation and ask the user to try again (as it should be a 1 in a billion exception, not the norm) is up to you. If the data is relying on sequences or user-entered meaningful keys it should be managed at the DB side using DatabaseGenerated.Identity and meaningless keys with related object graphs created and committed within a single SaveChanges call.
The typical concern around ID generation and EF is usually where developers don't rely on EF/the DB to manage the PK and FKs through navigation properties. They feel they need to know the PK in order to set FKs for related data, either saving the primary entity to get the PK or generating keys client-side. One of the key benefits of using an ORM like EF is giving it the related objects and letting it manage the inserting of PKs and FKs automatically.
There are couple of things over here.
One thing required is that you must have primary key vonstraint on column at the database Level
Now at the Entity Framework level, it is good if you check whether the record exists or not. So basically what happen you query for record using primary key and if it is found, then it return the entity and then you make changes to entity and at last SaveChanges will save that entity
Now if you are not able to find entity then you have to add entity
If you try without query then it is problematic for EF and specially if multiple request try to update same record
Now one more case is that, lets assume that there is possibility that multiple request can insert same record and so primary key constraint will help here and it will not allow duplication if you are generating primary key manually
For update too, there is possibility of data loss if you are not taking care of concurrency
I need to one-way-synchronize external Data with CRM on a regular basis (i.e. nightly).
This includes creating new Records as well as updating existing ones.
That means, I have to keep track of IDs of CRM-Entities created by my synchronization process.
emphasized textI already managed to create and update records in CRM from lines of database-tables so this is not a problem.
Currently, my mapped tables have the following columns
id: The tables primary key set when inserting a new row
new_myentityid: Primary Attribute of the mapped entity, set after the record was created by the synchronization process
new_name etc.: the values of the records attributes
However, I see a way to drastically simplify the whole process:
Instead of having a PrimaryKey (id) in the database and keeping track of the CRM ID (new_myentityid) in a seperate column, I could as well get rid of the id-columns and make the CRM-ID-Column (new_myentityid) primary key of the table and set it when inserting new records (newid()), so basically substitute id with new_myentityid from a database perspective. I could then bulk-upsert via ExecuteMultipleRequest in combination with UpsertRequest.
This way, I would save a column in each mapped table as well as logic to store the CRM IDs after creating them.
Question
Would this be acceptable or is there anything that should make me avoid this?
Disclaimer: I'm not aware of a best practice for this so this is just my personal opinion on the matter having developed for Dynamics several times.
I think that using the CRM Entity GUID for your primary key is a good idea. It's less complicated and is handled well in SQL. I assume the column in your database is uniqueidentifier.
My only comment is to not generate the GUIDs yourself. Let CRM generate them for you as it does a better job at keeping everything sequential and indexed.
See this blog entry on MSDN for further detail
I'm probably a little late to this discussion but just wanted to add my tuppence worth.
There is nothing inherently wrong with specifying the GUID when creating a new record in CRM, and this behaviour is explicitly supported by the SDK.
A common real life scenario is when creating records by script; it is useful to have the same GUID for an entity in Dev, Test and Production environments (Admittedly we normally use the GUID auto generated in Dev).
The reason that it is considered best practice to allow CRM generate its own GUID (https://msdn.microsoft.com/en-us/library/gg509027.aspx) is that CRM will generate the GUID sequentially. Using newid() generates a statistically random GUID. This has a performance impact on SQL server around index maintenance. This thread provides some insight: What are the performance improvement of Sequential Guid over standard Guid?
But basically specifying your own GUID can cause the underlying SQL INSERT statement to become more expensive. Read and Update operations should remain the same.
If you are generating you own GUIDs is SQL you can always use NEWSEQUENTIALID (https://msdn.microsoft.com/en-us/library/ms189786.aspx) for a sequentially generated GUIDs.
Hi previous posts cover this well. Just to note that if you did go with generating GUIDs outside of CRM you could mitigate against the potential performance impact (INSERTS) simply by running a weekly Maintenance plan to refresh the clustered indices directly on the SQL database(s) this would I believe ensure that GUIDs were ordered sequentially. In any case, CRM/API will always be the bottleneck, so best to do things in the way that the platform expects to avoid issues later on.
Why not save in new table?
likes origin exist a table named "customer",and your new data save in "customer_update",the field as same as the origin table.
it's will help you in future.maybe you want have a look the data's orgin.
We have a text processing application developed in C# using .NET FW 4.0 where the Administrator can define various settings. All this 'settings' data reside in about 50 tables with foreign key relations and Identity primary keys (this one will make it tricky, I think). The entire database is no more than 100K records, with the average table having about 6 short columns. The system is based on MS SQL 2008 R2 Express database.
We face a requirement to create a snapshot of all this data so that the administrator of the system could roll back to one of the snapshots anytime he screws up something. We need to keep the last 5 snapshots only. Creation of the snapshot must be commenced from the application GUI and so must be the rollback to any of the snapshots if needed (use SSMS will not be allowed as direct access to the DB is denied). The system is still in development (are we ever really finished?) which means that new tables and columns are added many times. Thus we need a robust method that can take care of changes automatically (digging code after inserting/changing columns is something we want to avoid unless there's no other way). The best way would be to tell that "I want to create a snapshot of all tables where the name begins with 'Admin'". Obviously, this is quite a DB-intensive task, but due to the fact that it will be used in emergency situations only, this is something that I do not mind. I also do not mind if table locks happen as nothing will try to use these tables while the creation or rollback of the snapshot is in progress.
The problem can be divided into 2 parts:
creating the snapshot
rolling back to the snapshot
Regarding problem #1. we may have two options:
export the data into XML (file or database column)
duplicate the data inside SQL into the same or different tables (like creating the same table structure again with the same names as the original tables prefixed with "Backup").
Regarding problem #2. the biggest issue I see is how to re-import all data into foreign key related tables which use IDENTITY columns for PK generation. I need to delete all data from all affected tables then re-import everything while temporarily relaxing FK constraints and switching off Identity generation. Once data is loaded I should check if FK constraints are still OK.
Or perhaps I should find a logical way to load tables so that constraint checking can remain in place while loading (as we do not have an unmanageable number of tables this could be a viable solution). Of course I need to do all deletion and re-loading in a single transaction, for obvious reasons.
I suspect there may be no pure SQL-based solution for this, although SQL CLR might be of help to avoid moving data out of SQL Server.
Is there anyone out there with the same problem we face? Maybe someone who successfully solved such problem?
I do not expect a step by step instruction. Any help on where to start, which routes to take (export to RAW XML or keep snapshot inside the DB or both), pros/cons would be really helpful.
Thank you for your help and your time.
Daniel
We don't have this exact problem, but we have a very similar problem in which we provide our customers with a baseline set of configuration data (fairly complex, mostly identity PKs) that needs to be updated when we provide a new release.
Our mechanism is probably overkill for your situation, but I am sure there is a subset of it that is applicable.
The basic approach is this:
First, we execute a script that drops all of the FK constraints and changes the nullability of those FK columns that are currently NOT NULL to NULL. This script also drops all triggers to ensure that any logical constraints implemented in them will not be executed.
Next, we perform the data import, setting identity_insert off before updating a table, then setting it back on after the data in the table is updated.
Next, we execute a script that checks the data integrity of the newly added items with respect to the foreign keys. In our case, we know that items that do not have a corresponding parent record can safely be deleted, but you may choose to take a different approach (report the error and let someone manually handle the issue).
Finally, once we have verified the data, we execute another script that restores the nullability, adds the FKs back, and reinstalls the triggers.
If you have the budget for it, I would strongly recommend that you take a look at the tools that Red Gate provides, specifically SQL Packager and SQL Data Compare (I suspect there may be other tools out there as well, we just don't have any experience with them). These tools have been critical in the successful implementation of our strategy.
Update
We provide the baseline configuration through an SQL Script that is generated by RedGate's SQL Packager.
Because our end-users can modify the database between updates which will cause the identity values in their database to be different in ours, we actually store the baseline primary and foreign keys in separate fields within each record.
When we update the customer database and we need to link new records to known configuration information, we can use the baseline fields to find out what the database-specific FKs should be.
In otherwords, there is always a known set of field ids for well-known configuration records regardless what other data is modified in the database and we can use this to link records together.
For example, if I have Table1 linked to Table2, Table1 will have a baseline PK and Table2 will have a baseline PK and a baseline FKey containing Table1's baseline PK. When we update records, if we add a new Table2 record, all we have to do is find the Table1 record with the specified baseline PK, then update the actual FKey in Table2 with the actual PK in Table1.
A kind of versioning by date ranges is a common method for records in Enterprise applications. As an example we have a table for business entities (us) or companies (uk) and we keep the current official name in another table as follows:
CompanyID Name ValidFrom ValidTo
12 Business Lld 2000-01-01 2008-09-23
12 Business Inc 2008-09-23 NULL
The null in the last record means that this is current one. You may use the above logic and possibly add more columns to gain more control. This way there are no duplicates, you can keep the history up to any level and synchronize the current values across tables easily. Finally the performance will be great.
I have inherited an old shabby database, and would like to put loads of foreign keys in on existing relationship columns, so that I can use things like nHibernate for relationships.
I am a little inexperienced in the realm of keys, and although I feel I understand how they work, there is some part of me that is fearful, or corrupting the database somehow.
For example, I've come across the concept of "cascade on delete". I don't think there are currently any foreign keys on the database, so I guess this won't affect me... but how can I check to be sure?
What other risks do I need to be aware of?
In a way I'd love to just use nHibernate without foreign keys, but from what I can see this wouldn't be possible?
The biggest problem of putting foreign keys on a database that was designed without them (which is a indication the orginal database designers were incompetent, so there will be many other problems to fix as well), is that there is close to a 100% chance that you have orphaned data that doesn't have a parent key. You will need to figure out what to do with this data. Some of it can just be thrown out as it is no longer usable in any fashion and is simply wasting space. However if any of it relates to orders or anything financial, you need to keep the data, in which case you may need to define a parent record of "unknown" that you can relate the records to. Find and fix all bad data first, then add the foreign keys.
Use cascade update and cascade delete sparingly as they can lock up your database if a large number of records need to be changed. Additonally, in many cases, you want the delete to fail if existing records exist. You don't want to cascade delete ever through financial records for instance. If deleting the user would delete past orders, that is a very bad thing! If you don't use cascading, you are likely to come across the buggy code that let the data get bad when you can no longer delete or change a record once the key is in place. So test all deleting and updating functionality thoroughly once you have the keys in place.
NHibernate does not require foreign keys to be present on a database to be used, however I would still recommend adding foreign keys whenever possible as foreign keys are a good thing they make sure that your database's referential integrity is as it should be.
For example, if I had a User and a Comment table within my database and I were to delete user 1 who happens to have made two comments, without foreign keys I'd now have two comments without an owner! We obviously do not want this situation to ever occur.
This is where foreign keys come in, by declaring that User is a foreign key within Comment table our database server will make sure that we can't delete a user unless it there are no comments associated with him or her (anymore).
Introducing foreign keys into a database is a good thing. It will expose existing invalid data. It will keep existing valid data, valid. You might have to perform some data manipulation on tables that have already gone haywire (i.e. create an 'Unknown user' or something similar and update all non-existing keys to point at it, this is a decision that needs to be made after examining the meaning of the data).
It might even cause a few issues initially where an existing application crash if for example it doesn't delete all the data it should do (such as not deleting all comments in my example). But this is a good thing in the long term, as it exposes where things are going wrong and allows you to fix them without the data and database getting into an even worse state in the meantime.
NHibernate cascades are seperate from foreign keys and are NHibernate's way of allowing you to for example make sure all child objects are deleted when you delete a parent. This for example allows you to make sure that any change you make to your data model does not violate your foreign key relationships (which would cause a database exception and no changes to be applied). Personally I prefer to take care of this myself, but it's up to you whether and how you want to use them.
Foreign keys formalize relationships in a normalized database. The foreign key constraints you are talking about do things like preventing the creation of duplicate keys or the deletion of a field which defines an entity still being used or referenced. This is called "referential integrity.
I suggest using some kind of modelling tool to draw a so-called ERM or entity-relationship model diagram. This will help you to get an overview of how the data is stored and where changes would be useful.
After you have done this, then you should consider whether or not the data is at a reasonable (say second or third normal form) degree of normalization. Pay particular attention to every entity having a primary key, and that the data completely describes the key. You should also try to remove redundancy and split non-atomic fields into a new table. "Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key so help you Codd." If you find the data is not normalized it would be a good time to fix any serious structural problems and/or refactor, if appropriate.
At this point, adding foreign keys and constraints is putting the cart before the horse. Ensure you have data integrity before you try to protect it. You need to do some preparation work first, then constraints will keep your not-so-shabby newly remodeled database in tip-top shape. The constraints will ensure that no one decides to make exceptions to the rules that turn the data into a mess. Take the time to give the data a better organized home now, and put locks on the doors after.
How do I just simply allow MySQL to assign a primary key to an inserted object with nhibernate? It seems I would want to set the generator as a type "identity", but the documentation states that using this "..require[s] two SQL queries to insert a new object." Why would it do that? Is there some way to get this functioning like a normal insert sql statement?
The reason that it requires two queries is that with an identity the value of the column is not defined until the row is inserted. Therefore it requires a select after insert to get the column value for the inserted object. This is pretty standard and I wouldn't let it stop me from using autogenerated keys as my primary key. The other option is to pre-generate the key -- say a GUID for the new object before persisting it to the database. For the most part I don't really see an advantage to this unless there are other mitigating circumstances, such as having to merge data from separate databases where autogenerated keys might collide.
There's an obvious advantage: Letting NHibernate use Guids or Hilo as the id generator will enable an extremely cool feature in NHibernate: batching. Just configure NHibernate to use batching (for like, say 1000 statements), and your inserts will suddenly be extremely fast.
Fabio has a post about the various availabe generators here - extremely useful reading if you are using NHibernate (or if you know someone who thinks NHibernate performs badly).