Is it OK to generate IDs of entities outside of CRM?

Is it OK to generate IDs of entities outside of CRM? - c#

I need to one-way-synchronize external Data with CRM on a regular basis (i.e. nightly).
This includes creating new Records as well as updating existing ones.
That means, I have to keep track of IDs of CRM-Entities created by my synchronization process.
emphasized textI already managed to create and update records in CRM from lines of database-tables so this is not a problem.
Currently, my mapped tables have the following columns
id: The tables primary key set when inserting a new row
new_myentityid: Primary Attribute of the mapped entity, set after the record was created by the synchronization process
new_name etc.: the values of the records attributes
However, I see a way to drastically simplify the whole process:
Instead of having a PrimaryKey (id) in the database and keeping track of the CRM ID (new_myentityid) in a seperate column, I could as well get rid of the id-columns and make the CRM-ID-Column (new_myentityid) primary key of the table and set it when inserting new records (newid()), so basically substitute id with new_myentityid from a database perspective. I could then bulk-upsert via ExecuteMultipleRequest in combination with UpsertRequest.
This way, I would save a column in each mapped table as well as logic to store the CRM IDs after creating them.
Question
Would this be acceptable or is there anything that should make me avoid this?

Disclaimer: I'm not aware of a best practice for this so this is just my personal opinion on the matter having developed for Dynamics several times.
I think that using the CRM Entity GUID for your primary key is a good idea. It's less complicated and is handled well in SQL. I assume the column in your database is uniqueidentifier.
My only comment is to not generate the GUIDs yourself. Let CRM generate them for you as it does a better job at keeping everything sequential and indexed.
See this blog entry on MSDN for further detail

I'm probably a little late to this discussion but just wanted to add my tuppence worth.
There is nothing inherently wrong with specifying the GUID when creating a new record in CRM, and this behaviour is explicitly supported by the SDK.
A common real life scenario is when creating records by script; it is useful to have the same GUID for an entity in Dev, Test and Production environments (Admittedly we normally use the GUID auto generated in Dev).
The reason that it is considered best practice to allow CRM generate its own GUID (https://msdn.microsoft.com/en-us/library/gg509027.aspx) is that CRM will generate the GUID sequentially. Using newid() generates a statistically random GUID. This has a performance impact on SQL server around index maintenance. This thread provides some insight: What are the performance improvement of Sequential Guid over standard Guid?
But basically specifying your own GUID can cause the underlying SQL INSERT statement to become more expensive. Read and Update operations should remain the same.
If you are generating you own GUIDs is SQL you can always use NEWSEQUENTIALID (https://msdn.microsoft.com/en-us/library/ms189786.aspx) for a sequentially generated GUIDs.

Hi previous posts cover this well. Just to note that if you did go with generating GUIDs outside of CRM you could mitigate against the potential performance impact (INSERTS) simply by running a weekly Maintenance plan to refresh the clustered indices directly on the SQL database(s) this would I believe ensure that GUIDs were ordered sequentially. In any case, CRM/API will always be the bottleneck, so best to do things in the way that the platform expects to avoid issues later on.

Why not save in new table?
likes origin exist a table named "customer",and your new data save in "customer_update",the field as same as the origin table.
it's will help you in future.maybe you want have a look the data's orgin.

Related

SQL recursively copy rows from multiple tables following PK FK relationships

I was given the task of creating a stored procedure to copy every piece of data associated with a given ID in our database. This data spans dozens of tables. each table may have dozens of matching rows.
example:
table Account
pk = AccountID
Table AccountSettings
FK = AccountID
Table Users
PK = UserID
FK = AccountID
Table UserContent
PK = UserContentID
FK = UserID
I want to create a copy of everything that is associated with an AccountID(which will traverse nearly every table) The copy will have a new AccountID and UserContentID but will have the same UserID. the new data needs to be in its respective table.
:) fun right?
The above is just a sample but I will be doing this for something like 50 or 60 tables.
I have researched using CTEs but am still a bit foggy on them. that may prove to be the best method. MY SQL skills are...... well I have worked with it for about 40 logged hours so far :)
Any advice or direction on where to look would be greatly appreciated. In addition, I am not opposed to doing this via C# if that would be possible or better.
Thanks in advance for any help of info.

The simplest way to solve this is the brute force way: write a very long proc that processes each table individually. This will be error-prone and very hard to maintain. But it will have the advantage of not relying on the database or database metadata to be in any particularly consistent state.
If you want something that works based on metadata, things are more interesting. You have three challenges there:
You need to programmatically identify all the related tables.
You need to generate insert statements for all 50 or 60.
You need to capture generated ids for those tables that are more than one or two steps away from the Account table, so that they can in turn be used as foreign keys in yet more copied records.
I've looked at this problem in the past, and while I can't offer you a watertight algorithm, I can give you a general heuristic. In other words: this is how I'd approach it.
Using a later version of MS Entity Framework (you said you'd be open to using C#), build a model of the Account table and all the related tables.
Review the heck out of it. If your database is like many, some of the relationships your application(s) assume will, for whatever reason, not have an actual foreign key relationship set up in the database. Create them in your model anyway.
Write a little recursive routine in C# that can take an Account object and traverse all the related tables. Pick a couple of Account instances and have it dump table name and key information to a file. Review that for completeness and plausibility.
Once you are satisfied you have a good model and a good algorithm that picks up everything, it's time to get cracking on the code. You need to write a more complicated algorithm that can read an Account and recursively clone all the records that reference it. You will probably need reflection in order to do this, but it's not that hard: all the metadata that you need will be in there, somewhere.
Test your code. Allow plenty of time for debugging.
Use your first algorithm, in step 3, to compare results for completeness and accuracy.
The advantage of the EF approach: as the database changes, so can your model, and if your code is metadata-based, it ought to be able to adapt.
The disadvantage: if you have such phenomena as fields that are "really" the same but are different types, or complex three-way relationships that aren't modeled properly, or embedded CSV lists that you'd need to parse out, this won't work. It only works if your database is in good shape and is well-modeled. Otherwise you'll need to resort to brute force.

Updating primary keys using LINQ

I'm currently working on a sandbox environment based on two databases located on different servers. What I am aiming to do is allow my clients to make changes on a test server and then once approved, I can simply hit a button and import the data across to my live database.
So far, I have managed to port the data across the two databases but what I would like to do is amend the primary keys on the test server to match those held on the live (incase I need backups and so that I can make checks to stop the same information being copied multiple times).
So far I have tried this solution:
DT_SitePage OldPage = new DT_SitePage
{
PageID = SP.PageID
};
DT_SitePage NewPage = new DT_SitePage
{
PageID = int.Parse(ViewState["PrimaryKey"].ToString())
};
Sandbox.DT_SitePages.Attach(NewPage, OldPage);
Sandbox.SubmitChanges();
However I keep getting the error:
***Value of member 'PageID' of an object of type 'DT_SitePage' changed.
A member defining the identity of the object cannot be changed.
Consider adding a new object with new identity and deleting the existing one instead.***
Is there anyway in LINQ to avoid this error and force the database to update this field???
Many Thanks

Why won't you use the stock backup/restore functionality supplied by DB manufacturer?
It makes a perfect logical sense that high-level ORM tools won't allow you to change the primary key of the record, as they only identify the record by its primary key.
You should consider making direct UPDATE queries to DB from your code instead.
And anyway, changing the primary key is the bad idea, what prevents you from INSERTing it with the needed value in the first place?

As said, modifying primary keys is typically something you don't want to do. If Linq-to-sql wouldn't have the early warning, your RDBMS would probably complain (SQL server does!). Especially when records are related by foreign key constraints updating primary keys is not trivial.
In cross-database scenarios, it is more common to use some "global" unique identification, for which GUIDs may do. Maybe your data can be identified in an alternative way? (Like when two users have the same name, they are deemed identical).
If you don't need to keep identical the database structures, you may consider using an extra field in your test database to store the "live" primary key.
Here is a post with lots of useful thoughts.

Creating snapshot of application data - best practice

We have a text processing application developed in C# using .NET FW 4.0 where the Administrator can define various settings. All this 'settings' data reside in about 50 tables with foreign key relations and Identity primary keys (this one will make it tricky, I think). The entire database is no more than 100K records, with the average table having about 6 short columns. The system is based on MS SQL 2008 R2 Express database.
We face a requirement to create a snapshot of all this data so that the administrator of the system could roll back to one of the snapshots anytime he screws up something. We need to keep the last 5 snapshots only. Creation of the snapshot must be commenced from the application GUI and so must be the rollback to any of the snapshots if needed (use SSMS will not be allowed as direct access to the DB is denied). The system is still in development (are we ever really finished?) which means that new tables and columns are added many times. Thus we need a robust method that can take care of changes automatically (digging code after inserting/changing columns is something we want to avoid unless there's no other way). The best way would be to tell that "I want to create a snapshot of all tables where the name begins with 'Admin'". Obviously, this is quite a DB-intensive task, but due to the fact that it will be used in emergency situations only, this is something that I do not mind. I also do not mind if table locks happen as nothing will try to use these tables while the creation or rollback of the snapshot is in progress.
The problem can be divided into 2 parts:
creating the snapshot
rolling back to the snapshot
Regarding problem #1. we may have two options:
export the data into XML (file or database column)
duplicate the data inside SQL into the same or different tables (like creating the same table structure again with the same names as the original tables prefixed with "Backup").
Regarding problem #2. the biggest issue I see is how to re-import all data into foreign key related tables which use IDENTITY columns for PK generation. I need to delete all data from all affected tables then re-import everything while temporarily relaxing FK constraints and switching off Identity generation. Once data is loaded I should check if FK constraints are still OK.
Or perhaps I should find a logical way to load tables so that constraint checking can remain in place while loading (as we do not have an unmanageable number of tables this could be a viable solution). Of course I need to do all deletion and re-loading in a single transaction, for obvious reasons.
I suspect there may be no pure SQL-based solution for this, although SQL CLR might be of help to avoid moving data out of SQL Server.
Is there anyone out there with the same problem we face? Maybe someone who successfully solved such problem?
I do not expect a step by step instruction. Any help on where to start, which routes to take (export to RAW XML or keep snapshot inside the DB or both), pros/cons would be really helpful.
Thank you for your help and your time.
Daniel

We don't have this exact problem, but we have a very similar problem in which we provide our customers with a baseline set of configuration data (fairly complex, mostly identity PKs) that needs to be updated when we provide a new release.
Our mechanism is probably overkill for your situation, but I am sure there is a subset of it that is applicable.
The basic approach is this:
First, we execute a script that drops all of the FK constraints and changes the nullability of those FK columns that are currently NOT NULL to NULL. This script also drops all triggers to ensure that any logical constraints implemented in them will not be executed.
Next, we perform the data import, setting identity_insert off before updating a table, then setting it back on after the data in the table is updated.
Next, we execute a script that checks the data integrity of the newly added items with respect to the foreign keys. In our case, we know that items that do not have a corresponding parent record can safely be deleted, but you may choose to take a different approach (report the error and let someone manually handle the issue).
Finally, once we have verified the data, we execute another script that restores the nullability, adds the FKs back, and reinstalls the triggers.
If you have the budget for it, I would strongly recommend that you take a look at the tools that Red Gate provides, specifically SQL Packager and SQL Data Compare (I suspect there may be other tools out there as well, we just don't have any experience with them). These tools have been critical in the successful implementation of our strategy.
Update
We provide the baseline configuration through an SQL Script that is generated by RedGate's SQL Packager.
Because our end-users can modify the database between updates which will cause the identity values in their database to be different in ours, we actually store the baseline primary and foreign keys in separate fields within each record.
When we update the customer database and we need to link new records to known configuration information, we can use the baseline fields to find out what the database-specific FKs should be.
In otherwords, there is always a known set of field ids for well-known configuration records regardless what other data is modified in the database and we can use this to link records together.
For example, if I have Table1 linked to Table2, Table1 will have a baseline PK and Table2 will have a baseline PK and a baseline FKey containing Table1's baseline PK. When we update records, if we add a new Table2 record, all we have to do is find the Table1 record with the specified baseline PK, then update the actual FKey in Table2 with the actual PK in Table1.

A kind of versioning by date ranges is a common method for records in Enterprise applications. As an example we have a table for business entities (us) or companies (uk) and we keep the current official name in another table as follows:
CompanyID Name ValidFrom ValidTo
12 Business Lld 2000-01-01 2008-09-23
12 Business Inc 2008-09-23 NULL
The null in the last record means that this is current one. You may use the above logic and possibly add more columns to gain more control. This way there are no duplicates, you can keep the history up to any level and synchronize the current values across tables easily. Finally the performance will be great.

Why creating Tables in run-time (code behind) is bad?

People suggest creating database table dynamically (or, in run-time) should be avoided, with the saying that it is bad practice and will be hard to maintain.
I don't see the reason why, and I don't see difference between creating table and any another SQL query/statement such as SELECT or INSERT. I wrote apps that create, delete and modify database and tables in run time, and so far I do not see any performance issues.
Can anyone explane the cons of creating database and tables in run-time?

Tables are much more complex entities than rows and managing table creation is much more complex than an insert which has to abide by an existing model, the table. True, a table create statement is a standard SQL operation but depending on creating them dynamically smacks of a bad design decisions.
Now, if you just create one or two and that's it, or an entire database dynamically, or from a script once, that might be ok. But if you depend on having to create more and more tables to handle your data you will also need to join more and more and query more and more. One very serious issue I encountered with an app that made use of dynamic table creation is that a single SQL Server query can only involve 255 tables. It's a built-in constraint. (And that's SQL Server, not CE.) It only took a few weeks in production for this limit to be reached resulting in a nonfunctioning application.
And if you get into editing the tables, e.g. adding/dropping columns, then your maintenance headache gets even worse. There's also the matter of binding your db data to your app's logic. Another issue is upgrading production databases. This would really be a challenge if a db had been growing with objects dynamically and you suddenly needed to update the model.
When you need to store data in such a dynamic manner the standard practice is to make use of EAV models. You have fixed tables and your data is added dynamically as rows so your schema does not have to change. There are drawbacks of course but it's generally thought of as better practice.

KMC ,
Remember the following points
What if you want to add or remove a column , you many need to change in the code and compile it agian
what if the database location changes
Developers who are not very good at database can make changes , if you create the schema at the backend , DBA's can take care of it.
If you get any performance issues , it may get tough to debug.

You will need to be a little clearer about what you mean by "creating tables".
One reason to not allow the application to control table creation and deletion is that this is a task that should be handled only by an administrator. You don't want normal users to have the ability to delete whole tables.
Temporary tables ar a different story, and you may need to create temporary tables as part of your queries, but your basic database structure should be managed only by someone with the rights to do so.

sometimes, creating tables dynamically is not the best option security-wise (Google SQL injection), and it would be better using stored procedures and have your insert or update operations occur at the database level by executing the stored procedures in code.

Letting the database pick the primary key in nhibernate

How do I just simply allow MySQL to assign a primary key to an inserted object with nhibernate? It seems I would want to set the generator as a type "identity", but the documentation states that using this "..require[s] two SQL queries to insert a new object." Why would it do that? Is there some way to get this functioning like a normal insert sql statement?

The reason that it requires two queries is that with an identity the value of the column is not defined until the row is inserted. Therefore it requires a select after insert to get the column value for the inserted object. This is pretty standard and I wouldn't let it stop me from using autogenerated keys as my primary key. The other option is to pre-generate the key -- say a GUID for the new object before persisting it to the database. For the most part I don't really see an advantage to this unless there are other mitigating circumstances, such as having to merge data from separate databases where autogenerated keys might collide.

There's an obvious advantage: Letting NHibernate use Guids or Hilo as the id generator will enable an extremely cool feature in NHibernate: batching. Just configure NHibernate to use batching (for like, say 1000 statements), and your inserts will suddenly be extremely fast.
Fabio has a post about the various availabe generators here - extremely useful reading if you are using NHibernate (or if you know someone who thinks NHibernate performs badly).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.