Identifying dependencies among tables

Identifying dependencies among tables - c#

Given a table, is there any way to identify all tables which are taking foreign key reference on that table?
The actual scenario goes like this. Given a database, I have a set of C# schema classes which I have to populated from data in the database and stored them onto a cache. All these schemas should always be in sync with the database.
Now, I have two ways to solve the above problem, one is that whenever a database change happens, go and update all the stored schemas which will be very costly. The other is use some heuristic based algo to correctly identify the schema which will be impacted from a db change and update those only.
In order to do implement this, I was thinking of building a dependency tree/graph kind of structure where a table T1 is called as dependent on Table T2 of T1 has a foreign key constrain on T2. So that whenever a change happens in one or more table, I can quickly iterate over the graph and says that these all schemas needs to be updated.
I know that using Data Dictionaries you can find these kind of dependencies but since I am using Entity Framework, I'm looking for a way of doing it through Entity Framework.
Also, if someone has a better approach of doing the same, share that as well.

Related

How to maintain database concurrency in long logical flows

We are using entity framework code first to save reports to SQL database, many of the objects have many to many relations so the data is split into different tables.
In order to prevent duplication of data we first check if a certain object is already saved and later on we add the relation to the database.
For example have object Person that can have many countries, and object Country that can hold multiple Person object.
During the beginning of the save flow we query the database for existing countries and update them in the Person object if they exist, or create them if they don't.
This flow worked fine while we have only one saving process at the same time but now we got requirement to support it many times simultaneously and my worry is that one thread will add a new country right after other thread will check existing countries.
I was wondering what good practices are there to solve this problem with minimal impact on performance.
Thanks!

It doesn't sound like your fully leveraging the capabilities of your chosen ORM. Relationships are represented in your returned entities if you are using the library according to it's documentation. So updating a single entity's many-to-many relationship would update for all other related entities so long as the EntityID remains the same.
If you still are having trouble trusting the integrity of this relationship, I would suggest using the bulk-update feature of entity framework

Is it possible to use Entity Framework and keep object relations in the code and out of the database

I'm having a hard time just defining my situation so please be patient. Either I have a situation that no one blogs about, or I've created a problem in my mind by lack of understanding the concepts.
I have a database which is something of a mess and the DB owner wants to keep it that way. By mess I mean it is not normalized and no relationships defined although they do exist...
I want to use EF, and I want to optimize my code by reducing database calls.
As a simplified example I have two tables with no relationships set like so:
Table: Human
HumanId, HumanName, FavoriteFoodId, LeastFavoriteFoodId, LastFoodEatenId
Table: Food
FoodId, FoodName, FoodProperty1, FoodProperty2
I want to write a single EF database call that will return a human and a full object for each related food item.
First, is it possible to do this?
Second, how?
Boring background information: A super sql developer has written a query that returns 21 tables in 20 milliseconds which contain a total of 1401 columns. This is being turned into an xml document for our front end developer to bind to. I want to change our technique to use objects and thus reduce the amount of hand coding and mapping from fields to xml (not to mention the handling of nulls vs empty strings etc) and create a type safe compile time environment. Unfortunately we are not allowed to change the database or add relationships...

If I understand you correct, it's better for you to use Entity Framework Code First Approach:
You can define your objects (entities) Human and Food
Make relations between them in code even if they don't have foreign keys in DB
Query them usinq linq-to-sql
And yes, you can select all related information in one call.

You can define the relationships in the code with Entity Framework using Fluent API. In your case you might be able to define your entities manually, or use a tool to reverse engineer your EF model from an existing database. There is some support for this built in to Visual Studio, and there are VS extensions like EF Power Tools that offer this capability.
As for making a single call to the database with EF, you would probably need to create a stored procedure or a view that returns all of the information you need. Using the standard setup with lazy-loading enabled, EF will make calls to the database and populate the data as needed.

View using same type as Table

I have a table that used throughout an app by Entity. I have a view that returns an identical column set, but is actually a union on itself to try to work around some bad normalization (The app is large and partially out of my hands, this part is unavoidable).
Is it possible to have Entity 4 treat a view that is exactly like a table as the same type, so that I can use this view to populate a collection of the same type? This question seems to indicate it is possible in nhibernatem but I can't find anything like it for entity. It would be an extra bonus of the navigation properties could still be used to Include(), but this is not necessary (I can always manually join).

Since EF works on mappings from objects to database entities this is not directly possible. What you need is something like changing the queried database entity dynamically, and AFAIK this is not possible without manually changing the object context.

For sure the EF runtime won't care as long as it can treat the view as if it was completely separate table. The two possible challenges that I forsee are:
Tooling: Our wizard does allow you to select views when doing reverse engineering (i.e. database-first). Definitively if you can use 'code first against an existing database' you can just pretend that the view is just a table, but you won't get any help scripting the database creation or migrations.
Updates: in general you can perform updates for a view setting up store procedure mapping (which is available in the EF Designer from v1 or in Code First starting in EF6). You might also be able to make your view updatable directly or using instead off triggers (see "Updatable Views" here for more details). If I remember correctly the SQL generated by EF to retrieve database generated values (e.g. for identity columns) is not compatible in some cases with instead-off triggers. Yet another alternative is to have your application treat the view as read-only and perform all updates through the actual table, which you would map as a separate entity. Keep in in mind that in-memory entities for the view and the original table will not be kept in sync.
Hope this helps!

SQL recursively copy rows from multiple tables following PK FK relationships

I was given the task of creating a stored procedure to copy every piece of data associated with a given ID in our database. This data spans dozens of tables. each table may have dozens of matching rows.
example:
table Account
pk = AccountID
Table AccountSettings
FK = AccountID
Table Users
PK = UserID
FK = AccountID
Table UserContent
PK = UserContentID
FK = UserID
I want to create a copy of everything that is associated with an AccountID(which will traverse nearly every table) The copy will have a new AccountID and UserContentID but will have the same UserID. the new data needs to be in its respective table.
:) fun right?
The above is just a sample but I will be doing this for something like 50 or 60 tables.
I have researched using CTEs but am still a bit foggy on them. that may prove to be the best method. MY SQL skills are...... well I have worked with it for about 40 logged hours so far :)
Any advice or direction on where to look would be greatly appreciated. In addition, I am not opposed to doing this via C# if that would be possible or better.
Thanks in advance for any help of info.

The simplest way to solve this is the brute force way: write a very long proc that processes each table individually. This will be error-prone and very hard to maintain. But it will have the advantage of not relying on the database or database metadata to be in any particularly consistent state.
If you want something that works based on metadata, things are more interesting. You have three challenges there:
You need to programmatically identify all the related tables.
You need to generate insert statements for all 50 or 60.
You need to capture generated ids for those tables that are more than one or two steps away from the Account table, so that they can in turn be used as foreign keys in yet more copied records.
I've looked at this problem in the past, and while I can't offer you a watertight algorithm, I can give you a general heuristic. In other words: this is how I'd approach it.
Using a later version of MS Entity Framework (you said you'd be open to using C#), build a model of the Account table and all the related tables.
Review the heck out of it. If your database is like many, some of the relationships your application(s) assume will, for whatever reason, not have an actual foreign key relationship set up in the database. Create them in your model anyway.
Write a little recursive routine in C# that can take an Account object and traverse all the related tables. Pick a couple of Account instances and have it dump table name and key information to a file. Review that for completeness and plausibility.
Once you are satisfied you have a good model and a good algorithm that picks up everything, it's time to get cracking on the code. You need to write a more complicated algorithm that can read an Account and recursively clone all the records that reference it. You will probably need reflection in order to do this, but it's not that hard: all the metadata that you need will be in there, somewhere.
Test your code. Allow plenty of time for debugging.
Use your first algorithm, in step 3, to compare results for completeness and accuracy.
The advantage of the EF approach: as the database changes, so can your model, and if your code is metadata-based, it ought to be able to adapt.
The disadvantage: if you have such phenomena as fields that are "really" the same but are different types, or complex three-way relationships that aren't modeled properly, or embedded CSV lists that you'd need to parse out, this won't work. It only works if your database is in good shape and is well-modeled. Otherwise you'll need to resort to brute force.

Creating snapshot of application data - best practice

We have a text processing application developed in C# using .NET FW 4.0 where the Administrator can define various settings. All this 'settings' data reside in about 50 tables with foreign key relations and Identity primary keys (this one will make it tricky, I think). The entire database is no more than 100K records, with the average table having about 6 short columns. The system is based on MS SQL 2008 R2 Express database.
We face a requirement to create a snapshot of all this data so that the administrator of the system could roll back to one of the snapshots anytime he screws up something. We need to keep the last 5 snapshots only. Creation of the snapshot must be commenced from the application GUI and so must be the rollback to any of the snapshots if needed (use SSMS will not be allowed as direct access to the DB is denied). The system is still in development (are we ever really finished?) which means that new tables and columns are added many times. Thus we need a robust method that can take care of changes automatically (digging code after inserting/changing columns is something we want to avoid unless there's no other way). The best way would be to tell that "I want to create a snapshot of all tables where the name begins with 'Admin'". Obviously, this is quite a DB-intensive task, but due to the fact that it will be used in emergency situations only, this is something that I do not mind. I also do not mind if table locks happen as nothing will try to use these tables while the creation or rollback of the snapshot is in progress.
The problem can be divided into 2 parts:
creating the snapshot
rolling back to the snapshot
Regarding problem #1. we may have two options:
export the data into XML (file or database column)
duplicate the data inside SQL into the same or different tables (like creating the same table structure again with the same names as the original tables prefixed with "Backup").
Regarding problem #2. the biggest issue I see is how to re-import all data into foreign key related tables which use IDENTITY columns for PK generation. I need to delete all data from all affected tables then re-import everything while temporarily relaxing FK constraints and switching off Identity generation. Once data is loaded I should check if FK constraints are still OK.
Or perhaps I should find a logical way to load tables so that constraint checking can remain in place while loading (as we do not have an unmanageable number of tables this could be a viable solution). Of course I need to do all deletion and re-loading in a single transaction, for obvious reasons.
I suspect there may be no pure SQL-based solution for this, although SQL CLR might be of help to avoid moving data out of SQL Server.
Is there anyone out there with the same problem we face? Maybe someone who successfully solved such problem?
I do not expect a step by step instruction. Any help on where to start, which routes to take (export to RAW XML or keep snapshot inside the DB or both), pros/cons would be really helpful.
Thank you for your help and your time.
Daniel

We don't have this exact problem, but we have a very similar problem in which we provide our customers with a baseline set of configuration data (fairly complex, mostly identity PKs) that needs to be updated when we provide a new release.
Our mechanism is probably overkill for your situation, but I am sure there is a subset of it that is applicable.
The basic approach is this:
First, we execute a script that drops all of the FK constraints and changes the nullability of those FK columns that are currently NOT NULL to NULL. This script also drops all triggers to ensure that any logical constraints implemented in them will not be executed.
Next, we perform the data import, setting identity_insert off before updating a table, then setting it back on after the data in the table is updated.
Next, we execute a script that checks the data integrity of the newly added items with respect to the foreign keys. In our case, we know that items that do not have a corresponding parent record can safely be deleted, but you may choose to take a different approach (report the error and let someone manually handle the issue).
Finally, once we have verified the data, we execute another script that restores the nullability, adds the FKs back, and reinstalls the triggers.
If you have the budget for it, I would strongly recommend that you take a look at the tools that Red Gate provides, specifically SQL Packager and SQL Data Compare (I suspect there may be other tools out there as well, we just don't have any experience with them). These tools have been critical in the successful implementation of our strategy.
Update
We provide the baseline configuration through an SQL Script that is generated by RedGate's SQL Packager.
Because our end-users can modify the database between updates which will cause the identity values in their database to be different in ours, we actually store the baseline primary and foreign keys in separate fields within each record.
When we update the customer database and we need to link new records to known configuration information, we can use the baseline fields to find out what the database-specific FKs should be.
In otherwords, there is always a known set of field ids for well-known configuration records regardless what other data is modified in the database and we can use this to link records together.
For example, if I have Table1 linked to Table2, Table1 will have a baseline PK and Table2 will have a baseline PK and a baseline FKey containing Table1's baseline PK. When we update records, if we add a new Table2 record, all we have to do is find the Table1 record with the specified baseline PK, then update the actual FKey in Table2 with the actual PK in Table1.

A kind of versioning by date ranges is a common method for records in Enterprise applications. As an example we have a table for business entities (us) or companies (uk) and we keep the current official name in another table as follows:
CompanyID Name ValidFrom ValidTo
12 Business Lld 2000-01-01 2008-09-23
12 Business Inc 2008-09-23 NULL
The null in the last record means that this is current one. You may use the above logic and possibly add more columns to gain more control. This way there are no duplicates, you can keep the history up to any level and synchronize the current values across tables easily. Finally the performance will be great.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.