Is EF or SQL the better choice to audit data changes?

Is EF or SQL the better choice to audit data changes? - c#

The requirement seems simple: when data changes, audit the changes.
Here's some important pieces to the equation:
The Data in my application spans multiple tables (some cross ref. tables).
My DTO is deep, with Navigation Properties conditionally populated.
When loaded, I copy the original DTO with its "original values".
When saved is requested, the original DTO contains the changes.
Ideally, foreign keys will read like useful text not Id numbers.
Unlike TFS' cool history feature, mine seems more complicated because of the many related tables and conditional child entities.
I see three possibilities (so far):
I could use C# to reflect the objects and create a before/after record.
I could use triggers in SQL 2008R2 to catch changes and coalesce a before/after record.
I could store the raw before/after objects and let SQL 2008R2 parse them.
Please note: Right now, I seems to me that SQL 2008R2's CDC is far too heavy of an option. I am really looking for something I can build, but I admit my mind is open to anything right now.
My question
Before I get started building this:
How does everybody else handle auditing a complex EF DTO?
Is there a low(ish)-tech solution available?
Thank you in advance.
Related, but not-completely-related questions already on StackOverflow: Implementing Audit Log / Change History with MVC & Entity Framework and Create Data Audit in SQL Server and https://stackoverflow.com/questions/5773419/how-to-audit-many-to-many-relationship-in-entity-framework and Maintaining audit log for entities split across multiple tables and Linq to SQL Audit Trail / Audit Log: should I use triggers or doddleaudit? do not provide an answer.

IF audit is a real requirement I would opt for the trigger solution... since the other methods have several "shortcomings":
"blind" to any changes happening through other means than your application
if you make some code changes and forget about adding the audit code the audit trail gets "blind spots"
The trigger-based solution can be secured so that only special users can even see the audited data...
I usually work with Oracle but from my experience in such situations: allow the app only SELECT rights via Views , any insert/delete/update should be done via Stored procedures and audit trail should be done via triggers...

I've recently implemented an audit log manager on top of Entity Framework. When I instantiate my audit manager, I reflect all of the entity classes, and store the property information. Then within the object context SavingChanges event, I audit all of the changes. It works great. In the case of foreign keys, I just store their Id's before and after during changes.
The nice thing about this solution is that it doesn't require any extra coding. Once you create a log manager of sort, you don't have to worry about adding new triggers, or modifying triggers when new columns are added. Any changes to your entity classes will automatically be picked up when reflecting the classes.

Well, let's see. SQL Server auditing already exists, comes with tools, is probably already known by your DBAs, doesn't slow down your app, and can trace events that the application itself will never even see.
On the other hand, rolling your own in EF will allow you to audit non-SQL Server data sources. It also doesn't require EE.

Trigger Solution, Pros:
Cannot bypass the audit
Trigger Solution, Cons:
Cannot audit non SQL data
Cannot audit complex objects on insert
Entity Framework, Pros:
Can audit everything
Can audit complex objects in any state
Entity Framework, Cons:
Can be bypassed (like direct-to-SQL)
Requires a copy of original values
My choice is Entity Framework. Using STE makes it easier.
Either way you have to roll your own.

Related

Entity Framework and Database Triggers

I have a database which is created in a separate project and a .edmx model file is generated by Entity Framework and created the model classes from the existing database.
There are several things that are added to the database (other parts of the backend, front end site, api, etc). Currently the method I have is a loop that checks for new entries in the database every 5 seconds (basically just a call to the table that looks for entries newer than the most recent entry I know of), and then I use the entry to perform actions that are non database related.
I was wondering if there was a better way to get new entries as opposed to constantly querying the database for something new. I was wondering if what I'm doing is fine, or if there's a better way to get new entries, preferably able to be built upon/with EF.
Thanks for any help!

If you want to notify your app as soon as any database records are inserted or updated or deleted and do some extra processing on them then you have two choices.
You can go with SqlDependency or SqlTableDependency. Both are used to notify the application when something on database changes. There is just one constraint where you must be able to enable the Broker for SQL server using ALTER DATABASE MyDatabase SET ENABLE_BROKER (This is important as some db doesn't support broker services i.e SQL Azure )
Here are some good links to explore both the approaches.
https://github.com/christiandelbianco/monitor-table-change-with-sqltabledependency
https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/detecting-changes-with-sqldependency

Is it possible to use Entity Framework and keep object relations in the code and out of the database

I'm having a hard time just defining my situation so please be patient. Either I have a situation that no one blogs about, or I've created a problem in my mind by lack of understanding the concepts.
I have a database which is something of a mess and the DB owner wants to keep it that way. By mess I mean it is not normalized and no relationships defined although they do exist...
I want to use EF, and I want to optimize my code by reducing database calls.
As a simplified example I have two tables with no relationships set like so:
Table: Human
HumanId, HumanName, FavoriteFoodId, LeastFavoriteFoodId, LastFoodEatenId
Table: Food
FoodId, FoodName, FoodProperty1, FoodProperty2
I want to write a single EF database call that will return a human and a full object for each related food item.
First, is it possible to do this?
Second, how?
Boring background information: A super sql developer has written a query that returns 21 tables in 20 milliseconds which contain a total of 1401 columns. This is being turned into an xml document for our front end developer to bind to. I want to change our technique to use objects and thus reduce the amount of hand coding and mapping from fields to xml (not to mention the handling of nulls vs empty strings etc) and create a type safe compile time environment. Unfortunately we are not allowed to change the database or add relationships...

If I understand you correct, it's better for you to use Entity Framework Code First Approach:
You can define your objects (entities) Human and Food
Make relations between them in code even if they don't have foreign keys in DB
Query them usinq linq-to-sql
And yes, you can select all related information in one call.

You can define the relationships in the code with Entity Framework using Fluent API. In your case you might be able to define your entities manually, or use a tool to reverse engineer your EF model from an existing database. There is some support for this built in to Visual Studio, and there are VS extensions like EF Power Tools that offer this capability.
As for making a single call to the database with EF, you would probably need to create a stored procedure or a view that returns all of the information you need. Using the standard setup with lazy-loading enabled, EF will make calls to the database and populate the data as needed.

Continuous delivery and database schema changes with entity framework

We want to progress towards being able to do continuous delivery of of our application into production. We currently deploy to azure and use table/blob storage and have a azure sql database, which we access with the entity.
As the database schema changes we want to be able to automatically apply the schema changes to the production database, but as this will happen whilst the application is live and the code changes are being deployed to many nodes at the same time we are not sure what the correct approach is.
After some reading it seems (and this makes sense) that the application needs to be tolerant of the 2 different database schema versions, so that it doesn't matter if its an old version of the code or a new version of the code which sees the database, however I'm not sure what the best way to approach handling this in the application is, using the entity framework.
Should we have versioned instances of the EF generated classes in the code which know how to access a specific version of the schema? What happens when the schema is updated and an old version of the code is running against the database?
Our entity framework classes are mapped to views in specific schemas in the db and nothing is mapped to the underlying tables, so potentially this could allow us to create v1 views which the old code uses and v2 views which the new code uses, but maintaining this feels like it would be a bit of a nightmare (its already enough of a pain simply maintaining the EF mappings to views rather than tables)
So what are best practices in this area? What do others do to solve this problem?

Whether you use EF or not, maintaining the code's ability to work with 2 consecutive versions of the database is a good (and perhaps the only viable) approach here.
Here are some ways we handle specific types of migrations:
When adding a column, we can typically just add the column (with a default constraint if non-nullable) and not worry about the code. EF will never issue a "SELECT *", so it will be able to continue to function properly while ignoring the new column. Similarly, adding a table is easy.
When removing a column or table, simply keep that column around 1 version longer than you would have otherwise.
For more complex migrations (e. g. completely changing the structure for a table or segment of the data model), deploy the new model alongside backwards-compatibility views (or tables with triggers to keep them in-sync), which will live as long as does the code that references them. As you say, this can a lot of work depending on the complexity of the migration, but it sounds like you are already well-positioned to do this because your EF entities point to views anyway. On the other hand, the benefit of this work is that you have more time to do the code migration. If you have a large codebase, this could be really beneficial in allowing you to migrate the data model to fit the needs of new features while still supporting old features without major code changes.
As a side-note, the difficulty of data migration often makes us push developing a finalized data model as far back as possible in the development schedule. With EF, you can write and test a lot of code before the data model is finalized (we use code-first to generate a sample SQLExpress database in a unit tests, even though our production database is not maintained by code-first). That way, we make fewer incremental changes to the production data model once a new feature is released.

Entity Framework and ADO.NET with Unit of Work pattern

We have a system built using Entity Framework 5 for creating, editing and deleting data but the problem we have is that sometimes EF is too slow or it simply isn't possible to use entity framework (Views which build data for tables based on users participating in certain groups in database, etc) and we are having to use a stored procedure to update the data.
However we have gotten ourselves into a problem where we are having to save the changes to EF in order to have the data in the database and then call the stored procedures, we can't use ITransactionScope as it always elevates to a distributed transaction and/or locks the table(s) for selects during the transaction.
We are also trying to introduce a DomainEvents pattern which will queue events and raise them after the save changes so we have the data we need in the DB but then we may end up with the first part succeeding and the second part failing.
Are there any good ways to handle this or do we need to move away from EF entirely for this scenario?

I had similar scenario . Later I break the process into small ones and use EF only, and make each small process short. Even overall time is longer, but system is easier to maintain and scale. Also I minimized joins, only update entity itself, disable EF'S AutoDetectChangesEnabled and ValidateOnSaveEnabled.
Sometimes if you look your problem in different ways, you may have better solution.
Good luck!

Best Practice - Mixing Table-Entities with View-Entities in EntityFramework?

I have a legacy database that I'd like to interact with Entity Framework.
The database is highly normalised for storing information about flights. In order to make it easier to work with some of the data, a number of SQL Views have been written to flatten data and to pivot certain multi-table joins into more logical information.
After quickly looking over this I see two problems with using Views in EF.
The Views contains lots and lots of Keys. Some quick googling seems to indicate I will need to manually edit the EDMX file to remove this info.
The Views don't have any relationships to the other table entities. These associations need to be manually added in order to link a View -> Table.
Both of these seem like major pain points when it comes to refreshing the Model from the DB, when teh DBA team make changes.
Is this just something you need to "put up with" when working with EF or are there any suggested patterns/practices to deal with these.

Mixing Table-Entities with View-Entities is ok and largely depends on your requirements.
My experience has been these are things you are going to have to deal with.
When I first started using Entity, I used views a lot because I was told I needed to use them. As I became more familiar with Entity I began to prefer the use of table-entities over view-entities; mainly because I felt I had more control. Views are ok when you are presenting read-only info, or as you described (flattend data, pivots, joins etc.); however, when your requirements change and you now have to add CRUD, you are going to have to use stored procedures or change your model to use table-entites anyway, so you might as well use table-entities from the start.
The Views contains lots and lots of Keys. Some quick googling seems to
indicate I will need to manually edit the EDMX file to remove this
info.
This wasn't ever really a problem for me. You can undo keys of the view-entity in the designer. If your talking about doing this for the view in the storage layer, then yes, you can, to make it work, but as soon as you update your model from the database, you are going to have to do this over again -- I wouldn't recommend doing this. You are better off working with your DBA to adjust the key constraints in the database.
The Views don't have any relationships to the other table entities.
These associations need to be manually added in order to link a View
-> Table.
This was often a problem for me. Sometimes you are able to add keys and create relationships without any problems, but often times you may have to change the keys and/or relationships in the db to make it work -- this depends on your requirements; you may have to deal with this even when using table-entities.
Hope this helps.

I've been in a similar situation as we transitioned into using Entity Framework.
The first step was to start with a blank EF model and add the tables when we created the domain service calls. This at least meant that the model wasn't crazy to start with! Then the plan was to try and not use views as much as possible and move that kind of logic into the domain service, where at least it could be tested, and slowly deprecate the CRUD stored procedures. It's worked fine and there haven't really been any major problems.
In practice there are still some views, mainly used for situations that need to be performant, Fortunately these views can be considered in isolation (for read only grids) and have been left as such in the model with no associations. Adding the keys in would I'm sure be annoying.
Editing the EDMX file is okay, but sometimes on a model refresh these changes can get lost. This has happened to me particularly when EF thinks a table is a view. And yes it's a pain and something that has just been put up with.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.