Maintaining Referential Integrity Without Actually Deleting A Record - c#

Rather than deleting an entry from the database, I am planning on using a boolean column like isActive in every table and manage its true/false state.
Normally when you delete a record from the database,
referential integrity is maintained, which means you cannot delete it if before deleting its dependencies.
when you query a deleted record, it returns null
How can I achieve the same results in an automated way using Entity Framework? Because checking isActive field for every entity in every query manually seems too much work which will be error-prone. And the same holds true for marking the dependencies as isActive=false.
EDIT:
My purpose is not limited to point-in-time queries. Let me give an example. UserA posted a photo and UserB wrote a comment on it. Then UserB wanted to delete his account. But the comment has its poster FK pointing at UserB. So, rather than deleting UserB, I want to deactivate its account but keep the record in order not to break dependencies.
And I want to extend this logic to every table in the database. Is that wrong?

As kind of a side answer to this question, instead of querying all of the tables directly why not use Views and then query the views? You can place a filter in the view to only display the "IsActive = true" records, that way you don't have to worry about including it manually in every query (something you mention is error prone).

Because checking isActive field for every entity in every query manually seems too much work which will be error-prone
It is error prone. But you may not always want only the active records (admin page?). You may also not want to soft delete ALL records, as not everything makes sense to keep around (in my experience). You could use an Expression to help you out / wire it up for certain methods / repositories and build dynamic queries.
Expression<Func<MyModel, bool>> IsActive = x => x.IsActive;
And the same holds true for marking the dependencies as isActive=false
A base repository could handle the delete for all your repositories, which would set the status to false (where the BaseModel would have an IsActive property).
public int Delete<TEntity>(long id) where TEntity : BaseModel
{
using (var context = GetContext())
{
var dbEntity = context.Set<TEntity>().Find(id);
dbEntity.IsActive = false;
return context.SaveChanges();
}
}

There is an OSS tool called EF Filters that can achieve what you are looking for: https://github.com/jbogard/EntityFramework.Filters
It let's you set global filters like an IsActive field and would certainly work for queries.

Related

Relation Many to Many EF6 Fluent API

I am developping an application using EF6 with Fluent API and I have an issue to manage Many-To-Many relationship.
For some internal reasons the Join table has a specific format including 4 fields
- Left Id (FK)
- Right Id (FK)
- StartDate (dateTime)
- EndDate (datetime)
Deleting a link is in fact setting the EndDate as not null but i don't now how to configure it in EF6.
In an other hand when reading links the record with Not NULL EndDate shouldn't be considered.
Can you give me a solution ?
Thank you.
Join tables and EF
EF automates some things for you. For this, it uses convention-over-configuration. If you stick to the convention, you can skip on a whole lot of common configuration.
For example, if your entity has a property named Id, EF will inherently assume that this is the PK.
Similarly, if two entity types have nav props that refer to each other (and only one direct link between the two entities exists), then EF will automatically assume that these nav props are the two sides to a single many-to-many relationship. EF will make a join table in the database, but it will keep this hidden from you, and let you deal with the two entity types themselves.
For some internal reasons the Join table has a specific format including 4 fields - Left Id (FK) - Right Id (FK) - StartDate (dateTime) - EndDate (datetime)
Your join table no longer conforms to what the content of a conventional and automatically generated EF join table is. You are expecting a level of custom configurability that EF cannot provide based on blind convention, which means you have to explicitly configure this.
Secondly, the fact that you have these additional columns implies that you wish to use this data at some point (presumably to show the historical relations between two entities. Therefore, it doesn't make sense to rely on EF's automatic join tables as the join table and it content would be hidden from the application/developer.
It's possible that the second consideration is invalid for you, if you don't need the application to ever fetch the ended entries. But the overall point still stands.
The solution here is to make the join record an explicit entity of its own. In essence, you are not dealing with a many-to-many here, you are dealing with a specific entity (the join element) with two one-to-many relationships (one for each of the two entity types).
This enables you to achieve exactly what you want. Your expectation of what EF can automate for you simply doesn't apply in this case.
Soft delete
Deleting a link is in fact setting the EndDate as not null but i don't now how to configure it in EF6.
In general, this is known as "soft delete" behavior, albeit maybe slightly differently here. In a regular soft delete pattern, when an entry is deleted, the database secretly retains the entry but the application doesn't know that and doesn't see the entry again.
It's unclear if you intend for ended entries to still show up in the application, e.g. the relational history. If this is not the case, then your situation is exactly soft delete behavior.
This isn't something you configure on the model level, but rather something you override in your database's SaveChanges behavior. A simple example of how I implement a soft delete:
public override int SaveChanges()
{
// Get all entries of the change trackes (of a given type)
var entries = ChangeTracker.Entries<IAuditedEntity>().ToList();
// Filter the entries that are being deleted
foreach (var entry in entries.Where(entry.State == EntityState.Deleted))
{
// Change the entry so it instead updates the entry and does not delete it
entry.Entity.DeletedOn = DateTime.Now;
entry.State = EntityState.Modified;
}
return base.SaveChanges();
}
This allows you to prevent deletions to the entities that you want this to apply to, which is the safest way to implement a soft delete as this serves as a catch-all for database deletes coming from whichever consumer uses this db context.
The solution to your question is pretty much the same. Assuming you named your join entity (see previous chapter) JoinEntity:
public override int SaveChanges()
{
var entries = ChangeTracker.Entries<JoinEntity>().ToList();
// Filter the entries that are being deleted
foreach (var entry in entries.Where(entry.State == EntityState.Deleted))
{
// Change the entry so it instead updates the entry and does not delete it
entry.Entity.Ended = DateTime.Now;
entry.State = EntityState.Modified;
}
return base.SaveChanges();
}
Word of warning
Soft deletes tend to be a catch-all for all entities (or at least a significant chuck of your database). Therefore, it makes sense to catch this at the db context level as I did here.
However, if this entity is unique in that it is soft deleted, then this is more of a business logic implementation than it is a DAL-architecture. If you start writing many custom rules for different types of entities, the db context logic is going to get clutterend and it's not going to be nice to work with because you need to account for multiple possible operations happening during the SaveChanges.
Take note to not push what is supposed to be a business logic decision to the DAL. I can't draw this line for you, it depends on your context. But evaluate whether the db context is the best place to implement this behavior.
Can you give me a solution ?
If your linking table has extra columns you have to model it as an Entity, and the EndDate logic for navigation needs to be explicit. EF won't do any of that for you.

How to transition partial update operation from ObjectContext to DbContext

I am using entity framework 5.0. I am in a rocess od changing my app from ObjectContext to DbContext model. DbContext should be according to microsoft the recommended approach. I use database forst approach and I have generated model form database.
But, at a very first simple task there is a problem. Namely simple update of a record is broken.
Let's have a simple table Item, for ilustration only:
Item
(
ItemId int NOT NULL, -- Primary key
Name nvarchar(50) NOT NULL,
Description NVARCHAR(50)
)
I have noticed that using DbContext does not support updating a record not as ObjectContext does.
In my application I have a simple update method.
public void UpdateItem()
{
MyContext context = new MyContext();
Item item = new Item();
item.ItemId = 666;
context.Items.Attach(item);
// From this point onward EF tracks the changes I make to Item
Item.Description = "Some description";
context.SaveChanges();
}
Using ObjectContext this method correctly updates a record. Using SQL profiler I can see that it generates something like this (greatly simplified!!!)
UPDATE Item
SET Description = 'Some description'
WHERE ItemId = 666
If, however I try to do the same thing in DbContext I get the exception:
System.Exception: Items.aspx.cs - logged from CustomError() ---> System.Data.Entity.Validation.DbEntityValidationException: Validation failed for one or more entities. See 'EntityValidationErrors' property for more details.
at System.Data.Entity.Internal.InternalContext.SaveChanges()
at System.Data.Entity.Internal.LazyInternalContext.SaveChanges()
at System.Data.Entity.DbContext.SaveChanges()
And no database UPDATE is issued to Sql server.
I guess that DbContext validates all the properties and the property Name is null. This by design. I do not intend to modify it, I do not even know what is it and I do not need to know what is it.
Only the property Description was changed. Clearly ObjectContext does not track changes correctly.
How can this problem be resolved?
I have researched the issue and found the something on updating records.
For example this link: https://stackoverflow.com/a/15339512/4601078
db.Users.Attach(updatedUser);
var entry = db.Entry(updatedUser);
entry.Property(e => e.Email).IsModified = true;
// other changed properties
db.SaveChanges();
But this is horrible code. For every property on should add a line like:
entry.Property(e => e.Email).IsModified = true;
This produces ugly unreadable code, an I suspect lamda expression are not stelar in performance.
Even worse are those who propose to make a roundtrip to DB to fetch existing records with all properties populated, update it and the save changes. This is a no go with regard to performance.
So, how to tackle with simple entity updates or is this DbContext just another item in microsofts collection of dead ends which serve no real purpose?
DbContext doesn't really track changes by watching properties, it compares the values to previously known values. And validation always works on the entire entity so the only real choice if you want to do things this way is to disable validation during this operation. See Entity Framework validation with partial updates
If you know for sure that the changes you apply are valid, or you have custom code to validate them, you can turn off validation by EF altogether:
db.Configuration.ValidateOnSaveEnabled = false;
This works OK as long as you do it your way: attach a new entity with a known Id (aka a stub entity) and then modify its properties. EF will only update the properties it detects as having been modified (indeed by comparing original and current values, not, as ObjectContext did, by change notifications). You shouldn't mark the entity itself as modified.
If you don't want to turn off EF's validation, but neither want to mark individual properties as modified, I think this could be a useful alternative (brought to my attention by Alex's answer).

Archive data based on conditions

We've been using the Entity framework-code first approach and Fluent Api, and have this requirement, an entity with multiple navigation properties and the possibility of numerous entries.
This entity reflects the data of a process and a field captures whether the entity is active in the process. I've provided an example for this.
public class ProcessEntity
{
//Other properties and Navigation properties
public bool IsInProcess { get; set; }
}
What I've been trying to do is, have an another table could be a mapping table or something that will contain only the ProcessEntity items whose IsInProcess property is set to true, ie.,this table provides the ProcessEntities that are active in the process.
The whole idea and thought behind this segregation is that, a lot of queries and reports are generated only on the items that are still in process and querying the whole table every time with a Where clause would be a performance bottleneck. Please correct me If I'm wrong.
I thought of having a mapping table but the entries have to be manually added and removed based on the condition.
Is there any other solution or alternative design ideas for this requirement?
Consider using an index.
Your second table is what an index would do.
Let the DB do its job.
Given that a boolean isnt a great differentiator, a date or similiar as part of the index may also be useful.
eg How to create index in Entity Framework 6.2 with code first

Is DbSet<>.Local something to use with special care?

For a few days now, I have been struggling with retrieving my entities from a repository (DbContext).
I am trying to save all the entities in an atomic action. Thus, different entities together represent something of value to me. If all the entities are 'valid', then I can save them all to the database. Entity 'a' is already stored in my repository, and needs to be retrieved to 'validate' entity 'b'.
That's where the problem arises. My repository relies on the DbSet<TEntity> class which works great with Linq2Sql (Include() navigation properties e.g.). But, the DbSet<TEntity> does not contain entities that are in the 'added' state.
So I have (as far as I know) two options:
Use the ChangeTracker to see which entities are available and query them into a set based on their EntityState.
Use the DbSet<TEntity>.Local property.
The ChangeTracker seems to involve some extra hard work to get it working in a way such that I can use Linq2Sql to Include() navigation properties e.g.
The DbSet<TEntity>.Local seems a bit weird to me. It might just be the name. I just read something that it is not performing very well (slower than DbSet<> itself). Not sure if that is a false statement.
Could somebody with significant EntityFramework experience shine some light on this? What's the 'wise' path to follow? Or am I seeing ghosts and should I always use the .Local property?
Update with code examples:
An example of what goes wrong
public void AddAndRetrieveUncommittedTenant()
{
_tenantRepository = new TenantRepository(new TenantApplicationTestContext());
const string tenantName = "testtenant";
// Create the tenant, but not call `SaveChanges` yet until all entities are validated
_tenantRepository.Create(tenantName);
//
// Some other code
//
var tenant = _tenantRepository.GetTenants().FirstOrDefault(entity => entity.Name.Equals(tenantName));
// The tenant will be null, because I did not call save changes yet,
// and the implementation of the Repository uses a DbSet<TEntity>
// instead of the DbSet<TEntity>.Local.
Assert.IsNotNull(tenant);
// Can I safely use DbSet<TEntity>.Local ? Or should I play
// around with DbContext.ChangeTracker instead?
}
An example of how I want to use my Repository
In my Repository I have this method:
public IQueryable<TEntity> GetAll()
{
return Context.Set<TEntity>().AsQueryable();
}
Which I use in business code in this fashion:
public List<Case> GetCasesForUser(User user)
{
return _repository.GetAll().
Where(#case => #case.Owner.EmailAddress.Equals(user.EmailAddress)).
Include(#case => #case.Type).
Include(#case => #case.Owner).
ToList();
}
That is mainly the reason why I prefer to stick to DbSet like variables. I need the flexibility to Include navigation properties. If I use the ChangeTracker I retrieve the entities in a List, which does not allow me to lazy load related entities at a later point in time.
If this is close to incomprehensible bullsh*t, then please let me know so that I can improve the question. I desperately need an answer.
Thx a lot in advance!
If you want to be able to 'easily' issue a query against the DbSet and have it find newly created items, then you will need to call SaveChanges() after each entity is created. If you are using a 'unit of work' style approach to working with persistent entities, this is actually not problematic because you can have the unit of work wrap all actions within the UoW as a DB transaction (i.e. create a new TransactionScope when the UoW is created, and call Commit() on it when the UoW completed). With this structure, the changes are sent to the DB, and will be visible to DbSet, but not visible to other UoWs (modulo whatever isolation level you use).
If you don't want the overhead of this, then you need to modify your code to make use of Local at appropriate times (which may involve looking at Local, and then issuing a query against the DbSet if you didn't find what you were looking for). The Find() method on DbSet can also be quite helpful in these situations. It will find an entity by primary key in either Local or the DB. So if you only need to locate items by primary key, this is pretty convenient (and has performance advantages as well).
As mentioned by Terry Coatta, the best approach if you don't want to save the records first would be checking both sources.
For example:
public Person LookupPerson(string emailAddress, DateTime effectiveDate)
{
Expression<Func<Person, bool>> criteria =
p =>
p.EmailAddress == emailAddress &&
p.EffectiveDate == effectiveDate;
return LookupPerson(_context.ObjectSet<Person>.Local.AsQueryable(), criteria) ?? // Search local
LookupPerson(_context.ObjectSet<Person>.AsQueryable(), criteria); // Search database
}
private Person LookupPerson(IQueryable<Person> source, Expression<Func<Person, bool>> predicate)
{
return source.FirstOrDefault(predicate);
}
For those who come after, I ran into some similar issues and decided to give the .Concat method a try. I have not done extensive performance testing so someone with more knowledge than I should feel free to chime in.
Essentially, in order to properly break up functionality into smaller chunks, I ended up with a situation in which I had a method that didn't know about consecutive or previous calls to that same method in the current UoW. So I did this:
var context = new MyDbContextClass();
var emp = context.Employees.Concat(context.Employees.Local).FirstOrDefault(e => e.Name.Contains("some name"));
This may only apply to EF Core, but every time you reference .Local of a DbSet, you're silently triggering change detection on the context, which can be a performance hit, depending on how complex your model is, and how many entries are currently being tracked.
If this is a concern, you'll want to use (fore EFCore) dbContext.ChangeTracker.Entries<T>() to get the locally tracked entities, which will not trigger change detection, but does require manual filtering of the DB state, as it will include deleted and detached entities.
There's a similar version of this in EF6, but in EFCore the Entries is a list of EntityEntries which you'll have to select out the entry.Entity to get out the same data the DbSet would give you.

How to implement recursive deletion?

I have the following situation:
Customers contain projects and projects contain licenses.
Good because of archiving we won't delete anything but we use the IsDeleted instead.
Otherweise I could have used the cascade deletion.
Owkay I work with the repository pattern so I call
customerRepository.Delete(customer);
But here starts the problem. The customer is set to isdeleted true. But then I would like to delete all the projects of that customer and each project that gets deleted should delete all licenses as well.
I would like to know if there is a proper solution for this.
It has to be performant though.
Take note that this is a simple version of the actual problem. A customer has also sites which are also linked to licenses but I just wanted to simplify the problem for you guys.
I'm working in a C# environment using sql server 2008 as database.
edit: I'm using enterprice libraries to connect to the database
One option would be to do this in the database with triggers. I guess another option would be use Cascade update, but that might not fit in with how your domain works.
Personally I'd probably just bite the bullet and write C# code to do the setting of IsDeleted type field for me (if there was one and only one app accessing the DB).
I recommend just writing a stored procedure (or group of stored procedures) to encapsulate this logic, which would look something like this:
update Customer set isDeleted = 1
where CustomerId = #CustomerId
/* Say the Address table has a foreign key to customer */
update Address set isDeleted = 1
where CustomerId = #CustomerId
/*
To delete related records that also have child data,
write and call other procedures to handle the details
*/
exec DeleteProjectByCustomer(#CustomerId)
/* ... etc ... */
Then call this procedure from customerRepository.Delete within a transaction.
This totally depends on your DAL. For instance NHibernate mappings can be setup to cascade delete all these associated objects without extra code. I'm sure EF has something similar. How are you connecting to your DB?
If your objects arent persisted, then the .NET GC will sweep all your project objects away once there is no reference to them. I presume from your question though that you are talking about removing them from the database?
If your relationships are fixed (i.e. a license is always related to a project, and a project to a customer), you can get away with not cascading the update at all. Since you're already dealing with the pain of soft deletes in your queries, you might as well add in the pain of checking the hierarchy:
SELECT [...] FROM License l
JOIN Project p ON l.ProjectID = p.ID
JOIN Customer c on p.CustomerID = c.ID
WHERE l.IsDeleted <> 1 AND p.IsDeleted <> 1 AND c.IsDeleted <> 1
This will add a performance burden only in the case where you have queries on a child table that don't join to the ancestor tables.
It has an additional merit over a cascading approach: it lets you undelete items without automatically undeleting their children. If I delete one of a project's licenses, then delete the project, then undelete the project, a cascading approach will lose the fact that I deleted that first license. This approach won't.
In your object model, you'd implement it like this:
private bool _IsDeleted;
public bool IsDeleted
{
get
{
return _IsDeleted || (Parent == null ) ? false : Parent.IsDeleted;
}
set
{
_IsDeleted = value;
}
}
...though you must be careful to actually store the private _IsDeleted value in the database, and not the value of IsDeleted.

Categories

Resources