Duplicating a model that has a hierarchy - c#

My model looks something like this:
Company
-Locations
Locations
-Stores
Stores
-Products
So I want to make a copy of a Company, and all of its associations should also be copied and saved to the database.
How can I do this if I have the Company loaded in memory?
Company company = DbContext.Companies.Find(123);
If it is tricky, I can loop through each association and then call create a new object. The Id's will be different but everything else should be the same.
I am using EF 6.

Cloning object graphs with EF is a piece of cake:
var company = DbContext.Companies.AsNoTracking()
.Include(c => c.Locations
.Select(l => l.Stores
.Select(s => s.Products)))
.Where(c => c.Id == 123)
.FirstOrDefault();
DbContext.Companies.Add(company);
DbContext.SaveChanges();
A few things to note here.
AsNoTracking() is vital, because the objects you add to the context shouldn't be tracked already.
Now if you Add() the company, all entities in its object graph will be marked as Added as well.
I assume that the database generates new primary key values (identity columns). If so, EF will ignore the current values from the existing objects in the database. If not, you'll have to traverse the object graph and assign new values yourself.
One caveat: this only works well if the associations are 1:0..n. If there is a n:m association, identical entities may get inserted multiple times. If, for example, Store-Product is n:m and product A occurs at store 1 and store 2, product A will be inserted twice. If you want to prevent this, you should fetch the objects by one context, with tracking (i.e. without AsNoTracking), and Add() them in a new context. By enabling tracking, EF keeps track of identical entities and won't duplicate them. In this case, proxy creation should be disabled, otherwise the entities keep a reference to the context they came from.
More details here: Merge identical databases into one

I would add a method to each model that needs to be cloneable this way, I'd recommend an interface for it also.
It could be done something like this:
//Company.cs
Company DeepClone()
{
Company clone = new Company();
clone.Name = this.name;
//...more properties (be careful when copying reference types)
clone.Locations = new List<Location>(this.Locations.Select(l => l.DeepClone()));
return clone;
}
You should repeat this basic pattern for every class and "child" class that needs to be copiable. This way each object is aware of how to create a deep clone of its self, and passes responsibility for child objects off to the child class, neatly encapsulating everything.
It could be used this way:
Company copyOfCompany123 = DbContext.Companies.Find(123).DeepClone;
My apologies if there are any errors in the above code; I don't have Visual Studio available at the moment to verify everything, I'm working from memory.
One other really simple and code efficient way to deeply clone an object using serialization can be found in this post How do you do a deep copy an object in .Net (C# specifically)?
public static T DeepClone<T>(T obj)
{
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
formatter.Serialize(ms, obj);
ms.Position = 0;
return (T) formatter.Deserialize(ms);
}
}
Just be aware that this can have some pretty serious resource and performance issues depending on your object structure. Every class that you want to use it on must also be marked with the [Serializable] attribute.

Related

Entity Framework update the intermediate table of a many-to-many relation

I've got a many-to-many relation between user and project. Like:
class User
{
public ICollection<Project> Projects { get; set; }
}
class Project
{
public ICollection<User> Users { get; set; }
}
Entity Framework automatically generated the intermediate table.
The thing is I want to update the user, along with the entire list of projects. This list could've been modified in any way, projects could've been added and deleted. at the same time before the user object gets updated.
I always get the same error, that Entity Framework tried to add a duplicated entry in the intermediate table.
I've tried numerous things without success (a few listed below).
var tmp = Context.Entry(user); // user being the updated object.
tmp.State = EntityState.Modified;
tmp.Collection(e => e.Projects).IsModified = true;
Context.Users.Update(user);
Context.SaveChanges();
or
var tmp = Context.Users.SingleOrDefault(u => u.Id == user.Id);
if (tmp == null)
return null;
Context.Entry(tmp).CurrentValues.SetValues(user);
Context.SaveChanges();
return user;
or just plain old update:
Context.Users.Update(user);
Context.SaveChanges();
But none of these worked.
The issue sounds like you have a detached User Entity with a set of Projects and you want to pass that into a method, associate with the DbContext to persist the changes.
You are encountering issues with doubling up records because while you attach the user to the DbContext, it will treat each of the Project entities associated with the user as new instances because they don't reference tracked instances themselves.
Updating detached entities with associations is fairly involved, especially where you expect to possibly add or remove associations in an operation.
The recommended approach would be to load the current User and Projects from the DB then leverage Automapper to guard what values you can copy over from the detached entity, and then go through the associations to add/remove any project references that have changed. If it is possible to create a brand new project to associate to the user as part of this operation, you need to handle that as well.
var existingUser = Context.Users.Include(x => x.Projects).Single(x => x.UserId == user.UserId);
Mapper.Map(user, existingUser);
// Where Automapper is configured with a User to User mapping with allowed
// values to copy over, ignoring anything that cannot legally be changed.
var newProjectIds = user.Projects.Select(x => x.ProjectId).ToList();
var existingProjectIds = existingUser.Projects.Select(x => x.ProjectId).ToList();
var projectIdsToAdd = newProjectIds.Except(existingProjectIds).ToList();
var projectIdsToRemove = existingProjectIds.Except(newProjectIds).ToList();
var projectsToAdd = Context.Projects.Where(x => projectIdsToAdd.Contains(x.ProjectId)).ToList();
var projectsToRemove = existingUser.Projects.Where(x => projectIdsToRemove.Contains(x.ProjectId)).ToList();
foreach(var project in projectsToRemove)
existigUser.Projects.Remove(project);
foreach(var project in projectsToAdd)
existingUser.Projects.Add(project);
Context.SaveChanges();
... This example does not cover the possibility of brand new projects. If the updated user can include a brand new project then you need to detect those when looking for projectsToAdd to add any Projects from the passed in project list where the ID is in new project IDs but not found in the DB. Those detached references can be added to the User loaded from the DbContext, however you do need to handle any navigation properties that each Project might have to avoid duplication, substituting each of those with references to tracked entities, including any bi-directional reference back to the User if present.
In general, dealing with detached entities has various considerations that you need to keep in mind and handle very deliberately. It is generally much better to avoid passing detached entities around and instead aim to pass a minimal representation of the data you want to associate, then load and adjust that server-side. Usually the argument to using detached entities is to avoid having to load the data again, however this leads to more code when trying to synchronize these detached instances, and neglects the fact that data state could have changed since the detached instances were taken. The above code for instance should also be looking at entity versioning between the detached entity and the loaded state to detect if anyone else might have made changes since the detached copies were read at the start of the user process for making the changes.

Live Transfer of data from one provider to another in Entity Framework

I apologise if this has been asked already, I am struggling greatly with the terminology of what I am trying to find out about as it conflicts with functionality in Entity Framework.
What I am trying to do:
I would like to create an application that on setup gives the user to use 1 database as a "trial"/"startup" database, i.e. non-production database. This would allow a user to trial the application but would not have backups etc. in no way would this be a "production" database. This could be SQLite for example.
When the user is then ready, they could then click "convert to production" (or similar), and give it the target of the new database machine/database. This would be considered the "production" environment. This could be something like MySQL, SQLServer or.. whatever else EF connects to these days..
The question:
Does EF support this type of migration/data transfer live? Would it need another app where you could configure the EF source and EF destination for it to then run through the process of conversion/seeding/population of the data source to another data source?
Why I have asked here:
I have tried to search for things around this topic, but transferring/migration brings up subjects totally non-related, so any help would be much appreciated.
From what you describe I don't think there is anything out of the box to support that. You can map a DbContext to either database, then it would be a matter of fetching and detaching entities from the evaluation DbContext and attaching them to the production one.
For a relatively simple schema / object graph this would be fairly straight-forward to implement.
ICollection<Customer> customers = new List<Customer>();
using(var context = new AppDbContext(evalConnectionString))
{
customers = context.Customers.AsNoTracking().ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{ // Assuming an empty database...
context.Customers.AddRange(customers);
}
Though for more complex models this could take some work, especially when dealing with things like existing lookups/references. Where you want to move objects that might share the same reference to another object you would need to query the destination DbContext for existing relatives and substitute them before saving the "parent" entity.
ICollection<Order> orders = new List<Order>();
using(var context = new AppDbContext(evalConnectionString))
{
orders = context.Orders
.Include(x => x.Customer)
.AsNoTracking()
.ToList();
}
using(var context = new AppDbContext(productionConnectionString))
{
var customerIds = orders.Select(x => x.Customer.CustomerId)
.Distinct().ToList();
var existingCustomers = context.Customers
.Where(x => customerIds.Contains(x.CustomerId))
.ToList();
foreach(var order in orders)
{ // Assuming all customers were loaded
var existingCustomer = existingCustomers.SingleOrDefault(x => x.CustomerId == order.Customer.CustomerId);
if(existingCustomer != null)
order.Customer = existingCustomer;
else
existingCustomers.Add(order.Customer);
context.Orders.Add(order);
}
}
This is a very simple example to outline how to handle scenarios where you may be inserting data with references that may, or may not exist in the target DbContext. If we are copying across Orders and want to deal with their respective Customers we first need to check if any tracked customer reference exists and use that reference to avoid a duplicate row being inserted or throwing an exception.
Normally loading the orders and related references from one DbContext should ensure that multiple orders referencing the same Customer entity will all share the same entity reference. However, to use detached entities that we can associate with the new DbContext via AsNoTracking(), detached references to the same record will not be the same reference so we need to treat these with care.
For example where there are 2 orders for the same customer:
var ordersA = context.Orders.Include(x => x.Customer).ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Passes
var ordersB = context.Orders.Include(x => x.Customer).AsNoTracking().ToList();
Assert.AreSame(orders[0].Customer, orders[1].Customer); // Fails
Even though in the 2nd example both are for the same customer. Each will have a Customer reference with the same ID, but 2 different references because the DbContext is not tracking the references used. One of the several "gotchas" with detached entities and efforts to boost performance etc. Using tracked references isn't ideal since those entities will still think they are associated with another DbContext. We can detach them, but that means diving through the object graph and detaching all references. (Do-able, but messy compared to just loading them detached)
Where it can also get complicated is when possibly migrating data in batches (disposing of a DbContext regularly to avoid performance pitfalls for larger data volumes) or synchronizing data over time. It is generally advisable to first check the destination DbContext for matching records and use those to avoid duplicate data being inserted. (or throwing exceptions)
So simple data models this is fairly straight forward. For more complex ones where there is more data to bring across and more relationships between that data, it's more complicated. For those systems I'd probably look at generating a database-to-database migration such as creating INSERT statements for the desired target DB from the data in the source database. There it is just a matter of inserting the data in relational order to comply with the data constraints. (Either using a tool or rolling your own script generation)

Entity Framework updating two databases

As I've mentioned in a couple other questions, I'm currently trying to replace a home-grown ORM with the Entity Framework, now that our database can support it.
Currently, we have certain objects set up such that they are mapped to a table in our internal database and a table in the database that runs our website (which is not even in the same state, let alone on the same server). So, for example:
Part p = new Part(12345);
p.Name = "Renamed part";
p.Update();
will update both the internal and the web databases simultaneously to reflect that the part with ID 12345 is now named "Renamed part". This logic only needs to go one direction (internal -> web) for the time being. We access the web database through a LINQ-to-SQL DBML and its objects.
I think my question has two parts, although it's possible I'm not asking the right question in the first place.
Is there any kind of "OnUpdate()" event/method that I can use to trigger validation of "Should this be pushed to the web?" and then do the pushing? If there isn't anything by default, is there any other way I can insert logic between .SaveChanges() and when it hits the database?
Is there any way that I can specify for each object which DBML object it maps to, and for each EF auto-generated property which property on the L2S object to map to? The names often match up, but not always so I can't rely on that. Alternatively, can I modify the L2S objects in a generic way so that they can populate themselves from the EF object?
Sounds like a job for Sql Server replication.
You don't need to inter-connect the two together as it seems you're saying with question 2.
Just have the two separate databases with their own EF or L2S models and abstract them away using repositories with domain objects.
This is the solution I ended up going with. Note that the implementation of IAdvantageWebTable is inherited from the existing base class, so nothing special needed to be done for EF-based classes, once the T4 template was modified to inherit correctly.
public partial class EntityContext
{
public override int SaveChanges(System.Data.Objects.SaveOptions options)
{
var modified = this.ObjectStateManager.GetObjectStateEntries(EntityState.Modified | EntityState.Added); // Get the list of things to update
var result = base.SaveChanges(options); // Call the base SaveChanges, which clears that list.
using (var context = new WebDataContext()) // This is the second database context.
{
foreach (var obj in modified)
{
var table = obj.Entity as IAdvantageWebTable;
if (table != null)
{
table.UpdateWeb(context); // This is IAdvantageWebTable.UpdateWeb(), which calls all the existing logic I've had in place for years.
}
}
context.SubmitChanges();
}
return result;
}
}

Storing run-time data in objects from ObjectSet<T>

I'm very new to this Entity Framework Object Services Overview (Entity Framework), so forgive me if I use the wrong terminology here.
I'm using the EDMX file to connect to an SQLite database. What I'm trying to do is use the ObjectSet<T> normally, to access a collection of objects from a table in the database. However, I want to additionally store some run-time-only data in the objects in that set. In my case, I have a set of devices stored in the database, but upon startup, I want to mark them as "Connected" or "Disconnected", and keep track of this state throughout execution.
Since the (row) types generated by the EDMX are partial I've added another partial definition, and added my public bool Connected property there. This seems to work, I can set it, and future queries provide objects with the same value that I previously set. The problem is, I don't know a) how it is working, or b) whether I can trust it. These doubts come from the fact that these aren't really collections of objects we're dealing with, right?
Hopefully that made sense, else I can provide more detail.
What you're doing is completely safe.
ObjectSet is still a collection of objects. With a lot magic added underneath.
I am not an expert on the internals but here is how I think it works:
The Entity Framework has a StateTracker hat keeps track of all the entities you're working with.
Every class in your EDMX model is required to have a key. EF is using that key internally so that it loads that specific object only once into memory.
var foo = db.Foos.Single(x => x.Id == 1); // foo with Id 1 is unique (in memory)
var foo2 = db.Foos.Single(x => x.Id == 1); // same instance of foo, but with updated values
var foo3 = db.Foos.Single(x => x.Id == 2) // a new unique instance (Id = 2)
bool sameObject = Object.Equals(foo, foo2); // will return true;
At every select the following happens:
Is an instance of class Foo already tracked/does it already exist?
Yes -> update the properties of the existing instance from the database.
No -> create new instance of class Foo (take values from database)
Of course it can only ever update mapped properties. So the ones you defined in the partial class won't be overwritten.
In case you're going to use code first. There is also the [NotMapped] attribute, that makes sure that the property won't be included in the table if you generate a new database from your code first models.
I hope I could clarify some things for you.

Is there a design pattern for light & heavy versions of an object?

I have the need for both light-weight, and heavy-weight versions of an object in my application.
A light-weight object would contain only ID fields, but no instances of related classes.
A heavy-weight object would contain IDs, and instances of those classes.
Here is an example class (for purpose of discussion only):
public class OrderItem
{
// FK to Order table
public int OrderID;
public Order Order;
// FK to Prodcut table
public int ProductID;
public Product Product;
// columns in OrderItem table
public int Quantity;
public decimal UnitCost;
// Loads an instance of the object without linking to objects it relates to.
// Order, Product will be NULL.
public static OrderItem LoadOrderItemLite()
{
var reader = // get from DB Query
var item = new OrderItem();
item.OrderID = reader.GetInt("OrderID");
item.ProductID = reader.GetInt("ProductID");
item.Quantity = reader.GetInt("Quantity");
item.UnitCost = reader.GetDecimal("UnitCost");
return item;
}
// Loads an instance of the objecting and links to all other objects.
// Order, Product objects will exist.
public static OrderItem LoadOrderItemFULL()
{
var item = LoadOrderItemLite();
item.Order = Order.LoadFULL(item.OrderID);
item.Product = Product.LoadFULL(item.ProductID);
return item;
}
}
Is there a good design pattern to follow to accomplish this?
I can see how it can be coded into a single class (as my example above), but it is not apparent in which way an instance is being used. I would need to have NULL checks throughout my code.
Edit:
This object model is being used on client side of client-server application. In the case where I'm using the light-weight objects, I don't want lazy load because it will be a waste of time and memory ( I will already have the objects in memory on client side elsewhere)
Lazy initialization, Virtual Proxy and Ghost are three implementations of that lazy loading pattern. Basically they refer to load properties once you need them. Now, I suppose you'll be using some repo to store objects so I'll encourage you to use any of the ORM tools available. (Hibernate, Entity Framework and so on), they all implement these functionality free for you.
Have you considered using an ORM tool like NHibernate for accessing DB? If you use something like NHibernate, you would get this behavior by means of lazy loading.
Most ORM tools do exactly what you are looking for within lazy loading - they first get the object identifiers, and upon accessing a method, they issue subsequent queries to load the related objects.
Sounds like you might have a need for a Data Transfer Object (DTO), just a "dumb" wrapper class that summarizes a business entity. I usually use something like that when I need to flatten out an object for display. Be careful, though: overuse results in an anti-pattern.
But rendering an object for display is different from limiting hits against the database. As Randolph points out, if your intention is the latter, then use one of the existing deferred loading patterns, or better yet, use an ORM.
Take a look at the registry pattern, you can use it to find objects and also to better manage these objects, like keeping them in a cache.

Categories

Resources