We are running into various StackOverflowException and OutOfMemoryExceptions when sending certain entities to the server as part of an InvokeServerMethod call. The problem seems to come up because DevForce ends up trying to serialize a ton more data than we are expecting it to. I tracked it down to data that is stored in the OriginalValuesMap.
The original values are for DataEntityProperties that we've added to the entity but that aren't marked with [DataMember] so they normally don't get sent to the server. But if we have an existing (previously saved entity) and then change one of those properties, the initial value of the property does end up getting serialized as part of the OriginalValuesMap. This is causing us big problems because it turns out the original value is an entity that has a huge entity graph.
Adding to the problem, the entities we are dealing with are actually clones (via ((ICloneable)origEntity).Clone()) of existing (previously saved) entities so they have a state of detached and I haven't found a way to clear the OriginalValuesMap for detached entities. Usually I'd do myEntity.EntityAspect.AcceptChanges() but that doesn't do anything for detached entities. I couldn't find any other easy way to do this.
So far, the only way I've found to clear the original values is to attach the entity to an Entity Manager. This ends up clearing the original values but it is a major pain because I'm actually dealing with a large number of entities (so performance is a concern) and many of these entities don't have unique primary key values (in fact, they don't have any key values filled in because they are just 'in memory' objects that I don't plan to actually ever save) so I need to do extra work to avoid 'duplicate key exception' errors when adding them to an entity manager.
Is there some other way I can clear the original values for a detached entity? Or should detached entities even be tracking original values in the first place if things like AcceptChanges don't even work for detached entities? Or maybe a cloned entity shouldn't 'inherit' the original values of its source? I don't really have a strong opinion on either of these possibilities...I just want to be able to serialize my entities.
Our app is a Silverlight client running DevForce 2012 v7.2.4.0
Before diving into the correct behavior for detached entities, I'd like to back up and verify that it really is the OriginalValuesMap which is causing the exception. The contents of the OriginalValuesMap should follow the usual rules for the DataContractSerializer, so I'd think that non-DataMember items would not be serialized. Can you try serializing one of these problem entities to a text file to send to IdeaBlade support? You can use SerializationFns.Save(entity, filename, null, false) to quickly serialize an item. If it does look like the OriginalValuesMap contains things it shouldn't, I'll also need the type definition(s) involved.
Related
We are leading into some issues with ef-core on sql databases in a web-api when trying to update complexe objects on the database provided by a client.
A detailed example: When receiving an object "Blog" with 1-n "Posts" from an client and trying to update this existing object on database, should we:
Make sure the primary keys are set and just use
dbContext.Update(blogFromClient)
Load and track the blog while
including the posts from database, then patch the changes from
client onto this object and use SaveChanges()
When using approach (1) we got issues with:
Existing posts for the existing blog on database are not deleted
when the client does not post them any more, needing to manually
figure them out and delete them
Getting tracking issues ("is already been tracked") if
dependencies of the blog (for example an "User" as "Creator") are
already in ChangeTracker
Cannot unit test our business logic without using a real DbContext
while using a repository pattern (tracking errors do just not exist)
While using a real DbContext with InMemoryDatabase for tests cannot rely on things like foreign-key exceptions or computed
columns
when using approach (2):
we can easily manage updated relations and keep an easy track of
the object
lead into performance penalty because of loading the
object which we do not really need
need to map many manual things
as tools like AutoMapper cannot be used to automaticlly map
objects with n-n relations while keeping a correct track by ef core (getting some primary key errors, as some objects are deleted from lists and are added again with the same primary
key, which is not allowed as the primary key cannot be set on insert)
n-n relations can be easily damaged by this as on database
there could be n-n blog to post, while the post in blog does hold
the same relation to its posts. if only one relation is (blog to
post, but not post to blog - which is the same in sql) is posted and
the other part is deleted from list, ef core will track this entry
as "deleted".
in vanilla SQL we would manage this by
deleting all existing relations for the blog to posts
updating the post itself
creating all new relations
in ef core we cannot write such statements like deleting of bulk relations without loading them before and then keeping detailed track on each relation.
Is there any best practice, how to handle an update of complexe objects with deep relations while getting the "new" data from a client?
The correct approach is #2: "Load and track the blog while including the posts from database, then patch the changes from client onto this object and use SaveChanges()".
As to your concerns:
lead into performance penalty because of loading the object which we do not really need
You are incorrect in assuming you don't need this. You do in fact need this because you absolutely shouldn't be posting every single property on every single entity and related entity, including things that should not be be changed like audit props and such. If you don't post every property, then you will end up nulling stuff out when you save. As such, the only correct path is to always load the full dataset from the database and then modify that via what was posted. Doing it any other way will cause problems and is totally and completely 100% wrong.
need to map many manual things as tools like AutoMapper cannot be used to automaticlly map objects with n-n relations while keeping a correct track by ef core
What you're describing here is a limitation of any automatic mapping. In order to map entity to entity in collections, the tool would have to somehow know what identifies each entity uniquely. That's usually going to be a PK, of course, but AutoMapper doesn't (and shouldn't) make assumptions about that. Instead, the default and naive behavior is to simply replace the collection on the destination with the collection on the source. To EF, though, that looks like you're deleting everything in the collection and then adding new items to the collection, which is the source of your issue.
There's two paths forward. First, you can simply ignore the collection props on the source, and then manually map these. You can still use AutoMapper for the mapping, but you'd simply need to iterate over each item in the collection individually matching it with the appropriate item that should map to it, based on your knowledge of what identifies the entity (i.e. the part AutoMapper doesn't know).
Second, there's actually an additional library for AutoMapper to make this easier: AutoMapper.Collection. The entire point of this library is to provide the ability to tell AutoMapper how to identify your entities, so that it can then map collections correctly. If you utilize this library and add the additional necessary configuration, then you can map your entities as normal without worrying about collections getting messed up.
I am maintaing an existing application where I needed for performance issues to point NHibernate to a view to get away from it producing outer joins. This is ok, and I get an entity back populated with data.
Now, this object is then updated in C# and calls on Update, which is a generic method in the C# code used by a number of other repository classes. When this Update method is called, I am getting an error message:
"NHibernate.NonUniqueObjectException: a different object with the same identifier value was already associated with the session"
It points to a nested object inside the entity object, but I am unclear as to how to resolve this. I don't want to change the update method in case this impacts classes that use it.
If I need to revert back from using a view to get data, is it possible to set in the mapping config to force NHibernate to use equi-joins rather than left outer joins?
I am not that familiar with NHibernate, and so any guidance/help would be appreciated.
A session.Merge(entity) would be solution here. Working with detached objects is described here:
9.4.2. Updating detached objects
small cite:
...
The last case can be avoided by using Merge(Object o). This method
copies the state of the given object onto the persistent object with
the same identifier. If there is no persistent instance currently
associated with the session, it will be loaded. The method returns the
persistent instance. If the given instance is unsaved or does not
exist in the database, NHibernate will save it and return it as a
newly persistent instance. Otherwise, the given instance does not
become associated with the session. In most applications with detached
objects, you need both methods, SaveOrUpdate() and Merge().
Other words, calling Merge(entity) should solve this issue, and properly resolve conflicts related to "passed objects" vs "session (loaded) objects"
I am using VS 2013 Express for Web with ADO.NET Entity Data Model.
When updating the entity data model from database using 'refresh' tab option (seems you can only select one item though the heading says select objects plural) the usage seems unclear and I have noticed some issues.
Just two examples:
I changed a stored procedure so it returned the same number of fields but one field was of a slightly different type but the complex type never changed. I realise there can be an impact on client code but this simply did not change the complex type, everything stayed the same. However, removing the relevant elements from the model browser then readding the elments from the database back into the model did exactly what I expected.
I made some significant changes to two or three tables, attributes and one relationship but did bot change the table names. Here again refresh had some very odd results, so I simply created a fresh model.
I am planning some more changes first change specifically I am adding a FK relationship that I forgot.
Is there any way to be sure of what is supported and what is not in terms of refresh.
Also I am concerned that if refresh fails and I so delete the two tables with the relationship, what impact will that have on temporarily orphaned tables and their relationships, and if when I regenerate the two tables their connections with the other tables will still work. I guess it depends how the generated code works underneath.
I want to make these kinds of changes but avoid have to recreate the entire model.
Any advice appreciated.
The most guaranteed way of ensuring you always have the latest version is to select all (Ctrl A) delete, and then re-add everything from the model page.
I know it sounds like a pain but it's guaranteed to work as long as you haven't made any changes to the model from within Visual Studio.
The refresh doesn't always work.
I'm exploring Mongo as an alternative to relational databases but I'm running into a problem with the concept of schemaless collections.
In theory it sounds great, but as soon as you tie a model to a collection, the model becomes your defacto schema. You can no longer just add or remove fields from your model and expect it to continue to work. I see the same problems here managing changes as you have with a relational database in that you need some sort of script to migrate from one version of the database schema to the other.
Am I approaching this from the wrong angle? What approaches do members here take to ensure that their collection items stay in sync with their domain model when making updates to their domain model?
Edit: It's worth noting that these problems obviously exist in relational databases as well, but I'm asking specifically for strategies in mitigating the problem using schemaless databases and more specifically Mongo. Thanks!
Schema migration with MongoDB is actually a lot less painful than with, say, SQL server.
Adding a new field is easy, old records will come in with it set to null or you can use attributes to control the default value [BsonDefaultValue("abc", SerializeDefaultValue = false)]
The [BsonIgnoreIfNull] attribute is also handy for omitting objects that are null from the document when it is serialized.
Removing a field is fairly easy too, you can use [BSonExtraElements] (see docs) to collect them up and preserve them or you can use [BsonIgnoreExtraElements] to simply throw them away.
With these in place there really is no need to go convert every record to the new schema, you can do it lazily as needed when records are updated, or slowly in the background.
PS, since you are also interested in using dynamic with Mongo, here's an experiment I tried along those lines. And here's an updated post with a complete serializer and deserializer for dynamic objects.
My current thought on this is to use the same sort of implementation I would using a relational database. Have a database version collection which stores the current version of the database.
My repositories would have a minimum required version which they require to accurately serialize and deserialize the collections items. If the current db version is lower than the required version, I just throw an exception. Then use migrations which would do all the conversion necessary to update the collections to the required state to be deserialized and update the database version number.
With statically-typed languages like C#, whenever an object gets serialised somewhere, then its original class changes, then it's deserialised back into the new class, you're probably going to run in problems somewhere along the line. It's fairly unavoidable whether it's MongoDB, WCF, XmlSerializer or whatever.
You've usually got some flexibility with serialization options, for example with Mongo you can change a class property name but still have its value map to the same field name (e.g. using the BsonElement attribute). Or you can tell the deserializer to ignore Mongo fields that don't have a corresponding class property, using the BsonIgnoreExtraElements attribute, so deleting a property won't cause an exception when the old field is loaded from Mongo.
Overall though, for any structural schema changes you'll probably need to reload the data or run a migration script. The other alternative is to use C# dynamic variables, although that doesn't really solve the underlying problem, you'll just get fewer serialization errors.
I've been using mongodb for a little over a year now though not for very large projects. I use hugo's csmongo or the fork here. I like the dynamic approach it introduces. This is especially useful for projects where the database structure is volatile.
Question - What is a good best practice approach for how can I save/keep-in-sync an jn-memory graph of objects with the database?
Background:
That is say I have the classes Node and Relationship, and the application is building up a graph of related objects using these classes. There might be 1000 nodes with various relationships between them. The application needs to query the structure hence an in-memory approach is good for performance no doubt (e.g. traverse the graph from Node X to find the root parents)
The graph does need to be persisted however into a database with tables NODES and RELATIONSHIPS.
Therefore what is a good best practice approach for how can I save/keep-in-sync an jn-memory graph of objects with the database?
Ideal requirements would include:
build up changes in-memory and then 'save' afterwards (mandatory)
when saving, apply updates to database in correct order to avoid hitting any database constraints (mandatory)
keep persistence mechanism separate from model, for ease in changing persistence layer if needed, e.g. don't just wrap an ADO.net DataRow in the Node and Relationship classes (desirable)
mechanism for doing optimistic locking (desirable)
Or is the overhead of all this for a smallish application just not worth it and I should just hit the database each time for everything? (assuming the response times were acceptable) [would still like to avoid if not too much extra overhead to remain somewhat scalable re performance]
I'm using the self tracking entities in Entity Framework 4. After the entities are loaded into memory the StartTracking() MUST be called on every entity. Then you can modify your entity graph in memory without any DB-Operations. When you're done with the modifications, you call the context extension method "ApplyChanges(rootOfEntityGraph)" and SaveChanges(). So your modifications are persisted. Now you have to start the tracking again on every entity in the graph. Two hints/ideas I'm using at the moment:
1.) call StartTracking() at the beginning on every entity
I'm using an Interface IWorkspace to abstract the ObjectContext (simplifies testing -> see the OpenSource implementation bbv.DomainDrivenDesign at sourceforge). They also use a QueryableContext. So I created a further concrete Workspace and QueryableContext implementation and intercept the loading process with an own IEnumerable implementation. When the workspace's consumer executes the query which he get with CreateQuery(), my intercepting IEnumerable object registers an eventhandler on the context's ChangeTracker. In this event handler I call StartTracking() for every entity loaded and added into the context (doesn't work if you load the objects with NoTrakcing, because in that case the objects aren't added to the context and the event handler will not be fired). After the enumeration in the self made Iterator, the event handler on the ObjectStateManager is deregistered.
2.) call StartTracking() after ApplyChanges()/SaveChanges()
In the workspace implementation, I ask the context's ObjectStateManager for the modified entities, i.e:
var addedEntities = this.context.ObjectStateManager.GetObjectStateEntries(EntityState.Added);
--> analogous for modified entities
cast them to IObjectWithChangeTracker and call the AcceptChanges() method on the entity itself. This starts the object's changetracker again.
For my project I have the same mandatory points as you. I played around with EF 3.5 and didn't find a satisfactory solution. But the new ability of self tracking entities in EF 4 seems to fit my requirements (as far as I explored the funcionality).
If you're interested, I'll send you my "spike"-project.
Have anyone an alternative solution? My project is a server application which holds objects in memory for fast operations, while modifications should also be persisted (no round trip to DB). At some points in code the object graphs are marked as deleted/terminated and are removed from the in-memory container. With the explained solution above I can reuse the generated model from EF and have not to code and wrapp all objects myself again. The generated code for the self tracking entities arises from T4 templates which can be adapted very easily.
Thanks a lot for other ideas/critism
Short answer is that you can still keep a graph (collection of linked objects) of the objects in memory and write the changes to the database as they occur. If this is taking too long, you could put the changes onto a message queue (but that is probably overkill) or execute the updates and inserts on a separate thread.