I'm using this code to determine if it should create a parent or if the parent already exists:
var id = 1;
var parent = Session.Get<Parent>(id);
if (parent == null)
parent = new Parent();
var child = new Child();
child.Parent = parent;
parent.Children.Add(child);
Session.Save(parent);
Right now this seems very inefficient, this method queries the database with 3 separate sql queries everytime when a child is added:
Get parent based on id
Insert child
Insert/update parent (depending if the parent did exist)
Could i do this in a better way?
There are two scenarios in fact.
In the first case, when we really do not know, if there is a parent with provided id - there's no other way. Such solution will always require so many sql statements.. to find out if there is a parent and insert if not.
In the second scenario, if we do know that there is a parent in DB (with provided id) - we can make it more efficient with a built in support: Load<Parent>(id)
9.2. Loading an object
... Load() returns an object that is an uninitialized proxy and
does not actually hit the database until you invoke a method of the
object...
Get more details here:
NHibernate difference between Query<T>, Get<T> and Load<T>
Given that you are in a regular business logic development, I wouldn't mind these queries.
Get is quite fast, because it usually performs a lookup by primary key, which usually is a clustered index (don't know about SQLite).
Existing parents need to be found to be linked anyway. You can postpone the actual query by using Load, but in my experience you need the parent anyway.
Updating the parent may be unnecessary. What does it update? Is there a missing inverse-mapping?
If you are doing the whole thing for a lot of records (not only one), you may consider other options. (Batches, pre-fetching, Futures, whatever.)
If you think that you need a highly optimized implementation for exactly this code, you should consider to avoid using an ORM and implement it in plain SQL (and recheck if it would really use less queries). Writing object oriented code requires having objects in memory which sometimes requires getting data from the database that wouldn't be required in a highly optimized implementation.
Related
Given this model:
I would like to be able to save in one SaveChange call the relations. Which means, I either have a new or updated ContainerParent, and multiple first level children and each of those can have 1 or 2 levels deeper.
The thing is, the children both have a key to themselves, for finding its parent, and a key to the container, for the container to get all its Children independently of their hierarchical level.
With this pseudo code (in the case of all entities are created, not updated)
var newContainerParent = context.ContainerParents.Add(new ContainerParent());
var rootChild = context.Children.Add(new Child());
var secondLevelChild = new Child();
var thirdLevelChild = new Child();
secondLevelChild.Children.Add(thirdLevelChild);
rootChild.Children.Add(secondLevelChild);
newContainerParent.Children.Add(rootChild);
context.SaveChanges();
Problem with this code, is that only the rootchild will have the FK for the container set. I also tried to add the children to they child parent AND the container:
rootChild.Children.Add(secondLevelChild);
newContainerParent.Children.Add(rootChild);
newContainerParent.Children.Add(secondLevelChild);
newContainerParent.Children.Add(thirdLevelChild);
I have the same problem while updating an existing container with new children. I set all the children with the already existing key of the parent, but when SaveChanges is called the key is not saved, its reverted to null.
I fixed it by doing all this in 2 steps, saving once and then getting all the newly created children and updating them with the parent key, the calling SaveChanges again.
I have a feeling I'm missing something, that I should not need to save twice.
The number or frequence of SaveChange calls have no implication on anything, not on performance or so. So why do you want to minimize it ?
Actually, storing such a self referencing table with one SaveChanges is not possible,
cause the ID of an new entity, is generated, when it is saved. So you first need to save it, and then you get the ID, that you can store in another entity. This might require further update-Commands, to the entity you just stored.
You have two chances to solve this.
1) manually generated ID's, handle it all yourself and you know the ID before your store it.
2) In case you have no circularity in your dependency, so a perfect tree structure, you save the items top-down, level by level. I assume you have the childs having a reference to it's parents, so the root has no reference to any other items, you save that first, than the 1st level children, and so on.
This requires multiple SaveChanges, but this is not a disadvantage. It is one Insert-SQL-Command per entity anyway, no matter if you do it in 1 SaveChanges or in 100 SaveChanges.
Both solutions avoid "Update" Commands to the entities, they do Inserts only.
Entity Framework could actually find out this dependencies itself and create an order for new entities to insert, but this is not implemented today, or not perfect, especially on self-referenced tables. The order of saving items is kind of random. So you have to enforce the order with intermediate SaveChanges.
I want to cache a never-changing aggregate which would be accessible by a root object only (all other entities are accessible only by using Reference/HasMany properties on the root object)?
Should I use NHibernate (which we are already using) second-level-cache or is it better to build some sort of singleton that provides access to all entities in the aggregate?
I found a blog post about getting everything with MultiQuery but my database does not support it.
The 'old way' to do this would be to
Do a select * from all aggregate tables
Loop the entities and set the References and the Collections manually
Something like:
foreach (var e in Entities)
{
e.Parent = loadedParentEntities.SingleOrDefault(pe => e.ParentId = pe.Id);
}
But surely there is a way to tell NHibernate to do this for me?
Update
Currently I tried merely fetching everything from the db and hope NHibernate does all the reference setting. It does not however :(
var getRoot = Session.Query<RootObject>().ToList();
var getRoot_hasMany = Session.Query<RootObjectCollection>().ToList();
var getRoot_hasMany_ref = Session.Query<RootObjectCollectionReference>().ToList();
var getRoot_hasMany_hasMany = Session.Query<RootObjectCollectionCollection>().ToList();
Domain:
Root objects are getRoot. These have a collection property 'HasMany'. These HasMany have each a reference back to GetRoot, and a reference to another entity (getRoot_hasMany_ref), and a collection of their own (getRoot_hasMany_hasMany). If this doesn't make sense, I'll create an ERD but the actual structure is not really relevant for the question (I think).
This results in 4 queries being executed. (which is good)
However, when accessing properties like getRoot.First().HasMany.First().Ref or getRoot.First().HasMany.First().HasMany().First() it still results in extra queries being executed even altough everything should already be known to the ISession?
So how do I tell NHibernate to perform those 4 queries and then build the graphs without using any proxy properties, ... so that I have access to everything even after the ISession went out of scope?
I think there are several questions in one.
I stopped trying to trick NHibernate too much. I wouldn't access entities from multiple threads, because they are usually not thread safe. At least when using lazy loading. Caching lazy entities is therefore something evil.
I would avoid too many queries by the use of batch size, which is far the cleanest and easiest solution and in most cases "good enough". It's fully transparent to the business logic, which makes it so cool.
I would:
Consider not caching the entity at all. Use NH first level cache (say: always load it using session.Get()). Make use of lazy loading when only a small part of the data is used in a single transaction.
Is there is a proven need to cache the data, consider to turn off lazy loading at all (by making the entities non-lazy and setting all the collections to non lazy. Load the entity once and cache it. Still consider thread safety when accessing the data while it is still loaded.
Should the entities be lazy, because some instances of the same type are not in the cache, consider using a DTO-like structure as cache. Copy all data in a similar class structure which are not entities. This may sound like a lot of additional work, but at the end it will avoid many strange problems and safe you much time.
Usually, query time is less important as flush time. This time is used by NH to find which entities changed in a session. To avoid this, make entities read only if you can.
if the whole object tree never changes (config settings?) then just load them efficiently with all references/collections initialised
using(var Session = Sessionfactory = OpenSession())
{
var root = Session.Query<RootObject>().FetchMany(x => x.Collection).ToFutureValue();
Session.Query<RootObjectCollection>().Fetch(x => x.Ref).FetchMany(x => x.Collection).ToFuture();
// Do something with root.Value
}
I have parent child relationship between two entities(Parent and Child).
My Parent mapping is as follows:
<class name="Parent" table="Parents">
...
<bag name="Children" cascade="all">
<key column="ParentID"></key>
<one-to-many class="Child"></one-to-many>
</bag>
</class>
I would like to execute the following:
someParent.Children.Remove(someChild);
The Child class has a reference to another parent class, Type. The relationship looks like
Note: I apologize for the non-linked url above, I couldn't seem to get past the asterisk in the url string using Markup
Due to this relationship, when the above code is called, instead of a DELETE query, an UPDATE query is generated which removes the ParentID from the Child table(sets to null).
Is it possible to force NHibernate to delete the child record completely, when removed from the Parent.Children collection?
UPDATE
#Spencer's Solution
Very attractive solution as this is something that can be implemented in future classes. However, due to the way sessions are handled(in my particular case) in the repository pattern, this is near impossible as we would have to pass session types(CallSessionContext/WebSessionContext) depending on the application.
#Jamie's Solution
Simple and quick to implement, however I've hit another road block. My child entity looks as follows:
When using the new method, NHibernate generates an update statement setting the TypeID and ParentID to null, as opposed to a single delete outright. If I missed something within the implementation, let me know as this method would be painless to move forward with.
#The One-Shot-Delete solution described here, outlines an idea dereferencing the collection to force a single delete. Same results as above however, an update statement is issued.
//Instantiate new collection and add persisted items
List<Child> children = new List<Child>();
children.AddRange(parent.Children);
//Find and remove requested items from new collection
var childrenToRemove = children
.Where(c => c.Type.TypeID == 1)
.ToList();
foreach (var c in childrenToRemove) { children.Remove(m); }
parent.Children = null;
//Set persisted collection to new list
parent.Children = Children;
Solution
Took a bit of digging, but Jamie's solution came through with some additional modifications. For future readers, based on my class model above:
Type mapping - Inverse = true, Cascade = all
Parent mapping - Inverse = true, Cascade = all-delete-orphan
Remove methods as described in Jamie's solution works. This does produce a single delete statement per orphaned item, so there is the possibility for tuning in the future, however the end result is successful.
Instead of exposing the IList<Child>, control access to the collection through a method:
RemoveChild(Child child)
{
Children.Remove(child);
child.Parent = null;
child.Type.RemoveChild(child);
}
Type.RemoveChild would look similar but you would have to be careful to not put it into an infinite loop calling each other's RemoveChild methods.
I don't think this is exactly possible because Hibernate has no way of knowing if the record has been orphaned. It can check if any other classes relate to the child class but then it would be assuming that it's aware of the entire DB structure which may not be the case.
You're not completely out of luck however. By using the IList interface in conjunction with a custom ICascadeDeleteChild interface you'd create you can come up with a rather seamless solution. Here are the basic steps.
Create a class that inheirits IList and IList<> and call it CascadeDeleteList or something along those lines.
Create a private .Net List inside this class and simply proxy the various method calls to this List.
Create an Interface called ICascadeDeleteChild and give it a method Delete()
Under the Delete method for your CascadeDeleteList check the type of the object that is to be deleted. If it is of type ICascadeDeleteChild then call it's Delete method.
Change your Child class to implement the ICascadeDeleteChild interface.
I know it seems like a pain but once this is done these interfaces should be simple to port around.
I have a Linq object, and I want to make changes to it and save it, like so:
public void DoSomething(MyClass obj) {
obj.MyProperty = "Changed!";
MyDataContext dc = new MyDataContext();
dc.GetTable<MyClass>().Attach(dc, true); // throws exception
dc.SubmitChanges();
}
The exception is:
System.InvalidOperationException: An entity can only be attached as modified without original state if it declares a version member or does not have an update check policy.
It looks like I have a few choices:
put a version member on every one of my Linq classes & tables (100+) that I need to use in this way.
find the data context that originally created the object and use that to submit changes.
implement OnLoaded in every class and save a copy of this object that I can pass to Attach() as the baseline object.
To hell with concurrency checking; load the DB version just before attaching and use that as the baseline object (NOT!!!)
Option (2) seems the most elegant method, particularly if I can find a way of storing a reference to the data context when the object is created. But - how?
Any other ideas?
EDIT
I tried to follow Jason Punyon's advice and create a concurrency field on on table as a test case. I set all the right properties (Time Stamp = true etc.) on the field in the dbml file, and I now have a concurrency field... and a different error:
System.NotSupportedException: An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
Edit - FULL DISCLOSURE
OK, I can see it's time for full disclosure of why all the standard patterns aren't working for me.
I have been trying to be clever and make my interfaces much cleaner by hiding the DataContext from the "consumer" developers. This I have done by creating a base class
public class LinqedTable<T> where T : LinqedTable<T> {
...
}
... and every single one of my tables has the "other half" of its generated version declared like so:
public partial class MyClass : LinqedTable<MyClass> {
}
Now LinqedTable has a bunch of utility methods, most particularly things like:
public static T Get(long ID) {
// code to load the record with the given ID
// so you can write things like:
// MyClass obj = MyClass.Get(myID);
// instead of:
// MyClass obj = myDataContext.GetTable<MyClass>().Where(o => o.ID == myID).SingleOrDefault();
}
public static Table<T> GetTable() {
// so you can write queries like:
// var q = MyClass.GetTable();
// instead of:
// var q = myDataContext.GetTable<MyClass>();
}
Of course, as you can imagine, this means that LinqedTable must somehow be able to have access to a DataContext. Up until recently I was achieving this by caching the DataContext in a static context. Yes, "up until recently", because that "recently" is when I discovered that you're not really supposed to hang on to a DataContext for longer than a unit of work, otherwise all sorts of gremlins start coming out of the woodwork. Lesson learned.
So now I know that I can't hang on to that data context for too long... which is why I started experimenting with creating a DataContext on demand, cached only on the current LinqedTable instance. This then led to the problem where the newly created DataContext wants nothing to do with my object, because it "knows" that it's being unfaithful to the DataContext that created it.
Is there any way of pushing the DataContext info onto the LinqedTable at the time of creation or loading?
This really is a poser. I definitely do not want to compromise on all these convenience functions I've put into the LinqedTable base class, and I need to be able to let go of the DataContext when necessary and hang on to it while it's still needed.
Any other ideas?
Updating with LINQ to SQL is, um, interesting.
If the data context is gone (which in most situations, it should be), then you will need to get a new data context, and run a query to retrieve the object you want to update. It's an absolute rule in LINQ to SQL that you must retrieve an object to delete it, and it's just about as iron-clad that you should retrieve an object to update it as well. There are workarounds, but they are ugly and generally have lots more ways to get you in trouble. So just go get the record again and be done with it.
Once you have the re-fetched object, then update it with the content of your existing object that has the changes. Then do a SubmitChanges() on the new data context. That's it! LINQ to SQL will generate a fairly heavy-handed version of optimistic concurrency by comparing every value in the record to the original (in the re-fetched) record. If any value changed while you had the data, LINQ to SQL will throw a concurrency exception. (So you don't need to go altering all your tables for versioning or timestamps.)
If you have any questions about the generated update statements, you'll have to break out SQL Profiler and watch the updates go to the database. Which is actually a good idea, until you get confidence in the generated SQL.
One last note on transactions - the data context will generate a transaction for each SubmitChanges() call, if there is no ambient transaction. If you have several items to update and want to run them as one transaction, make sure you use the same data context for all of them, and wait to call SubmitChanges() until you've updated all the object contents.
If that approach to transactions isn't feasible, then look up the TransactionScope object. It will be your friend.
I think 2 is not the best option. It's sounding like you're going to create a single DataContext and keep it alive for the entire lifetime of your program which is a bad idea. DataContexts are lightweight objects meant to be spun up when you need them. Trying to keep the references around is also probably going to tightly couple areas of your program you'd rather keep separate.
Running a hundred ALTER TABLE statements one time, regenerating the context and keeping the architecture simple and decoupled is the elegant answer...
find the data context that originally created the object and use that to submit changes
Where did your datacontext go? Why is it so hard to find? You're only using one at any given time right?
So what the heck am I supposed to attach, then, if not an existing entity? If I wanted a new record, I would do an InsertOnSubmit()! So how are you supposed to use Attach()?
You're supposed to attach an instance that represents an existing record... but was not loaded by another datacontext - can't have two contexts tracking record state on the same instance. If you produce a new instance (ie. clone) you'll be good to go.
You might want to check out this article and its concurrency patterns for update and delete section.
The "An entity can only be attached as modified without original state if it declares a version member" error when attaching an entitity that has a timestamp member will (should) only occur if the entity has not travelled 'over the wire' (read: been serialized and deserialized again). If you're testing with a local test app that is not using WCF or something else that will result in the entities being serialized and deserialized then they will still keep references to the original datacontext through entitysets/entityrefs (associations/nav. properties).
If this is the case, you can work around it by serializing and deserializing it locally before calling the datacontext's .Attach method. E.g.:
internal static T CloneEntity<T>(T originalEntity)
{
Type entityType = typeof(T);
DataContractSerializer ser =
new DataContractSerializer(entityType);
using (MemoryStream ms = new MemoryStream())
{
ser.WriteObject(ms, originalEntity);
ms.Position = 0;
return (T)ser.ReadObject(ms);
}
}
Alternatively you can detach it by setting all entitysets/entityrefs to null, but that is more error prone so although a bit more expensive I just use the DataContractSerializer method above whenever I want to simulate n-tier behavior locally...
(related thread: http://social.msdn.microsoft.com/Forums/en-US/linqtosql/thread/eeeee9ae-fafb-4627-aa2e-e30570f637ba )
You can reattach to a new DataContext. The only thing that prevents you from doing so under normal circumstances is the property changed event registrations that occur within the EntitySet<T> and EntityRef<T> classes. To allow the entity to be transferred between contexts, you first have to detach the entity from the DataContext, by removing these event registrations, and then later on reattach to the new context by using the DataContext.Attach() method.
Here's a good example.
When you retrieve the data in the first place, turn off object tracking on the context that does the retrieval. This will prevent the object state from being tracked on the original context. Then, when it's time to save the values, attach to the new context, refresh to set the original values on the object from the database, and then submit changes. The following worked for me when I tested it.
MyClass obj = null;
using (DataContext context = new DataContext())
{
context.ObjectTrackingEnabled = false;
obj = (from p in context.MyClasses
where p.ID == someId
select p).FirstOrDefault();
}
obj.Name += "test";
using (DataContext context2 = new ())
{
context2.MyClasses.Attach(obj);
context2.Refresh(System.Data.Linq.RefreshMode.KeepCurrentValues, obj);
context2.SubmitChanges();
}
Using LINQ to Entities sounds like a great way to query against a database and get actual CLR objects that I can modify, data bind against and so forth. But if I perform the same query a second time do I get back references to the same CLR objects or an entirely new set?
I do not want multiple queries to generate an ever growing number of copies of the same actual data. The problem here is that I could alter the contents of one entity and save it back to the database but another instance of the entity is still in existence elsewhere and holding the old data.
Within the same DataContext, my understanding is that you'll always get the same objects - for queries which return full objects instead of projections.
Different DataContexts will fetch different objects, however - so there's a risk of seeing stale data there, yes.
In the same DataContext you would get the same object if it's queried (DataContext maintains internal cache for this).
Be aware that that the objects you deal are most likely mutable, so instead of one problem (data duplication) you can get another (concurrent access).
Depending on business case it may be ok to let the second transaction with stale data to fail on commit.
Also, imagine a good old IDataReader/DataSet scenario. Two queries would return two different readers that would fill different datasets. So the data duplication problem isn't ORM specific.
[oops; note that this reply applies to Linq-to-SQL, not Entity Framework.]
I've left it here (rather than delete) because it is partly on-topic, and might be useful.
Further to the other replies, note that the data-context also has the ability to avoid doing a round-trip for simply "by primary key" queries - it will check the cache first.
Unfortunately, it was completely broken in 3.5, and is still half-broken in 3.5SP1, but it works for some queries. This can save a lot of time if you are getting individual objects.
So basically, IIRC you need to use:
// uses object identity cache (IIRC)
var obj = ctx.Single(x=>x.Id == id);
But not:
// causes round-trip (IIRC)
var obj = ctx.Where(x=>x.Id == id).Single();