How to Create a Node with Neo4jClient in Neo4j v2? - c#

Under Neo4j v1.9.x, I used the following sort of code.
private Category CreateNodeCategory(Category cat)
{
var node = client.Create(cat,
new IRelationshipAllowingParticipantNode<Category>[0],
new[]
{
new IndexEntry(NeoConst.IDX_Category)
{
{ NeoConst.PRP_Name, cat.Name },
{ NeoConst.PRP_Guid, cat.Nguid.ToString() }
}
});
cat.Nid = node.Id;
client.Update<Category>(node, cat);
return cat;
}
The reason being that the Node Id was auto generated and I could use it later for a quick look up, start bits in other queries, etc. Like the following:
private Node<Category> CategoryGet(long nodeId)
{
return client.Get<Category>((NodeReference<Category>)nodeId);
}
This enables the following which appeared to work well.
public Category CategoryAdd(Category cat)
{
cat = CategoryFind(cat);
if (cat.Nid != 0) { return cat; }
return CreateNodeCategory(cat);
}
public Category CategoryFind(Category cat)
{
if (cat.Nid != 0) { return cat; }
var node = client.Cypher.Start(new {
n = Node.ByIndexLookup(NeoConst.IDX_Category, NeoConst.PRP_Name, cat.Name)})
.Return<Node<Category>>("n")
.Results.FirstOrDefault();
if (node != null) { cat = node.Data; }
return cat;
}
Now the cypher Wiki, examples and bad-habits recommend using the .ExecuteWithoutResults() in all the CRUD.
So the question I have is how do you have an Auto Increment value for the node ID?

First up, for Neo4j 2 and onwards, you always need to start with the frame of reference "how would I do this in Cypher?". Then, and only then, do you worry about the C#.
Now, distilling your question, it sounds like your primary goal is to create a node, and then return a reference to it for further work.
You can do this in cypher with:
CREATE (myNode)
RETURN myNode
In C#, this would be:
var categoryNode = graphClient.Cypher
.Create("(category {cat})")
.WithParams(new { cat })
.Return(cat => cat.Node<Category>())
.Results
.Single();
However, this still isn't 100% what you were doing in your original CreateNodeCategory method. You are creating the node in the DB, getting Neo4j's internal identifier for it, then saving that identifier back into the same node. Basically, you're using Neo4j to generate auto-incrementing numbers for you. That's functional, but not really a good approach. I'll explain more ...
First up, the concept of Neo4j even giving you the node id back is going away. It's an internal identifier that actually happens to be a file offset on disk. It can change. It is low level. If you think about SQL for a second, do you use a SQL query to get the file byte offset of a row, then reference that for future updates? A: No; you write a query that finds and manipulates the row all in one hit.
Now, I notice that you already have an Nguid property on the nodes. Why can't you use that as the id? Or if the name is always unique, use that? (Domain relevant ids are always preferable to magic numbers.) If neither are appropriate, you might want to look at a project like SnowMaker to help you out.
Next, we need to look at indexing. The type of indexing that you're using is referred to in the 2.0 docs as "Legacy Indexing" and misses out on some of the cool Neo4j 2.0 features.
For the rest of this answer, I'm going to assume your Category class looks like this:
public class Category
{
public Guid UniqueId { get; set; }
public string Name { get; set; }
}
Let's start by creating our category node with a label:
var category = new Category { UnqiueId = Guid.NewGuid(), Name = "Spanners" };
graphClient.Cypher
.Create("(category:Category {category})")
.WithParams(new { category })
.ExecuteWithoutResults();
And, as a one-time operation, let's establish a schema-based index on the Name property of any nodes with the Category label:
graphClient.Cypher
.Create("INDEX ON :Category(Name)")
.ExecuteWithoutResults();
Now, we don't need to worry about manually keeping indexes up to date.
We can also introduce an index and unique constraint on UniqueId:
graphClient.Cypher
.Create("CONSTRAINT ON (category:Category) ASSERT category.UniqueId IS UNIQUE")
.ExecuteWithoutResults();
Querying is now very easy:
graphClient.Cypher
.Match("(c:Category)")
.Where((Category c) => c.UniqueId == someGuidVariable)
.Return(c => c.As<Category>())
.Results
.Single();
Rather than looking up a category node, to then do another query, just do it all in one go:
var productsInCategory = graphClient.Cypher
.Match("(c:Category)<-[:IN_CATEGORY]-(p:Product)")
.Where((Category c) => c.UniqueId == someGuidVariable)
.Return(p => p.As<Product>())
.Results;
If you want to update a category, do that in one go as well:
graphClient.Cypher
.Match("(c:Category)")
.Where((Category c) => c.UniqueId == someGuidVariable)
.Update("c = {category}")
.WithParams(new { category })
.ExecuteWithoutResults();
Finally, your CategoryAdd method currently 1) does one DB hit to find an existing node, 2) a second DB hit to create a new one, 3) a third DB hit to update the ID on it. Instead, you can compress all of this to a single call too using the MERGE keyword:
public Category GetOrCreateCategoryByName(string name)
{
return graphClient.Cypher
.WithParams(new {
name,
newIdIfRequired = Guid.NewGuid()
})
.Merge("(c:Category { Name = {name})")
.OnCreate("c")
.Set("c.UniqueId = {newIdIfRequired}")
.Return(c => c.As<Category>())
.Results
.Single();
}
Basically,
Don't use Neo4j's internal ids as a way to hack around managing your own identities. (But they may release some form of autonumbering in the future. Even if they do, domain identities like email addresses or SKUs or airport codes or ... are preferred. You don't even always need an id: you can often infer a node based on its position in the graph.)
Generally, Node<T> will disappear over time. If you use it now, you're just accruing legacy code.
Look into labels and schema-based indexing. They will make your life easier.
Try and do things in the one query. It will be much faster.
Hope that helps!

Related

Insert order of Entity Framework children

I have a structure like this
public class Son {
public string Name {get;set;}
public int Age {get;set;}
}
public class Daughter {
public string Name {get;set;}
public int Age {get;set;}
}
public class Parent {
public Daughter[] Daughters {get;set;}
public Son[] Sons {get;set;}
}
Where there is a FK Parent -> Son and Parent -> Daughter
Currently when doing a Context.SaveChanges() on a parent object it saves the Parent, and then saves the Daughters and then saves the Sons. I need it to save the Sons before the Daughters because we have a database trigger that does validation of the Sons based on the Daughters (and will deny the whole thing if it doesnt meet a requirement)
This trigger is obviously outside the knowledge of EF.
How can I specify that Sons are dependent on Daughters in EF such that Sons get inserted first; or is there a specification or attribute that I can define insert order?
PS: Do not look too much into the contrived example (such as why we dont save it under one thing called Children). The real-world example is much more complicated but the idea of saving Sons before Daughters is there
I love a challenge!
Firstly a declaration: I'm not a fan of Triggers or building a requirement for making order of insert important. My first exercise would be to exhaust all options to remove such a requirement.
After a bit of tinkering from what I can see at least when adding entities, for instance a Parent with one or more Daughters and one or more Sons, the order of insert is consistently alphabetical based on the Entity names. For example with entities named "Parent", "Daughter", and "Son", the insert was always Parent > Daughter > Son. The order of the properties, configuration, inserts, or even the table names had no bearing on the operations, however renaming the entity class "Son" to "ASon" resulted in Sons being inserted before Daughters. I don't know if this will carry forward to edits, but it's something to consider without getting too hacky. (Though something like this would definitely need to be documented well in the system in case someone questions a naming convention to get something inserting before something else.)
That said, getting into the hacky fun business!
Using a Son entity called ASon to force Sons before Daughters, it is possible to get EF to reverse that insert order:
using (var context = new ParentDbContext())
{
var parent = context.Parents.Create();
parent.Name = "Steve";
parent.Daughters.Add(new Daughter { Name = "Elise" });
parent.Daughters.Add(new Daughter { Name = "Susan" });
parent.Sons.Add(new ASon { Name = "Jason" });
parent.Sons.Add(new ASon { Name = "Duke" });
context.Parents.Add(parent);
context.SaveChanges();
}
Out of the box this inserted Parent, Son, Son, Daughter, Daughter.
To reverse it, I overrode SaveChanges, looking for our Sons to defer saving until after everything else:
public override int SaveChanges()
{
var trackedStates = new[] { EntityState.Added, EntityState.Modified };
var trackedParentIds = ChangeTracker.Entries<Parent>().Where(x => trackedStates.Contains(x.State)).Select(x => x.Entity.ParentId).ToList();
var addedSons = ChangeTracker.Entries<ASon>().Where(x => x.State == EntityState.Added).ToList();
var modifiedSons = ChangeTracker.Entries<ASon>().Where(x => x.State == EntityState.Modified).ToList();
int tempid = -1;
int modifiedParentCount = addedSons.Select(x => x.Entity.Parent.ParentId)
.Where(x => trackedParentIds.Contains(x))
.Count();
List<Tuple<Parent, ASon>> associatedSons = new List<Tuple<Parent, ASon>>();
modifiedSons.ForEach(x => { x.State = EntityState.Unchanged; });
addedSons.ForEach(x =>
{
x.Entity.SonId = tempid--;
associatedSons.Add(new Tuple<Parent, ASon>(x.Entity.Parent, x.Entity));
x.Entity.Parent.Sons.Remove(x.Entity);
x.State = EntityState.Unchanged;
});
var result = base.SaveChanges();
addedSons.ForEach(x => { x.Entity.Parent = associatedSons.Single(a => a.Item2 == x.Entity).Item1; x.State = EntityState.Added; });
modifiedSons.ForEach(x => { x.State = EntityState.Modified; });
result += base.SaveChanges() - modifiedParentCount;
return result;
}
So what this is doing:
The first bit is easy, we find our added and modified sons. We also take a count of parents with both modified and added sons. These will get double-counted when this is done.
For modified sons, we just set their state to Unchanged.
For added sons, we need to do a bit of dirty work. We need to give them a temporary unique ID because to mark them as Unchanged, EF still wants to track their ID and when you add 2 sons it will fail here. Note that when we put them back as added, they will receive proper IDs from Identity columns, not these temporary negative ones. We also track the association of their added sons with their respective parent in a Tuple because we need to temporarily remove those sons from their parent. Finally the added son is also marked unchanged.
Now we call the base SaveChanges which will save our parents and their daughters.
For modified sons, we just need to update the state back to Modified.
For our added sons, we use our saved association to re-assign them to their parent, and then set their state back to added.
We call the base SaveChanges again, appending the affected row count to the first run, and subtract the duplicate parent references. (parents that were already counted due to being modified)
The sketchy bit is adjusting the result count for the double-save, this might not be 100% accurate but it should only be an issue if you happen to be using the result of SaveChanges. I can't say I've ever really paid much attention to that return value :)
Hopefully that gives you some ideas to play with.

Entity Framework Core Linq query returning ids which do not exist in database

I wonder if there is an easy way using Linq to SQL with Entity Framework Core to query check if a given list of ids exist in the database and which returns the list of ids that do not exist.
The use case I come across this is if the user can do something with a list of object (represented through the list of their ids) I want to check if these ids exist or not.
Of course I could query all objects/object ids that exist in the database and cross check in a second step.
Just wondering if it would be possible in one step.
What I mean in code:
public class MyDbObject
{
public int Id { get; set; }
public string Name { get; set; }
}
public IActionResult DoSomethingWithObjects([FromQuery]List<int> ids}
{
List<int> idsThatDoNotExistInTheDb = DbContext.MyDbObject.Where(???)
return NotFound("those ids do not exist: " + string.Join(", ", idsThatDoNotExist));
}
You can obtain the list of IDs that match, then remove them from the original list, like this:
var validIds = DbContext
.MyDbObject
.Where(obj => ids.Contains(obj.Id))
.Select(obj => obj.Id);
var idsThatDoNotExistInTheDb = ids.Except(validIds);
This approach may be slow, though, so you may be better off doing it in a stored procedure that takes a table-valued parameter (how?)
Note: Pre-checks of this kind are not bullet-proof, because a change may happen between the moment when you validate IDs and the moment when you start the operation. It is better to structure your APIs in a way that it validates and then does whatever it needs to do right away. If validation fails, the API returns a list of errors.

I don't understand Dapper's mapping, multimapping and QueryMultiple

Title says it all, I'm trying to use it but I don't understand it. It's possible that the problem is a lack of knowledge due to that I'm an amateur, but I've read a dozen questions about this thing and googled for three days, and I still don't understand it.
I have SO many questions that I'm not sure that I should write it all in only one Question, or even if someone would read it all. If someone have other solution or think I should split it in different questions... well, I'm open to suggestions.
I was going to write an example, but again I read dozen of examples for days and didn't help me.
I just can't make my mind to understand how work something like the example at github:
var sql =
#"select * from #Posts p
left join #Users u on u.Id = p.OwnerId
Order by p.Id";
var data = connection.Query<Post, User, Post>(sql, (post, user) => { post.Owner = user; return post;});
So, Post have a property of type User and that property is called Owner, right? Something like:
public class Post
{
...
public User Owner { get; set;}
}
Therefore Query<Post, User, Post> will return a Post instance with all the properties and what not, AND will create a User instance and assign it to Post.Owner property? How would simple parameters be added to that query, for example if someone wanted to pass the id as a int parameter like ...WHERE Id = #Id", new {Id = id}, where should the parameter be added given that the parameter right now is (post, user) => { post.Owner = user; return post;}? The parameter always refer to the types given, you can only use the simple typical parameters for the dynamic query, both can be used simultaneously? How?
Also, how does it to differentiate what DB field goes to what object? It makes something like class name=DB table name? What happens if the classes don't have the same name as the DB table and I want to use the [Table] attribte, will it work or the attribute is only for Dapper.Contrib.Extensionsmethods? Would it work with objects that share the same DB table?
Regarding same table for different objects question, f.i. lets say I have a Person object that have a BankAccount object:
public class Person
{
...
public BankAccount Account {get; set;}
...
}
public class BankAccount
{
private string _Account;
public string Account
{
get { return _Account; }
set
{
if(!CheckIfIBANIsCorrect(value))
throw new Exception();
_Account = value;
}
}
private bool CheckIfIBANIsCorrect(string IBAN)
{
//...
//Check it
}
}
I could store the string account at the same table than Person, since every person would have a single account referred by the person's Id. How should I map something like that? Is there even a way, should I simply load the result in a dynamic object and then create all the objects, will Query create the rest of the Person object and I should bother to create the nested object myself?
And by the way, how is splitOnsupposedly be used in all this? I understand that it should split the result into various "groups" so you can split the results by Ids f.i. and take what you need, but I don't understand how should I retrieve the info from the different "groups", and how it return the different "groups", lists, enumerables, what?.
QueryMultiple is other thing that is FAR beyond my understanding regardles how much questions and answers I read.
You know... how the * does that .Read thing work? All I read here or googling assumes that Read is some sort of automagic thing that can miracly discern between objects. Again, do it divide results by class names so I just have to be sure every object have the correct table name? And again what happens with [Table] attribute in this case?
I think the problem I'm having is that I can't find(I suppose it doesn't exist) a single web page that describes it all(the examples at GitHub are very scarce), and I only still finding answers to concrete cases that doesn't answer exactly what I'm trying to understand but only that concrete cases, which are confusing me more and more while I read them, since everyone seems to use a bunch of different methods without explaining WHY or HOW.
I think that your main problem with the Dapper querying of joined table queries is thinking that the second argument in the list is always the "param" argument. Consider the following code:
var productsWithoutCategories = conn.Query<Product>(
"SELECT * FROM Products WHERE ProductName LIKE #nameStartsWith + '%'",
new { nameStartsWith = "a" }
);
Here, there are two arguments "sql" and "param" - if we used named arguments then the code would look like this:
var productsWithoutCategories = conn.Query<Product>(
sql: "SELECT * FROM Products WHERE ProductName LIKE #nameStartsWith + '%'",
param: new { nameStartsWith = "a" }
);
In your example, you have
var data = connection.Query<Post, User, Post>(sql, (post, user) => { post.Owner = user; return post;});
The second argument there is actually an argument called "map" which tells Dapper how to combine entities for cases where you've joined two tables in your SQL query. If we used named arguments then it would look like this:
var data = connection.Query<Post, User, Post>(
sql: sql,
map: (post, user) => { post.Owner = user; return post;}
);
I'm going to use the class NORTHWND database in a complete example. Say we have the classes
public class Product
{
public int ProductId { get; set; }
public string ProductName { get; set; }
public bool Discontinued { get; set; }
public Category Category { get; set; }
}
public class Category
{
public int CategoryID { get; set; }
public string CategoryName { get; set; }
}
and we want to build a list of Products, with the nested Category type populated, we'd do the following:
using (var conn = new SqlConnection("Server=.;Database=NORTHWND;Trusted_Connection=True;"))
{
var productsWithCategories = conn.Query<Product, Category, Product>(
"SELECT * FROM Products INNER JOIN Categories ON Categories.CategoryID = Products.CategoryID,
map: (product, category) =>
{
product.Category = category;
return product;
},
splitOn: "CategoryID"
);
}
This goes through all the rows of JOIN'd Product and Category data and generates a list of unique Products but can't be sure how to combine the Category data with it, so it requires a "map" function which takes a Product instance and a Category instance and which must return a Product instance which has the Category data combined with it. In this example, it's easy - we just need to set the Category property on the Product instance to the Category instance.
Note that I've had to specify a "splitOn" value. Dapper presumes that the key columns of tables will simply be called "Id" and, if they are, then it can deal with joins on those columns automatically. However, in this case, we're joining on a column called "CategoryID" and so we have to tell Dapper to split the data back up (into Products and into Categories) according to that column name.
If we also wanted to specify "param" object to filter down the results, then we could do something like the following:
var productsWithCategories = conn.Query<Product, Category, Product>(
"SELECT * FROM Products INNER JOIN Categories ON Categories.CategoryID = Products.CategoryID WHERE ProductName LIKE #nameStartsWith + '%'",
map: (product, category) =>
{
product.Category = category;
return product;
},
param: new { nameStartsWith = "a" },
splitOn: "CategoryID"
);
To answer your final question, QueryMultiple simply executes multiple queries in one go and then allows you to read them back separately. For example, instead of doing this (with two separate queries):
using (var conn = new SqlConnection("Server=.;Database=NORTHWND;Trusted_Connection=True;"))
{
var categories = conn.Query("SELECT * FROM Categories");
var products = conn.Query("SELECT * FROM Products");
}
You could specify a single SQL statement that includes both queries in one batch, but you would then need to read them separately out of the combined result set that is returned from QueryMultiple:
using (var conn = new SqlConnection("Server=.;Database=NORTHWND;Trusted_Connection=True;"))
{
var combinedResults = conn.QueryMultiple("SELECT * FROM Categories; SELECT * FROM Products");
var categories = combinedResults.Read<Category>();
var products = combinedResults.Read<Product>();
}
I think that the other examples I've seen of QueryMultiple are a little confusing as they are often returning single values from each query, rather than full sets of rows (which is what is more often seen in simple Query calls). So hopefully the above clears that up for you.
Note: I haven't covered your question about the [Table] attribute - if you're still having problems after you've tried this out then I would suggest creating a new question for it. Dapper uses the "splitOn" value to decide when the columns for one entity end and the next start (in the JOIN example above there were fields for Product and then fields for Category). If you renamed the Category class to something else then the query will still work, Dapper doesn't rely upon the table name in this case - so hopefully you won't need the [Table] at all.

SQL query shows good database values, but LINQ to Entity framework brings a null value from nowhere

I'm having a really strange problem here, and i dont have any clue why.
I'm supposed to make small localdb console app in C#. The goal is to enter persons (teachers, actually) in the DB, with a certain amount of information.
I have a few classes, but 2 of them are important here: Certification and Notation.
Certifications are, well, certifications of the professors.
The code for these classes is this:
class Certification
{
public int CertificationID { get; set; }
public virtual Teacher Teacher { get; set; }
public virtual Course Course { get; set; }
public string CertificationName { get; set; }
public virtual Notation Notation { get; set; }
}
class Notation
{
public int NotationID {get;set;}
public string Note {get;set;}
}
Nothing too dangerous. Through migrations i made my database, and they look like they should:
Certification:
CertificationID (PK)
CertificationName
Course_CourseID (FK to another class, course)
Notation_NotationID (FK to notations)
Teacher_TeacherID (FK to the teachers)
Notations:
NotationID (PK)
Note
My program allows me to add teachers, with all the informations i need, and for example, their certifications. Here, i made some dummy teacher, with a dummy certification.
If i call SELECT * FROM Certification , i get exactly what i should get, a single line like this:
CertificationID = 6
CertificationName = placeholder
Course_CourseID = 13
Notation_NotationID = 12
Teacher_TeacherID = 5
Everything is correct in this. CourseID links to an actual course in the database, NotationID in an actual note, and Teacher to an actual teacher too. Everything is fine!
Now, i just want to show the certifications of our teacher:
var certifs = from c in db.Certifications where c.Teacher.TeacherID == item.TeacherID select c;
foreach(var v in certifs )
{
var course = (from c in db.Courses where c.CourseID == v.Course.CourseID select c).First();
var note = (from n in db.Notations where n.NotationID == v.Notation.NotationID select n.NotationID).First();
Console.WriteLine("Name: " + v.CertificationName + ", related to the " + course.CourseName + " course, with a note of " + note);
Console.WriteLine("");
}
And it doesn't work. When my foreach loop starts, my first item in the loop doesn't have any reference to a notation. Everything else is fine: the foreign keys for the course and the teachers are here, and valid, but for the notation, i only get a null value. So my certification item looks more like:
CertificationID = 6
CertificationName = placeholder
Course_CourseID = 13
Notation_NotationID = null
Teacher_TeacherID = 5
Basically, if i do a SQL Query, my row in the database is perfectly fine, but calling it through the entity framework (and LINQ) returns a null value for the notation. (which throws an exception when calling var note etc....
Does anybody have an idea about this? I'm really stuck on this.
I'm sorry if my English isn't good enough. If you guys need more information, just ask.
Anwsered by JC:
Lazy loading isnt working properly. Eager loading solves the problem.
The problem is you aren't populating your navigation properties when you retrieve the Certification entities. Then you try to access them and they're null.
You either need to make sure lazy loading is turned on:
Configuration.LazyLoadingEnabled = true; //In your DbContext's constructor
in which case just accessing the Course and Notification references should cause them to be populated in separate database transactions...
...or you need to employ eager loading when querying against the DbSet:
var certifs = from c in db.Certifications.Include(c=>c.Course).Include(c=>c.Notation) where ...
Which will cause Course and Notation to be loaded at the same time Certifications is loaded all in one database transaction.
In your line
var note = (from n in db.Notations
where n.NotationID == v.Notation.NotationID
select n.NotationID).First();
you are selecting n.NotationID only which would return an integer only. Trying changing the select to select n

How to version a performance counter category?

I have a performance counter category. The counters in this category may change for my next release so when the program starts, I want to check if the category exists and it is the correct version - if not, create the new category. I can do this by storing a GUID in the help string but this is obviously smelly. Is it possible to do this more cleanly with the .NET API?
Existing smelly version...
if (PerformanceCounterCategory.Exists(CATEGORY_NAME))
{
PerformanceCounterCategory c = new PerformanceCounterCategory(CATEGORY_NAME);
if (c.CategoryHelp != CATEGORY_VERSION)
{
PerformanceCounterCategory.Delete(CATEGORY_NAME);
}
}
if (!PerformanceCounterCategory.Exists(CATEGORY_NAME))
{
// Create category
}
In our system, each time the application starts we do a check for the existing category. If it isn't found, we create the category. If it exists, we compare the existing category to what we expect and recreate it (delete, create) if there are missing values.
var missing = counters
.Where(counter => !PerformanceCounterCategory.CounterExists(counter.Name, CategoryName))
.Count();
if (missing > 0)
{
PerformanceCounterCategory.Delete(CategoryName);
PerformanceCounterCategory.Create(
CategoryName,
CategoryHelp,
PerformanceCounterCategoryType.MultiInstance,
new CounterCreationDataCollection(counters.Select(x => (CounterCreationData)x).ToArray()));
}
I don't think there is a better way. IMHO, I don't think this is a terrible solution.

Categories

Resources