I'm looking to transform my in memory Plain old C# classes into a neo4j database.
(Class types are node types and derive from, nodes have a List for "linkedTo")
Rather than write a long series of cypher queries to create nodes and properties then link them with relationships I am wondering if there is anything more clever I can do.
For example can I serialize them to json and then import that directly into neo4j?
I understand that the .unwind function in the C# neo4j driver may be of help here but do not see good examples of its use and then relationships need to be matched and created separately
Is there an optimal method for doing this? i expect to have around 50k nodes
OK, first off, I'm using Neo4jClient for this and I've added an INDEX to the DB using:
CREATE INDEX ON :MyClass(Id)
This is important for the way this works, as it makes inserting the data a lot quicker.
I have a class:
public class MyClass
{
public int Id {get;set;}
public string AValue {get;set;}
public ICollection<int> LinkToIds {get;set;} = new List<int>();
}
Which has an Id which I'll be keying off, and a string property - just because. The LinkToIds property is a collection of Ids that this instance is linked to.
To generate my MyClass instances I'm using this method to randomly generate them:
private static ICollection<MyClass> GenerateMyClass(int number = 50000){
var output = new List<MyClass>();
Random r = new Random((int) DateTime.Now.Ticks);
for (int i = 0; i < number; i++)
{
var mc = new MyClass { Id = i, AValue = $"Value_{i}" };
var numberOfLinks = r.Next(1, 10);
for(int j = 0; j < numberOfLinks; j++){
var link = r.Next(0, number-1);
if(!mc.LinkToIds.Contains(link) && link != mc.Id)
mc.LinkToIds.Add(link);
}
output.Add(mc);
}
return output;
}
Then I use another method to split this into smaller 'batches':
private static ICollection<ICollection<MyClass>> GetBatches(ICollection<MyClass> toBatch, int sizeOfBatch)
{
var output = new List<ICollection<MyClass>>();
if(sizeOfBatch > toBatch.Count) sizeOfBatch = toBatch.Count;
var numBatches = toBatch.Count / sizeOfBatch;
for(int i = 0; i < numBatches; i++){
output.Add(toBatch.Skip(i * sizeOfBatch).Take(sizeOfBatch).ToList());
}
return output;
}
Then to actually add into the DB:
void Main()
{
var gc = new GraphClient(new Uri("http://localhost:7474/db/data"), "neo4j", "neo");
gc.Connect();
var batches = GetBatches(GenerateMyClass(), 5000);
var now = DateTime.Now;
foreach (var batch in batches)
{
DateTime bstart = DateTime.Now;
var query = gc.Cypher
.Unwind(batch, "node")
.Merge($"(n:{nameof(MyClass)} {{Id: node.Id}})")
.Set("n = node")
.With("n, node")
.Unwind("node.LinkToIds", "linkTo")
.Merge($"(n1:{nameof(MyClass)} {{Id: linkTo}})")
.With("n, n1")
.Merge("(n)-[:LINKED_TO]->(n1)");
query.ExecuteWithoutResults();
Console.WriteLine($"Batch took: {(DateTime.Now - bstart).TotalMilliseconds} ms");
}
Console.WriteLine($"Total took: {(DateTime.Now - now).TotalMilliseconds} ms");
}
On my aging (5-6 years old now) machine it takes about 20s to put 50,000 nodes in and around about 500,000 relationships.
Let's break into that important call to Neo4j above. The key things are as you rightly suggesting UNWIND - here I UNWIND a batch and give each 'row' in that collection the identifier of node. I can then access the properties (node.Id) and use that to MERGE a node. In the first unwind - I always SET the newly created node (n) to be the node so all the properties (in this case just AValue) are set.
So up to the first With we have a new Node created with a MyClass label, and all it's properties set. Now. This does include having an array of LinkToIds which if you were a tidy person - you might want to remove. I'll leave that to yourself.
In the second UNWIND we take advantage of the fact that the LinkToIds property is an Array, and use that to create a 'placeholder' node that will be filled later, then we create a relationship between the n and the n1 placeholder. NB - if we've already created a node with the same id as n1 we'll use that node, and when we get to the same Id during the first UNWIND we'll set all the properties of the placeholder.
It's not the easiest to explain, but in the best things to look at are MERGE and UNWIND in the Neo4j Documentation.
Related
as you can see I have ten team in my database, and here's my code, now I want to generate randomly matches in asp.net C#
in this code the problem is that "d" is a list and the return type of Data is object,
the the picture of error is below.
note in database team_id and team_name in relation when you call team id team_name will be show or call.
function is in service and service is calling in controller.
[HttpGet("DoMatch")]
public IActionResult DoMatch()
{
var res= _matchService.DoMatch();
return Ok(res);
}
public ResponseModel DoMatch()
{
var random = new Random();
List<Team> list = _context.Team.ToList();
Dictionary<int, List<Team>> d = new Dictionary<int, List<Team>> { };
var count = list.Count();
for (int i = 0; i < count / 2; i++)
{
List<Team> temp = new List<Team>();
int index1 = random.Next(list.Count);
temp.Add(list[index1]);
list.RemoveAt(index1);
int index2 = random.Next(list.Count);
temp.Add(list[index2]);
list.RemoveAt(index2);
d.Add(i, temp);
}
return new ResponseModel
{
Data = d,
IsSuccess = true
};
}
the error or exception is:
System.NotSupportedException: The collection type 'System.Collections.Generic.Dictionary2[System.Int32,System.Collections.Generic.List1[Fantasy_League.Models.Team]]' on 'FantasyLeague.Models.ViewModels.ResponseModel.Data' is not supported.
The actual problem that you're running into, as described by the exception message you're getting, is that Dictionary<int, ...> cannot be serialized to be sent back in the web response. JSON requires each key to be a string. So you'll need to decide what you actually want your model to look like. Most likely it would work just fine to use the Values from your dictionary.
Data = d.Values,
That will make the JSON data come across as an array where each element is an array with the paired teams in it.
But Fildor makes a good point in his comment, that you could do this more easily by shuffling and pairing up adjacent teams:
Data = list.OrderBy(t => random.Next()).Chunk(2);
Then all that fancy dictionary logic goes away.
I have a list that is constantly being updated throughout my program. I would like to be able to compare the initial count and final count of my list after every update. The following is just a sample code (the original code is too lengthy) but it sufficiently captures the problem.
class Bot
{
public int ID { get; set; }
}
public class Program
{
public void Main()
{
List<Bot> InitialList = new List<Bot>();
List<Bot> FinalList = new List<Bot>();
for (int i = 0; i < 12345; i++)
{
Bot b = new Bot() {ID = i};
InitialList.Add(b);
}
FinalList = InitialList;
for (int i = 0; i < 12345; i++)
{
Bot b = new Bot() {ID = i};
FinalList.Add(b);
}
Console.Write($"Initial list has {InitialList.Count} bots");
Console.Write($"Final list has {FinalList.Count} bots");
}
}
Output:
Initial list has 24690 bots
Final list has 24690 bots
Expected for both lists to have 12345 bots.
What is correct way to copy the initial list so new set is not simply added to original?
To do what you seem to want to do, you want to copy the list rather than assign a new reference to the same list. So instead of
FinalList = InitialList;
Use
FinalList.AddRange(InitialList);
Basically what you had was two variables both referring to the same list. This way you have two different lists, one with the initial values and one with new values.
That said, you could also just store the count if that's all you want to do.
int initialCount = InitialList.Count;
FinalList = InitialList;
Although there's now no longer a reason to copy from one to the other if you already have the data you need.
I get the feeling you actually want to do more than what's stated in the question though, so the correct approach may change depending on what you actually want to do.
I have a list of arrays, of which i want to take one value from each array and build up a JSON structure. Currently for every managedstrategy the currency is always the last value in the loop. How can i take the 1st, then 2nd value etc while looping the names?
List<managedstrategy> Records = new List<managedstrategy>();
int idcnt = 0;
foreach (var name in results[0])
{
managedstrategy ms = new managedstrategy();
ms.Id = idcnt++;
ms.Name = name.ToString();
foreach (var currency in results[1]) {
ms.Currency = currency.ToString();
}
Records.Add(ms);
}
var Items = new
{
total = results.Count(),
Records
};
return Json(Items, JsonRequestBehavior.AllowGet);
JSON structure is {Records:[{name: blah, currency: gbp}]}
Assuming that I understand the problem correctly, you may want to look into the Zip method provided by Linq. It's used to "zip" together two different lists, similar to how a zipper works.
A related question can be found here.
Currently, you are nesting the second loop in the first, resulting in it always returning the last currency, you have to put it all in one big for-loop for it to do what you want:
for (int i = 0; i < someNumber; i++)
{
// some code
ms.Name = results[0][i].ToString();
ms.Currency = results[1][i].ToString();
}
I have a class:
class DisplayableUnit
{
public string ID; //unique ID which is not repeated
public string ParentID; //ID of parent DisplayableUnit
[NonSerialized]
public DisplayableUnit ParentDU; //Instance of parent displayable unit
//other fields
}
Each instance of this class is stored in a List.
At some point I serialize each of those instances into a separate file and then load it back.
The ParentDU field becomes a null of course, and I really don't need to serialize it too.
Now my task is to restore relations between instances, so I look for a most clear and fast way to do it.
What I have is a **List<DisplayableUnit> LoadedProject.DUnits** with all deserialized objects.
I wrote some functions to do it, but it's still feels kind of weird and time-consumable to use those.
private static List<DisplayableUnit> GetChildDisplayableUnitsFor(DisplayableUnit dunit)
{
List<DisplayableUnit> ret_list = new List<DisplayableUnit>();
for (int i = 0; i < LoadedProject.DUnits.Count; i++) //iterate through all deserialized units
if (string.Compare(LoadedProject.DUnits[i].ParentDUnitID, dunit.ID) == 0) //compare the own ID and parentID of potential child
ret_list.Add(LoadedProject.DUnits[i]); //add to list if this is child
return ret_list;
}
public static void RestoreTreeForDU(DisplayableUnit du)
{
List<DisplayableUnit> childs = GetChildDisplayableUnitsFor(du); //get childs units
for (int i = 0; i < childs.Count; i++) //iterate through those
{
childs[i].ParentDUnit = du; //restore instance link
RestoreTreeForDU(childs[i]); //make just found child as parent and see if we can restore childs for it.
}
}
public static List<DisplayableUnit> GetParentDUnits()
{
List<DisplayableUnit> ret_list = new List<DisplayableUnit>();
for (int i = 0; i < LoadedProject.DUnits.Count; i++)
if (string.IsNullOrEmpty(LoadedProject.DUnits[i].ParentDUnitID))
ret_list.Add(LoadedProject.DUnits[i]);
return ret_list;
}
And this is where I start to think what to do next... What I need to do initially to start restore of relations?
Do I need just to iterate through LoadedProject.DUnits (all deserialized units) and call RestoreTreeForDU for every unit?
This looks kind of weird since some units there would be already restored and etc.
It's all so confusing :/
Create a lookup of the ID to the actual object, then you can simply loop through the list and get the value of its parent from that lookup:
List<DisplayableUnit> list = new List<DisplayableUnit>();
//todo deserialize into list
var lookup = list.ToDictionary(unit => unit.ID, unit => unit);
foreach (var unit in list)
unit.ParentDU = lookup[unit.ParentID];
Let's say I have two List<string>. These are populated from the results of reading a text file
List owner contains:
cross
jhill
bbroms
List assignee contains:
Chris Cross
Jack Hill
Bryan Broms
During the read from a SQL source (the SQL statement contains a join)... I would perform
if(sqlReader["projects.owner"] == "something in owner list" || sqlReader["assign.assignee"] == "something in assignee list")
{
// add this projects information to the primary results LIST
list_by_owner.Add(sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
// if the assignee is not null, add also to the secondary results LIST
// logic to determine if assign.assignee is null goes here
list_by_assignee.Add(sqlReader["assign.assignee"],sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
}
I do not want to end up using nested foreach.
The FOR loop would probably suffice. Someone had mentioned ZIP to me but wasn't sure if that would be a preferable route to go in my situation.
One loop to iterate through both lists (assuming both have same count):
for (int i = 0; i < alpha.Count; i++)
{
var itemAlpha = alpha[i] // <= your object of list alpha
var itemBeta = beta[i] // <= your object of list beta
//write your code here
}
From what you describe, you don't need to iterate at all.
This is what you need:
http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx
Usage:
if ((listAlpga.contains(resultA) || (listBeta.contains(resultA)) {
// do your operation
}
List Iteration will happen implicitly inside the contains method. And thats 2n comparisions, vs n*n for nested iteration.
You would be better off with sequential iteration in each list one after the other, if at all you need to go that route.
This list is maybe better represented as a List<KeyValuePair<string, string>> which would pair the two list values together in a single list.
There are several options for this. The least "painful" would be plain old for loop:
for (var index = 0; index < alpha.Count; index++)
{
var alphaItem = alpha[index];
var betaItem = beta[index];
// Do something.
}
Another interesting approach is using the indexed LINQ methods (but you need to remember they get evaluated lazily, you have to consume the resulting enumerable), for example:
alpha.Select((alphaItem, index) =>
{
var betaItem = beta[index];
// Do something
})
Or you can enumerate both collection if you use the enumerator directly:
using (var alphaEnumerator = alpha.GetEnumerator())
using (var betaEnumerator = beta.GetEnumerator())
{
while (alphaEnumerator.MoveNext() && betaEnumerator.MoveNext())
{
var alphaItem = alphaEnumerator.Current;
var betaItem = betaEnumerator.Current;
// Do something
}
}
Zip (if you need pairs) or Concat (if you need combined list) are possible options to iterate 2 lists at the same time.
I like doing something like this to enumerate over parallel lists:
int alphaCount = alpha.Count ;
int betaCount = beta.Count ;
int i = 0 ;
while ( i < alphaCount && i < betaCount )
{
var a = alpha[i] ;
bar b = beta[i] ;
// handle matched alpha/beta pairs
++i ;
}
while ( i < alphaCount )
{
var a = alpha[i] ;
// handle unmatched alphas
++i ;
}
while ( i < betaCount )
{
var b = beta[i] ;
// handle unmatched betas
++i ;
}