Range of Elements with Quick Access by Two Keys - c#

I have these informations to save in a variable for my basic neural network simulation.
Node (NodeId, State)
Relationship (SourceNodeId, TargetNodeId, Weight, State)
State is the activation level which is the only value which changes during simulation. It is an unsigned float.
I want to easily get all incoming and all outgoing relationships for the current node. By easily I mean very great performance. (I have around 1,000,000 nodes with each 50 relationships on average.)
The main part of my program looks like this (pseudo code).
foreach(Node in Nodes)
{
Inputs[] = all incomeing relationships;
Node.State = sum of all Inputs[] elements;
Outputs[] = all outgoing relationships;
normalize all Outputs[] elements temporarly; // so that the sum of their weights is 1
foreach(Output in Outputs[])
{
Output.State = Node.State * Output.Weight;
}
}
I hope you understand what I want to do. If not I will try to explain better.
What type of Object would be best to have quick access to the nodes by their SourceNodeId and by their TargetNodeId?
PS: Programming in C# using Visual Studio.

I think list with nodes + two syncronized Dictionaries will be fast.
List<Node> allNodes;
Dictionary<Node,List<Node>> sourceTarget;
Dictionary<Node,List<Node>> targetSource;
I recommend you to encapsulate them into single object... Two-Way-Dictionary:
class TwoWayDictionary<T1,T2>
{
private Dictionary<T1,List<T2>> sourceTarget;
private Dictionary<T2,List<T1>> targetSource;
// Here shold be public methods and accessors ...
}

Related

Get first (or any) value from HashSet

Currently, my code looks like this:
private List<Node> dirtyNodes = new List<Node> dirtyNodes();
public void UpdateDirtyNodes()
{
while(dirtyNodes.Count > 0)
{
Node nodeToUpdate = dirtyNodes[0];
nodeToUpdate.UpdateNode();
dirtyNodes.Remove(nodeToUpdate);
}
}
public void MarkNodeDirty(Node node)
{
if(!dirtyNodes.Contains(node))
{
dirtyNodes.Add(node);
}
}
public void MarkNodeClean(Node node)
{
dirtyNodes.Remove(node);
}
This is a performance-critical part of the code, and it's slower than I'd like because dirtyNodes.Contains has to iterate over the entire array in most cases. I'd like to replace the List with a HashSet because it should be faster, but I can't figure out how to make that work with UpdateDirtyNodes().
The difficulty is that UpdateNode() can add or remove nodes from dirtyNodes at any time, hence the slightly awkward while loop. Is there a way I can get the "first" value from a HashSet? Order doesn't matter, I just need to stay in the while loop until dirtyNodes is empty, updating whatever node comes next.
I would prefer to avoid using Linq since this code will be part of a library and I don't want to force them to include Linq.
How can I do this?
Turns out it's very easy to just use the enumerator directly:
public void UpdateDirtyNodes()
{
while(dirtyNodes.Count > 0)
{
using(HashSet<Node>.Enumerator enumerator = dirtyNodes.GetEnumerator())
{
if(enumerator.MoveNext())
{
Node nodeToUpdate = enumerator.Current;
nodeToUpdate.UpdateNode();
dirtyNodes.Remove(nodeToUpdate);
}
}
}
}
public void MarkNodeDirty(Node node)
{
dirtyNodes.Add(node);
}
I originally tried something similar but didn't fully understand how to manually use enumerators and it didn't work.
It's significantly faster than the List (overall frame time is ~25-50% faster depending on the situation just from that one change) so I'm very happy. (Don't panic about that 30MB allocation in the screenshot below - I'm working on it.)
Add a bool dirty field inside the Node class. This is in addition to keeping a hash set. Then MarkNodeClean() doesn't need to remove nodes from the HashSet, shaving off some CPU cycles.
If you feel that adding a field inside Node class is too "dirty" (pun intended), then just make a HashSet<(Node, bool)> instead of HashSet<Node>, but you are loosing performance on allocating and garbage-collecting extra objects, which is not ideal since your code is performance-critical.
UpdateDirtyNodes() will take nodes one at a time until the HashSet is empty. After taking each node, it will look at the boolean flag to decide whether the node is actually dirty.
P.S.
You should remove dirtyNodes.Clear(); from UpdateDirtyNodes(). This is a race condition. If a node is added after the while loop finds that dirtyNodes.Count is 0, then dirtyNodes.Clear(); clears out that node without processing it. This is a separate bug, not related to your question.

Object oriented design, interacting objects

This problem reminds me of the minigame Doodle God. There are several objects and some of them can interact with each other and form new objects. Each object is naturally its own class: water, fire, air, etc. These all inherit from the same base class. The water and fire objects, for example, could be combined to form an ash object which can be used in new combinations.
The problem is figuring out an elegant way to handle all the possible combinations. The most obvious, but horribly unmaintainable, solution would be creating a function that takes any two objects as parameters and uses a huge switch block to compare typenames and figure out what kind of object (if any) should be returned when these two interact. It is also important that combine(a, b) should always equal combine(b, a).
What would be a maintainable and efficient design for this scenario?
We had to take code for this in a game to collide items. We ended up going for a two dimensional structure that stored a bunch of delegate methods.
| air | wind | fire
air |combine(air,air)|combine(air,wind) |combine(air,fire)
wind | |combine(wind,wind)|combine(wind,fire)
fire | | |combine(fire,fire)
with a bit of thinking, you only need to populate just over half of the combining matrix.
You could (for instance):
lookup =
new Dictionary<
Tuple<Type, Type>,
Func<ICombinable, ICombinable, ICombinable>();
lookup.Add(
Tuple.Create(typeof(Air), typeof(Fire)),
(air,fire) => return new Explosion());
Then have a single method:
ICombinable Combine(ICombinable a,ICombinable b)
{
var typeA = a.GetType();
var typeB = b.GetType();
var typeCombo1 = Tuple.Create(typeA,typeB);
Func<ICombinable,ICombinable,ICombinable> combineFunc;
if(lookup.TryGetValue(typeCombo1, out combineFunc))
{
return combineFunc(a,b);
}
var typeCombo2 = Tuple.Create(typeB,typeA);
if(lookup.TryGetValue(typeCombo2, out combineFunc))
{
return combineFunc(b,a);
}
//throw?
}
All game objects are already designed in some way. They are either hardcoded or read at runtime from a resource.
This data structure can easily be stored in a Dictionary<Element, Dictionary<Element, Element>>.
var fire = new FireElement();
var water = new WaterElement();
var steam = new SteamElement();
_allElements = Dictionary<Element, Dictionary<Element,Element>>
{
new KeyValuePair<Element, Dictionary<Element, Element>>
{
Key = fire,
Value = new KeyValuePair<Element, Element>
{
Key = water,
Value = steam
}
},
new KeyValuePair<Element, Dictionary<Element, Element>>
{
Key = water,
Value = new KeyValuePair<Element, Element>
{
Key = fire,
Value = steam
}
}
}
When loading or defining the elements, you can just duplicate them, as there'll at most be a few hundred. The overhead is neglectable for the ease of coding IMO.
The keys of _allElements contain all existing, combinable elements. The value of _allElements[SomeElement] yields yet another dictionary, which you can access on the elment you wish to combine it with.
This means you can find the resulting element of a combination with the following code:
public Element Combine(Element element1, Element element2)
{
return _allElements[element1][element2];
}
Which, when called as such:
var resultingElement = Combine(fire, water);
Yields steam, the same result as were Combine(water, fire) called.
Untested, but I hope the principle applies.
Exactly this is the right place for interfaces. With them you can avoid the big switch and each element class can implement its own behaviour of interacting with another elemt class.
I would propose using an Abstract Factory returning a specific kind of interface, lets say InteractionOutcome. You would not escape the need of using a switch-case but you would end up with something much more maintenable using different factories for each "construction".
Hope I helped!

All the paths between 2 nodes in graph

I have to make an uninformed search (Breadth-first-Search) program which takes two nodes and return all the paths between them.
public void BFS(Nod start, Nod end) {
Queue<Nod> queue = new Queue<Nod>();
queue.Enqueue(start);
while (queue.Count != 0)
{
Nod u = queue.Dequeue();
if (u == end) break;
else
{
u.data = "Visited";
foreach (Edge edge in u.getChildren())
{
if (edge.getEnd().data == "")
{
edge.getEnd().data = "Visited";
if (edge.getEnd() != end)
{
edge.getEnd().setParent(u);
}
else
{
edge.getEnd().setParent(u);
cost = 0;
PrintPath(edge.getEnd(), true);
edge.getEnd().data = "";
//return;
}
}
queue.Enqueue(edge.getEnd());
}
}
}
}
My problem is that i only get two paths instead of all and i don't know what to edit in my code to get them all. The input of my problem is based on this map :
In the BFS algorithm you must not stop after you find a solution. One idea is to set data null for all the cities you visited except the first one and let the function run a little bit longer. I don't have time to write you a snippet but if ou don't get it i will write at least a pseudocode. If you didn't understood my idea post a comment with your question and i will try to explain better.
Breadth first search is a strange way to generate all possible paths for the following reason: you'd need to keep track of whether each individual path in the BFS had traversed the node, not that it had been traversed at all.
Take a simple example
1----2
\ \
3--- 4----5
We want all paths from 1 to 5. We queue up 1, then 2 and 3, then 4, then 5. We've lost the fact that there are two paths through 4 to 5.
I would suggest trying to do this with DFS, though this may be fixable for BFS with some thinking. Each thing queued would be a path, not a single node, so one could see if that path had visited each node. This is wasteful memory wise, thoug
A path is a sequence of vertices where no vertex is repeated more than once. Given this definition, you could write a recursive algorithm which shall work as follows: Pass four parameters to the function, call it F(u, v, intermediate_list, no_of_vertices), where u is the current source (which shall change as we recurse), v is the destination, intermediate_list is a list of vertices which shall be initially empty, and every time we use a vertex, we'll add it to the list to avoid using a vertex more than once in our path, and no_of_vertices is the length of the path that we would like to find, which shall be lower bounded by 2, and upper bounded by V, the number of vertices. Essentially, the function shall return a list of paths whose source is u, destination is v, and whose length of each path is no_of_vertices. Create an initial empty list and make calls to F(u, v, {}, 2), F(u, v, {}, 3), ..., F(u, v, {}, V), each time merging the output of F with the list where we intend to store all paths. Try to implement this, and if you still face trouble, I'll write the pseudo-code for you.
Edit: Solving the above problem using BFS: Breadth first search is an algorithm that could be used to explore all the states of a graph. You could explore the graph of all paths of the given graph, using BFS, and select the paths that you want. For each vertex v, add the following states to the queue: (v, {v}, {v}), where each state is defined as: (current_vertex, list_of_vertices_already_visited, current_path). Now, while the queue is not empty, pop off the top element of the queue, for each edge e of the current_vertex, if the tail vertex x doesn't already exist in the list_of_vertices_already_visited, push the new state (x, list_of_vertices_already_visited + {x}, current_path -> x) to the queue, and process each path as you pop it off the queue. This way you can search the entire graph of paths for a graph, whether directed, or undirected.
Sounds like homework. But the fun kind.
The following is pseudocode, is depth first instead of breath first (so should be converted to a queue type algorithm, and may contain bugs, but the general jist should be clear.
class Node{
Vector[Link] connections;
String name;
}
class Link{
Node destination;
int distance;
}
Vector[Vector[Node]] paths(Node source, Node end_dest, Vector[Vector[Node]] routes){
for each route in routes{
bool has_next = false;
for each connection in source.connections{
if !connection.destination in route {
has_next = true;
route.push(destination);
if (!connection.destination == end_dest){
paths(destination, end_dest, routes);
}
}
}
if !has_next {
routes.remove(route) //watch out here, might mess up the iteration
}
}
return routes;
}
Edit: Is this actually the answer to the question you are looking for? Or do you actually want to find the shortest path? If it's the latter, use Dijkstra's algorithm: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Recursive Tree Mapping

I've been working a lot with tree implementations lately and how we represent and understand trees. My focus has been on turning mathematical expressions into binary trees, I set the problem of representing a tree in a linear form say a string or an array, while still retaining important information about the tree and its sub trees.
As such I have developed a really simple encoding for binary expression trees does just this. However I am having some issues with implementing it effectively in a recursive manor, it seems to be the one failing aspect behind the concept.
The encoding is simple if the node resides as a left child it is given a map of 1 if it resides as a right child it is given a 0. This simple encoding allows me to encode entire balanced and unbalanced trees like this:
## ##
/ \ / \
1 0 OR 1 0
/ \ / \ / \
11 10 01 00 01 00
Etc to trees of depth N
Does anyone have any suggestions as to how to create a recursive function that would create the prefix string representing a mapping of this sort (for example ## 1 11 10 0 01 00).
I was told this would be difficult/impossible due to having to keep track of alternating between 1 and 0 while retaining and concatenating to the value of the parent.
I wondered if anyone had any insight or ideas into how to do this with C# ??
I'm not sure I understand you problem but here is something that might help. One solution might be implementing graph traversal routine on a Graph (remember a Tree is a specialized Graph), where the visit occurs the first time you encounter a node/vertex. I apologize for posting Java code when you asked for C# but I happen know Java...
public void depthFirstSearch(Graph graph, Vertex start){
Set<Vertex> visited = new HashSet<Vertex>(); // could use vertex.isVisited()...
Deque<Vertex> stack = new ArrayDeque<Vertex>(); // stack implies depth first
// first visit the root element, then add it to the stack so
// we will visit it's children in a depth first order
visit(start);
visited.add(start);
stack.push(start);
while(stack.isEmpty() == false){
List<Edge> edges = graph.getEdges(stack.peekFirst());
Vertex nextUnvisited = null;
for(Edge edge : edges){
if(visited.contains(edge.getEndVertex)) == false){
nextUnvisited = edge.getEndVertex();
break; // break for loop
}
}
if(nextUnvisited == null){
// check the next item in the stack
Vertex popped = stack.pop();
} else {
// visit adjacent unvisited vertex
visit(nextUnvisited);
visited.add(nextUnvisited);
stack.push(nextUnvisited); // visit it's "children"
}
}
}
public void visit(Vertex vertex){
// your own visit logic (string append, etc)
}
You can easily modify this to be a breadth first search by using the Deque as a queue instead of stack as follows:
stack.pop() >>>> queue.removeFirst()
stack.push() >>>> queue.addLast()
Note that for this purpose the Graph and Edge classes support the following operations :
public interface Graph {
...
// get edges originating from Vertex v
public List<Edge> getEdges(Vertex v);
...
}
public interface Edge {
...
// get the vertex at the start of the edge
// not used here but kind of implied by the getEndVertex()...
public Vertex getStartVertex();
// get the vertex at the end of the edge
public Vertex getEndVertex();
...
}
Hopefully that gives you some ideas.
Well i don't know if i completely get your question but it seems you want a preorder traversal of the tree. I don't know c#'s syntax but the pseudocode i think will be as follows:
preorder_traversal(node)
if(node != NULL)
print(node)
preorder_traversal(left_sub_child)
preorder_traversal(right_sub_child)
else
return
Building a tree recursively is a difficult challenge even for a seasoned programmer. I realize I'm a bit late to the party on this question considering it was originally posted in March of 2011. Better late than never?
One important factor in creating a tree is just making sure your dataset is formatted correctly. You simply need a way to associate a parent to a child. Once the association is clearly defined, then you can begin to code the solution. I chose to use a simple format like this:
ParentId ChildId
1 2
1 3
2 4
3 5
Etc.
Once that relationship is established, I developed a recursive method to iterate through the dataset to build the tree.
First I identify all the parent nodes and store them in a collection giving them each a unique identifier using a combination of the parent ID and child ID:
private void IdentifyParentNodes()
{
SortedList<string, MyTreeNode> newParentNodes = new SortedList<string,MyTreeNode>();
Dictionary<string, string> parents = new Dictionary<string, string>();
foreach (MyTreeNode oParent in MyTreeDataSource.Values)
{
if (!parents.ContainsValue(oParent.ParentId))
{
parents.Add(oParent.ParentId + "." + oParent.ChildId, oParent.ParentId);
newParentNodes.Add(oParent.ParentId + "." + oParent.ChildId, oParent);
}
}
this._parentNodes = newParentNodes;
}
Then the root calling method would loop through the parents and call the recursive method to build the tree:
// Build the rest of the tree
foreach (MyTreeNode node in ParentNodes.Values)
{
RecursivelyBuildTree(node);
}
Recursive method:
private void RecursivelyBuildTree(MyTreeNode node)
{
int nodePosition = 0;
_renderedTree.Append(FormatNode(MyTreeNodeType.Parent, node, 0));
_renderedTree.Append(NodeContainer("open", node.ParentId));
foreach (MyTreeNode child in GetChildren(node.ParentId).Values)
{
nodePosition++;
if (IsParent(child.ChildId))
{
RecursivelyBuildTree(child);
}
else
{
_renderedTree.Append(FormatNode(MyTreeNodeType.Leaf, child, nodePosition));
}
}
_renderedTree.Append(NodeContainer("close", node.ParentId));
}
Method used to get children of a parent:
private SortedList<string, MyTreeNode> GetChildren(string parentId)
{
SortedList<string, MyTreeNode> childNodes = new SortedList<string, MyTreeNode>();
foreach (MyTreeNode node in this.MyTreeDataSource.Values)
{
if (node.ParentId == parentId)
{
childNodes.Add(node.ParentId + node.ChildId, node);
}
}
return childNodes;
}
Not that complex or elegant, but it got the job done. This was written in 2007 time frame, so it's old code, but it still works. :-) Hope this helps.

Is this more suited for key value storage or a tree?

I'm trying to figure out the best way to represent some data. It basically follows the form Manufacturer.Product.Attribute = Value. Something like:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
So the minimum price across all Acme products is 100 except in the case of product A and B. I want to store this data in C# and have some function where GetValue("Acme.ProductC.MinimumPrice") returns 100 but GetValue("Acme.ProductA.MinimumPrice") return 50.
I'm not sure how to best represent the data. Is there a clean way to code this in C#?
Edit: I may not have been clear. This is configuration data that needs to be stored in a text file then parsed and stored in memory in some way so that it can be retrieved like the examples I gave.
Write the text file exactly like this:
Acme.*.MinimumPrice = 100
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Parse it into a path/value pair sequence:
foreach (var pair in File.ReadAllLines(configFileName)
.Select(l => l.Split('='))
.Select(a => new { Path = a[0], Value = a[1] }))
{
// do something with each pair.Path and pair.Value
}
Now, there two possible interpretations of what you want to do. The string Acme.*.MinimumPrice could mean that for any lookup where there is no specific override, such as Acme.Toadstool.MinimumPrice, we return 100 - even though there is nothing referring to Toadstool anywhere in the file. Or it could mean that it should only return 100 if there are other specific mentions of Toadstool in the file.
If it's the former, you could store the whole lot in a flat dictionary, and at look up time keep trying different variants of the key until you find something that matches.
If it's the latter, you need to build a data structure of all the names that actually occur in the path structure, to avoid returning values for ones that don't actually exist. This seems more reliable to me.
So going with the latter option, Acme.*.MinimumPrice is really saying "add this MinimumPrice value to any product that doesn't have its own specifically defined value". This means that you can basically process the pairs at parse time to eliminate all the asterisks, expanding it out into the equivalent of a completed version of the config file:
Acme.ProductA.MinimumPrice = 50
Acme.ProductB.MinimumPrice = 60
Acme.ProductC.DefaultColor = Blue
Acme.ProductC.MinimumPrice = 100
The nice thing about this is that you only need a flat dictionary as the final representation and you can just use TryGetValue or [] to look things up. The result may be a lot bigger, but it all depends how big your config file is.
You could store the information more minimally, but I'd go with something simple that works to start with, and give it a very simple API so that you can re-implement it later if it really turns out to be necessary. You may find (depending on the application) that making the look-up process more complicated is worse over all.
I'm not entirely sure what you're asking but it sounds like you're saying either.
I need a function that will return a fixed value, 100, for every product ID except for two cases: ProductA and ProductB
In that case you don't even need a data structure. A simple comparison function will do
int GetValue(string key) {
if ( key == "Acme.ProductA.MinimumPrice" ) { return 50; }
else if (key == "Acme.ProductB.MinimumPrice") { return 60; }
else { return 100; }
}
Or you could have been asking
I need a function that will return a value if already defined or 100 if it's not
In that case I would use a Dictionary<string,int>. For example
class DataBucket {
private Dictionary<string,int> _priceMap = new Dictionary<string,int>();
public DataBucket() {
_priceMap["Acme.ProductA.MinimumPrice"] = 50;
_priceMap["Acme.ProductB.MinimumPrice"] = 60;
}
public int GetValue(string key) {
int price = 0;
if ( !_priceMap.TryGetValue(key, out price)) {
price = 100;
}
return price;
}
}
One of the ways - you can create nested dictionary: Dictionary<string, Dictionary<string, Dictionary<string, object>>>. In your code you should split "Acme.ProductA.MinimumPrice" by dots and get or set a value to the dictionary corresponding to the splitted chunks.
Another way is using Linq2Xml: you can create XDocument with Acme as root node, products as children of the root and and attributes you can actually store as attributes on products or as children nodes. I prefer the second solution, but it would be slower if you have thousands of products.
I would take an OOP approach to this. The way that you explain it is all your Products are represented by objects, which is good. This seems like a good use of polymorphism.
I would have all products have a ProductBase which has a virtual property that defaults
virtual MinimumPrice { get { return 100; } }
And then your specific products, such as ProductA will override functionality:
override MinimumPrice { get { return 50; } }

Categories

Resources