Topological sort with support for cyclic dependencies - c#

Consider the following dependencies (where A --> B means B depends on A, so effectively, A is the 'parent')
A --> B
A --> C
C --> D
C --> E
More graphically:
A
|
----------
| |
B C
|
-----------
| |
D E
A topological sort algorithm would return something like:
ABCDE
I have found code for this (exhibit A and exhibit B), but neither support cyclice dependencies. I am in the situtation that this could happen:
A --> B
B --> C
B --> D
C --> B
C --> E
Graphically:
A
|
B <--> C
| |
D E
This could return ABCDE or ACBDE. So because B and C are on the same 'level', the order between them is not important (likewise for D and E).
How could I accomplish such a thing. I realize this isn't exactly a topological sorting, but I'm not expert mathematician, so I don't really know where to start looking, let alone how to implement this.
Personally, I'm working in C#, but if you know how to do it in any other language, I'd be happy to study your code and translate it to C#.
update
I can also have the following situation:
A <--------
| |
--> B --> C
| |
D E
So, important, this doesn't have to be a tree. I can have any arbitrary graph. In fact, not all nodes have to be connected to one another.

First off, it is conceptually easier if you have a graph such that you can ask "what do you depend on"? I'm going to assume that we have a graph where a directed edge from A to B means "A depends on B", which is the opposite of your statement.
I am somewhat confused by your question since a topo sort that ignores cycles is virtually the same as a regular topo sort. I'll develop the algorithm so that you can handle cycles as you see fit; perhaps that will help.
The idea of the sort is:
A graph is a collection of nodes such that each node has a collection of neighbours. As I said, if a node A has a neighbour B then A depends on B, so B must happen before A.
The sort takes a graph and produces a sorted list of nodes.
During the operation of the sort a dictionary is maintained which maps every node onto one of three values: alive, dead and undead. An alive node has yet to be processed. A dead node is already processed. An undead node is being processed; it's no longer alive but not yet dead.
If you encounter a dead node you can skip it; it's already in the output list.
If you encounter a live node then you process it recursively.
If you encounter an undead node then it is part of a cycle. Do what you like. (Produce an error if cycles are illegal, treat it as dead if cycles are legal, etc.)
function topoSort(graph)
state = []
list = []
for each node in graph
state[node] = alive
for each node in graph
visit(graph, node, list, state)
return list
function visit(graph, node, list, state)
if state[node] == dead
return // We've done this one already.
if state[node] == undead
return // We have a cycle; if you have special cycle handling code do it here.
// It's alive. Mark it as undead.
state[node] = undead
for each neighbour in getNeighbours(graph, node)
visit(graph, neighbour, list, state)
state[node] = dead
append(list, node);
Make sense?

edit 3 years later: I've occasionally come back to this since I first implemented this in 2014. I didn't really have a good understanding of it at the time when I first posted this answer, so that answer was overly complex. The sort is actually pretty straightforward to implement:
public class Node
{
public int Data { get; set; }
public List<Node> Children { get; set; }
public Node()
{
Children = new List<Node>();
}
}
public class Graph
{
public List<Node> Nodes { get; set; }
public Graph()
{
Nodes = new List<Node>();
}
public List<Node> TopologicSort()
{
var results = new List<Node>();
var seen = new List<Node>();
var pending = new List<Node>();
Visit(Nodes, results, seen, pending);
return results;
}
private void Visit(List<Node> graph, List<Node> results, List<Node> dead, List<Node> pending)
{
// Foreach node in the graph
foreach (var n in graph)
{
// Skip if node has been visited
if (!dead.Contains(n))
{
if (!pending.Contains(n))
{
pending.Add(n);
}
else
{
Console.WriteLine(String.Format("Cycle detected (node Data={0})", n.Data));
return;
}
// recursively call this function for every child of the current node
Visit(n.Children, results, dead, pending);
if (pending.Contains(n))
{
pending.Remove(n);
}
dead.Add(n);
// Made it past the recusion part, so there are no more dependents.
// Therefore, append node to the output list.
results.Add(n);
}
}
}
}

In fact, you want a breadth-first printout of your graph
The linked wikipedia page list an algorithm to perform this.
There's also this question on SO

Start by thinking about the problem right. You don't have a tree. You have an arbitrary graph.
With that in mind, what you probably need to do first is to find cycles and break them by deleting an edge in the cycle (OK, marking the edge as "ignore this when doing topological sort").
With all the cycles removed, you can apply toplogical sort to the remaining nodes and arcs.

Related

C# Object suddenly null

I've searched so long and hard for this and now I'm at road's end. I've had this issue with more than this project, but I ended up scrapping the others. I have a code (C#) which is basically me trying to do Huffman tree. At one point I do this ("nodeList" is List(Node)):
Node node = new Node(nodeList[0], nodeList[1]);
nodeList.Add(node); // Adds a new node which includes two subnodes.
// Remove nodes from list (as they are now included in another node)
nodeList.RemoveAt(0);
nodeList.RemoveAt(1);
And the constructors in use here is:
// Constructor that takes nodes
public Node(Node left, Node right)
{
leftNode = new Node(left);
rightNode = new Node(right);
}
// Constructor that only takes 1 single node
public Node(Node copy)
{
rightNode = copy.rightNode;
leftNode = copy.leftNode;
unencodedBits = copy.unencodedBits;
encodingValue = copy.encodingValue;
positions = copy.positions;
}
I did the second constructor as a hope that it would fix my problem (thinking that removing the node from the list maybe nulled it out.) (all of the values in my Node-class is on the right side of the second constructor.)
The problem: After doing the second "RemoveAt" the Node will no longer contain the two nodes. And I can not understand why. What do I have to do to prevent this from happening and why does it happen (so I can understand similar cases in the future)?
I probably forgot to include some vital information; If I did, please tell me. And thanks for any assistance.
Is your nodeList object in array or a List? If it is a list, then nodeList.RemoveAt(0) causes the node currently located and index 1 to now be located at index 0. so you would need to call
nodeList.RemoveAt(0);
nodeList.RemoveAt(0);
instead of
nodeList.RemoveAt(0);
nodeList.RemoveAt(1);
see here: http://msdn.microsoft.com/en-us/library/5cw9x18z(v=vs.110).aspx

All the paths between 2 nodes in graph

I have to make an uninformed search (Breadth-first-Search) program which takes two nodes and return all the paths between them.
public void BFS(Nod start, Nod end) {
Queue<Nod> queue = new Queue<Nod>();
queue.Enqueue(start);
while (queue.Count != 0)
{
Nod u = queue.Dequeue();
if (u == end) break;
else
{
u.data = "Visited";
foreach (Edge edge in u.getChildren())
{
if (edge.getEnd().data == "")
{
edge.getEnd().data = "Visited";
if (edge.getEnd() != end)
{
edge.getEnd().setParent(u);
}
else
{
edge.getEnd().setParent(u);
cost = 0;
PrintPath(edge.getEnd(), true);
edge.getEnd().data = "";
//return;
}
}
queue.Enqueue(edge.getEnd());
}
}
}
}
My problem is that i only get two paths instead of all and i don't know what to edit in my code to get them all. The input of my problem is based on this map :
In the BFS algorithm you must not stop after you find a solution. One idea is to set data null for all the cities you visited except the first one and let the function run a little bit longer. I don't have time to write you a snippet but if ou don't get it i will write at least a pseudocode. If you didn't understood my idea post a comment with your question and i will try to explain better.
Breadth first search is a strange way to generate all possible paths for the following reason: you'd need to keep track of whether each individual path in the BFS had traversed the node, not that it had been traversed at all.
Take a simple example
1----2
\ \
3--- 4----5
We want all paths from 1 to 5. We queue up 1, then 2 and 3, then 4, then 5. We've lost the fact that there are two paths through 4 to 5.
I would suggest trying to do this with DFS, though this may be fixable for BFS with some thinking. Each thing queued would be a path, not a single node, so one could see if that path had visited each node. This is wasteful memory wise, thoug
A path is a sequence of vertices where no vertex is repeated more than once. Given this definition, you could write a recursive algorithm which shall work as follows: Pass four parameters to the function, call it F(u, v, intermediate_list, no_of_vertices), where u is the current source (which shall change as we recurse), v is the destination, intermediate_list is a list of vertices which shall be initially empty, and every time we use a vertex, we'll add it to the list to avoid using a vertex more than once in our path, and no_of_vertices is the length of the path that we would like to find, which shall be lower bounded by 2, and upper bounded by V, the number of vertices. Essentially, the function shall return a list of paths whose source is u, destination is v, and whose length of each path is no_of_vertices. Create an initial empty list and make calls to F(u, v, {}, 2), F(u, v, {}, 3), ..., F(u, v, {}, V), each time merging the output of F with the list where we intend to store all paths. Try to implement this, and if you still face trouble, I'll write the pseudo-code for you.
Edit: Solving the above problem using BFS: Breadth first search is an algorithm that could be used to explore all the states of a graph. You could explore the graph of all paths of the given graph, using BFS, and select the paths that you want. For each vertex v, add the following states to the queue: (v, {v}, {v}), where each state is defined as: (current_vertex, list_of_vertices_already_visited, current_path). Now, while the queue is not empty, pop off the top element of the queue, for each edge e of the current_vertex, if the tail vertex x doesn't already exist in the list_of_vertices_already_visited, push the new state (x, list_of_vertices_already_visited + {x}, current_path -> x) to the queue, and process each path as you pop it off the queue. This way you can search the entire graph of paths for a graph, whether directed, or undirected.
Sounds like homework. But the fun kind.
The following is pseudocode, is depth first instead of breath first (so should be converted to a queue type algorithm, and may contain bugs, but the general jist should be clear.
class Node{
Vector[Link] connections;
String name;
}
class Link{
Node destination;
int distance;
}
Vector[Vector[Node]] paths(Node source, Node end_dest, Vector[Vector[Node]] routes){
for each route in routes{
bool has_next = false;
for each connection in source.connections{
if !connection.destination in route {
has_next = true;
route.push(destination);
if (!connection.destination == end_dest){
paths(destination, end_dest, routes);
}
}
}
if !has_next {
routes.remove(route) //watch out here, might mess up the iteration
}
}
return routes;
}
Edit: Is this actually the answer to the question you are looking for? Or do you actually want to find the shortest path? If it's the latter, use Dijkstra's algorithm: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Range of Elements with Quick Access by Two Keys

I have these informations to save in a variable for my basic neural network simulation.
Node (NodeId, State)
Relationship (SourceNodeId, TargetNodeId, Weight, State)
State is the activation level which is the only value which changes during simulation. It is an unsigned float.
I want to easily get all incoming and all outgoing relationships for the current node. By easily I mean very great performance. (I have around 1,000,000 nodes with each 50 relationships on average.)
The main part of my program looks like this (pseudo code).
foreach(Node in Nodes)
{
Inputs[] = all incomeing relationships;
Node.State = sum of all Inputs[] elements;
Outputs[] = all outgoing relationships;
normalize all Outputs[] elements temporarly; // so that the sum of their weights is 1
foreach(Output in Outputs[])
{
Output.State = Node.State * Output.Weight;
}
}
I hope you understand what I want to do. If not I will try to explain better.
What type of Object would be best to have quick access to the nodes by their SourceNodeId and by their TargetNodeId?
PS: Programming in C# using Visual Studio.
I think list with nodes + two syncronized Dictionaries will be fast.
List<Node> allNodes;
Dictionary<Node,List<Node>> sourceTarget;
Dictionary<Node,List<Node>> targetSource;
I recommend you to encapsulate them into single object... Two-Way-Dictionary:
class TwoWayDictionary<T1,T2>
{
private Dictionary<T1,List<T2>> sourceTarget;
private Dictionary<T2,List<T1>> targetSource;
// Here shold be public methods and accessors ...
}

Is there an accepted name for this type of enumerable operation?

I often find myself needing to traverse trees of hierarchial objects and perform operations on each item along the way. Is there a generally accepted name for this kind of operation in the list comprehension vernacular? I ask because I remember first learning about python's zip function back before it had an equivalent in the .net framework and thinking it had an unusual but appropriate name.
Here are a couple of generalized methods that recurse up and down tree structures and yield each item as they're encountered.
public static IEnumerable<T> Ancestors<T>(T source, Func<T, T> selector)
{
do
{
yield return source;
source = selector(source);
} while (!Equals(source, default(T)));
}
public static IEnumerable<T> Descendents<T>(T source,
Func<T, IEnumerable<T>> selector)
{
var stack = new Stack<T>();
stack.Push(source);
while (stack.Count > 0)
{
source = stack.Pop();
yield return source;
var items = selector(source);
if (items != null)
{
foreach (var item in items)
{
stack.Push(item);
}
}
}
}
Assuming that the selector gives the child nodes, your second method is a "right first depth first" traversal. That is, if you had
A
/ \
B C
/ \ / \
D E F G
Then you get A, C, G, F, B, E, D. You get "G" before "B" because "depth first" goes as deep as it can before it tries another branch. In your particular example you'll get C before B because it prioritizes right over left.
If you changed it to
foreach (var item in items.Reverse())
then you'd get a left-first-depth-first traversal, which is how most people think of depth first traversal.
If you changed the stack to a queue then it would become a "breadth first" traversal. A, B, C, D, E, F, G. You do an entire "level" at a time.
There are other traversals as well. Notice that depth-first and breadth-first searches both have the property that parent nodes come before child nodes. You can also have "post-order" traversals in which every node comes after its children.
Binary trees also have an "inorder" traversal. The inorder traversal of this tree is D, B, E, A, F, C, G. That is, every left child comes before all its ancestors, and every right child comes after all its ancestors. As an exercise, can you write an in-order traversal on a binary tree?
These are standard Tree Traversal functions, also commonly known as "tree walking". It's difficult to give your examples standardised names because the concrete walking strategy is not known :)

Recursive Tree Mapping

I've been working a lot with tree implementations lately and how we represent and understand trees. My focus has been on turning mathematical expressions into binary trees, I set the problem of representing a tree in a linear form say a string or an array, while still retaining important information about the tree and its sub trees.
As such I have developed a really simple encoding for binary expression trees does just this. However I am having some issues with implementing it effectively in a recursive manor, it seems to be the one failing aspect behind the concept.
The encoding is simple if the node resides as a left child it is given a map of 1 if it resides as a right child it is given a 0. This simple encoding allows me to encode entire balanced and unbalanced trees like this:
## ##
/ \ / \
1 0 OR 1 0
/ \ / \ / \
11 10 01 00 01 00
Etc to trees of depth N
Does anyone have any suggestions as to how to create a recursive function that would create the prefix string representing a mapping of this sort (for example ## 1 11 10 0 01 00).
I was told this would be difficult/impossible due to having to keep track of alternating between 1 and 0 while retaining and concatenating to the value of the parent.
I wondered if anyone had any insight or ideas into how to do this with C# ??
I'm not sure I understand you problem but here is something that might help. One solution might be implementing graph traversal routine on a Graph (remember a Tree is a specialized Graph), where the visit occurs the first time you encounter a node/vertex. I apologize for posting Java code when you asked for C# but I happen know Java...
public void depthFirstSearch(Graph graph, Vertex start){
Set<Vertex> visited = new HashSet<Vertex>(); // could use vertex.isVisited()...
Deque<Vertex> stack = new ArrayDeque<Vertex>(); // stack implies depth first
// first visit the root element, then add it to the stack so
// we will visit it's children in a depth first order
visit(start);
visited.add(start);
stack.push(start);
while(stack.isEmpty() == false){
List<Edge> edges = graph.getEdges(stack.peekFirst());
Vertex nextUnvisited = null;
for(Edge edge : edges){
if(visited.contains(edge.getEndVertex)) == false){
nextUnvisited = edge.getEndVertex();
break; // break for loop
}
}
if(nextUnvisited == null){
// check the next item in the stack
Vertex popped = stack.pop();
} else {
// visit adjacent unvisited vertex
visit(nextUnvisited);
visited.add(nextUnvisited);
stack.push(nextUnvisited); // visit it's "children"
}
}
}
public void visit(Vertex vertex){
// your own visit logic (string append, etc)
}
You can easily modify this to be a breadth first search by using the Deque as a queue instead of stack as follows:
stack.pop() >>>> queue.removeFirst()
stack.push() >>>> queue.addLast()
Note that for this purpose the Graph and Edge classes support the following operations :
public interface Graph {
...
// get edges originating from Vertex v
public List<Edge> getEdges(Vertex v);
...
}
public interface Edge {
...
// get the vertex at the start of the edge
// not used here but kind of implied by the getEndVertex()...
public Vertex getStartVertex();
// get the vertex at the end of the edge
public Vertex getEndVertex();
...
}
Hopefully that gives you some ideas.
Well i don't know if i completely get your question but it seems you want a preorder traversal of the tree. I don't know c#'s syntax but the pseudocode i think will be as follows:
preorder_traversal(node)
if(node != NULL)
print(node)
preorder_traversal(left_sub_child)
preorder_traversal(right_sub_child)
else
return
Building a tree recursively is a difficult challenge even for a seasoned programmer. I realize I'm a bit late to the party on this question considering it was originally posted in March of 2011. Better late than never?
One important factor in creating a tree is just making sure your dataset is formatted correctly. You simply need a way to associate a parent to a child. Once the association is clearly defined, then you can begin to code the solution. I chose to use a simple format like this:
ParentId ChildId
1 2
1 3
2 4
3 5
Etc.
Once that relationship is established, I developed a recursive method to iterate through the dataset to build the tree.
First I identify all the parent nodes and store them in a collection giving them each a unique identifier using a combination of the parent ID and child ID:
private void IdentifyParentNodes()
{
SortedList<string, MyTreeNode> newParentNodes = new SortedList<string,MyTreeNode>();
Dictionary<string, string> parents = new Dictionary<string, string>();
foreach (MyTreeNode oParent in MyTreeDataSource.Values)
{
if (!parents.ContainsValue(oParent.ParentId))
{
parents.Add(oParent.ParentId + "." + oParent.ChildId, oParent.ParentId);
newParentNodes.Add(oParent.ParentId + "." + oParent.ChildId, oParent);
}
}
this._parentNodes = newParentNodes;
}
Then the root calling method would loop through the parents and call the recursive method to build the tree:
// Build the rest of the tree
foreach (MyTreeNode node in ParentNodes.Values)
{
RecursivelyBuildTree(node);
}
Recursive method:
private void RecursivelyBuildTree(MyTreeNode node)
{
int nodePosition = 0;
_renderedTree.Append(FormatNode(MyTreeNodeType.Parent, node, 0));
_renderedTree.Append(NodeContainer("open", node.ParentId));
foreach (MyTreeNode child in GetChildren(node.ParentId).Values)
{
nodePosition++;
if (IsParent(child.ChildId))
{
RecursivelyBuildTree(child);
}
else
{
_renderedTree.Append(FormatNode(MyTreeNodeType.Leaf, child, nodePosition));
}
}
_renderedTree.Append(NodeContainer("close", node.ParentId));
}
Method used to get children of a parent:
private SortedList<string, MyTreeNode> GetChildren(string parentId)
{
SortedList<string, MyTreeNode> childNodes = new SortedList<string, MyTreeNode>();
foreach (MyTreeNode node in this.MyTreeDataSource.Values)
{
if (node.ParentId == parentId)
{
childNodes.Add(node.ParentId + node.ChildId, node);
}
}
return childNodes;
}
Not that complex or elegant, but it got the job done. This was written in 2007 time frame, so it's old code, but it still works. :-) Hope this helps.

Categories

Resources