Compare XPath list to find the closest to another node? - c#

I have the following node
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[7]/p[1]/#text[1]"
How can I figure out that the last one of these is the closest one?
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[4]/div[1]/img[1]"
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[4]/div[3]/a[1]/img[1]"
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[4]/div[3]/a[2]/img[1]"
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[4]/div[5]/img[1]"
"/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[5]/div[1]/img[1]"
It won't always be necessarily the last one.
Here's how I got there:
protected string GuessThumbnail(HtmlDocument document)
{
HtmlNode root = document.DocumentNode;
IEnumerable<string> result = new List<string>();
HtmlNode description = root.SelectSingleNode(DescriptionPredictiveXPath);
if (description != null) // in this case, we predict relevant images are the ones closest to the description text node.
{
HtmlNode node = description.ParentNode;
while (node != null)
{
string path = string.Concat(node.XPath, ImageXPath);
node = node.ParentNode;
IEnumerable<HtmlNode> nodes = root.SelectNodesOrEmpty(path);
// find the image tag that's closest to the text node.
if (nodes.Any())
{
var xpaths = nodes.Select(n => n.XPath);
xpaths.ToList();
// return closest
}
}
}
// figure some other way to do it
throw new NotImplementedException();
}

Consider assigning "position in whole tree in depth-first order" to each node. This way comparing 2 nodes will be very simple.
If you can attach arbitrary data to your nodes - add it directly. Otherwise have dictionary of all nodes to position map.
Note that depending on how many times you need to do this comparison this approach may be to slow for you, but it should be easy to implement and measure it it meats your requirements.

Did it like this:
protected string GuessThumbnail(HtmlDocument document)
{
HtmlNode root = document.DocumentNode;
HtmlNode description = root.SelectSingleNode(DescriptionPredictiveXPath);
if (description != null)
{
// in this case, we predict relevant images are the ones closest to the description text node.
HtmlNode parent = description.ParentNode;
while (parent != null)
{
string path = string.Concat(parent.XPath, ImageXPath);
IList<HtmlNode> images = root.SelectNodesOrEmpty(path).ToList();
// find the image tag that's closest to the text node.
if (images.Any())
{
HtmlNode descriptionOutermost = description.ParentNodeUntil(parent); // get the first child towards the description from the parent node.
int descriptionIndex = descriptionOutermost.GetIndex(); // get the index of the description's outermost element.
HtmlNode closestToDescription = null;
int distanceToDescription = int.MaxValue;
foreach (HtmlNode image in images)
{
int index = image.ParentNodeUntil(parent).GetIndex(); // get the index of the image's outermost element.
if (index > descriptionIndex)
{
index *= -1;
}
int distance = descriptionIndex - index;
if (distance < distanceToDescription)
{
closestToDescription = image;
distanceToDescription = distance;
}
}
if (closestToDescription != null)
{
string source = closestToDescription.Attributes["src"].Value;
return source;
}
}
parent = parent.ParentNode;
}
}
// figure some other way to do it
throw new NotImplementedException();
}
public static HtmlNode ParentNodeUntil(this HtmlNode node, HtmlNode parent)
{
while (node.ParentNode != parent)
{
node = node.ParentNode;
}
return node;
}
public static int GetIndex(this HtmlNode node)
{
return node.ParentNode.ChildNodes.IndexOf(node);
}

Related

C# Delete node from BST - Am I on the right track?

EDIT:
Right thanks for helping earlier, I have been using and the step into and step over and it looks to be working but the nodes are not being deleted and I'm not sure why.
I actually use 5 arguments for the BST but just using the one for testing purposes. It compares and finds if it has any children no problem. Just wont set it to null.
only testing nodes with 0 or 1 children.
main
Tree aTree = new Tree();
aTree.InsertNode("a");
aTree.InsertNode("s");
aTree.InsertNode("3");
aTree.InsertNode("1");
aTree.InsertNode("p");
aTree.PreorderTraversal();
aTree.RemoveNode("p");
aTree.RemoveNode("3");
aTree.PreorderTraversal();
Console.ReadKey();
My Delete Methods are:
Tree Node
public void Remove(TreeNode root, TreeNode Delete) {
if (Data == null) {
}
if (Delete.Data.CompareTo(root.Data) < 0) {
root.nodeLeft.Remove(root.nodeLeft, Delete);
}
if (Delete.Data.CompareTo(root.Data) > 0) {
root.nodeRight.Remove(root.nodeRight, Delete);
}
if (Delete.Data == root.Data) {
//No child nodes
if (root.nodeLeft == null && root.nodeRight == null) {
root = null;
}
else if (root.nodeLeft == null)
{
TreeNode temp = root;
root = root.nodeRight;
root.nodeRight = null;
temp = null;
}
//No right child
else if (root.nodeRight == null)
{
TreeNode temp = root;
root = root.nodeLeft;
root.nodeLeft = null;
temp = null;
}
//Has both child nodes
else
{
TreeNode min = minvalue(root.nodeRight);
root.Data = min.Data;
root.nodeRight.Remove(root.nodeRight, min);
}
}
}
Find Min
public TreeNode minvalue(TreeNode node)
{
TreeNode current = node;
/* loop down to find the leftmost leaf */
while (current.nodeLeft != null)
{
current = current.nodeLeft;
}
return current;
}
Tree
public void RemoveNode(string Nation)
{
TreeNode Delete = new TreeNode(Nation);
root.Remove(root, Delete);
}
Remove is of return type void, but you're trying to assign it to root.nodeLeft and root.nodeRight, causing your type conversion error.
In general your Remove function needs to return the root of the sub-tree as the result, as in
public void Remove(TreeNode root, TreeNode Delete) {
if (Data == null) {
return null;
}
if (Delete.Data.CompareTo(root.Data) < 0) {
root.nodeLeft = (root.nodeLeft.Remove(root.nodeLeft, Delete));
return root;
}
... and so on.
Otherwise, since your nodes don't refer to their parents, there would be no way for the parent to know that the child node is gone, or that a new node is now at the root of the sub-tree.

C# Node pointer issue

I am having some trouble setting child nodes using C#. I am trying to build a tree of nodes where each node holds an int value and can have up to a number of children equal to it's value.
My issue appears when I iterate in a node looking for empty(null) children so that I may add a new node into that spot. I can find and return the null node, but when I set the new node to it, it loses connection to the parent node.
So if I add 1 node, then it is linked to my head node, but if I try to add a second it does not become a child of the head node. I am trying to build this with unit tests so here is the test code showing that indeed the head does not show the new node as it's child (also confirmed with visual studios debugger):
[TestMethod]
public void addSecondNodeAsFirstChildToHead()
{
//arange
Problem3 p3 = new Problem3();
p3.addNode(2, p3._head);
Node expected = null;
Node expected2 = p3._head.children[0];
int count = 2;
//act
Node actual = p3.addNode(1, p3._head);
Node expected3 = p3._head.children[0];
//assert
Assert.AreNotEqual(expected, actual, "Node not added"); //pass
Assert.AreNotEqual(expected2, actual, "Node not added as first child"); //pass
Assert.AreEqual(expected3, actual, "Node not added as first child"); //FAILS HERE
Assert.AreEqual(count, p3.nodeCount, "Not added"); //pass
}
Here is my code.
public class Node
{
public Node[] children;
public int data;
public Node(int value)
{
data = value;
children = new Node[value];
for(int i = 0; i < value; i++)
{
children[i] = null;
}
}
}
public class Problem3
{
public Node _head;
public int nodeCount;
public Problem3()
{
_head = null;
nodeCount = 0;
}
public Node addNode(int value, Node currentNode)
{
if(value < 1)
{
return null;
}
Node temp = new Node(value);
//check head
if (_head == null)
{
_head = temp;
nodeCount++;
return _head;
}
//start at Current Node
if (currentNode == null)
{
currentNode = temp;
nodeCount++;
return currentNode;
}
//find first empty child
Node emptyChild = findEmptyChild(currentNode);
emptyChild = temp;
nodeCount++;
return emptyChild;
}
public Node findEmptyChild(Node currentNode)
{
Node emptyChild = null;
//find first empty child of current node
for (int i = 0; i < currentNode.children.Length; i++)
{
if (currentNode.children[i] == null)
{
return currentNode.children[i];
}
}
//move to first child and check it's children for an empty
//**this causes values to always accumulate on left side of the tree
emptyChild = findEmptyChild(currentNode.children[0]);
return emptyChild;
}
I feel the problem is I am trying to treat the nodes as pointers like I would in C++ but that it is not working as I expect.
It is impossible for a function to return a handle (or a pointer) to something that does not yet exist. Either you initialize non existent value inside the function, or you provide enough variables for it to be initialized outside of the function.
One solution would be to rename the function findEmptyChild to something like initializeEmptyChild(Node currentNode, Node newNode), adding one more Node parameter to it (when calling it that would be temp value), and in the loop before return you initialize the previously empty Node, currentNode.children[i] = newNode.
Another solution would be not to return just one Node but two values, a parent node and an index where empty child is found, Tuple<Node, int> findEmptyChild(Node currentNode), and in the loop instead of return currentNode.children[i] you do return new Tuple<Node, int>(currentNode, i). When calling the function you would change the code to
var parentAndIndex = findEmptyChild(currentNode);
parentAndIndex.Item1.children[parentAndIndex.Item2] = temp;
Look at this part of your code:
Node temp = new Node(value);
//...
Node emptyChild = findEmptyChild(currentNode);
emptyChild = temp;
You are assigning the emptyChild to a new node, doing so you will "loose" the connection with any parent node. You should write something like this:
emptyChild.data = temp.data;
emptyChild.children = temp.children;
As others said, your approach using null checking could be improved. You mentioned that Node.data holds the numbers of children of a given node, so you could simply say that when you have Node.data == 0, that node should be treated as being null, or empty. For example, instead of having:
rootNode.children[0] = null; // rootNode can have a lot of children
rootNode.children[1] = null;
//...
you would have:
rootNode.children[0] = new Node(0);
rootNode.children[1] = new Node(0);
//...
At this point your code will look similar to this:
public class Node
{
public Node[] children;
public int data;
public Node(int value)
{
data = value;
children = new Node[value];
// Instead of "pointing" to null,
// create a new empty node for each child.
for (int i = 0; i < value; i++)
{
children[i] = new Node(0);
}
}
}
public class Problem3
{
public Node _head;
public int nodeCount;
public Problem3()
{
_head = null;
nodeCount = 0;
}
public Node addNode(int value, Node currentNode)
{
if (value < 1)
{
return null;
}
Node temp = new Node(value);
//check head
if (_head == null)
{
_head = temp;
nodeCount++;
return _head;
}
//start at Current Node
if (currentNode == null)
{
currentNode = temp;
nodeCount++;
return currentNode;
}
//find first empty child
Node emptyChild = findEmptyChild(currentNode);
if (emptyChild != null)
{
emptyChild.data = temp.data;
emptyChild.children = temp.children;
nodeCount++;
}
return emptyChild;
}
public Node findEmptyChild(Node currentNode)
{
// Null checking.
if (currentNode == null)
return null;
// If current node is empty, return it.
if (currentNode.data == 0)
return currentNode;
// If current node is non-empty, check its children.
// If no child is empty, null will be returned.
// You could change this method to check even the
// children of the children and so on...
return currentNode.children.FirstOrDefault(node => node.data == 0);
}
}
Let's look now at the testing part (please see the comments for clarification):
[TestMethod]
public void addSecondNodeAsFirstChildToHead()
{
//arange
Problem3 p3 = new Problem3();
p3.addNode(2, p3._head); // Adding two empty nodes to _head, this means that now _head can
// contain two nodes, but for now they are empty (think of them as
// being "null", even if it's not true)
Node expected = null;
Node expected2 = p3._head.children[0]; // Should be the first of the empty nodes added before.
// Be careful: if you later change p3._head.children[0]
// values, expected2 will change too, because they are
// now pointing to the same object in memory
int count = 2;
//act
Node actual = p3.addNode(1, p3._head); // Now we add a non-empty node to _head, this means
// that we will have a total of two non-empty nodes:
// this one fresly added and _head (added before)
Node expected3 = p3._head.children[0]; // This was an empty node, but now should be non-empty
// because of the statement above. Now expected2 should
// be non-empty too.
//assert
Assert.AreNotEqual(expected, actual, "Node not added"); //pass
// This assert won't work anymore, because expected2, expected 3 and actual
// are now pointing at the same object in memory: p3._head.children[0].
// In your code, this assert was working because
// In order to make it work, you should replace this statement:
// Node expected2 = p3._head.children[0];
// with this one:
// Node expected2 = new Node(0); // Create an empty node.
// expected2.data = p3._head.children[0].data; // Copy data
// expected2.children = p3._head.children[0].children;
// This will make a copy of the node instead of changing its reference.
Assert.AreNotEqual(expected2, actual, "Node not added as first child");
// Now this will work.
Assert.AreEqual(expected3, actual, "Node not added as first child");
Assert.AreEqual(count, p3.nodeCount, "Not added"); //pass
}

Binary tree recursive search by node content does not return correct node

I have a binary tree class.
I need to find a first occurrence of a node with some specified content and return this node using recursion.
For example Find("B") should find a first occurrence of a node with content "B".
public Node Find(string content)
{
Node aux = null;
bool found = false;
if (this.left != null)
{
this.left.Find(content);
}
if (found != true)
{
if (content == this.content)
{
found = true;
return aux = this;
}
}
if (this.right != null)
{
this.right.Find(content);
}
return aux;
}

How to turn textual qualified data to a tree

I have data that looks like the below:
a.b.c.d.e.f.g
b.c.d.e.f.g.h.x
c.d.e.q.s.n.m.y
a.b.c
I need to take this data and turn each and every level into a node in a treeview. So the tree looks something like:
a
b
c
d
e
...
b
c
d
....
if for example at the same leve there is another a, elements under this should be added as nodes to that branch. I have thought of the following:
Parse each line that is qualified by the dot character for each element and create an ordered list.
For each item in the list add it as a node in the current location.
Before adding check to make sure another item at the same level does not exist with the same name.
Add the next element until all items in the list are done, next elements being child to the first added item of the list.
I hope I was clear and let me know if it needs further clarification.
You can change the Node class to have the checks, if you want a list of children nodes, etc, add that as a HashSet, so you can easily make the check for uniqueness. Add a method in the Node class to do the AddChild and do the check on the HashSet.
public class Main
{
public Main()
{
string treeStr = "";
string[] strArr = { "a.b.c.d.e.f.g", "b.c.d.e.f.g.h.x" };
List<Node> nodes = new List<Node>();
Node currentNode;
foreach (var str in strArr)
{
string[] split = str.Split('.');
currentNode = null;
for (int i = 0; i < split.Length; i++)
{
var newNode = new Node { Value = str };
if (currentNode != null)
{
currentNode.Child = newNode;
}
else
{
nodes.Add(newNode);
}
currentNode = newNode;
}
}
}
}
public class Node
{
public string Value { get; set; }
public Node Child { get; set; }
}
I'm assuming the existence of the methods CreateRootNode and AddChildNode.
void ParseToTreeview(IEnumerable<string> data) {
foreach (var line in data) {
var names = line.Split('.');
for (var i = 0; i < names.Length; i++) {
TreeNode node = null;
if (i == 0)
node = CreateRootNode(name:names[i]);
else
node = AddChildNode(name:names[i], parentNode:node);
}
}
}
A recursive method to add all of these is what you need. Here's a sample:
Use:
string[] yourListOfData = { "a.b.c.d.e.f.g", "b.c.d.e.f.g.h.x", "c.d.e.q.s.n.m.y", "a.b.c" };
foreach(string x in yourListOfData)
PopulateTreeView(x, myTreeView.Nodes[0]);
Sample Method:
public void PopulateTreeView(string values, TreeNode parentNode )
{
string nodeValue = values;
string additionalData = values.Substring(value.Length - (value.Length - 2));
try
{
if (!string.IsNullOrEmpty(nodeValue))
{
TreeNode myNode = new TreeNode(nodeValue);
parentNode.Nodes.Add(myNode);
PopulateTreeView(additionalData, myNode);
}
} catch ( UnauthorizedAccessException ) {
parentNode.Nodes.Add( "Access denied" );
} // end catch
}
NOTE: code above is not tested, might need tweaking

Is this implementation of a Red Black Tree C# correct?

Please critique my code. I noticed my last assert fails with value 277. I expected the value to 255 (1/2 500+10). Is this a valid test or have I done something wrong?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace RedBlackTree
{
public enum NodeColor
{
Red,
Black
}
public enum NodeSide
{
Left,
Right,
None
}
public class RedBlackTreeNode<T>
where T : IComparable<T>
{
public RedBlackTreeNode<T> Left { get; set; }
public RedBlackTreeNode<T> Right { get; set; }
public RedBlackTreeNode<T> Parent { get; set; }
public T Data {get; set;}
public NodeColor Color { get; set; }
public RedBlackTreeNode<T> Uncle
{
get
{
if (WhichSideAmIOn() == NodeSide.Left)
{
return this.Parent.Right;
}
else
return this.Parent.Left;
}
}
public string ToString()
{
return String.Format("Me: {0} Left: {1} Right: {2}", Data, Left != null ? Left.Data.ToString() : "null", Right != null ? Right.Data.ToString() : "null");
}
public NodeSide WhichSideAmIOn()
{
if (this.Parent == null) return NodeSide.None;
if (this.Parent.Left == this)
return NodeSide.Left;
if (this.Parent.Right == this)
return NodeSide.Right;
throw new Exception("Impossible - there can be only two sides. You must not be a child of your parent.");
}
}
public class RedBlackTree<T>
where T : IComparable<T>
{
private RedBlackTreeNode<T> Root { get; set; }
public void InsertNode(T data)
{
//make a node to hold the data - always red
RedBlackTreeNode<T> newNode = new RedBlackTreeNode<T>();
newNode.Data = data;
newNode.Color = NodeColor.Red;
//rule 1 - if the root is null, then hte new node is the root and roots can't be red.
if (Root == null)
{
Root = newNode;
Root.Color = NodeColor.Black;
return;
}
//otherwise if we're not the first node, insert by walking.
RedBlackTreeNode<T> walker = Root;
while (walker != null)
{
if (newNode.Data.CompareTo(walker.Data)< 0)
{
//walk left
if (walker.Left == null)
{
walker.Left = newNode;
newNode.Parent = walker;
break;
}
else
{
walker = walker.Left;
}
}
else if (newNode.Data.CompareTo(walker.Data) > 0)
{
//walk right
if (walker.Right == null)
{
walker.Right = newNode;
newNode.Parent = walker;
break;
}
else
{
walker = walker.Right;
}
}
else //todo: remove me
{
//we're equal, ignore this node in general
return;
}
}
//rebalance -
//at this point we have the parent , we have the newnode and we need to implement some rules.
Rebalance();
}
private void Rebalance()
{
RedBlackTreeNode<T> node = Root;
Stack<RedBlackTreeNode<T>> stack = new Stack<RedBlackTreeNode<T>>();
while (stack.Count !=0 || node !=null )
{
if (node != null)
{
stack.Push(node);
node = node.Left;
}
else
{
node = stack.Pop();
Rebalance(node);
node = node.Right;
}
}
}
private void Rebalance(RedBlackTreeNode<T> node)
{
if (node.Parent == null) return;
if (node.Parent.Color == NodeColor.Red) //rule 2 or 3
{
if (node.Uncle != null) //the rule 2 - change him to black as well
{
Rule2(node);
}
else //if my uncle doesn't exist, it's could be rule 3 or 4, which requires rotation
{
//if my parent and I are on the same side,
if (node.WhichSideAmIOn() == node.Parent.WhichSideAmIOn())
{
Rule3(node);
}
else
{
Rule4(node);
}
}
}
}
private void Rule2(RedBlackTreeNode<T> node)
{
//my parent + uncle needs to be black
if (node.Parent == null) throw new Exception("um how?");
node.Parent.Color = NodeColor.Black;
node.Uncle.Color = NodeColor.Black;
}
//The rule of two red nodes to the same side
//if the nodes of the tree are stacked in one direction and the two stacked nodes are red
//the middle node comes up to the parent and the top node becomes the left or right hand child.
private void Rule3(RedBlackTreeNode<T> node)
{
//make my grand parent, my parents left|right
//where am i?
NodeSide ns = node.WhichSideAmIOn();
if (node.Parent == null) throw new Exception("um how?");
RedBlackTreeNode<T> parent = node.Parent;
RedBlackTreeNode<T> grandParent = parent.Parent;
RedBlackTreeNode<T> greatGrandParent = grandParent.Parent;
//set my great, grand parent, to point at my parent
NodeSide gpSide = grandParent.WhichSideAmIOn();
if (gpSide == NodeSide.Left)
{
if (greatGrandParent !=null)
greatGrandParent.Left = parent;
}
else
{
if (greatGrandParent != null)
greatGrandParent.Right = parent;
}
//swap my grandparent into my parent's other child
if (ns == NodeSide.Left)
{
//set my parents right to my grandParent
parent.Right = grandParent;
grandParent.Left = null;
}
else if (ns == NodeSide.Right)
{
//set my parents right to my grandParent
parent.Left = grandParent;
grandParent.Right = null;
}
//reset the parent, update the root
parent.Parent = greatGrandParent;
if (greatGrandParent == null)
{
Root = parent;
}
grandParent.Parent = parent;
//swap colors
parent.Color = NodeColor.Black;
grandParent.Color = NodeColor.Red;
}
//The rule of two red nodes on different sides
//if the nodes of a tree are both red and one goes to the left, but the other goes to the right
//then the middle node becomes the parent and the top node becomes the left or right child
private void Rule4(RedBlackTreeNode<T> node)
{
if (node.Parent == null) throw new Exception("um how?");
RedBlackTreeNode<T> parent = node.Parent;
RedBlackTreeNode<T> grandParent = parent.Parent;
RedBlackTreeNode<T> greatGrandParent = grandParent.Parent;
//fix the reference that will be above me
NodeSide ns;
if (grandParent!= null)
{
ns = grandParent.WhichSideAmIOn();
//replace the reference to my grand parent with me
if (ns == NodeSide.Left)
{
greatGrandParent.Left = node;
}
else if (ns == NodeSide.Right)
{
greatGrandParent.Right = node;
}
}
//put my parent and my grand parent on the
//correct side of me.
ns = node.WhichSideAmIOn();
NodeSide parentSide = parent.WhichSideAmIOn();
if (ns == NodeSide.Left)
{
node.Left = grandParent;
node.Right = parent;
//I was the child of parent, wipe this refernce
parent.Left = null;
}
else
{
node.Left = parent;
node.Right = grandParent;
//i was the child of parent, wipe this reference
parent.Right = null;
}
parent.Parent = node;
grandParent.Parent = node;
//parent was the child of grandparent, wipe this reference
if (parentSide == NodeSide.Left) { grandParent.Left = null; }
if (parentSide == NodeSide.Right) { grandParent.Right = null; }
//reset my parent and root
node.Parent = greatGrandParent;
if (greatGrandParent == null)
{
Root = node;
}
//swap colors
node.Color = NodeColor.Black;
grandParent.Color = NodeColor.Red;
}
public void Print()
{
Stack<RedBlackTreeNode<T>> stack = new Stack<RedBlackTreeNode<T>>();
RedBlackTreeNode<T> temp = Root;
while (stack.Count != 0 || temp != null)
{
if (temp != null)
{
stack.Push(temp);
temp = temp.Left;
}
else
{
temp = stack.Pop();
Console.WriteLine(temp.Data.ToString());
temp = temp.Right;
}
}
}
public double Height
{
get
{
Stack<RedBlackTreeNode<T>> stack = new Stack<RedBlackTreeNode<T>>();
RedBlackTreeNode<T> temp = Root;
double currentHeight =0;
while (stack.Count != 0 || temp != null)
{
if (temp != null)
{
stack.Push(temp);
if (temp.Left != null || temp.Right != null)
{
currentHeight++;
}
temp = temp.Left;
}
else
{
temp = stack.Pop();
temp = temp.Right;
}
}
return currentHeight;
}
}
}
class Program
{
static void Main(string[] args)
{
RedBlackTree<int> rbt = new RedBlackTree<int>();
rbt.InsertNode(1);
rbt.InsertNode(2);
rbt.InsertNode(3);
rbt.InsertNode(4);
rbt.InsertNode(5);
rbt.InsertNode(6);
rbt.InsertNode(7);
rbt.InsertNode(8);
rbt.InsertNode(9);
rbt.InsertNode(10);
rbt.Print();
Assert.AreEqual(5, rbt.Height); //make sure sorted vals don't snake off to the left or right
//inert 500 more random numbers, height should remain balanced
Random random = new Random();
for (int i = 0; i < 500; i++)
{
rbt.InsertNode(random.Next(0, 10000));
}
Assert.AreEqual(255, rbt.Height);
}
}
}
I think your test is incorrect, although I think your code has other problems that the test isn't catching.
First of all, the Height property does not actually return the height, but the number of nodes with at least one child. If you want the height of the deepest node then you should do something like currentHeight = Math.Max(currentHeight, stack.Count) on each iteration instead. You may also want it to return an int rather than a double.
The number of nodes without children should be approximately half of them like you want, but red-black trees are not perfectly balanced. You can have a valid tree with one third of the nodes having one child, one third having two, and one third having none: start with a perfectly balanced tree with all black nodes at the last level and add a red child to each one. This maintains the red-black tree invariants, but as many as two-thirds of the nodes will have children.
Similarly, if you were to test depth it would be between log(N) and 2 log(N).
You may want to write tests that verify the invariants of the tree directly. Visit every node in the tree, and verify that every red node has a black parent and that every path to a leaf contains the same number of black nodes. If you run those tests after every insert in your test suite, you can be sure that the tree is always balanced.
As for the code itself, your Rebalance method crawls the entire tree on every insert. This means insert will require O(N) time and will negate the benefits of using a self-balancing tree. Retrieval will still be O(log N), but you could get the same result by keeping a sorted list and inserting elements into the appropriate place. You should only have to rebalance the tree along the path being inserted, which will only be O(log N) nodes.
I think some of your transformations are wrong. You don't check the color of the current node before calling Rule2, and that rule appears to change nodes to black without ensuring that other paths in the tree have the same number of black nodes. (I may be misreading it; red-black trees are too complicated to do entirely in my head.)
If you're looking for a reference implementation, the Wikipedia page on Red-black trees has an implementation in C that could easily be translated to C#, and SortedSet<T> is implemented using a red-black tree that you can view with Reflector.

Categories

Resources