Recursion faster than iteration - c#

I've implemented a quadtree in C# and have come across a weird occurrence where recursion seems to perform better than iteration, despite looking like it should be the opposite.
My nodes look like this:
class QuadNode
{
private QuadNode topLeft;
private QuadNode topRight;
private QuadNode bottomRight;
private QuadNode bottomLeft;
// other fields...
}
To traverse the tree I used the following recursive method, which I invoke on the root node:
Traverse()
{
// visit 'this'
if (topLeft != null)
topLeft.Traverse();
if (topRight != null)
topRight.Traverse();
if (bottomRight != null)
bottomRight.Traverse();
if (bottomLeft != null)
bottomLeft.Traverse();
}
Mostly out of interest, I tried to create an iterative method for traversing the tree.
I added the following field to each node: private QuadNode next, and when I create the tree I perform a breadth-first traversal using a queue, linking the next field of each node to the next node in line. Essentially I created a singly-linked list from the nodes of the tree.
At this point I am able to traverse the tree with the following method:
Traverse()
{
QuadNode node = this;
while (node != null)
{
// visit node
node = node.next;
}
}
After testing the performance of each method I was very surprised to learn that the iterative version is consistently and noticeably slower than the recursive one. I've tested this on both huge trees and small trees alike and the recursive method is always faster. (I used aStopwatch for benchmarking)
I've confirmed that both methods traverse the entire tree successfully and that the iterative version only visits each node exactly once as planned, so it's not a problem with the linking between them.
It seems obvious to me that the iterative version would perform better... what could be the cause of this? Am I overlooking some obvious reason as to why the recursive version is faster?
I'm using Visual Studio 2012 and Compiled under Release, Any CPU (prefer 32 bit unchecked).
Edit:
I've opened a new project and created a simple test which also confirms my results.
Here's the full code: http://pastebin.com/SwAsTMjQ
The code isn't commented but I think it's pretty self-documenting.

Cache locality is killing speed. Try:
public void LinkNodes()
{
var queue = new Queue<QuadNode>();
LinkNodes(queue);
QuadNode curr = this;
foreach (var item in queue)
{
curr.next = item;
curr = item;
}
}
public void LinkNodes(Queue<QuadNode> queue)
{
queue.Enqueue(this);
if (topLeft != null)
topLeft.LinkNodes(queue);
if (topRight != null)
topRight.LinkNodes(queue);
if (bottomRight != null)
bottomRight.LinkNodes(queue);
if (bottomLeft != null)
bottomLeft.LinkNodes(queue);
}
Now the iterative version should be 30/40% faster than the recursive version.
The reason of the slowness is that your iterative algorithm will go Breadth First instead of Depth First. You created your elements Depth First, so they are sorted Depth First in memory. My algorithm creates the traverse list Depth First.
(note that I used a Queue in LinkNodes() to make it easier to follow, but in truth you could do it without)
public QuadNode LinkNodes(QuadNode prev = null)
{
if (prev != null)
{
prev.next = this;
}
QuadNode curr = this;
if (topLeft != null)
curr = topLeft.LinkNodes(curr);
if (topRight != null)
curr = topRight.LinkNodes(curr);
if (bottomRight != null)
curr = bottomRight.LinkNodes(curr);
if (bottomLeft != null)
curr = bottomLeft.LinkNodes(curr);
return curr;
}

Looking at your code, both methods seem to be working the same , BUT in the recursive one you visit 4 nodes in a "loop" , that means you do not "jump" between 3 tests whereas in the iterative you "jump" to the beginning of the loop each run.
I'd say if you want to see almost similar behaviour you'll have to unroll the iterative loop into something like :
Traverse(int depth)
{
QuadNode node = this;
while (node != null)
{
// visit node
node = node.next;
if (node!=null) node=node.next;
if (node!=null) node=node.next;
if (node!=null) node=node.next;
}
}

Related

Is there any benefit to using LINQ to get the first (and only) element in IEnumerable<T>?

Is there any significant difference between these two lines?
var o = xmlFile.Descendants("SomeElement").ElementAt(0).Value;
And:
var o = xmlFile.Descendants("SomeElement").First().Value;
XmlFile is an XDocument object, and Descendants(XName name) returns IEnumerable<XElement>.
I know First(); will throw an exception if the collection is empty and you might want to use FirstOrDefault(); but that's fine in this case; I already validate my XDocument object against an XmlSchemaSet, so I know the element exists. I suppose directly accessing Value would throw an exception either way if the collection was empty, as ElementAt(0) wouldn't return anything either.
But yea; I, obviously, don't like adding using directives if I don't need to. Is there any reason one might want to use LINQ in this case? I can't imagine there's any real performance difference in either case.
I ask because the user is able to upload a zip file containing any number of XML files that need to be processed. 1 "record" per XML file.
EDIT: What my original question was going to be was "How do you get the first element from IEnumerable without adding using System.Linq; then I found the ElementAt, not realizing they were both part of LINQ.
So I guess really what I want to know is, would there be a difference between either snippet above and this:
var descendants = xmlFile.Descendants("SomeElement");
var enumerator = descendants.GetEnumerator();
var node = (enumerator.MoveNext()) ? enumerator.Current : null;
I'd definitely say LINQ is much more readable, and for that alone is probably worth using. But again, the user can upload I think up to a 10 MB zip file and each of these XML files ranges from about 2 kilobytes to 10 kilobytes, depending on which schema it is. So that's a good number of files.
Check the source. Both ElementAt and First are extension methods defined on System.Linq.Enumerable (as noted by Lee in the question comments).
Update
I included the implementation for Single as well, as it was discussed it would be a better option for this specific problem. Fundamentally this comes down to readability and exceptions that are thrown, as they all use the same way of accessing the first element.
public static TSource First<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null) {
if (list.Count > 0) return list[0];
}
else {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return e.Current;
}
}
throw Error.NoElements();
}
public static TSource ElementAt<TSource>(this IEnumerable<TSource> source, int index) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if(list != null) return list[index];
if (index < 0) throw Error.ArgumentOutOfRange("index");
using (IEnumerator<TSource> e = source.GetEnumerator()) {
while (true) {
if (!e.MoveNext()) throw Error.ArgumentOutOfRange("index");
if (index == 0) return e.Current;
index--;
}
}
}
public static TSource Single<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null) {
switch (list.Count) {
case 0: throw Error.NoElements();
case 1: return list[0];
}
}
else {
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (!e.MoveNext()) throw Error.NoElements();
TSource result = e.Current;
if (!e.MoveNext()) return result;
}
}
throw Error.MoreThanOneElement();
}
The only real difference is the name but it's important anyway. If you only want the first item use Enumerable.First/FirstOrDefault, if you want the first but maybe later also the second, third etc. then use ElementAt/ElementAtOrdefault.
The intention should be self explanatory. Readability is the key factor here.
You can find source code here, for example:
Enumerable.ElementAt and Enumerable.First
You can see that both methods are optimized for collections that support access via index.
The other answers here point out that both options you've presented actually use LINQ. But your updated question asks if this is equivalent to the original LINQ call:
var descendants = xmlFile.Descendants("SomeElement");
var enumerator = descendants.GetEnumerator();
var node = (enumerator.MoveNext()) ? enumerator.Current : null;
Well, no, not quite. Firstly, note that the IEnumerator<T> implements IDisposable, but your code is never going to call Dispose (although I doubt that would actually have any affect in this case). Secondly, your code handles empty data sets differently from either of those LINQ methods (your implementation is more like FirstOrDefault). A more equivalent version would be:
XElement node;
using (var enumerator = xmlFile.Descendants("SomeElement").GetEnumerator())
{
if (!enumerator.MoveNext())
{
throw new Exception(...);
}
node = enumerator.Current;
}
Or without the using:
XElement node;
var enumerator = xmlFile.Descendants("SomeElement").GetEnumerator();
try {
if (!enumerator.MoveNext()) { throw new Exception(...); }
node = enumerator.Current;
} finally {
enumerator.Dispose();
}
But in truth, we don't need the Enumerator at all. We can get rid of the call to Descendants like this:
var n = xmlFile.FirstNode;
var node = n as XElement;
while (node == null && n != null)
{
node = (n = n.NextNode) as XElement;
}
while (node != null && node.Name != "SomeElement")
{
node = (n = node.FirstNode ?? node.NextNode ?? node.Parent?.NextNode) as XElement;
while (node == null && n != null)
{
node = (n = n.NextNode) as XElement;
}
}
if (node == null)
{
throw new Exception("");
}
Now, if you profile this, you'll find some marginal performance boost with the more complex solutions. Here's the results of a fairly basic benchmark I put together (first column is without compiler optimizations, second column is with compiler optimizations):
Method Mean (/o-) Mean (/o+)
First() 0.1468333 0.1414340
ElementAt() 0.1452045 0.1419018
No Linq 0.1334992 0.1259622
While Loop 0.0895821 0.0693819
However, saving a few processor cycles usually isn't your biggest concern in enterprise-level applications. Given the typical costs for maintaining code, you should generally try to optimize for readability, and in my opinion, this is a lot easier to read:
var node = xmlFile.Descendants("SomeElement").First();
They can be used interchangeably since they both are defined in System.Linq.Enumerable.
But here some minor differences:
1) If no results are returned, .First will throw an exception.
2) .ElementAt(0) will throw an exception if the indexer is out of bounds.
Both of these exceptions can be avoided by using FirstOrDefault() and/or ElementAtOrDefault(0)

A special C# Tree algorithm in Umbraco CMS

I'm creating a special tree algorithm and I need a bit of help with the code that I currently have, but before you take a look on it please let me explain what it really is meant to do.
I have a tree structure and I'm interacting with a node (any of the nodes in the tree(these nodes are Umbraco CMS classes)) so upon interaction I render the tree up to the top (to the root) and obtain these values in a global collection (List<Node> in this particular case). So far, it's ok, but then upon other interaction with another node I must check the list if it already contains the parents of the clicked node if it does contain every parent and it doesn't contain this node then the interaction is on the lowest level (I hope you are still with me?).
Unfortunately calling the Contains() function in Umbraco CMS doesn't check if the list already contains the values which makes the list add the same values all over again even through I added the Contains() function for the check.
Can anyone give me hand here if he has already met such a problem? I exchanged the Contains() function for the Except and Union functions, and they yield the same result - they do contain duplicates.
var currentValue = (string)CurrentPage.technologies;
List<Node> globalNodeList = new List<Node>();
string[] result = currentValue.Split(',');
foreach (var item in result)
{
var node = new Node(int.Parse(item));
if (globalNodeList.Count > 0)
{
List<Node> nodeParents = new List<Node>();
if (node.Parent != null)
{
while (node != null)
{
if (!nodeParents.Contains(node))
{
nodeParents.Add(node);
}
node = (Node)node.Parent;
}
}
else { globalNodeList.Add(node); }
if (nodeParents.Count > 0)
{
var differences = globalNodeList.Except<Node>(globalNodeList);
globalNodeList = globalNodeList.Union<Node>(differences).ToList<Node>();
}
}
else
{
if (node.Parent != null)
{
while (node != null)
{
globalNodeList.Add(node);
node = (Node)node.Parent;
}
}
else
{
globalNodeList.Add(node);
}
}
}
}
If I understand your question, you only want to see if a particular node is an ancestor of an other node. If so, just (string) check the Path property of the node. The path property is a comma separated string. No need to build the list yourself.
Just myNode.Path.Contains(",1001") will work.
Small remarks.
If you are using Umbraco 6, use the IPublishedContent instead of Node.
If you would build a list like you do, I would rather take you can provide the Umbraco helper with multiple Id's and let umbraco build the list (from cache).
For the second remark, you are able to do this:
var myList = Umbraco.Content(1001,1002,1003);
or with a array/list
var myList = Umbraco.Content(someNode.Path.Split(','));
and because you are crawling up to the root, you might need to add a .Reverse()
More information about the UmbracoHelper can be found in the documentation: http://our.umbraco.org/documentation/Reference/Querying/UmbracoHelper/
If you are using Umbraco 4 you can use #Library.NodesById(...)

Binary search tree iterator c#

I have problem with method MoveNext in my enumerator. I need iterator for binary search tree. In construcotr of my enumerator I initialize Node to root of tree.
Current is value that I ened to return for next item. This code for method moveNext return wrong values.
public bool MoveNext()
{
if (Current == null)
Current = node.Value;
else if (node.Left != null)
{
node = node.Left;
Current = node.Value;
}
else if (node.Right != null)
{
node = node.Right;
Current = node.Value;
}
else
{
node.Value = Current;
do
{
if (node.Parent == null)
return false;
node = node.Parent;
} while (node.Right == null);
Current = node.Value;
}
return true;
}
I see a few issues with that code. First, in the else-branch, you are changing the value of a node in the tree - You probably meant to write Current = node.Value; instead of node.Value = Current;.
However, that's not the main issue. Your iterator will get stuck in an infinite loop really easily. Everything looks reasonable for traversing down, you take the leftmost path down to a leaf node.
Then you backtrack up until you find an ancestor node which has a Right child and yield the value of that node. However, this value was already returned by the iterator on the way down. Also, you don't remember which path you already traversed down, so on the next step you will inevitably follow the same path down again that you took before, then you'll backtrack up again and so on, ad infinitum.
In order to fix this, don't stop at the parent node when you backtrack - take the first step down the next path already. It is true that this will always be the Right child of some node, but it is not necessarily the Right child of the first ancestor that has one, because you might already be backtracking up from that branch.
So to summarize: If you can't go down any further, backtrack up one level. Check if you are coming from the Left or the Right child node. If you came from the left one, go down the right one if it exists, and set Current to its value. If it doesn't, or if you already come from the right child, recurse upwards another level.
Your enumerator modifies the tree:
Node.Value = Current;
Enumerators shouldn't do that.
In the last else you are changing the value of the node to the same value as the current node:
node.Value = Current;
I think that you tried to put the current node in the node variable, but that's not done by putting the current value in the value of the current node, and it's not needed as the node variable already contains the current node.
As Medo42 pointed out, if you are coming from the Right node, all children of that parent has already been iterated, so you should check for that when looking for a parent to continue iterating:
} while (node.Right == null || node.Right == last);
When you have looped up the parent chain to find the right parent, you are using the parent instead of getting the child:
node = node.Right;
So:
public bool MoveNext() {
if (Current == null)
Current = node.Value;
else if (node.Left != null)
{
node = node.Left;
Current = node.Value;
}
else if (node.Right != null)
{
node = node.Right;
Current = node.Value;
}
else
{
Node last;
do
{
if (node.Parent == null)
return false;
last = node;
node = node.Parent;
} while (node.Right == null || node.Right == last);
node = node.Right;
Current = node.Value;
}
return true;
}
I posted my solution here
Binary Search Tree Iterator java
Source code
https://github.com/yan-khonski-it/bst/blob/master/bst-core/src/main/java/com/yk/training/bst/iterators/BSTIterator.java
Here is algorithm how to do so (you can implement it in any language you like then).
ArrayIterator
BSTITerator what uses an array under the hood.
Perform inorder traversal and collect visited nodes into a list.
ArrayIterator. Now it works similar to list iterator, which is easy to implement.
next() and hasNext() - have constant time complexity; however, this iterator requires memory for N elements of the tree.
StackIterator
All nodes to be returned in next() call are stored in stack. Each time, you return next node, you remove element from the stack (let's name it currentNode).
If currentNode has a right child (you have already returned left), you put the right child into the stack. Then you need to iterate left subtree of the right child and put all left elements into the stack.

Why can't I do currentNode = currentNode.Next.Next?

I have made my own single chained/linked list.
Now, if I want to delete/remove a node/item from my list, I'd have to do something like this:
public void Delete(PARAMETERS)
{
Node previousNode = null,
currentNode = f;
while (currentNode != null)
{
if (SOMECONDITION)
{
if (previousNode == null)
{
f = currentNode.Next;
}
else
{
previousNode.Next = currentNode.Next;
}
}
else
{
previousNode = currentNode;
}
currentNode = currentNode.Next;
}
}
If SOMECONDITION is true, you simply skip the currentNode and therefor effectively "deleting" the node, as nothing points to it anymore.
But, I am really wondering, why can I not do something like this:
(...)
while ()
{
if (SOMECONDITION)
{
currentNode = currentNode.Next;
}
currentNode = currentNode.Next;
}
(...)
OR perhaps:
(...)
while ()
{
if (SOMECONDITION)
{
currentNode = currentNode.Next.Next;
}
else
{
currentNode = currentNode.Next;
}
}
(...)
What fundamental understanding do I lack?
Doing:
currentNode = currentNode.Next.Next;
Is a prime candidate for a NullReferenceException
EDIT:
Here's a list implementation with some pictures that may help you understand.
http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=176
There is nothing to say you can't do Next.Next.
The only issue is what if currentNode.Next is null? Then you would get an error.
PreviousNode works because you are doing a NULL check before using it.
currentNode is just a temporary pointer variable (reference) that ceases to exist at the end of the scope (that is by the next closing brace). When you change what that reference points to, you don't change any other references; changing currentNode doesn't magically change what the previous node's Next reference points to.
currentNode = currentNode.Next // only changes the temporary reference
You have to actually reach into the linked list and change a referende inside the list, which is what you do when you change previousNode.Next - you change what node the previous node considers its next node. You basically tell it "This is your new Next node, forget about the old one".
Also, as the others have stated, you should check for null references throughout. if currentNode.Next is the last node in the list, its Next will point at nothing, and you'll get a NullReferenceException.
Perhaps if you re-write the original a bit you would see better what you are really doing to the list.
public void Delete(PARAMETERS)
{
var previous = FindPreviousNode(PARAMETERS);
if( previous == null && Matches(f, PARAMETERS)) {
f = f.Next;
} else if(previous != null ) {
previous.Next = previous.Next.Next;
} // u could add "else { throw new NodeNotFound() }" if that's appropiate
}
private Node FindPreviousNode(PARAMETERS) {
Node currentNode = f;
while (currentNode != null) {
if (Matches(currentNode.Next, PARAMETERS)) {
return currentNode;
}
currentNode = currentNode.Next;
}
return null;
}
You have asked around in the comments to understand more what's up with the list and the Next's properties, so here it goes:
Lets say the list is: 1|3|5|7, first points to 1, 1's Next property points to 3, 5's Next points to 7, and 7's Next points to null. That's all you keep track of to store the list. If you set the 5's Next property to null, you are deleting the 7. If instead you set 3's Next property to 7, you are deleting the 5 from the list. If you set first to 3, you are deleting the 1.
Its all about the first and the Next properties. That's what makes the list.
The assignments to currentNode and previousNode do not alter the structure of the linked list. They're merely used to step through the structure.
The assignment to previousNode.Next is what changes the structure. Doing currentNode = currentNode.Next.Next will skip over the next node (if currentNode.Next is not null) but it won't alter the structure of the list.
You should really sketch a picture of the linked list if you're wondering about problems like this. It's far easier to see what needs to be done to accomplish some linked list mutation, than it is to reason it out.
Honestly, I don't follow the posted code at all.
If this is a standard linked list (each node has a Next, but that's it), follow these steps to run a deletion of a single item:
Step 1: Find the target node you want to delete, but keep track of the previous node visited.
Step 2: prevNode.Next = targetNode.Next
Note: special checks for deleting the head of the list need to be done.
How do you know in both cases that currentNode.Next is not null and thus that you can apply .Next on it? You are only checking for the != null in the loop condition.

LinkedList<T> (2.0): removing items iteratively

I need to iterate through a LinkedList<T> (in .NET 2.0) and remove all the items according to a given criteria.
It was easy way under Java, since I could do the following:
Iterator<E> i = list.iterator();
while (i.hasNext()) {
E e = i.next();
if (e == x) {
// Found, so move it to the front,
i.remove();
list.addFirst(x);
// Return it
return x;
}
}
Unfortunately, in the .NET behavior of IEnumerator<T> (the equivalent of Iterator<E>) there's no remove method to remove the current element from the collection.
Also, in the LinkedList<T> there's no way to access an element at a given index, to accomplish the task by iterating back from the last to the first.
Have you got any idea on how to do it? Thank you very much!
This will remove all nodes that match a criteria, in one loop through the linked list.
LinkedListNode<E> node = list.First;
while (node != null)
{
var next = node.Next;
if (node.Value == x) {
list.Remove(e);
}
node = next;
}
I believe that's what you're attempting... You also added back in the node at the beginning of the list (so your java code didn't remove all of the nodes, but rather moved the first matching to the beginning of the list). That would be easy to do with this approach, as well.
It's actually a lot easier in C#.
function PlaceAtHead(<T> x)
{
list.Remove(x);
list.AddFirst(x);
return x;
}
One ugly option is to iterate through your list, find all the items that apply and store them in a list. Then iterate through your second list and call remove on your LinkedList...
I'm hoping someone else has a more elegant solution :)
Just a little addition to Reed Copsey's answer with a predicate:
public static T MoveAheadAndReturn<T>(LinkedList<T> ll, Predicate<T> pred)
{
if (ll == null)
throw new ArgumentNullException("ll");
if (pred == null)
throw new ArgumentNullException("pred");
LinkedListNode<T> node = ll.First;
T value = default(T);
while (node != null)
{
value = node.Value;
if (pred(value))
{
ll.Remove(node);
ll.AddFirst(node);
break;
}
node = node.Next;
}
return value;
}

Categories

Resources