Is there a method to build a balanced binary search tree?
Example:
1 2 3 4 5 6 7 8 9
5
/ \
3 etc
/ \
2 4
/
1
I'm thinking there is a method to do this, without using the more complex self-balancing trees. Otherwise I can do it on my own, but someone probably have done this already :)
Thanks for the answers! This is the final python code:
def _buildTree(self, keys):
if not keys:
return None
middle = len(keys) // 2
return Node(
key=keys[middle],
left=self._buildTree(keys[:middle]),
right=self._buildTree(keys[middle + 1:])
)
For each subtree:
Find the middle element of the subtree and put that at the top of the tree.
Find all the elements before the middle element and use this algorithm recursively to get the left subtree.
Find all the elements after the middle element and use this algorithm recursively to get the right subtree.
If you sort your elements first (as in your example) finding the middle element of a subtree can be done in constant time.
This is a simple algorithm for constructing a one-off balanced tree. It is not an algorithm for a self-balancing tree.
Here is some source code in C# that you can try for yourself:
public class Program
{
class TreeNode
{
public int Value;
public TreeNode Left;
public TreeNode Right;
}
TreeNode constructBalancedTree(List<int> values, int min, int max)
{
if (min == max)
return null;
int median = min + (max - min) / 2;
return new TreeNode
{
Value = values[median],
Left = constructBalancedTree(values, min, median),
Right = constructBalancedTree(values, median + 1, max)
};
}
TreeNode constructBalancedTree(IEnumerable<int> values)
{
return constructBalancedTree(
values.OrderBy(x => x).ToList(), 0, values.Count());
}
void Run()
{
TreeNode balancedTree = constructBalancedTree(Enumerable.Range(1, 9));
// displayTree(balancedTree); // TODO: implement this!
}
static void Main(string[] args)
{
new Program().Run();
}
}
This paper explains in detail:
Tree Rebalancing in Optimal Time and Space
http://www.eecs.umich.edu/~qstout/abs/CACM86.html
Also here:
One-Time Binary Search Tree Balancing:
The Day/Stout/Warren (DSW) Algorithm
http://penguin.ewu.edu/~trolfe/DSWpaper/
If you really want to do it on-the-fly, you need a self-balancing tree.
If you just want to build a simple tree, without having to go to the trouble of balancing it, just randomize the elements before inserting them into the tree.
Make the median of your data (or more precisely, the nearest element in your array to the median) the root of the tree. And so on recursively.
Related
I am trying to generate a decision tree, that will be displayed in a TreeView. This is for a football game developer interface. It will allow the user to add events to particular nodes. The problem I have is generating all of the nodes. When using a linked list one can share nodes, where paths cross, but this cant be utilised in a TreeView, as the nodes get confused. I have an image here:
As you know football is a goal event game, and once a goal is scored, I move into the next node in the tree. So a score of 0 - 0 is the starting point. That node then splits into 2 nodes (1 - 0) and (0 - 1). Once on a path the tree needs to cater for travelling down that path but also cater for 2 - 2, 3 - 2, 3 - 3, etc
So each node in the tree needs to contain all possible solutions from the previous score. I'm sure you get the idea.
The maximum score, or exit point for the recursion is defined as:
(Home + Away) < 8
I call the recursion routine with:
Recurse(rootNode, 0, 0);
The function CreateNodeFromScore does the fancy node creation and works great.
My recursion code is here:
private void Recurse(TreeNode node, int iHome, int iAway) {
if ((iHome + iAway) == 8 ) {
return;
}
node.Nodes.Add( CreateNodeFromScore(iHome, iAway) );
TreeNode nextNode = node.Nodes[0];
Recurse(nextNode, ++iHome, iAway);
Recurse(nextNode, iHome, ++iAway);
}
private void CreateNodeFromScore(int iHome, int iAway) {
return new TreeNode(iHome.ToString() + " - " + iAway.ToString());
}
I have tried many ways to get this working, but the solution eludes me.
This is an algorithm problem rather than a gui, treeview, or C#, C++ problem. The code can be pretty much directly translated between the 2 languages.
Can any help me ?
I'm going to answer my own question here, as I have figured out the answer. Interestingly, posting here has helped me re-think the problem as I think I was getting confused. My thanks to #dbc for his advice and pointers.
private void Recurse(TreeNode node, int iHome, int iAway) {
if ((iHome + iAway)> 7)
return;
var homeNode = CreateNodeFromScore(iHome +1, iAway);
var awayNode = CreateNodeFromScore(iHome, iAway + 1);
node.Nodes.Add(homeNode);
node.Nodes.Add(awayNode);
Recurse(homeNode, iHome +1, iAway);
Recurse(awayNode, iHome, iAway +1);
}
Produces this result:
https://i.imgur.com/ztnbRDA.png
I hope this might be useful for others
Andrea
In my Unity3d application, I need to detect a polyline that has been selected by a user. The easy way to determine this is to add a collider component to each GameObject (polyline) then I'll know whenever the user clicks a polyline. But this is incredibly inefficient because I will have thousands of polylines.
So my more efficient method is to store each polylines distance from the point (0,0,0) in a List <KeyValuePair<double,GameObject>>. This list will be ordered from lowest distance to highest. When the user selects a point in the game, I will determine this points' distance (D) from (0,0,0) then use a 'Upper Bounds' Binary Search to find the polyline closest to this point (ie, with a similar distance to (0,0,0)).
My Question: Before I go and reinvent the wheel and code my own 'Upper Bounds' Binary Search algorithm, element sorting and etc, is there a C# .NET class for Upper Bounds Binary Search that will sort and search for me?
I am aware of the method List(T).BinarySearch() but is it up to me to ensure that the List is sorted correctly? If my list isn't sorted, and the method needs to sort the list each method call then that could be rather inefficient.
You could use a SortedList<double, GameObject> to store your polygons sorted instead of List <KeyValuePair<double,GameObject>> . Or you Sort() your List<> once after all polygons have been added (the second option is the best if you are not going to add other polygons latter, obviously).
#LeakyCode provided an implementation of lower bound for an IList, which would give you the index (in your list) of the closest GameObject :
private static int BinarySearch<T>(IList<T> list, T value)
{
if (list == null)
throw new ArgumentNullException("list");
var comp = Comparer<T>.Default;
int lo = 0, hi = list.Length - 1;
while (lo < hi) {
int m = (hi + lo) / 2; // this might overflow; be careful.
if (comp(list[m], value) < 0) lo = m + 1;
else hi = m - 1;
}
if (comp(list[lo], value) < 0) lo++;
return lo;
}
public static int FindFirstIndexGreaterThanOrEqualTo<T,U>
(this SortedList<T,U> sortedList, T key)
{
return BinarySearch(sortedList.Keys, key);
}
Should you sort your list first using List<T>.Sort Method and implement our own IComparer<T> in a class. I think it's the best approach.
References to IComparer
Reference to ISort
I'm searching through a generic list (or IQueryable) which contains 3 columns. I'm trying to find the value of the 3 column, based on 1 and 2, but the search is really slow. For a single search, the speed isn't noticeable, but I'm performing this search on a loop, and for 700 iterations, it takes a combined time of over 2 minutes, which isn't any use. Columns 1 and 2 are int and column 3 is a double. Here is the linq I'm using:
public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList)
{
Distance item = distanceList.Where(h => h.Start == start && h.End == end).FirstOrDefault();
return item ;
}
There could be up do 60,000 entries in the IQueryable list. I know that is quite a lot, but I didn't think it would pose any problem for searching.
So my question is, is there a better way to search through a collection when needing to match 2 columns to get value of a third? I guess I need all 700 searches to be almost instant, but it takes about 300ms for each which soon mounts up.
UPDATE - Final Solution #######################
I've now created a dictionary using Tuple with start and end as the key. I think this could be the right solution.
var dictionary = new Dictionary<Tuple<int, int>, double>();
var key = new Tuple<int, int>(Convert.ToInt32(reader[0]), Convert.ToInt32(reader[1]));
var value = Convert.ToDouble(reader[2]);
if (value <= distance)
{
dictionary.Add(key, value);
}
var key = new Tuple<int, int>(5, 20);
Works fine - much faster
Create a dictionary where columns 1 and 2 create the key. You create the dictionary once and then your searches will be almost instant.
If you have control over your collection and model classes, there is a library which allows you to index the properties of the class, which can greatly speed up searching.
http://i4o.codeplex.com/
I'd give a hashSet a try. This should speed up things ;)
Create a single value out of the first two columns, for example by concatenating them into a long, and use that as a key in a dictionary:
public long Combine(int start, int end) {
return ((long)start << 32) | end;
}
Dictionary<long, Distance> lookup = distanceList.ToDictionary(h => Combine(h.Start, h.End));
Then you can look up the value:
public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList) {
Distance item;
if (!lookup.TryGetValue(Combine(start, end), out item) {
item = null;
}
return item;
}
Getting an item from a dictionary is close to an O(1) operaton, which should make a dramatic difference from the O(n) operaton to loop through the items to find one.
Your problem is that LINQ has to execute the expression tree everytime you return the item. Just call this method with multiple start and end values
public static IEnumerable<Distance> FindByStartAndEnd
(IEnumerable<KeyValuePair<int, int>> startAndEnd,
IQueryable<Distance> distanceList)
{
return
from item in distanceList
where
startAndEnd.Select(s => s.Key).Contains(item.Start)
&& startAndEnd.Select(s => s.Value).Contains(item.End)
select item;
}
Please help I've been trying to generate a random binary search tree of size 1024 and the elements needs to be random sortedset ... I'm able to write a code to create a binary search tree manually by adding elements manually but I'm unablele yo write a code that would generate a random balanced binary tree of size 1024 then use try to find a key in that tree ... please please and thank u ahead ....
Edit added code from comments
ya it is homework... and this is what i got so far as code:
using System;
namespace bst {
public class Node {
public int value;
public Node Right = null;
public Node Left = null;
public Node(int value)
{
this.value = value;
}
}
public class BST {
public Node Root = null;
public BST() { }
public void Add(int new_value)
{
if(Search(new_value))
{
Console.WriteLine("value (" + new_value + ") already");
}
else
{
AddNode(this.Root,new_value);
}
}
}
}
Use recursion.
Each branch generates a new branch, select the middle item in the unsorted set, the median. Put it in the current item in the tree. Copy all items less than the median to another array, send that new array to the call of the same method. Copy all items greater than the median to another array, send that new array to the call of the same method.\
Balanced trees have to have an odd number of items, unless the main parent node is not filled in. You need to decide if there are two values that are the Median, whether the duplicate belongs on the lower branch or upper branch. I put duplicates on the upper branch in my example.
The median will be the number where an equal amount of numbers is less than and greater than the number. 1,2,3,3,4,18,29,105,123
In this case, the median is 4, even though the mean (or average) is much higher.
I didn't include code that determines the median.
BuildTreeItem(TreeItem Item, Array Set)
{
Array Smalls;
Array Larges;
Median = DetermineMedian(Set);
Item.Value = Median;
if(Set.Count() == 1)
return;
for (int i = 0; int i < Set.Count(); i++)
{
if(Set[i] < Median)
{
Smalls.new(Set[i]);
}
else
{
Larges.new(Set[i]);
}
}
Item.Lower = new TreeItem;
Item.Upper = new TreeItem;
BuildTreeItem(TreeItem.Lower, Smalls);
BuildTreeItem(TreeItem.Upper, Larges);
}
Unless it is homework the easiest solution would be to sort data first and then build a tree by using middle item as root and descending down each half. Method proposed by Xaade is similar , but much slower due to DetermineMedian complexity.
The other option is to actually look at algorithms that build balanced trees (like http://en.wikipedia.org/wiki/Red-black_tree ) to see if it fits your requirements.
EDIT: removing incorrect statement about speed of Xaade algorithm - it is actually as fast as quick sort (n log n - check each element on every level of recursion with log n levels of recursion), not sure why I estimated it slower.
I'm a Linq beginner so just looking for someone to let me know if following is possible to implement with Linq and if so some pointers how it could be achieved.
I want to transform one financial time series list into another where the second series list will be same length or shorter than the first list (usually it will be shorter, i.e., it becomes a new list where the elements themselves represent aggregation of information of one or more elements from the 1st list). How it collapses the list from one to the other depends on the data in the first list. The algorithm needs to track a calculation that gets reset upon new elements added to second list. It may be easier to describe via an example:
List 1 (time ordered from beginning to end series of closing prices and volume):
{P=7,V=1}, {P=10,V=2}, {P=10,V=1}, {P=10,V=3}, {P=11,V=5}, {P=12,V=1}, {P=13,V=2}, {P=17,V=1}, {P=15,V=4}, {P=14,V=10}, {P=14,V=8}, {P=10,V=2}, {P=9,V=3}, {P=8,V=1}
List 2 (series of open/close price ranges and summation of volume for such range period using these 2 param settings to transform list 1 to list 2: param 1: Price Range Step Size = 3, param 2: Price Range Reversal Step Size = 6):
{O=7,C=10,V=1+2+1}, {O=10,C=13,V=3+5+1+2}, {O=13,C=16,V=0}, {O=16,C=10,V=1+4+10+8+2}, {O=10,C=8,V=3+1}
In list 2, I explicitly am showing the summation of the V attributes from list 1 in list 2. But V is just a long so it would just be one number in reality. So how this works is opening time series price is 7. Then we are looking for first price from this initial starting price where delta is 3 away from 7 (via param 1 setting). In list 1, as we move thru the list, the next step is upwards move to 10 and thus we've established an "up trend". So now we build our first element in list 2 with Open=7,Close=10 and sum up the Volume of all bars used in first list to get to this first step in list 2. Now, next element starting point is 10. To build another up step, we need to advance another 3 upwards to create another up step or we could reverse and go downwards 6 (param 2). With data from list 1, we reach 13 first, so that builds our second element in list 2 and sums up all the V attributes used to get to this step. We continue on this process until end of list 1 processing.
Note the gap jump that happens in list 1. We still want to create a step element of {O=13,C=16,V=0}. The V of 0 is simply stating that we have a range move that went thru this step but had Volume of 0 (no actual prices from list 1 occurred here - it was above it but we want to build the set of steps that lead to price that was above it).
Second to last entry in list 2 represents the reversal from up to down.
Final entry in list 2 just uses final Close from list 1 even though it really hasn't finished establishing full range step yet.
Thanks for any pointers of how this could be potentially done via Linq if at all.
My first thought is, why try to use LINQ on this? It seems like a better situation for making a new Enumerable using the yield keyword to partially process and then spit out an answer.
Something along the lines of this:
public struct PricePoint
{
ulong price;
ulong volume;
}
public struct RangePoint
{
ulong open;
ulong close;
ulong volume;
}
public static IEnumerable<RangePoint> calculateRanges(IEnumerable<PricePoint> pricePoints)
{
if (pricePoints.Count() > 0)
{
ulong open = pricePoints.First().price;
ulong volume = pricePoints.First().volume;
foreach(PricePoint pricePoint in pricePoints.Skip(1))
{
volume += pricePoint.volume;
if (pricePoint.price > open)
{
if ((pricePoint.price - open) >= STEP)
{
// We have established a up-trend.
RangePoint rangePoint;
rangePoint.open = open;
rangePoint.close = close;
rangePoint.volume = volume;
open = pricePoint.price;
volume = 0;
yield return rangePoint;
}
}
else
{
if ((open - pricePoint.price) >= REVERSAL_STEP)
{
// We have established a reversal.
RangePoint rangePoint;
rangePoint.open = open;
rangePoint.close = pricePoint.price;
rangePoint.volume = volume;
open = pricePoint.price;
volume = 0;
yield return rangePoint;
}
}
}
RangePoint lastPoint;
lastPoint.open = open;
lastPoint.close = pricePoints.Last().price;
lastPoint.volume = volume;
yield return lastPoint;
}
}
This isn't yet complete. For instance, it doesn't handle gapping, and there is an unhandled edge case where the last data point might be consumed, but it will still process a "lastPoint". But it should be enough to get started.