.NET equivalent of Java's TreeSet.floor & TreeSet.ceiling - c#

As an example, there's a Binary Search Tree which holds a range of values. Before adding a new value, I need to check if it already contains it's 'almost duplicate'. I have Java solution which simply performs floor and ceiling and further condition to do the job.
JAVA: Given a TreeSet, floor() returns the greatest element in this set less than or equal to the given element; ceiling() returns the least element in this set greater than or equal to the given element
TreeSet<Long> set = new TreeSet<>();
long l = (long)1; // anything
Long floor = set.floor(l);
Long ceil = set.ceiling(l);
C#: Closest data structure seems to be SortedSet<>. Could anyone advise the best way to get floor and ceil results for an input value?
SortedSet<long> set = new SortedSet<long>();

The above, as mentioned, is not the answer since this is a tree we expect logarithmic times. Java's floor and ceiling methods are logarithmic. GetViewBetween is logarigmic and so are Max and Min, so:
floor for SortedSet<long>:
sortedSet.GetViewBetween(long.MinValue, num).Max
ceiling for SortedSet<long>:
sortedSet.GetViewBetween(num, long.MaxValue).Min

You can use something like this. In Linq there is LastOrDefault method:
var floor = sortedSet.LastOrDefault(i => i < num);
// num is the number whose floor is to be calculated
if (! (floor < sortedSet.ElementAt(0)))
{
// we have a floor
}
else
// nothing is smaller in the set
{
}

Related

Compare if elements are almost equal in a list in C# .NET

I am still very beginner in C# and .NET and just need to do this simple test.
var odds = new System.Collections.Generic.List<double>();
// here is a code which adds the values in the list
foreach(var odd in odds)
{
System.Console.WriteLine(odd);
}
and the output is something like that:
13.098252624859418
14.098252624859349
13.098252624859577
13.098252624853423
14.098252624859398
So I would like to compare all the values inside the list if they are almost equal. That means even if there is a little difference between the numbers (such as 13 and 14) inside the list still to be acceptable so I would like this difference to be maximum of 2.
Check the difference between the maximum value and the minimum value in the list (2 in your case). Using a tolerance value. For example
double delta = 2;
// getting largest element
var maxNum = odds.Max();
// getting smallest element
var minNum = odds.Min();
var almostEqual = maxNum - minNum <= delta;
You'll need to do it manually, as is recommended with every floating point number comparison (because floating point math is unintuitive), doing that is quite simple, something like this:
var a = 13.098252624859418;
var b = 14.098252624859398;
// define your acceptable range, i.e 1.0 means number 1.0 larger and smaller are equal to one another
var delta = 1.0;
var areNearlyEqual = Math.Abs(a - b) <= delta; // true
Now if you want to check if every element in a List is nearly equal to every other element, there is a naïve and more "complicated" solution, I'll start with the naïve one:
(Don't actually use this implementation, this is for illustration purposes of how to check equality of all items in a list which aren't just numbers)
var allAreNearlyEqual = true; // Let's start of assuming all are equal
foreach (var x in odds)
{
if (!allAreNearlyEqual)
break;
foreach (var y in odds)
{
if (!Math.Abs(x - y) <= delta)
allAreNearlyEqual = false;
}
}
Console.WriteLine(allAreNearlyEqual);
As you can see we need to iterate over every element in the list (x) and compare it to every other element in the list (y), there is an easier to read (and also faster*) version of this:
var max = odds.Max();
var min = odds.Min();
if (Math.Abs(max - min) <= delta)
Console.WriteLine("All items are nearly equal");
else
Console.WriteLine("Not all items are nearly equal");
(This takes advantage of the fact that all other elements between the min and max are also close enough to be nearly equal, if the min and max are)
You can check out the implementation for Max here to see how they do it, but basically it's just a foreach loop which returns the highest value found.
*The second version is faster, because it's O(2N) where as the first version is O(N^2), I added the first version to illustrate how you could do the same thing on a list of objects which are not just numbers

How can i make random float numbers but to check that there will be no same numbers?

for(int i = 0; i < gos.Length; i++)
{
float randomspeed = (float)Math.Round (UnityEngine.Random.Range (1.0f, 15.0f));
floats.Add (randomspeed);
_animator [i].SetFloat ("Speed", randomspeed);
}
Now what i get is only round numbers between 1 and 15. I mean i'm not getting numbers like 1.0 or 5.4 or 9.8 or 14.5 is it logical to have speed values like this ? If so how can i make that the random numbers will include also floats ?
Second how can i make sure that there will be no the same numbers ?
gos Length is 15
As noted in the other answer, you aren't getting fractional values, because you call Math.Round(), which has the express purpose of rounding to the nearest whole number (when called the way you do).
As for preventing duplicates, I question the need to ensure against duplicates. First, the number of possible values within the range you're selecting is large enough that the chances of getting duplicates is very small. Second, it appears you are selecting random speeds for some game object, and it seems to me that in that scenario, it's entirely plausible that once in a while you would find a pair of game objects with the same speed.
That said, if you still want to do that, I would advise against the linear searches recommended by the other answers. Game logic should be reasonably efficient, and in this scenario that would mean using a hash set. For example:
HashSet<float> values = new HashSet<float>();
while (values.Count < gos.Length)
{
float randomSpeed = UnityEngine.Random.Range(1.0f, 15.0f);
// The Add() method returns "true" if the value _wasn't_ already in the set
if (values.Add(randomSpeed))
{
_animator[values.Count - 1].SetFloat("Speed, randomSpeed);
}
}
// it's not clear from your question whether you really need the list of
// floats at the end, but if you do, this is a way to convert the hash set
// to a list
floats = values.ToList();
The reason you're not getting any decimals is because you're using Math.Round, this will either raise the float to the next whole number or lower it.
As for if it's logical, it depends.As for your case animation speed is usually done by floats because it can smoothly speed up and down.
Also to answer your question on how to avoid duplicates of the same float.. which in itself is already very unlikely, try doing this instead :
for(int i = 0; i < gos.Length; i++)
{
float randomspeed = 0f;
// Keep repeating this until we find an unique randomspeed.
while(randomspeed == 0f || floats.Contains(randomspeed))
{
// Use this is you want round numbers
//randomspeed = Mathf.Round(Random.Range(1.0f, 15.0f));
randomspeed = Random.Range(1.0f, 15.0f);
}
floats.Add (randomspeed);
_animator [i].SetFloat ("Speed", randomspeed);
}
Your first problem: if you use Math.Round(), you'll never get numbers like 5.4...
Second question: you can check for existance of the number before you add the number:
private float GenerateRandomSpeed()
{ return (float)UnityEngine.Random.Range (1.0f, 15.0f);}
for(int i = 0; i < gos.Length; i++)
{
float randomspeed= GenerateRandomSpeed();
while (floats.any(x=>x==randomspeed))
randomspeed=GenerateRandomSpeed();
floats.Add (randomspeed);
_animator [i].SetFloat ("Speed", randomspeed);
}
I didn't test it but i hope it can direct you to the answer.

Best way to group list of doubles, to add up to a specific value

I have task, that I'm unsure on how i should approach.
there's a list of doubles, and i need to group them together to add up to a specific value.
Say i have:
14.6666666666666,
14.6666666666666,
2.37499999999999,
1.04166666666665,
1.20833333333334,
1.20833333333334,
13.9583333333333,
1.20833333333334,
3.41666666666714,
3.41666666666714,
1.20833333333334,
1.20833333333334,
14.5416666666666,
1.20833333333335,
1.04166666666666,
And i would like to group into set values such as 12,14,16
I would like to take the highest value in the list then group it with short ones to equal the closest value above.
example:
take double 14.6666666666666, and group it with 1.20833333333334 to bring me close to 16, and if there are anymore small doubles left in the list, group them with that as well.
Then move on to the next double in the list..
That's literally the "Cutting stock Problem" (Sometimes called the 1 Dimensional Bin Packing Problem). There are a number of well documented solutions.
The only way to get the "Optimal" solution (other than a quantum computer) is to cycle through every combination, and select the best outcome.
A quicker way to get an "OK" solution is called the "First Fit Algorithm". It takes the requested values in the order they come, and removes them from the first piece of material that can fulfill the request.
The "First Fit Algorithm" can be slightly improved by pre-ordering the the values from largest to smallest, and pre-ordering the materials from smallest to largest. You could also uses the material that is closest to being completely consumed by the request, instead of the first piece that can fulfill the request.
A compromise, but one that requires more code is a "Genetic Algorithm". This is an over simplification, but you could use the basic idea of the "First Fit Algorithm", but randomly swap two of the values before each pass. If the efficiency increases, you keep the change, and if it decreases, you go back to the last state. Repeat until a fixed amount of time has passed or until you're happy.
Put the doubles in a list and sort them. Grab the highest value that is less than the target to start. Then loop through from the start of the list adding the values until you reach a point where adding the value will put you over the limit.
var threshold = 16;
List<double> values = new List<double>();
values.Add(14.932034);
etc...
Sort the list:
values = values.OrderBy(p => p).ToList();
Grab the highest value that is less than your threshold:
// Highest value under threshold
var highestValue = values.Where(x => x < threshold).Max();
Now perform your search and calculations until you reach your solution:
currentValue = highestValue
Console.WriteLine("Starting with: " + currentValue);
foreach(var val in values)
{
if(currentValue + val <= theshold)
{
currentValue = currentValue + val;
Console.WriteLine(" + " + val.ToString());
}
else
break;
}
Console.WriteLine("Finished with: " + currentValue.ToString());
Console.ReadLine();
Repeat the process for the next value and so on until you've output all of the solutions you want.

Random Forest trains but tree count is zero?

I'm using emguCV to use OpenCV machine learning algorithms. I can successfully train a RTree(i get success) but when i try to predict it gives me always -1. Then i tried to get the Variable Importance matrix and the tree count and the matrix comes as null (i specified the params to built it) and the tree count comes as 0.
Does anyone has any thoughts on what i'm doing wrong? PS, if i use a decision tree i can get predictions.
I have 6 variables and about 11000 samples.
Below are the parameters i use:
MCvRTParams param = new MCvRTParams();
param.maxDepth = 8;// max depth
param.minSampleCount = 10;// min sample count
param.regressionAccuracy = 0;// regression accuracy: N/A here
param.useSurrogates = true; //compute surrogate split, no missing data
param.maxCategories = 15;// max number of categories (use sub-optimal algorithm for larger numbers)
param.cvFolds = 10;
//param.use1seRule = true;
param.truncatePrunedTree = true;
//param.priors = priorsHandle.AddrOfPinnedObject(); // the array of priors
Thanks
Try setting regressionAccuracy to a non-zero number. regressionAccuracy stops splitting nodes, if the accuracy within a node is better than regressionAccuracy. If you set it to zero it will stop immediately at the root node.

Standard deviation of generic list? [duplicate]

This question already has answers here:
How do I determine the standard deviation (stddev) of a set of values?
(12 answers)
Standard Deviation in LINQ
(8 answers)
Closed 9 years ago.
I need to calculate the standard deviation of a generic list. I will try to include my code. Its a generic list with data in it. The data is mostly floats and ints. Here is my code that is relative to it without getting into to much detail:
namespace ValveTesterInterface
{
public class ValveDataResults
{
private List<ValveData> m_ValveResults;
public ValveDataResults()
{
if (m_ValveResults == null)
{
m_ValveResults = new List<ValveData>();
}
}
public void AddValveData(ValveData valve)
{
m_ValveResults.Add(valve);
}
Here is the function where the standard deviation needs to be calculated:
public float LatchStdev()
{
float sumOfSqrs = 0;
float meanValue = 0;
foreach (ValveData value in m_ValveResults)
{
meanValue += value.LatchTime;
}
meanValue = (meanValue / m_ValveResults.Count) * 0.02f;
for (int i = 0; i <= m_ValveResults.Count; i++)
{
sumOfSqrs += Math.Pow((m_ValveResults - meanValue), 2);
}
return Math.Sqrt(sumOfSqrs /(m_ValveResults.Count - 1));
}
}
}
Ignore whats inside the LatchStdev() function because I'm sure its not right. Its just my poor attempt to calculate the st dev. I know how to do it of a list of doubles, however not of a list of generic data list. If someone had experience in this, please help.
The example above is slightly incorrect and could have a divide by zero error if your population set is 1. The following code is somewhat simpler and gives the "population standard deviation" result. (http://en.wikipedia.org/wiki/Standard_deviation)
using System;
using System.Linq;
using System.Collections.Generic;
public static class Extend
{
public static double StandardDeviation(this IEnumerable<double> values)
{
double avg = values.Average();
return Math.Sqrt(values.Average(v=>Math.Pow(v-avg,2)));
}
}
This article should help you. It creates a function that computes the deviation of a sequence of double values. All you have to do is supply a sequence of appropriate data elements.
The resulting function is:
private double CalculateStandardDeviation(IEnumerable<double> values)
{
double standardDeviation = 0;
if (values.Any())
{
// Compute the average.
double avg = values.Average();
// Perform the Sum of (value-avg)_2_2.
double sum = values.Sum(d => Math.Pow(d - avg, 2));
// Put it all together.
standardDeviation = Math.Sqrt((sum) / (values.Count()-1));
}
return standardDeviation;
}
This is easy enough to adapt for any generic type, so long as we provide a selector for the value being computed. LINQ is great for that, the Select funciton allows you to project from your generic list of custom types a sequence of numeric values for which to compute the standard deviation:
List<ValveData> list = ...
var result = list.Select( v => (double)v.SomeField )
.CalculateStdDev();
Even though the accepted answer seems mathematically correct, it is wrong from the programming perspective - it enumerates the same sequence 4 times. This might be ok if the underlying object is a list or an array, but if the input is a filtered/aggregated/etc linq expression, or if the data is coming directly from the database or network stream, this would cause much lower performance.
I would highly recommend not to reinvent the wheel and use one of the better open source math libraries Math.NET. We have been using that lib in our company and are very happy with the performance.
PM> Install-Package MathNet.Numerics
var populationStdDev = new List<double>(1d, 2d, 3d, 4d, 5d).PopulationStandardDeviation();
var sampleStdDev = new List<double>(2d, 3d, 4d).StandardDeviation();
See http://numerics.mathdotnet.com/docs/DescriptiveStatistics.html for more information.
Lastly, for those who want to get the fastest possible result and sacrifice some precision, read "one-pass" algorithm https://en.wikipedia.org/wiki/Standard_deviation#Rapid_calculation_methods
I see what you're doing, and I use something similar. It seems to me you're not going far enough. I tend to encapsulate all data processing into a single class, that way I can cache the values that are calculated until the list changes.
for instance:
public class StatProcessor{
private list<double> _data; //this holds the current data
private _avg; //we cache average here
private _avgValid; //a flag to say weather we need to calculate the average or not
private _calcAvg(); //calculate the average of the list and cache in _avg, and set _avgValid
public double average{
get{
if(!_avgValid) //if we dont HAVE to calculate the average, skip it
_calcAvg(); //if we do, go ahead, cache it, then set the flag.
return _avg; //now _avg is garunteed to be good, so return it.
}
}
...more stuff
Add(){
//add stuff to the list here, and reset the flag
}
}
You'll notice that using this method, only the first request for average actually computes the average. After that, as long as we don't add (or remove, or modify at all, but those arnt shown) anything from the list, we can get the average for basically nothing.
Additionally, since the average is used in the algorithm for the standard deviation, computing the standard deviation first will give us the average for free, and computing the average first will give us a little performance boost in the standard devation calculation, assuming we remember to check the flag.
Furthermore! places like the average function, where you're looping through every value already anyway, is a great time to cache things like the minimum and maximum values. Of course, requests for this information need to first check whether theyve been cached, and that can cause a relative slowdown compared to just finding the max using the list, since it does all the extra work setting up all the concerned caches, not just the one your accessing.

Categories

Resources