selection based on percentage weighting

selection based on percentage weighting - c#

I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)

Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());

For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).

Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).

Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.

Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.

def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.

import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]

today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.

I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}

If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}

Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}

I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3

Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.

Related

Random number with Probabilities in C#

I have converted this Java program into a C# program.
using System;
using System.Collections.Generic;
namespace RandomNumberWith_Distribution__Test
{
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
public DistributedRandomNumberGenerator()
{
distribution = new Dictionary<Int32, Double>();
}
public void addNumber(int val, double dist)
{
distribution.Add(val, dist);// are these two
distSum += dist; // lines correctly translated?
}
public int getDistributedRandomNumber()
{
double rand = new Random().NextDouble();//generate a double random number
double ratio = 1.0f / distSum;//why is ratio needed?
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)//what does "rand/ratio" signify? What does this comparison achieve?
{
return i;
}
}
return 0;
}
}
public class MainClass
{
public static void Main(String[] args)
{
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
//=================
// start simulation
int testCount = 1000000;
Dictionary<Int32, Double> test = new Dictionary<Int32, Double>();
for (int i = 0; i < testCount; i++)
{
int random = drng.getDistributedRandomNumber();
if (test.ContainsKey(random))
{
double prob = test[random]; // are these
prob = prob + 1.0 / testCount;// three lines
test[random] = prob; // correctly translated?
}
else
{
test.Add(random, 1.0 / testCount);// is this line correctly translated?
}
}
foreach (var item in test.Keys)
{
Console.WriteLine($"{item}, {test[item]}");
}
Console.ReadLine();
}
}
}
I have several questions:
Can you check if the marked-by-comment lines are correct or need explanation?
Why doesn't getDistributedRandomNumber() check if the sum of the distribution 1 before proceeding to further calculations?

The method
public void addNumber(int val, double dist)
Is not correctly translated, since you are missing the following lines:
if (this.distribution.get(value) != null) {
distSum -= this.distribution.get(value);
}
Those lines should cover the case when you call the following (based on your example code):
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(1, 0.5d);
So calling the method addNumber twice with the same first argument, the missing code part looks if the first argument is already present in the dictionary and if so it will remove the "old" value from the dictionary to insert the new value.
Your method should look like this:
public void addNumber(int val, double dist)
{
if (distribution.TryGetValue(val, out var oldDist)) //get the old "dist" value, based on the "val"
{
distribution.Remove(val); //remove the old entry
distSum -= oldDist; //substract "distSum" with the old "dist" value
}
distribution.Add(val, dist); //add the "val" with the current "dist" value to the dictionary
distSum += dist; //add the current "dist" value to "distSum"
}
Now to your second method
public int getDistributedRandomNumber()
Instead of calling initializing a new instance of Random every time this method is called you should only initialize it once, so change the line
double rand = new Random().NextDouble();
to this
double rand = _random.NextDouble();
and initialize the field _random outside of a method inside the class declaration like this
public class DistributedRandomNumberGenerator
{
private Dictionary<Int32, Double> distribution;
private double distSum;
private Random _random = new Random();
... rest of your code
}
This will prevent new Random().NextDouble() from producing the same number over and over again if called in a loop.
You can read about this problem here: Random number generator only generating one random number
As I side note, fields in c# are named with a prefix underscore. You should consider renaming distribution to _distribution, same applies for distSum.
Next:
double ratio = 1.0f / distSum;//why is ratio needed?
Ratio is need because the method tries its best to do its job with the information you have provided, imagine you only call this:
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
int random = drng.getDistributedRandomNumber();
With those lines you told the class you want to have the number 1 in 20% of the cases, but what about the other 80%?
And that's where the ratio variable comes in place, it calculates a comparable value based on the sum of probabilities you have given.
eg.
double ratio = 1.0f / distSum;
As with the latest example drng.addNumber(1, 0.2d); distSum will be 0.2, which translates to a probability of 20%.
double ratio = 1.0f / 0.2;
The ratio is 5.0, with a probability of 20% the ratio is 5 because 100% / 5 = 20%.
Now let's have a look at how the code reacts when the ratio is 5
double tempDist = 0;
foreach (Int32 i in distribution.Keys)
{
tempDist += distribution[i];
if (rand / ratio <= tempDist)
{
return i;
}
}
rand will be to any given time a value that is greater than or equal to 0.0, and less than 1.0., that's how NextDouble works, so let's assume the following 0.254557522132321 as rand.
Now let's look what happens step by step
double tempDist = 0; //initialize with 0
foreach (Int32 i in distribution.Keys) //step through the added probabilities
{
tempDist += distribution[i]; //get the probabilities and add it to a temporary probability sum
//as a reminder
//rand = 0.254557522132321
//ratio = 5
//rand / ratio = 0,0509115044264642
//tempDist = 0,2
// if will result in true
if (rand / ratio <= tempDist)
{
return i;
}
}
If we didn't apply the ratio the if would be false, but that would be wrong, since we only have a single value inside our dictionary, so no matter what the rand value might be the if statement should return true and that's the natur of rand / ratio.
To "fix" the randomly generated number based on the sum of probabilities we added. The rand / ratio will only be usefull if you didn't provide probabilites that perfectly sum up to 1 = 100%.
eg. if your example would be this
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
You can see that the provided probabilities equal to 1 => 0.2 + 0.3 + 0.5, in this case the line
if (rand / ratio <= tempDist)
Would look like this
if (rand / 1 <= tempDist)
Divding by 1 will never change the value and rand / 1 = rand, so the only use case for this devision are cases where you didn't provided a perfect 100% probability, could be either more or less.
As a side note, I would suggest changing your code to this
//call the dictionary distributions (notice the plural)
//dont use .Keys
//var distribution will be a KeyValuePair
foreach (var distribution in distributions)
{
//access the .Value member of the KeyValuePair
tempDist += distribution.Value;
if (rand / ratio <= tempDist)
{
return i;
}
}
Your test routine seems to be correctly translated.

Nearest value from user input in an array C#

so in my application , I read some files into it and ask the user for a number , in these files there a lot of numbers and I am trying to find the nearest value when the number they enter is not in the file. So far I have as following
static int nearest(int close_num, int[] a)
{
foreach (int bob in a)
{
if ((close_num -= bob) <= 0)
return bob;
}
return -1;
}
Console.WriteLine("Enter a number to find out if is in the selected Net File: ");
int i3 = Convert.ToInt32(Console.ReadLine());
bool checker = false;
//Single nearest = 0;
//linear search#1
for (int i = 0; i < a.Length; i++)//looping through array
{
if(a[i] == i3)//checking to see the value is found in the array
{
Console.WriteLine("Value found and the position of it in the descending value of the selected Net File is: " + a[i]);
checker = true;
}
else
{
int found = nearest(i3,a);
Console.WriteLine("Cannot find this number in the Net File however here the closest number to that: " + found );
//Console.WriteLine("Cannot find this number in the Net File however here the closest number to that : " + nearest);
}
}
When a value that is in the file is entered the output is fine , but when it comes to the nearest value I cannot figure a way. I can't use this such as BinarySearchArray for this. a = the array whilst i3 is the value the user has entered. Would a binary search algorithm just be simpler for this?
Any help would be appreciated.

You need to make a pass over all the elements of the array, comparing each one in turn to find the smallest difference. At the same time, keep a note of the current nearest value.
There are many ways to do this; here's a fairly simple one:
static int nearest(int close_num, int[] a)
{
int result = -1;
long smallestDelta = long.MaxValue;
foreach (int bob in a)
{
long delta = (bob > close_num) ? (bob - close_num) : (close_num - bob);
if (delta < smallestDelta)
{
smallestDelta = delta;
result = bob;
}
}
return result;
}
Note that delta is calculated so that it is the absolute value of the difference.

Well, first we should define, what is nearest. Assuming that,
int nearest for given int number is the item of int[] a such that Math.Abs(nearest - number) is the smallest possible value
we can put it as
static int nearest(int number, int[] a)
{
long diff = -1;
int result = 0;
foreach (int item in a)
{
// actual = Math.Abs((long)item - number);
long actual = (long)item - number;
if (actual < 0)
actual = -actual;
// if item is the very first value or better than result
if (diff < 0 || actual < diff) {
result = item;
diff = actual;
}
}
return result;
}
The only tricky part is long for diff: it may appear that item - number exceeds int range (and will either have IntegerOverflow exceprion thrown or *invalid answer), e.g.
int[] a = new int[] {int.MaxValue, int.MaxValue - 1};
Console.Write(nearest(int.MinValue, a));
Note, that expected result is 2147483646, not 2147483647

what about LINQ ?
var nearestNumber = a.OrderBy(x => Math.Abs(x - i3)).First();

Just iterate through massive and find the minimal delta between close_num and array members
static int nearest(int close_num, int[] a)
{
// initialize as big number, 1000 just an example
int min_delta=1000;
int result=-1;
foreach (int bob in a)
{
if (Math.Abs(bob-close_num) <= min_delta)
{
min_delta = bob-close_num;
result = bob;
}
}
return result;
}

Where is the flaw in my algorithm for consolidating gold mines?

The setup is that, given a list of N objects like
class Mine
{
public int Distance { get; set; } // from river
public int Gold { get; set; } // in tons
}
where the cost of moving the gold from one mine to the other is
// helper function for cost of a move
Func<Tuple<Mine,Mine>, int> MoveCost = (tuple) =>
Math.Abs(tuple.Item1.Distance - tuple.Item2.Distance) * tuple.Item1.Gold;
I want to consolidate the gold into K mines.
I've written an algorithm, thought it over many times, and don't understand why it isn't working. Hopefully my comments help out. Any idea where I'm going wrong?
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
class Mine
{
public int Distance { get; set; } // from river
public int Gold { get; set; } // in tons
}
class Solution
{
static void Main(String[] args)
{
// helper function for reading lines
Func<string, int[]> LineToIntArray = (line) => Array.ConvertAll(line.Split(' '), Int32.Parse);
int[] line1 = LineToIntArray(Console.ReadLine());
int N = line1[0], // # of mines
K = line1[1]; // # of pickup locations
// Populate mine info
List<Mine> mines = new List<Mine>();
for(int i = 0; i < N; ++i)
{
int[] line = LineToIntArray(Console.ReadLine());
mines.Add(new Mine() { Distance = line[0], Gold = line[1] });
}
// helper function for cost of a move
Func<Tuple<Mine,Mine>, int> MoveCost = (tuple) =>
Math.Abs(tuple.Item1.Distance - tuple.Item2.Distance) * tuple.Item1.Gold;
// all move combinations
var moves = from m1 in mines
from m2 in mines
where !m1.Equals(m2)
select Tuple.Create(m1,m2);
// moves in ascending order of cost
var ordered = from m in moves
orderby MoveCost(m)
select m;
int sum = 0; // running total of move costs
var spots = Enumerable.Repeat(1, N).ToArray(); // spots[i] = 1 if hasn't been consildated into other mine, 0 otherwise
var iter = ordered.GetEnumerator();
while(iter.MoveNext() && spots.Sum() != K)
{
var move = iter.Current; // move with next smallest cost
int i = mines.IndexOf(move.Item1), // index of source mine in move
j = mines.IndexOf(move.Item2); // index of destination mine in move
if((spots[i] & spots[j]) == 1) // if the source and destination mines are both unconsolidated
{
sum += MoveCost(move); // add this consolidation to the total cost
spots[i] = 0; // "remove" mine i from the list of unconsolidated mines
}
}
Console.WriteLine(sum);
}
}
An example of a test case I'm failing is
3 1
11 3
12 2
13 1
My output is
3
and the correct output is
4

The other answer does point out a flaw in the implementation, but it fails to mention that in your code, you aren't actually changing the Gold values in the remaining Mine objects. So even if you did re-sort the data, it wouldn't help.
Furthermore, at each iteration all you really care about is the minimum value. Sorting the entire list of data is overkill. You can just scan it once to find the minimum-valued item.
You also don't really need the separate array of flags. Just maintain your move objects in a list, and after choosing a move, remove the move objects that include the Mine you would otherwise have flagged as no longer valid.
Here is a version of your algorithm that incorporates the above feedback:
static void Main(String[] args)
{
string input =
#"3 1
11 3
12 2
13 1";
StringReader reader = new StringReader(input);
// helper function for reading lines
Func<string, int[]> LineToIntArray = (line) => Array.ConvertAll(line.Split(' '), Int32.Parse);
int[] line1 = LineToIntArray(reader.ReadLine());
int N = line1[0], // # of mines
K = line1[1]; // # of pickup locations
// Populate mine info
List<Mine> mines = new List<Mine>();
for (int i = 0; i < N; ++i)
{
int[] line = LineToIntArray(reader.ReadLine());
mines.Add(new Mine() { Distance = line[0], Gold = line[1] });
}
// helper function for cost of a move
Func<Tuple<Mine, Mine>, int> MoveCost = (tuple) =>
Math.Abs(tuple.Item1.Distance - tuple.Item2.Distance) * tuple.Item1.Gold;
// all move combinations
var moves = (from m1 in mines
from m2 in mines
where !m1.Equals(m2)
select Tuple.Create(m1, m2)).ToList();
int sum = 0, // running total of move costs
unconsolidatedCount = N;
while (moves.Count > 0 && unconsolidatedCount != K)
{
var move = moves.Aggregate((a, m) => MoveCost(a) < MoveCost(m) ? a : m);
sum += MoveCost(move); // add this consolidation to the total cost
move.Item2.Gold += move.Item1.Gold;
moves.RemoveAll(m => m.Item1 == move.Item1 || m.Item2 == move.Item1);
unconsolidatedCount--;
}
Console.WriteLine("Moves: " + sum);
}
Without more detail in your question, I can't guarantee that this actually meets the specification. But it does produce the value 4 for the sum. :)

When you consolidate mine i into mine j, the amount of gold in the mine j is increased. This makes consolidations from mine j to other mines more expensive potentially making the ordering of the mines by the move cost invalid. To fix this, you could re-sort the list of mines at the beginning of each iteration of your while-loop.

C# Simple Constrained Weighted Average Algorithm with/without Solver

I'm at a loss as to why I can't get this seemingly simple problem solved using Microsoft Solver Foundation.
All I need is to modify the weights (numbers) of certain observations to ensure that no 1 observation's weight AS A PERCENTAGE exceeds 25%. This is for the purposes of later calculating a constrained weighted average with the results of this algorithm.
For example, given the 5 weights of { 45, 100, 33, 500, 28 }, I would expect the result of this algorithm to be { 45, 53, 33, 53, 28 }, where 2 of the numbers had to be reduced such that they're within the 25% threshold of the new total (212 = 45+53+33+53+28) while the others remained untouched. Note that even though initially, the 2nd weight of 100 was only 14% of the total (706), as a result of decreasing the 4th weight of 500, it subsequently pushed up the % of the other observations and therein lies the only challenge with this.
I tried to recreate this using Solver only for it to tell me that it is the solution is "Infeasible" and it just returns all 1s. Update: solution need not use Solver, any alternative is welcome so long as it is fast when dealing with a decent number of weights.
var solver = SolverContext.GetContext();
var model = solver.CreateModel();
var decisionList = new List<Decision>();
decisionList.Add(new Decision(Domain.IntegerRange(1, 45), "Dec1"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 100), "Dec2"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 33), "Dec3"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 500), "Dec4"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 28), "Dec5"));
model.AddDecisions(decisionList.ToArray());
int weightLimit = 25;
foreach (var decision in model.Decisions)
{
model.AddConstraint(decision.Name + "weightLimit", 100 * (decision / Model.Sum(model.Decisions.ToArray())) <= weightLimit);
}
model.AddGoal("calcGoal", GoalKind.Maximize, Model.Sum(model.Decisions.ToArray()));
var solution = solver.Solve();
foreach (var decision in model.Decisions)
{
Debug.Print(decision.GetDouble().ToString());
}
Debug.Print("Solution Quality: " + solution.Quality.ToString());
Any help with this would be very much appreciated, thanks in advance.

I ditched Solver b/c it didn't live up to its name imo (or I didn't live up to its standards :)). Below is where I landed. Because this function gets used many times and on large lists of input weights, efficiency and performance are key so this function attempts to do the least # of iterations possible (let me know if anyone has any suggested improvements though). The results get used for a weighted average so I use "AttributeWeightPair" to store the value (attribute) and its weight and the function below is what modifies the weights to be within the constraint when given a list of these AWPs. The function assumes that weightLimit is passed in as a %, e.g. 25% gets passed in as 25, not 0.25 --- ok I'll stop stating what'll be obvious from the code - so here it is:
public static List<AttributeWeightPair<decimal>> WeightLimiter(List<AttributeWeightPair<decimal>> source, decimal weightLimit)
{
weightLimit /= 100; //convert to percentage
var zeroWeights = source.Where(w => w.Weight == 0).ToList();
var nonZeroWeights = source.Where(w => w.Weight > 0).ToList();
if (nonZeroWeights.Count == 0)
return source;
//return equal weights if given infeasible constraint
if ((1m / nonZeroWeights.Count()) > weightLimit)
{
nonZeroWeights.ForEach(w => w.Weight = 1);
return nonZeroWeights.Concat(zeroWeights).ToList();
}
//return original list if weight-limiting is unnecessary
if ((nonZeroWeights.Max(w => w.Weight) / nonZeroWeights.Sum(w => w.Weight)) <= weightLimit)
{
return source;
}
//sort (ascending) and store original weights
nonZeroWeights = nonZeroWeights.OrderBy(w => w.Weight).ToList();
var originalWeights = nonZeroWeights.Select(w => w.Weight).ToList();
//set starting point and determine direction from there
var initialSumWeights = nonZeroWeights.Sum(w => w.Weight);
var initialLimit = weightLimit * initialSumWeights;
var initialSuspects = nonZeroWeights.Where(w => w.Weight > initialLimit).ToList();
var initialTarget = weightLimit * (initialSumWeights - (initialSuspects.Sum(w => w.Weight) - initialLimit * initialSuspects.Count()));
var antepenultimateIndex = Math.Max(nonZeroWeights.FindLastIndex(w => w.Weight <= initialTarget), 1); //needs to be at least 1
for (int i = antepenultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[antepenultimateIndex - 1]; //set cap equal to the preceding weight
}
bool goingUp = (nonZeroWeights[antepenultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight)) > weightLimit ? false : true;
//Procedure 1 - find the weight # at which a cap would result in a weight % just UNDER the weight limit
int penultimateIndex = antepenultimateIndex;
bool justUnderTarget = false;
while (!justUnderTarget)
{
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1]; //set cap equal to the preceding weight
}
var currentMaxPcntWeight = nonZeroWeights[penultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight);
if (currentMaxPcntWeight == weightLimit)
{
return nonZeroWeights.Concat(zeroWeights).ToList();
}
else if (goingUp && currentMaxPcntWeight < weightLimit)
{
nonZeroWeights[penultimateIndex].Weight = originalWeights[penultimateIndex]; //reset
if (penultimateIndex < nonZeroWeights.Count() - 1)
penultimateIndex++; //move up
else break;
}
else if (!goingUp && currentMaxPcntWeight > weightLimit)
{
if (penultimateIndex > 1)
penultimateIndex--; //move down
else break;
}
else
{
justUnderTarget = true;
}
}
if (goingUp) //then need to back up a step
{
penultimateIndex = (penultimateIndex > 1 ? penultimateIndex - 1 : 1);
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1];
}
}
//Procedure 2 - increment the modified weights (subject to a cap equal to their original values) until the weight limit is hit (allowing a very slight overage for the last term in some cases)
int ultimateIndex = penultimateIndex;
var sumWeights = nonZeroWeights.Sum(w => w.Weight); //use this counter instead of summing every time for condition check within loop
bool justOverTarget = false;
while (!justOverTarget)
{
for (int i = ultimateIndex; i < nonZeroWeights.Count(); i++)
{
if (nonZeroWeights[i].Weight + 1 > originalWeights[i])
{
if (ultimateIndex < nonZeroWeights.Count() - 1)
ultimateIndex++;
else justOverTarget = true;
}
else
{
nonZeroWeights[i].Weight++;
sumWeights++;
}
}
if ((nonZeroWeights.Last().Weight / sumWeights) >= weightLimit)
{
justOverTarget = true;
}
}
return nonZeroWeights.Concat(zeroWeights).ToList();
}
public class AttributeWeightPair<T>
{
public T Attribute { get; set; }
public decimal? Weight { get; set; }
public AttributeWeightPair(T attribute, decimal? count)
{
this.Attribute = attribute;
this.Weight = count;
}
}

Combination Algorithm

Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..

The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.

V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

selection based on percentage weighting - c#

Let T = the sum of all item weights Let R = a random number between 0 and T Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.

Related

Random number with Probabilities in C#

Nearest value from user input in an array C#

Where is the flaw in my algorithm for consolidating gold mines?

C# Simple Constrained Weighted Average Algorithm with/without Solver

Combination Algorithm

Categories

Resources