C# Simple Constrained Weighted Average Algorithm with/without Solver - c#

I'm at a loss as to why I can't get this seemingly simple problem solved using Microsoft Solver Foundation.
All I need is to modify the weights (numbers) of certain observations to ensure that no 1 observation's weight AS A PERCENTAGE exceeds 25%. This is for the purposes of later calculating a constrained weighted average with the results of this algorithm.
For example, given the 5 weights of { 45, 100, 33, 500, 28 }, I would expect the result of this algorithm to be { 45, 53, 33, 53, 28 }, where 2 of the numbers had to be reduced such that they're within the 25% threshold of the new total (212 = 45+53+33+53+28) while the others remained untouched. Note that even though initially, the 2nd weight of 100 was only 14% of the total (706), as a result of decreasing the 4th weight of 500, it subsequently pushed up the % of the other observations and therein lies the only challenge with this.
I tried to recreate this using Solver only for it to tell me that it is the solution is "Infeasible" and it just returns all 1s. Update: solution need not use Solver, any alternative is welcome so long as it is fast when dealing with a decent number of weights.
var solver = SolverContext.GetContext();
var model = solver.CreateModel();
var decisionList = new List<Decision>();
decisionList.Add(new Decision(Domain.IntegerRange(1, 45), "Dec1"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 100), "Dec2"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 33), "Dec3"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 500), "Dec4"));
decisionList.Add(new Decision(Domain.IntegerRange(1, 28), "Dec5"));
model.AddDecisions(decisionList.ToArray());
int weightLimit = 25;
foreach (var decision in model.Decisions)
{
model.AddConstraint(decision.Name + "weightLimit", 100 * (decision / Model.Sum(model.Decisions.ToArray())) <= weightLimit);
}
model.AddGoal("calcGoal", GoalKind.Maximize, Model.Sum(model.Decisions.ToArray()));
var solution = solver.Solve();
foreach (var decision in model.Decisions)
{
Debug.Print(decision.GetDouble().ToString());
}
Debug.Print("Solution Quality: " + solution.Quality.ToString());
Any help with this would be very much appreciated, thanks in advance.

I ditched Solver b/c it didn't live up to its name imo (or I didn't live up to its standards :)). Below is where I landed. Because this function gets used many times and on large lists of input weights, efficiency and performance are key so this function attempts to do the least # of iterations possible (let me know if anyone has any suggested improvements though). The results get used for a weighted average so I use "AttributeWeightPair" to store the value (attribute) and its weight and the function below is what modifies the weights to be within the constraint when given a list of these AWPs. The function assumes that weightLimit is passed in as a %, e.g. 25% gets passed in as 25, not 0.25 --- ok I'll stop stating what'll be obvious from the code - so here it is:
public static List<AttributeWeightPair<decimal>> WeightLimiter(List<AttributeWeightPair<decimal>> source, decimal weightLimit)
{
weightLimit /= 100; //convert to percentage
var zeroWeights = source.Where(w => w.Weight == 0).ToList();
var nonZeroWeights = source.Where(w => w.Weight > 0).ToList();
if (nonZeroWeights.Count == 0)
return source;
//return equal weights if given infeasible constraint
if ((1m / nonZeroWeights.Count()) > weightLimit)
{
nonZeroWeights.ForEach(w => w.Weight = 1);
return nonZeroWeights.Concat(zeroWeights).ToList();
}
//return original list if weight-limiting is unnecessary
if ((nonZeroWeights.Max(w => w.Weight) / nonZeroWeights.Sum(w => w.Weight)) <= weightLimit)
{
return source;
}
//sort (ascending) and store original weights
nonZeroWeights = nonZeroWeights.OrderBy(w => w.Weight).ToList();
var originalWeights = nonZeroWeights.Select(w => w.Weight).ToList();
//set starting point and determine direction from there
var initialSumWeights = nonZeroWeights.Sum(w => w.Weight);
var initialLimit = weightLimit * initialSumWeights;
var initialSuspects = nonZeroWeights.Where(w => w.Weight > initialLimit).ToList();
var initialTarget = weightLimit * (initialSumWeights - (initialSuspects.Sum(w => w.Weight) - initialLimit * initialSuspects.Count()));
var antepenultimateIndex = Math.Max(nonZeroWeights.FindLastIndex(w => w.Weight <= initialTarget), 1); //needs to be at least 1
for (int i = antepenultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[antepenultimateIndex - 1]; //set cap equal to the preceding weight
}
bool goingUp = (nonZeroWeights[antepenultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight)) > weightLimit ? false : true;
//Procedure 1 - find the weight # at which a cap would result in a weight % just UNDER the weight limit
int penultimateIndex = antepenultimateIndex;
bool justUnderTarget = false;
while (!justUnderTarget)
{
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1]; //set cap equal to the preceding weight
}
var currentMaxPcntWeight = nonZeroWeights[penultimateIndex].Weight / nonZeroWeights.Sum(w => w.Weight);
if (currentMaxPcntWeight == weightLimit)
{
return nonZeroWeights.Concat(zeroWeights).ToList();
}
else if (goingUp && currentMaxPcntWeight < weightLimit)
{
nonZeroWeights[penultimateIndex].Weight = originalWeights[penultimateIndex]; //reset
if (penultimateIndex < nonZeroWeights.Count() - 1)
penultimateIndex++; //move up
else break;
}
else if (!goingUp && currentMaxPcntWeight > weightLimit)
{
if (penultimateIndex > 1)
penultimateIndex--; //move down
else break;
}
else
{
justUnderTarget = true;
}
}
if (goingUp) //then need to back up a step
{
penultimateIndex = (penultimateIndex > 1 ? penultimateIndex - 1 : 1);
for (int i = penultimateIndex; i < nonZeroWeights.Count(); i++)
{
nonZeroWeights[i].Weight = originalWeights[penultimateIndex - 1];
}
}
//Procedure 2 - increment the modified weights (subject to a cap equal to their original values) until the weight limit is hit (allowing a very slight overage for the last term in some cases)
int ultimateIndex = penultimateIndex;
var sumWeights = nonZeroWeights.Sum(w => w.Weight); //use this counter instead of summing every time for condition check within loop
bool justOverTarget = false;
while (!justOverTarget)
{
for (int i = ultimateIndex; i < nonZeroWeights.Count(); i++)
{
if (nonZeroWeights[i].Weight + 1 > originalWeights[i])
{
if (ultimateIndex < nonZeroWeights.Count() - 1)
ultimateIndex++;
else justOverTarget = true;
}
else
{
nonZeroWeights[i].Weight++;
sumWeights++;
}
}
if ((nonZeroWeights.Last().Weight / sumWeights) >= weightLimit)
{
justOverTarget = true;
}
}
return nonZeroWeights.Concat(zeroWeights).ToList();
}
public class AttributeWeightPair<T>
{
public T Attribute { get; set; }
public decimal? Weight { get; set; }
public AttributeWeightPair(T attribute, decimal? count)
{
this.Attribute = attribute;
this.Weight = count;
}
}

Related

Determine if a number can be made with prepicked numbers and times

I have 2 arrays one of the types of numbers that will be used and the 2nd array is how many times that number can be used. I have a letter that determines what kind of method will be used I need to figure out how many times I can use a certain number from an array to determine a letter+number The ‘number’ is what I have to make with all the available numbers I can use. If the number cannot be made I would like to just say number cant be made or anything but allow the program to move on.
Here is what I have
int[] picksToUse = { 100, 50, 20, 10, 5, 1 };
int[] timesToUse = { 10, 10, 10, 10, 10, 10 };
string choice = Console.ReadLine();
string input = "";
if(choice.Length > 2)
{
input = choice.Substring(choice.IndexOf("$") + 1);
}
if(...){
}
else if (choice.Equals("D"))
{
int amt = Convert.ToInt32(input);
// code here to determine if number can be made with above choices
Dispense(amt, timesToUse);
}
Assuming picksToUse and timesToUse are exactly the same as you declared them, here's a way to know if you have enough of everything in stock to "pay up". It's a boolean function, which uses recursion. You would call it with the amount needed as parameter, and it would tell you right there if you have enough of everything.
Private Function HasCashInStock(amount As Integer, Optional index As Integer = 0) As Boolean
Dim billsNeeded As Integer = amount \ picksToUse(index)
If billsNeeded > timesToUse(index) Then
Return False
End If
amount -= picksToUse(index) * billsNeeded
If amount = 0 Then
Return True
End If
Return HasCashInStock(amount, index + 1)
End Function
The \ is an integer division operator (in VB.NET, at least - I'm shamelessly letting you translate this code). If you're not familiar with the integer division operator, well, when you use it with integer it gets rid of the floating numbers.
3 / 2 is not valid on integers, because it would yield 1.5.
3 \ 2 is valid on integers, and will yield 1.
That's all there is to it, really. Oh yeah, and recursion. I like recursion, but others will tell you to avoid it as much as you can. What can I say, I think that a nice recursive function has elegance.
You can also totally copy this function another time, modify it and use it to subtracts from your timesToUse() array once you know for sure that there's enough of everything to pay up.
If HasCashInStock(HereIsTheAmountAsInteger) Then
GivesTheMoney(HereIsTheAmountAsInteger)
End If
Having two functions is not the leanest code but it would be more readable. Have fun!
This is what I used to finish my project.
public static bool Validate(int amount, int[] total, int[] needed)
{
int[] billCount = total;
int[] cash = { 100, 50, 20, 10, 5, 1 };
int total = amount;
bool isValid = true;
for (int i = 0; i < total.Length; i++)
{
if(total >= cash[i])
{
billCount[i] = billCount[i] - needed[i];
}
if(billCount[i] < 0)
{
isValid = false;
break;
}
}
return isValid;
}
I got some working code. I did it with a class. I remember when I couldn't see what good classes were. Now I can't brush my teeth without a class. :-)
I force myself to do these problems to gain a little experience in C#. Now I have 3 things I like about C#.
class ATM
{
public int Denomination { get; set; }
public int Inventory { get; set; }
public ATM(int denom, int inven)
{
Denomination = denom;
Inventory = inven;
}
}
List<int> Bills = new List<int>();
List<ATM> ATMs = new List<ATM>();
private void OP2()
{
int[] picksToUse = { 100, 50, 20, 10, 5, 1 };
foreach (int d in picksToUse )
{
ATM atm = new ATM(d, 10);
ATMs.Add(atm);
}
//string sAmtRequested = Console.ReadLine();
string sAmtRequested = textBox1.Text;
if (int.TryParse(sAmtRequested, out int AmtRequested))
{
int RunningBalance = AmtRequested;
do
{
ATM BillReturn = GetBill(RunningBalance);
if (BillReturn is null)
{
MessageBox.Show("Cannot complete transaction");
return;
}
RunningBalance -= BillReturn.Denomination;
} while (RunningBalance > 0);
}
else
{
MessageBox.Show("Non-numeric request.");
return;
}
foreach (int bill in Bills)
Debug.Print(bill.ToString());
Debug.Print("Remaining Inventory");
foreach (ATM atm in ATMs)
Debug.Print($"For Denomination {atm.Denomination} there are {atm.Inventory} bills remaining");
}
private ATM GetBill(int RequestBalance)
{
var FilteredATMs = from atm in ATMs
where atm.Inventory > 0
orderby atm.Denomination descending
select atm;
foreach (ATM bill in FilteredATMs)
{
if (RequestBalance >= bill.Denomination )
{
bill.Inventory -= 1;
Bills.Add(bill.Denomination);
return bill;
}
}
return null;
}

Which corner case unit test would this fail?

I tried the Fish problem on Codility and I secured 75% marks for correctness because the results reported that my code failed one simple test case. The results do not report what input was provided for the test case.
Could you please help me find out what is wrong with my code and what corner case it would fail?
using System;
public class Solution
{
// Time complexity: O(N)
// Space complexity: O(N)
public int solution(int[] sizes, int[] direction)
{
if (sizes == null || direction == null)
throw new ArgumentNullException();
var sizesLen = sizes.Length;
var directionLen = direction.Length;
if (sizesLen != direction.Length)
throw new ArgumentException();
var len = sizesLen;
if (len <= 1) return len;
var survivors = new Fish[len];
survivors[0] = new Fish(sizes[0], direction[0]);
var curr = 0;
for (int i = 1; i < len; i++)
{
var fish = new Fish(sizes[i], direction[i]);
if (survivors[curr].Direction == 1 && fish.Direction == 0)
{
if (fish.Size < survivors[curr].Size) continue;
while(curr >= 0 &&
fish.Size > survivors[curr].Size &&
survivors[curr].Direction == 1)
{
curr--;
}
}
survivors[++curr] = fish;
}
return ++curr;
}
}
public class Fish
{
public Fish(int size, int direction)
{
Size = size;
Direction = direction;
}
public int Size { get; set; }
public int Direction { get; set; }
}
As mentioned in your code, your Solution is O(M*N). As stated in the problem link, the code should run in linear time. Hence, I will not correct your solution as it will eventually fail on bigger test cases. I will provide you a linear algorithm that you can easily implement.
Keep a Stack S, empty initially.
Iterate over the array A, i from 0 to n-1
When you encounter an element, say A[i], do the following
If the stack S is empty, then push both (A[i], B[i]) as a pair
Else, extract the top pair from the stack S and compare the value of B[top] and B[i].
While B[top] is 1 and B[i] is 0, then one of the fishes will eat the other one. So pop from stack S, the top element. Now, compare which fish is bigger with values A[top] and A[i]. Whichever is bigger, that fish stays alive. Push that pair in the stack S, that corresponds to the fish that stays alive. Continue the while loop till the condition fails
If B[top] is not 1 and B[i] is not 0, then simply push the new pair (A[i],B[i])
The size of the stack S at the end, is your answer.
Note: You might not be passing that test case, for which your solution times out. For example, for N=100000, your solution will time out.
In my solution, the worst case time complexity is O(N+N) = O(2N) = O(N). N times because of the iteration over array A and another N times worst case, due to the Stack if it keeps shrinking, for the while condition holds true.
Hope it helps!!!
Edit: suppose A = [ 99, 98, 92, 91, 93 ], and B = [1, 1, 1, 1, 0]. Your code gives answer as 3. Expected answer = 2
Edit-2: This is your modified code that will pass every test case
public int solution(int[] sizes, int[] direction)
{
if (sizes == null || direction == null)
throw new ArgumentNullException();
var sizesLen = sizes.Length;
var directionLen = direction.Length;
if (sizesLen != direction.Length)
throw new ArgumentException();
var len = sizesLen;
if (len <= 1) return len;
var survivors = new Fish[len];
survivors[0] = new Fish(sizes[0], direction[0]);
var curr = 0;
for (int i = 1; i < len; i++)
{
var fish = new Fish(sizes[i], direction[i]);
if (survivors[curr].Direction == 1 && fish.Direction == 0)
{
if (fish.Size < survivors[curr].Size) continue;
while(curr >= 0 &&
fish.Size > survivors[curr].Size &&
survivors[curr].Direction == 1)
{
curr--;
}
if (curr >= 0)
{
if (fish.Size < survivors[curr].Size &&
survivors[curr].Direction == 1)
continue;
}
}
survivors[++curr] = fish;
}
return ++curr;
}
}
public class Fish
{
public Fish(int size, int direction)
{
Size = size;
Direction = direction;
}
public int Size { get; set; }
public int Direction { get; set; }
}
I think the intention here is to use Stack or Queue. Here is a solution with two Stack.
public static int Fish(int[] A, int[] B)
{
var downStreamFish = new Stack<int>(B.Length);
var upStreamFish = new Stack<int>(B.Length);
var result = B.Length;
for (var i = 0; i < B.Length; i++)
{
// push the fish into up/down stream stack.
if (B[i] == 1)
downStreamFish.Push(i);
else
upStreamFish.Push(i);
// check to see whether it's possible to eat a fish
while (downStreamFish.Count > 0 && upStreamFish.Count > 0)
{
var dfIndex = downStreamFish.Peek();
var ufIndex = upStreamFish.Peek();
//NOTE:downstream fish index must be less than upstream fish index in order for 'eat' to happen
if (dfIndex < ufIndex)
{
if (A[dfIndex] > A[ufIndex])
upStreamFish.Pop();
else
downStreamFish.Pop();
result--; // one fish is eatten
}
else
break; // eat condition is not met
}
}
return result;
}

Combination Algorithm

Length = input Long(can be 2550, 2880, 2568, etc)
List<long> = {618, 350, 308, 300, 250, 232, 200, 128}
The program takes a long value, for that particular long value we have to find the possible combination from the above list which when added give me a input result(same value can be used twice). There can be a difference of +/- 30.
Largest numbers have to be used most.
Ex:Length = 868
For this combinations can be
Combination 1 = 618 + 250
Combination 2 = 308 + 232 + 200 +128
Correct Combination would be Combination 1
But there should also be different combinations.
public static void Main(string[] args)
{
//subtotal list
List<int> totals = new List<int>(new int[] { 618, 350, 308, 300, 250, 232, 200, 128 });
// get matches
List<int[]> results = KnapSack.MatchTotal(2682, totals);
// print results
foreach (var result in results)
{
Console.WriteLine(string.Join(",", result));
}
Console.WriteLine("Done.");
}
internal static List<int[]> MatchTotal(int theTotal, List<int> subTotals)
{
List<int[]> results = new List<int[]>();
while (subTotals.Contains(theTotal))
{
results.Add(new int[1] { theTotal });
subTotals.Remove(theTotal);
}
if (subTotals.Count == 0)
return results;
subTotals.Sort();
double mostNegativeNumber = subTotals[0];
if (mostNegativeNumber > 0)
mostNegativeNumber = 0;
if (mostNegativeNumber == 0)
subTotals.RemoveAll(d => d > theTotal);
for (int choose = 0; choose <= subTotals.Count; choose++)
{
IEnumerable<IEnumerable<int>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);
results.AddRange(from combo in combos where combo.Sum() == theTotal select combo.ToArray());
}
return results;
}
public static class Combination
{
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)
{
return choose == 0 ?
new[] { new T[0] } :
elements.SelectMany((element, i) =>
elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));
}
}
I Have used the above code, can it be more simplified, Again here also i get unique values. A value can be used any number of times. But the largest number has to be given the most priority.
I have a validation to check whether the total of the sum is greater than the input value. The logic fails even there..
The algorithm you have shown assumes that the list is sorted in ascending order. If not, then you shall first have to sort the list in O(nlogn) time and then execute the algorithm.
Also, it assumes that you are only considering combinations of pairs and you exit on the first match.
If you want to find all combinations, then instead of "break", just output the combination and increment startIndex or decrement endIndex.
Moreover, you should check for ranges (targetSum - 30 to targetSum + 30) rather than just the exact value because the problem says that a margin of error is allowed.
This is the best solution according to me because its complexity is O(nlogn + n) including the sorting.
V4 - Recursive Method, using Stack structure instead of stack frames on thread
It works (tested in VS), but there could be some bugs remaining.
static int Threshold = 30;
private static Stack<long> RecursiveMethod(long target)
{
Stack<long> Combination = new Stack<long>(establishedValues.Count); //Can grow bigger, as big as (target / min(establishedValues)) values
Stack<int> Index = new Stack<int>(establishedValues.Count); //Can grow bigger
int lowerBound = 0;
int dimensionIndex = lowerBound;
long fail = -1 * Threshold;
while (true)
{
long thisVal = establishedValues[dimensionIndex];
dimensionIndex++;
long afterApplied = target - thisVal;
if (afterApplied < fail)
lowerBound = dimensionIndex;
else
{
target = afterApplied;
Combination.Push(thisVal);
if (target <= Threshold)
return Combination;
Index.Push(dimensionIndex);
dimensionIndex = lowerBound;
}
if (dimensionIndex >= establishedValues.Count)
{
if (Index.Count == 0)
return null; //No possible combinations
dimensionIndex = Index.Pop();
lowerBound = dimensionIndex;
target += Combination.Pop();
}
}
}
Maybe V3 - Suggestion for Ordered solution trying every combination
Although this isn't chosen as the answer for the related question, I believe this is a good approach - https://stackoverflow.com/a/17258033/887092(, otherwise you could try the chosen answer (although the output for that is only 2 items in set being summed, rather than up to n items)) - it will enumerate every option including multiples of the same value. V2 works but would be slightly less efficient than an ordered solution, as the same failing-attempt will likely be attempted multiple times.
V2 - Random Selection - Will be able to reuse the same number twice
I'm a fan of using random for "intelligence", allowing the computer to brute force the solution. It's also easy to distribute - as there is no state dependence between two threads trying at the same time for example.
static int Threshold = 30;
public static List<long> RandomMethod(long Target)
{
List<long> Combinations = new List<long>();
Random rnd = new Random();
//Assuming establishedValues is sorted
int LowerBound = 0;
long runningSum = Target;
while (true)
{
int newLowerBound = FindLowerBound(LowerBound, runningSum);
if (newLowerBound == -1)
{
//No more beneficial values to work with, reset
runningSum = Target;
Combinations.Clear();
LowerBound = 0;
continue;
}
LowerBound = newLowerBound;
int rIndex = rnd.Next(LowerBound, establishedValues.Count);
long val = establishedValues[rIndex];
runningSum -= val;
Combinations.Add(val);
if (Math.Abs(runningSum) <= 30)
return Combinations;
}
}
static int FindLowerBound(int currentLowerBound, long runningSum)
{
//Adjust lower bound, so we're not randomly trying a number that's too high
for (int i = currentLowerBound; i < establishedValues.Count; i++)
{
//Factor in the threshold, because an end aggregate which exceeds by 20 is better than underperforming by 21.
if ((establishedValues[i] - Threshold) < runningSum)
{
return i;
}
}
return -1;
}
V1 - Ordered selection - Will not be able to reuse the same number twice
Add this very handy extension function (uses a binary algorithm to find all combinations):
//Make sure you put this in a static class inside System namespace
public static IEnumerable<List<T>> EachCombination<T>(this List<T> allValues)
{
var collection = new List<List<T>>();
for (int counter = 0; counter < (1 << allValues.Count); ++counter)
{
List<T> combination = new List<T>();
for (int i = 0; i < allValues.Count; ++i)
{
if ((counter & (1 << i)) == 0)
combination.Add(allValues[i]);
}
if (combination.Count == 0)
continue;
yield return combination;
}
}
Use the function
static List<long> establishedValues = new List<long>() {618, 350, 308, 300, 250, 232, 200, 128, 180, 118, 155};
//Return is a list of the values which sum to equal the target. Null if not found.
List<long> FindFirstCombination(long target)
{
foreach (var combination in establishedValues.EachCombination())
{
//if (combination.Sum() == target)
if (Math.Abs(combination.Sum() - target) <= 30) //Plus or minus tolerance for difference
return combination;
}
return null; //Or you could throw an exception
}
Test the solution
var target = 858;
var result = FindFirstCombination(target);
bool success = (result != null && result.Sum() == target);
//TODO: for loop with random selection of numbers from the establishedValues, Sum and test through FindFirstCombination

Split a collection of double by size of its contents

I have a collection of numbers (Collection) and it can be any size and contain negative and positive numbers. I am trying to split it up based on some criteria. starting at the first number in the collection I want to make a collection while that number is above -180 and below 180. Any numbers above 180 will go in a new collection or any numbers below -180 will go in an new collection. If the numbers become within the acceptable parameters again those will go in a new collection again. the problem is the collections need to stay in order.
For example.
Take a collection of 100:
the first 50 is between 180 and -180.
the next 20 are below -180
the next 20 are above 180
the last 10 are between 180 and -180
From the collection above I should now have 4 separate collection in the same order as the original 1 collection.
First collection numbers in original order between 180 and -180
second collection numbers in original order below -180
third collection numbers in original order above 180
fourth collection numbers in original order between 180 and -180
I have made an attempt, what I have doesn't work and is a nasty mess of if statements. I don't know linq very well but I think there may be a more elegant solution using that. Can anyone help me out here either with showing me how to create a linq statement or suggestions on how to get my if statements to work if that is the best way.
Collection<Tuple<Collection<double>, int>> collectionOfDataSets = new Collection<Tuple<Collection<double>, int>>();
Collection<double> newDataSet = new Collection<double>();
for (int i = 0; i < dataSet.Count; i++) {
if (dataSet[i] < 180 && dataSet[i] > -180) {
newDataSet.Add(dataSet[i]);
} else {
Tuple<Collection<double>, int> lastEntry = collectionOfDataSets.LastOrDefault(b => b.Item2 == i--);
if (lastEntry != null){
lastEntry.Item1.Add(dataSet[i]);
}
double lastInLastCollection = collectionOfDataSets.ElementAtOrDefault(collectionOfDataSets.Count).Item1.Last();
if (newDataSet.Count > 0 && lastInLastCollection!= dataSet[i]){
collectionOfDataSets.Add(new Tuple<Collection<double>, int>(newDataSet, i));
}
newDataSet = new Collection<double>();
}
}
Thank you in advance for any assistance.
Your example is complicated. I'll first state and solve a simpler problem, then use the same method to solve your original problem.
I want to split a list of numbers into contiguous groups of even and odd numbers. For example, given the list 2,2,4,3,6,2 I would split it into three groups [2,2,4], [3], [6,2]
This can be done concisely with a GroupAdjacentBy method
> var numbers = new List<int>{2,2,4,3,6,2};
> numbers.GroupAdjacentBy(x => x % 2)
[[2,2,4], [3], [6,2]]
To solve your problem, simply replace the even-odd classifying function above with your classification function:
> var points = new List<int>{-180,180};
> var f = new Func<int,int>(x => points.BinarySearch(x));
> var numbers = new List<int>{6,-50,100,190,200,20};
> numbers.GroupAdjacentBy(f)
[[6,-50,100], [190,200], [20]]
If you need the collections to be updated as soon as the values change why don;t you use properties? Something like
// your original collection
public IList<double> OriginalValues; //= new List<double> { -1000, 5, 7 1000 };
public IList<double> BelowMinus180
{
get { return OriginalValues.Where(x => x < -180).ToList().AsReadOnly(); }
}
public IList<double> BetweenMinus180And180
{
get { return OriginalValues.Where(x => x >= -180 && x <= 180).ToList().AsReadOnly(); }
}
public IList<double> Above180
{
get { return OriginalValues.Where(x => x > 180).ToList().AsReadOnly(); }
}
public static List<List<T>> PartitionBy<T>(this IEnumerable<T> seq, Func<T, bool> predicate)
{
bool lastPass = true;
return seq.Aggregate(new List<List<T>>(), (partitions, item) =>
{
bool inc = predicate(item);
if (inc == lastPass)
{
if (partitions.Count == 0)
{
partitions.Add(new List<T>());
}
partitions.Last().Add(item);
}
else
{
partitions.Add(new List<T> { item });
}
lastPass = inc;
return partitions;
});
}
You can then use:
List<List<double>> segments = newDataSet.PartitionBy(d => d > -180 && d < 180);
How about this possible solution using two passes. In the first pass we find the indices were a change occurs, and in the second pass we do the actual partitioning.
First an auxiliary method to determine the category:
protected int DetermineCategory(double number)
{
if (number < 180 && number > -180)
return 0;
else if (number < -180)
return 1;
else
return 2;
}
And then the actual algorithm:
List<int> indices = new List<int>();
int currentCategory = -1;
for (int i = 0; i < numbers.Count; i++)
{
int newCat = DetermineCategory(numbers[i]);
if (newCat != currentCategory)
{
indices.Add(i);
currentCategory = newCat;
}
}
List<List<double>> collections = new List<List<double>>(indices.Count);
for (int i = 1; i < indices.Count; ++i)
collections.Add(new List<double>(
numbers.Skip(indices[i - 1]).Take(indices[i] - indices[i - 1])));
Here is a new answer based on the new info you provided. I hope this time I will be closer to what you need
public IEnumerable<IList<double>> GetCollectionOfCollections(IList<double> values, IList<double> boundries)
{
var ordered = values.OrderBy(x => x).ToList();
for (int i = 0; i < boundries.Count; i++)
{
var collection = ordered.Where(x => x < boundries[i]).ToList();
if (collection.Count > 0)
{
ordered = ordered.Except(collection).ToList();
yield return collection.ToList();
}
}
if (ordered.Count() > 0)
{
yield return ordered;
}
}
One method with linq. Untested but should work
var firstSet = dataSet.TakeWhile(x=>x>-180&&x<180);
var totalCount = firstSet.Count();
var secondSet = dataSet.Skip(totalCount).TakeWhile(x=>x<-180);
totalCount+=secondSet.Count();
var thirdSet = dataSet.Skip(totalCount).TakeWhile(x=>x>180);
totalCount += thirdSet.Count();
var fourthSet = dataSet.Skip(totalCount);

selection based on percentage weighting

I have a set of values, and an associated percentage for each:
a: 70% chance
b: 20% chance
c: 10% chance
I want to select a value (a, b, c) based on the percentage chance given.
how do I approach this?
my attempt so far looks like this:
r = random.random()
if r <= .7:
return a
elif r <= .9:
return b
else:
return c
I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.
(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)
Here is a complete solution in C#:
public class ProportionValue<T>
{
public double Proportion { get; set; }
public T Value { get; set; }
}
public static class ProportionValue
{
public static ProportionValue<T> Create<T>(double proportion, T value)
{
return new ProportionValue<T> { Proportion = proportion, Value = value };
}
static Random random = new Random();
public static T ChooseByRandom<T>(
this IEnumerable<ProportionValue<T>> collection)
{
var rnd = random.NextDouble();
foreach (var item in collection)
{
if (rnd < item.Proportion)
return item.Value;
rnd -= item.Proportion;
}
throw new InvalidOperationException(
"The proportions in the collection do not add up to 1.");
}
}
Usage:
var list = new[] {
ProportionValue.Create(0.7, "a"),
ProportionValue.Create(0.2, "b"),
ProportionValue.Create(0.1, "c")
};
// Outputs "a" with probability 0.7, etc.
Console.WriteLine(list.ChooseByRandom());
For Python:
>>> import random
>>> dst = 70, 20, 10
>>> vls = 'a', 'b', 'c'
>>> picks = [v for v, d in zip(vls, dst) for _ in range(d)]
>>> for _ in range(12): print random.choice(picks),
...
a c c b a a a a a a a a
>>> for _ in range(12): print random.choice(picks),
...
a c a c a b b b a a a a
>>> for _ in range(12): print random.choice(picks),
...
a a a a c c a c a a c a
>>>
General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.
Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).
Knuth references Walker's method of aliases. Searching on this, I find http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ and http://prxq.wordpress.com/2006/04/17/the-alias-method/. This gives the exact probabilities required in constant time per number generated with linear time for setup (curiously, n log n time for setup if you use exactly the method Knuth describes, which does a preparatory sort you can avoid).
Take the list of and find the cumulative total of the weights: 70, 70+20, 70+20+10. Pick a random number greater than or equal to zero and less than the total. Iterate over the items and return the first value for which the cumulative sum of the weights is greater than this random number:
def select( values ):
variate = random.random() * sum( values.values() )
cumulative = 0.0
for item, weight in values.items():
cumulative += weight
if variate < cumulative:
return item
return item # Shouldn't get here, but just in case of rounding...
print select( { "a": 70, "b": 20, "c": 10 } )
This solution, as implemented, should also be able to handle fractional weights and weights that add up to any number so long as they're all non-negative.
Let T = the sum of all item weights
Let R = a random number between 0 and T
Iterate the item list subtracting each item weight from R and return the item that causes the result to become <= 0.
def weighted_choice(probabilities):
random_position = random.random() * sum(probabilities)
current_position = 0.0
for i, p in enumerate(probabilities):
current_position += p
if random_position < current_position:
return i
return None
Because random.random will always return < 1.0, the final return should never be reached.
import random
def selector(weights):
i=random.random()*sum(x for x,y in weights)
for w,v in weights:
if w>=i:
break
i-=w
return v
weights = ((70,'a'),(20,'b'),(10,'c'))
print [selector(weights) for x in range(10)]
it works equally well for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
print [selector(weights) for x in range(10)]
If you have a lot of weights, you can use bisect to reduce the number of iterations required
import random
import bisect
def make_acc_weights(weights):
acc=0
acc_weights = []
for w,v in weights:
acc+=w
acc_weights.append((acc,v))
return acc_weights
def selector(acc_weights):
i=random.random()*sum(x for x,y in weights)
return weights[bisect.bisect(acc_weights, (i,))][1]
weights = ((70,'a'),(20,'b'),(10,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
Also works fine for fractional weights
weights = ((0.7,'a'),(0.2,'b'),(0.1,'c'))
acc_weights = make_acc_weights(weights)
print [selector(acc_weights) for x in range(100)]
today, the update of python document give an example to make a random.choice() with weighted probabilities:
If the weights are small integer ratios, a simple technique is to build a sample population with repeats:
>>> weighted_choices = [('Red', 3), ('Blue', 2), ('Yellow', 1), ('Green', 4)]
>>> population = [val for val, cnt in weighted_choices for i in range(cnt)]
>>> random.choice(population)
'Green'
A more general approach is to arrange the weights in a cumulative distribution with itertools.accumulate(), and then locate the random value with bisect.bisect():
>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]
'Blue'
one note: itertools.accumulate() needs python 3.2 or define it with the Equivalent.
I think you can have an array of small objects (I implemented in Java although I know a little bit C# but I am afraid can write wrong code), so you may need to port it yourself. The code in C# will be much smaller with struct, var but I hope you get the idea
class PercentString {
double percent;
String value;
// Constructor for 2 values
}
ArrayList<PercentString> list = new ArrayList<PercentString();
list.add(new PercentString(70, "a");
list.add(new PercentString(20, "b");
list.add(new PercentString(10, "c");
double percent = 0;
for (int i = 0; i < list.size(); i++) {
PercentString p = list.get(i);
percent += p.percent;
if (random < percent) {
return p.value;
}
}
If you are really up to speed and want to generate the random values quickly, the Walker's algorithm mcdowella mentioned in https://stackoverflow.com/a/3655773/1212517 is pretty much the best way to go (O(1) time for random(), and O(N) time for preprocess()).
For anyone who is interested, here is my own PHP implementation of the algorithm:
/**
* Pre-process the samples (Walker's alias method).
* #param array key represents the sample, value is the weight
*/
protected function preprocess($weights){
$N = count($weights);
$sum = array_sum($weights);
$avg = $sum / (double)$N;
//divide the array of weights to values smaller and geq than sum/N
$smaller = array_filter($weights, function($itm) use ($avg){ return $avg > $itm;}); $sN = count($smaller);
$greater_eq = array_filter($weights, function($itm) use ($avg){ return $avg <= $itm;}); $gN = count($greater_eq);
$bin = array(); //bins
//we want to fill N bins
for($i = 0;$i<$N;$i++){
//At first, decide for a first value in this bin
//if there are small intervals left, we choose one
if($sN > 0){
$choice1 = each($smaller);
unset($smaller[$choice1['key']]);
$sN--;
} else{ //otherwise, we split a large interval
$choice1 = each($greater_eq);
unset($greater_eq[$choice1['key']]);
}
//splitting happens here - the unused part of interval is thrown back to the array
if($choice1['value'] >= $avg){
if($choice1['value'] - $avg >= $avg){
$greater_eq[$choice1['key']] = $choice1['value'] - $avg;
}else if($choice1['value'] - $avg > 0){
$smaller[$choice1['key']] = $choice1['value'] - $avg;
$sN++;
}
//this bin comprises of only one value
$bin[] = array(1=>$choice1['key'], 2=>null, 'p1'=>1, 'p2'=>0);
}else{
//make the second choice for the current bin
$choice2 = each($greater_eq);
unset($greater_eq[$choice2['key']]);
//splitting on the second interval
if($choice2['value'] - $avg + $choice1['value'] >= $avg){
$greater_eq[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
}else{
$smaller[$choice2['key']] = $choice2['value'] - $avg + $choice1['value'];
$sN++;
}
//this bin comprises of two values
$choice2['value'] = $avg - $choice1['value'];
$bin[] = array(1=>$choice1['key'], 2=>$choice2['key'],
'p1'=>$choice1['value'] / $avg,
'p2'=>$choice2['value'] / $avg);
}
}
$this->bins = $bin;
}
/**
* Choose a random sample according to the weights.
*/
public function random(){
$bin = $this->bins[array_rand($this->bins)];
$randValue = (lcg_value() < $bin['p1'])?$bin[1]:$bin[2];
}
Here is my version that can apply to any IList and normalize the weight. It is based on Timwi's solution : selection based on percentage weighting
/// <summary>
/// return a random element of the list or default if list is empty
/// </summary>
/// <param name="e"></param>
/// <param name="weightSelector">
/// return chances to be picked for the element. A weigh of 0 or less means 0 chance to be picked.
/// If all elements have weight of 0 or less they all have equal chances to be picked.
/// </param>
/// <returns></returns>
public static T AnyOrDefault<T>(this IList<T> e, Func<T, double> weightSelector)
{
if (e.Count < 1)
return default(T);
if (e.Count == 1)
return e[0];
var weights = e.Select(o => Math.Max(weightSelector(o), 0)).ToArray();
var sum = weights.Sum(d => d);
var rnd = new Random().NextDouble();
for (int i = 0; i < weights.Length; i++)
{
//Normalize weight
var w = sum == 0
? 1 / (double)e.Count
: weights[i] / sum;
if (rnd < w)
return e[i];
rnd -= w;
}
throw new Exception("Should not happen");
}
I've my own solution for this:
public class Randomizator3000
{
public class Item<T>
{
public T value;
public float weight;
public static float GetTotalWeight<T>(Item<T>[] p_itens)
{
float __toReturn = 0;
foreach(var item in p_itens)
{
__toReturn += item.weight;
}
return __toReturn;
}
}
private static System.Random _randHolder;
private static System.Random _random
{
get
{
if(_randHolder == null)
_randHolder = new System.Random();
return _randHolder;
}
}
public static T PickOne<T>(Item<T>[] p_itens)
{
if(p_itens == null || p_itens.Length == 0)
{
return default(T);
}
float __randomizedValue = (float)_random.NextDouble() * (Item<T>.GetTotalWeight(p_itens));
float __adding = 0;
for(int i = 0; i < p_itens.Length; i ++)
{
float __cacheValue = p_itens[i].weight + __adding;
if(__randomizedValue <= __cacheValue)
{
return p_itens[i].value;
}
__adding = __cacheValue;
}
return p_itens[p_itens.Length - 1].value;
}
}
And using it should be something like that (thats in Unity3d)
using UnityEngine;
using System.Collections;
public class teste : MonoBehaviour
{
Randomizator3000.Item<string>[] lista;
void Start()
{
lista = new Randomizator3000.Item<string>[10];
lista[0] = new Randomizator3000.Item<string>();
lista[0].weight = 10;
lista[0].value = "a";
lista[1] = new Randomizator3000.Item<string>();
lista[1].weight = 10;
lista[1].value = "b";
lista[2] = new Randomizator3000.Item<string>();
lista[2].weight = 10;
lista[2].value = "c";
lista[3] = new Randomizator3000.Item<string>();
lista[3].weight = 10;
lista[3].value = "d";
lista[4] = new Randomizator3000.Item<string>();
lista[4].weight = 10;
lista[4].value = "e";
lista[5] = new Randomizator3000.Item<string>();
lista[5].weight = 10;
lista[5].value = "f";
lista[6] = new Randomizator3000.Item<string>();
lista[6].weight = 10;
lista[6].value = "g";
lista[7] = new Randomizator3000.Item<string>();
lista[7].weight = 10;
lista[7].value = "h";
lista[8] = new Randomizator3000.Item<string>();
lista[8].weight = 10;
lista[8].value = "i";
lista[9] = new Randomizator3000.Item<string>();
lista[9].weight = 10;
lista[9].value = "j";
}
void Update ()
{
Debug.Log(Randomizator3000.PickOne<string>(lista));
}
}
In this example each value has a 10% chance do be displayed as a debug =3
Based loosely on python's numpy.random.choice(a=items, p=probs), which takes an array and a probability array of the same size.
public T RandomChoice<T>(IEnumerable<T> a, IEnumerable<double> p)
{
IEnumerator<T> ae = a.GetEnumerator();
Random random = new Random();
double target = random.NextDouble();
double accumulator = 0;
foreach (var prob in p)
{
ae.MoveNext();
accumulator += prob;
if (accumulator > target)
{
break;
}
}
return ae.Current;
}
The probability array p must sum to (approx.) 1. This is to keep it consistent with the numpy interface (and mathematics), but you could easily change that if you wanted.

Categories

Resources