I try to optimize pefrormance using IJobParallelFor in Unity code:
Unfortunatelly I met error likie this:
System.IndexOutOfRangeException: Index {0} is out of restricted
IJobParallelFor range [{1}...{2}] in ReadWriteBuffer.
I tried to use
[NativeDisableParallelForRestriction] and
[NativeDisableContainerSafetyRestriction]
but without effect
[BurstCompile(CompileSynchronously = true)]
public struct DilationJob : IJobParallelFor
{
[ReadOnly]
public NativeArray<Color32> colorsArray;
public NativeArray<int> voxelToColor;
public int kernelSize;
public NativeArray<int> neighboursArray;
public int cubeNumber;
public void Execute(int index)
{
int dimx = 512;
int dimy = 512;
//int[] neighboursArray = new int[kernelSize * kernelSize * kernelSize];
int listIndex = 0;
for (int i = -1; i < kernelSize - 1; i++)
{
for (int j = -1; j < kernelSize - 1; j++)
{
for (int k = -1; k < kernelSize - 1; k++)
{
int neigbourIndex = (i + 1) * (j + 1) * kernelSize + (j + 1) * kernelSize + (k + 1);
if (neigbourIndex < 0)
{
neigbourIndex = 0;
}
neighboursArray[neigbourIndex] =
index + k * dimx * dimy + j * dimx + i;
if (neighboursArray[neigbourIndex] < colorsArray.Length && neighboursArray[neigbourIndex] >= 0 &&
colorsArray[neighboursArray[neigbourIndex]].b == 255)
{
voxelToColor[listIndex] = index;
listIndex++;
}
}
}
}
}
}
Afaik this exception is caused by the fact that an IJobParallelFor as the same says is executed in parallel
Execute(int index) will be executed once for each index from 0 to the provided length. Each iteration must be independent from other iterations (The safety system enforces this rule for you). The indices have no guaranteed order and are executed on multiple cores in parallel.
Unity automatically splits the work into chunks of no less than the provided batchSize, and schedules an appropriate number of jobs based on the number of worker threads, the length of the array and the batch size.
What this means is lets say you have an array of length 10 an you say the batchSize can be up to 4 then you will probably end up with 3 parallel job chunks for the indices [0, 1, 2, 3], [4, 5, 6, 7] and [8, 9]. Since each of these 3 chunks is possibly on a different kernel they only get access to their according chunk of the NativeArray(s). (More on that here)
What probably happens is that multiple of your parallel jobs try to access and write to the same index of your output NativeArrays voxelToColor and neighboursArray. In specific without doing the rest of calculation you definitely will possibly try to write to voxelToColor[0] in each and every of your parallel jobs which is neither allowed nor makes a lot of sense to me.
Within one Execute call you are restricted to only write to the given index.
Afaik the message should further read
ReadWriteBuffers are restricted to only read & write the element at the job index. You can use double buffering strategies to avoid race conditions due to reading & writing in parallel to the same elements from a job.
[ReadOnly] tagged arrays are the exception because here multiple parallel accesses can't corrupt the data as long as you only read.
Then the [NativeDisableContainerSafetyRestriction] if I understand correctly solves race conditions between different jobs and the main thread.
While what you probably want to go with is more [NativeDisableParallelForRestriction] which as far as I understand disables the security restriction for parallel access to array indices. To be honest the Unity API is quite spare on these.
As a little example
public class Example : MonoBehaviour
{
public Color32[] colors = new Color32[10];
private void Awake()
{
var job = new DilationJob()
{
colorsArray = new NativeArray<Color32>(colors, Allocator.Persistent),
voxelToColor = new NativeArray<int>(colors.Length, Allocator.Persistent)
};
var handle = job.Schedule(colors.Length, 4);
handle.Complete();
foreach (var i in job.voxelToColor)
{
Debug.Log(i);
}
job.colorsArray.Dispose();
job.voxelToColor.Dispose();
}
}
[BurstCompile(CompileSynchronously = true)]
public struct DilationJob : IJobParallelFor
{
[ReadOnly] public NativeArray<Color32> colorsArray;
[NativeDisableParallelForRestriction]
public NativeArray<int> voxelToColor;
public void Execute(int index)
{
voxelToColor[index] = colorsArray[index].a;
if (index + 1 < colorsArray.Length - 1) voxelToColor[index + 1] = 0;
}
}
should not throw any exception. But if you comment out the [NativeDisableParallelForRestriction] you will get the exception you are getting.
Related
public static int n;
public static int w;
public static int[] s;
public static int[] p;
static void Main(string[] args)
{
n = 5;
w = 5;
s = new int[n + 1];
p = new int[n + 1];
Random rnd = new Random();
for (int i = 1; i <= n; i++)
{
s[i] = rnd.Next(1, 10);
p[i] = rnd.Next(1, 10);
}
Console.WriteLine(F_recursion(n, w));
Console.WriteLine(DP(n, w));
}
// recursive approach
public static int F_recursion(int n, int w)
{
if (n == 0 || w == 0)
return 0;
else if (s[n] > w)
return F_recursion(n - 1, w);
else
{
return Math.Max(F_recursion(n - 1, w), (p[n] + F_recursion(n - 1, w - s[n])));
}
}
// iterative approach
public static int DP(int n, int w)
{
int result = 0;
for (int i = 1; i <= n; i++)
{
if (s[i] > w)
{
continue;
}
else
{
result += p[i];
w = w - s[i];
}
}
return result;
}
I need to convert F_recursion function to iterative. I currently written following function DP that sometimes works but not always. I learned that problem is in F_recursion(n - 1, w - s[n]) I have no idea how to make w - s[n] work correctly in iterative solution. If change w - s[n] and w - s[i] to only w then program always work.
In Console:
s[i] = 2 p[i] = 3
-------
s[i] = 3 p[i] = 4
-------
s[i] = 5 p[i] = 3
-------
s[i] = 3 p[i] = 8
-------
s[i] = 6 p[i] = 6
-------
Recursive:11
Iteration:7
but sometimes it works
s[i] = 5 p[i] = 6
-------
s[i] = 8 p[i] = 1
-------
s[i] = 3 p[i] = 5
-------
s[i] = 3 p[i] = 1
-------
s[i] = 7 p[i] = 7
-------
Recursive:6
Iteration:6
The following approach might be useful, when bigger numbers are involved (specially for s) and consequently a 2 dimensional array would be unnecessary big and only a few w values would actually be used in computing the result.
The idea: precompute possible w values, by starting at w and for each i in [n, n-1, ..., 1] determine the values w_[i], where w_[i+1] >= s[i] without duplicates.
Then iterate i_n over n and compute sub-results only for valid w_[i] values.
I chose an array of Dictionary as datastructure, since it's relatively easy to design sparse data this way.
public static int DP(int n, int w)
{
// compute possible w values for each iteration from 0 to n
Stack<HashSet<int>> validW = new Stack<HashSet<int>>();
validW.Push(new HashSet<int>() { w });
for (int i = n; i > 0; i--)
{
HashSet<int> validW_i = new HashSet<int>();
foreach (var prevValid in validW.Peek())
{
validW_i.Add(prevValid);
if (prevValid >= s[i])
{
validW_i.Add(prevValid - s[i]);
}
}
validW.Push(validW_i);
}
// compute sub-results for all possible n,w values.
Dictionary<int, int>[] value = new Dictionary<int,int>[n + 1];
for (int n_i = 0; n_i <= n; n_i++)
{
value[n_i] = new Dictionary<int, int>();
HashSet<int> validSubtractW_i = validW.Pop();
foreach (var w_j in validSubtractW_i)
{
if (n_i == 0 || w_j == 0)
value[n_i][w_j] = 0;
else if (s[n_i] > w_j)
value[n_i][w_j] = value[n_i - 1][w_j];
else
value[n_i][w_j] = Math.Max(value[n_i - 1][w_j], (p[n_i] + value[n_i - 1][w_j - s[n_i]]));
}
}
return value[n][w];
}
It's important to understand that some space and computation is "wasted" in order to precompute possible w values and to support the sparse data structures. So this approach might perform bad for large data sets with small values in s, where most w values will be possible sub-results.
After some more thought I realized, if space is a concern, you can actually throw away the sub-results of everything except the previous outer loop iteration, since the recursion in this algorithm follows a strict n-1 pattern. However, I'm not including this into my code for now.
Your approach does not work because your dynamic programmig state space (which apparently is only one variable) does not match the signature of the recursive method. The goal of the dynamic programming approach should be to define and fill a state space such that all results for evaluation are available when needed. On inspection of the recursive method, notice that the recursive calls of F_recursion may change both arguments, n and w. This is an indication that a two-dimensional state space should be used.
The first argument (which apparently limits the range of items) can range from 0 to n while the second argument (which apparently is some bound for the total of an item property) can range from 0 to w.
You should define a two dimensional state space
int[,] value = new int[n,w];
for accomodation of the values. Next, you should initialize the values to undefined; you can use the value Int32.MaxValue for this, because it will behave in a suitable way if the minimum with some different value is calculated.
Next, the iterative version of the algorithm shoud use two loops which iterate in a forwad manner, unlike the recursive iteration which decreases the arguments.
for (int i = 0; i < n; i++)
{
for (int j = 0; j < w; j++)
{
// logic for the recurrence relation goes here
}
}
In the innermost block you can use a modified version of the recurrence relation. Instead of using recursive calls, you access values which are stored in value; instead of returning values, you write the values to value.
Semantically this is the same as memoization, but instead of using actual recursive calls, the order of evaluation asserts that necessary values always exist, making additional logic unneccessary.
Once the state space is filled, you have to examine its last state (namely the part of the array where the first index is n-1) to determine the maximal value for the entire input.
Please check below code, this code try to compute birthday conflict possibility. To my surprise, if i execute those code with sequence, the result is expected around 0.44; but if try on PLinq, the result is 0.99.
Anyone can explain the result?
public static void BirthdayConflict(int num = 5, int people = 300) {
int N = 100000;
int act = 0;
Random r = new Random();
Action<int> action = (a) => {
List<int> p = new List<int>();
for (int i = 0; i < people; i++)
{
p.Add(r.Next(364) + 1);
}
p.Sort();
bool b = false;
for (int i = 0; i < 300; i++)
{
if (i + num -1 >= people) break;
if (p[i] == p[i + num -1])
b = true;
}
if (b)
Interlocked.Increment(ref act);
// act++;
};
// Result is around 0.99 - which is not OK
// Parallel.For( 0, N, action);
//Result is around 0.44 - which is OK
for (int i = 0; i < N; i++)
{
action(0);
}
Console.WriteLine(act / 100000.0);
Console.ReadLine();
}
You're using a shared (between threads) instance System.Random. It's not thread-safe then you're getting wrong results (well actually it just doesn't work and it'll return 0). From MSDN:
If your app calls Random methods from multiple threads, you must use a synchronization object to ensure that only one thread can access the random number generator at a time. If you don't ensure that the Random object is accessed in a thread-safe way, calls to methods that return random numbers return 0.
Simple (but not so efficient for parallel execution) solution is to use a lock:
lock (r)
{
for (int i = 0; i < people; i++)
{
p.Add(r.Next(364) + 1);
}
}
To improve performance (but you should measure) you may use multiple instances of System.Random, be careful to initialize each one with a different seed.
I find a useful explanation why random does not work under multi-thread, although it was original for Java, still can be benefitical.
I'm performing some array manipulation/calculation in CUDA (via the Cudafy.NET library, though I'm equally interested in CUDA/C++ methods), and need to calculate the minimum and maximum values that are in the array. One of the kernels looks like this:
[Cudafy]
public static void UpdateEz(GThread thread, float time, float ca, float cb, float[,] hx, float[,] hy, float[,] ez)
{
var i = thread.blockIdx.x;
var j = thread.blockIdx.y;
if (i > 0 && i < ez.GetLength(0) - 1 && j > 0 && j < ez.GetLength(1) - 1)
ez[i, j] =
ca * ez[i, j]
+ cb * (hx[i, j] - hx[i - 1, j])
+ cb * (hy[i, j - 1] - hy[i, j])
;
}
I'd like to do something like this:
[Cudafy]
public static void UpdateEz(GThread thread, float time, float ca, float cb, float[,] hx, float[,] hy, float[,] ez, out float min, out float max)
{
var i = thread.blockIdx.x;
var j = thread.blockIdx.y;
min = float.MaxValue;
max = float.MinValue;
if (i > 0 && i < ez.GetLength(0) - 1 && j > 0 && j < ez.GetLength(1) - 1)
{
ez[i, j] =
ca * ez[i, j]
+ cb * (hx[i, j] - hx[i - 1, j])
+ cb * (hy[i, j - 1] - hy[i, j])
;
min = Math.Min(ez[i, j], min);
max = Math.Max(ez[i, j], max);
}
}
Anyone knows of a convenient way to return the minimum and maximum values (for the entire array, not just per thread or block)?
If you are writing an electromagnetic wave simulator and do not want to re-invent the wheel, you can use thrust::minmax_element. Below I'm reporting a simple example on how using it. Please, add your own CUDA error check.
#include <stdio.h>
#include <cuda_runtime_api.h>
#include <thrust\pair.h>
#include <thrust\device_vector.h>
#include <thrust\extrema.h>
int main()
{
const int N = 5;
const float h_a[N] = { 3., 21., -2., 4., 5. };
float *d_a; cudaMalloc(&d_a, N * sizeof(float));
cudaMemcpy(d_a, h_a, N * sizeof(float), cudaMemcpyHostToDevice);
float minel, maxel;
thrust::pair<thrust::device_ptr<float>, thrust::device_ptr<float>> tuple;
tuple = thrust::minmax_element(thrust::device_pointer_cast(d_a), thrust::device_pointer_cast(d_a) + N);
minel = tuple.first[0];
maxel = tuple.second[0];
printf("minelement %f - maxelement %f\n", minel, maxel);
return 0;
}
Based on your comment to your question, you were trying to find the max and min values while calculating them; while it's possible, it's not the most efficient. If you're set on doing that, then you can have an atomic comparison against some global minimum and global maximum, with the downside that each thread will be serialized, which will likely be a significant bottleneck.
For the more canonical approach to finding the maximum or minimum in an array via a reduction, you can do something along the lines of:
#define MAX_NEG ... //some small number
template <typename T, int BLKSZ> __global__
void cu_max_reduce(const T* d_data, const int d_len, T* max_val)
{
volatile __shared__ T smem[BLKSZ];
const int tid = threadIdx.x;
const int bid = blockIdx.x;
//starting index for each block to begin loading the input data into shared memory
const int bid_sidx = bid*BLKSZ;
//load the input data to smem, with padding if needed. each thread handles 2 elements
#pragma unroll
for (int i = 0; i < 2; i++)
{
//get the index for the thread to load into shared memory
const int tid_idx = 2*tid + i;
const int ld_idx = bid_sidx + tid_idx;
if(ld_idx < (bid+1)*BLKSZ && ld_idx < d_len)
smem[tid_idx] = d_data[ld_idx];
else
smem[tid_idx] = MAX_NEG;
__syncthreads();
}
//run the reduction per-block
for (unsigned int stride = BLKSZ/2; stride > 0; stride >>= 1)
{
if(tid < stride)
{
smem[tid] = ((smem[tid] > smem[tid + stride]) ? smem[tid]:smem[tid + stride]);
}
__syncthreads();
}
//write the per-block result out from shared memory to global memory
max_val[bid] = smem[0];
}
//assume we have d_data as a device pointer with our data, of length data_len
template <typename T> __host__
T cu_find_max(const T* d_data, const int data_len)
{
//in your host code, invoke the kernel with something along the lines of:
const int thread_per_block = 16;
const int elem_per_thread = 2;
const int BLKSZ = elem_per_thread*thread_per_block; //number of elements to process per block
const int blocks_per_grid = ceil((float)data_len/(BLKSZ));
dim3 block_dim(thread_per_block, 1, 1);
dim3 grid_dim(blocks_per_grid, 1, 1);
T *d_max;
cudaMalloc((void **)&d_max, sizeof(T)*blocks_per_grid);
cu_max_reduce <T, BLKSZ> <<<grid_dim, block_dim>>> (d_data, data_len, d_max);
//etc....
}
This will find the per-block maximum value. You can run it again on its output (e.g. with d_max as the input data and with updated launch parameters) on 1 block to find the global maximum - running it in a multi-pass manner like this is necessary if your dataset is too large (in this case, above 2 * 4096 elements, since we have each thread process 2 elements, although you could just process more elements per thread to increase this).
I should point out that this isn't particularly efficient (you'd want to use a more intelligent stride when loading the shared memory to avoid bank conflicts), and I'm not 100% sure it's correct (it worked on a few small testcases I tried), but I tried to write it for maximal clarity. Also don't forget to put in some error checking code to make sure your CUDA calls are completing successfully, I left them out here to keep it short(er).
I should also direct you towards some more in-depth documentation; you can take a look at the CUDA sample reduction over at http://docs.nvidia.com/cuda/cuda-samples/index.html although it's not doing a min/max calculation, it's the same general idea (and more efficient). Also, if you're looking for simplicity, you might just want to use Thrust's functions thrust::max_element and thrust::min_element, and the documentation at: thrust.github.com/doc/group__extrema.html
You can develop your own min/max algorithm using a divide and conquer method.
If you have the possibility to use npp, then this function may be useful : nppsMinMax_32f.
Hi and thanks for looking!
Background
I have a computing task that requires either a lot of time, or parallel computing.
Specifically, I need to loop through a list of about 50 images, Base64 encode them, and then calculate the Levenshtein distance between each newly encoded item and values in an XML file containing about 2000 Base64 string-encoded images in order to find the string in the XML file that has the smallest Lev. Distance from the benchmark string.
A regular foreach loop works, but is too slow so I have chosen to use PLINQ to take advantage of my Core i7 multi-core processor:
Parallel.ForEach(candidates, item => findImage(total,currentWinner,benchmark,item));
The task starts brilliantly, racing along at high speed, but then I get an "Out of Memory" exception.
I am using C#, .NET 4, Forms App.
Question
How do I tweak my PLINQ code so that I don't run out of available memory?
Update/Sample Code
Here is the method that is called to iniate the PLINQ foreach:
private void btnGo_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Load(#"C:\Foo.xml");
var imagesNode = doc.Element("images").Elements("image"); //Each "image" node contains a Base64 encoded string.
string benchmark = tbData.Text; //A Base64 encoded string.
IEnumerable<XElement> candidates = imagesNode;
currentWinner = 1000000; //Set the "Current" low score to a million and bubble lower scores into it's place iteratively.
Parallel.ForEach(candidates, i => {
dist = Levenshtein(benchmark, i.Element("score").Value);
if (dist < currentWinner)
{
currentWinner = dist;
path = i.Element("path").Value;
}
});
}
. . .and here is the Levenshtein Distance Method:
public static int Levenshtein(string s, string t) {
int n = s.Length;
int m = t.Length;
var d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
Thanks in advance!
Update
Ran into this error again today under different circumstances. I was working on a desktop app with high memory demand. Make sure that you have set the project for 64-bit architecture to access all available memory. My project was set on x86 by default and so I kept getting out of memory exceptions. Of course, this only works if you can count on 64-bit processors for your deployment.
End Update
After struggling a bit with this it appears to be operator error:
I was making calls to the UI thread from the parallel threads in order to update progress labels, but I was not doing it in a thread-safe way.
Additionally, I was running the app without the debugger, so there was an uncaught exception each time the code attempted to update the UI thread from a parallel thread which caused the overflow.
Without being an expert on PLINQ, I am guessing that it handles all of the low-level allocation stuff for you as long as you don't make a goofy smelly code error like this one.
Hope this helps someone else.
I am creating a forecasting application that will run simulations for various "modes" that a production plant is able to run. The plant can run in one mode per day, so I am writing a function that will add up the different modes chosen each day that best maximize the plant’s output and best aligns with the sales forecast numbers provided. This data will be loaded into an array of mode objects that will then be used to calculate the forecast output of the plant.
I have created the functions to do this, however, I need to make them recursive so that I am able to handle any number (within reason) of modes and work days (which varies based on production needs). Listed below is my code using for loops to simulate what I want to do. Can someone point me in the right direction in order to create a recursive function to replace the need for multiple for loops?
Where the method GetNumbers4 would be when there were four modes, and GetNumbers5 would be 5 modes. Int start would be the number of work days.
private static void GetNumber4(int start)
{
int count = 0;
int count1 = 0;
for (int i = 0; 0 <= start; i++)
{
for (int j = 0; j <= i; j++)
{
for (int k = 0; k <= j; k++)
{
count++;
for (int l = 0; l <= i; l++)
{
count1 = l;
}
Console.WriteLine(start + " " + (count1 - j) + " " + (j - k) + " " + k);
count1 = 0;
}
}
start--;
}
Console.WriteLine(count);
}
private static void GetNumber5(int start)
{
int count = 0;
int count1 = 0;
for (int i = 0; 0 <= start; i++)
{
for (int j = 0; j <= i; j++)
{
for (int k = 0; k <= j; k++)
{
for (int l = 0; l <= k; l++)
{
count++;
for (int m = 0; m <= i; m++)
{
count1 = m;
}
Console.WriteLine(start + " " + (count1 - j) + " " + (j - k) + " " + (k - l) + " " + l);
count1 = 0;
}
}
}
start--;
}
Console.WriteLine(count);
}
EDITED:
I think that it would be more helpful if I gave an example of what I was trying to do. For example, if a plant could run in three modes "A", "B", "C" and there were three work days, then the code will return the following results.
3 0 0
2 1 0
2 0 0
1 2 0
1 1 1
1 0 2
0 3 0
0 2 1
0 1 2
0 0 3
The series of numbers represent the three modes A B C. I will load these results into a Modes object that has the corresponding production rates. Doing it this way allows me to shortcut creating a list of every possible combination; it instead gives me a frequency of occurrence.
Building on one of the solutions already offered, I would like to do something like this.
//Where Modes is a custom classs
private static Modes GetNumberRecur(int start, int numberOfModes)
{
if (start < 0)
{
return Modes;
}
//Do work here
GetNumberRecur(start - 1);
}
Thanks to everyone who have already provided input.
Calling GetNumber(5, x) should yield the same result as GetNumber5(x):
static void GetNumber(int num, int max) {
Console.WriteLine(GetNumber(num, max, ""));
}
static int GetNumber(int num, int max, string prefix) {
if (num < 2) {
Console.WriteLine(prefix + max);
return 1;
}
else {
int count = 0;
for (int i = max; i >= 0; i--)
count += GetNumber(num - 1, max - i, prefix + i + " ");
return count;
}
}
A recursive function just needs a terminating condition. In your case, that seems to be when start is less than 0:
private static void GetNumberRec(int start)
{
if(start < 0)
return;
// Do stuff
// Recurse
GetNumberRec(start-1);
}
I've refactored your example into this:
private static void GetNumber5(int start)
{
var count = 0;
for (var i = 0; i <= start; i++)
{
for (var j = 0; j <= i; j++)
{
for (var k = 0; k <= j; k++)
{
for (var l = 0; l <= k; l++)
{
count++;
Console.WriteLine(
(start - i) + " " +
(i - j) + " " +
(j - k) + " " +
(k - l) + " " +
l);
}
}
}
}
Console.WriteLine(count);
}
Please verify this is correct.
A recursive version should then look like this:
public static void GetNumber(int start, int depth)
{
var count = GetNumber(start, depth, new Stack<int>());
Console.WriteLine(count);
}
private static int GetNumber(int start, int depth, Stack<int> counters)
{
if (depth == 0)
{
Console.WriteLine(FormatCounters(counters));
return 1;
}
else
{
var count = 0;
for (int i = 0; i <= start; i++)
{
counters.Push(i);
count += GetNumber(i, depth - 1, counters);
counters.Pop();
}
return count;
}
}
FormatCounters is left as an exercise to the reader ;)
I previously offered a simple C# recursive function here.
The top-most function ends up having a copy of every permutation, so it should be easily adapted for your needs..
I realize that everyone's beaten me to the punch at this point, but here's a dumb Java algorithm (pretty close to C# syntactically that you can try out).
import java.util.ArrayList;
import java.util.List;
/**
* The operational complexity of this is pretty poor and I'm sure you'll be able to optimize
* it, but here's something to get you started at least.
*/
public class Recurse
{
/**
* Base method to set up your recursion and get it started
*
* #param start The total number that digits from all the days will sum up to
* #param days The number of days to split the "start" value across (e.g. 5 days equals
* 5 columns of output)
*/
private static void getNumber(int start,int days)
{
//start recursing
printOrderings(start,days,new ArrayList<Integer>(start));
}
/**
* So this is a pretty dumb recursion. I stole code from a string permutation algorithm that I wrote awhile back. So the
* basic idea to begin with was if you had the string "abc", you wanted to print out all the possible permutations of doing that
* ("abc","acb","bac","bca","cab","cba"). So you could view your problem in a similar fashion...if "start" is equal to "5" and
* days is equal to "4" then that means you're looking for all the possible permutations of (0,1,2,3,4,5) that fit into 4 columns. You have
* the extra restriction that when you find a permutation that works, the digits in the permutation must add up to "start" (so for instance
* [0,0,3,2] is cool, but [0,1,3,3] is not). You can begin to see why this is a dumb algorithm because it currently just considers all
* available permutations and keeps the ones that add up to "start". If you want to optimize it more, you could keep a running "sum" of
* the current contents of the list and either break your loop when it's greater than "start".
*
* Essentially the way you get all the permutations is to have the recursion choose a new digit at each level until you have a full
* string (or a value for each "day" in your case). It's just like nesting for loops, but the for loop actually only gets written
* once because the nesting is done by each subsequent call to the recursive function.
*
* #param start The total number that digits from all the days will sum up to
* #param days The number of days to split the "start" value across (e.g. 5 days equals
* 5 columns of output)
* #param chosen The current permutation at any point in time, may contain between 0 and "days" numbers.
*/
private static void printOrderings(int start,int days,List<Integer> chosen)
{
if(chosen.size() == days)
{
int sum = 0;
for(Integer i : chosen)
{
sum += i.intValue();
}
if(sum == start)
{
System.out.println(chosen.toString());
}
return;
}
else if(chosen.size() < days)
{
for(int i=0; i < start; i++)
{
if(chosen.size() >= days)
{
break;
}
List<Integer> newChosen = new ArrayList<Integer>(chosen);
newChosen.add(i);
printOrderings(start,days,newChosen);
}
}
}
public static void main(final String[] args)
{
//your equivalent of GetNumber4(5)
getNumber(5,4);
//your equivalent of GetNumber5(5)
getNumber(5,5);
}
}