Print all partitions into disjoint combinations of fixed size - c#

I have an array of numbers from 1 to n, and I need to find all possible partitions into disjoint combinations of 3 numbers.
That is, for n = 9 the situation is as follows:
Array: 1, 2, 3, 4, 5, 6, 7, 8, 9;
Possible combinations of 3: 123, 124 ... 245, 246 ... 478, 479, etc .;
Possible partitions into 3 disjoint combinations: 123 456 789, 123 457 689 ... 123 468 579 ... 127 458 369, etc.
I've found an algorithm for finding combinations of 3 numbers from a set, here it is: https://www.geeksforgeeks.org/print-all-possible-combinations-of-r-elements-in-a-given-array-of-size-n / (there are even 2 of them, but I used the first one). Now the question is how to find combinations of the combinations themselves, and this already causes difficulties: it seems to me that for this I need to deal with recursion again, but how and where exactly to use it, I don't fully understand (and perhaps the point is then another). Also I've seen a non-recursive algorithm that finds all the combinations from given numbers, https://rosettacode.org/wiki/Combinations#C.23, but could do nothing with it (I enclose my work with it). Could you please help me?
public static IEnumerable<int[]> Combinations(int[] a, int n, int m)
{
int[] result = new int[m];
Stack<int> stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
int index = stack.Count - 1;
int value = stack.Pop();
while (value < n)
{
result[index++] = ++value;
stack.Push(value);
if (index == m)
{
for (int i = 0; i < 3; i++)
{
a = a.Where(val => val != result[i]).ToArray();
}
return Combinations (a, n-3, m);
break;
}
}
}
}

Assuming n is a multiple of 3, there is a simple and intuitive recursive algorithm. (Writing it efficiently is a bit more of a challenge :-) ).
In pseudocode, generalising 3 to k:
# A must have a multiple of k elements
# I write V \ C to mean "V without the values in C". Since producing
# copies is expensive, you should find a more efficient way of doing
# this.
Partition(A, k):
If A has k elements, produce the partition consisting only of A
Otherwise:
Let m be the smallest element of A.
For each combination C of k-1 elements from A \ [m]:
Add m to C
For each partition P generated by Partition(A \ C, k):
produce P with the addition of C
Of course, that depends on you having access to an algorithm which can enumerate the k-combinations of a list. (Even better would be a function which produced successive shuffles of the list with different k-combinations at the beginning, while maintaining the list in order. Sadly, few standard libraries provide that.)
There's another recursive algorithm, which can easily be made into an iterative algorithm by maintaining an explicit stack. It's possibly not quite as intuitive, although once you see it, how it works is pretty obvious, but it's a lot easier to implement efficiently. It requires us to maintain the invariants that each set in the partition is stored in increasing order, and that the sets themselves are sorted in increasing order by their first element. (The order itself is irrelevant, and it's totally reasonable to just assume that the the original order of the elements is the desired sortation, as long as the elements are kept in a data structure whose ordering is constant.)
Once you establish that rule, you can start by making all the partition's sets empty, and then place each successive element in order, using each of the possible locations which obey the following simple constraints:
Once a set contains the correct number of elements, no more elements can be added to it.
Each element is placed at the end of a set (because all the already-placed elements are smaller and all the elements yet to be placed are bigger);
An element can only be added to an empty set if it is the first empty set in the partition (to guarantee that the sets themselves will be sorted).
To avoid constantly copying the sets in the partition, you can implement this by using a fixed-size two-dimensional array of k rows and n ⁄ k columns, where each row represents one set in the partition; it's then necessary to keep another array of n ⁄ k integers the current length of each set.
One advantage of the first algorithm is that it makes it reasonably obvious how many possible partitions there are, because the number of partitions generated by the inner loop is independent of the combination chosen in the outer loop. Consequently, if we write P(n, k) for the number of k-partitions of n objects, we can see that
P(n, k) = C(n−1, k−1) × P(n−k, k) for n>0   (where C(n, k) is the binomial coefficient)
That's simply a product of binomial coefficients:
P(n, k) = C(n−1, k−1) × C(n−k−1, k−1) × C(n−2k−1, k−1) × … × C(k−1, k−1)
Since C(n, k) is n! ⁄ k!(n−k)!, that can be simplified to n! ⁄ (d! × k!d) where d is the number of sets in each partition, i.e. d = n ⁄ k. That number is obviously a lot smaller than n! but it still grows extremely rapidly, making large arguments to the partition function impractical. For k=3, the first few counts are:
P( 3, 3) = 1
P( 6, 3) = 10
P( 9, 3) = 280
P(12, 3) = 15,400
P(15, 3) = 1,401,400
P(18, 3) = 190,590,400
P(21, 3) = 36,212,176,000
For this reason, it's usually advisable to generate and use the possible values one at a time rather than attempting to stash them all into a massive vector, which would take up a lot of memory.

Related

Comparing 1 million integers in an array without sorting it first

I have a task to find the difference between every integer in an array of random numbers and return the lowest difference. A requirement is that the integers can be between 0 and int.maxvalue and that the array will contain 1 million integers.
I put some code together which works fine for a small amount of integers but it takes a long time (so long most of the time I give up waiting) to do a million. My code is below, but I'm looking for some insight on how I can improve performance.
for(int i = 0; i < _RandomIntegerArray.Count(); i++) {
for(int ii = i + 1; ii < _RandomIntegerArray.Count(); ii++) {
if (_RandomIntegerArray[i] == _RandomIntegerArray[ii]) continue;
int currentDiff = Math.Abs(_RandomIntegerArray[i] - _RandomIntegerArray[ii]);
if (currentDiff < lowestDiff) {
Pairs.Clear();
}
if (currentDiff <= lowestDiff) {
Pairs.Add(new NumberPair(_RandomIntegerArray[i], _RandomIntegerArray[ii]));
lowestDiff = currentDiff;
}
}
}
Apologies to everyone that has pointed out that I don't sort; unfortunately sorting is not allowed.
Imagine that you have already found a pair of integers a and b from your random array such that a > b and a-b is the lowest among all possible pairs of integers in the array.
Does an integer c exist in the array such that a > c > b, i.e. c goes between a and b? Clearly, the answer is "no", because otherwise you'd pick the pair {a, c} or {c, b}.
This gives an answer to your problem: a and b must be next to each other in a sorted array. Sorting can be done in O(N*log N), and the search can be done in O(N) - an improvement over O(N2) algorithm that you have.
As per #JonSkeet try sorting the array first and then only compare consecutive array items, which means that you only need to iterate the array once:
Array.Sort(_RandomIntegerArray);
for (int i = 1; i < _RandomIntegerArray.Count(); i++)
{
int currentDiff = _RandomIntegerArray[i] - _RandomIntegerArray[i-1];
if (currentDiff < lowestDiff)
{
Pairs.Clear();
}
if (currentDiff <= lowestDiff)
{
Pairs.Add(new NumberPair(_RandomIntegerArray[i], _RandomIntegerArray[i-1]));
lowestDiff = currentDiff;
}
}
In my testing this results in < 200 ms elapsed for 1 million items.
You've got a million integers out of a possible 2.15 or 4.3 billion (signed or unsigned). That means the largest possible min distance is either about 2150 or 4300. Let's say that the max possible min distance is D.
Divide the legal integers into groups of length D. Create a hash h keyed on integers with arrays of ints as values. Process your array by taking each element x, and adding it to h[x/D].
The point of doing this is that any valid pair of points is either contained in h(k) for some k, or collectively in h(k) and h(k+1).
Find your pair of points by going through the keys of the hash and checking the points associated with adjacent keys. You can sort if you like, or use a bitvector, or any other method but now you're dealing with small arrays (on average 1 element per array).
As elements of the array are b/w 0 to int.maxvalue, so I suppose maxvalue will be less than 1 million. If it is so you just need to initialise the array[maxvalue] to 0 and then as you read 1 million values increment the value in your array.
Now read this array and find the lowest value as described by others as if all the values were sorted. If at any element is present more than 1 than its value will be >1 so you could easily say that min. difference is 0.
NOTE- This method is efficient only if you do not use sorting and more importantly int.maxvalue<<<<<(less than) 10^6(1 million).
It helps a little if you do not count on each iteration
int countIntegers = _RandomIntegerArray.Count();
for(int i = 0; i < countIntegers; i++) {
//...
for(int ii = i + 1; ii < countIntegers; ii++) {
//...
Given that Count() is only returning the number of Ints in an array on each successful count and not modifying the array or caching output until modifications.
How about splitting up the array into arraysize/number of processors sized chunks and running each chunk in a different thread. (Neil)
Assume three parts A, B and C of size as close as possible.
For each part, find the minimum "in-part" difference and that of pairs with the first component from the current part and the second from the next part (A being the next from C).
With a method taking O(n²) time, n/3 should take one ninth, done 2*3 times, this amounts to two thirds plus change for combining the results.
This calls to be applied recursively - remember Карацу́ба/Karatsuba multiplication?
Wait - maybe use two parts after all, for three fourth of the effort - very close to "Karatsuba". (When not seeing how to use an even number of parts, I was thinking multiprocessing with every processor doing "the same".)

Getting all combinations of K and less elements in List of N elements with big K

I want to have all combination of elements in a list for a result like this:
List: {1,2,3}
1
2
3
1,2
1,3
2,3
My problem is that I have 180 elements, and I want to have all combinations up to 5 elements. With my tests with 4 elements, it took a long time (2 minutes) but all went well. But with 5 elements, I get a run out of memory exception.
My code presently is this:
public IEnumerable<IEnumerable<Rondin>> getPossibilites(List<Rondin> rondins)
{
var combin5 = rondins.Combinations(5);
var combin4 = rondins.Combinations(4);
var combin3 = rondins.Combinations(3);
var combin2 = rondins.Combinations(2);
var combin1 = rondins.Combinations(1);
return combin5.Concat(combin4).Concat(combin3).Concat(combin2).Concat(combin1).ToList();
}
With the fonction: (taken from this question: Algorithm to return all combinations of k elements from n)
public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int k)
{
return k == 0 ? new[] { new T[0] } :
elements.SelectMany((e, i) =>
elements.Skip(i + 1).Combinations(k - 1).Select(c => (new[] { e }).Concat(c)));
}
I need to search in the list for a combination where each element added up is near (with a certain precision) to a value, this for each element in an other list. There is all my code for this part:
var possibilites = getPossibilites(opt.rondins);
possibilites = possibilites.Where(p => p.Sum(r => r.longueur + traitScie) < 144);
foreach(BilleOptimisee b in opt.billesOptimisees)
{
var proches = possibilites.Where(p => p.Sum(r => (r.longueur + traitScie)) < b.chute && Math.Abs(b.chute - p.Sum(r => r.longueur)) - (p.Count() * 0.22) < 0.01).OrderByDescending(p => p.Sum(r => r.longueur)).ElementAt(0);
if(proches != null)
{
foreach (Rondin r in proches)
{
opt.rondins.Remove(r);
b.rondins.Add(r);
possibilites = possibilites.Where(p => !p.Contains(r));
}
}
}
With the code I have, how can I limit the memory taken by my list ? Or is there a better solution to search in a very big set of combinations ?
Please, if my question is not good, tell me why and I will do my best to learn and ask better questions next time ;)
Your output list for combinations of 5 elements will have ~1.5*10^9 (that's billion with b) sublists of size 5. If you use 32bit integers, even neglecting lists overhead and assuming you have a perfect list with 0b overhead - that will be ~200GB!
You should reconsider if you actually need to generate the list like you do, some alternative might be: streaming the list of elements - i.e. generating them on the fly.
That can be done by creating a function, which gets the last combination as an argument - and outputs the next. (to think how it is done, think about increasing by one a number. you go from last to first, remembering a "carry over" until you are done)
A streaming example for choosing 2 out of 4:
start: {4,3}
curr = start {4, 3}
curr = next(curr) {4, 2} // reduce last by one
curr = next(curr) {4, 1} // reduce last by one
curr = next(curr) {3, 2} // cannot reduce more, reduce the first by one, and set the follower to maximal possible value
curr = next(curr) {3, 1} // reduce last by one
curr = next(curr) {2, 1} // similar to {3,2}
done.
Now, you need to figure how to do it for lists of size 2, then generalize it for arbitrary size - and program your streaming combination generator.
Good Luck!
Let your precision be defined in the imaginary spectrum.
Use a real index to access the leaf and then traverse the leaf with the required precision.
See PrecisLise # http://net7mma.codeplex.com/SourceControl/latest#Common/Collections/Generic/PrecicseList.cs
While the implementation is not 100% complete as linked you can find where I used a similar concept here:
http://net7mma.codeplex.com/SourceControl/latest#RtspServer/MediaTypes/RFC6184Media.cs
Using this concept I was able to re-order h.264 Access Units and their underlying Network Access Layer Components in what I consider a very interesting way... outside of interesting it also has the potential to be more efficient using close the same amount of memory.
et al, e.g, 0 can be proceeded by 0.1 or 0.01 or 0.001, depending on the type of the key in the list (double, float, Vector, inter alia) you may have the added benefit of using the FPU or even possibly Intrinsics if supported by your processor, thus making sorting and indexing much faster than would be possible on normal sets regardless of the underlying storage mechanism.
Using this concept allows for very interesting ordering... especially if you provide a mechanism to filter the precision.
I was also able to find several bugs in the bit-stream parser of quite a few well known media libraries using this methodology...
I found my solution, I'm writing it here so that other people that has a similar problem than me can have something to work with...
I made a recursive fonction that check for a fixed amount of possibilities that fit the conditions. When the amount of possibilities is found, I return the list of possibilities, do some calculations with the results, and I can restart the process. I added a timer to stop the research when it takes too long. Since my condition is based on the sum of the elements, I do every possibilities with distinct values, and search for a small amount of possibilities each time (like 1).
So the fonction return a possibility with a very high precision, I do what I need to do with this possibility, I remove the elements of the original list, and recall the fontion with the same precision, until there is nothing returned, so I can continue with an other precision. When many precisions are done, there is only about 30 elements in my list, so I can call for all the possibilities (that still fits the maximum sum), and this part is much easier than the beginning.
There is my code:
public List<IEnumerable<Rondin>> getPossibilites(IEnumerable<Rondin> rondins, int nbElements, double minimum, double maximum, int instance = 0, double longueur = 0)
{
if(instance == 0)
timer = DateTime.Now;
List<IEnumerable<Rondin>> liste = new List<IEnumerable<Rondin>>();
//Get all distinct rondins that can fit into the maximal length
foreach (Rondin r in rondins.Where(r => r.longueur < (maximum - longueur)).DistinctBy(r => r.longueur).OrderBy(r => r.longueur))
{
//Check the current length
double longueur2 = longueur + r.longueur + traitScie;
//If the current length is under the maximal length
if (longueur2 < maximum)
{
//Get all the possibilities with all rondins except the current one, and add them to the list
foreach (IEnumerable<Rondin> poss in getPossibilites(rondins.Where(rondin => rondin.id != r.id), nbElements - liste.Count, minimum, maximum, instance + 1, longueur2).Select(possibilite => possibilite.Concat(new Rondin[] { r })))
{
liste.Add(poss);
if (liste.Count >= nbElements && nbElements > 0)
break;
}
//If this the current length in higher than the minimum, add it to the list
if (longueur2 >= minimum)
liste.Add(new Rondin[] { r });
}
//If we have enough possibilities, we stop the research
if (liste.Count >= nbElements && nbElements > 0)
break;
//If the research is taking too long, stop the research and return the list;
if (DateTime.Now.Subtract(timer).TotalSeconds > 30)
break;
}
return liste;
}

Recomended Combination Algorithms for Numbers with Ranges

I am currently trying to write C# code that finds multiple arrays of integers that equal a specified total when they are summed up. I would like to find these combinations while each integer in the array is given a range it can be.
For example, if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 3 and 6, some possible combination are [1, 3, 6], [2, 2, 6], and [4, 2, 4].
What sort of algorithm would help with solving a problem like this that can run in them most efficient amount of time? Also, what other things should I keep in mind when transitioning this problem into C# code?
I would do this using recursion. You can simply iterate over all possible values and see if they give a required sum.
Input
Let's suppose we have the following input pattern:
N S
min1 min2 min3 ... minN
max1 max2 max3 ... maxN
For your example
if our total is 10 and we have an int array of size 3 where the first
number can be between 1 and 4, the second 2 and 4, and the third 3 and
6
it will be:
3 10
1 2 3
4 4 6
Solution
We have read our input values. Now, we just try to use each possible number for our solution.
We will have a List which will store the current path:
static List<int> current = new List<int>();
The recursive function is pretty simple:
private static void Do(int index, int currentSum)
{
if (index == length) // Termination
{
if (currentSum == sum) // If result is a required sum - just output it
Output();
return;
}
// try all possible solutions for current index
for (int i = minValues[index]; i <= maxValues[index]; i++)
{
current.Add(i);
Do(index + 1, currentSum + i); // pass new index and new sum
current.RemoveAt(current.Count() - 1);
}
}
For non-negative values we can also include such condition. This is the recursion improvement which will cut off a huge amount of incorrect iterations. If we already have a currentSum greater than sum then it is useless to continue in this recursion branch:
if (currentSum > sum) return;
Actually, this algorithm is a simple "find combinations that give a sum S" problem solution with one difference: inner loop indices within minValue[index] and maxValue[index].
Demo
Here is the working IDEOne demo of my solution.
You cannot do much better than nested for loops/recursion. Though if you are familiar with the 3SUM problem you will know a little trick to reduce the time complexity of this sort of algorithm! If you have n ranges then you know what number you have to pick from the nth range after you make your first n-1 choices!
I will use an example to walk through my suggestion.
if our total is 10 and we have an int array of size 3 where the first number can be between 1 and 4, the second 2 and 4, and the third 5 and 6
First of all lets process the data to be a bit nicer to deal with. I personally like the idea of working with ranges that start at 0 instead of arbitrary numbers! So we subtract the lower bounds from the upper bounds:
(1 to 4) -> (0 to 3)
(2 to 4) -> (0 to 2)
(5 to 6) -> (0 to 1)
Of course now we need to adjust our target sum to reflect the new ranges. So we subtract our original lower bounds from our target sum as well!
TargetSum = 10-1-2-5 = 2
Now we can represent our ranges with just the upper bound since they share a lower bound! So a range array would look something like:
RangeArray = [3,2,1]
Lets sort this (it will become more obvious why later). So we have:
RangeArray = [1,2,3]
Great! Now onto the beef of the algorithm... the summing! For now I will use for loops as it is easier to use for example purposes. You will have to use recursion. Yeldar's code should give you a good starting place.
result = []
for i from 0 to RangeArray[0]:
SumList = [i]
newSum = TargetSum - i
for j from 0 to RangeArray[1]:
if (newSum-j)>=0 and (newSum-j)<=RangeArray[2] then
finalList = SumList + [j, newSum-j]
result.append(finalList)
Note the inner loop. This is what was inspired by the 3SUM algorithm. We take advantage of the fact that we know what value we have to pick from the third range (since it is defined by our first 2 choices).
From here you have to of course re-map the results back to the original ranges by adding the original lowerbounds to the values that came from the corresponding ranges.
Notice that we now understand why it may be a good idea to sort RangeList. The last range gets absorbed into the secondlast range's loop. We want the largest range to be the one that does not loop.
I hope this helps to get you started! If you need any help translating my pseudocode into c# just ask :)

Divide a list of string into groups randomly

Given a list of string of n item, I wish to divide it to b groups (b<=n) where each group has i to j (j>=i) items
An example:
Say
List<string> lst=new List<string>(new string[]{"a","b","c","d"});
(Therefore n=4)
Assume the function that provide this functionality is
List<List<string>> DivideIntoGroup(List<string> lst, b, i, j)
one of the possible result of DivideIntoGroup(lst, 3, 1, 2) is
{"a"},
{"b","c"},
{"d"}
How should I write the DivideIntoGroup functions?
I am not a C# expert so I will give you a purely mathematical solution, and hopefully you will be able to translate it in your language.
Basically your task consists of two separate parts: choose b groups i to j elements each, and randomness. The second should be easy - just random shuffle the elements initially and then do the group splitting. Lets get down to the interesting part:
How to split n elements in b groups containing i to j elements each?
A straight forward solution will be to take random number between iand j for the number of elements of the first group, then the second etc. However, there will be no guarantee, that doing so you will not be left with the last group having element number not between i and j. Also such solution is not doing pure random distribution.
The correct approach will be to get the number of elements of the first group, respecting the probability of solution of the overall group splitting when you take as many elements - you basically are interested how many solutions are overall for the task(n, b, i, j) and how many will exist for the task(n-k, b-1, i, j) if we assume we take k elements in the first group. If we are able to calculate just the number of solutions, you can take each k with its respective probability and do random sampling of k for the first group, then the second and so on...
So now the question is: how many solutions are there for task(n, b, i, j)?
Noting the fact that task(n, b, i, j) = sum(k=i to j) task(n-k, b - 1, i, j) you can find these numbers easily using recursion (use dynamic optimization, so that you need not calculate the values more than once).
PS: There might be a closed form solution for the number of solutions, but I can't figure it out right away and as long as n * b is kept relatively small (< 10^6) recursive solution should work.
EDIT
PS2: actually the numbers in task(n, b, i, j) might get pretty large very fast, so consider using big integers.
What I would do as a solution is so, this is pseudo code of course:
func( n, b, i, j )
{
if(n == 0)
return //finished
if(i>j or i>min(j,n))
return //no solution possible down this path
out = choose_random_between (i , min(j,n))
current_ave_of_cells_per_group = ( (n - out) / (b - 1) )
if current_ave_of_cells_per_group < i
func ( n, b, i, min(out-1,n) )
else if current_ave_of_cells_per_group > j
func ( n, b, out+1, min(j,n) )
else
**form the group consisting of 'out' numbers**
func ( n-out, b-1, i, min(j,n-out) )
}

Building a non sequential list of numbers (From a large range)

I need to create a non sequential list of numbers that fit within a range. For instance i need to a generate a list of numbers from 1 to 1million and make sure that non of the numbers are in a sequential order, that they are completly shuffled. I guess my first question is, are there any good algorithms out there that could help and how best to implement this.
I currently am not sure the best way to implement, either via a c# console app that will spit out the numbers in an XML file or in a database that will spit out the numbers into a table or a set of tables, but that is really secondary to actually working out the best way of "shuffling" the set of numbers.
Any advice guys?
Rob
First off, if none of the numbers are in sequential order then every number in the sequence must be less than its predecessor. A sequence which has that property is sorted from biggest to smallest! Clearly that is not what you want. (Or perhaps you simply do not want any subsequence of the form 5, 6, 7 ? But 6, 8, 20 would be OK?)
To answer your question properly we need to know more information about the problem space. Things I would want to know:
1) Is the size of the range equal to, larger than, or smaller than the size of the sequence? That is, are you going to ask for ten numbers between 1 and 10, five numbers between 1 and 10 or fifty numbers between 1 and 10?
2) Is it acceptable for the sequence to contain duplicates? (If the number of items in the sequence is larger than the range, then clearly yes.)
3) What is the randomness being used for? Most random number generators are only pseudo-random; a clever attacker can deduce the next "random" number by knowing the previous ones. If for example you are generating a series of five cards out of a deck of 52 to make a poker hand, you want really strong randomness; you don't want players to be able to deduce what their opponents have in their hands.
How "non-sequential" do you want it?
You could easily generate a list of random numbers from a range with the Random class:
Random rnd1 = new Random();
List<int> largeList = new List<int>();
for (int i = 0, i < largeNumber, i++)
{
largeList.Add(rnd1.Next(1, 1000001);
}
Edit to add
Admittedly the Durstenfeld algorithm (modern version of the Fisher–Yates shuffle apparently) is much faster:
var fisherYates = new List<int>(upperBound);
for (int i = 0; i < upperBound; i++)
{
fisherYates.Add(i);
}
int n = upperBound;
while (n > 1)
{
n--;
int k = rnd.Next(n + 1);
int temp = fisherYates[k];
fisherYates[k] = fisherYates[n];
fisherYates[n] = temp;
}
For the range 1 to 10000 doing a brute force "find a random number I've not yet used" takes around 4-5 seconds, while this takes around 0.001.
Props to Greg Hewgill for the links.
I understand, that you want to get a random array of lenth 1mio with all numbers from 1 to 1mio. No duplicates, is that right?
You should build up an array with your numbers ranging from 1 to 1mio. Then start shuffling. But it can happen (that is true randomness) that two ore even more numbers are sequential.
Have a look here
Here's a C# function to get you started:
public IEnumerable<int> GetRandomSequence(int max)
{
var r = new Random();
while (true)
{
yield return r.GetNext(max);
}
}
call it like this to get a million numbers ranged 0-9999999:
var numbers = GetRandomSequence(9999999).Take(1000000);
As for sorting, or if you don't want to allow repeats, look at Enumerable.GetRange() (which will give you a consecutive ordered sequence) and use a Fisher-Yates (or Knuth) shuffle algorithm (which you can find all over the place).
"completly shuffled" is a very misunderstood term. One trick fraud experts use when examining what should be "random" data is to watch for cases where there no duplicate values (like 3743***88***123, because in a truly random sequence the chances of not having such a pair is very low... Exactly what are you trying to do ? What, exactly do you mean by "completly shuffled"? If all you mean is random sequence of digits, then just use the Random class in the CLR. to generate random numbers between 0 and 1M... as many as you need...
Well ,you could go with something like this (assuming that you want every number exactly once):
DECLARE #intFrom int
DECLARE #intTo int
DECLARE #tblList table (_id uniqueidentifier, _number int)
SET #intFrom = 0
SET #intTo = 1000000
WHILE (#intFrom < #intTo)
BEGIN
INSERT INTO #tblList
SELECT NewID(), #intFrom
SET #intFrom = #intFrom + 1
END
SELECT *
FROM #tblList
ORDER BY _id
DISCLAIMER: I didn't test this, since I don't have an SQL Server at my disposal at the moment.
This may get you what you need:
1) Populate a list of numbers in order. If your range is 1 - x, it'll look like this:
[1, 2, 4, 5, 6, 7, 8, 9, ... , x]
2) Loop over the list x times, each time choosing a random number between 0 and the length of your list - 1.
3) Use this chosen number to select the corresponding element from your list, and add this number to your output list.
4) Delete the element you just selected from your list. Rinse, repeat.
This will work for any range of numbers, not just lists that start with 1 or 0. The pseudocode looks like this:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
shuffled_nums = []
for i in range(0, len(nums)):
random_index = rand(0,len(nums))
shuffled_nums.add(nums[random_index])
del(nums[random_index])

Categories

Resources