Efficient and deterministic ranking items in collection? - c#

I have list of billions of items in SQL which can be shuffled by user at random, by moving them inside list to another position, I consider using simple double divide solution:
Id, Rank
1 10
2 20
3 30
4 40
5 50
Now user moves item id=3 to first position and I perform item rank recalculation based on their adjasent items (0 - means no relative from left, max - no relative from right):
Id, Rank
3 (0+10)/2 = 5
1 10
2 20
4 40
5 50
Now there is a bug - until it reach epsilon for double, it will work, after that you will get a couple of elements with epsilon and they are not possible to move.
This can be avoided by infrequent recalculation of stack rank for entire collection, but I hesitate at the moment to implement this, because this looks too much.
I wanted to know is there some other algorithmic solution other than changing billions of items or is there a well-known name to this problem to find appropriate solution myself.

Related

How to pick random items through cumulative probability?

Exercise Background
The exercise consists in generating a 2D map with a user given x,y size of said map, and then place on each cell of the map random items from a table.
I have a cell in an [x, y] coordinate of an Items matrix and I have to pick items randomly for every cell of this matrix.
My Problem
I have to select random items from a table of 4 items that have their probabilities shown in cumulative probability, and a cell that has such items can have more than 1 and different combinations of those items.
I don't really know how to go about this problem, taking in account that 2 of the items have the same probability on the given table for the homework.
This is the table of probability given:
Food - 1
Weapons - 0.5
Enemy - 0.5
Trap - 0.3
My Items enumeration:
[Flags]
enum Items
{
Food = 1<<0,
Weapon = 1<<1,
Enemy = 1<<2,
Trap = 1<<3
}
Again, the expected output is to pick randomly through this percentages what items does 1 cell have. What I'd like to have as an answer would be just a start or a way to go about this problem please, I still want to try and do it myself, avoid complete code solutions if you can.
I find it easier to work with integers in this type of problem, so I'll work with:
Food - 10
Weapons - 5
Enemy - 5
Trap - 3
That gives a total of 10 + 5 + 5 + 3 = 23 total possible options.
Most computer RNGs work from base 0, so split the 23 options (as in 0..22) like this:
Food - 0..9 giving 10 options.
Weapons - 10..14 giving 5 options.
Enemy - 15..19 giving 5 options.
Trap - 20..22 giving 3 options.
Work through the possibilities in order, stopping when you reach the selected option. I will use pseudocode as my C++ is very rusty:
function pickFWET()
pick <- randomInRange(0 to 22);
if (pick < 10) return FOOD;
if (pick < 15) return WEAPONS;
if (pick < 20) return ENEMY;
if (pick < 23) return TRAP;
// If we reach here then there was an error.
throwError("Wrong pick in pickFWET");
end function pickFWET
If two items have the same cumulative probability then the probability of getting the latter item is 0. Double check the probability table, but if it is correct, then 'Weapons' is not a valid option to get.
However in general. If you could 'somehow' generate a random number between 0 and 1, the problem would be easy right? With a few if conditions you can choose one of the options given this random number.
With a little bit of search you can easily find how to generate a random number in whatever language you desire.

Building up a matrix dynamically in the xy plane

I have to read some data sequentially(from a file) and put the data into a matrix. I don't know the rank of the matrix initially. For example consider the data is plotted on an x, y plane with years on the Y axis and increments in the x axis. At first the data came in for 1990 with 3 increments
year increment(1991) increment(1992) increment(1993)
1990 12 25 35
Note that I will only know about the increments after reading the data line. So next is 1989 with 4 increments. So it should be
year increment(1990) increment(1991) increment(1992) increment(1993)
1989 23 33 43 53
1990 0 12 25 35
Note that when the new data came in another increment year came in the y axis(1990).As there is no increment year of 1990 for year 1990 this has to be filled with zero or kept it empty, but the
In the end I have to create a matrix. For example
year increment(1990) increment(1991) increment(1992) increment(1993)
1989 23 33 43 53
1990 0 12 25 35
1991 0 0 23 33
To build up the matrix, the difficult part is I don't know the years/increments initially, I will only know after reading the entire data. I would like to plot the matrix while reading the data so that I can avoid more than one pass through the data.
The placement of the matrix in the xy axis will be only known after the entire data is processed!
Any suggestions?
I quite like the sparse matrix solution, but you could use a version of http://en.wikipedia.org/wiki/Dynamic_array. Dynamic arrays are arrays that you resize when they get too full. Resizing is expensive, but if you increase the size by a constant factor every time you resize the cost of resizing works out so that the total cost is still O(n) if the final size has n elements.
To use dynamic arrays for this, you could create two dynamic arrays for each row, one growing with years larger than those seen so far, and one growing with years smaller than those seen so far (so with years in decreasing order along the array).
Another way to do this would be to create a single area of storage for the matrix, with only the central section used, so there is always space to add entries in any direction. You would then have to check that increasing the size of this storage by a constant factor when you were about to run over the edges would lead to a total cost of at most O(n). I suspect that it would, but the constant factors might not be very good.
You can build it as a sparse matrix with SortedList<int, SortedList<int, int>>

Selecting random item from list having probability weighting using c#?

I have a scenario where i a was taking a list of users (20 users) from my database, where i was giving
weighting for users
first 5 users probability factor of 4
next 5 users probability factor of 3
next 5 users probability factor of 2
next 5 users probability factor of 1
So an user that occurs in the first 5 users is 4 times more
likely to occur than an user in the last 5.
So how can i select a random user from the list using probability in c#?
can anybody help me in doing this i am totally stuck up logically?
You could add the uses the number of probability times in the list. So the 5 first users are 4 times in the list, next 5 users 3 times and so on. Then just select one user from the complete list.
Create a list of partial sums of weights. In your example, it would be
[4, 8, 12, 16, 20, 23, ...]
The last element is the sum of all weights. Pick a random number between 0 and this sum (exclusive). Then your element is the first element with partial sum greater then the random number. So if you got 11, you need the third element, if you got 16, the fifth, etc.
I have a (bit hacky) solution for you:
Create a list containing the users, where each user is added as often as his weightage is. (e.g. User has a weightage of 5, add him 5 times to the list). Then us a Random to fetch a user from that list, that should solve your problem.
One solution would be to find the smallest common denominator of the weights (or just multiply them together) and create a new list that contains the keys of the first list, but multiple times, ie:
user1
user1
user2
user3
user3
user3
Then just to a newList.skip(Random.Next(newList.Count)).Take(1) and you are set!
You could apportion the probability range amongst the users using a dictionary. eg
User 1 has 1-4 (so max of 4)
User 2 has 5-8 (max of 8) etc etc...
Then after selecting the random number find which user within the dictionary it relates to. You can do this using Linq like so...
int iUser = users.Where(p => (choice <= p.Value)).First().Key;
..where users is a Dictionary<int,int> (Key = user number, Value = max value) and choice is the randomly generated value.
This is obviously more complex than the "multiple entries" method proposed by others but has its advantages if you
a) need a fractional weighting which makes the common denominator of your multiple entry method very small (resulting in many entries) or
b) need to weight heavily in favour of particular users (which would again have the effect of making the multiple entry method very large).
Working Example at ideone.

Struggling to make algorithm to generate board for a puzzle game

I'm looking to make a number puzzle game. For the sake of the question, let's say the board is a grid consisting of 4 x 4 squares. (In the actual puzzle game, this number will be 1..15)
A number may only occur once in each column and once in each row, a little like Sudoku, but without "squares".
Valid:
[1, 2, 3, 4
2, 3, 4, 1
3, 4, 1, 2
4, 1, 2, 3]
I can't seem to come up with an algorithm that will consistently generate valid, random n x n boards.
I'm writing this in C#.
Start by reading my series on graph colouring algorithms:
http://blogs.msdn.com/b/ericlippert/archive/tags/graph+colouring/
It is going to seem like this has nothing to do with your problem, but by the time you're done, you'll see that it has everything to do with your problem.
OK, now that you've read that, you know that you can use a graph colouring algorithm to describe a Sudoku-like puzzle and then solve a specific instance of the puzzle. But clearly you can use the same algorithm to generate puzzles.
Start by defining your graph regions that are fully connected.
Then modify the algorithm so that it tries to find two solutions.
Now create a blank graph and set one of the regions at random to a random colour. Try to solve the graph. Were there two solutions? Then add another random colour. Try it again. Were there no solutions? Then back up a step and add a different random colour.
Keep doing that -- adding random colours, backtracking when you get no solutions, and continuing until you get a puzzle that has a unique solution. And you're done; you've got a random puzzle generator.
It seems you could use this valid example as input to an algorithm that randomly swapped two rows a random number of times, then swapped two random columns a random number of times.
There aren't too many combinations you need to try. You can always rearrange a valid board so the top row is 1,2,3,4 (by remapping the symbols), and the left column is 1,2,3,4 (by rearranging rows 2 thru 4). On each row there are only 6 permutations of the remaining 3 symbols, so you can loop over those to find which of the 216 possible boards are valid. You may as well store the valid ones.
Then pick a valid board randomly, randomly rearrange the rows, and randomly reassign the symbols.
I don't speak C#, but the following algorithm ought to be easily translated.
Associate a set consisting of the numbers 1..N with each row and column:
for i = 1 to N
row_set[i] = column_set[i] = Set(1 .. N)
Then make a single pass through the matrix, choosing an entry for each position randomly from the set elements valid at that row and column. Remove the number chosen from the respective row and column sets.
for r = 1 to N
for c = 1 to N
k = RandomChoice( Intersection( column_set[c], row_set[r] ))
puzzle_board[r, c] = k
column_set[c] = column_set[c] - k
row_set[r] = row_set[r] - k
next c
next r
Looks like you want to generate uniformly distributed Latin Squares.
This pdf has a description of a method by Jacobson and Matthews (which was published elsewhere, a reference of which can be found here: http://designtheory.org/library/encyc/latinsq/z/)
Or you could potentially pre-generate a "lot" of them (before you ship :-)), store that in a file and randomly pick one.
Hope that helps.
The easiest way I can think of would be to create a partial game and solve it. If it's not solvable, or if it's wrong, make another. ;-)
Sudoku without squares sounds a bit like Sudoku. :)
http://www.codeproject.com/KB/game/sudoku.aspx
There is an explanation of the board generator code they use there.
Check out http://www.chiark.greenend.org.uk/~sgtatham/puzzles/ - he's got several puzzles that have precisely this constraint (among others).
A further solution would be this. Suppose you have a number of solutions. For each of them, you can generate a new solution by simply permuting the identifiers (1..15). These new solutions are of course logically the same, but to a player they will appear different.
The permutation might be done by treating each identifier in the initial solution as an index into an array, and then shuffling that array.
Use your first valid example:
1 2 3 4
2 3 4 1
3 4 1 2
4 1 2 3
Then, create randomly 2 permutations of {1, 2, 3, 4}.
Use the first to permute rows and then the second to permute columns.
You can find several ways to create permutations in Knuth's The Art of Computer Programming (TAOCP), Volume 4 Fascicle 2, Generating All Tuples and Permutations (2005), v+128pp. ISBN 0-201-85393-0.
If you can't find a copy in a library, a preprint (of the part that discusses permutations) is available at his site: fasc2b.ps.gz
EDIT - CORRECTION
The above solution is similar to 500-Intenral Server Error's one. But I think both won't find all valid arrangements.
For example they'll find:
1 3 2 4
3 1 4 2
2 4 1 3
4 2 3 1
but not this one:
1 2 3 4
2 1 4 3
3 4 1 2
4 3 2 1
One more step is needed: After rearranging rows and columns (either using my or 500's way), create one more permutation (lets call it s3) and use it to permute all the numbers in the array.
s3 = randomPermutation(1 ... n)
for i=1 to n
for j=1 to n
array[i,j] = s3( array[i,j] )

Centering Divisions Around Zero

I'm trying to create something that sort of resembles a histogram. I'm trying to create buckets from an array.
Suppose I have a random array doubles between -10 and 10; this is very simplified. I then want to specify a center point, in this case 0 and the number of buckets.
If I want 4 buckets the division would be -10 to -5, -5 to 0, 0 to 5 and 5 to 10. Not that complicated right. Now if I change the min and max to -12 and -9 and as for 4 divisions its more complicated. I either want a division at -3 and 3; it is centered around 0 ; or one at -6 to 0 and 0 to 6.
Its not that hard to find the division size
= Math.Ceiling((Abs(Max) + Abs(Min)) / Divisions)
Then you would basically have an if statement to determine whether you want it centered on 0 or on an edge. You then iterate out from either 0 or DivisionSize/2 depending on the situation. You may not ALWAYS end up with the specified number of divisions but it will be close. Then you iterate through the array and increment the bin count.
Does this seem like a good way to go about this? This method would surely work but it does not seem to be the most elegant. I'm curious as to whether the creation of the bins and the counting from the list could be done in a clever class with linq in a more elegant way?
Something like creating the bins and then having each bin be a property {get;} that returns list.Count(x=> x >= Lower && x < Upper).
To me it seems simpler: You need to find lower bound and size of each "division".
Since you want it to be symmetrical around 0 depending on number of divisions you either get one that includes 0 for odd numbers (-3,3) or around 0 for even ones (-3,0)(0,3)
lowerBound = - Max(Abs(from), Abs(to))
bucketSize = 2 * lowerBound / divisions
(throw in Ceiling and update bucketSize and lowerBound if needed)
Than use .Aggregate to update array of buckets (position would be (value-lowerBound)/devisions, with additional range checks if needed).
Note: do not implement get the way you suggested - it is not expected for getters to perfomr non-trivial work like walking large array.

Categories

Resources