Possible Dup: Help Me Figure Out A Random Scheduling Algorithm using Python and PostgreSQL
Let's say you have a division with 9 teams, and you want them to play 16 games each. Usually you would want to have 8 games (Home), and 8 games (Visitor). Is there a known algorithm to go in and assign the matches, randomly?
Note -> It can, sometimes not work, so you can have uneven numbers.
Any help is appreciated.
See these permutation algorithms
Does this one work for you : Fisher–Yates shuffle
There's a nice easy way to generate a round robin here. In the second round, you can repeat the round-robin and add swap home and away.
If you have an odd number of teams, you just use a dummy team that gives its opponent a bye in a particular round, which results in an extra round. You can distribute that extra round among the other rounds if you'd rather give double-headers than byes.
I think you can use the maximal matching in a bipartite graph algorithm for this (see, e.g., here), which runs in polynomial time.
We represent your problem by assigning each team, T, 8 vertices (Th1, ..., Th8) in the "home" subset of vertices and 8 vertices (Ta1, ..., Ta8) in the "away" subset of the vertices.
We now look for a maximal matching between the "home" and "away" subsets such that each edge (H, A) in the matching satisfies the property that H is in the "home" subset, "A" is in the "away" subset, and H and A belong to different teams.
Related
I have a function that takes in X as an argument and randomly picks an element from a 2D array.
The 2D array has thousands of elements, each of them has a different requirement on X, stored in arr[Y][1].
For example,
arr[0] should only be chosen when X is larger than 4. (arr[0][1] = 4+)
Then arr[33] should only be chosen when X is between 37 and 59. (arr[33][1] = 37!59)
And arr[490] should only be chosen when X is less than 79. (arr[490][1] = 79-)
And there are many more, most with a different X requirement.
What is the best way to tackle this problem that takes the least space, and least repetition of elements?
The worst way would be storing possible choices for each X in a 2D array. But that would cause a lot of repetition, costing too much memory.
Then, I have thought about using three arrays, separating X+ requirements, X- and X range. But it still sounds too basic to me, is there a better way?
One option here would be what's called "accept/reject sampling": you pick a random index i and check if the condition on X is satisfied for that index. If so, you return arr[i]. If not, you pick another index at random and repeat until you find something.
Performance will be good so long as most conditions are satisfied for most values of i. If this isn't the case -- if there are a lot of values of X for which only a tiny number of conditions are satisfied -- then it might make sense to try and precompute something that lets you find (or narrow down) the indices that are allowable for a given X.
How to do this depends on what you allow as a condition on each index. For instance, if every condition is given by an interval like in the examples you give, you could sort the list twice, first by left endpoints and then by right endpoints. Then determining the valid indices for a particular value of X comes down to intersecting the intervals whose left endpoint is less than or equal to X with those whose right endpoint is greater than or equal to X.
Of course if you allow conditions other than "X is in this interval" then you'd need a different algorithm.
While I believe that re-sampling will be the optimal solution in your case (dozens of resamplings is very cheap price to pay), here is the algorithm I would never implement in practice (since it uses very complicated datastructures and is less efficient than resampling), but with provable bounds. It requires O(n log n) preprocessing time, O(n log n) memory and O(log n) time for each query, where n is the number of elements you can potentially sample.
You store all ends of all ranges in one array (call it ends). E.g. in your case you have an array [-infty, 4, 37, 59, 79, +infty] (it may require some tuning, like adding +1 to right ends of ranges; not important now). The idea is that for any X we only have to determine between which ends it's located. E.g. if X=62 is in range [59; 79] (I'll call such pair an interval). Then for each interval you store a set of all possible ranges. For your input X you just find the interval (using binary search) and then output a random range, corresponding to this interval.
How do you compute the corresponding set of ranges for each interval? We go from left to right in ends array. Let's assume we compute the set for the current interval, and go to the next one. There is some end between these interval. If it's a left end of some interval, we add the corresponding range to the new set (since we enter this range). If it's a right end, we remove the range. How do we do this in O(log n) time instead of O(n)? Immutable balanced tree sets can do this (essentially, they create new trees instead of modifying the old one).
How do you return a uniformly random range from a set? You should augment tree sets: each node should know how many nodes its subtree contains. First you sample an integer in range [0; size(tree)). Then you look at your root node and its children. For example, assume that you sampled integer 15, and your left child's subtree has size 10, while the right's one is 20. Then you go to the right child (since 15 >= 10) and process it with integer 5 (since 15 - 10 = 5). You will eventually visit a leaf, corresponding to a single range. Return this range.
Sorry if it's hard to understand. Like I said, it's not trivial approach which you would need for upper bounds in the worse case (other approaches discussed before require linear time in the worst case; resampling may run for indefinite time if there is no element satisfying restrictions). It also requires some careful handling (e.g. when some ranges have coinciding endpoints).
I have a random Matrix eg.:
3 A 6 8
9 2 7* 1
6 6 9 1
2 #3 4 B
I need to find, the shortest path from A to B. The * and # marks are jumping points. If you stay on the * marked number, you can jump to the # marked number.
I thought a lot around this, but can't solve.
How can i achive this?
In case the values in your matrix are the movement cost of one field to another, the algorithm you need is A*. Wikipedia offers you some pseudo code to get started, but if you ask Google, you will find loads and loads of example implementation in every language there is.
In case the movement cost if always the same, it is a A* algorithm too, but in that special case it is Dijkstra's algorithm.
A* is basically Dijkstra's algorithm with the addition of considering changing movement costs.
Preface: I'm currently learning about ANNs because I have ~18.5k images in ~83 classes. They will be used to train a ANN to recognize approximately equal images in realtime. I followed the image example in the book, but it doesn't work for me. So I'm going back to the beginning as I've likely missed something.
I took the Encog XOR example and extended it to teach it how to add numbers less than 100. So far, the results are mixed, even for exact input after training.
Inputs (normalized from 100): 0+0, 1+2, 3+4, 5+6, 7+8, 1+1, 2+2, 7.5+7.5, 7+7, 50+50, 20+20.
Outputs are the numbers added, then normalized to 100.
After training 100,000 times, some sample output from input data:
0+0=1E-18 (great!)
1+2=6.95
3+4=7.99 (so close!)
5+6=9.33
7+8=11.03
1+1=6.70
2+2=7.16
7.5+7.5=10.94
7+7=10.48
50+50=99.99 (woo!)
20+20=41.27 (close enough)
From cherry-picked unseen data:
2+4=7.75
6+8=10.65
4+6=9.02
4+8=9.91
25+75=99.99 (!!)
21+21=87.41 (?)
I've messed with layers, neuron numbers, and [Resilient|Back]Propagation, but I'm not entirely sure if it's getting better or worse. With the above data, the layers are 2, 6, 1.
I have no frame of reference for judging this. Is this normal? Do I have not enough input? Is my data not complete or random enough, or too weighted?
You are not the first one to ask this. It seems logical to teach an ANN to add. We teach them to function as logic gates, why not addition/multiplication operators. I can't answer this completely, because I have not researched it myself to see how well an ANN performs in this situation.
If you are just teaching addition or multiplication, you might have best results with a linear output and no hidden layer. For example, to learn to add, the two weights would need to be 1.0 and the bias weight would have to go to zero:
linear( (input1 * w1) + (input2 * w2) + bias) =
becomes
linear( (input1 * 1.0) + (input2 * 1.0) + (0.0) ) =
Training a sigmoid or tanh might be more problematic. The weights/bias and hidden layer would basically have to undo the sigmoid to truely get back to an addition like above.
I think part of the problem is that the neural network is recognizing patterns, not really learning math.
ANN can learn arbitrary function, including all arithmetics. For example, it was proved that addition of N numbers can be computed by polynomial-size network of depth 2. One way to teach NN arithmetics is to use binary representation (i.e. not normalized input from 100, but a set of input neurons each representing one binary digit, and same representation for output). This way you will be able to implement addition and other arithmetics. See this paper for further discussion and description of ANN topologies used in learning arithmetics.
PS. If you want to work with image recognition, its not good idea to start practicing with your original dataset. Try some well-studied dataset like MNIST, where it is known what results can be expected from correctly implemented algorithms. After mastering classical examples, you can move to work with your own data.
I am in the middle of a demo that makes the computer to learn how to multiply and I share my progress on this: as Jeff suggested I used the Linear approach and in particular ADALINE. At this moment my program "knows" how to multiply by 5. This is the output I am getting:
1 x 5 ~= 5.17716232607829
2 x 5 ~= 10.147218373698
3 x 5 ~= 15.1172744213176
4 x 5 ~= 20.0873304689373
5 x 5 ~= 25.057386516557
6 x 5 ~= 30.0274425641767
7 x 5 ~= 34.9974986117963
8 x 5 ~= 39.967554659416
9 x 5 ~= 44.9376107070357
10 x 5 ~= 49.9076667546553
Let me know if you are interested in this demo. I'd be happy to share.
I want to find out what is the best way to do this in C#:
I have a array of lets say 20 numbers, and then one more additional variable.
I want to get the sum of the numbers which is closest to the given variable.
Lets say, I have 1.1, 1.5, 1.7, 1.9, 2.2, 3.1, 3.2, 1,5, 4.5, 4.1. And then the additional variable has value of 5.
I want to get the sum of some numbers in the array which will be closest to the given number, and once I'll get that number, remove those numbers from the list and add them to a new array.
Every comment is welcomed.
Thanks
You are describing the optimization problem for Subset Sum Problem.
The problem is NP-Complete, so there is no known polynomial solution to it.
However, since the input is fairly small scale - an exponential solution of checking all subsets is feasible, since there are only 2^20 ~= 1000000 (a bit more, actually, but close enough for estimating run time)
Pseudo code should be something like:
getClosestSum(list,sum,number):
if (list is empty):
return sum
candidate1 <- getClosest(list[1:],sum,number)
candidate2 <- getClosest(list[1:],sum+list[0],number)
if (abs(number-candidate1) < abs(number-candidate2)):
return candidate1
else:
return candidate2
Recently I have been reading about lotto wheeling and combination generating. I thought I'd give it a whirl and looked about for example code. I managed to cobble together a number wheel based on some VB but I've introduced an interesting bug while porting it.
http://www.xtremevbtalk.com/showthread.php?t=168296
It allows you to basically ID any combination. You feed it N numbers, K picks and an index and it returns that combination in lexicographical order.
It works well at low values but as the number of balls (N) rises I get additional numbers occurring for example. 40 balls, 2 picks. Combination No. 780 Returns 40 and 41! The more picks and numbers I added the higher this goes, It seem to happen at the end of a run when the number preceding is due to cycle.
I found the method for generating number of possible combination on the VB forum to not make a lot of sense, so I found a simpler one:
http://www.dreamincode.net/code/snippet2334.htm
Then I discovered that using doubles seems to cause a lack of resolution. Using long works, but now I can't use higher values of N because the multiplying goes out of range for a long! I then tried ulong and decimal neither could go much past 26-28 numbers (N).
So I reverted to the version on the VB site.
http://www.xtremevbtalk.com/showthread.php?s=6548354125cb4f312fc555dd0864853e&t=129902
The code is a method to avoid hitting the 96bit ceiling and claims to be able to calculate as high as N 98, K 49.
For some reason I cannot get this to behave, it spits out some very strange numbers.
After giving up for a while I decided to re-read the wiki suggested. While most of it was over my head, I was able to discover that certain ways of calculating a binomial coefficient have inaccuracy. This wouldn't be appropriate for a system where you are essentially dialing up (wheeling) to a game. After a bit of searching and reading I came across this:
http://dmitrybrant.com/2008/04/29/binomial-coefficients-stirling-numbers-csharp
Turns out this is exactly the information I was looking for! The first method is accurate and plenty fast for anything I'm doing. Much thanks for psYchotic going to the trouble of joining just to post here!
There are exactly 780 combinations of 2 numbers to generate out of a set of 40. If your combination generator uses a zero-based index, any index >= the maximum amount of combinations would be invalid.
You can use the binomial coefficient to determine the number of combinations that can be formed.