A nearest neighbour when edge costs are asymmetric, some doubts

A nearest neighbour when edge costs are asymmetric, some doubts - c#

To clarify my post, I have edited it based on comments.
I was thinking how to implement a nearest neighbour search efficiently when edge costs are asymmetric. I'm thinking a range of cities something like from 100 to 12000.
In more detail, as an example, there's a cost COST1 on travelling from city A to city B, e.g. by foot, and a cost COST1/10 to travel from B to A, e.g. by train. In other words, the problem I see here is that if I have an asymmetric matrix C representing costs between travelling cities and I select one point A, how could discover efficiently, say, three nearest neighbouring cities B1, B2 and B3 in terms of travelling cost? I would like to run the queries repeatedly. Preprocessing time, if not huge, is all right.
The efficiency pondering let me to thinking something like a k-d tree, which faciliates for finding k nearest neighbours in O(lg(n)) time when costs between cities are symmetric. This is the snag with just basic k-d tree in my case as the travelling costs aren't in general the same in both directions between any two cities. The gist of the matter seems to be then, how could I do something like k-nearest neighbours in asymmetric case?
To remedy the aforementioned symmetry assumption, I thought that instead of just one tree, I have two trees constructed so that the costs are calculated in both directions, and then I run a search through both trees. Then I became to wonder, does anyone know if there's already something specifically for the purpose of asymmetric costs and/or would using two trees as an idea be totally astray?
It also may be k-d trees in two dimensions isn't necessarily the most fit solution. So pointers to other data structures and algorithms are welcome too. Especially if someone has practical experience regarding my problem size. Wikipedia lists quite a bunch of approaches, and maybe even approximate solution is good for what I'm trying to do (this is for a smallish game for learning purposes).

For each point you need to calculate costs for all available travel types(foot,travel,..), lead to one unit,compare and get min. And this cost you can use in search algorithms.

Related

Best match between two sets of points

I've got two lists of points, let's call them L1( P1(x1, y1), ... Pn(xn, yn)) and L2(P'1(x'1, y'1), ... P'n(x'n, y'n)).
My task is to find the best match between their points for minimizing the sum of their distances.
Any clue on some algorithm? The two lists contain approx. 200-300 points.
Thanks and bests.

If the use case of your problem involves matching ever point present in list L1 with a point in list L2, then the Hungarian Algorithm would serve as a perfect fit.
The weights corresponding to your Hungarian matrix would be the distance between the point annotated for the row vs the column. The overall runtime for the optimized Hungarian algorithm is O(n3) which will comfortably fit for your given constraint of n = 300
A pretty nice tutorial covering the ideology and implementation of the Hungarian algorithm is https://www.topcoder.com/community/competitive-programming/tutorials/assignment-problem-and-hungarian-algorithm/
If not for the Hungarian algorithm, you can also morph the given problem into a max-flow-min-cost problem - the details of which I'll omit for now but can discuss if required.

A* search for Rush Hour game?

For an assignment for school I have to make a solver for a Rush Hour game.. if you aren't familiar with Rush Hour.. check this link: http://www.puzzles.com/products/rushhour.htm
For this solver I have to use the A* search algorithm, I looked on the internet a bit, and I think I quite understood how the algorithm works.. only I don't really have an idea how to implement it in the solver.. nor how I should build up the grid for the cars.. Can someone please give me some tips/help for this?
Not a complete solution..

To represent the grid of cars, I'd just use a rectangular array of cells where each cell is marked with an integer -- 0 indicates "empty", and each car has a particular number, so the different cars in the grid will manifest themselves as consecutive cells with the same number.
At this point, you should be able to write a function to return all the possible "moves" from a given grid, where a "move" is a transition from one grid state to another grid state -- you probably don't need to encode a better representation of a move than that.
To implement A*, you'll need a naive heuristic for figuring out how good a move looks, so you know which moves to try first. I would suggest initially that any move which either moves the target car closer to the goal or makes space nearer the front of the target car might be a better candidate move. Like Will A said in the comments, unless you're solving a 1000x1000 Rush Hour board, this probably isn't a big deal.
That's all the tricky parts I can think of.

As mquander or Will have already pointed out, the A* algorithm might be a bit an overfit for your problem.
I just give you now some hints what other algorithm you can use to solve the problem.
I don't want to explain how those algorithms work since you can find many good description in the internet. However, if you have a question, don't hesitate to ask me.
You can use some algorithms which belong to the kind of "uninformed search". There you have for example breadth first search, deep-first search, uniform cost search, depth-limited search or iterative deepening search. If you use breadth-first search or uniform cost search then you might have to deal with available memory space problem since those algorithms have an exponential space complexity (and you have to keep the whole search space in memory). Therefore, using a deep-first search (space complexity O(b*m)) is more memory friendly since the left part of the tree which you visit firstly can be omitted if it does not contain the solution. Depth-limited search and iterative deepening search are almost the same, whereas in the iterative deepening search you increase the search level of your tree iteratively.
If you compare time complexity (b=branching factor of the tree, m=maximum depth of the tree, l=depth level limit, d=depth of the solution):
breadth-first: b^(d+1)
uniform cost: b^?
depth-fist:b^m
depth-limited: b^l if (l>d)
iterative deepening: b^d
So as you can see iterative deepening or breadth-first search perform quite well. The problem of the depth-limited search is if your solution is located deeper than you search level, then you will not find a solution.
Then you have the so called "informed search" such as best-first search, greedy search, a*, hill climbing or simulated annealing. In short, for the best-first search, you use an evaluation function for each node as an estimate of “desirability". The goal of the greedy search is to expand the node which brings you closer to goal. Hill climbing and simulated annealing are very similar. Stuart Russell explains hill climbing as following (which I like a lot...): the hill climbing algorithm is like climbing Everest in thick fog with amnesia". It is simply a loop that continually moves in the direction of increasing value. So you just "walk" to the direction which increases your evaluation function.
I would use one of the uniformed search algorithms since they are very easy to implement (you just need to programme tree and traverse it correctly). Informed search performs usually better if you have a good evaluation function...
Hope that helps you...

How to effeciently spread objects on a 2D surface in a "natural" way?

i would like to effeciently generate positions for objects on a given surface. As you probably guessed this is for a game. The surface is actually a 3D terrain, but the third dimension does not matter as it is determined by terrain height.
The problem is i would like to do this in the most effecient and easy way, but still get good results. What i mean by "natural" is something like mentoined in this article about Perlin noise. (trees forming forests, large to small groups spread out on the land) The approach is nice, but too complicated. I need to do this quite often and prefferably without any more textures involved, even at the cost of worse performance (so the results won't be as pretty, but still good enough to give a nice natural terrain with vegetation).
The amount of objects placed varies, but generally is around 50. A nice enhancement would be to somehow restrict placement of objects at areas with very high altitude (mountains) but i guess it could be done by placing a bit more objects and deleting those placed above a given altitude.

This might not be the answer you are looking for, but I believe that Perlin Noise is the solution to your problem.
Perlin Noise itself involves no textures; I do believe that you have a misunderstanding about what it is. It's basically, for your purposes, a 2D index of, for each point, a value between 0 and 1. You don't need to generate any textures. See this description of it for more information and an elegant explanation. The basics of Perlin Noise involves making a few random noise maps, starting with one with very few points, and each new one having twice as many points of randomness (and lower amplitude), and adding them together.
Especially, if your map is discretely tiled, you don't even have to generate the noise at a high resolution :)
How "often" are you planning to do this? If you're going to be doing it 10+ times every single frame, then Perlin Noise might not be your answer. However, if you're doing it once every few seconds (or less), then I don't think that you should have any worries about speed impact -- at least, for 2D Perlin Noise.
Establishing that, you could look at this question and my personal answer to it, which is trying to do something very similar to what you are trying to do. The basic steps involve this:
Generate perlin noise; higher turbulence = less clumping and more isolated features.
Set a "threshold" (ie, 0.5) -- anything above this threshold is considered "on" and anything above it is considered "off". Higher threshold = more frequent, lower threshold = less frequent.
Populate "on" tiles with whatever you are making.
Here are some samples of Perlin Noise to generate 50x50 tile based map. Note that the only difference between the nature of the two are the "threshold". Bigger clumps means lower threshold, smaller clumps means a higher one.
A forest, with blue trees and brown undergrowth
A marsh, with deep areas surrounded by shallower areas

Note you'll have to tweak the constants a bit, but you could do something like this
First, pick a random point. (say 24,50).
Next, identify points of interest for this object. If it's a rock, your points might be the two mountains at 15,13 or 50,42. If it was a forest, it would maybe do some metrics to find the "center" of a couple local forests.
Next, calculate the distance vectors between the the point and the points of interest, and scale them by some constant.
Now, add all those vectors to the point.
Next determine if the object is in a legal position. If it is, move to the next object. If it's not, repeat the process.
Adapt as necessary. :-)

One thing: If you want to reject things like trees on mountains you don't add extra tries, you keep trying to place an object until you find a suitable location or you've tried it a bunch of times and you need to bail out because it doesn't look placeable.

All valid combinations of points, in the most (speed) effective way

I know there are quite some questions out there on generating combinations of elements, but I think this one has a certain twist to be worth a new question:
For a pet proejct of mine I've to pre-compute a lot of state to improve the runtime behavior of the application later. One of the steps I struggle with is this:
Given N tuples of two integers (lets call them points from here on, although they aren't in my use case. They roughly are X/Y related, though) I need to compute all valid combinations for a given rule.
The rule might be something like
"Every point included excludes every other point with the same X coordinate"
"Every point included excludes every other point with an odd X coordinate"
I hope and expect that this fact leads to an improvement in the selection process, but my math skills are just being resurrected as I type and I'm unable to come up with an elegant algorithm.
The set of points (N) starts small, but outgrows 64 soon (for the "use long as bitmask" solutions)
I'm doing this in C#, but solutions in any language should be fine if it explains the underlying idea
Thanks.
Update in response to Vlad's answer:
Maybe my idea to generalize the question was a bad one. My rules above were invented on the fly and just placeholders. One realistic rule would look like this:
"Every point included excludes every other point in the triagle above the chosen point"
By that rule and by choosing (2,1) I'd exclude
(2,2) - directly above
(1,3) (2,3) (3,3) - next line
and so on
So the rules are fixed, not general. They are unfortunately more complex than the X/Y samples I initially gave.

How about "the x coordinate of every point included is the exact sum of some subset of the y coordinates of the other included points". If you can come up with a fast algorithm for that simply-stated constraint problem then you will become very famous indeed.
My point being that the problem as stated is so vague as to admit NP-complete or NP-hard problems. Constraint optimization problems are incredibly hard; if you cannot put extremely tight bounds on the problem then it very rapidly becomes not analyzable by machines in polynomial time.

For some special rule types your task seems to be simple. For example, for your example rule #1 you need to choose a subset of all possible values of X, and than for each value from the subset assign an arbitrary Y.
For generic rules I doubt that it's possible to build an efficient algorithm without any AI.

My understanding of the problem is: Given a method bool property( Point x ) const, find all points the set for which property() is true. Is that reasonable?
The brute-force approach is to run all the points through property(), and store the ones which return true. The time complexity of this would be O( N ) where (a) N is the total number of points, and (b) the property() method is O( 1 ). I guess you are looking for improvements from O( N ). Is that right?
For certain kind of properties, it is possible to improve from O( N ) provided suitable data structure is used to store the points and suitable pre-computation (e.g. sorting) is done. However, this may not be true for any arbitrary property.

Possible Combination of Knapsack problem and?

Alright quick overview
I have looked into the knapsack problem
http://en.wikipedia.org/wiki/Knapsack_problem
and i know it is what i need for my project, but the complicated part of my project would be that i need multiple sacks inside a main sack.
The large knapsack that holds all the "bags" can only carry x amount of "bags" (lets say 9 for sake of example). Each bag has different values;
Weight
Cost
Size
Capacity
and so on, all of those values are integer numbers. Lets assume from 0-100.
The inner bag will also be assigned a type, and there can only be one of that type within the outer bag, although the program input will be given multiple of the same type.
I need to assign a maximum weight that the main bag can hold, and all other properties of the smaller bags need to be grouped by weighted values.
Example
Outer Bag:
Can hold 9 smaller bags
Weight no more than 98 [Give or take 5 either side]
Must hold one of each type, Can only hold one of each type at a time.
Inner Bags:
Cost, Weighted at 100%
Size, Weighted at 67%
Capacity, Weighted at 44%
The program will be given an input of multiple bags, and then must work out combinations of Smaller Bags to go into the larger bag, there will be multiple solutions depending on the input, and the program would output the best solutions for me.
I am wondering what you guys think the best way for me to approach this would be.
I will be programming it in either Java, or C#. I would love to program it in PHP but i'm afraid the algorithm would be very inefficient for web servers.
Thanks for any help you can give
-Zack

Okay, well, knapsack is NP-hard so I'm pretty certain this will be NP-hard as well (if it weren't you could solve knapsack by doing this with only one outer bag.) So for an exactly optimal solution, you're probably going to be able to do no beter than searching all combinations. So the outline of the program you want will be like
for each possible combination
do
if current combination is better than best previous
save current combination as best so far
fi
od
and the run time will be exponential. It sounds, though, like you might be able to get a near solution with dynamic programming.

Consider using Prolog for your logical programming. There's multiple implementations of it including P# on mono (.NET). Theres a bit of a learning curve, but once you get used to it, it's pretty much in a league of its own for this kind of problem solving.
Hope this helps. Cheers!
link to P#

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.