I have a CAD application, that allows user to draw lines and polygons and all that.
One thorny problem that I face is user drawing can be highly imprecise, for example, a user might want to draw two rectangles that are connected to each other. Hence there should be one line shared by two rectangles. However, it's easy for user to, instead of draw a line, draw two lines that are very close to each other, so close to each other that when look from the screen, you would be mistaken that they are the same line, except that they aren't when you zoom in a little bit.
My application would require user to properly draw the lines ( or my preprocessing must be able to do auto correction), or else my internal algorithm (let's call it The Algorithm) would not be able to process the inputs correctly.
What is the best strategy to combat this kind of problem? I am thinking about rounding the point coordinates to a certain degree of precision, but although I can't exactly pinpoint the problem of this approach, but I feel that this is not the correct way of doing things, that this will introduce a new set of problem.
Edit: For the sake of argument the snapping isn't an available option. For the matter, all sorts of "input-side" guidance are not available. The correction must be done via preprocessing on my code, when the drawing is finished, but just before I submit it to my algorithm.
Crazy restriction, you say. But a user can construct their input either in my application, or they can construct their input in other CAD software and then submit to my engine to do the calculation. I can't control how they input in other CAD software.
Edit 2:I can let user to specify the "cluster radius" to occur, but the important point is, I would need to make sure that my preprocessing algorithm is consistent and won't really introduce a new set of problem.
Any idea?
One problem I see is that your clustering/snapping algorithm would have to decide on its own which point to move onto which other point.
During live input snapping is simple: the first point stays put, the second point is snapped onto the first. If in offline mode you get a bunch of points that you know should be snapped together, you have no idea where the resulting point should lie. Calculate the average, possibly resulting in a completely new point? Choose the most central point out of all the candidates? Pick one at random? Try to align your point with some other points on the x/y/z-axis?
If your program allows any user interaction at all, you could detect point clusters that might be candidates for merging, and present the user with different merge target points to choose from.
Otherwise, you could make this kind of behaviour configurable: take a merge radius ("if two or more poins are within n units of one another...") and a merging algorithm ("... merge them into the most central of the points given") as parameters and read them from a config file.
Snapping points. User should be able to snap to end points (and many more) then, when you detect a snap, just change the point user clicked to snap point point. Check AutoCAD, functions line End, Middle and so on.
EDIT: If you want offline snapping then you just need to check every pair of points if they are near each other. The problem is that this in NP-problem so it will take a lot of time as you can't really get under O(n^2) time complexity. This algorithm you need should be under "clustering".
EDIT2: I think you shouldn't consider that input data is bad. But if you reallllllly want to do this, simples way is to take each point, check if there are other points in users defined radius, if yes find whole group that should merge into one point, find avg of coordinates of points and point all of them to that specific point. But remember - most designers KNOW what are snap points for and if they don't use them they have valid idea for that.
Your basic problem seems to me (I hope I understood correctly) to determine if two lines are the "same" line.
Out of my own experience your feeling is correct, rounding the coordinates in the input might prove not to be a good idea.
Maybe you should leave the coordinates in the input as they are but implement your function let's name it IsSameLine That you use in "The Algorithm" (who among others determines if two rectangles are connected if i understood your description correctly).
IsSameLine could transform the endpoints of the input lines from source coordinates to screen coordinates considering a certain (possibly configurable) screen resolution and check if they are the same in screen coordinates.
I.e. let's say you have an input file with the following extent (lowerleft) (upperRight) ((10,10), (24,53)). The question would be how far apart would be points (11,15) and (11.1, 15.1) if drawn at "zoom to extents" level on a 1600x1200 pixels screen. So you can determine a transform from source coordinates to "screen coordinates". You use then this transformation in IsSameLine as described above.
I'm not sure however this would be actually a good solution for you.
Another (maybe better?) possibility is to implement IsSameLine to return true if the points of the two lines are at maximum epsilon distance apart. The epsilon could have a default value computed based on the extent of the input vector data and probably it would be a good idea to give the user the possibility to give another value for it.
Related
Can anybody let me know how I can compare two scans with different clusters in them. I have attached a picture showing both of the scans. I want to keep the clusters which are present in both the scans.
Assuming the two sets contain the same points ( possibly with some noise added), but one of the sets contains extra points.
It should then be fairly simple to check the distance from each point in set A to each point in set B. If there is any point within some radius can mark the point as part of both set A and B. This kind of search will probably benefit greatly from some kind of search structure, like a Kd-Tree, or some other optimization approach.
Without more details it is difficult to provide a more detailed suggestion.
I need to find optimal path between a number of points on a 2d map.
The 2d map is of a building and will simply have where you can not go (through walls) and all the points on the map. So it's not really a map, rather lines you cannot go through with points to pass through.
I have a number of points, say between 20 and 500
I start with one that I select and then need the route calculated for most optimal path.
I would love hints for where to look for this travelling salesman problem with obstacles. Or even better, done library for doing it.
Bonuses
Things like doors can be weighted as they are less fun to pass through back and forth.
Possibility of prioritizing/Weighting the ability to end close to where you started.
Selecting areas as passable but annoying (weighting down)
.Net/C# code that I can use, I want to use this both on .NET MVC project and Xamarin mobile project so .net code would be great (if code exists)
Update example
In my example here we have an office. Now I have not thought every detail out so this is merely an example.
All the purple dots need to be checked
Yellow area could mean annoying to pass through but doable
Red could mean not active but can be passed if no other option exists.
Blue (walls) are impenetrable and can not be passed.
Green is doors, weighted down possibly as it's annoying to go trough closed doors (usually this would probably make sense anyway as the dots in a room would be easiest to check together.
The user would go to one dot, check it, then the software should tell him which one to do next until he is done.
Bonus could be given for ending close to start place. So for instance in this example, if the red area was normal and contained dots it would have been easy to make it a loop. (So the user comes back close to where he started)
Finally I suppose it would also be smart to differentiate outdoors areas as you would need to get dressed for outdoors, so you only want to go out once.
Also it could be smart to be able to prioritize ending on a point close to stairwell to next floor if they intend to check multiple floors at once.
Of course would have more more complex and larger plans the this exmaple.
Again sorry for just brainstorming out ideas but I have never done this kind of work and is happy for any pointers :-)
Let N be the set of nodes to visit (purple points). For each i and j in N, let c(i,j) be the distance (or travel time) to get from i to j. These can be pre-computed based on actual distances plus walls, doors, other barriers, etc.
Now, you could then add a penalty to c(i,j) if the path from i to j goes through a door, "annoying" area, etc. But a more flexible way might be as follows:
Let k = 1,...,K be the various types of undesirable route attributes (doors, annoying areas, etc.). Let a_k(i,j) be the amount of each of these attributes on the path from i to j. (For example, suppose k=1 represents door, k=2 represents yellow areas, k=3 represents outside. Then from an i in the break area to j in the bathroom might have a_1(i,j) = 1, and from an i to a j both in the yellow areas would have a_2(i,j) = 0.5 or 2.0 or however annoying that area is, etc.)
Then, let p_k be a penalty for each unit of undesirable attribute k -- maybe p_1 = 0.1 if you don't mind going through doors too much but p_2 = 3.0 if you really don't like yellow areas.
Then, let c'(i,j) = c(i,j) + sum{k=1,...,K} p_k * a_k(i,j). In other words, replace the actual distance with the distance plus penalties for all the annoyances. The user can set the p_k values before the optimization in order to express his/her preferences among these. The final penalties p_k * a_k(i,j) should be commensurate with the distance units used for c(i,j), though -- you don't want distances of 100m but penalties of 1,000,000.
Now solve a TSP with distances given by c'(i,j).
The TSP requires you to start and end at the same node, so that preference is really a constraint. If you're going to solve for multiple floors simultaneously, then the stairway times would be in the c(i,j) so there's no need to explicitly encourage routes that end near a stairway -- the solution would tend to do that anyway since stairs are slow. If you're going to solve each floor independently, then just set the start node for each floor equal to the stairway.
I wouldn't do anything about the red (allowable but unused) areas -- that would already be baked into the c(i,j) calculations.
Hope this helps.
I wonder if there's any described algorithm that can convert isochrones into approximate area to show a range of some feature (in my problem this feature is a road network).
Example. I have something like on the image beneath:
It's a simple network (where I can arrive from the start point in X minutes or going Y kilometers). I have information of all the nodes and links. Now I need to create an isochrone map that show an approximate range where I can arrive.
Problems:
Convex hull - sucks because of too general approximation,
I can create buffors on roads - so I will get some polygon that shows range, but I will also have the holes by roads that connect into circles.
What I need to obtain is something like this:
I've found some potentially useful information HERE, but there are only some ideas how it could be done. If anyone has any concept, please, help me to solve my problem.
Interesting problem, to get better answers you might want to define exactly what will this area that shows the range (isochrone map) be used for? For example is it illustrative? If you define what kind of approximation you want it could help you solve the problem.
Now here are some ideas.
1) Find all the cycles in the graph (see link), then eliminate edges that are shared between two cycles. Finally take the convex hull of the remaining cycles, this together with all the roads, so that the outliers that do not form cycles are included, will give a good approximation for an isochrome map.
2) A simpler solution is to define a thickness around each point of every road, this thickness should be inversely proportional to how long it takes to arrive at that point from the starting point. I.e. the longer it takes to arrive at the point the less thick. You can then scale the thickness of all points until all wholes are filled, and then you will have an approximate isochrome map. One possible way of implementing this is to run an algorithm that takes all possible routes simultaneously from the starting point, branching off at every new intersection, while tracking how long it took to arrive at each point. During its execution, at every instant of time all previously discovered route should be thickened. At the end you can scale this thickness so as to fill all wholes.
Hopefully this will be of some help. Good luck.
I have solved the problem (it's not so fast and robust, but has to be enough for now).
I generated my possible routes using A* (A-Star) algorithm.
I used #Artur Gower's idea from point one to eliminate cycles and simplify my geometry.
Later I decided to generate 2 types of gemetries (1st - like on the image, 2nd - simple buffers):
1st one:
3. Then I have removed the rest of unnecessary points using Douglas-Peucker algorithm (very fast!).
4. In the end I used Concave Hull algorithm (aka Alpha-Shapes or Non-Convex Hull).
2nd one:
3. Apply a buffer to the existing geometry and take the exterior ring (JTS library made that really easier:)).
Let's assume, that we're scanning test-like documents with checkboxes / empty circles (for signing / striking / ticking). What would be the proper way, to check, if already cropped checkbox/circle is checked/signed/striked/ticked?
In case we'll force test users to fully mark the area, just knowing the position of checkbox/circle and counting amount on non-white pixels would be enough (would it?), but what way should we approach to test, that the checkbox / circle is ticked or checked (X)?
This is going to be part of the project in C#, so code or even ready libraries for .net / c/c++ would be appreciated.
sorry for the shortness of this answer but you could have an ocr system run on the area within the checkbox.
If it returns nothing then you know it's not checked.
If it returns something then compare it against a large white list of possibilities and then flag uncertainties.
you could use the error handling that #dan proposed as well
What makes this more robust than just taking an average is that you can determine if it's not checked with a high certainty. because we're looking for a mark that is in some minimal way recognizable we know if there isn't anything there then it's definitely not checked. all you have to do then is find a good white list of characters and marks that could be used as checks (and think outside of the box, the ocr system may return an 'a' for a squiggle, but that is a positive response). And to clarify, the problem with just taking an average is that any increase in darkness in the check box yields a positive result, which isn't always the case. if someone puts a mark and then erases you're still going to have a increase in darkness within the box.
Lastly i'll add that there are a lot of OCR systems out there now that are pretty advanced. i doubt you'd have much trouble finding one where you could provide additional training data sets that would match you're cases better than random characters.
The algorithm would go something like this:
Find each checkbox (I understand you already have that)
Calculate the average of the color of all pixels
If it is above a certain threshold, it is marked, if below, it is unmarked
However, you should add some checks:
Are multiple above the threshold? -> Let a human check it, the student could have first ticked something and then changed it to another field.
Are none above the threshold? -> Let a human verify that really none have been checked.
I guess the important part about this answer is:
If the algorithm is unsure, flag it for manual processing.
Most of the high performance products that offer checkbox recognition use some kind of bell distribution curve to work out the likelihood of a box actually being checked: too much 'data' and there's a good chance the user changed their mind and has scribbled-out this box; too few and it could be the 'tail' left by a user ticking a box below and not lifting the pen before crossing the next box region.
I'd suggest you apply additional logic to deal with more than one box being allowed (e.g. do you own a car / do you also own a bike) as well as the situation where only one box can be correct (e.g. are you male or female). This should help your app. filter out the more obvious errors.
I've been trying to make a little simple game just to test my logics, and it's a simple labyrinth, it's ugly, and so far sucky.
The engine works pretty well, given that the labyrinth already exists (a matrix), it could be even enjoyable, but I have no intention on drawing a bunch of maps, which might be setting values on 400 (20x20) fields of a matrix. not funny.
Then I've created a function to randomize it, setting floor/wall for each field, and (I expected that) not every map is winnable. then I've made another function which checks if the maps is playable (receives two points, and checks if there's a valid path between them, then I just pass the start and the end. Pretty nifty) and it worked.
If you haven't noticed, this is a VERY stupid way of creating my random labyrinth for the following reasons:
1 - It might come out really easy (giant isles of floor, or a bunch of walls together, making only one, extremely visible path, creating a stupit (though valid) labyrinth
2 - It is potentially the fastest way of creating a perfect random labyrinth EVER, but at the same time it's potentially the slowest too, taking as long as... infinite. This difference is noticed more when I set the grid for 30x30 or more (when something is not overflown)
3 - It's dumb and an offence to logic itself.
In my deffense, I didn't plan making it this way from the beginning, as described, one thing led to another.
So I've started thinking about ways to do a beautiful (full of paths, tricky and winnable) labyrinth, then I've thought about making tiny small (let's say) 5x5 blocks with predesigned entrances and mount them together in a way that it fits, but it would go against my true random desire, as well as my unwillingness to draw it by hand.
Then I've thought about a function to create a random path, run it once to the end, and run it several times to somewhere close to the end, and some crossings and stuff, some creating dead ends, which seemed better to me, but I just couldn't imagine it creating a decent labyrinth.
You can check what I've done so far in this link.
Note: I have no intentions in harming anyone's pc with anything.
First one to open it, please comment here saying that it's safe. - Done (thank you, Jonno_FTW)
If you still don't trust it, use a Virtual Machine.
OBS: I know this is not the best way of developing anything. I should get a decent game engine, bla bla bla, it was some kind of challenge for myself.
I've done maze generation. You don't want to place stuff randomly and validate. Instead, you generate it out from a starting point.
Pick a starting point, move in a random direction. Have a random probability of picking a new direction. Never move into an occupied square, if you bump into one the current trail ends. If the current trail ends pick a square you have already visited and pick a new direction and do a random walk like you did for the first one. Repeat until the maze is as full as you want it to be.
The probability of the direction change should be an input parameter as it makes quite a difference. Note that if you are doing a 3D maze the odds of a vertical turn should be a lot lower than the odds of a horizontal move.
Here's an expansive website dedicated to labyrinths:
http://www.astrolog.org/labyrnth/algrithm.htm
Explains what types of labyrinths there are, goes over the generation algorithms and the solution algorithms, has a lot of cool pictures.
Have a look at the source code in my Roguelike game, Tyrant:
Code for Dungeon.java
There are a lot of diferent map generation techniques used to produce the different level types. But the most basic pattern is to iterate the following:
Start with a blank map
Create a single random room / open space in the map
Randomly select a tile at the edge of the currently open area
Attempt to "grow" a corridor or room randomly out from that space (if it doesn't fit, do nothing)
Loop back to step 3 as many times as you need to create a decent maze
Finally, do a pass over the whole map and convert and remaining blank space to walls
Here's a screenshot of the type of thing you get (Look at the mini-map from the maze structure):
Tyrant screenshot http://www.freeimagehosting.net/uploads/af45502c9c.png
Your question makes me think of the XScreensaver Maze program. Look at its screenshots to see if that's the desired effect.
It looks like it took its maze generation algorithm from Wikipedia.
Wikipedia has a great article on Maze generation algorithms
How you create a random labyrinth will depend on what you want it to look like. If you're creating something that's designed to have a lot of dead ends, then you can just "randomly" trace a path from the start point to the end point, and then randomly fill in the empty spaces, essentially carving the path out of a solid block of material. E.g. imagine you had a stone tablet. First step would be to carve the "solution" path. Then you'd go in and make all of the dead ends.
If you want something that's more "play" than "puzzle", then creating a bunch of tile pieces that fit together in different ways is probably the way to go. That's how the Diablo games did it as far as I can tell; a number of predesigned "sets" and rules about how they fit together. You'd mark the four sides of the block with things like "three open spaces followed by two closed," and then if another piece also has a matching description, they can be put together.
After that, all you have to do is figure out how you can consistently render "random" behavior.
There's actually a trick that Al Lowe used for one of his Leisure Suit Larry games (LSL 3, I believe) that might be helpful.
Basically, he made a bamboo forest 'maze' that the player had to navigate. Rather than creating a separate 'square' of maze for each screen, however, he simply 'flipped' the one screen he had already created and made dead ends by blocking various entrances with a single 'bamboo wall' graphic.
Perhaps you could do the same: have the generator carve a valid maze, and then tell it to place dead-end blocks along some of the paths. That would ensure that there's always at least one valid, open path to the 'finish line', as well as preventing players from just strolling through a super-easy layout.
It'll also make a 30x30 maze more workable, since the computer won't have to test every square of a 900-square grid for validity.