I'm making a game in C# and XNA, and I was trying to come up with a method to render massive terrains without using a tremendous amount of memory or passing the poly limit hard-coded into XNA.
My solution so far is to create a massive heightmap, and that heightmap is loaded into memory at the beginning of the game in the initialization phase. Then, terrain is only generated nearest to the camera. This is accomplished by projecting a triangle whose vertex is the character and the other two endpoints extend to the sides of the character's viewing area. Then, all the pixels inside that triangle on the heightmap are rendered and drawn into the game, thus only rendering what is seen.
The problem is, I've successfully found (I think, can't test until I get terrain rendering) the three vertices of the triangle. Now I need to find a list of the coordinates for every single pixel inside that triangle - whole numbers only, because I just need a list of pixels to render.
I know it sounds a little confusing, so here's the gist of it:
I have an image, and I project a triangle onto that image. The only thing I know about that triangle are the three vertices. I need a list of the pixels inside that triangle.
I've been Googling around for maybe 20 minutes now, and I figured I midas well go ahead and post something here due to the fact that what I'm trying to do isn't all that common. If I find an answer, I'll be sure to post it here.
But until then, can anyone tell me how to accomplish this?
Edit: A formula, please. If you can provide a formula or algorithm, and an explanation, that would be just perfect.
Edit: I've posted a new question, as I've ditched this method of rendering large terrains. The question is here.
Start here:
http://mathworld.wolfram.com/TriangleInterior.html
One of the non-trivial problems, not mentioned there, that you have to deal with is the pixelization along the boundary.
Related
I Have two images( left and right )
I want to measure the real distance on image?
When I click on the image, ı ll get real distance to clicked point to camera.
Left Image:
Right Image:
I have calibrated the two images. I want to use EmguCV to get distance from image.
Is this possible ?
While I do not know the specifics of EmguCV, I can tell you the concept behind how stereo depth perception works, and hopefully you can then implement some sort of fix.
Essentially, the first step is to segment and match parts of the image. What you are trying to accomplish here is to identify the parts of the image that are the "same" in each. For instance, you want to be able to identify the center of the lamp in each image. The feature set you use to do this is up to you, but one basic one that may help is by using an edge detector (like the canny method) and trying to match contours with similar shapes. Additionally, another technique that is common is breaking up the image into smaller blocks and matching features in those blocks. The method you use is up to you.
Next, you are able to calculate the distance of the matched objects from the center of your camera in both images. You will need to do this both for the x and y directions. We will call this your x and y disparity.
Now, you need to know the distance between the centers of the cameras that took the picture. Once you have this, there is some simple trig that you can do to solve for distance. There is a rather simple explanation of this here
Again, this is all conceptual, but it is important to know how the algorithms you are applying work. The first step to understanding the solution to a problem is to understand the problem itself. Once you have a full understanding of the problem, and the procedure for solving it, implementing that procedure with any library should become much easier. Good luck!
I am currently working on a project which we have a set of photos of trucks going by a camera. I need to detect what type of truck it is (how many wheels it has). So I am using EMGU to try to detect this.
Problem I have is I cannot seem to be able to detect the wheels using EMGU's HoughCircle detection, it doesn't detect all the wheels and will also detect random circles in the foliage.
So I don't know what I should try next, I tried implementing SURF algo to match wheels between them but this does not seem to work either since they aren't exactly the same, is there a way I could implement a "loose" SURF algo?
This is what I start with.
This is what I get after the Hough Circle detection. Many erroneous detections, has some are not even close to having a circle and the back wheels are detected as a single one for some reason.
Would it be possible to either confirm that the detected circle are actually wheels using SURF and matching them between themselves? I am a bit lost on what I should do next, any help would be greatly appreciated.
(sorry for the bad English)
UPDATE
Here is what i did.
I used blob tracking to be able to find the blob in my set of photos. With this I effectively can locate the moving truck. Then i split the rectangle of the blob in two and take the lower half from there i know i get the zone that should contain the wheels which greatly increases the detection. I will then run a light intensity loose check on the wheels i get. Since they are in general more black i should get a decently low value for those and can discard anything that is too white, 180/255 and up. I also know that my circles radius cannot be greater than half the detection zone divided by half.
In this answer I describe an approach that was tested successfully with the following images:
The image processing pipeline begins by either downsampling the input image, or performing a color reduction operation to decrease the amount data (colors) in the image. This creates smaller groups of pixels to work with. I chose to downsample:
The 2nd stage of the pipeline performs a gaussian blur in order to smooth/blur the images:
Next, the images are ready to be thresholded, i.e binarized:
The 4th stage requires executing Hough Circles on the binarized image to locate the wheels:
The final stage of the pipeline would be to draw the circles that were found over the original image:
This approach is not a robust solution. It's meant only to inspire you to continue your search for answers.
I don't do C#, sorry. Good luck!
First, the wheels projections are ellipses and not circles. Second, some background gradient can easily produce circle-like object so there should be no surprise here. The problem with ellipses of course is that they have 5 DOF and not 3DOF as circles. Note thatfive dimensional Hough space becomes impractical. Some generalized Hough transforms can probably solve ellipse problem at the expense of a lot of additional false alarm (FA) circles. To counter FA you have to verify that they really are wheels that belong to a truck and nothing else.
You probably need to start with specifying your problem in terms of objects and backgrounds rather than wheel detection. This is important since objects would create a visual context to detect wheels and background analysis will show how easy would it be to segment a truck (object) on the first place. If camera is static one can use motion to detect background. If background is relatively uniform a gaussian mixture models of its colors may help to eliminate much of it.
I strongly suggest using:
http://cvlabwww.epfl.ch/~lepetit/papers/hinterstoisser_pami11.pdf
and the C# implementation:
https://github.com/dajuric/accord-net-extensions
(take a look at samples)
This algorithm can achieve real-time performance by using more than 2000 templates (20-30 fps) - so you can cover ellipse (projection) and circle shape cases.
You can modify hand tracking sample (FastTemplateMatchingDemo)
by putting your own binary templates (make them in Paint :-))
P.S:
To suppress false-positives some kind of tracking is also incorporated. The link to the library that I have posted also contains some tracking algortihms like: Discrete Kalman Filter and Particle Filter all with samples!
This library is still under development so there is possibility that something will not work.
Please do not hesitate sending me a message.
I was having trouble coming up with a way to describe the problem area that I want to understand better so I set up the following scenario to help illustrate
Given the following image, how would I go about programming something that could find all of the happy faces that match the image in position 1 (call it the template image) and disregard sad face images like those in position 2 and 5.
...
I'm not looking for anyone to solve it for me, I just need an insightful first step to get me started as it's uncharted territory for me.
What would this be called? What should I be querying google and stack overflow for in order to find helpful information? Does anyone have a library or code snippet that can help get me started?
Also, I'm a .NET / C# programmer by trade so anything that happens to be in my native language is especially appreciated but not a deal-breaker.
Thanks in advance...
Mike
The technique in fact depends on the actual scenario. This goes by several names, such as content based retrieval, template matching, image description and such.
My suggestions:
If your scenario is like the faces, rotated at known angles with known sizes, look for simpler techniques, such as the correlation of two images. Do it for each angle and you got it.
If you know that the only variation between images is the rotation, that means you have only the happy and sad faces rotated, without other distortions, you can look for rotation invariant matching methods. The Fourier theory may help you there, and also mappings to polar coordinates associated with correlations.
The worst case, where you have several variations, you will need to look into image descriptors and pattern matching techniques. These also depend on the image type, and there are several of them. If you end up with these, you'll have a scheme with some libraries/code to extract features from the images and a classifier to tell you which are the same and which are not, with some kind of confidence (such as a distance measure between the features vectors).
The simplest technique would probably be template matching. The difference in your example images is pretty small though, so it might be hard to differentiate for example image 1 and 5 in your example.
A possible algorithm is:
Compute gradient of the image
For each gradient vector, compute the gradient direction
Compute the orientation histogram (angle vs frequency) of the gradient vectors
This orientation histogram will be distinct for the "happy" vs the "sad" smiley.
have fun.
A simple poor persons algorithm just to get the job done in this case could be.
Determine the bounding box of the image and assume the centre of this is the circle.
Within the circle search for the two eyes as BLOB's. ie objects that contain 20 or pixels in total that fit within a small defined rectangle.
Once you have to location of the two eyes you can determine the slope of the intersecting line between the two lines and hence the orientation of the face.
The distance from the point in the middle of the two eyes straight down though the centre of the circle to the mouth return 1 of 2 possible distances. ie sad or happy.
Quick and dirty and hardcoded to this particular image but it would do the job quickly.
The AForge option is probably a better generalised approach.
I'm working on a simple 2D Real time strategy game using XNA. Right now I have reached the point where I need to be able to click on the sprite for a unit or building and be able to reference the object associated with that sprite.
From the research I have done over the last three days I have found many references on how to do "Mouse picking" in 3D which does not seem to apply to my situation.
I understand that another way to do this is to simply have an array of all "selectable" objects in the world and when the player clicks on a sprite it checks the mouse location against the locations of all the objects in the array. the problem I have with this approach is that it would become rather slow if the number of units and buildings grows to larger numbers. (it also does not seem very elegant) so what are some other ways I could do this. (Please note that I have also worked over the ideas of using a Hash table to associate the object with the sprite location, and using a 2 dimensional array where each location in the array represents one pixel in the world. once again they seem like rather clunky ways of doing things.)
For up to hundreds of units, it should be fast enough to simply do a linear search O(n) over all the units in the world if the click regions are circles or rectangles. Especially seeing as it will be once per click, not once per frame.
If your units are not circular or rectangular, check against a bounding circle or rectangle first, and if that passes check against the more complicated bounding shape.
For a more detailed answer, here's my answer to a similar question about space partitioning. There I mention bucketed grids and quadtrees as potential structures for performance optimisation.
But you should never do performance optimisation until you have tested and actually do have a performance problem!
If you have a class that manages drawabel objects you could have a static int that you increase every time you make a new object, and save the old one as a local instance of Color in the drawabel object. You can then use the .Net type converter to make its to bye arrays and back, dont remember its name and im on my phoneon a train so can't check for you im afraid.
When you build the color from the byte array just remember to max the alpha channel, and if you happen to get too many objects you might overrun the indexes you can use.. not sure what to do then... probably have all your objects reaquire new colors from 0:0:0:255 again since hopefully some old ones are no longer in use :P
Not sure i made alot of sense but since im on a train thats all i can give you, sorry :)
You could use pixel perfect picking, which scales very well to massive numbers of objects (and has the advantage of being pixel perfect).
Essentially you render your scene using a unique colour for each object. Then you resolve the backbuffer into a texture and get the texture data back, finally you can simply check the pixel underneath the mouse and find out which object the mouse is on.
You can be clever about the information you get back, you can request just the single pixel the mouse is on top of.
Color[] pixel = new Color[1];
texture.GetData(pixel, mousePosition.Y * texture.Width + mousePosition.x, 1);
//pixel[0] == colour of the item the mouse is over. You can now look this up in a dictionary<Color, item>
You should be careful not to stall the pipeline by doing this (causing the CPU to wait for the GPU to render things). The best way to do this is to swap between 2 render targets, and always GetData from the render target you used last frame, this means the data is a frame out of date, but no human has fast enough reactions to notice.
Addendum in response to your comment.
To assign a unique colour to each object, simply increment a byte for each object. When that byte overflows, increment another, and when that one overflows increment another; Then you can use those three bytes as Red, Green and Blue. Remeber to keep alpha at max value, you don't want any see through objects!
To resolve the backbuffer is slightly changed in XNA4. Now you must render to a rendertarget and resolve that. To do this is pretty simple, and outlined by Shawn Hargreaves here
I'm trying to draw a polygon using c# and directx
All I get is an ordered list of points from a file and I need to draw the flat polygon in a 3d world.
I can load the points and draw a convex shape using a trianglefan and drawuserprimitives.
This obviously leads to incorrect results when the polygon is very concave (which it may be).
I can't imagine I'm the only person to grapple with this problem (tho I'm a gfx/directx neophyte - my background is in gui\windows application development).
Can anyone point me towards a simple to follow resource\tutorial\algorithm which may assist me?
Direct3D can only draw triangles (well, it can draw lines and points as well, but that's besides the point). So if you want to draw any shape that is more complex than a triangle, you have to draw a bunch of touching triangles that equal to that shape.
In your case, it's a concave polygon triangulation problem. Given a bunch of vertices, you can keep them as is, you just need to compute the "index buffer" (in simplest case, three indices per triangle that say which vertices the triangle uses). Then draw that by putting into vertex/index buffers or using DrawUserPrimitives.
Some algorithms for triangulating simple (convex or concave, but without self-intersections or holes) polygons are at VTerrain site.
I have used Ratcliff's code in the past; very simple and works well. VTerrain has a dead link to it; the code can be found here. It's C++, but porting that over to C# should be straightforward.
Oh, and don't use triangle fans. They are of very limited use, inefficient and are going away soon (e.g. Direct3D 10 does not support them anymore). Just use triangle lists.
If you are able to use the stencil buffer, it should not be hard to do. Here's a general algorithm:
Clear the stencil buffer to 1.
Pick an arbitrary vertex v0, probably somewhere near the polygon to reduce floating-point errors.
For each vertex v[i] of the polygon in clockwise order:
let s be the segment v[i]->v[i+1] (where i+1 will wrap to 0 when the last vertex is reached)
if v0 is to the "right" of s:
draw a triangle defined by v0, v[i], v[i+1] that adds 1 to the stencil buffer
else
draw a triangle defined by v0, v[i], v[i+1] that subtracts 1 from the stencil buffer
end for
fill the screen with the desired color/texture, testing for stencil buffer values >= 2.
By "right of s" I mean from the perspective of someone standing on v[i] and facing v[i+1]. This can be tested by using a cross product:
cross(v0 - v[i], v[i+1] - v[i]) > 0
Triangulation is he obvious answer, but it's hard to write a solid triangulator. Unless you have two month time to waste don't even try it.
There are a couple of codes that may help you:
The GPC Library. Very easy to use, but you may not like it's license:
http://www.cs.man.ac.uk/~toby/alan/software/gpc.html
There is also triangle:
http://www.cs.cmu.edu/~quake/triangle.html
And FIST:
http://www.cosy.sbg.ac.at/~held/projects/triang/triang.html
Another (and my prefered) option would be to use the GLU tesselator. You can load and use the GLU library from DirectX programs just fine. It does not need an OpenGL context to use it and it's pre-installed on all windows machines. If you want source you can lift off the triangulation code from the SGI reference implementation. I did that once and it took me just a couple of hours.
So far for triangulation. There is a different way as well: You can use stencil tricks.
The general algorithm goes like this:
Disable color- and depth writes. Enable stencil writes and setup your stencil buffer that it will invert the current stencil value. One bit of stencil is sufficient. Oh - your stencil buffer should be cleared as well.
Pick a random point on the screen. Any will do. Call this point your Anchor.
For each edge of your polygon build a triangle from the two vertices that build the edge and your anchor. Draw that triangle.
Once you've drawn all these triangles, turn off stencil write, turn on stencil test and color-write and draw a fullscreen quad in your color of choice. This will fill just the pixels inside your convex polygon.
It's a good idea to place the anchor into the middle of the polygon and just draw a rectangle as large as the boundary box of your polygon. That saves a bit of fillrate.
Btw - the stencil technique works for self-intersecting polygons as well.
Hope it helps,
Nils
I just had to do this for a project. The simplest algorithm I found is called "Ear Clipping". A great paper on it is here: TriangulationByEarClipping.pdf
I took me about 250 lines of c++ code and 4 hours to implement the brute force version of it. Other algorithms have better performance, but this was simple to implement and understand.