Source reverse engineering of a Kinect app

Source reverse engineering of a Kinect app - c#

I was working on a open source app of a Kinect camera ,
and I faced a problem while I read the source .
By the way the project idea is for controlling PowerPoint using hands, you can find the source code here.
The author uses this code:
Skeleton closestSkeleton = skeletons.Where(s => s.TrackingState == SkeletonTrackingState.Tracked)
.OrderBy(s => s.Position.Z * Math.Abs(s.Position.X))
.FirstOrDefault();
Can any one help me figure out what s => s.Position.Z * Math.Abs(s.Position.X)
means as an idea, I know it's a lambda expression so I only need to figure out why?

It's a distance metric, used to determine the Skeleton closest to the Kinect sensor.
In the Skeleton Space, Z is the distance from the Kinect sensor (see here).
And if you think of the room being divided in a left half and a right half, by a line from the Kinect sensor.... then X is how far away something is from that line. How far to the left or to the right.
This is also why the absolute value of X is used - the code looks how far away the Skeleton is from that hypothetical dividing line.
So this code looks how far away from the sensor a body is (Z), then multiplies it by how far to the left or right (X). It is a somewhat primitive determination of distance. (One would have expected to use the Pythagorean theorem, but maybe that was considered too slow?)
The code takes the FirstOrDefault Skeleton, where these Skeletons are ordered by this distance metric.

s => s.Position.Z * Math.Abs(s.Position.X) is in the OrderBy statement, serving as the quantity by which to order all detected bodies. It is weighting the skeletons sort by radial distance and not just orthogonal Z separation.
Consider two objects at the same z coordinate, and the camera at the origin. The closest one is the one with a smaller horizontal (x) distance.

Related

Generate Voronoi diagram without using Fortune's algorithm

I'm hoping to create a Voronoi landscape in Unity in C#. I looked at a number of Unity Project files, but they all implement Fortune's algorithm, which is completely over my head. Are there any other methods of generating Voronoi diagram (that is easier to understand)?
Slow performance is completely fine with me.
Much appreciated!
Sidenote: Since I'm working in Unity and need to generate 2D/3D mesh from Voronoi diagram, per-pixel distance check won't work :,(
On second thought, maybe I could use a 2D array of Vector2s instead of pixels, that are 1.0 unit spaced apart in x and z axis.

There is a very simple way to create an approximated Voronoi diagram VD. For every Site s that should define a cell in the VD (2D-plane) you center a cone at s with constant slope and a certain height. Then you look from above onto that landscape of cones (where all the spikes are visible). The boundary where the different cones meet (projected to the 2D-plane) is the (approximated) Voronoi diagram.
(Image Source)
As you requested in the comments, to get the actual edge data seems not so easy. But there could be some graphical routines to generate them by intersecting the cones.
An alternative is to compute a Delaunay triangulation of the given point set. There are some implementation referenced in this related post (also simple approximations are mentioned). Then you compute the dual graph of your triangulation and you have the Voronoi diagram. (Dual graph means that for every for every edge AB in the triangulation there exists an edge in the VD bisecting the space between the two vertices A and B, and for every triangle there exists a vertex in the VD where the dual edges meet.) Othwerwise there are also many C# Voronoi implementations around: Unity-delaunay, but as you mentioned using the Fortune approach.
If you want to code everything yourself you may compute a triangulation of the points with brute force for n points in O(n^2) time. Then apply in-circle tests and edge flips. That is, for every triangle t(abc) create a circle C defined by the three vertices of t. Then check if there lies another point d of your point set inside C. If so, then flip the edge that is in t as well as forms an edge in the triangle with d. This flipping is done until all triangles fulfil the empty circle property (Delaunay condition). Again with brute force will take O(n^2) time. Then you can compute the dual graph as mentioned above.
(Image Source)

"Easiest? That's the brute-force approach: For each pixel in your output, iterate through all points, compute distance, use the closest. Slow as can be, but very simple. If performance isn't important, it does the job."
[1] Easiest algorithm of Voronoi diagram to implement?

How can you stitch multiple heightmaps together to remove seams?

I am trying to write an algorithm (in c#) that will stitch two or more unrelated heightmaps together so there is no visible seam between the maps. Basically I want to mimic the functionality found on this page :
http://www.bundysoft.com/wiki/doku.php?id=tutorials:l3dt:stitching_heightmaps
(You can just look at the pictures to get the gist of what I'm talking about)
I also want to be able to take a single heightmap and alter it so it can be tiled, in order to create an endless world (All of this is for use in Unity3d). However, if I can stitch multiple heightmaps together, I should be able to easily modify the algorithm to act on a single heightmap, so I am not worried about this part.
Any kind of guidance would be appreciated, as I have searched and searched for a solution without success. Just a simple nudge in the right direction would be greatly appreciated! I understand that many image manipulation techniques can be applied to heightmaps, but have been unable to find a image processing algorithm that produces the results I'm looking for. For instance, image stitching appears to only work for images that have overlapping fields of view, which is not the case with unrelated heightmaps.
Would utilizing a FFT low pass filter in some way work, or would that only be useful in generating a single tileable heightmap?
Because the algorithm is to be used in Unit3d, any c# code will have to be confined to .Net 3.5, as I believe that's the latest version Unity uses.
Thanks for any help!

Okay, seems I was on the right track with my previous attempts at solving this problem. My initial attemp at stitching the heightmaps together involved the following steps for each point on the heightmap:
1) Find the average between a point on the heightmap and its opposite point. The opposite point is simply the first point reflected across either the x axis (if stitching horizontal edges) or the z axis (for the vertical edges).
2) Find the new height for the point using the following formula:
newHeight = oldHeight + (average - oldHeight)*((maxDistance-distance)/maxDistance);
Where distance is the distance from the point on the heightmap to the nearest horizontal or vertical edge (depending on which edge you want to stitch). Any point with a distance less than maxDistance (which is an adjustable value that effects how much of the terrain is altered) is adjusted based on this formula.
That was the old formula, and while it produced really nice results for most of the terrain, it was creating noticeable lines in the areas between the region of altered heightmap points and the region of unaltered heightmap points. I realized almost immediately that this was occurring because the slope of the altered regions was too steep in comparison to the unaltered regions, thus creating a noticeable contrast between the two. Unfortunately, I went about solving this issue the wrong way, looking for solutions on how to blur or smooth the contrasting regions together to remove the line.
After very little success with smoothing techniques, I decided to try and reduce the slope of the altered region, in the hope that it would better blend with the slope of the unaltered region. I am happy to report that this has improved my stitching algorithm greatly, removing 99% of the lines reported above.
The main culprit from the old formula was this part:
(maxDistance-distance)/maxDistance
which was producing a value between 0 and 1 linearly based on the distance of the point to the nearest edge. As the distance between the heightmap points and the edge increased, the heightmap points would utilize less and less of the average (as defined above), and shift more and more towards their original values. This linear interpolation was the cause of the too step slope, but luckily I found a built in method in the Mathf class of Unity's API that allows for quadratic (I believe cubic) interpolation. This is the SmoothStep Method.
Using this method (I believe a similar method can be found in the Xna framework found here), the change in how much of the average is used in determining a heightmap value becomes very severe in middle distances, but that severity lessens exponentially the closer the distance gets to maxDistance, creating a less severe slope that better blends with the slope of the unaltered region. The new forumla looks something like this:
//Using Mathf - Unity only?
float weight = Mathf.SmoothStep(1f, 0f, distance/maxDistance);
//Using XNA
float weight = MathHelper.SmoothStep(1f, 0f, distance/maxDistance);
//If you can't use either of the two methods above
float input = distance/maxDistance;
float weight = 1f + (-1f)*(3f*(float)Math.Pow(input, 2f) - 2f*(float)Math.Pow(input, 3f));
//Then calculate the new height using this weight
newHeight = oldHeight + (average - oldHeight)*weight;
There may be even better interpolation methods that produce better stitching. I will certainly update this question if I find such a method, so anyone else looking to do heightmap stitching can find the information they need. Kudos to rincewound for being on the right track with linear interpolation!

What is done in the images you posted looks a lot like simple linear interpolation to me.
So basically: You take two images (Left, Right) and define a stitching region. For linear interpolation you could take the leftmost pixel of the left image (in the stitching region) and the rightmost pixel of the right image (also in the stitching region). Then you fill the space in between with interpolated values.
Take this example - I'm using a single line here to show the idea:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
Lets say our overlap is 4 pixels wide:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
^ ^ ^ ^ overlap/stitiching region.
The leftmost value of the left image would be 10
The rightmost value of the right image would be 1.
Now we interpolate linearly between 10 and 1 in 2 steps, our new stitching region looks as follows
stitch = [10, 07, 04, 01]
We end up with the following stitched line:
line = [11,11,11,10,07,04,01,02,02,02]
If you apply this to two complete images you should get a result similar to what you posted before.

Finding angle between two markers for use in mathematical optimisation

I am trying to minimize the difference between sets of square markers in 3d space with a set of unknown parameters.
I have a model set of these square markers (represented by 3d position and rotation) which should at the end of optimization match up with a set of observed square markers.
I am using Levenberg–Marquardt to optimize the set of unknown parameters, these parameters will alter the position and rotation of the model 3d markers until they match (more or less) with the observed 3d marker positions.
The observed 3d markers come from a computer vision marker detection algorithm. It gives the id of the markers seen in each frame and the transformation from the camera of each marker (using Coplanar posit). Each 'frame' would only be able to see a small number of markers in the total set of markers, there will also be inaccuracies in the transformation.
I have thought of how to construct my minimization function and I thought to try to compare the relative rotations and minimize the difference between the rotations in each iteration of the LM optimisation.
Essentially:
foreach (Marker m1 in markers)
{
foreach (Marker m2 in markers)
{
Vector3 eulerRotation = getRotation(m1, m2);
ObservedMarker observed1 = getMatchingObserved(m1);
ObservedMarker observed2 = getMatchingObserved(m2);
Vector3 eulerRotationObserved = getRotation(observed1, observed2);
double diffX = Math.Abs(eulerRotation.X - eulerRotationObserved.X);
double diffY = Math.Abs(eulerRotation.Y - eulerRotationObserved.Y);
double diffZ = Math.Abs(eulerRotation.Z - eulerRotationObserved.Z);
}
}
Where diffX, diffY and diffZ are the values to be minimized.
I am using the following to calculate the angles:
Vector3 axis = Vector3.Cross(getNormal(m1), getNormal(m2));
axis.Normalize();
double angle = Math.Acos(Vector3.Dot(getNormal(m1), getNormal(m2)));
Vector3 modelRotation = calculateEulerAngle(axis, angle);
getNormal(Marker m) calculates the normal to the plane that the square marker lies on.
I am sure I am doing something wrong here though. Throwing this all into the LM optimiser (I am using ALGLib) doesn't seem to do anything, it goes through 1 iteration and finishes without changing any of the unknown parameters (initially all 0).
I am thinking that something is wrong with the function I am trying to minimize over. It seems sometimes the angle calculated (3rd line) returns NaN (I am currently setting this case to return diffX, diffY, diffZ as 0). Is it even valid to compare the euler angles as above?
Any help would be greatly appreciated.
Further information:
Program is written in C#, I am using XNA as well.
The model markers are represented by its four corners in 3D coords
All the model markers are in the same coordinate space.
Observed markers are the four corners as translations from the camera position in camera coordinate space
If m1 and m2 markers are the same marker id or if either m1 or m2 is not observed, I set all the diffs to 0 (no difference).

At first I thought this might be a typo, but then I realized that this could be a bug, having been a victim of similar cases myself in the past.
Shouldn't diffY and diffZ be:
double diffY = Math.Abs(eulerRotation.Y - eulerRotationObserved.Y);
double diffZ = Math.Abs(eulerRotation.Z - eulerRotationObserved.Z);
I don't have enough reputation to post this as a comment, hence posting it as an answer!

Any luck with this? Is it correct to assume that you want to minimize the "sum" of all diffs over all marker combinations? I think if you want to use LM you should not use Math.Abs.
One alternative would be to formulate your objective function manually and use another optimizer. I have recently ported two non-linear optimizers to C# which do not even require you to compute derivatives:
COBYLA2, supports non-linear constraints but require more iterations.
BOBYQA, limited to variable bounds constraints, but provides a considerable more efficient iteration scheme.

Kinect - Difference between Depth and Joint Position.Z

It seems to me that both depth and position.z measure the distance between the body parts and the camera.
From what I see in examples and questions, (e.g.) the body parts of the tracked human being can be coloured differently based on how far they are from the camera.
As for the skeleton, the position z is limited to the joints that are available through the SDK.
So in conclusion, both provides the same function but depth is more precise. Am I having the wrong concept on depth or missing out any important points?
*I apologize if this question can be easily found on stackoverflow or on other websites. I couldn't find any pages that could answer my query so I've decided to post here instead.

Depth is trivially calculated per-pixel. Joint.Z is optionally calculated per-joint. Joint calculating has a substantial performance cost because the SDK has to analyze the image to figure out which of those millions of pixels is, for example, your left knee. Joint has the benefit of also getting inferred by the SDK based on its understanding of human anatomy so if your left knee happens to be occluded by a wandering puppy, the Joint position will still be pretty accurate because assumptions are made based on other visible joints.
If you are already doing skeleton tracking for x,y of joints then you might as well take advantage of the z that comes with it but otherwise depth will be more efficient.

projection of point on line in 3D

I have point with x,y,z
and line direction x,y,z
how to get the point projection on this line
I tried this code
http://www.zshare.net/download/93560594d8f74429/
for example when use the function intersection in the code I got
the line direction is (1,0,0) and the point (2,3,3) will have projection (value in x , 0, 0 ) and this is wrong value
any suggestion
Best regards

You want to project the vector (x,y,z) on the line with direction (a,b,c).
If (a,b,c) is a unit vector then the result is just (x,y,z).(a,b,c) (a,b,c) = (ax+by+cz)(a,b,c)
If it's not a unit vector make it one, divising it by its norm.
EDIT : a little bit of theory:
Let E be your vectorial space of dimension N:
let F be the line directed by vector a. The hyperplan orthogonal to F is :
Now let's chose a vector x in E, x can be writen as :
where xF is the coordinate of x in the direction of F, an x orthogonal is the coordinate on the orthogonal hyperplan.
You want to find xF: (it's exactly the same formula as the one I wrote above)
You should have a close look at the wikipedia article on orthogonal projections and try to find more stuff on the web .
You can generalise that to any F, if it's not a line anymore but a plan then take F orthogonal and decompose x the same way...etc.

This topic is clearly old and I think the original poster meant vector not line. But for the purposes of Google:
A line, unlike a vector does not (necessarily) have its origin at (0,0,0). So cannot be described just by a direction, it also needs an origin. This is the zero point of the line; the line can extend beyond and before this point, but when you say you’re zero meters along the line this is where you mean.
So to get the projection of a point onto a line you first need to convert the point into the local co-ordinate frame, which you do by subtracting the origin from the point (e.g. if a fence post is the ‘line’ you go from GPS co-ordinates to ‘5 metres to the north and a meter above the bottom of the fence post‘). Now in this local co-ordinate frame the line is just a vector, so we can get the projection of the point using the normal dot product approach.
pointLocalFrame = point– origin
projection = dotProduct(lineDirection, pointLocalFrame)
NOTE: this assumes the line is infinite in length, if the projection is greater than the actual line length then there is no projection
NOTE: lineDirection must be normalised; i.e. its length must be 1
NB: dot product of two vectors (x1,y1,z1) and (x2,y2,z2) is x1*x2+y1*y2+z1*z2

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.