Transforming a point cloud from one space to another - c#

I have a robot arm and a Microsoft Kinect 2.0. The Kinect can output 3D point clouds, which is offset by the Kinects origo and rotation. The robot arm can move its outermost wrist to XYZ positions (and rotations) given in mm.
Imagine that the Kinect is mounted on the arm.
The robot moves to position X:0, Y: 570, Z: -950, and rotates the wrist so that the Kinect points directly downwards (rotated -90 degrees around the Z-axis).
If I get a ColorMesh from the Kinect, how would I rotate and offset the mesh so that it is in robot space instead of 3D camera space? I don't know much about this so any help is appreciated.
Here's an illustration of the real-life setup in mind:
The sphere denotes the workspace reach of the arm.
I was figuring the class to look something like this:
public class CameraToRobotCalibrator
{
//offset in mm compared to robot origo
public const float x_offset = 0f;
public const float y_offset = 570f;
public const float z_offset = -950f;
//camera rotated around its own origo
public const float x_rotation = 0f;
public const float y_rotation = 0f;
public const float z_rotation = -90f;
public ColorMesh ConvertToRobospace(ColorMesh mesh)
{
//?
}
}
I'm guessing if it was for 1 point it wouldn't be too hard, but I don't know about an entire mesh with vertices and faces?

Given the specifications of the robot and its degree of freedom, you can usually calculate a 4x4 matrix that describes the endeffector's position and orientation. This is specific to the robot's design. Let's call this matrix E.
Then, you need to mount the Kinect on the endeffector. This will introduce another translation and possibly a rotation. So you need another matrix M that represents how the Kinect is mounted on the endeffector. It is the transform from the endeffector's coordinate system to the Kinect's coordinate system. Ideally, this matrix would be calibrated somehow. A good initial guess might be derived by measuring.
Then, the overall transform from the robot's root coordinate system to the Kinect coordinate system is T = R * M (assuming column-major matrices).
And this is the matrix that you need to transform your mesh. Transforming a mesh is just like transforming a single point. But now, you transform all vertex positions (add 1 as w-component) and all vertex normals (add 0 as w-component). Keep the face indices as they are.

For those interested in my implementation, see here:
http://pastebin.com/zfCdVFNE
You can probably find some better matrix library out there, I just use the Matrix4 from Kinect's API.
The properties were measured in real life and then calibrated after running a test with the robot and Kinect.
I figured someone might stumble upon the question some day and go 'but where is the CODE?'
I followed this guide which is basically matrices for dummies: http://www.codinglabs.net/article_world_view_projection_matrix.aspx

Related

Constrain rigidbody movement to spline at high speeds / sharp curves

I have a 2.5d platformer game. The character is using rigidbody movement on a spline (using the curvy splines asset) which curves into 3d space in all sorts of ways, while the camera stays fixed to the side so that you see the path and background turning, but maintain a 2d side scrolling perspective.
I'm essentially creating a look rotation based on the spline, then moving the player using that forward vector, and making sure to remove any velocity perpendicular to the path so that the player stays centered on the path even when curving. I'm removing the velocity on that vector instead of projecting all the velocity in the direction of the path so that the player can still jump and fall like normal.
void SetLookRotation()
{
// get nearest TF and point on spline
Vector3 p;
mTF = Spline.GetNearestPointTF(transform.localPosition, out p);
// Get forward and up vectors of point on spline
_localHorizontal = Spline.GetTangentFast(mTF);
_localVertical = Spline.GetOrientationUpFast(mTF);
// Set look rotation to path
transform.rotation = Quaternion.LookRotation(Vector3.Cross(_localHorizontal, _localVertical), _localVertical);
}
void Movement()
{
Vector3 m = transform.right * groundAcceleration * moveInput;
rb.AddForce(RemoveCrossVelocity(m));
rb.velocity = RemoveCrossVelocity(rb.velocity);
Vector3 localVelocity = transform.InverseTransformDirection(rb.velocity);
localVelocity.z = 0;
rb.velocity = transform.TransformDirection(localVelocity);
}
Vector3 RemoveCrossVelocity(Vector3 v)
{
// get magnitude going in the cross product / perpindicular of localHorizontal and localVertical vector
// (essentially the magnitude on "local Z" or to the sides of the player)
Vector3 crossVelocity = Vector3.Project(v, Vector3.Cross(transform.right, transform.up));
// and remove it from the vector
return v -= crossVelocity;
}
The first 2 functions are happening in FixedUpdate() in the order shown.
The problem is, when hitting sharp corners at high speeds, some inertia causes the player to deviate off the center of the path still just ever so slightly, and a lot of that momentum turns into upward momentum, launching the player upwards. Eventually the player can fall off the path completely (I do have a custom gravity acting towards the spline though). It works perfectly at lower speeds though, even when dealing with sharp corners. At least as far as I can tell.
I tried a bit of code from https://answers.unity.com/questions/205406/constraining-rigidbody-to-spline.html too but no luck.
Is there a way I could constrain the player rigidbody on a vector that is not one of the global x/y/z axes? I've tried a host of other solutions like setting the transform of the player towards at the center of the spline but I can't seem to get it without feeling very jerky. Using forces makes the player "rubber band" towards and past the center back and forth. Maybe there is something in my math wrong. In any case, I'm hoping someone could help me make sure that the player will always stay on the center of the spline but only on the vector to the sides of the player's face direction, so that it doesn't mess with jumping. Thank you very much in advance!
For potential future visitors, I have figured this out. There are a few components (and a lot more if you're trying to do full spline based physics, but just to start with movement...)
First we must orient our character, so that our local coordinate system can be referenced with transform.right etc. Luckily this package provides these functions which return useful vectors. I'm sure there is math beyond me to do this otherwise if you are building your own spline system.
void SetLookRotation()
{
// get nearest TF and point on spline
Vector3 p;
playerTF = currentSpline.GetNearestPointTF(transform.localPosition, out p);
// Get forward and up vectors of point on spline
_localHorizontal = currentSpline.GetTangentFast(playerTF);
_localVertical = currentSpline.GetOrientationUpFast(playerTF);
// Set look rotation to path
transform.rotation = Quaternion.LookRotation(Vector3.Cross(_localHorizontal, _localVertical), _localVertical);
}
Here I am setting a velocity directly but if you're using forces it's the same principle.
if (Mathf.Abs(localVelocityAs_X) <= maxDashSpeed * Mathf.Abs(moveInput))
{
Vector3 m = transform.right * maxDashSpeed * moveInput;
rb.velocity = RemoveCrossVelocity(m);
}
localVelocityAs_X is defined as (ran in fixedUpdate/ physics step):
float currLocalVelocityX = (playerTF - prevPositionX) / Time.deltaTime;
localVelocityAs_X = Mathf.Lerp(localVelocityAs_X, currLocalVelocityX, 0.5f);
prevPositionX = playerTF;
Where playerTF is your position on a spline (in this case, using the curvy spline package from the unity asset store. Those spline positions return very small floats so in my case I multiplied playerTF by around 10,000 to make it a more easily readable metric). This is essentially just manually calculating velocity of the player each frame by comparing current position on the spline to last frame's.
RemoveCrossVelocity is the same as above. Comment explanations should suffice.
Vector3 RemoveCrossVelocity(Vector3 v)
{
// get magnitude going in the cross product / perpendicular of local horizontal and local vertical vectors
// (essentially the magnitude on "local Z" of the player)
Vector3 crossVelocity = Vector3.Project(v, Vector3.Cross(transform.right, transform.up));
// and remove it from the vector
return v -= crossVelocity;
}
Finally the solution to the drift. My crude fix was essentially to just adjust the player to the center of the spline every frame. Horizontally, there is no change because it grabs the closest spline point which is calculated by this package to be sort of a float clamped between the start and end of the spline. Vertically, we are being set to the distance the player is from the spline in the local up direction - a fancy way of saying we're not moving vertically at all. The reason this must be done is to avoid the spline vertical position overwriting the players, and we obviously can't set this vector back to playerPos.y in our local coordinate space, so we must resort to using a direction vector * the distance from our everchanging floor. This isn't absolutely ideal at the end of the day, but it works, and there isn't any extra jitter from it (interpolate on your player's rigidbody and some camera dampening helps). All in all these together combine to make a player able to accelerate quickly around sharp corners of a spline with physics and intertia will never cause the player to fly off or drift from the center. Take that, rocket physics!
void ResetPlayerToSpline()
{
Vector3 P; //closest spline point to player
float pTf = currentSpline.GetNearestPointTF(transform.position, out P);
playerHeight = Vector3.Distance(transform.position, P);
transform.position = P + (transform.up * Vector3.Distance(transform.position, P));
}
Ultimately for those possibly looking to do some kind of implementation in the future, the biggest thing you'll run into is a lack of cardinal direction, global oriented axis-based functions and properties normally provided by a game engine. For a primer, here are a few I would use (not including gravity, which is simply opposite your up vector times whatever magnitude):
This one allows you to create a vector using x and y like normal (and z in theory) and run this function to convert it when you actually use the vector in a local space. That way, you don't have to try and think in directions without names. You can still think of things in terms of x and y:
Vector3 ConvertWorldToLocalVector(Vector3 v)
{
Vector3 c;
c = transform.right * v.x + transform.up * v.y;
return c;
}
This is basically the same as what is happening in RemoveCrossVelocity(), but it's important to reiterate this is how you set velocity in a direction to 0. The second part shows how to get velocity in a certain vector.
void Velocity_ZeroY()
{
rb.velocity -= GetLocalVerticalVelocity();
}
public Vector3 GetLocalVerticalVelocity()
{
return Vector3.Project(rb.velocity, _localVertical);
}
Getting height, since you cannot just compare y positions:
height = Vector3.Distance(transform.position, P);
I think that's all the good stuff I can think of. I noticed a severe lack of resources for created spline based physics movement in games, and I'm guessing now it's based on the fact that this was quite an undertaking. It has since been brought to my attention that the game "Pandemonium"(1996) is a curvy 3d spline based sidescrolling platformer - just like mine! The main difference seems to be that it isn't at all based on physics, and I'm not sure from what I can tell if it has pitch changes and gravity to compliment. Hope this helps someone someday, and thank you to those who contributed to the discussion.

Transform kinect v2 coordinate to another coordinate

I actually measured (x,y) joint position that related to a human skeleton in the sagittal plan using Kinect v2 camera. Now, I want to create the angle between Kinect v2 and skeleton direction of motion( like in this figure: http://www.mediafire.com/file/7wf8890ngnmi1d4/kinect.pdf ).
How can I measure the joint position relative to a coordinate fixed on certain join on the skeleton like SpineBase position using MATLAB??
what is the transformation required to do that?
I have no kinect available right now, but here is the theory how I would tackle this:
First of you seem to already be able access the different joint coordinates, so you have sth like this:
if (body.IsTracked)
{
Joint spineMid = body.Joints[JointType.SpineMid];
float x = spineMid.Position.X;
float y = spineMid.Position.Y;
float z = spineMid.Position.Z;
}
This gives us a spineMid point with x,y,z. Each frame we compare that spineMid point to the spinMid point from last frame (and save it afterwards for the comparison in the next frame). Lets call these points P_new and P_old. To get the direction Vector we just subtract the two like so:
p_dir = P_new - P_old
now we have to get the angle between this direction vector and the vector "out" of the kinect which is <0,0,1> with the kinect coordinate system. But given your drawing we need to use z_dir = <0,0,-1>.
By using the unit vector of p_dir, lets call it p_dir_unit, we can use the dot product to get the angle between z_dir and p_dir_unit.
theta = acos(z_dir * p_dir_unit)
If you only need the direction in the x,z plane, you can just set the y value for p_dir to 0 and get the unit vector from that vector. From the absolute length of p_dir you can also get information on how quick the body is moving.
Hope that helps.

3D animation with bones and their matrices

I really am asking this as a last resort. I haven't been able to solve this for 2 days now. So if someone knows a thing or two about 3D, Matrices and animation I would really appreciate your input.
I have downloaded this project and implemented it into my own project: http://xbox.create.msdn.com/en-US/education/catalog/sample/skinned_model.
The character in my game move his hands as he casts a spell. I have successfully made this animation and imported it into the project. But I need to spawn particles inside the palms of his hands which move according to an animation. All I need is the 3D position of the palms of his hands after the animation has been applied.
Picture of hands during the animation:
http://s18.postimg.org/qkaipufa1/hands.png
If you take a look at the skinned model sample project: Class: AnimationPlayer.cs you will notice that it processes the matrices 3 times:
public void Update(TimeSpan time, bool relativeToCurrentTime,
Matrix rootTransform)
{
UpdateBoneTransforms(time, relativeToCurrentTime);
UpdateWorldTransforms(rootTransform);
UpdateSkinTransforms();
}
And allows us to access them after each of the steps:
/// Gets the current bone transform matrices, relative to their parent bones.
/// </summary>
public Matrix[] GetBoneTransforms()
{
return boneTransforms;
}
/// <summary>
/// Gets the current bone transform matrices, in absolute format.
/// </summary>
public Matrix[] GetWorldTransforms()
{
return worldTransforms;
}
/// <summary>
/// Gets the current bone transform matrices,
/// relative to the skinning bind pose.
/// </summary>
public Matrix[] GetSkinTransforms()
{
return skinTransforms;
}
I should also mention that I know the index of the bone in the palm and the index of all its parents:
10 - 11's parent, Root bone
11 - 12's parent
12 - 13's parent
13 - 14's parent
14 - 15's parent
15 - 16's parent
16 - The bone in the palm
As far as I understand this project is that all of the GetXXXXXXXX Commands I listed above return an array of Matrix[] ordered according to the index of the bone. So to get the Transform of Bone 10. I believe the code will look like:
Matrix[] M = animtionplayer.GetSkinTransforms();
Matrix transform = M[10];
OK, now for the parts I don't understand.
I don't know which of the 3 GetXXXXXXXXX functions I need to use to get the palms position.
I think the way the shaders calculate the position of the bones is by multiplying them by each of their parent bones. So:
Matrix[] M = animtionplayer.GetBoneTransforms();
Matrix transform = M[10];
transform = transform * M[11];
transform = transform * M[12];
transform = transform * M[13];
transform = transform * M[14];
transform = transform * M[15];
transform = transform * M[16];
//maybe apply the world position of the model?
transform = transform * MyWorld;
And then maybe to get a vector3 position.
Vector3 HandPosition = transform.Up;
Well when I try the solution above I get mixed results. With certain bones it moves correctly for the middle section of the animation. But honestly nothing good. Can someone explain whats going on here? How do I get the position of the bone in the palm? I'm really in the dark here. I only learnt what a matrix was 2 months ago, and animation with bones only this week.
Alright so this bug was quiet frustrating so I took a break from coding for about a week. I started working on it again 2 days ago. Trying many random things.... looking at the values of the matrices looking for patterns and I finally got it.
Here is an image where the hands are animated and all the bones are highlighted perfectly with big white spheres. And it follows the animation perfectly too!
http://s27.postimg.org/f48i7ae4x/2014_07_30_18_H14_M20_S.jpg
The code:
Matrix[] BoneTransforms = animtionPlayer.GetWorldTransforms();
for (int i = 0; i < BoneTransforms.Length; i++)
{
Vector3 v3 = new Vector3(BoneTransforms[i].M41, BoneTransforms[i].M42, BoneTransforms[i].M43);
v3 = Vector3.Transform(v3, GetWorld(Position, Rotation, Scale));//GetWorld is the models world matrix: position, rotation and scale.
Game1.DrawMarker(v3); //v3 is the position of the current Bone.
}
I just made that code by watching all of the values of the specific Matrix. I learnt that M41, M42, M43 are the X, Y, Z cords from the specific bone.
Reopen your mesh designer, put an empty in his hands about where the particle effect should be, parent the empty to the "palm" bones and keep track of what direction in local coords the empty is facing so you can face the particles in the right direction, rebake the animation and reimport it. I will admit I'm not familiar with the "mesh skinner" you're using - but I would imagine it supports child objects of bones like empties as they're frequently employed for such purposes. You're overthinking this. If you can't do this with the X-box kit, I don't know.
A "transform matrix" isn't necessarily the location of the bone - so I'm confused about what these functions return. They probably just return the matrix it uses to calculate the position of the vertices which are weighted to those bones. There isn't a function to return the "world position" of a palm bone and simply track your animation to the world position of the empty or palm bone? I'm confused what you need to use the transform matrices for. If the library you're using is good, there will almost certainly be a way to get a bone's world coordinate in which case adding a "tracking bone" just outside the palm will let you track animations or objects to that bone or even directly parent them.

Perspective Projection

Few days ago I decided to start in 3D programming and came across perspective projection.
I use the following code to get the matrix:
public static Matrix3D ProjectionMatrix(double angle, double aspect, double near, double far)
{
double size = near * Math.Tan(MathUtils.DegreeToRadian(angle) / 2.0);
double left = -size, right = size, bottom = -size / aspect, top = size / aspect;
Matrix3D m = new Matrix3D(new double[,] {
{2*near/(right-left),0,(right + left)/(right - left),0},
{0,2*near/(top-bottom),(top+bottom)/(top-bottom),0},
{0,0,-(far+near)/(far-near),-(2 * far * near) / (far - near)},
{0,0,-1,0}
});
return m;
}
I use the following code for the camera:
Matrix3D Camera
{
get
{
Vector3D cameraZAxis = -this.LookDirection;
cameraZAxis.Normalize();
Vector3D cameraXAxis = Vector3D.CrossProduct(this.UpDirection, cameraZAxis);
cameraXAxis.Normalize();
Vector3D cameraYAxis = Vector3D.CrossProduct(cameraZAxis, cameraXAxis);
Vector3D cameraPosition = (Vector3D)this.Position;
double offsetX = -Vector3D.DotProduct(cameraXAxis, cameraPosition);
double offsetY = -Vector3D.DotProduct(cameraYAxis, cameraPosition);
double offsetZ = -Vector3D.DotProduct(cameraZAxis, cameraPosition);
return new Matrix3D(new double[,]{{cameraXAxis.X, cameraYAxis.X, cameraZAxis.X, 0},
{cameraXAxis.Y, cameraYAxis.Y, cameraZAxis.Y, 0},
{cameraXAxis.Z, cameraYAxis.Z, cameraZAxis.Z, 0},
{offsetX, offsetY, offsetZ, 1}});
}
}
However, I don't know how to get the Model or World Matrix, also is there anything wrong with the previous code?
A Matrix is used to transform an object from one "space" into another. Think of a Model which is a cube, the model center is 0,0,0 and each corner is at an extent of 5. Now to transform that into your world you would apply a Worldmatrix with the transformation to put these model coordinates into your world.
A translation for example which should "move" the model to 5, 5, 5. Now just think of adding this position to your model and all its points, and now these new coordinates are called "in world space". Now the model is, via your World matrix placed in the world at position 5,5,5 (your former model center resides now in your world at this position).
The view matrix is a little bit tricky to understand, but the easiest is to think of "We can't move a camera, so we move all ojects in the opposite direction so it looks like we moved the camera". This sounds difficult, but in fact its the same like transforming from model space to world space.
Now we transform from World space into view space. Finally, we need to get it on the screen. Obviously your screen is 2D and you still have a 3D scene, now you need a way to transform your 3D objects into your view space. This can be achieved with different kinds of projection. The two important ones are orthogonal projection and perspective projection. In fact this uses the Z component( which we don't have on a 2D surface like a screen) and the screen resolution to "project" our visible world into our screen space.
All this is easily accomplished using matrices. One for each transformation is a good start. To be fair, you don't need any of it, you could directly supply your data in screen space, but this wouldn't be practical. If you don't need a World matrix for example, which happens if your model is modeled so that it could be used directly, you would use an Matrix.Identity which, can roughly be interpreted as multiplying a number by 1 which gives you the same number. Also an Identity Matrix for the view is like having the camera placed in the world on 0,0,0 and looking down the Z axis (which axis heavly depends on the coordinate system you use)
To get the final matrix for a shader you usually pass all these matrices (or a combined one via multiplication) to your shader. If you use the fixed function pipeline there are usually methods to supply them. The projection is usually fixed, but could be changed to apply some visual effects like zooming a sniper rifle or a fisheye effect. The camera matrix of course is used to move and rotate the camera. And the world matrix is used to position your objects in the scene, move players, rotate doors etc.
Disclaimer: This is how i understood the whole thing for myself, so it is by no means a mathematical correct explanation, but maybe it is of any help for the OP.
To get a better understanding of the whole subject you can read this.

Basic render 3D perspective projection onto 2D screen with camera (without opengl)

Let's say I have a data structure like the following:
Camera {
double x, y, z
/** ideally the camera angle is positioned to aim at the 0,0,0 point */
double angleX, angleY, angleZ;
}
SomePointIn3DSpace {
double x, y, z
}
ScreenData {
/** Convert from some point 3d space to 2d space, end up with x, y */
int x_screenPositionOfPt, y_screenPositionOfPt
double zFar = 100;
int width=640, height=480
}
...
Without screen clipping or much of anything else, how would I calculate the screen x,y position of some point given some 3d point in space. I want to project that 3d point onto the 2d screen.
Camera.x = 0
Camera.y = 10;
Camera.z = -10;
/** ideally, I want the camera to point at the ground at 3d space 0,0,0 */
Camera.angleX = ???;
Camera.angleY = ????
Camera.angleZ = ????;
SomePointIn3DSpace.x = 5;
SomePointIn3DSpace.y = 5;
SomePointIn3DSpace.z = 5;
ScreenData.x and y is the screen x position of the 3d point in space. How do I calculate those values?
I could possibly use the equations found here, but I don't understand how the screen width/height comes into play. Also, I don't understand in the wiki entry what is the viewer's position vers the camera position.
http://en.wikipedia.org/wiki/3D_projection
The 'way it's done' is to use homogenous transformations and coordinates. You take a point in space and:
Position it relative to the camera using the model matrix.
Project it either orthographically or in perspective using the projection matrix.
Apply the viewport trnasformation to place it on the screen.
This gets pretty vague, but I'll try and cover the important bits and leave some of it to you. I assume you understand the basics of matrix math :).
Homogenous Vectors, Points, Transformations
In 3D, a homogenous point would be a column matrix of the form [x, y, z, 1]. The final component is 'w', a scaling factor, which for vectors is 0: this has the effect that you can't translate vectors, which is mathematically correct. We won't go there, we're talking points.
Homogenous transformations are 4x4 matrices, used because they allow translation to be represented as a matrix multiplication, rather than an addition, which is nice and quick for your videocard. Also convenient because we can represent successive transformations by multiplying them together. We apply transformations to points by performing transformation * point.
There are 3 primary homogeneous transformations:
Translation,
Rotation, and
Scaling.
There are others, notably the 'look at' transformation, which are worth exploring. However, I just wanted to give a brief list and a few links. Successive application of moving, scaling and rotating applied to points is collectively the model transformation matrix, and places them in the scene, relative to the camera. It's important to realise what we're doing is akin to moving objects around the camera, not the other way around.
Orthographic and Perspective
To transform from world coordinates into screen coordinates, you would first use a projection matrix, which commonly, come in two flavors:
Orthographic, commonly used for 2D and CAD.
Perspective, good for games and 3D environments.
An orthographic projection matrix is constructed as follows:
Where parameters include:
Top: The Y coordinate of the top edge of visible space.
Bottom: The Y coordinate of the bottom edge of the visible space.
Left: The X coordinate of the left edge of the visible space.
Right: The X coordinate of the right edge of the visible space.
I think that's pretty simple. What you establish is an area of space that is going to appear on the screen, which you can clip against. It's simple here, because the area of space visible is a rectangle. Clipping in perspective is more complicated because the area which appears on screen or the viewing volume, is a frustrum.
If you're having a hard time with the wikipedia on perspective projection, Here's the code to build a suitable matrix, courtesy of geeks3D
void BuildPerspProjMat(float *m, float fov, float aspect,
float znear, float zfar)
{
float xymax = znear * tan(fov * PI_OVER_360);
float ymin = -xymax;
float xmin = -xymax;
float width = xymax - xmin;
float height = xymax - ymin;
float depth = zfar - znear;
float q = -(zfar + znear) / depth;
float qn = -2 * (zfar * znear) / depth;
float w = 2 * znear / width;
w = w / aspect;
float h = 2 * znear / height;
m[0] = w;
m[1] = 0;
m[2] = 0;
m[3] = 0;
m[4] = 0;
m[5] = h;
m[6] = 0;
m[7] = 0;
m[8] = 0;
m[9] = 0;
m[10] = q;
m[11] = -1;
m[12] = 0;
m[13] = 0;
m[14] = qn;
m[15] = 0;
}
Variables are:
fov: Field of view, pi/4 radians is a good value.
aspect: Ratio of height to width.
znear, zfar: used for clipping, I'll ignore these.
and the matrix generated is column major, indexed as follows in the above code:
0 4 8 12
1 5 9 13
2 6 10 14
3 7 11 15
Viewport Transformation, Screen Coordinates
Both of these transformations require another matrix matrix to put things in screen coordinates, called the viewport transformation. That's described here, I won't cover it (it's dead simple).
Thus, for a point p, we would:
Perform model transformation matrix * p, resulting in pm.
Perform projection matrix * pm, resulting in pp.
Clipping pp against the viewing volume.
Perform viewport transformation matrix * pp, resulting is ps: point on screen.
Summary
I hope that covers most of it. There are holes in the above and it's vague in places, post any questions below. This subject is usually worthy of a whole chapter in a textbook, I've done my best to distill the process, hopefully to your advantage!
I linked to this above, but I strongly suggest you read this, and download the binary. It's an excellent tool to further your understanding of theses transformations and how it gets points on the screen:
http://www.songho.ca/opengl/gl_transform.html
As far as actual work, you'll need to implement a 4x4 matrix class for homogeneous transformations as well as a homogeneous point class you can multiply against it to apply transformations (remember, [x, y, z, 1]). You'll need to generate the transformations as described above and in the links. It's not all that difficult once you understand the procedure. Best of luck :).
#BerlinBrown just as a general comment, you ought not to store your camera rotation as X,Y,Z angles, as this can lead to an ambiguity.
For instance, x=60degrees is the same as -300 degrees. When using x,y and z the number of ambiguous possibilities are very high.
Instead, try using two points in 3D space, x1,y1,z1 for camera location and x2,y2,z2 for camera "target". The angles can be backward computed to/from the location/target but in my opinion this is not recommended. Using a camera location/target allows you to construct a "LookAt" vector which is a unit vector in the direction of the camera (v'). From this you can also construct a LookAt matrix which is a 4x4 matrix used to project objects in 3D space to pixels in 2D space.
Please see this related question, where I discuss how to compute a vector R, which is in the plane orthogonal to the camera.
Given a vector of your camera to target, v = xi, yj, zk
Normalise the vector, v' = xi, yj, zk / sqrt(xi^2 + yj^2 + zk^2)
Let U = global world up vector u = 0, 0, 1
Then we can compute R = Horizontal Vector that is parallel to the camera's view direction R = v' ^ U,
where ^ is the cross product, given by
a ^ b = (a2b3 - a3b2)i + (a3b1 - a1b3)j + (a1b2 - a2b1)k
This will give you a vector that looks like this.
This could be of use for your question, as once you have the LookAt Vector v', the orthogonal vector R you can start to project from the point in 3D space onto the camera's plane.
Basically all these 3D manipulation problems boil down to transforming a point in world space to local space, where the local x,y,z axes are in orientation with the camera. Does that make sense? So if you have a point, Q=x,y,z and you know R and v' (camera axes) then you can project it to the "screen" using simple vector manipulations. The angles involved can be found out using the dot product operator on Vectors.
Following the wikipedia, first calculate "d":
http://upload.wikimedia.org/wikipedia/en/math/6/0/b/60b64ec331ba2493a2b93e8829e864b6.png
In order to do this, build up those matrices in your code. The mappings from your examples to their variables:
θ = Camera.angle*
a = SomePointIn3DSpace
c = Camera.x | y | z
Or, just do the equations separately without using matrices, your choice:
http://upload.wikimedia.org/wikipedia/en/math/1/c/8/1c89722619b756d05adb4ea38ee6f62b.png
Now we calculate "b", a 2D point:
http://upload.wikimedia.org/wikipedia/en/math/2/5/6/256a0e12b8e6cc7cd71fa9495c0c3668.png
In this case ex and ey are the viewer's position, I believe in most graphics systems half the screen size (0.5) is used to make (0, 0) the center of the screen by default, but you could use any value (play around). ez is where the field of view comes into play. That's the one thing you were missing. Choose a fov angle and calculate ez as:
ez = 1 / tan(fov / 2)
Finally, to get bx and by to actual pixels, you have to scale by a factor related to the screen size. For example, if b maps from (0, 0) to (1, 1) you could just scale x by 1920 and y by 1080 for a 1920 x 1080 display. That way any screen size will show the same thing. There are of course many other factors involved in an actual 3D graphics system but this is the basic version.
Converting points in 3D-space into a 2D point on a screen is simply made by using a matrix. Use a matrix to calculate the screen position of your point, this saves you a lot of work.
When working with cameras you should consider using a look-at-matrix and multiply the look at matrix with your projection matrix.
Assuming the camera is at (0, 0, 0) and pointed straight ahead, the equations would be:
ScreenData.x = SomePointIn3DSpace.x / SomePointIn3DSpace.z * constant;
ScreenData.y = SomePointIn3DSpace.y / SomePointIn3DSpace.z * constant;
where "constant" is some positive value. Setting it to the screen width in pixels usually gives good results. If you set it higher then the scene will look more "zoomed-in", and vice-versa.
If you want the camera to be at a different position or angle, then you will need to move and rotate the scene so that the camera is at (0, 0, 0) and pointed straight ahead, and then you can use the equations above.
You are basically computing the point of intersection between a line that goes through the camera and the 3D point, and a vertical plane that is floating a little bit in front of the camera.
You might be interested in just seeing how GLUT does it behind the scenes. All of these methods have similar documentation that shows the math that goes into them.
The three first lectures from UCSD might be very helful, and contain several illustrations on this topic, which as far as I can see is what you are really after.
Run it thru a ray tracer:
Ray Tracer in C# - Some of the objects he has will look familiar to you ;-)
And just for kicks a LINQ version.
I'm not sure what the greater purpose of your app is (you should tell us, it might spark better ideas), but while it is clear that projection and ray tracing are different problem sets, they have a ton of overlap.
If your app is just trying to draw the entire scene, this would be great.
Solving problem #1: Obscured points won't be projected.
Solution: Though I didn't see anything about opacity or transparency on the blog page, you could probably add these properties and code to process one ray that bounced off (as normal) and one that continued on (for the 'transparency').
Solving problem #2: Projecting a single pixel will require a costly full-image tracing of all pixels.
Obviously if you just want to draw the objects, use the ray tracer for what it's for! But if you want to look up thousands of pixels in the image, from random parts of random objects (why?), doing a full ray-trace for each request would be a huge performance dog.
Fortunately, with more tweaking of his code, you might be able to do one ray-tracing up front (with transparancy), and cache the results until the objects change.
If you're not familiar to ray tracing, read the blog entry - I think it explains how things really work backwards from each 2D pixel, to the objects, then the lights, which determines the pixel value.
You can add code so as intersections with objects are made, you are building lists indexed by intersected points of the objects, with the item being the current 2d pixel being traced.
Then when you want to project a point, go to that object's list, find the nearest point to the one you want to project, and look up the 2d pixel you care about. The math would be far more minimal than the equations in your articles. Unfortunately, using for example a dictionary of your object+point structure mapping to 2d pixels, I am not sure how to find the closest point on an object without running through the entire list of mapped points. Although that wouldn't be the slowest thing in the world and you could probably figure it out, I just don't have the time to think about it. Anyone?
good luck!
"Also, I don't understand in the wiki entry what is the viewer's position vers the camera position" ... I'm 99% sure this is the same thing.
You want to transform your scene with a matrix similar to OpenGL's gluLookAt and then calculate the projection using a projection matrix similar to OpenGL's gluPerspective.
You could try to just calculate the matrices and do the multiplication in software.

Categories

Resources