I am trying to make a little GUI library in c#, but after doing some research on matrix transformations I found out that I need a Matrix3x3 to store rotation, scale, and translation of a Vector2. But in the C# System.Numerics there is only a Matrix3x2 or Matrix4x4? Could I use one of those instead? If so how would I go about it? And why isnt there a Matrix3x3 in the standard library?
I am very new to Matrix and Vector programming, So sorry if this is a stupid question.
Thanks in advance.
You can use Matrix4x4. Start with an identity matrix and fill the top left 3×3 elements with your matrix.
For example to solve a 3×3 system of equations:
// Find the barycentric coordinates
//
// Solve for (wA,wB,wC):
// | px | | ax bx cx | | wA |
// | py | = | ay by cy | | wB |
// | pz | | az bz cz | | wC |
var m = new Matrix4x4(
A.X, B.X, C.X, 0,
A.Y, B.Y, C.Y, 0,
A.Z, B.Z, C.Z, 0,
0, 0, 0, 1);
if (Matrix4x4.Invert(m, out Matrix4x4 u))
{
var w = new Vector3(
u.M11*P.X + u.M12*P.Y + u.M13*P.Z,
u.M21*P.X + u.M22*P.Y + u.M23*P.Z,
u.M31*P.X + u.M32*P.Y + u.M33*P.Z);
// ...
}
As for the reasons, the intent of System.Numerics has to be related to computer graphics since it utilizes Homogeneous Coordinates in which 3D vectors contain 4 elements. Three regular coordinates and a scalar weight factor. The math with homogeneous coordinates for computer graphics is vastly simplified. The only reason there is a Vector3 is because a Vector4 should be treated as a vector of 3 elements plus a scalar, and thus Vector3 should be used in composition and decomposition of the homogeneous coordinates. It means that not all 4 elements can be treated equally, and sometimes you need to do things with the vector (first three elements) separately from the scalar value (fourth element).
Also System.Numerics uses single precision float elements, which are almost never used in scientific computing, but universally applied to computer graphics for their speed. Hopefully one day when the CLR supports AVX512 there will be double precision numeric intrinsics that scientists can actually use.
Related
I have a 3x3 matrix I'm using to track movement in 2D. I need to extract from that the translation, rotation and scale matrices. Can anyone suggest how I would do this? I've had no luck searching online so far (possibly I'm using the wrong terms).
This is just off the top of my head so there may be an error in here, but:
Assuming your matrix is row-major (just transpose everything if you're using column major):
| cos(t) -sin(t) 0 |
| sin(t) cos(t) 0 |
| tx ty 1 |
The translation vector will be the last row in the matrix [tx ty 1]. Extracting scale and rotation in a composed matrix is a bit trickier.
Looking at a 3x3 rotation matrix,
| cos(t) -sin(t) 0 |
| sin(t) cos(t) 0 |
| 0 0 1 |
And a scale matrix
| vx 0 0 |
| 0 vy 0 |
| 0 0 1 |
The combined rotation & scale matrix might look like (ct = cos(t), st = sin(t))
| vx*ct -vx*st 0 |
| vy*st vy*ct 0 |
| 0 0 1 |
For uniform scaling, vx=vy.
| v*ct -v*st 0 |
| v*st v*ct 0 |
| 0 0 1 |
Remembering the trig identity
ct^2 + st^2 = 1
We can see that the
(v*ct)^2 + (v*st)^2 = v^2
or
v^2*ct^2 + v^2*st^2 = v^2
... all the terms of which are in the composite (scale,rotation,translation or SRT for short) matrix,
So,
v = sqrt((v*ct)^2 + (v*st)^2)
or
v = sqrt(M[0,0]^2 + M[0,1]^2);
Theta, then is just
t = acos(vct/v)
or
t = acos(M[0,0]/v)
or, this might work much easier but I haven't tried it:
theta = atan2(vst,vct),
scale = sqrt(vst^2+vct^2)
where ^ is an exponent, not an XOR.
... and you can work out the rest.
It would be wiser to keep your scale, rotation and translation values around and use those values both to build the matrix and for whatever other tasks you require. Relying on the matrix as the only container for that information will eventually lead to compound floating-point errors and other drama.
Some notes:
Theta is an angle. It's a fun, easy-to-draw Greek symbol, and all us engineers love greek characters. It's mostly Stockholm syndrome since some of them look like 5's, and some of them are impossible to draw unless you're Greek.
This works great for 2D, where there is only one possible rotation axis. If you're working in three (or higher! It's a thing now!) dimensions, the concept is the same but the implementation becomes much, much messier.
st, ct, vst, vct, etc. are all contained in the composite matrix you have available. (Composite == concatenated == combined. The terminology in use depends on who you're reading.) The non-intuitive part is extracting the elements that you need from the matrix. This requires understanding what is in the matrix, where, and why.
This is a solvable problem. Normally, I'm a sit-down-and-start-coding kind of guy, and in 25 years of professional development, I'd have to say that it's worked out pretty well so far. But, there are times where a whiteboard or a notebook and a mechanical pencil are better friends than your keyboard.
For a column major matrix, given:
X vector of < Xx, Xy >;
Y vector of < Yx, Yy > and
Translation/position of < Tx, Ty >,
The corresponding column major transformation matrix is:
The translation is:
For the Scaling, the scale factors for the X and Y vectors are:
The scaling transformation is then:
For Rotation, the normalized vectors are:
X Vector: < Xx/Sx, Xy/Sx >
Y Vector: < Yx/Sy, Yy/Sy >
And the rotation transformation "R" is:
RECOMPOSITION:
Given translation "T", rotation "R", and scaling "S" transformation in column major order, the original matrix/transform can be recreated with:
T * R * S
Assumption, the original transformation has X and Y vectors that are:
Not parallel;
Not a length of zero;
Not mirrored; and
Not skewed/sheared.
The rotation can be further decomposed into an angle as described by 3Dave (which is row major, and needs to be transposed to be column major).
So, I have a Direct2D Matrix3x2F that I use to store transformations on geometries. I want these transformations to be user-editable, and I don't want the user to have to edit a matrix directly. Is it possible to decompose a 3x2 matrix into scaling, rotation, skewing, and translation?
This is the solution I found for a Direct2D transformation matrix:
scale x = sqrt(M11 * M11 + M12 * M12)
scale y = sqrt(M21 * M21 + M22 * M22) * cos(shear)
rotation = atan2(M12, M11)
shear (y) = atan2(M22, M21) - PI/2 - rotation
translation x = M31
translation y = M32
If you multiply these values back together in the order scale(x, y) * skew(0, shear) * rotate(angle) * translate(x, y) you will get a matrix that performs an equivalent transformation.
Decomposition
yes you can (at least partially). 3x2 transform matrix represents 2D homogenuous 3x3 transform matrix without projections. Such transform matrix is either OpenGL style:
| Xx Yx Ox |
| Xy Yy Oy |
or DirectX style:
| Xx Xy |
| Yx Yy |
| Ox Oy |
As you tagged Direct2D and using 3x2 matrix then the second is the one you got. There are 3 vectors:
X=(Xx,Xy) X axis vector
Y=(Yx,Yy) Y axis vector
O=(Ox,Oy) Origin of coordinate system.
Now lets assume that there is no skew present and the matrix is orthogonal...
Scaling
is very simple just obtain the axises basis vectors lengths.
scalex = sqrt( Xx^2 + Xy^2 );
scaley = sqrt( Yx^2 + Yy^2 );
if scale coefficient is >1 the matrix scales up and if <1 scales down.
rotation
You can use:
rotation_ang=atan2(Xy,Yx);
translation
The offset is O so if it is non zero you got translation present.
Skew
In 2D skew does not complicate things too much and the bullets above still apply (not the case for 3D). The skew angle is the angle between axises minus 90 degrees so:
skew_angle = acos((X.Y)/(|X|.|Y|)) - 0.5*PI;
skew_angle = acos((Xx*Yx + Xy*Yy)/sqrt(( Xx^2 + Xy^2 )*( Yx^2 + Yy^2 ))) - 0.5*PI;
Also beware if your transform matrix does not represent your coordinate system but its inverse then you need to inverse your matrix before applying this...
So compute first inverse of:
| Xx Xy 0 |
| Yx Yy 0 |
| Ox Oy 1 |
And apply the above on the result.
For more info about this topic see:
Understanding 4x4 homogenous transform matrices
Especially the difference between column major and row major orders (OpenGL vs. DirectX notation)
Store the primary transformations in a class with editable properites
scaling
rotation
skewing
translation
and then build the final transform matrix from those. It will be easier that way. However if you must there are algorithms for decomposing a matrix. They are not as simple as you might think.
System.Numerics has a method for decomposing 3D transform matrices
https://github.com/dotnet/corefx/blob/master/src/System.Numerics.Vectors/src/System/Numerics/Matrix4x4.cs#L1497
I have detected the Face in an Image(Only 1 Person) and have the coordinates of the Face Rectangle.
Since the image can be of any size,I need only the part of the image that is important(head.shoulders).What intent to do is extend the bounds of the detected rectangle by some factor so that the important parts are included.
Is this the right approach?
Update:
I have tried this .. but its not giving the correct result.Note that i have changed 1.7 to 2 since it only takes integer arguments.And Top and Left are readonly properties.
foreach (Rectangle f in objects)
{
int x, y;
x = f.Top - (f.Height / 8);
y = f.Left - (f.Width / 2);
Rectangle myrect = new Rectangle(x, y, f.Width * 2, f.Height * 2);
g.DrawRectangle(Pens.Gray, myrect);
}
Detected Face Rectangle
Top----->62
Right----->470
Left----->217
Bottom----->315
Extended Rectangle as per answer
Top----->91
Right----->537
Left----->31
Bottom----->597
Extended rectangle
As my previous answer as off-topic, I will write my correct answer here:
As I am not completely familiar with Emgu CV, I would have the following approaches:
As Emgu CV is open-source, you could spend restless nights and change the code inside the libraries and recompile them etc.
or (my preferable approach):
You think about it biologically, meaning:
You know the position and size of your face rectangle. If you also know body proporions, you can calculate the estimated width of the shoulders and there vertical offset (relative to the face's center).
More details for the biological approach:
Imagine fig. №1 begin true, and imagine that you have the following image and face rectangle:
Bitmap
| .Width == 100
| .Height == 160
Face // type: System.Drawing.Rectangle
| .Top == 20
| .Left == 50
| .Width == 60
| .Height == 60
then, according to the provided image, the new Rectangle should be:
f := Face // face rectangle
Face_and_Shoulder
| .Top = f.Top - (f.Height / 8)
| .Left = f.Left - (f.Width / 2)
| .Width = f.Width * 2
| .Height = f.Height * 1.7
which would result in the following values:
Face_and_Shoulder
| .Top == 12.5
| .Left == 20
| .Width == 120
| .Height == 102
The resulted rectangle (Face_and_Shoulder) should include the shoulder and hair etc. when drawing it over your image.
This method has however a minor drawback: It will not work, if the face is rotated by a certain number of degrees (I believe more than 5..10°).
To calculated the respective rectangle, I would advise you to use this code (you seem to have confused X and Y in your code sample):
foreach (Rectangle f in objects)
{
float x = f.Left - (f.Width / 2f);
float y = f.Top - (f.Height / 8f);
Rectangle myrect = new Rectangle((int)x, (int)y, f.Width * 2, (int)(f.Height * 1.3));
g.DrawRectangle(Pens.Gray, myrect);
}
fig. №1 (source: http://www.idrawdigital.com/wp-content/uploads/2009/01/prop_var.gif)
I would create a second bitmap and draw the first one into the second one as follows:
Bitmap source = Image.FromFile("/my/path/to/myimage.png") as Bitmap;
Rectangle facerectangle = /* your face detection logic */;
Bitmap target = new Bitmap(facerectangle.Width, facerectangle.Height);
using (Graphics g = Graphics.FromImage(target))
{
g.DrawImage(source, new Rectangle(0, 0, target.Width, target.Height),
facerectangle, GraphicsUnit.Pixel);
}
The code should be rather easy to understand :)
You first load your bitmap source, then create your rectangle using your face recognition logic and create the bitmap target, in which you draw the first segment using GDI+'s DrawImage.
I've been trying to figure this relationship out but I can't, maybe I'm just not searching for the right thing. If I project a world-space coordinate to clip space using Vector3.Project, the X and Y coordinates make sense but I can't figure out how it's computing the Z (0..1) coordinate. For instance, if my nearplane is 1 and farplane is 1000, I project a Vector3 of (0,0,500) (camera center, 50% of distance to far plane) to screen space I get (1050, 500, .9994785)
The resulting X and Y coordinates make perfect sense but I don't understand where it's getting the resulting Z-value.
I need this because I'm actually trying to UNPROJECT screen-space coordinates and I need to be able to pick a Z-value to tell it the distance from the camera I want the world-space coordinate to be, but I don't understand the relationship between clip space Z (0-1) and world-space Z (nearplane-farplane).
In case this helps, my transformation matrices are:
World = Matrix.Identity;
//basically centered at 0,0,0 looking into the screen
View = Matrix.LookAtLH(
new Vector3(0,0,0), //camera position
new Vector3(0,0,1), //look target
new Vector3(0,1,0)); //up vector
Projection = Matrix.PerspectiveFovLH(
(float)(Math.PI / 4), //FieldOfViewY
1.6f, // AspectRatio
1, //NearPlane
1000); //FarPlane
Standard perspective projection creates a reciprocal relationship between the scene depth and the depth buffer value, not a linear one. This causes a higher percentage of buffer precision to be applied to objects closer to the near plane than those closer to the far plane, which is typically desired. As for the actual math, here's the breakdown:
The bottom-right 2x2 elements (corresponding to z and w) of the projection matrix are:
[far / (far - near) ] [1]
[-far * near / (far - near)] [0]
This means that after multiplying, z' = z * far / (far - near) - far * near / (far - near) and w' = z. After this step, there is the perspective divide, z'' = z' / w'.
In your specific case, the math works out to the value you got:
z = 500
z' = z * 1000 / (1000 - 999) - 1000 / (1000 - 999) = 499.499499499...
w' = z = 500
z'' = z' / w' = 0.998998998...
To recover the original depth, simply reverse the operations:
z = (far / (far - near)) / ((far / (far - near)) - z'')
I'm looking for a simple algorithm that, given a rectangle with width w and height h, splits the rectangle into n more or less equal sized and shape rectangles and calculates the center of these rectangles.
EDIT: Forgot to mention that the shapes should be as similar as possible to a square.
Any hints how to start?
A simple algorithm is to split vertically into n equal sized strips of height h and width w/n.
If you assume that the initial rectangle has corners (0,0) and (w,h) then using this algorithm the ith rectangle would have center (w / n * (i + ½), h/2), for 0 <= i < n.
Update: try finding all the factorizations of the number n into factor pairs (i, j) such that i * j = n, and find the factor pair such that the ratio of the factors is closest to the ratio of the sides of the rectangle. Then use the two factors to create a regular grid of smaller rectangles.
For example when n is 10, you can choose between (1, 10), (2, 5), (5, 2) and (10, 1). Here is an example grid using the factors (5, 2):
------------------------------------
| | | | | |
| | | | | |
------------------------------------
| | | | | |
| | | | | |
------------------------------------
If your initial rectangle has width 60 and height 20 then using the factor pair (5, 2) will give ten rectangles of size (60/5, 20/2) = (12, 10) which is close to square.