I was wondering if it is possible to find the dimensions (in pixel) of a cube/cuboid in an image like the one shown below?
I know its nearly impossible because of no information about the depth,the viewing angle, etc. But at least can one find the appropriate corners of the cube so that the length, width and height can be approximated?
Any type help or information would be appreciated.
Thanks in advance.
I guess I could suggest a solution to the "at least" part of the question. You can find the corners of the cube by finding the lines in the image.
Firstly, find the edges in the image. If the target images are as plain and clear as the provided one, finding edges must be straighforward. Use cv::Canny().
cv::Mat img = cv::imread("cube.png");
cv::Mat edges;
cv::Canny(img, edges, 20, 60);
Secondly, in the edges image, detect the straight lines. Use either cv::HoughLines() or cv::HoughLinesP(). Here, I proceed with the former one:
std::vector<cv::Vec2f> lines;
cv::HoughLines(edges, lines, 0.6, CV_PI / 120, 50);
Plaese refer to the on the OpenCV documentation on Hough lines. I also took the code for the visualization from there.
The cv::HoughLines() function detects straight lines and, for each line, returns 2 values (ρ - distance, and θ - rotation angle) which define this line's equation in polar coordinates. This function would often return several lines for one source edge (as it did for a couple of lines here). In our case, we can remove such duplicates by filtering lines with the very close ρ values.
The benefit of our case is that the sides of the cube resposible for each dimension (length, width, and height) will have the same rotation angle θ in found line equations. For instance, we can expect vertical sides of the cube (responsible for the height dimension) to remain vertical and have their θ close to 0 or π (see the OpenCV documentation). We could find such lines in the vector of the detected Hough lines:
std::vector<cv::Vec2f> vertical_lines;
std::copy_if(lines.begin(), lines.end(), std::back_inserter(vertical_lines), [](cv::Vec2f line) {
//copy if θ is near 0 or CV_PI
return ((0 < line[1]) && (line[1] < 0 + CV_PI / 10)) ||
((line[1] < CV_PI) && (line[1] > CV_PI - CV_PI / 10));
});
The same reasoning applies to finding the lines for the rest of the cube sides. Just filter the found Hough lines by appropriate θ.
Now that we have the equations of the lines of our interest, we can find their corresponding edge pixels (not optimal code below, just demo):
std::vector<cv::Point> non_zero_points;
cv::findNonZero(edges, non_zero_points);
std::vector<std::vector<cv::Point>> corresponding_points(vertical_lines.size());
for (int i = 0; i < vertical_lines.size(); ++i)
for (auto point : non_zero_points)
if (abs(cos(vertical_lines[i][1])*point.x + sin(vertical_lines[i][1])*point.y - vertical_lines[i][0]) < 2)
corresponding_points[i].push_back(point);
Now, for each found cluster find the top-most, the bottom-most points (or left-most/right-most for the other sides) and get your cube corners.
Please note the pixel I denoted by exclamation marks. It got accidently sorted to one of the vertical Hough lines, but it actually belongs to a non-vertical top side. It needs to be removed, by some outlier detection or by some other approach to the corresponding pixel search.
About retreiving actual lengths of the sides: to my knowledge, it is really a non-trivial problem. Maybe this SO question would be a good place to start.
I'm working on a personal project involving Bezier curves. I have seen many implementations that handle quadratic and cubic curves individually, but I'm looking to create a more generalized algorithm where I can add and remove control points by increasing and decreasing the degree(order) of the curve.
This is not a part of my main question, but if anyone knows an example of a generalized algorithm that I can look at, I will be grateful if they can point me that direction.
First of all, I understand that any conversion from a low order to a high order curve is an approximation, and not equivalent. I am satisfied with computationally "close enough".
Secondly, I also understand that there is a loss of information when stepping down from a higher order to a lower order curve. This is unavoidable as the higher order curve has more "bends" in the curve that the lower order curve simply cannot approximate.
I'm fine with these limitations as it is necessary to be able to add and subtract control points within the curve for desired functionality.
My first question is related to this one asked approximately 5 years ago:
Convert quadratic curve to cubic curve
A quadratic curve has three (3) control points:
Vector3 Start;
Vector3 Control1;
Vector3 End;
The conversion to a cubic we introduce a fourth control point...
Vector3 Control2;
..."Between" Control1 and End. We then set Control1 and Control2 accordingly:
Control2 = new Vector3(
End.x + ((2.0f / 3.0f) * (Control1.x - End.x)),
End.y + ((2.0f / 3.0f) * (Control1.y - End.y)),
End.z + ((2.0f / 3.0f) * (Control1.z - End.z))
);
Control1 = new Vector3(
Start.x + ((2.0f / 3.0f) * (Control1.x - Start.x)),
Start.y + ((2.0f / 3.0f) * (Control1.y - Start.y)),
Start.z + ((2.0f / 3.0f) * (Control1.z - Start.z))
);
I am not certain that this is correct. In the example, only the 'x' component was set. I merely extrapolated 'y' and 'z' from it. If this is not correct, then I'd appreciate knowing what is correct.
This example only covers converting from quadratic to cubic curves. The controlling value appears to be the (2.0f/3.0f) in setting the coordinates. So, does that mean that converting from cubic to quartic would be (2.0f/4.0f) or (3.0f/4.0f) or something else entirely? What is the properly generalized function for this conversion for an arbitrary order of curve?
My project is also going to be working with splines. I am using an edge to edge method for constructing my splines where an edge is defined as an arbitrarily ordered curve from 1 (a line) to n, where the first and last Vector3 in the list of control points are the start and end points of the edge, and connecting edges share the end point of the previous edge with the start point of the next.
Unlike Bezier Curves, I'm not adding and subtracting control points. Instead, I'm dividing and combining edges. What would be a good method for subdividing a curve into two lower order curves with an order no lower than 2 approximating the original curve? Likewise, what would be a good method for combining two curves into a single high order curve?
I realize this is a lot to ask in a single post, but I felt it better to keep it together in one topic rather than dividing it up.
Thanks for any help!
You'll want to read over http://pomax.github.io/bezierinfo/#reordering for how to raising a curve to a higher order, which is almost trivial:
That may look scary, but if you actually look at what each of the three terms is, all three are stupidly simple bits of grade school maths. You typically just want the new weights, so generating those as a new list of coordinates is absolutely trivial and written in about a minute. In Javascript:
function getOneHigher(x, y):
k = x.length;
s = k - 1;
let nx = [], ny =[];
nx[0] = x[0]; nx[k] = x[s];
ny[0] = y[0]; ny[k] = y[s];
for (let i=1; i<k; i++) {
nx[i] = ((k-i)*x[i] + i*x[i-1]) / k;
ny[i] = ((k-i)*y[i] + i*y[i-1]) / k;
}
return {nx, ny};
}
And we're done. Calling getOneHigher(x, y) will yield two arrays of new coordinates for the same curve, represented as a Bezier curve of a degree one higher than we put in. Note that what this means is that we've kept the set of curve coordinates the same (literally: the functions are identical ) while reducing the tangents at each coordinate (because the derivatives are not identical), effectively "slowing down" anything that travels the path that this curve traces.
You'll then also want to follow the link to Sirver's Castle in that section which explains how to do the lossy conversion down in a way that minimizes loss of information. And this should have been multiple questions: this is a Q&A site, future visitors can't find answers to specific questions if you don't ask specific questions but instead combine several technically unrelated ones in a single post.
Lets assume we have a bezier curve with a start p0 of (0, 0) and an end p4 of (100, 0). Right now it would basically be a line with no curve yet. Now lets assume I want to calculate the two missing control points (p1 p2) based on a given angle. What is the best way to achieve this?
Lets assume I wanted something like this:
https://1.bp.blogspot.com/_W3ZUYKgeEpk/SDcAerq1xkI/AAAAAAAAAAc/W9OnovkzgPI/s400/RectanglularControlPoly.jpg
I mean depending on the position of the control points it forms a triangle of some sort, that is why I was wondering if its possible.
Controls points that go through a Bezier point with a given angle, lie on the tangent with that angle.
The resulting bending will be the softer the farther away the control point is chosen, so there are many different solutions with the same angle and different curvature..
To find control points with equally soft curvatures for two Bezier points simply find the crossing of the two tangents! Use the crossing as the common control point for both segments, i.e. have C1 = C2.
For any sort of symmetrical curve you need to keep the deviations from the crossing symmetrical, i.e. 50%, 10% etc..
Note that for optimizing the overall shape one also needs to look at the neighbouring points; in general the provided GDI function does a good job; so it is worth considering simply adding more Bezier points for controlling the shape; but of course using the perfect set of control points is the most economic solution.
Update: I have added an example of how well a circle (orange) gets approximated by the math in this interesting post.
Short version: An exact solution isn't really possible but the best fit for a quarter circle is to move the control point to ~0.55% of the crossing point. (d=r*4*(sqrt(2)-1)/3). Sometimes instead of using a 4 segment solution an 8 segment solution is used for even closer approximation..
private void button_Click(object sender, EventArgs e)
{
int w = Math.Abs(P2.Left - P1.Left);
int h = Math.Abs(P2.Top - P1.Top);
C2.Left = (int) (P2.Left + w * 0.5523f);
C2.Top = P2.Top;
C1.Left = P1.Left;
C1.Top = (int) (P1.Top + h * 0.5523f);
C1.Parent.Invalidate();
}
The code uses Labels for the points and control points..
Btw: Adding ellipses/circles to a GraphicsPath will create bezier curves that seem to be approximated just like that.
I am trying to write an algorithm (in c#) that will stitch two or more unrelated heightmaps together so there is no visible seam between the maps. Basically I want to mimic the functionality found on this page :
http://www.bundysoft.com/wiki/doku.php?id=tutorials:l3dt:stitching_heightmaps
(You can just look at the pictures to get the gist of what I'm talking about)
I also want to be able to take a single heightmap and alter it so it can be tiled, in order to create an endless world (All of this is for use in Unity3d). However, if I can stitch multiple heightmaps together, I should be able to easily modify the algorithm to act on a single heightmap, so I am not worried about this part.
Any kind of guidance would be appreciated, as I have searched and searched for a solution without success. Just a simple nudge in the right direction would be greatly appreciated! I understand that many image manipulation techniques can be applied to heightmaps, but have been unable to find a image processing algorithm that produces the results I'm looking for. For instance, image stitching appears to only work for images that have overlapping fields of view, which is not the case with unrelated heightmaps.
Would utilizing a FFT low pass filter in some way work, or would that only be useful in generating a single tileable heightmap?
Because the algorithm is to be used in Unit3d, any c# code will have to be confined to .Net 3.5, as I believe that's the latest version Unity uses.
Thanks for any help!
Okay, seems I was on the right track with my previous attempts at solving this problem. My initial attemp at stitching the heightmaps together involved the following steps for each point on the heightmap:
1) Find the average between a point on the heightmap and its opposite point. The opposite point is simply the first point reflected across either the x axis (if stitching horizontal edges) or the z axis (for the vertical edges).
2) Find the new height for the point using the following formula:
newHeight = oldHeight + (average - oldHeight)*((maxDistance-distance)/maxDistance);
Where distance is the distance from the point on the heightmap to the nearest horizontal or vertical edge (depending on which edge you want to stitch). Any point with a distance less than maxDistance (which is an adjustable value that effects how much of the terrain is altered) is adjusted based on this formula.
That was the old formula, and while it produced really nice results for most of the terrain, it was creating noticeable lines in the areas between the region of altered heightmap points and the region of unaltered heightmap points. I realized almost immediately that this was occurring because the slope of the altered regions was too steep in comparison to the unaltered regions, thus creating a noticeable contrast between the two. Unfortunately, I went about solving this issue the wrong way, looking for solutions on how to blur or smooth the contrasting regions together to remove the line.
After very little success with smoothing techniques, I decided to try and reduce the slope of the altered region, in the hope that it would better blend with the slope of the unaltered region. I am happy to report that this has improved my stitching algorithm greatly, removing 99% of the lines reported above.
The main culprit from the old formula was this part:
(maxDistance-distance)/maxDistance
which was producing a value between 0 and 1 linearly based on the distance of the point to the nearest edge. As the distance between the heightmap points and the edge increased, the heightmap points would utilize less and less of the average (as defined above), and shift more and more towards their original values. This linear interpolation was the cause of the too step slope, but luckily I found a built in method in the Mathf class of Unity's API that allows for quadratic (I believe cubic) interpolation. This is the SmoothStep Method.
Using this method (I believe a similar method can be found in the Xna framework found here), the change in how much of the average is used in determining a heightmap value becomes very severe in middle distances, but that severity lessens exponentially the closer the distance gets to maxDistance, creating a less severe slope that better blends with the slope of the unaltered region. The new forumla looks something like this:
//Using Mathf - Unity only?
float weight = Mathf.SmoothStep(1f, 0f, distance/maxDistance);
//Using XNA
float weight = MathHelper.SmoothStep(1f, 0f, distance/maxDistance);
//If you can't use either of the two methods above
float input = distance/maxDistance;
float weight = 1f + (-1f)*(3f*(float)Math.Pow(input, 2f) - 2f*(float)Math.Pow(input, 3f));
//Then calculate the new height using this weight
newHeight = oldHeight + (average - oldHeight)*weight;
There may be even better interpolation methods that produce better stitching. I will certainly update this question if I find such a method, so anyone else looking to do heightmap stitching can find the information they need. Kudos to rincewound for being on the right track with linear interpolation!
What is done in the images you posted looks a lot like simple linear interpolation to me.
So basically: You take two images (Left, Right) and define a stitching region. For linear interpolation you could take the leftmost pixel of the left image (in the stitching region) and the rightmost pixel of the right image (also in the stitching region). Then you fill the space in between with interpolated values.
Take this example - I'm using a single line here to show the idea:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
Lets say our overlap is 4 pixels wide:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
^ ^ ^ ^ overlap/stitiching region.
The leftmost value of the left image would be 10
The rightmost value of the right image would be 1.
Now we interpolate linearly between 10 and 1 in 2 steps, our new stitching region looks as follows
stitch = [10, 07, 04, 01]
We end up with the following stitched line:
line = [11,11,11,10,07,04,01,02,02,02]
If you apply this to two complete images you should get a result similar to what you posted before.
Let's say I have a data structure like the following:
Camera {
double x, y, z
/** ideally the camera angle is positioned to aim at the 0,0,0 point */
double angleX, angleY, angleZ;
}
SomePointIn3DSpace {
double x, y, z
}
ScreenData {
/** Convert from some point 3d space to 2d space, end up with x, y */
int x_screenPositionOfPt, y_screenPositionOfPt
double zFar = 100;
int width=640, height=480
}
...
Without screen clipping or much of anything else, how would I calculate the screen x,y position of some point given some 3d point in space. I want to project that 3d point onto the 2d screen.
Camera.x = 0
Camera.y = 10;
Camera.z = -10;
/** ideally, I want the camera to point at the ground at 3d space 0,0,0 */
Camera.angleX = ???;
Camera.angleY = ????
Camera.angleZ = ????;
SomePointIn3DSpace.x = 5;
SomePointIn3DSpace.y = 5;
SomePointIn3DSpace.z = 5;
ScreenData.x and y is the screen x position of the 3d point in space. How do I calculate those values?
I could possibly use the equations found here, but I don't understand how the screen width/height comes into play. Also, I don't understand in the wiki entry what is the viewer's position vers the camera position.
http://en.wikipedia.org/wiki/3D_projection
The 'way it's done' is to use homogenous transformations and coordinates. You take a point in space and:
Position it relative to the camera using the model matrix.
Project it either orthographically or in perspective using the projection matrix.
Apply the viewport trnasformation to place it on the screen.
This gets pretty vague, but I'll try and cover the important bits and leave some of it to you. I assume you understand the basics of matrix math :).
Homogenous Vectors, Points, Transformations
In 3D, a homogenous point would be a column matrix of the form [x, y, z, 1]. The final component is 'w', a scaling factor, which for vectors is 0: this has the effect that you can't translate vectors, which is mathematically correct. We won't go there, we're talking points.
Homogenous transformations are 4x4 matrices, used because they allow translation to be represented as a matrix multiplication, rather than an addition, which is nice and quick for your videocard. Also convenient because we can represent successive transformations by multiplying them together. We apply transformations to points by performing transformation * point.
There are 3 primary homogeneous transformations:
Translation,
Rotation, and
Scaling.
There are others, notably the 'look at' transformation, which are worth exploring. However, I just wanted to give a brief list and a few links. Successive application of moving, scaling and rotating applied to points is collectively the model transformation matrix, and places them in the scene, relative to the camera. It's important to realise what we're doing is akin to moving objects around the camera, not the other way around.
Orthographic and Perspective
To transform from world coordinates into screen coordinates, you would first use a projection matrix, which commonly, come in two flavors:
Orthographic, commonly used for 2D and CAD.
Perspective, good for games and 3D environments.
An orthographic projection matrix is constructed as follows:
Where parameters include:
Top: The Y coordinate of the top edge of visible space.
Bottom: The Y coordinate of the bottom edge of the visible space.
Left: The X coordinate of the left edge of the visible space.
Right: The X coordinate of the right edge of the visible space.
I think that's pretty simple. What you establish is an area of space that is going to appear on the screen, which you can clip against. It's simple here, because the area of space visible is a rectangle. Clipping in perspective is more complicated because the area which appears on screen or the viewing volume, is a frustrum.
If you're having a hard time with the wikipedia on perspective projection, Here's the code to build a suitable matrix, courtesy of geeks3D
void BuildPerspProjMat(float *m, float fov, float aspect,
float znear, float zfar)
{
float xymax = znear * tan(fov * PI_OVER_360);
float ymin = -xymax;
float xmin = -xymax;
float width = xymax - xmin;
float height = xymax - ymin;
float depth = zfar - znear;
float q = -(zfar + znear) / depth;
float qn = -2 * (zfar * znear) / depth;
float w = 2 * znear / width;
w = w / aspect;
float h = 2 * znear / height;
m[0] = w;
m[1] = 0;
m[2] = 0;
m[3] = 0;
m[4] = 0;
m[5] = h;
m[6] = 0;
m[7] = 0;
m[8] = 0;
m[9] = 0;
m[10] = q;
m[11] = -1;
m[12] = 0;
m[13] = 0;
m[14] = qn;
m[15] = 0;
}
Variables are:
fov: Field of view, pi/4 radians is a good value.
aspect: Ratio of height to width.
znear, zfar: used for clipping, I'll ignore these.
and the matrix generated is column major, indexed as follows in the above code:
0 4 8 12
1 5 9 13
2 6 10 14
3 7 11 15
Viewport Transformation, Screen Coordinates
Both of these transformations require another matrix matrix to put things in screen coordinates, called the viewport transformation. That's described here, I won't cover it (it's dead simple).
Thus, for a point p, we would:
Perform model transformation matrix * p, resulting in pm.
Perform projection matrix * pm, resulting in pp.
Clipping pp against the viewing volume.
Perform viewport transformation matrix * pp, resulting is ps: point on screen.
Summary
I hope that covers most of it. There are holes in the above and it's vague in places, post any questions below. This subject is usually worthy of a whole chapter in a textbook, I've done my best to distill the process, hopefully to your advantage!
I linked to this above, but I strongly suggest you read this, and download the binary. It's an excellent tool to further your understanding of theses transformations and how it gets points on the screen:
http://www.songho.ca/opengl/gl_transform.html
As far as actual work, you'll need to implement a 4x4 matrix class for homogeneous transformations as well as a homogeneous point class you can multiply against it to apply transformations (remember, [x, y, z, 1]). You'll need to generate the transformations as described above and in the links. It's not all that difficult once you understand the procedure. Best of luck :).
#BerlinBrown just as a general comment, you ought not to store your camera rotation as X,Y,Z angles, as this can lead to an ambiguity.
For instance, x=60degrees is the same as -300 degrees. When using x,y and z the number of ambiguous possibilities are very high.
Instead, try using two points in 3D space, x1,y1,z1 for camera location and x2,y2,z2 for camera "target". The angles can be backward computed to/from the location/target but in my opinion this is not recommended. Using a camera location/target allows you to construct a "LookAt" vector which is a unit vector in the direction of the camera (v'). From this you can also construct a LookAt matrix which is a 4x4 matrix used to project objects in 3D space to pixels in 2D space.
Please see this related question, where I discuss how to compute a vector R, which is in the plane orthogonal to the camera.
Given a vector of your camera to target, v = xi, yj, zk
Normalise the vector, v' = xi, yj, zk / sqrt(xi^2 + yj^2 + zk^2)
Let U = global world up vector u = 0, 0, 1
Then we can compute R = Horizontal Vector that is parallel to the camera's view direction R = v' ^ U,
where ^ is the cross product, given by
a ^ b = (a2b3 - a3b2)i + (a3b1 - a1b3)j + (a1b2 - a2b1)k
This will give you a vector that looks like this.
This could be of use for your question, as once you have the LookAt Vector v', the orthogonal vector R you can start to project from the point in 3D space onto the camera's plane.
Basically all these 3D manipulation problems boil down to transforming a point in world space to local space, where the local x,y,z axes are in orientation with the camera. Does that make sense? So if you have a point, Q=x,y,z and you know R and v' (camera axes) then you can project it to the "screen" using simple vector manipulations. The angles involved can be found out using the dot product operator on Vectors.
Following the wikipedia, first calculate "d":
http://upload.wikimedia.org/wikipedia/en/math/6/0/b/60b64ec331ba2493a2b93e8829e864b6.png
In order to do this, build up those matrices in your code. The mappings from your examples to their variables:
θ = Camera.angle*
a = SomePointIn3DSpace
c = Camera.x | y | z
Or, just do the equations separately without using matrices, your choice:
http://upload.wikimedia.org/wikipedia/en/math/1/c/8/1c89722619b756d05adb4ea38ee6f62b.png
Now we calculate "b", a 2D point:
http://upload.wikimedia.org/wikipedia/en/math/2/5/6/256a0e12b8e6cc7cd71fa9495c0c3668.png
In this case ex and ey are the viewer's position, I believe in most graphics systems half the screen size (0.5) is used to make (0, 0) the center of the screen by default, but you could use any value (play around). ez is where the field of view comes into play. That's the one thing you were missing. Choose a fov angle and calculate ez as:
ez = 1 / tan(fov / 2)
Finally, to get bx and by to actual pixels, you have to scale by a factor related to the screen size. For example, if b maps from (0, 0) to (1, 1) you could just scale x by 1920 and y by 1080 for a 1920 x 1080 display. That way any screen size will show the same thing. There are of course many other factors involved in an actual 3D graphics system but this is the basic version.
Converting points in 3D-space into a 2D point on a screen is simply made by using a matrix. Use a matrix to calculate the screen position of your point, this saves you a lot of work.
When working with cameras you should consider using a look-at-matrix and multiply the look at matrix with your projection matrix.
Assuming the camera is at (0, 0, 0) and pointed straight ahead, the equations would be:
ScreenData.x = SomePointIn3DSpace.x / SomePointIn3DSpace.z * constant;
ScreenData.y = SomePointIn3DSpace.y / SomePointIn3DSpace.z * constant;
where "constant" is some positive value. Setting it to the screen width in pixels usually gives good results. If you set it higher then the scene will look more "zoomed-in", and vice-versa.
If you want the camera to be at a different position or angle, then you will need to move and rotate the scene so that the camera is at (0, 0, 0) and pointed straight ahead, and then you can use the equations above.
You are basically computing the point of intersection between a line that goes through the camera and the 3D point, and a vertical plane that is floating a little bit in front of the camera.
You might be interested in just seeing how GLUT does it behind the scenes. All of these methods have similar documentation that shows the math that goes into them.
The three first lectures from UCSD might be very helful, and contain several illustrations on this topic, which as far as I can see is what you are really after.
Run it thru a ray tracer:
Ray Tracer in C# - Some of the objects he has will look familiar to you ;-)
And just for kicks a LINQ version.
I'm not sure what the greater purpose of your app is (you should tell us, it might spark better ideas), but while it is clear that projection and ray tracing are different problem sets, they have a ton of overlap.
If your app is just trying to draw the entire scene, this would be great.
Solving problem #1: Obscured points won't be projected.
Solution: Though I didn't see anything about opacity or transparency on the blog page, you could probably add these properties and code to process one ray that bounced off (as normal) and one that continued on (for the 'transparency').
Solving problem #2: Projecting a single pixel will require a costly full-image tracing of all pixels.
Obviously if you just want to draw the objects, use the ray tracer for what it's for! But if you want to look up thousands of pixels in the image, from random parts of random objects (why?), doing a full ray-trace for each request would be a huge performance dog.
Fortunately, with more tweaking of his code, you might be able to do one ray-tracing up front (with transparancy), and cache the results until the objects change.
If you're not familiar to ray tracing, read the blog entry - I think it explains how things really work backwards from each 2D pixel, to the objects, then the lights, which determines the pixel value.
You can add code so as intersections with objects are made, you are building lists indexed by intersected points of the objects, with the item being the current 2d pixel being traced.
Then when you want to project a point, go to that object's list, find the nearest point to the one you want to project, and look up the 2d pixel you care about. The math would be far more minimal than the equations in your articles. Unfortunately, using for example a dictionary of your object+point structure mapping to 2d pixels, I am not sure how to find the closest point on an object without running through the entire list of mapped points. Although that wouldn't be the slowest thing in the world and you could probably figure it out, I just don't have the time to think about it. Anyone?
good luck!
"Also, I don't understand in the wiki entry what is the viewer's position vers the camera position" ... I'm 99% sure this is the same thing.
You want to transform your scene with a matrix similar to OpenGL's gluLookAt and then calculate the projection using a projection matrix similar to OpenGL's gluPerspective.
You could try to just calculate the matrices and do the multiplication in software.