Facial detection coordinates using a camera - c#

I need a way to grab the coordinates of the face in C# for Windows Phone 8.1 in the camera view. I haven't been able to find anything on the web so I'm thinking it might not be possible. What I need is the x and y (and possibly area) of the "box" that forms around the face when it is detected in the camera view. Has anyone done this before?

Code snippet (bear in mind this is part of an app from the tutorial I linked below the code. It's not copy-pasteable, but should provide some help)
const string MODEL_FILE = "haarcascade_frontalface_alt.xml";
FaceDetectionWinPhone.Detector m_detector;
public MainPage()
{
InitializeComponent();
m_detector = new FaceDetectionWinPhone.Detector(System.Xml.Linq.XDocument.Load(MODEL_FILE));
}
void photoChooserTask_Completed(object sender, PhotoResult e)
{
if (e.TaskResult == TaskResult.OK)
{
BitmapImage bmp = new BitmapImage();
bmp.SetSource(e.ChosenPhoto);
WriteableBitmap btmMap = new WriteableBitmap(bmp);
//find faces from the image
List<FaceDetectionWinPhone.Rectangle> faces =
m_detector.getFaces(
btmMap, 10f, 1f, 0.05f, 1, false, false);
//go through each face, and draw a red rectangle on top of it.
foreach (var r in faces)
{
int x = Convert.ToInt32(r.X);
int y = Convert.ToInt32(r.Y);
int width = Convert.ToInt32(r.Width);
int height = Convert.ToInt32(r.Height);
btmMap.FillRectangle(x, y, x + height, y + width, System.Windows.Media.Colors.Red);
}
//update the bitmap before drawing it.
btmMap.Invalidate();
facesPic.Source = btmMap;
}
}
This is taken from developer.nokia.com
To do this in real-time, you need to intercept the viewfinder image, perhaps using the NewCameraFrame method (EDIT: not sure if you should use this method or PhotoCamera.GetPreviewBufferArgb32 as described below. I have to leave it up to your research)
So basically your task has 2 parts:
Get the viewfinder image
Detect faces on it (using something like the code above)
If I were you, I'd first do step 2. on an image loaded from disk, and once you can detect faces on that, I'd see how to obtain current viewfinder image and detect faces on that. X,Y coordinates are easy enough to obtain once you've detected the face - see code above.
(EDIT): I think you should try using PhotoCamera.GetPreviewBufferArgb32 method to obtain the viewfinder image. Look here MSDN documentation. Also, be sure to search through MSDN docs and tutorials. This should be more than enough to complete step 1.
A lot of face detection algorithms use Haar classifiers, Viola-Jones algorithm etc. If you're familiar with that, you'll feel more confident in what you're doing, but you can do without. Also, read the materials that I linked - they seem fairly good.

Related

Android: Get correct coordinates with scaled and translated canvas

I am writing an Android (Xamarin) application which is able to zoom and pan an image. A user can also click on a position on the image. I need those coordinates on the image for later use.
The following code is zooming and panning the image:
protected override void OnDraw(Canvas canvas)
{
base.OnDraw(canvas);
_maxX = canvas.Width;
_maxY = canvas.Height;
canvas.Translate(_posX, _posY);
if (_scaleDetector.IsInProgress)
canvas.Scale(_scaleFactor, _scaleFactor, _scaleDetector.FocusX, _scaleDetector.FocusY);
else
canvas.Scale(_scaleFactor, _scaleFactor, _lastGestureX, _lastGestureY);
}
So far so good, now I have some MotionEvent in use, which is a LongPressListener. I wrote the following code to translate the coordinates from the MotionEvent to the coordinates on the image:
var actualX = e.GetX() - (_parent._posX / _parent._scaleFactor);
var actualY = e.GetY() - (_parent._posY / _parent._scaleFactor);
e in this case is the frame of the image. The frame holds an image (which is _parent), the user can drag the image. _parent._posX/Y are changed when that happens. The user can also zoom the image, that's the _scaleFactor.
So, when a user taps anywhere in e, I need to translate those coordinates to the image coordinates.
Those two lines of code works, but when the user zooms in, the coordinates are off as you can see in the attached image:
The red dots represent the calculated positions. As you can see, if a user zooms in the coordinates gets more off. What's wrong in this calculation?
Try to do this :-
var actualX = (e.GetX() - _parent._posX) / _parent._scaleFactor;
var actualY = (e.GetY() - _parent._posY) / _parent._scaleFactor;
I think your problem is because the Canvas is not getting updated, try using Canvas.UpdateLayout after zooming.
I managed to fix it using a Matrix:
private float[] TranslateCoordinates(float[] coordinates)
{
var matrix = new Matrix(Matrix);
matrix.PreScale(_scaleFactor, _scaleFactor);
matrix.PreTranslate(_posX, _posY);
matrix.Invert(matrix);
matrix.MapPoints(coordinates);
return coordinates;
}
The float[] contains the values of MotionEvent's GetX() and GetY().

aruco.net - How to find marker orientation

I am trying to use openCV.NET to read scanned forms. The problem is that sometimes the positions of the relevant regions of interest and the alignment may differ depending on the printer it was printed form and the way the user scanned the form.
So I thought I could use an ArUco marker as a reference point as there are libraries (ArUco.NET) already built to recognize them. I was hoping to find out how much the ArUco code is rotated and then rotate the form backwards by that amount to make sure the text is straight. Then I can use the center of the ArUco code as a reference point to use OCR on specific regions on the form.
I am using the following code to get the OpenGL modelViewMatrix. However, it always seems to be the same numbers no matter which angle the ArUco code is rotated. I only just started with all of these libraries but I thought that the modelViewMatrix would give me different values depending on the rotation of the marker. Why would it always be the same?
Mat cameraMatrix = new Mat(3, 3, Depth.F32, 1);
Mat distortion = new Mat(1, 4, Depth.F32, 1);
using (Mat image2 = OpenCV.Net.CV.LoadImageM("./image.tif", LoadImageFlags.Grayscale))
{
using (var detector = new MarkerDetector())
{
detector.ThresholdMethod = ThresholdMethod.AdaptiveThreshold;
detector.Param1 = 7.0;
detector.Param2 = 7.0;
detector.MinSize = 0.01f;
detector.MaxSize = 0.5f;
detector.CornerRefinement = CornerRefinementMethod.Lines;
var markerSize = 10;
IList<Marker> detectedMarkers = detector.Detect(image2, cameraMatrix, distortion);
foreach (Marker marker in detectedMarkers)
{
Console.WriteLine("Detected a marker top left at: " + marker[0].X + #" " + marker[0].Y);
//Upper 3x3 matrix of modelview matrix (0,4,8,1,5,9,2,6,10) is called rotation matrix.
double[] modelViewMatrix = marker.GetGLModelViewMatrix();
}
}
}
It looks like you have not initialized your camera parameters.
cameraMatrix and distortion are the intrinsic parameters of your camera. You can use OpenCV to find them.
This is vor OpenCV 2.4 but will help you to understand the basics:
http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
If you have found them you should be able to get the parameters.

Detect small rectangles in image AForge

I'm trying to detect rectangles on this image:
with this code:
static void Main(string[] args)
{
// Open your image
string path = "test.png";
Bitmap image = (Bitmap)Bitmap.FromFile(path);
// locating objects
BlobCounter blobCounter = new BlobCounter();
blobCounter.FilterBlobs = true;
blobCounter.MinHeight = 5;
blobCounter.MinWidth = 5;
blobCounter.ProcessImage(image);
Blob[] blobs = blobCounter.GetObjectsInformation();
// check for rectangles
SimpleShapeChecker shapeChecker = new SimpleShapeChecker();
foreach (var blob in blobs)
{
List<IntPoint> edgePoints = blobCounter.GetBlobsEdgePoints(blob);
List<IntPoint> cornerPoints;
// use the shape checker to extract the corner points
if (shapeChecker.IsQuadrilateral(edgePoints, out cornerPoints))
{
// only do things if the corners form a rectangle
if (shapeChecker.CheckPolygonSubType(cornerPoints) == PolygonSubType.Rectangle)
{
// here i use the graphics class to draw an overlay, but you
// could also just use the cornerPoints list to calculate your
// x, y, width, height values.
List<Point> Points = new List<Point>();
foreach (var point in cornerPoints)
{
Points.Add(new Point(point.X, point.Y));
}
Graphics g = Graphics.FromImage(image);
g.DrawPolygon(new Pen(Color.Red, 5.0f), Points.ToArray());
image.Save("result.png");
}
}
}
}
but it dont recognize the rectangles (walls). It just recognize the big square, and when I reduce the minHeight and minWidth, it recognize trapezoids on the writing..
I propose a different algorithm approach, after working almost a year with image processing algorithms what I can tell is that to create an efficient algorithm, you have to "reflect" how you, as a human would do that, here is the proposed approach:
We don't really care about the textures, we care about the edges (rectangles are edges), therefore we will apply an Edge-detection>Difference (http://www.aforgenet.com/framework/docs/html/d0eb5827-33e6-c8bb-8a62-d6dd3634b0c9.htm), this gives us:
We want to exaggerate the walls, as humans we know that we are looking for the walls, but the computer does not know this, therefore, apply two rounds of Morphology>Dilatation (http://www.aforgenet.com/framework/docs/html/88f713d4-a469-30d2-dc57-5ceb33210723.htm), this gives us:
We care only about the what is wall and what is not, apply a Binarization>Threshold (http://www.aforgenet.com/framework/docs/html/503a43b9-d98b-a19f-b74e-44767916ad65.htm), we get:
(Optional) We can apply a blob extraction to erase the labels ("QUARTO, BANHEIRO", etc)
We apply a Color>Invert, this is just done because the next step detects the white color not black.
Apply a Blob>Processing>Connected Components Labeling (http://www.aforgenet.com/framework/docs/html/240525ea-c114-8b0a-f294-508aae3e95eb.htm), this will give us all the rectangles, like this:
Note that for each colored box you have its coordinates, center, width and height. So you can extract a snip from the real image with that coordinates.
PS: Using the program AForge Image Processing Lab is highly recommended to test your algos.
Each time a rectangle is found, the polygon is drawn on Graphics and the file is saved only for THAT rectangle. This means that result.png will only contain a single rectangle at a time.
Try first saving all the rectangles in a List<List<Points>> and then going over it and add ALL the rectangles to the image. Something like this (Pseudo):
var image..
var rectangles..
var blobs..
foreach (blob in blobs)
{
if (blob is rectangle)
{
rectangles.add(blob);
}
}
foreach (r in rectangles)
{
image.draw(r.points);
}
image.save("result.png");
If your problem now is to avoid noise due to writings on the image, use FillHoles with widht and height of holes smaller than the smallest rectangle but larger than any of the writings.
If the quality of image is good and no text is touching the border of the image, Invert the image and FillHoles will remove most of the stuff.
Hope I understood your problem correctly.
We are trying to detect rectangles in so many rectangles (considering gray rectangles of grid). Almost all algorithms will get confused here. You're not eliminating externals from input image. Why not replace grid line color with background color or use threshold above to eliminate all grids first.
Then grow all pixels equal to the width of wall, Find all horizontal and vertical lines thereafter use maths to find rectangles using detected lines. Uncontrolled filling will be risky as when boundries are not closed fill will make two rooms as one rectangle.

Texture appears grey when rendered

I'm currently working my way through "Beginning C# Programming", and have hit a problem in chapter 7 when drawing textures.
I have used the same code as on the demo CD, and although I had to change the path of the texture to be absolute, when rendered it is appearing grey.
I have debugged the program to write to file the loaded texture, and this is fine - no problems there. So something after that point is going wrong.
Here are some snippets of code:
public void InitializeGraphics()
{
// set up the parameters
Direct3D.PresentParameters p = new Direct3D.PresentParameters();
p.SwapEffect = Direct3D.SwapEffect.Discard;
...
graphics = new Direct3D.Device( 0, Direct3D.DeviceType.Hardware, this,
Direct3D.CreateFlags.SoftwareVertexProcessing, p );
...
// set up various drawing options
graphics.RenderState.CullMode = Direct3D.Cull.None;
graphics.RenderState.AlphaBlendEnable = true;
graphics.RenderState.AlphaBlendOperation = Direct3D.BlendOperation.Add;
graphics.RenderState.DestinationBlend = Direct3D.Blend.InvSourceAlpha;
graphics.RenderState.SourceBlend = Direct3D.Blend.SourceAlpha;
...
}
public void InitializeGeometry()
{
...
texture = Direct3D.TextureLoader.FromFile(
graphics, "E:\\Programming\\SharpDevelop_Projects\\AdvancedFrameworkv2\\texture.jpg", 0, 0, 0, 0, Direct3D.Format.Unknown,
Direct3D.Pool.Managed, Direct3D.Filter.Linear,
Direct3D.Filter.Linear, 0 );
...
}
protected virtual void Render()
{
graphics.Clear( Direct3D.ClearFlags.Target, Color.White , 1.0f, 0 );
graphics.BeginScene();
// set the texture
graphics.SetTexture( 0, texture );
// set the vertex format
graphics.VertexFormat = Direct3D.CustomVertex.TransformedTextured.Format;
// draw the triangles
graphics.DrawUserPrimitives( Direct3D.PrimitiveType.TriangleStrip, 2, vertexes );
graphics.EndScene();
graphics.Present();
...
}
I can't figure out what is going wrong here. Obviously if I load up the texture in windows it displays fine - so there's something not right in the code examples given in the book. It doesn't actually work, and there must be something wrong with my environment presumably.
You're using a REALLY old technology there... I'm guessing you're trying to make a game (as we all did when we started out!), try using XNA. My best guess is that it's your graphics driver. I know that sounds like a cop out, but seriously, I've seen this before and once I swapped out my old graphics card for a new one it worked! I'm not saying it's broken, or that it's impossible to get it to work. But my best two suggestions would be to:
1) Start using XNA and use the tutorials on http://www.xnadevelopment.com/tutorials.shtml
2) Replace your graphics card (if you want to carry on with what you are doing now).

How do you do a 3D transform (perspective) in C# or VB.Net?

What I am looking to do sounds really simple, but no where on the Internet so far have I found a way to do this in DotNet nor found a 3rd party component that does this either (without spending thousands on completely unnecessary features).
Here goes:
I have a jpeg of a floor tile (actual photo) that I create a checkerboard pattern with.
In dotnet, it is easy to rotate and stitch photos together and save the final image as a jpeg.
Next, I want to take that final picture and make it appear as if the "tiles" are laying on a floor for a generic "room scene". Basically adding a 3D perspective to make it appear as if it is actually in the room scene.
Heres a website that is doing something similar with carpeting, however I need to do this in a WinForms application:
Flor Website
Basically, I need to create a 3D perspective of a jpeg, then save it as a new jpeg (then I can put an overlay of the generic room scene).
Anyone have any idea on where to get a 3rd party DotNet image processing module that can do this seemingly simple task?
It is not so simple because you need a 3D transformation, which is more complicated and computationally expensive than a simple 2D transformation such as rotation, scaling or shearing. For you to have an idea of the difference in the math, 2D transformations require 2 by 2 matrices, whereas a projection transformation (which is more complicated than other 3D transforms) requires a 4 by 4 matrix...
What you need is some 3D rendering engine in which you can draw polygons (in a perspective view) and them cover them with a texture (like a carpet). For .Net 2.0, I'd recommend using SlimDX which is a port of DirectX that would allow you to render polygons, but there is some learning curve. If you are using WPF (.Net 3.0 and up), there is a built in 3D canvas that allows you to draw textured polygons in perspective. That might be easier/better to learn than SlimDX for your purposes. I'm sure that there is a way to redirect the output of the 3D canvas towards a jpeg...
You might simplify the problem a lot if you don't require great performance and if you restrict the orientation of the texture (eg. always a horizontal floor or always a vertical wall). If so, you could probably render it yourself with a simple drawing loop in .Net 2.0.
If you just want a plain floor, your code would look like this. WARNING: Obtaining your desired results will take some significant time and refinement, specially if you don't know the math very well. But on the other hand, it is always fun to play with code of this type... (:
Find some sample images below.
using System;
using System.Collections.Generic;
using System.Drawing;
using System.Windows.Forms;
namespace floorDrawer
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
ResizeRedraw = DoubleBuffered = true;
Width = 800;
Height = 600;
Paint += new PaintEventHandler(Form1_Paint);
}
void Form1_Paint(object sender, PaintEventArgs e)
{
// a few parameters that control the projection transform
// these are the parameters that you can modify to change
// the output
double cz = 10; // distortion
double m = 1000; // magnification, usually around 1000 (the pixel width of the monitor)
double y0 = -100; // floor height
string texturePath = #"c:\pj\Hydrangeas.jpg";//#"c:\pj\Chrysanthemum.jpg";
// screen size
int height = ClientSize.Height;
int width = ClientSize.Width;
// center of screen
double cx = width / 2;
double cy = height / 2;
// render destination
var dst = new Bitmap(width, height);
// source texture
var src = Bitmap.FromFile(texturePath) as Bitmap;
// texture dimensions
int tw = src.Width;
int th = src.Height;
for (int y = 0; y < height; y++)
for (int x = 0; x < width; x++)
{
double v = m * y0 / (y - cy) - cz;
double u = (x - cx) * (v + cz) / m;
int uu = ((int)u % tw + tw) % tw;
int vv = ((int)v % th + th) % th;
// The following .SetPixel() and .GetPixel() are painfully slow
// You can replace this whole loop with an equivalent implementation
// using pointers inside unsafe{} code to make it much faster.
// Note that by casting u and v into integers, we are performing
// a nearest pixel interpolation... It's sloppy but effective.
dst.SetPixel(x, y, src.GetPixel(uu, vv));
}
// draw result on the form
e.Graphics.DrawImage(dst, 0, 0);
}
}
}

Categories

Resources