I have been trying to get my mind around how to do structure from motion using Emgu (C# openCV wrapper) and images extracted from a monocular video, but I find most of the papers on the subject to be too theoretical and written in a mathematical language I am not familiar with.
I have been able to track 2D feature points pretty accurately, although I still got some outliers, and now need to find the camera location and angle given array pairs of the 2D feature points (current and next frame). I understand I probably need to use some kind of RANSAC algorithm to get the best match from a set of these feature points.
How would one proceed to solve this using Emgu and C#?
Related
I'm using C# with EmguCV in combination with a Kinect to display a video captured by the Kinect. My application detects successfully feature points with the SURF algorithm and marks them as red points on the video. Now I need to find and approximately largest area WITHOUT any feature points to detect a blank surface (e.g. free space for projection etc.). Is this possible using the SURF algorithm or do better solutions exist for this kind of problem?
how do i track points in a video in real time like in the below video.
https://www.youtube.com/watch?v=jg6Nz6BfoSQ
I managed to use optical flow method to get this output to my video,
But i couldn't find a way to point track with Emgu Cv. Can someone suggest what should I do?
In youTube video he used c++ as the language.Does the language type affect to the real time response of the system?
You can get the floor coordinates (http://msdn.microsoft.com/en-us/library/hh973078.aspx) there is an API function. You can also get raw depth stream data and determine the 3D coordinates of a point. To make 3D point cloud it is a litle expensive usualy done on GPU for real time applications. The problem is that you have to track objects. For tracking you can check OpenCV and to combine it with kinect raw data.
I'm working on small WPF desktop app to track a robot. I have a Kinect for Windows on my desk and I was able to do the basic features and run the Depth camera stream and the RGB camera stream.
What I need is to track a robot on the floor but I have no idea where to start. I found out that I should use EMGU (OpenCV wrapper)
What I want to do is track a robot and find it's location using the depth camera. Basically, it's for localization of the robot using Stereo Triangulation. Then using TCP and Wifi to send the robot some commands to move him from one place to an other using both the RGB and Depth camera. The RGB camera will also be used to map the object in the area so that the robot can take the best path and avoid the objects.
The problem is that I have never worked with Computer Vision before and it's actually my first, I'm not stuck to a deadline and I'm more than willing to learn all the related stuff to finish this project.
I'm looking for details, explanation, hints, links or tutorials to achieve my need.
Thanks.
Robot localization is a very tricky problem and I myself have been struggling for months now, I can tell you what I have achieved But you have a number of options:
Optical Flow Based Odometery: (Also known as visual odometry):
Extract keypoints from one image or features (I used Shi-Tomashi, or cvGoodFeaturesToTrack)
Do the same for a consecutive image
Match these features (I used Lucas-Kanade)
Extract depth information from Kinect
Calculate transformation between two 3D point clouds.
What the above algorithm is doing is it is trying to estimate the camera motion between two frames, which will tell you the position of the robot.
Monte Carlo Localization: This is rather simpler, but you should also use wheel odometery with it.
Check this paper out for a c# based approach.
The method above uses probabalistic models to determine the robot's location.
The sad part is even though libraries exist in C++ to do what you need very easily, wrapping them for C# is a herculean task. If you however can code a wrapper, then 90% of your work is done, the key libraries to use are PCL and MRPT.
The last option (Which by far is the easiest, but the most inaccurate) is to use KinectFusion built in to the Kinect SDK 1.7. But my experiences with it for robot localization have been very bad.
You must read Slam for Dummies, it will make things about Monte Carlo Localization very clear.
The hard reality is, that this is very tricky and you will most probably end up doing it yourself. I hope you dive into this vast topic, and would learn awesome stuff.
For further information, or wrappers that I have written. Just comment below... :-)
Best
Not sure if is would help you or not...but I put together a Python module that might help.
http://letsmakerobots.com/node/38883#comments
I have a bitmap image like this
My requirement is to create a GUI to load the image and for changing the contrast and other things on the image and algorithm to mark the particular area in silver colour as shown in the fig using C++ or C#.I am new to image processing and through my search I have found out that I can use the Histogram of the image for finding the required area.These are the steps.
Get the histogram
Search for intensity difference
Search for break in the line
Can someone suggest me how can I proceed from here.Can I use Opencv for this or any other efficient methods are available..?
NOTE:
This image have many bright points and the blob algorithm is not successful.
Any other suggestions to retrieve the correct coordinates of the rectangle like object.
Thanks
OpenCV should work.
Convert your input image to greyscale.
adaptiveThreshold converts it to black and white
Feature detection has a whole list of OpenCV feature detectors; choose one depending on the exact feature that you're trying to detect.
E.g. have a look at the Simple Blob Detector which lists the basic steps needed. Your silver rectangle certainly qualifies as "simple blob" (no holes or other hard bits)
If all of your pictures look like that, it seems to me not complicate to segment the silver area and find its centre. Basically you will need to apply these algorithms in the sequence below:
I would suggest binaryze the image using Otsu adaptive threshold algorithm
Apply a labelling (blob) algorithm
If you have some problem with noise you can use an opening filter or median before the blob algorithm
If you end up with only one blob (with the biggest area I guess) use moment algorithm to find its centre of mass. Then you have the X,Y coordinate you are looking for
These algorithms are classical image processing, I guess it wouldn't be hard to find then. In any case, I may have they implemented in C# and I can post here latter in case you think they solve your problem.
May be a research on Directshow, a multi media framework from Microsoft will help you to accomplish your task.
I want to count number of people crossing a line from either side. I have a camera that is placed on ceiling and shooting for the floor where the line is (So camera sees just top of people heads; and so it is more of object detection than people detection).
Is there any sample solution for this problem or similar problems like this? So I can learn from them?
Edit 1: More than one person is crossing the line at any moment.
If nothing else but humans are subject to cross the line then you need not to detect people you only have to detect motion.
There are several approaches for motoin detection.
Probably the simplest one fits your goals. You simply calculate difference between successive frames of video stream and this way determine "motion mask" and thus detect line crossing event
As an improvement of this "algorithm" you may consider "running average" method.
To determine a direction of motion you can use "motion templates".
In order to increase accuracy of your detector you may try any background subtraction technique (which in turn is not a simple solution). For example, if there is some moving background which should be filtered out (e.g. using statistical learning)
All algorithms mentioned are included in OpenCV library.
UPD:
how to compute motion mask
Useful functions for determining motion direction cvCalcMotionGradient, cvSegmentMotion, cvUpdateMotionHistory (search docs). OpenCV library contains example code for motion analysis, see motempl.c
advanced background subtraction from "Learning OpenCV" book
I'm not an expert in video-based cv, but if you can reduce the problem into a finite set of images (for instance, entering frame, standing on line, exiting frame), then you can use one of many shape recognition algorithms. I know of Shape Context which is good, but I doubt if it subtle enough for this application (it won't tell the difference between a head and most other round objects).
Basically, try to extract key images from the video, and then test them with shape recognition algorithms.
P.S. Finding the key images might be possible with good motion detection methods.