Localization of a robot using Kinect and EMGU(OpenCV wrapper)

Localization of a robot using Kinect and EMGU(OpenCV wrapper) - c#

I'm working on small WPF desktop app to track a robot. I have a Kinect for Windows on my desk and I was able to do the basic features and run the Depth camera stream and the RGB camera stream.
What I need is to track a robot on the floor but I have no idea where to start. I found out that I should use EMGU (OpenCV wrapper)
What I want to do is track a robot and find it's location using the depth camera. Basically, it's for localization of the robot using Stereo Triangulation. Then using TCP and Wifi to send the robot some commands to move him from one place to an other using both the RGB and Depth camera. The RGB camera will also be used to map the object in the area so that the robot can take the best path and avoid the objects.
The problem is that I have never worked with Computer Vision before and it's actually my first, I'm not stuck to a deadline and I'm more than willing to learn all the related stuff to finish this project.
I'm looking for details, explanation, hints, links or tutorials to achieve my need.
Thanks.

Robot localization is a very tricky problem and I myself have been struggling for months now, I can tell you what I have achieved But you have a number of options:
Optical Flow Based Odometery: (Also known as visual odometry):
Extract keypoints from one image or features (I used Shi-Tomashi, or cvGoodFeaturesToTrack)
Do the same for a consecutive image
Match these features (I used Lucas-Kanade)
Extract depth information from Kinect
Calculate transformation between two 3D point clouds.
What the above algorithm is doing is it is trying to estimate the camera motion between two frames, which will tell you the position of the robot.
Monte Carlo Localization: This is rather simpler, but you should also use wheel odometery with it.
Check this paper out for a c# based approach.
The method above uses probabalistic models to determine the robot's location.
The sad part is even though libraries exist in C++ to do what you need very easily, wrapping them for C# is a herculean task. If you however can code a wrapper, then 90% of your work is done, the key libraries to use are PCL and MRPT.
The last option (Which by far is the easiest, but the most inaccurate) is to use KinectFusion built in to the Kinect SDK 1.7. But my experiences with it for robot localization have been very bad.
You must read Slam for Dummies, it will make things about Monte Carlo Localization very clear.
The hard reality is, that this is very tricky and you will most probably end up doing it yourself. I hope you dive into this vast topic, and would learn awesome stuff.
For further information, or wrappers that I have written. Just comment below... :-)
Best

Not sure if is would help you or not...but I put together a Python module that might help.
http://letsmakerobots.com/node/38883#comments

Related

Kinect v2 SDK Clothes Detection

So, recently I managed to land myself a Kinect v2 project (hurray!) on my Industrial Placement which is supposed to detect if a person is wearing the correct PPE or (personal protective equipment to you and me).
Included in this scope is detecting if the person is wearing the correct:
Hazard Protection Suit
Boots
Gloves
Hat
A beard mesh! (Only if they have a beard)
I have only recently begun working with the Kinect v2 Sensor so you will have to forgive any ignorance on my part. I am new to this whole game but have worked through a fair few examples online. I guess I am just looking for advice/sources on how best to solve the problem.
Sources online seem scarce for trying to detect if a person is WEARING something. There are a couple of parts to this project:
Split up human into components (hands, feet, head, body) while retaining colour. This part I imagine would be best done by the Kinect SDK.
Give a percentage likelihood that the person's hand part is wearing gloves, head part is wearing hat etc... I've seen some ideas for this including graphics packages such as OpenCV.
I'm looking for advice concerning both parts of this project. Feel free to ignore the waffling below, but I thought it best to post some of my own ideas first.
Idea (and worked example) 1 - Point clouds
I have done some preliminary projects involving basic detection of humans. In fact, grabbing the VS project from this source: http://laht.info/record-3d-video-with-kinect-v2-and-play-it-back-in-the-browser/ I have managed to create a .ply file out of human beings. The problem here would be trying to split the .ply 3D image up to identify the suit, hat, gloves, boots (and I guess) beard. I imagine this is a fairly complex task.
Of the 3 ideas however, I'm coming back around to this one. Splitting up a 3D .ply image and trying to detect the PPE could turn out to be more accurate. Would need advice here.
Idea 2 - Blob Detection
Using this: http://channel9.msdn.com/coding4fun/kinect/Kinect--OpenCV--WPF--Blob-Tracking I pondered whether there would be a good way of splitting up a human into a colour picture of their "hand part", "head part", "body part" etc with the Kinect v2 SDK. Maybe I could then use a graphics package like OpenCV to test if the colours are similar or if the logo on the suit can be detected. Unfortunately, there is no logo currently on the gloves and boots so this might not turn out to give a very reliable result.
Idea 3 - Change the PPE
This isn't ideal but if I can't do this another way, I could perhaps insist that logos would need to be put on the gloves and suit. In this worst case scenario, I guess I would be trying to just detect some text in 3D space.
Summary
I'd be grateful for any advice on how to begin tackling this problem. Even initial thoughts might spark something :)
Peace!
Benjamin Biggs

Suggestions for a video montage application (extracting and playing randomized short clips from a playlist)?

I'm an electronics engineer used to coding in embedded C and assembly, but I decided to start learning higher-level stuff like C#, .NET, etc., so I can start making software as a hobby. I have a great idea for one of my first projects, but after searching several forums for days on end, I'm left not really knowing what would be the easiest path forward.
The functionality that I'm looking to create is pretty similar to the idea of a photo slideshow, but applied to videos instead. The program would open a playlist or a folder full of videos and then play the videos in a random order, starting from a random starting position, and with a fixed duration (let's say 10 seconds as an example). You would end up being able to watch a sort of "video montage" that consisted of small clips from random parts of the videos in the playlist, shown in a random order, ad infinitum until the program is closed.
There are a number of ways I could tackle the problem:
Develop a standalone video player with the fixed functionality of showing "video slideshows." DirectX has the Microsoft.DirectX.AudioVideoPlayback API that
could be a good starting point. I found an example here: http://www.dreamincode.net/forums/topic/111181-adding-video-to-an-application/
Modify an open source project to add the desired functionality. I've seen a few cool projects that could get me started, like this simple C# Movie Player: http://www.codeproject.com/Articles/18552/C-Movie-Player
Use a scripting interface to implement this functionality on an existing media player, like VLC or Winamp. You could also control VLC via C#, like the example here: Controlling VLC via c#
I realize that the obvious answer for most people would be to "use whatever you're most comfortable with," but since I'm a pure beginner, I don't really have any allegiances to a particular language or development environment. So, I was just curious if anybody had an idea of what might be the least painful option for a beginner.
I also apologize that this is not a very specific programming question. I'm sort of just testing the waters to get my footing. Hopefully, once I get started on the project, I'll be able to come back and post more intelligent and relevant questions!

While your background would lend you toward C#, I recommend investigating something like this and using WPF for the media player. You can then control the media player using a background worker in order to stop the video or queue up the next one. Some other .NET concepts that will be of use to you are FileInfo and DirectoryInfo objects, to provide you with the necessary information about the files. I'm not sure if you've had experience with generic data structures in .NET, but the System.Collections.Generic namespace would be a good place to start to get a feel for data structure you want to keep your playlist in. WPF will also be able to help you with transitions between video clips.
Admittedly WPF is easier with an understanding of the MVVM or MVC design patterns, but I think you'll be able to get something working without having to delve too far into that right up front.

Web Page - 3d earthquake visualization - Silverlight?

I have never written any silverlight apps but I am looking to write a 3d viewer for earthquakes and have it run from my web site.
I would like to create a simple viewer so the user can change the "camera" ie their perspective. The view could contain up to 10,000 objects in the 3d space.
I want the ability to quickly view this - I have seen this on a Power Basic application and want to do this for the web.
I have a current web site at http://canterburyquakelive.co.nz for earthquakes in Canterbury New Zeaalnd and I want to learn the basics so that it can be more interactive.
I want to say for example (to start) place 2 objects in a "space" that I can define and move the camera in real time.
The system must support up to 10,000 objects in the end of the day.
Each object can be a simple circle - no need for special pixel shaders
I am unsure of the exact functionallity of the system at the moment so if I can find a tutorial that allows me to place someone (a circle) into a 3d world (space) and change the camera that would be good.
Any ideas appreciated - there seems to be so much about 3d and silverlight that I may be getting lost in the "gloss" of new features where I need some basics and I can learn and adapt over time.
** Added comment + image **
Basically I am waiting to create a page that look like this using Silverlight. But I am open to any technology.

I've never done 3D in silverlight so I can't exactly answer your question as asked but in general to display geographic markers in a 'real' 3D terrain is quite involved. At a minimum you're probably looking at:
Obtaining binary height data files (last time I looked, NASA gives this away)
Reading and interpreting said files to get 'bitmap' height data
Choosing and dealing with projections (e.g. UTM)
Deciding how to tesselate your bitmap height data
If you want it textured you'll need to also obtain satellite data for that, again converting or processing it to account for projection.
You could ignore the terrain height, but that may not simplify things depending on how 'bumpy' your terrain is.
For a pre-defined small enough area, you could perhaps pre-author a 3d model of the terrain in some 3D package but displaying your markers will still require a projection from long/lat into your 3D space, and you'll still need to know terrain height (unless you do mesh collision with the static model).
Rendering the markers is pretty straightforward by comparison, choose from:
Use a 3D model e.g. a 'pin head' (simple but not always visible)
Render a regular n-gon with 'viewer facing' polygons (resolution independent but maybe ugly)
Render a quad with a circle texture on it (low poly but what size texture to choose?)
There are probably libraries that do some or all of this for you, so if you are set on rolling your own then some of the things I've mentioned could form the basis for your search.
However, given what you've described of your site and situation I suspect you'd be better off avoiding all that work by using a pre-existing solution. E.g. the Google Earth API.

You could consider 3D web plugins that -granted- take you away from Silverlight but that might speed up your development process. I'm thinking in particular of e.g. the Blender 3D web plugin. I can understand the need to write your own viewer, but think twice before you re-invent the wheel. Good luck!

Skeleton tracking with 2D camera

I'm trying to use a 2D camera to recognize the device/object a user is pointing at so I was looking for a skeleton tracking software using a 2D camera in order to be able to do that. Is there any open source project that deals with skeleton tracking using 2D cameras?
(I've gone through tons of links on Google and it seems like most of what's there is just research papers but no actual open source projects)
Thanks!

Skamleton could be an option. It's an open-source project in early stages, but it implements a background subtractor, a skin color classifier, blob tracking and face classification. There is a demo on YouTube.
Note Skamleton use simple cameras, not RGB-D (depth) cameras as the Kinectic system (Kinect uses a structured light device from PrimSense).

It seems there's kind of a pre-release of a SDK for Kinect from Microsoft. Perhaps this might be helpful for you:
http://nuigroup.com/forums/viewthread/11249/
(Although I think this won't be Open Source. But since you are using c#, a Microsoft SDK might be ok for you.)

This seems like an old post, but in case anyone is still looking Extreme Reality uses a regular webcam and does skeleton tracking. It's not open source, but I've played around with it a bit and it does seem to be fairly robust.
http://www.xtr3d.com/developers/resources/

How to draw an unfilled square on top of a stream video using a mouse and track the object enclosed in the square automatically in C#?

I am making an object tracking application. I have used Emgucv 2.1.0.0
to load a video file
to a picturebox. I have also taken the video stream from a web camera.
Now, I want
to draw an unfilled square on the video stream using a mouse and then track the object enclosed
by the unfilled square as the video continues to stream.
This is what people have suggested so far:-
(1) .NET Video overlay drawing(DirectX) - but this is for C++ users, the suggester
said that there are .NET wrappers, but I had a hard time finding any.
(2) DxLogo sample
DxLogo – A sample application showing how to superimpose a logo on a data stream.
It uses a capture device for the video source, and outputs the result to a file.
Sadly, this does not use a mouse.
(3) GDI+ and mouse handling - this area I do not have a clue.
And for tracking the object in the square, I would appreciate if someone give me some research paper links to read.
Any help as to using the mouse to draw on a video is greatly appreciated.
Thank you for taking the time to read this.
Many Thanks

It sounds like you want to do image detection and / or tracking.
The EmguCV ( http://www.emgu.com/wiki/index.php/Main_Page ) library provides a good foundation for this sort of thing in .Net.
e.g. http://www.emgu.com/wiki/index.php/Tutorial#Examples
It's a pretty meaty subject with quite a few years and different branches of research associated with it so I'm not sure anyone can give the definitive guide to such things but reading up neural networks and related topics would give you a pretty good grounding in the way EmguCV and related libraries manage it.
It should be noted that systems such as EmguCV are designed to recognise predefined items within a scene (such as a licence plate number) rather than an arbitory feature within a scene.
For arbitory tracking of a given feature, a search for research papers on edge detection and the like (in combination with a library such a EmguCV) is probably a good start.
(You also may want to sneak a peek at an existing application such as http://www.pfhoe.com/ to see if it fits your needs)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.