Detecting human body from video source in C#

Detecting human body from video source in C# - c#

I am trying to extract human from a video source, so that I can use his image later. I need to only extract human body, and ignore the environment. The good thing is that the background is static. I have tried to use AForge and applied CustomFrameDifferenceDetector filter, which compares current frame to the static background image and extracts the pixels which differ (difference>threshold). It works well, but there is a problem when skin or part of the clothing has the similar color to background. In these cases filter ignores these parts and the result has various holes in the body. Simply decreasing threshold doesn't solve the problem, since body shadows and other noise increases (even under noise supression).
Do you know of any known solution to this problem? Or is it still unsolved problem?

It's a hard to solve issue (and one of the reasons for Microsoft's Kinect to not use visible light only and why blue/green screening is still so popular). I'd try to remove holes (you should be able to predict where the body has to be). If you've got the processing power, use different thresholds and merge the results. You could as well try to filter new separated images (e.g. add current frame to last frame and normalize the result). This way you could track shapes you're losing for one frame a lot more consistent.
A different approach: Use the detected shape/region for detecting the position of the body only. I.e. ignore its specific shape and use a premade shapre above/around the estimated body position. This most likely won't work if you'd like to do some kind of bluescreen like behaviour but it might help as well closing holes.

Alturos.Yolo does exactly what you are looking for.
Yolo learns from annotated images how to detect the objects you are looking for. First you need to install the project, along with a set of already trained images using the Nuget Package Manager. In your case the YOLOv2-tiny model should suffice:
Install-Package Alturos.Yolo
Install-Package Alturos.YoloV2TinyVocData
Once installed, you can use it like this to detect a human in your image:
using (var yoloWrapper = new YoloWrapper("yolov2-tiny-voc.cfg", "yolov2-tiny-voc.weights", "voc.names"))
{
var items = yoloWrapper.Detect(#"your_image.jpg");
//if (items[0].Type == "Person") { ... }
}
The items array will contain information about all the objects found. You can check there if it's a human you are looking at, using the Type property.

Related

How to implement a loop closure algorithm for ARFoundation

I build an app which is used for meassuring outside environments on a larger scale (image the size of a regular personal driveway) using AR.
In order to increase the reliability of the system I want to implement a loop closing algorithm that allows the user to set a fixpoint (e.g. ImageMarker) --> walk around --> return to the fixpoint. The system should then figure out the offset (drift) between the original marker position and the new position upon returning and apply the transformation on the measuring-points in such a way that it accounts for drift.
As far as I know this kind of system is called a "loop closure".
Where do I start with implementing such a thing in ARFoundation? Are there build in methods? What are keywords / names of specific algortihms that I could be researching?
Any help is greatly appreciated. Thank you!
I experimented with two things:
adding a simple standalone imagetrigger to the scene and check if that already has an impact on other ARAnchors in the scene -> this doesnt seem to be the case. The transformations on the other anchors do not seem to be affected when the imagetrigger updates its position.
attaching all points of meassurment as children of the imagetrigger --> this obviously applys the updated image-trigger transformation uniformly onto all other points. However drift is not accumilating uniformly over the runtime of the system.

NetTopology 'found non-noded intersection' Exception when determining the difference between two specific geometries

Using the NetTopology in C# I'm getting a 'found non-noded intersection' Exception when determining the difference between two specific geometries.
These geometries are the result of using several routines like CascadedPolygonUnion.Union, Intersection, and Difference.
At some point, we have a MultiPolygon from which we want to cut out another geometry (Polygon):
We use this code to try and cut off the 'red' polygon:
Geometry difference = multiPolygon.Difference(geometryToRemove);
But then we get a NetTopologySuite.Geometries.TopologyException with the message:
found non-noded intersection between LINESTRING (240173.28029999882 493556.2806000002, 240173.28177031482 493556.28131837514) and LINESTRING (240173.28176154062 493556.2813140882, 240173.28176153247 493556.2813140842) [ (240173.28176153894, 493556.2813140874) ]
I asked this question also in the NetTopologySuite Discussuion forum because we are close to a release date and I was hoping someone could give some extra insight (of ideas for a workaround) as this looks like a bug in de library because the polygons themselves seem valid.
The data regarding the polygons can be found here - we use the 'RDNew' data to perform the Difference action, but I also added the WGS84 versions of these polygons to be able to view them in tools like geojson.io.

Thanks to one of the maintainers of the library I got the answer.
Basically, I needed to upgrade to version 2.2 (which I already did at first to see if this would resolve the problem).
But second, I needed to configure the application to use the - in version 2.2 introduced - 'NextGen' overlay generator, which is not turned on by default.
To use the 'Next Gen' overlay generator, add the following code at some starup point in your application:
var curInstance = NetTopologySuite.NtsGeometryServices.Instance;
NetTopologySuite.NtsGeometryServices.Instance = new NetTopologySuite.NtsGeometryServices(
curInstance.DefaultCoordinateSequenceFactory,
curInstance.DefaultPrecisionModel,
curInstance.DefaultSRID,
GeometryOverlay.NG, // RH: use 'Next Gen' overlay generator
curInstance.CoordinateEqualityComparer);
I use the current instance of NtsGeometryServices to get and reuse the current default instances of the other configurable parts.
But your free to create new instances of the required parts (like mentioned in the original post at https://github.com/NetTopologySuite/NetTopologySuite/discussions/530#discussioncomment-888410 )
There are also possibilities to use both overlay generators next to each other (also mentioned in the original post), but I never tried this as we will be using the 'NextGen' version for the entire application.

Layered Sprites With Predetermined Offset / Rotation

Probably not a very descriptive title, but I'm doing my best. It's my first time posting on StackOverflow, and I'm relatively new to programming in C# (first started around a year ago using Unity, and decided a few days ago to upgrade to XNA). That being said, please be kind to me.
I'm planning out the mechanics of a 2D game that I'm designing, and while most of it seems straightforward after playing around in XNA, there's one issue that I keep coming back to that I have yet to come up with a satisfactory answer for. The issue involves the layering of sprites into composite / complex sprites. For example, a character in the game might be wielding one or two of any number of weapons. I did do a bit of research on the topic, and found some people recommending to use the RenderTarget class to draw a series of sprites as one, and some recommending simply drawing the sprites on top of one another during Draw(). These topics, however, were mostly focused on the relatively simple case of having a single character in the game.
In my case, the game will have a number of sprite-based characters who have totally different postures / animations. There are around 10 right now, and there will probably be more added later in development. There will likewise be a largish number of weapons (probably around 20 to start) that will be composited onto the characters. That much I'm comfortable with. However, the problem is that each of the characters would require the weapon sprites to be draw in different locations and with different rotations during each frame of a character's animation.
I've considered a couple approaches to how to pull this off, but they all have pretty massive drawbacks.
The first was simply drawing a spritesheet of each weapon for each character, that would be the same size as the appropriate character. The benefit to this approach would be the ease of just adding the call to draw the additional sprite on top of the base character without having to do any calculations. The downside would be that that creates an inordinate amount of extra sprite sheets (200 extra sheets for 10 characters x 20 weapons).
The second was creating a class to handle the weapon sprite. The WeaponSprite class would be attached to a single texture for each weapon, and would then store information about the offset / rotation to use when drawn, based on the character that it is attached to. The problem with this is that organizing the offsets / rotations on a per-frame basis would be incredibly tedious, and I can't think of any easy way to pull the information based on the frame required. (I had the idea of making an AnimationFrame class to keep track of the animation name, directional facing and frame number of each character, and then using a dictionary in the weapon class to load the proper data based on the name of the current frame, but something about the idea seemed really ill-conceived). This method also has the drawback of requiring a relatively large amount of memory to pull off (assuming a Vector2 is 8 bytes and a float is 4, having 10 characters and 20 weapons would require 192KB of memory given the current number of frames being used, which would only get larger as more weapons were added). I had an offshoot of that idea (which I sort of stole from another post on here about the same topic) of using a reserved alpha value pixel to link the offset and the 'origin' of each weapon, calculating the position at runtime and then only having to store the rotational float in the aforementioned dictionary.
Since I'm new to XNA (and still pretty green on C#), I figured I'd post and let the experts chime in. Am I on the right track with my methods, or am I missing something really simple? Thank you very much in advance for your help, and please let me know if you need any additional information.

Wow, big question. I can't really tell you exactly how to implement this. But I can give you some helpful nuggets of advice:
Advice #0: Whenever any kind of compositing problem comes up, people come out of the woodwork recommending "render targets" as some kind of compositing panacea. They are usually wrong. Avoid using render targets if you can. You only need them if you are doing effects on the final, composite image (blends, blurs, etc). Otherwise just draw your sprites over the top of each other directly to the backbuffer.
Advice #1: You want to pack all your sprites onto a single sprite sheet, if possible. If you exceed the texture size limit, you'll have to be clever about how you partition your sprites across sheets. The reason is performance - you want to limit the number of texture swaps - see this answer for details.
You may be able to use an existing sprite-packer for XNA. If you can find a suitable one, I recommend you use it. A good one will allow you to treat a packed sprite just as you would treat a texture when calling SpriteBatch.Draw.
Advice #2: Do not worry about how much space your positioning data takes at runtime. 192kb is almost nothing - the size of a small texture.
The upshot of this, and #1, is to store as much as possible in your positioning meta-data, and avoid duplicate textures.
How you store your meta-data almost doesn't matter.
Advice #3: You can change both your storage requirements and content creation story from an n × m problem to an n + m problem (n characters and m weapons). Simply store weapons with only an "origin", and store characters with an "origin" and a "hand position & rotation". Simply render such that the origin of the weapon lines up with the hand of the character (the maths is very simple).
Then you can add characters without worrying about what weapons exist, and add weapons without worrying about what characters exist.
Here's an example of how much space this needs: 10 characters × 20 bytes + 20 weapons × 8 bytes = 360 bytes. Nice and small! (Although you'll probably want many more attachment points - different kinds for different weapons, hats, whatever. Edit: oops I didn't include animation frames - but it's still a relatively small amount of data.)
Advice #4: The trickiest part, as you seem to be hinting at in your post, is content creation.
As you hint at, ideally you would want to be able to edit the attachment points directly in your image editor. This is a compelling idea. Special alpha values are only appropriate if your sprites have no anti-aliasing. You could theoretically do something with layers and different colours. The hardest part is figuring out how to encode rotation.
You could use an XNA content pipeline processor to extract data from the image at build-time. However this gets very expensive to implement (especially if you've not done it before - the content pipeline is badly under-documented). Unless your art requirements are truly enormous, it is almost certainly not worth the extra development time required to make the content pipeline extension. By the time you're done, you could have hand-coded the positioning data several times over.
My recommendation, then, is to store the extra data in an easy-to-edit XML file. I recommend using XNA's XML Content Importer. It can be tricky to get the hang of the formatting at first, and you have to remember to include the appropriate assembly referencing. But once you know how to use it, it's the easiest way to get structured data into XNA quickly.

convert equirectangular to tiles library C#

I want to create tiles out of a equirectangular image. So I want the image to be split into 4 lateral faces+up and bottom. Does anyone know any library that I can import into my c# project and which is able to do something like this?

Depending on your original image format, System.Drawing.Bitmap.Clone(Rectangle, PixelFormat) should do the trick.
More information here.
EDIT:
First, let me say that this is not going to answer your question either (not even close), you're seeking a library that already exists for this purpose and I don't know of one personally.
Equirectangular projection is the same as plate carrée (wow, that's humbling), so it's a very simple projection to work with in code.
Here is an example of it's use in a GIS application. I don't know what your purposes are, but the math is the same.
One way to do it is to deproject each pixel then draw it on a new image, but understand that to do this you will still require some sort of projection because you're changing from 3 dimensions to 2 dimensions.
I wasn't able to find a good example, but an easier or faster way might be to first use a matrix transform (again, to change projections), then cut the image into the regions you need.
Like I said, this isn't an asnwer, but if nothing else it will give you more keywords to goggle for.

Best logic for creating a (true) random labyrinth

I've been trying to make a little simple game just to test my logics, and it's a simple labyrinth, it's ugly, and so far sucky.
The engine works pretty well, given that the labyrinth already exists (a matrix), it could be even enjoyable, but I have no intention on drawing a bunch of maps, which might be setting values on 400 (20x20) fields of a matrix. not funny.
Then I've created a function to randomize it, setting floor/wall for each field, and (I expected that) not every map is winnable. then I've made another function which checks if the maps is playable (receives two points, and checks if there's a valid path between them, then I just pass the start and the end. Pretty nifty) and it worked.
If you haven't noticed, this is a VERY stupid way of creating my random labyrinth for the following reasons:
1 - It might come out really easy (giant isles of floor, or a bunch of walls together, making only one, extremely visible path, creating a stupit (though valid) labyrinth
2 - It is potentially the fastest way of creating a perfect random labyrinth EVER, but at the same time it's potentially the slowest too, taking as long as... infinite. This difference is noticed more when I set the grid for 30x30 or more (when something is not overflown)
3 - It's dumb and an offence to logic itself.
In my deffense, I didn't plan making it this way from the beginning, as described, one thing led to another.
So I've started thinking about ways to do a beautiful (full of paths, tricky and winnable) labyrinth, then I've thought about making tiny small (let's say) 5x5 blocks with predesigned entrances and mount them together in a way that it fits, but it would go against my true random desire, as well as my unwillingness to draw it by hand.
Then I've thought about a function to create a random path, run it once to the end, and run it several times to somewhere close to the end, and some crossings and stuff, some creating dead ends, which seemed better to me, but I just couldn't imagine it creating a decent labyrinth.
You can check what I've done so far in this link.
Note: I have no intentions in harming anyone's pc with anything.
First one to open it, please comment here saying that it's safe. - Done (thank you, Jonno_FTW)
If you still don't trust it, use a Virtual Machine.
OBS: I know this is not the best way of developing anything. I should get a decent game engine, bla bla bla, it was some kind of challenge for myself.

I've done maze generation. You don't want to place stuff randomly and validate. Instead, you generate it out from a starting point.
Pick a starting point, move in a random direction. Have a random probability of picking a new direction. Never move into an occupied square, if you bump into one the current trail ends. If the current trail ends pick a square you have already visited and pick a new direction and do a random walk like you did for the first one. Repeat until the maze is as full as you want it to be.
The probability of the direction change should be an input parameter as it makes quite a difference. Note that if you are doing a 3D maze the odds of a vertical turn should be a lot lower than the odds of a horizontal move.

Here's an expansive website dedicated to labyrinths:
http://www.astrolog.org/labyrnth/algrithm.htm
Explains what types of labyrinths there are, goes over the generation algorithms and the solution algorithms, has a lot of cool pictures.

Have a look at the source code in my Roguelike game, Tyrant:
Code for Dungeon.java
There are a lot of diferent map generation techniques used to produce the different level types. But the most basic pattern is to iterate the following:
Start with a blank map
Create a single random room / open space in the map
Randomly select a tile at the edge of the currently open area
Attempt to "grow" a corridor or room randomly out from that space (if it doesn't fit, do nothing)
Loop back to step 3 as many times as you need to create a decent maze
Finally, do a pass over the whole map and convert and remaining blank space to walls
Here's a screenshot of the type of thing you get (Look at the mini-map from the maze structure):
Tyrant screenshot http://www.freeimagehosting.net/uploads/af45502c9c.png

Your question makes me think of the XScreensaver Maze program. Look at its screenshots to see if that's the desired effect.
It looks like it took its maze generation algorithm from Wikipedia.

Wikipedia has a great article on Maze generation algorithms

How you create a random labyrinth will depend on what you want it to look like. If you're creating something that's designed to have a lot of dead ends, then you can just "randomly" trace a path from the start point to the end point, and then randomly fill in the empty spaces, essentially carving the path out of a solid block of material. E.g. imagine you had a stone tablet. First step would be to carve the "solution" path. Then you'd go in and make all of the dead ends.
If you want something that's more "play" than "puzzle", then creating a bunch of tile pieces that fit together in different ways is probably the way to go. That's how the Diablo games did it as far as I can tell; a number of predesigned "sets" and rules about how they fit together. You'd mark the four sides of the block with things like "three open spaces followed by two closed," and then if another piece also has a matching description, they can be put together.
After that, all you have to do is figure out how you can consistently render "random" behavior.

There's actually a trick that Al Lowe used for one of his Leisure Suit Larry games (LSL 3, I believe) that might be helpful.
Basically, he made a bamboo forest 'maze' that the player had to navigate. Rather than creating a separate 'square' of maze for each screen, however, he simply 'flipped' the one screen he had already created and made dead ends by blocking various entrances with a single 'bamboo wall' graphic.
Perhaps you could do the same: have the generator carve a valid maze, and then tell it to place dead-end blocks along some of the paths. That would ensure that there's always at least one valid, open path to the 'finish line', as well as preventing players from just strolling through a super-easy layout.
It'll also make a 30x30 maze more workable, since the computer won't have to test every square of a 900-square grid for validity.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.