I am trying to make an application which will do 2 task.
get some object from an image e.g a rectangle which actually a
traffic light.
Find this selected object in training data,training data is actually
bulk of images.
I have searched found an OpenCV library which can be use but how can i start it.How can i detect some specific shape from image and find it in training data with matching probability.
Also is there any algorithm which is auto learning..?
You would need to have stored the coordinates of the rectangle in a CSV file(for example) along with the path to the image. You would then load the image along with the coordinates to get the traffic light as a subimage. This, I think, answers question 1.
You would then feed these subimages, which would be your positive dataset along with some negative data, which could be random portions of the image that don't overlap with the traffic light, into a machine learning algorithm like a HOG SVM. There are some nice tutorials in Python here: http://www.pyimagesearch.com/2014/11/10/histogram-oriented-gradients-object-detection/
This, I think, would lead you to solving question 2.
Does that answer your question? Or have I misinterpreted it?
Related
So I am using Tesseract with C# to read english text and it works like a charm. I use pre-trained data from the tesseract repo:https://github.com/tesseract-ocr/tessdata
So far, so good. However, I fail to understand how to solve the following situation: I have an image with a maximum of three numbers on it:
I also followed this tutorial in order to train my own data but I failed to understand what exactly I am doing mid-way:https://pretius.com/how-to-prepare-training-files-for-tesseract-ocr-and-improve-characters-recognition/
In this tutorial, they used some existing font and train their network accordingly. However, I do not know what this font is. I tried to figure it out myself but was overwhelmed by the huge amount of information about tesseract and actually do not have any idea where to start.
I was wondering if the following would be possible: I have lots of pictures looking like that(in fact, every possible character with every possible color, only difference is that the background is different):
etc...
And with those pictures, I want to train the network, without using any existing font files.
My algorithm right now does not use tesseract, it just screenshots the position of the numbers and I compare pixel-wise. I do not like this appoach though, as the accuracy is something like 60%.
Thanks for your help in advance
I need to capture audio data from the computer mic, process it and then plot it in real time. Processing each frame will produce a 1-D array which I want to display in an image where each value in the array is mapped to a color. The next audio frame is processed similarly and is then displayed on the next row of the image and so on. In matlab, one can achieve this using imagesc function. I also want the user to be able to scroll up and down to see current or previous data.
I believe I will need to buffer the processed data in a file or database and then asynchronously update the plot as mentioned above.
I'm trying to achieve all the above using C#.
My question is: what is the best way to generate the image/plot? I've done a lot of research (Microsoft Chart, VTK, several codeproject articles..) but couldn't find exactly what I want.
Also, what would be the best database to use in such case?
I do not think that there is a component that does exactly the things you've described. In most frameworks/bundles all the images get visualized by native system calls (in the end) which accept strides, buffers and so on, driving all by HANDLE. So, either you generate next time new image with new rows or just draw it yourself by stacking prev image to new one.
Scrolling (AKA windowing) is not trivial but possible again with already pre-created image in memory which is fixed. However, please note, that GDI+ based images (.NET Bitmap) is kind of limited for more than 9000px size. Please consider using alternative like IPP, AForge images.
I recommend you draw rows yourself because in your task re-sizing in going to be an issue because of rows bluring.
So, all in all, you might need to do it yourself.
We have a large number of Images taken from a car for a project. To satisfy privacy norms, we need to detect faces & License Plates and then blur those areas. I came to know of the Emgucv project, and the tutorial given at http://www.emgu.com/wiki/index.php/License_Plate_Recognition_in_CSharp has been very useful to detect Licensplates.
Is there a way of blurring this region using Emgu itself?
I don't believe that there is something built-in like what you are looking for.
What you will have to do, like with openCV, is to blur a whole copy of your source image and then copy back the license plate part to the original image.
You can do this using the SmoothBlur method first and then the Copy method that accepts a mask as its second argument.
I have an objective: I need to join, for example 2 pictures like http://imgur.com/9G0fV and http://imgur.com/69HUg. In the result there has to be and image like http://imgur.com/SCG1X not http://imgur.com/LO4fh.
I'll explain in words: I have some images with the same areas and I need to find the area, crop it in one image and after this join them.
Take a look at this article, it's explains a possible solutions using the C# Aforge.NET image processing library
What you want to do is read the pixel values into arrays,
then find overlapping area using an algorithm like correlation
or min cut.
After finding coordinates of overlap, write out both images into
new array, use coordinates relative to large image minus
position of overlap in that source image plus position in destination image.
C# is not a factor in solving this, unless you meant
to ask about existing .NET frameworks that can help.
I am developing .NET library called SharpStitch (commercial) which can do the job.
It uses feature-based image alignment for general purpose image stitching.
I know I am probably being dense here but I need some help.
I am working on a program that handles mapping of an area, I need to have the map be GEOref'd so I can gather the MGRS coords for any point on the map. I already have a lib I wrote that does this working with images I import one by one using upper left and bottom right coords. I then simply calculate the number of pixels and their offset from the top left and bottom right of the image.
What I am trying to do is create a dragable map like GoogleMaps or any number of other mapping systems.
Here's the kicker. The system is running on a closed network with no access to Google or any other online resource for the maps.
I have 500gb worth of map data that I can work with but the format is something I am not familiar with, a XML file with some georef data, and a truck load of files with .tileset extension.
I assume I need to create some sort of tile stitching routine similar to what you would see in a game engine, but I have no experience with such engines.
Can anyone give me some advice or libs or directions to start researching to parse and use these tileset files and get this function going?