I believe the text result of OCR not only depends on how good the OCR library is but also depends on the image to scan.
My question is that, for an image, how could I make it easier to recognize?
Change the original color? (for example, to grey)
Change the scale/size? (for example, bigger)
Change the area to recognize? (for example, set the distance to top/left side)
Any other idea?
Related
Using SharpMap (Windows Forms) how can I display an overlay image over the map tile background so that the image scales and moves accordingly with the map zoom and pan? I want to implement a georeferencing like functionality.
I tried using SharpMap.Layers.GdiImageLayer but I didn't find a way to position it to a custom position (real world coordinate) - it is getting placed by default at origin [0,0] (the background tile coordinates). Is there a way to define a relation (transformation) to place it in a particular position?
I managed to get closer to a solution using VectorLayer with RasterPointSymbolizer but the symbol (image loaded for symbol) would need to be transformed on map scaling/zoom (so that corners remain in the same real world coordinates) - the math calculation behind looks a little complicated for a solution that seems more like a workaround than a natural one. As a note - I don't need precise calculation using projection/real geoid as I am doing this only at building level.
Using GDAL - GeoTIFF might be a choice here by generating a GeoTIFF based on the original image and making the georeferencing metadata dynamic based on UI controls? The original image is in raster format (JPG) with no geographic metadata.
Is there a better solution?
If VectorLayer - RasterPointSymbolizer is the best choice, do you have an example for symbol(image) synchronization with the map view?
As far as i can tell the GdiImageLayer calculates the position based on a world-file (GdiImageLayer.SetEnvelope()). Here is the link to the method. This meta file has to have the same name in the same location but the filetype must be of type *.wld. There you set the position (top left corner) and the skew in x and y axis. So rotation will be perserved. I never tried this but i would suggest correct zooming would work
(The wiki article offers also some good understanding for world files)
There might be another way if no world file is present and the information rests in the header. There is a GDalSample loading a GeoTiff Image onto an existing map. The example can be found in the WinFormSamples and the Different kind of layers supported by [MapBox] Example. The map is showing the overlay only while magnifying and not fully function but maybe a good hint
I have a small image of a text in it, a screenshot from a small area in a console application. I need a simplest possible test to tell if it matches a pre-defined pattern. In other words, if text is exactly some specific word, let's say, "root".
Font is fixed, no rotations, but it should be robust to small offsets.
So, my template/pattern image is fixed:
And my test image can be this:
In case of negative result, I don't need to extract text from it or anything. So, I'm looking for the simplest test possible.
I tried to calculate number of white pixels in the image, but that's unreliable.
I have an image that is a depth heatmap that I've filtered out anything further away than the first 25% of the image.
It looks something like this:
There are two blobs of color in the image, one is my hand (with part of my face behind it), and the other is the desk in the lower left corner. How can I search the image to find these blobs? I would like to be able to draw a rectangle around them if possible.
I can also do this (ignore shades, and filter to black or white):
Pick a random pixel as a seed pixel. This becomes area A. Repeatedly expand A until A doesn't get any bigger. That's your area.
The way to expand A is by looking for neighbor pixels to A, such that they have similar color to at least one neighboring pixel in A.
What "similar color" means to you is somewhat variable. If you can make exactly two colors, as you say in another answer, then "similar" is "equal". Otherwise, "similar" would mean colors that have RGB values or whatnot where each component of the two colors is within a small amount of each other (i.e. 255, 128, 128 is similar to 252, 125, 130).
You can also limit the selected pixels so they must be similar to the seed pixel, but that works better when a human is picking the seed. (I believe this is what is done in Photoshop, for example.)
This can be better than edge detection because you can deal with gradients without filtering them out of existence, and you don't need to process the resulting detected edges into a coherent area. It has the disadvantage that a gradient can go all the way from black to white and it'll register as the same area, but that may be what you want. Also, you have to be careful with the implementation or else it will be too slow.
It might be overkill for what you need, but there's a great wrapper for C# for the OpenCV libraries.
I have successfully used OpenCV in C++ for blob detection, so you might find it useful for what you're trying to do.
http://www.emgu.com/wiki/index.php/Main_Page
and the wiki page on OpenCV:
http://en.wikipedia.org/wiki/OpenCV
Edited to add: Here is a blobs detection library for Emgu in C#. There is even some nice features of ordering the blobs by descending area (useful for filtering out noise).
http://www.emgu.com/forum/viewtopic.php?f=3&t=205
Edit Again:
If Emgu is too heavyweight, Aforge.NET also includes some blob detection methods
http://www.aforgenet.com/framework/
If the image really is only two or more distinct colours (very little blur between colours), it is an easy case for an edge detection algorithm.
You can use something like the code sample from this question : find a color in an image in c#
It will help you find the x/y of specific colors in your image. Then you could use the min x/max x and the min y/max y to draw your rectangles.
Detect object from image based on object color by C#.
To detect a object based on its color, there is an easy algorithm for that. you have to choose a filtering method. Steps normally are:
Take the image
Apply ur filtering
Apply greyscalling
Subtract background and get your objects
Find position of all objects
Mark the objects
First you have to choose a filtering method, there are many filtering method provided for C#. Mainly I prefer AForge filters, for this purpose they have few filter:
ColorFiltering
ChannelFiltering
HSLFiltering
YCbCrFiltering
EuclideanColorFiltering
My favorite is EuclideanColorFiltering. It is easy and simple. For information about other filters you can visit link below. You have to download AForge dll for apply these in your code.
More information about the exact steps can be found here: Link
I want to cut up the image based on various text markers placed around within it. The font/size of the marker is up to me.
I know commercial OCR packages provide this in their APIs but I'm hoping I can code this up myself.
Ideally I wouldn't have to go pixel to pixel and compare against an image of the marker text.
I'm good with C++/C#, Java, PHP and an other language where a library like this exists...
Ideally I wouldn't have to go pixel to
pixel and compare against an image of
the marker text.
Well, if you're trying to find the marker image, then that's exactly what you'd have to do.
Here's an idea... Set the marker text to a particular color, then process the background image to make sure that it doesn't have any pixels of that color. Finding the markers should become a lot easier at that point.
A barcode would be easier to detect than a text marker. You can always place them together, with the barcode being used for automatic position detection and the text for human user.
If you want to do a really sufisticated solution, you could use the hough transform. It is often used for augumented reality stuff - there it is necessary to find a certain marker in an image. Ofc you would have to change your markers a bit, would this be possible? ;-)
The hough will give you the position of your marker lines and thus the area which you want to cut out.
Here is a link about hough, but there are many others though.
Hough
Or this one
Wiki
A fiduciary marker would be better than text. That's what they use for augmented reality and such.
If the text is always the same size, shape, and oriented in the same direction, you could use normalized cross-correlation.
"Ideally I wouldn't have to go pixel to pixel and compare against an image of the marker text."
Well how else are you going to do it? You're only going to search in part of the image?
At some long-ago Flash conferences I recall seeing a demo of a Flash app that had a color picker. Based on the user's color choice the app would show the user a set of images within the approximate range of that color: a bunch of mostly red images, a bunch of mostly blue images, etc.
I'm looking for two things:
1) A link to a demo of this sort of app, ideally a Flash app
2) ActionScript or C# code that describes how to pick a bunch of images that fall within a color range.
I know how to extract the aggregate/average RGB from individual images and persist this info to a database. I need to know how exactly to select out images within a certain range of color tolerance. Could this be done purely using SQL and a knowledge of the alphanumeric assignments of RGB color codes, or is there a better way?
I could not find any sample code, but found an article that gives a high-level explanation of their process (from this other page about Flickr's feature to search for images with similar colors). Apparently, Google also lets you do this with their image search (but I don't know if that is from metadata tags or actual color matching).
Now to the actual answer:
Instead of just storing the average or aggregate color for an image, you will need to store a "color signature" of the image.
My first (educated guess) idea would entail these steps:
Generate the histogram for each color band from the image
Generate some factors that describe each histogram curve (mean, variance, std-dev, etc? -- these factors will make up your digital signature of your image)
Store those factors in your database (and each of these factors would have an index in the DB)
Then, you would take your input (either a color, range of colors, or source image), run your histogram algorithm against that source, and search for matches to your computed factors.
The Flickr Hacks solution I cite in the comments is the best I've found: it involves resizing the image to 1x1 pixels using common algorithms which gives you an average color for the entire image. Clever.
You can use a set of descriptors for each image, then match those.
There's some great work here on it.. for C#.
http://savvash.blogspot.com/p/compact-composite-descriptors.html
Also check out the free img(Rummager) tool.. which can do what you want(ie find images matched by colour).