I've a image like this (white background and black text). If there is not noise (as you can see: the top and bottom of number line has many noise), Tesseract can recognize number very good.
But when has noise, Tesseract try to recognize it as number and add more number to result. It is really bad. How can I make Tesseract Ignore Noise? I can't make a preprocessing image to make it more contrast or sharp text. This doesn't help anything.
If some tool can to hightlight only string line. It can be really good input to Tesseract. Please help me. Thanks everybody.
You should try eroding and dilating:
The most basic morphological operations are two: Erosion and Dilation.
They have a wide array of uses, i.e. :
Removing noise
...
you could try to down sample your binary image and sample it up again (pyrDown and PyrUp) or you could try to smooth your image with an gaussian blur. And, as already suggested, erode and dilate your image.
I see 3 solutions for your problem:
As already sugested - try using erode and dilate or some kind of blur. It's the simplest solution.
Find all contours (findContours function) and then delete all contours with area less then some value (try different values, you should find correct one quite fast). Note that the value may not be constant - for example you can try to use 80% of average contour area (just add all contours areas, divide it by number of contours and multiply by 0.8).
Find all contours. Create one dimension array of integers, with length equal to your image height. Fill array with zeros. Now for each contour:
I. Find the top and the bottom point (points with the biggest and the smallest value of y coordinate). Let's name this points T and B.
II. Add one to all elements of array which index is between B.y and T.y. (so if B = (1, 4) and T = (3, 11) then add one to array[4], array[5], array[6] ..., array[11]).
Find the biggest element of array. Let's name this value v. All contours for which B.y <= v <= T.y should be letters, other contours - noise.
you can easily remove these noises by using image processing techniques(Morphological operations like erode and dilate) you can choose opencv for this operations.
Do connected component labeling....that is blob counting....all dose noises can never match the size of the numbers....with morphological techniques the numbers also get modified...label the image...count the number of pixels in each labeled region and set a threshold (which you can easily set as you will only have numbers and noises)...cvblob is the library written in C++ available at code googles...
I had similar problem: small noises was cause of tesseract fails. I cannot use open-cv, because I was developing some feature on android, and open-cv was unwanted because of it large size. I don't know if this solution is good, but here is what I did.
I found all black regions in image (points of each region I added to own region set). Then, I check if count of point in this region is bigger than some threshold, like 10, 25 and 50. If true, I make white all points of that region.
Related
I would like to find a piece of an image inside another image. However, I have some regions pixels in both images that I don't want to take into account. So I was thinking of using some type of mask with zeros or ones to indicate the good pixels.
I am using the MatchTemplate method from emgu and it does not accept a mask. Is there any other way of doing what I would like to do? Thank you!
ReferenceImage.MatchTemplate(templateImage, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCORR_NORMED);
I thought of a solution. Asuming that referenceImageMask and templateMask have 1s in the good pixels and 0s in the bad ones. And that referenceImage and templateImage have already been masked and have 0s in the bad pixels as well.
Then, the first result of template matching will give the not normalized cross correlation between the images.
The second template matching will give for each possible offset the number of pixels that were at the same time different from zero (unmasked) in both images.
Then, normalizing the correlation by that number should give the value I wanted. The average product for the pixels that are not masked in both images.
Image<Gray, float> imCorr = referenceImage.MatchTemplate(templateImage, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCORR);
Image<Gray, float> imCorrMask = referenceImageMask.MatchTemplate(templateMask, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCORR);
imCorr = imCorr .Mul(imCorrMask .Pow(-1));
Today you could use this method:
CvInvoke.MatchTemplate(actualImage, expectedImage, result, TemplateMatchingType.CcoeffNormed, mask);
I am trying to write an algorithm (in c#) that will stitch two or more unrelated heightmaps together so there is no visible seam between the maps. Basically I want to mimic the functionality found on this page :
http://www.bundysoft.com/wiki/doku.php?id=tutorials:l3dt:stitching_heightmaps
(You can just look at the pictures to get the gist of what I'm talking about)
I also want to be able to take a single heightmap and alter it so it can be tiled, in order to create an endless world (All of this is for use in Unity3d). However, if I can stitch multiple heightmaps together, I should be able to easily modify the algorithm to act on a single heightmap, so I am not worried about this part.
Any kind of guidance would be appreciated, as I have searched and searched for a solution without success. Just a simple nudge in the right direction would be greatly appreciated! I understand that many image manipulation techniques can be applied to heightmaps, but have been unable to find a image processing algorithm that produces the results I'm looking for. For instance, image stitching appears to only work for images that have overlapping fields of view, which is not the case with unrelated heightmaps.
Would utilizing a FFT low pass filter in some way work, or would that only be useful in generating a single tileable heightmap?
Because the algorithm is to be used in Unit3d, any c# code will have to be confined to .Net 3.5, as I believe that's the latest version Unity uses.
Thanks for any help!
Okay, seems I was on the right track with my previous attempts at solving this problem. My initial attemp at stitching the heightmaps together involved the following steps for each point on the heightmap:
1) Find the average between a point on the heightmap and its opposite point. The opposite point is simply the first point reflected across either the x axis (if stitching horizontal edges) or the z axis (for the vertical edges).
2) Find the new height for the point using the following formula:
newHeight = oldHeight + (average - oldHeight)*((maxDistance-distance)/maxDistance);
Where distance is the distance from the point on the heightmap to the nearest horizontal or vertical edge (depending on which edge you want to stitch). Any point with a distance less than maxDistance (which is an adjustable value that effects how much of the terrain is altered) is adjusted based on this formula.
That was the old formula, and while it produced really nice results for most of the terrain, it was creating noticeable lines in the areas between the region of altered heightmap points and the region of unaltered heightmap points. I realized almost immediately that this was occurring because the slope of the altered regions was too steep in comparison to the unaltered regions, thus creating a noticeable contrast between the two. Unfortunately, I went about solving this issue the wrong way, looking for solutions on how to blur or smooth the contrasting regions together to remove the line.
After very little success with smoothing techniques, I decided to try and reduce the slope of the altered region, in the hope that it would better blend with the slope of the unaltered region. I am happy to report that this has improved my stitching algorithm greatly, removing 99% of the lines reported above.
The main culprit from the old formula was this part:
(maxDistance-distance)/maxDistance
which was producing a value between 0 and 1 linearly based on the distance of the point to the nearest edge. As the distance between the heightmap points and the edge increased, the heightmap points would utilize less and less of the average (as defined above), and shift more and more towards their original values. This linear interpolation was the cause of the too step slope, but luckily I found a built in method in the Mathf class of Unity's API that allows for quadratic (I believe cubic) interpolation. This is the SmoothStep Method.
Using this method (I believe a similar method can be found in the Xna framework found here), the change in how much of the average is used in determining a heightmap value becomes very severe in middle distances, but that severity lessens exponentially the closer the distance gets to maxDistance, creating a less severe slope that better blends with the slope of the unaltered region. The new forumla looks something like this:
//Using Mathf - Unity only?
float weight = Mathf.SmoothStep(1f, 0f, distance/maxDistance);
//Using XNA
float weight = MathHelper.SmoothStep(1f, 0f, distance/maxDistance);
//If you can't use either of the two methods above
float input = distance/maxDistance;
float weight = 1f + (-1f)*(3f*(float)Math.Pow(input, 2f) - 2f*(float)Math.Pow(input, 3f));
//Then calculate the new height using this weight
newHeight = oldHeight + (average - oldHeight)*weight;
There may be even better interpolation methods that produce better stitching. I will certainly update this question if I find such a method, so anyone else looking to do heightmap stitching can find the information they need. Kudos to rincewound for being on the right track with linear interpolation!
What is done in the images you posted looks a lot like simple linear interpolation to me.
So basically: You take two images (Left, Right) and define a stitching region. For linear interpolation you could take the leftmost pixel of the left image (in the stitching region) and the rightmost pixel of the right image (also in the stitching region). Then you fill the space in between with interpolated values.
Take this example - I'm using a single line here to show the idea:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
Lets say our overlap is 4 pixels wide:
Left = [11,11,11,10,10,10,10]
Right= [01,01,01,01,02,02,02]
^ ^ ^ ^ overlap/stitiching region.
The leftmost value of the left image would be 10
The rightmost value of the right image would be 1.
Now we interpolate linearly between 10 and 1 in 2 steps, our new stitching region looks as follows
stitch = [10, 07, 04, 01]
We end up with the following stitched line:
line = [11,11,11,10,07,04,01,02,02,02]
If you apply this to two complete images you should get a result similar to what you posted before.
I have an binary format STL (STereoLithography) file, I have successfully read the file from c#.net and got the facets, I also got the count of triangles, volume of the part and surface area of the part. But now the problem is that I am not able to find the dimensions of the 3D object(Length,breadth,height). Please help.
I think you will probably have to calculate this yourself, but the algorithm should be fairly simple, assuming you want the dimensions in x,y,z rather than some rotation. Loop through the facets to find the max and min x,y and z coordinate values. Then the dimensions are simply the differences between max and min.
Edit: you could keep track of the max/min values while reading in the points from the file for a minor performance boost.
I have a problem. My company has given me an awfully boring task. We have two databases of dialog boxes. One of these databases contains images of horrific quality, the other very high quality.
Unfortunately, the dialogs of horrific quality contain important mappings to other info.
I have been tasked with, manually, going through all the bad images and matching them to good images.
Would it be possible to automate this process to any degree? Here is an example of two dialog boxes (randomly pulled from Google images) :
So I am currently trying to write a program in C# to pull these photos from the database, cycle through them, find the ones with common shapes, and return theird IDs. What are my best options here ?
I really see no reason to use any external libraries for this, I've done this sort of thing many times and the following algorithm works quite well. I'll assume that if you're comparing two images that they have the same dimensions, but you can just resize one if they don't.
badness := 0.0
For x, y over the entire image:
r, g, b := color at x,y in image 1
R, G, B := color at x,y in image 2
badness += (r-R)*(r-R) + (g-G)*(g-G) + (b-B)*(b-B)
badness /= (image width) * (image height)
Now you've got a normalized badness value between two images, the lower the badness, the more likely that the images match. This is simple and effective, there are a variety of things that make it work better or faster in certain cases but you probably don't need anything like that. You don't even really need to normalize the badness, but this way you can just come up with a single threshold for it if you want to look at several possible matches manually.
Since this question has gotten some more attention I've decided to add a way to speed this up in cases where you are processing many images many times. I used this approach when I had several tens of thousands of images that I needed to compare, and I was sure that a typical pair of images would be wildly different. I also knew that all of my images would be exactly the same dimensions. In a situation in which you are comparing dialog boxes your typical images may be mostly grey-ish, and some of your images may require resizing (although maybe that just indicates a mis-match), in which case this approach may not gain you as much.
The idea is to form a quad-tree where each node represents the average RGB values of the region that node represents. So an 4x4 image would have a root node with RGB values equal to the average RGB value of the image, its children would have RGB values representing the average RGB value of their respective 2x2 regions, and their children would represent individual pixels. (In practice it is a good idea to not go deeper than a region of about 16x16, at that point you should just start comparing individual pixels.)
Before you start comparing images you will also need to decide on a badness threshold. You won't calculate badnesses above this threshold with any reliable accuracy, so this is basically the threshold at which you are willing to label an image as 'not a match'.
Now when you compare image A to image B, first compare the root nodes of their quad-tree representations. Calculate the badness just as you would for a single pixel image, and if the badness exceeds your threshold then return immediately and report the badness at this level. Because you are using normalized badnesses, and since badnesses are calculated using squared differences, the badness at any particular level will be equal to or less than the badness at lower levels, so if it exceeds the threshold at any points you know it will also exceed the threshold at the level of individual pixels.
If the threshold test passes on an nxn image, just drop to the next level down and compare it like it was a 2nx2n image. Once you get low enough just compare the individual pixels. Depending on your corpus of images this may allow you to skip lots of comparisons.
I would personally go for an image hashing algorithm.
The goal of image hashing is to transform image content into a feature sequence, in order to obtain a condensed representation.
This feature sequence (i.e. a vector of bits) must be short enough for fast matching and preserve distinguishable features for similarity measurement to be feasible.
There are several algorithms that are freely available through open source communities.
A simple example can be found in this article, where Dr. Neal Krawetz shows how the Average Hash algorithm works:
Reduce size. The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels. Don't bother keeping the aspect ratio, just crush it down to fit an 8x8 square. This way, the hash will match any variation of the image, regardless of scale or aspect ratio.
Reduce color. The tiny 8x8 picture is converted to a grayscale. This changes the hash from 64 pixels (64 red, 64 green, and 64 blue) to 64 total colors.
Average the colors. Compute the mean value of the 64 colors.
Compute the bits. This is the fun part. Each bit is simply set based on whether the color value is above or below the mean.
Construct the hash. Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent. (I set the bits from left to right, top to bottom using big-endian.)
David Oftedal wrote a C# command-line application which can classify and compare images using the Average Hash algorithm.
(I tested his implementation with your sample images and I got a 98.4% similarity).
The main benefit of this solution is that you read each image only once, create the hashes and classify them based upon their similiarity (using, for example, the Hamming distance).
In this way you decouple the feature extraction phase from the classification phase, and you can easily switch to another hashing algorithm if you find it's not enough accurate.
Edit
You can find a simple example here (It includes a test set of 40 images and it gets a 40/40 score).
Here's a topic discussing image similarity with algorithms, already implemented in OpenCV library. You should have no problem importing low-level functions in your C# application.
The Commercial TinEye API is a really good option.
I've done image matching programs in the past and Image Processing technology these days is amazing, its advanced so much.
ps here's where those two random pics you pulled from google came from: http://www.tineye.com/search/1ec9ebbf1b5b3b81cb52a7e8dbf42cb63126b4ea/
Since this is a one-off job, I'd make do with a script (choose your favorite language; I'd probably pick Perl) and ImageMagick. You could use C# to accomplish the same as the script, although with more code. Just call the command line utilities and parse the resulting output.
The script to check a pair for similarity would be about 10 lines as follows:
First retrieve the sizes with identify and check aspect ratios nearly the same. If not, no match. If so, then scale the larger image to the size of the smaller with convert. You should experiment a bit in advance with filter options to find the one that produces the most similarity in known-equivalent images. Nine of them are available.
Then use the compare function to produce a similarity metric. Compare is smart enough to deal with translation and cropping. Experiment to find a similarity threshold that doesn't provide too many false positives.
I would do something like this :
If you already know how the blurred images have been blurred, apply the same function to the high quality images before comparison.
Then compare the images using least-squares as suggested above.
The lowest value should give you a match. Ideally, you would get 0 if both images are identical
to speed things up, you could perform most comparison on downsampled images then refine on a selected subsample of the images
If you don't know, try various probable functions (JPEG compression, downsampling, ...) and repeat
You could try Content-Based Image Retrieval (CBIR).
To put it bluntly:
For every image in the database, generate a fingerprint using a
Fourier Transform
Load the source image, make a fingerprint of the
image
Calculate the Euclidean Distance between the source and all
the images in the database
Sort the results
I think a hybrid approach to this would be best to solve your particular batch matching problem
Apply the Image Hashing algorithm suggested by #Paolo Morreti, to all images
For each image in one set, find the subset of images with a hash closer that a set distance
For this reduced search space you can now apply expensive matching methods as suggested by #Running Wild or #Raskolnikov ... the best one wins.
IMHO, best solution is to blur both images and later use some similarity measure (correlation/ mutual information etc) to get top K (K=5 may be?) choices.
If you extract the contours from the image, you can use ShapeContext to get a very good matching of images.
ShapeContext is build for this exact things (comparing images based on mutual shapes)
ShapeContext implementation links:
Original publication
A goot ppt on the subject
CodeProject page about ShapeContext
*You might need to try a few "contour extraction" techniques like thresholds or fourier transform, or take a look at this CodeProject page about contour extraction
Good Luck.
If you calculate just pixel difference of images, it will work only if images of the same size or you know exactly how to scale it in horizontal and vertical direction, also you will not have any shift or rotation invariance.
So I recomend to use pixel difference metric only if you have simplest form of problem(images are the same in all characteristics but quality is different, and by the way why quality is different? jpeg artifacts or just rescale?), otherwise i recommend to use normalized cross-correlation, it's more stable metric.
You can do it with FFTW or with OpenCV.
If bad quality is just result of lower resolution then:
rescale high quality image to low quality image resolution (or rescale both to equal low resolution)
compare each pixel color to find closest match
So for example rescaling all of images to 32x32 and comparing that set by pixels should give you quite reasonable results and its still easy to do. Although rescaling method can make difference here.
You could try a block-matching algorithm, although I'm not sure its exact effectiveness against your specific problem - http://scien.stanford.edu/pages/labsite/2001/ee368/projects2001/dropbox/project17/block.html - http://www.aforgenet.com/framework/docs/html/05d0ab7d-a1ae-7ea5-9f7b-a966c7824669.htm
Even if this does not work, you should still check out the Aforge.net library. There are several tools there (including block matching from above) that could help you in this process - http://www.aforgenet.com/
I really like Running Wild's algorithm and I think it can be even more effective if you could make the two images more similar, for example by decreasing the quality of the better one.
Running Wild's answer is very close. What you are doing here is calculating the Peak Signal to Noise Ratio for each image, or PSNR. In your case you really only need the Mean Squared Error, but the squaring component of it helps a great deal in calculating difference between images.
PSNR Reference
Your code should look like:
sum = 0.0
for(imageHeight){
for(imageWidth){
errorR = firstImage(r,x,y) - secondImage(r,x,y)
errorG = firstImage(g,x,y) - secondImage(g,x,y)
errorB = firstImage(b,x,y) - secondImage(b,x,y)
totalError = square(errorR) + square(errorG) + square(errorB)
}
sum += totalError
}
meanSquaredError = (sum / (imageHeight * imageWidth)) / 3
I asume the images from the two databases show the same dialog and that the images should be close to identical but of different quality? Then matching images will have same (or very close to same) aspect ratio.
If the low quality images were produced from the high quality images (or equivalent image), then you should use the same image processing procedure as a preprocessing step on the high quality image and match with the low quality image database. Then pixel by pixel comparison or histogram matching should work well.
Image matching can use a lot of resources if you have many images. Maybe a multipass approach is a good idea? For example:
Pass 1: use simple mesures like aspect ratio to groupe images (width and height fields in db?) (computationally cheap)
Pass 2: match or groupe by histogram for 1st-color-channel (or all channels) (relatively computationally cheap)
I will also recommend OpenCV. You can use it with c,c++ and Python (and soon Java).
Just thinking out loud:
If you use two images that should be compared as layers and combine these (subtract one from the other) you get a new image (some drawing programs can be scripted to do batch conversion, or you could use the GPU by writing a tiny DirectX or OpenGL program)
Next you would have to get the brightness of the resulting image; the darker it is, the better the match.
Have you tried contour/thresholding techniques in combination with a walking average window (for RGB values ) ?
I have to be able to set a random location for a waypoint for a flight sim. The maths challenge is straightforward:
"To find a single random location within a quadrangle, where there's an equal chance of the point being at any location."
Visually like this:
An example ABCD quadrangle is:
A:[21417.78 37105.97]
B:[38197.32 24009.74]
C:[1364.19 2455.54]
D:[1227.77 37378.81]
Thanks in advance for any help you can provide. :-)
EDIT
Thanks all for your replies. I'll be taking a look at this at the weekend and will award the accepted answer then. BTW I should have mentioned that the quadrangle can be CONVEX OR CONCAVE. Sry 'bout dat.
Split your quadrangle into two triangles and then use this excellent SO answer to quickly find a random point in one of them.
Update:
Borrowing this great link from Akusete on picking a random point in a triangle.
(from MathWorld - A Wolfram Web Resource: wolfram.com)
Given a triangle with one vertex at
the origin and the others at positions v1
and v2, pick
(from MathWorld - A Wolfram Web Resource: wolfram.com)
where A1
and A2 are uniform
variates in the interval [0,1] , which gives
points uniformly distributed in a
quadrilateral (left figure). The
points not in the triangle interior
can then either be discarded, or
transformed into the corresponding
point inside the triangle (right
figure).
I believe there are two suitable ways to solve this problem.
The first mentioned by other posters is to find the smallest bounding box that encloses the rectangle, then generate points in that box until you find a point which lies inside the rectangle.
Find Bounding box (x,y,width, height)
Pick Random Point x1,y1 with ranges [x to x+width] and [y to y+height]
while (x1 or y1 is no inside the quadrangle){
Select new x1,y1
}
Assuming your quadrangle area is Q and the bounding box is A, the probability that you would need to generate N pairs of points is 1-(Q/A)^N, which approaches 0 inverse exponentially.
I would reccommend the above approach, espesially in two dimensions. It is very fast to generate the points and test.
If you wanted a gaurentee of termination, then you can create an algorithm to only generate points within the quadrangle (easy) but you must ensure the probablity distribution of the points are even thoughout the quadrangle.
http://mathworld.wolfram.com/TrianglePointPicking.html
Gives a very good explination
The "brute force" approach is simply to loop through until you have a valid coordinate. In pseudocode:
left = min(pa.x, pb.x, pc.x, pd.x)
right = max(pa.x, pb.x, pc.x, pd.x)
bottom = min(pa.y, pb.y, pc.y, pd.y)
top = max(pa.y, pb.y, pc.y, pd.y)
do {
x = left + fmod(rand, right-left)
y = bottom + fmod(rand, top-bottom)
} while (!isin(x, y, pa, pb, pc, pd));
You can use a stock function pulled from the net for "isin". I realize that this isn't the fastest-executing thing in the world, but I think it'll work.
So, this time tackling how to figure out if a point is within the quad:
The four edges can be expressed as lines in y = mx + b form. Check if the point is above or below each of the four lines, and taken together you can figure out if it's inside or outside.
Are you allowed to just repeatedly try anywhere within the rectangle which bounds the quadrangle, until you get something within the quad? Might this even be faster than some fancy algorithm to ensure that you pick something within the quad?
Incidentally, in that problem statement, I think the use of the word "find" is confusing. You can't really find a random value that satisfies a condition; the randomizer just gives it to you. What you're trying to do is set parameters on the randomizer to give you values matching certain criteria.
I would divide your quadrangle into multiple figures, where each figure is a regular polygon with one side (or both sides) parallel to one of the axes. For eg, for the figure above, I would first find the maximum rectangle that fits inside the quadrangle, the rectangle has to be parallel to the X/Y axes. Then in the remaining area, I would fit triangles, such triangles will be adjacent to each side of the rectangle.
then it is simple to write a function:
1) get a figure at random.
2) find a random point in the figure.
If the figure chosen in #1 is a rectangle, it should be pretty easy to find a random point in it. The tricky part is to write a routine which can find a random point inside the triangle
You may randomly create points in a bound-in-box only stopping after you find one that it's inside your polygon.
So:
Find the box that contains all the points of your polygon.
Create a random point inside the bounds of the previously box found. Use random functions to generate x and y values.
Check if that point is inside the polygon (See how here or here)
If that point is inside the polygon stop, you're done, if not go to step 2
So, it depends on how you want your distribution.
If you want the points randomly sampled in your 2d view space, then Jacob's answer is great. If you want the points to be sort of like a perspective view (in your example image, more density in top right than bottom left), then you can use bilinear interpolation.
Bilinear interpolation is pretty easy. Generate two random numbers s and t in the range [0..1]. Then if your input points are p0,p1,p2,p3 the bilinear interpolation is:
bilerp(s,t) = t*(s*p3+(1-s)*p2) + (1-t)*(s*p1+(1-s)*p0)
The main difference is whether you want your distribution to be uniform in your 2d space (Jacob's method) or uniform in parameter space.
This is an interesting problem and there's probably as really interesting answer, but in case you just want it to work, let me offer you something simple.
Here's the algorithm:
Pick a random point that is within the rectangle that bounds the quadrangle.
If it is not within the quadrangle (or whatever shape), repeat.
Profit!
edit
I updated the first step to mention the bounding box, per Bart K.'s suggestion.