image focus calculation - c#

I'm trying to develop an image focusing algorithm for some test automation work. I've chosen to use AForge.net, since it seems like a nice mature .net friendly system.
Unfortunately, I can't seem to find information on building autofocus algorithms from scratch, so I've given it my best try:
take image. apply sobel edge detection filter, which generates a greyscale edge outline. generate a histogram and save the standard dev. move camera one step closer to subject and take another picture. if the standard dev is smaller than previous one, we're getting more in focus. otherwise, we've past the optimal distance to be taking pictures.
is there a better way?
update: HUGE flaw in this, by the way. as I get past the optimal focus point, my "image in focus" value continues growing. you'd expect a parabolic-ish function looking at distance/focus-value, but in reality you get something that's more logarithmic
update 2: okay, so I went back to this and the current method we're exploring is given a few known edges (okay, so I know exactly what the objects in the picture are), I do a manual pixel intensity comparison. as the resulting graph gets steeper, I get more in focus. I'll post code once the core algorithm gets ported from matlab into c# (yeah, matlab.. :S)
update 3: yay final update. came back to this again. the final code looks like this:
step 1: get image from the list of images (I took a hundred photos through the focused point)
step 2: find an edge for the object I'm focusing (In my case its a rectangular object that's always in the same place, so I crop a HIGH and NARROW rectangle of one edge)
step 3: get the HorizontalIntensityStatistics (Aforge.net class) for that cropped image.
step 4: get the Histogram (gray, in my case)
step 5: find the derivative of the values of the histogram
step 6: when your slope is the largest, is when you're in the most focused point.

You can have a look at the technique used in the NASA Curiosity Mars Rover.
The technique is described in this article
EDGETT, Kenneth S., et al. Curiosity’s Mars Hand Lens Imager (MAHLI) Investigation. Space science reviews, 2012, 170.1-4: 259-317.
which is available as a PDF here.
Quoting from the article:
7.2.2 Autofocus
Autofocus is anticipated to be the primary method by which MAHLI is
focused on Mars. The autofocus command instructs the camera to move to
a specified starting motor count position and collect an image, move a
specified number of steps and collect another image, and keep doing so
until reaching a commanded total number of images, each separated by a
specified motor count increment. Each of these images is JPEG
compressed (Joint Photographic Experts Group; see CCITT (1993)) with
the same compression quality factor applied. The file size of each
compressed image is a measure of scene detail, which is in turn a
function of focus (an in-focus image shows more detail than a blurry,
out of focus view of the same scene). As illustrated in Fig. 23, the
camera determines the relationship between JPEG file size and motor
count and fits a parabola to the three neighboring maximum file sizes.
The vertex of the parabola provides an estimate of the best focus
motor count position. Having made this determination, MAHLI moves the
lens focus group to the best motor position and acquires an image;
this image is stored, the earlier images used to determine the
autofocus position are not saved.
Autofocus can be performed over the
entire MAHLI field of view, or it can be performed on a sub-frame that
corresponds to the portion of the scene that includes the object(s) to
be studied. Depending on the nature of the subject and knowledge of
the uncertainties in robotic arm positioning of MAHLI, users might
elect to acquire a centered autofocus sub-frame or they might select
an off-center autofocus sub-frame if positioning knowledge is
sufficient to determine where the sub-frame should be located. Use of
sub-frames to perform autofocus is highly recommended because this
usually results in the subject being in better focus than is the case
when autofocus is applied to the full CCD; further, the resulting
motor count position from autofocus using a sub-frame usually results
in a more accurate determination of working distance from pixel scale.
The following is Figure 23:
This idea was suggested also in this answer: https://stackoverflow.com/a/2173259/15485

It may be a bit simplistic for your needs, but I've had good results with a simple algorithm that looks at the difference to neighbouring pixels. The sum of the difference of pixels 2-away seems to be a reasonable measure of image contrast. I couldn't find the original paper by Brenner in the 70's but it is mentioned in http://www2.die.upm.es/im/papers/Autofocus.pdf
Another issue is when the image is extremely out of focus, there is very little focus information, so it's hard to tell which way is 'moving closer' or to avoid a local maximum.

I haven't built one myself, but my first thought would be to do a 2D DFT on a portion of the image. When out of focus, high frequencies will disappear automatically.
For a lazy prototype, You could try to compress a region of the image with JPEG (high quality), and look at the output stream size. A big file means a lot of detail, which in turn implies the image is in focus. Beware that the camera should not be too noisy, and that you can't compare file sizes across different scenes of course.

This might be useful. It's how camera's AF system actually works - Passive Autofocus
Contrast measurement
Contrast measurement is achieved by
measuring contrast within a sensor
field, through the lens. The intensity
difference between adjacent pixels of
the sensor naturally increases with
correct image focus. The optical
system can thereby be adjusted until
the maximum contrast is detected. In
this method, AF does not involve
actual distance measurement at all and
is generally slower than phase
detection systems, especially when
operating under dim light. As it does
not use a separate sensor, however,
contrast-detect autofocus can be more
flexible (as it is implemented in
software) and potentially more
accurate. This is a common method in
video cameras and consumer-level
digital cameras that lack shutters and
reflex mirrors. Some DSLRs (including
Olympus E-420, Panasonic L10, Nikon
D90, Nikon D5000, Nikon D300 in Tripod
Mode, Canon EOS 5D Mark II, Canon EOS
50D) use this method when focusing in
their live-view modes. A new
interchangeable-lens system, Micro
Four Thirds, exclusively uses contrast
measurement autofocus, and is said to
offer performance comparable to phase
detect systems.

While the sobel is a decent choice, I would probably choose to do an edge magnitude calculation on the projections in x and y directions over several small representative regions. Another .NET friendly choices based on OpenCV is # http://www.emgu.com/wiki/index.php/Main_Page.

I wonder if the standard deviation is the best choice: If the image gets sharper, the sobel filter image will contain brighter pixels at the edges, but at the same time fewer bright pixels, because the edges are getting thinner. Maybe you could try using an average of the 1% highest pixel values in the sobel image?

Another flavor for focus metric might be:
Grab several images and average them ( noise reduction).
Then FFT the averaged image and use the high to low frequencies energy ratio.
The higher this ration the better focus. A Matlab demo is available (excluding the averaging stage) within the demos of the toolbox :)

Related

Identify groups of similar-coloured pixels

I am writing some software that periodically checks a camera image to identify whether an object has been introduced to the viewed scene. I am using ImageMagick in my WinForms software to compare two images to create a third. In the third image, a pixel is white if the first two images had a similar-coloured pixel, and black if they were different. So if the user sees a group of black pixels, they will know something has been placed in the scene that wasn't there previously, as seen here:
The user will not be seeing this image, so I would like my software to identify this for me. I don't need anything too complex from this analysis - just a Boolean for whether something was changed in the scene.
In my mind, there are two approaches to analysing this; counting the number of black pixels in the image, or I could write some algorithm to identify patches of black. My question is about the second approach, since it feels like the more correct approach. The image is a bit too noisy (you can see false positives on straight edges) for me to feel entirely comfortable with counting.
To identify a group, I have considered using some for loops to look at the colours of pixels surrounding every pixel but this seems like it would take forever. My processing time can't take more than a few seconds so I need to be wary of this. Are there cleaner or more-efficient ways to identify groups of similarly-coloured pixels? Or will I need to run loops and be as efficient as possible?
Threshold the image such that black pixels will have value 1 non black will have zero.
Use connected component labeling to find all the groups of connected black pixels. http://www.imagemagick.org/script/connected-components.php
Filter out the components that are too small or doesn't have the correct shape(for example you can have a long line on the sides so they have a lot of black pixels but you are not expecting to see a long line as a valid group of black pixels)
At this point I assume you have a sense of the scale of the objects that you are interested in capturing. For example, if you did have an entirely non-black screen (let's ignore noise for the purpose of this discussion), except you have a black pixel object that is only roughly 10 pixels in diameter. This seems to be frightfully small, and not enough information to be useful.
Once you have determined what is the minimum size of black mass you are willing to accept, I would then go about querying a staggered matrix.
i.e., a pattern like:
Use a little math to determine what acceptable noise is.
Once you have positive results, (pixels = black), investigate in those sectors.

Simple algorithm to crop empty borders from an image by code?

Currently I'm seeking for a rather fast and reasonably accurate algorithm in C#/.NET to do these steps in code:
Load an image into memory.
Starting from the color at position (0,0), find the unoccupied space.
Crop away this unnecessary space.
I've illustrated what I want to achieve:
What I can imagine is to get the color of the pixel at (0,0) and then do some unsafe line-by-line/column-by-column walking through all pixels until I meet a pixel with another color, then cut away the border.
I just fear that this is really really slow.
So my question is:
Are you aware of any quick algorithmns (ideally without any 3rd party libraries) to cut away "empty" borders from an in-memory image/bitmap?
Side-note: The algorithm should be "reasonable accurate", not 100% accurate. Some tolerance like one line too much or too few cropped would be way OK.
Addition 1:
I've just finished implementing my brute force algorithm in the simplest possible manner. See the code over at Pastebin.com.
If you know your image is centered, you might try walking diagonally ( ie (0,0), (1,1), ...(n,n) ) until you have a hit, then backtrack one line at a time checking until you find an "empty" line (in each dimension). For the image you posted, it would cut a lot of comparisons.
You should be able to do that from 2 opposing corners concurrently to get some multi-core action.
Of course, hopefully you dont it the pathelogical case of 1 pixel wide line in the center of the image :) Or the doubly pathological case of disconnected objects in your image such that the whole image is centered, but nothing crosses the diagonal.
One improvement you could make is to give your "hit color" some tolerance (adjustable maybe?)
The algorithm you are suggesting is a brute force algorithm and will work all the time for all type of images.
but for special cases like, subject image is centered and is a continuous blob of colors (as you have displayed in your example), binary sort kind of algorithm can be applied.
start from center line (0,length/2) and start in one direction at a time, examine the lines as we do in binary search.
do it for all the sides.
this will reduce complexity to log n to the base 2
For starters, your current algorithm is basically the best possible.
If you want it to run faster, you could code it in c++. This tends to be more efficient than managed unsafe code.
If you stay in c#, you can parallel extensions to run it on multiple cores. That wont reduce the load on the machine but it will reduce the latency, if any.
If you happen to have a precomputed thumbnail for the image, you can apply your algo on the thumbnail first to get a rough idea.
First, you can convert your bitmap to a byte[] using LockBits(), this will be much faster than GetPixel() and won't require you to go unsafe.
As long as you don't naively search the whole image and instead search one side at a time, you nailed the algorithm 95%. Just make you are not searching already cropped pixels, as this might actually make the algorithm worse than the naive one if you have two adjacent edges that crop a lot.
A binary search can improve a tiny bit, but it's not that significant as it will maybe save you a line of search for each direction in the best case scenario.
Although i prefer the answer of Tarang, i'd like to give some hints on how to 'isolate' objects in an image by refering to a given foregroundcolor and backgroundcolor, which is called 'segmentation' and used when working in the field of 'optical inspection', where an image is not just cropped to some detected object but objects are counted and also measured, things you can measure on an object is area, contour, diameter etc..
First of all, usually you'll start really to walk through your image beginning at x/y coordinates 0,0 and walk from left to right and top to bottom until you'll find a pixel that has another value as the background. The sensitivity of the segmentation is given by defining the grayscale value of the background as well as the grayscale value of the foreground. You possibly will walk through the image as said, by coordinates, but from the programs view you'll just walk through an array of pixels. That means you'll have to deal with the formula that calculates the x/y coordinate to the pixel's index in the pixel array. This formula sure needs width and height of the image.
For your concern of cropping, i think when you've found the so called 'pivot point' of your foreground object, you'll usually walk along the found object by using a formula that detects neighbor pixels of the same foregeground value. If there is only one object to detect as in your case, it's easy to store those pixels coordinates that are north-most, east-most, south-most and west-most. These 4 coordinates mark the rectangle your object fits in. With this information you can calculate the new images (cropped image) width and height.

Detecting a blob of color in an image

I have an image that is a depth heatmap that I've filtered out anything further away than the first 25% of the image.
It looks something like this:
There are two blobs of color in the image, one is my hand (with part of my face behind it), and the other is the desk in the lower left corner. How can I search the image to find these blobs? I would like to be able to draw a rectangle around them if possible.
I can also do this (ignore shades, and filter to black or white):
Pick a random pixel as a seed pixel. This becomes area A. Repeatedly expand A until A doesn't get any bigger. That's your area.
The way to expand A is by looking for neighbor pixels to A, such that they have similar color to at least one neighboring pixel in A.
What "similar color" means to you is somewhat variable. If you can make exactly two colors, as you say in another answer, then "similar" is "equal". Otherwise, "similar" would mean colors that have RGB values or whatnot where each component of the two colors is within a small amount of each other (i.e. 255, 128, 128 is similar to 252, 125, 130).
You can also limit the selected pixels so they must be similar to the seed pixel, but that works better when a human is picking the seed. (I believe this is what is done in Photoshop, for example.)
This can be better than edge detection because you can deal with gradients without filtering them out of existence, and you don't need to process the resulting detected edges into a coherent area. It has the disadvantage that a gradient can go all the way from black to white and it'll register as the same area, but that may be what you want. Also, you have to be careful with the implementation or else it will be too slow.
It might be overkill for what you need, but there's a great wrapper for C# for the OpenCV libraries.
I have successfully used OpenCV in C++ for blob detection, so you might find it useful for what you're trying to do.
http://www.emgu.com/wiki/index.php/Main_Page
and the wiki page on OpenCV:
http://en.wikipedia.org/wiki/OpenCV
Edited to add: Here is a blobs detection library for Emgu in C#. There is even some nice features of ordering the blobs by descending area (useful for filtering out noise).
http://www.emgu.com/forum/viewtopic.php?f=3&t=205
Edit Again:
If Emgu is too heavyweight, Aforge.NET also includes some blob detection methods
http://www.aforgenet.com/framework/
If the image really is only two or more distinct colours (very little blur between colours), it is an easy case for an edge detection algorithm.
You can use something like the code sample from this question : find a color in an image in c#
It will help you find the x/y of specific colors in your image. Then you could use the min x/max x and the min y/max y to draw your rectangles.
Detect object from image based on object color by C#.
To detect a object based on its color, there is an easy algorithm for that. you have to choose a filtering method. Steps normally are:
Take the image
Apply ur filtering
Apply greyscalling
Subtract background and get your objects
Find position of all objects
Mark the objects
First you have to choose a filtering method, there are many filtering method provided for C#. Mainly I prefer AForge filters, for this purpose they have few filter:
ColorFiltering
ChannelFiltering
HSLFiltering
YCbCrFiltering
EuclideanColorFiltering
My favorite is EuclideanColorFiltering. It is easy and simple. For information about other filters you can visit link below. You have to download AForge dll for apply these in your code.
More information about the exact steps can be found here: Link

How to detect image orientation (text)

My program is working with fax documents stored as separate bitmaps
I wonder if there is a way to detect automatically page orientation (vertical or horizontal) to show image preview for user in right order (meant rotate if neccesary)
Any advices much appreciated!
EDIT: Clarification:
When Faxmachine receives multi-page document it saves each page as separate TIFF file.
My app has built-in viewer displaying those files. All files are scaled to A4 format and saved in TIFF (so there is no change to detect orientation by height/width parameters)
My viewer displays images in portrait mode by default
What I'd like to do is automagically detect situation when org document was printed in landscape mode (eg wide Excel tables) then I'd like to show rotated preview for end user to speed up preview process
Obviously there are 4 possible fax orientation portrait / landscape x 2 kinds of rotations.
I'm even interested simplified solution detecting when org doc was landscape or portrait (I've noticed most of landscape docs needs to be rotated clockwise)
EDIT2: Idea
I think it might be some idea:
If I could draw horizontal and vertical lines and check if line doesn't cut any (black) point. Then we can compare what are more type of lines (horizontal or vertical) and his decides about page orientation.
What do you think ?
You could perform a Fast Fourier Transform (FFT) to convert your spatial image to a frequency/angle representation. Then find the angle with the most prominent frequency. It sounds complicated but it's not that hard, it's pretty efficient, and in effect it tests every possible angle at once, instead of being a hard-coded hack that only works for specific angles. Search for a sample implementation with search terms like Numerical Recipes and FFT.
You'd need OCR for that. Rolling your own OCR would be a bit difficult, but there might be library or something out there worth looking into? Also, even with good OCR, it's not a 100% reliable solution.
I wonder if there are some properties of text you could use to help you do this.
For instance based on a quick glance, there are far more vertical lines in text (l,j,k,m,n etc) than horizontal ones so maybe you could start with this.
But even detecting these isn't straightforward, you'd need to use some sort of filter like a Sobel or Prewitt. They both have horizontal and vertical versions, see here for more info.
Of course the vertical/horizontal lines of an excel spreadsheet would be the strongest edges so you'd have to ignore these and look only at the text.
Alternative: Can you not just give the user an easy way to rotate the images, like the arrows in Windows Picture viewer or just show 4 thumbnail previews they can click on. You might need to cache the 4 versions (if you are rotating) so it's quick, but only if speed turns out to be an issue?
Here's a paper entitled "Combined Script and Page Orientation Estimation using
the Tesseract OCR engine" [pdf]
I haven't been able to find an implementation of their work, but the approach looks good to me:
The basic idea behind the proposed approach is simple.
A shape classifier is trained on characters (classes) from all the scripts of interest. At run-time, the classifier is run independently on each connected component (CC) in the image and the process is repeated after rotating each CC into three other candidate orientations (90°, 180° and 270° from the input orientation).
The algorithm keeps track of the estimated number of characters in each script for a given orientation, and the accumulated classifier confidence score across all candidate orientations. The estimate of page orientation is chosen as the one with the highest cumulative confidence score, and the estimate of script is chosen as the one with the highest number of characters in that script for the best orientation estimate.

How to identify/check blur image?

How can we identify that a given image is blur or how much percent is it blur in C#? Is there any API available for that? Or any algorithm that can be helpful in that?
Thanks!
You could perform a 2D-FFT and search the frequency coefficients for a value over a certain threshold (to elimate false-positives from rounding/edge errors). A blurred image will never have high frequency coefficients (large X/Y values in frequency-space).
If you want to compare with a certain blurring algorithm, run a single pixel through a 2D-FFT and check further images to see if they have frequency components outside the range of the reference FFT. This means you can use the same algorithm regardless of what type of blurring algorithm is used (box blur, gaussian, etc)
For a very specific problem (finding blurred photos shot to an ancient book), I set up this script, based on ImageMagick:
https://gist.github.com/888239
Given a blurred bitmap alone, you probably can't.
Given the original bitmap and the blurred bitmap you could compare the two pixel by pixel and use a simple difference to tell you how much it is blurred.
Given that I'm guessing you don't have the original image, it might be worthing looking at performing some kind of edge detection on the blurred image.
This Paper suggests a method using Harr Wavelet Transform, but as other posters have said, this is a fairly complex subject.
Although, I am posting this answer after 8-9 years after asking the question. At that time, I resolved this problem by applying blur on image and then comparing with the original one.
The idea is, when we apply blur on a non-blurry image and then compare them, the difference in images is very high. But when we apply blur on a blurry image, the difference in images is almost 10%.
In this way, we resolved our problem of identifying blur images and the results were quite good.
Results were published in following conference paper:
Creating digital life stories through activity recognition with image filtering (2010)
Err ... do you have the original image? What you ask is not a simple thing to do, though ... if its even possible
This is just a bit of a random though but you might be able to do it by fourier transforming the "blurred" and the original image and seeing if you can get something that has a very similar frequency profiles by progressively low pass filtering the original image in the frequency domain. Testing for "similarity" would be fairly complex in itself though.

Categories

Resources