I have a picture containing text :
I made a method to detect text rows. This method return the 4 corners for the text zone (always sorted) :
I want to modify the bitmap to draw a rectangle (with transparence) from theses 4 corners. Something like this :
I have my image in gray scale. I created a function to draw a rectangle, but I only achieve to draw a right rectangle :
public static void SaveDrawRectangle(int width, int height, Byte[] matrix, int dpi, System.Drawing.Point[] corners, string path)
{
System.Windows.Media.Imaging.WriteableBitmap wbm = new System.Windows.Media.Imaging.WriteableBitmap(width, height, dpi, dpi, System.Windows.Media.PixelFormats.Bgra32, null);
uint[] pixels = new uint[width * height];
for (int Y = 0; Y < height; Y++)
{
for (int X = 0; X < width; X++)
{
byte pixel = matrix[Y * width + X];
int red = pixel;
int green = pixel;
int blue = pixel;
int alpha = 255;
if (X >= corners[0].X && X <= corners[1].X &&
Y >= corners[0].Y && Y <= corners[3].Y)
{
red = 255;
alpha = 255;
}
pixels[Y * width + X] = (uint)((alpha << 24) + (red << 16) + (green << 8) + blue);
}
}
wbm.WritePixels(new System.Windows.Int32Rect(0, 0, width, height), pixels, width * 4, 0);
using (FileStream stream5 = new FileStream(path, FileMode.Create))
{
PngBitmapEncoder encoder5 = new PngBitmapEncoder();
encoder5.Frames.Add(BitmapFrame.Create(wbm));
encoder5.Save(stream5);
}
}
How can I draw a rectangle from 4 corners ?
I modify my condition by replacing with that code:
public static void SaveDrawRectangle(int width, int height, Byte[] matrix, int dpi, List<Point> corners, string path)
{
System.Windows.Media.Imaging.WriteableBitmap wbm = new System.Windows.Media.Imaging.WriteableBitmap(width, height, dpi, dpi, System.Windows.Media.PixelFormats.Bgra32, null);
uint[] pixels = new uint[width * height];
for (int Y = 0; Y < height; Y++)
{
for (int X = 0; X < width; X++)
{
byte pixel = matrix[Y * width + X];
int red = pixel;
int green = pixel;
int blue = pixel;
int alpha = 255;
if (IsInRectangle(X, Y, corners))
{
red = 255;
}
pixels[Y * width + X] = (uint)((alpha << 24) + (red << 16) + (green << 8) + blue);
}
}
wbm.WritePixels(new System.Windows.Int32Rect(0, 0, width, height), pixels, width * 4, 0);
using (FileStream stream5 = new FileStream(path, FileMode.Create))
{
PngBitmapEncoder encoder5 = new PngBitmapEncoder();
encoder5.Frames.Add(BitmapFrame.Create(wbm));
encoder5.Save(stream5);
}
}
public static bool IsInRectangle(int X, int Y, List<Point> corners)
{
Point p1, p2;
bool inside = false;
if (corners.Count < 3)
{
return inside;
}
var oldPoint = new Point(
corners[corners.Count - 1].X, corners[corners.Count - 1].Y);
for (int i = 0; i < corners.Count; i++)
{
var newPoint = new Point(corners[i].X, corners[i].Y);
if (newPoint.X > oldPoint.X)
{
p1 = oldPoint;
p2 = newPoint;
}
else
{
p1 = newPoint;
p2 = oldPoint;
}
if ((newPoint.X < X) == (X <= oldPoint.X)
&& (Y - (long)p1.Y) * (p2.X - p1.X)
< (p2.Y - (long)p1.Y) * (X - p1.X))
{
inside = !inside;
}
oldPoint = newPoint;
}
return inside;
}
It works but have 2 failings :
generated images are very big (base image take 6 Mo and after drawing 25 Mo)
generation take several time (my images are 5000x7000 pixels, process take 10 seconds)
There is probably a better way, but this way is working good.
In the constructor
Bitmap bitmap = new Bitmap(#"I:\test\Untitled3.jpg"););
Bitmap GrayScaleBitmap = GrayScale(bitmap);
And the method GrayScale:
private Bitmap GrayScale(Bitmap bmp)
{
//get image dimension
int width = bmp.Width;
int height = bmp.Height;
//color of pixel
System.Drawing.Color p;
//grayscale
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
//get pixel value
p = bmp.GetPixel(x, y);
//extract pixel component ARGB
int a = p.A;
int r = p.R;
int g = p.G;
int b = p.B;
//find average
int avg = (r + g + b) / 3;
//set new pixel value
bmp.SetPixel(x, y, System.Drawing.Color.FromArgb(a, avg, avg, avg));
}
}
return bmp;
}
This is the original image i did for testing in paint:
And the result i'm getting as GrayScale:
The red shapes are now black on white background.
But instead i want the red shapes to be white on black background, and that's what i'm not sure how to change in the GrayScale method.
What you're doing is correct for grayscale.
What you want is a combination of grayscale and inversion.
Given that RGB operates in the scale of [0 - 255], you need to use that to reverse the current colors. I would suggest the following change;
int avg = 255 - ((r + g +b) / 3)
Which should achieve what you're after.
To illustrate why this works;
White is 255, 255, 255
Black is 0, 0, 0
So 255 - ((255 + 255 + 255) / 3) = 0 (i.e. white becomes black).
I want the background to be white and the shape I draw to be black.
To achieve this, transform your avg (grayscale) value to 0 / 1 and multiply by 255.
//find white-black value
int bw = (r + g + b) / 384; // or /3 / 128 if you prefer
//set new pixel value
bmp.SetPixel(x, y, System.Drawing.Color.FromArgb(a, bw*255, bw*255, bw*255));
Which gives you a pure black-and-white image:
Even if I draw .. a brown shape on a yellow background the final
result should be black shapes on a white background.
My suggested approach covers this requirement but some sort of a "threshold" (like in Photoshop, for example) is necessary for some hues.
int threshold = 128; // [1-254]
//find avg value [0-255]
int wb = (r + g + b) / 3;
//transform avg to [0-1] using threshold
wb = (wb >= threshold) ? 1 : 0;
//set new pixel value
bmp.SetPixel(x, y, System.Drawing.Color.FromArgb(a, wb*255, wb * 255, wb * 255));
If the threshold is too small your result will be all white or if the threshold is too large all black. On borderline cases, you also may get distorted shapes/background (similar to an image editor when the hue/threshold does not fit). The value 128 separates the colorspace evenly (as my original algorithm did).
Some examples with a too low threshold:
A: Using threshold 70 with input brown on yellow.
B: Using threshold 60 with input brown on yellow.
Full source code:
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Windows.Forms;
namespace BwImgSO
{
class Program
{
static void Main(string[] args)
{
Bitmap bitmap = new Bitmap(#"C: \Users\me\Desktop\color.jpg");
Bitmap grayScaleBitmap = GrayScale(bitmap);
string outputFileName = #"C: \Users\me\Desktop\bw.jpg";
using (MemoryStream memory = new MemoryStream())
{
using (FileStream fs = new FileStream(outputFileName, FileMode.Create, FileAccess.ReadWrite))
{
grayScaleBitmap.Save(memory, ImageFormat.Jpeg);
byte[] bytes = memory.ToArray();
fs.Write(bytes, 0, bytes.Length);
}
}
}
private static Bitmap GrayScale(Bitmap bmp)
{
//get image dimension
int width = bmp.Width;
int height = bmp.Height;
int threshold = 128;
//color of pixel
System.Drawing.Color p;
//grayscale
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
//get pixel value
p = bmp.GetPixel(x, y);
//extract pixel component ARGB
int a = p.A;
int r = p.R;
int g = p.G;
int b = p.B;
//find average, transform to b/w value
//int bw = (r + g + b) / 3 / 128; // or 384 if you prefer
int wb = (r + g + b) / 3; // avg: [0-255]
//transform avg to [0-1] using threshold
wb = (wb >= threshold) ? 1 : 0;
//set new pixel value
bmp.SetPixel(x, y, System.Drawing.Color.FromArgb(a, bw*255, bw*255, bw*255));
}
}
return bmp;
}
}
}
In the following source code, I am trying to do the following:
Obtain a 3x3 array of double values
Convert that double array to Bitmap
Pad that Bitmap.
Bitmap image = ImageDataConverter.ToBitmap(new double[,]
{
{ .11, .11, .11, },
{ .11, .11, .11, },
{ .11, .11, .11, },
});
Bitmap paddedBitmap = ImagePadder.Pad(image, 512, 512);
pictureBox1.Image = paddedBitmap;
But, this source code is generating the following exception in the BitmapLocker.GetPixel(), because, i = 8, and dataLength = 7.
Please, note that, image-stride is always found to be 4, no matter what the size of the dimensions are.
How can I fix this?
.
Relevant Source Code
ImageDataConverter.cs
public class ImageDataConverter
{
public static Bitmap ToBitmap(double[,] input)
{
int width = input.GetLength(0);
int height = input.GetLength(1);
Bitmap output = Grayscale.CreateGrayscaleImage(width, height);
BitmapData data = output.LockBits(new Rectangle(0, 0, width, height),
ImageLockMode.WriteOnly,
output.PixelFormat);
int pixelSize = System.Drawing.Image.GetPixelFormatSize(output.PixelFormat) / 8;
int offset = data.Stride - width * pixelSize;
double Min = 0.0;
double Max = 255.0;
unsafe
{
byte* address = (byte*)data.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
double v = 255 * (input[x, y] - Min) / (Max - Min);
byte value = unchecked((byte)v);
for (int c = 0; c < pixelSize; c++, address++)
{
*address = value;
}
}
address += offset;
}
}
output.UnlockBits(data);
return output;
}
}
ImagePadder.cs
public class ImagePadder
{
public static Bitmap Pad(Bitmap image, int newWidth, int newHeight)
{
int width = image.Width;
int height = image.Height;
if (width >= newWidth) throw new Exception("New width must be larger than the old width");
if (height >= newHeight) throw new Exception("New height must be larger than the old height");
Bitmap paddedImage = Grayscale.CreateGrayscaleImage(newWidth, newHeight);
BitmapLocker inputImageLocker = new BitmapLocker(image);
BitmapLocker paddedImageLocker = new BitmapLocker(paddedImage);
inputImageLocker.Lock();
paddedImageLocker.Lock();
//Reading row by row
for (int y = 0; y < image.Height; y++)
{
for (int x = 0; x < image.Width; x++)
{
Color col = inputImageLocker.GetPixel(x, y);
paddedImageLocker.SetPixel(x, y, col);
}
}
string str = string.Empty;
paddedImageLocker.Unlock();
inputImageLocker.Unlock();
return paddedImage;
}
}
BitmapLocker.cs
public class BitmapLocker : IDisposable
{
//private properties
Bitmap _bitmap = null;
BitmapData _bitmapData = null;
private byte[] _imageData = null;
//public properties
public bool IsLocked { get; set; }
public IntPtr IntegerPointer { get; private set; }
public int Width { get { return _bitmap.Width; } }
public int Height { get { return _bitmap.Height; } }
public int Stride { get { return _bitmapData.Stride; } }
public int ColorDepth { get { return Bitmap.GetPixelFormatSize(_bitmap.PixelFormat); } }
public int Channels { get { return ColorDepth / 8; } }
public int PaddingOffset { get { return _bitmapData.Stride - (_bitmap.Width * Channels); } }
public PixelFormat ImagePixelFormat { get { return _bitmap.PixelFormat; } }
public bool IsGrayscale { get { return Grayscale.IsGrayscale(_bitmap); } }
//Constructor
public BitmapLocker(Bitmap source)
{
IsLocked = false;
IntegerPointer = IntPtr.Zero;
this._bitmap = source;
}
/// Lock bitmap
public void Lock()
{
if (IsLocked == false)
{
try
{
// Lock bitmap (so that no movement of data by .NET framework) and return bitmap data
_bitmapData = _bitmap.LockBits(
new Rectangle(0, 0, _bitmap.Width, _bitmap.Height),
ImageLockMode.ReadWrite,
_bitmap.PixelFormat);
// Create byte array to copy pixel values
int noOfBitsNeededForStorage = _bitmapData.Stride * _bitmapData.Height;
int noOfBytesNeededForStorage = noOfBitsNeededForStorage / 8;
_imageData = new byte[noOfBytesNeededForStorage * ColorDepth];//# of bytes needed for storage
IntegerPointer = _bitmapData.Scan0;
// Copy data from IntegerPointer to _imageData
Marshal.Copy(IntegerPointer, _imageData, 0, _imageData.Length);
IsLocked = true;
}
catch (Exception)
{
throw;
}
}
else
{
throw new Exception("Bitmap is already locked.");
}
}
/// Unlock bitmap
public void Unlock()
{
if (IsLocked == true)
{
try
{
// Copy data from _imageData to IntegerPointer
Marshal.Copy(_imageData, 0, IntegerPointer, _imageData.Length);
// Unlock bitmap data
_bitmap.UnlockBits(_bitmapData);
IsLocked = false;
}
catch (Exception)
{
throw;
}
}
else
{
throw new Exception("Bitmap is not locked.");
}
}
public Color GetPixel(int x, int y)
{
Color clr = Color.Empty;
// Get color components count
int cCount = ColorDepth / 8;
// Get start index of the specified pixel
int i = (Height - y - 1) * Stride + x * cCount;
int dataLength = _imageData.Length - cCount;
if (i > dataLength)
{
throw new IndexOutOfRangeException();
}
if (ColorDepth == 32) // For 32 bpp get Red, Green, Blue and Alpha
{
byte b = _imageData[i];
byte g = _imageData[i + 1];
byte r = _imageData[i + 2];
byte a = _imageData[i + 3]; // a
clr = Color.FromArgb(a, r, g, b);
}
if (ColorDepth == 24) // For 24 bpp get Red, Green and Blue
{
byte b = _imageData[i];
byte g = _imageData[i + 1];
byte r = _imageData[i + 2];
clr = Color.FromArgb(r, g, b);
}
if (ColorDepth == 8)
// For 8 bpp get color value (Red, Green and Blue values are the same)
{
byte c = _imageData[i];
clr = Color.FromArgb(c, c, c);
}
return clr;
}
public void SetPixel(int x, int y, Color color)
{
// Get color components count
int cCount = ColorDepth / 8;
// Get start index of the specified pixel
int i = (Height - y - 1) * Stride + x * cCount;
try
{
if (ColorDepth == 32) // For 32 bpp set Red, Green, Blue and Alpha
{
_imageData[i] = color.B;
_imageData[i + 1] = color.G;
_imageData[i + 2] = color.R;
_imageData[i + 3] = color.A;
}
if (ColorDepth == 24) // For 24 bpp set Red, Green and Blue
{
_imageData[i] = color.B;
_imageData[i + 1] = color.G;
_imageData[i + 2] = color.R;
}
if (ColorDepth == 8)
// For 8 bpp set color value (Red, Green and Blue values are the same)
{
_imageData[i] = color.B;
}
}
catch (Exception ex)
{
throw new Exception("(" + x + ", " + y + "), " + _imageData.Length + ", " + ex.Message + ", i=" + i);
}
}
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
_bitmap = null;
_bitmapData = null;
_imageData = null;
IntegerPointer = IntPtr.Zero;
}
}
}
The problem is in BitmapLocker class. Besides obvious inefficiencies, the class contains two serious bugs.
The first (which is causing the exception) is the incorrect buffer size calculation inside Lock method:
int noOfBitsNeededForStorage = _bitmapData.Stride * _bitmapData.Height;
int noOfBytesNeededForStorage = noOfBitsNeededForStorage / 8;
_imageData = new byte[noOfBytesNeededForStorage * ColorDepth];//# of bytes needed for storage
The Stride property returns
The stride width, in bytes, of the Bitmap object.
and also
The stride is the width of a single row of pixels (a scan line), rounded up to a four-byte boundary. If the stride is positive, the bitmap is top-down. If the stride is negative, the bitmap is bottom-up.
so the correct calculation (shown in several LockBits related MSDN samples) is:
int noOfBytesNeededForStorage = Math.Abs(_bitmapData.Stride) * _bitmapData.Height;
_imageData = new byte[noOfBytesNeededForStorage];
which will fix the exception (your code was doing (12 / 8) * 8 which resulted in 8 rather than the expected 12).
The second issue is the determination of the start index of the specified pixel here:
int i = (Height - y - 1) * Stride + x * cCount;
which is calculation for bottom-up bitmap with positive Stride, which as you can see from the documentation is not possible.
Hence the correct calculation should be something like this:
int i = (Stride > 0 ? y * Stride : (Height - y - 1) * -Stride) + x * cCount;
or
int i = (Stride > 0 ? y : y - Height + 1) * Stride + x * cCount;
This should be changed in both GetPixel and SetPixel methods.
I am writing a .Net wrapper for Tesseract Ocr and if I use a grayscale image instead of rgb image as an input file to it then results are pretty good.
So I was searching the web for C# solution to convert a Rgb image to grayscale image and I found this code.
This performs 3 operations to increase the accuracy of tesseract.
Resize the image
then convert into grayscale image and remove noise from image
Now this converted image gives almost 90% accurate results.
//Resize
public Bitmap Resize(Bitmap bmp, int newWidth, int newHeight)
{
Bitmap temp = (Bitmap)bmp;
Bitmap bmap = new Bitmap(newWidth, newHeight, temp.PixelFormat);
double nWidthFactor = (double)temp.Width / (double)newWidth;
double nHeightFactor = (double)temp.Height / (double)newHeight;
double fx, fy, nx, ny;
int cx, cy, fr_x, fr_y;
Color color1 = new Color();
Color color2 = new Color();
Color color3 = new Color();
Color color4 = new Color();
byte nRed, nGreen, nBlue;
byte bp1, bp2;
for (int x = 0; x < bmap.Width; ++x)
{
for (int y = 0; y < bmap.Height; ++y)
{
fr_x = (int)Math.Floor(x * nWidthFactor);
fr_y = (int)Math.Floor(y * nHeightFactor);
cx = fr_x + 1;
if (cx >= temp.Width)
cx = fr_x;
cy = fr_y + 1;
if (cy >= temp.Height)
cy = fr_y;
fx = x * nWidthFactor - fr_x;
fy = y * nHeightFactor - fr_y;
nx = 1.0 - fx;
ny = 1.0 - fy;
color1 = temp.GetPixel(fr_x, fr_y);
color2 = temp.GetPixel(cx, fr_y);
color3 = temp.GetPixel(fr_x, cy);
color4 = temp.GetPixel(cx, cy);
// Blue
bp1 = (byte)(nx * color1.B + fx * color2.B);
bp2 = (byte)(nx * color3.B + fx * color4.B);
nBlue = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Green
bp1 = (byte)(nx * color1.G + fx * color2.G);
bp2 = (byte)(nx * color3.G + fx * color4.G);
nGreen = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
// Red
bp1 = (byte)(nx * color1.R + fx * color2.R);
bp2 = (byte)(nx * color3.R + fx * color4.R);
nRed = (byte)(ny * (double)(bp1) + fy * (double)(bp2));
bmap.SetPixel(x, y, System.Drawing.Color.FromArgb(255, nRed, nGreen, nBlue));
}
}
//here i included the below to functions logic without the for loop to remove repetitive use of for loop but it did not work and taking the same time.
bmap = SetGrayscale(bmap);
bmap = RemoveNoise(bmap);
return bmap;
}
//SetGrayscale
public Bitmap SetGrayscale(Bitmap img)
{
Bitmap temp = (Bitmap)img;
Bitmap bmap = (Bitmap)temp.Clone();
Color c;
for (int i = 0; i < bmap.Width; i++)
{
for (int j = 0; j < bmap.Height; j++)
{
c = bmap.GetPixel(i, j);
byte gray = (byte)(.299 * c.R + .587 * c.G + .114 * c.B);
bmap.SetPixel(i, j, Color.FromArgb(gray, gray, gray));
}
}
return (Bitmap)bmap.Clone();
}
//RemoveNoise
public Bitmap RemoveNoise(Bitmap bmap)
{
for (var x = 0; x < bmap.Width; x++)
{
for (var y = 0; y < bmap.Height; y++)
{
var pixel = bmap.GetPixel(x, y);
if (pixel.R < 162 && pixel.G < 162 && pixel.B < 162)
bmap.SetPixel(x, y, Color.Black);
}
}
for (var x = 0; x < bmap.Width; x++)
{
for (var y = 0; y < bmap.Height; y++)
{
var pixel = bmap.GetPixel(x, y);
if (pixel.R > 162 && pixel.G > 162 && pixel.B > 162)
bmap.SetPixel(x, y, Color.White);
}
}
return bmap;
}
But the problem is it takes lot of time to convert it
So I included SetGrayscale(Bitmap bmap)
RemoveNoise(Bitmap bmap) function logic inside the Resize() method to remove repetitive use of for loop
but it did not solve my problem.
The Bitmap class's GetPixel() and SetPixel() methods are notoriously slow for multiple read/writes. A much faster way to access and set individual pixels in a bitmap is to lock it first.
There's a good example here on how to do that, with a nice class LockedBitmap to wrap around the stranger Marshaling code.
Essentially what it does is use the LockBits() method in the Bitmap class, passing a rectangle for the region of the bitmap you want to lock, and then copy those pixels from its unmanaged memory location to a managed one for easier access.
Here's an example on how you would use that example class with your SetGrayscale() method:
public Bitmap SetGrayscale(Bitmap img)
{
LockedBitmap lockedBmp = new LockedBitmap(img.Clone());
lockedBmp.LockBits(); // lock the bits for faster access
Color c;
for (int i = 0; i < lockedBmp.Width; i++)
{
for (int j = 0; j < lockedBmp.Height; j++)
{
c = lockedBmp.GetPixel(i, j);
byte gray = (byte)(.299 * c.R + .587 * c.G + .114 * c.B);
lockedBmp.SetPixel(i, j, Color.FromArgb(gray, gray, gray));
}
}
lockedBmp.UnlockBits(); // remember to release resources
return lockedBmp.Bitmap; // return the bitmap (you don't need to clone it again, that's already been done).
}
This wrapper class has saved me a ridiculous amount of time in bitmap processing. Once you've implemented this in all your methods, preferably only calling LockBits() once, then I'm sure your application's performance will improve tremendously.
I also see that you're cloning the images a lot. This probably doesn't take up as much time as the SetPixel()/GetPixel() thing, but its time can still be significant especially with larger images.
The easiest way would be to redraw the image onto itself using DrawImage and passing a suitable ColorMatrix. Google for ColorMatrix and gray scale and you'll find a ton of examples, this one for example: http://www.codeproject.com/Articles/3772/ColorMatrix-Basics-Simple-Image-Color-Adjustment
I'm trying to scan 2 images (32bppArgb format), identify when there is a difference and store the difference block's bounds in a list of rectangles.
Suppose these are the images:
second:
I want to get the different rectangle bounds (the opened directory window in our case).
This is what I've done:
private unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
List<Rectangle> rec = new List<Rectangle>();
bmData = bmp.LockBits(new System.Drawing.Rectangle(0, 0, 1920, 1080), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, 1920, 1080), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
IntPtr scan0 = bmData.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride = bmData.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
int minX = int.MaxValue;;
int minY = int.MaxValue;
int maxX = 0;
bool found = false;
for (int y = 0; y < nHeight; y++)
{
byte* p = (byte*)scan0.ToPointer();
p += y * stride;
byte* p2 = (byte*)scan02.ToPointer();
p2 += y * stride2;
for (int x = 0; x < nWidth; x++)
{
if (p[0] != p2[0] || p[1] != p2[1] || p[2] != p2[2] || p[3] != p2[3]) //found differences-began to store positions.
{
found = true;
if (x < minX)
minX = x;
if (x > maxX)
maxX = x;
if (y < minY)
minY = y;
}
else
{
if (found)
{
int height = getBlockHeight(stride, scan0, maxX, minY, scan02, stride2);
found = false;
Rectangle temp = new Rectangle(minX, minY, maxX - minX, height);
rec.Add(temp);
//x += minX;
y += height;
minX = int.MaxValue;
minY = int.MaxValue;
maxX = 0;
}
}
p += 4;
p2 += 4;
}
}
return rec;
}
public unsafe int getBlockHeight(int stride, IntPtr scan, int x, int y1, IntPtr scan02, int stride2) //a function to get an existing block height.
{
int height = 0;;
for (int y = y1; y < 1080; y++) //only for example- in our case its 1080 height.
{
byte* p = (byte*)scan.ToPointer();
p += (y * stride) + (x * 4); //set the pointer to a specific potential point.
byte* p2 = (byte*)scan02.ToPointer();
p2 += (y * stride2) + (x * 4); //set the pointer to a specific potential point.
if (p[0] != p2[0] || p[1] != p2[1] || p[2] != p2[2] || p[3] != p2[3]) //still change on the height in the increasing **y** of the block.
height++;
}
return height;
}
This is actually how I call the method:
Bitmap a = Image.FromFile(#"C:\Users\itapi\Desktop\1.png") as Bitmap;//generates a 32bppRgba bitmap;
Bitmap b = Image.FromFile(#"C:\Users\itapi\Desktop\2.png") as Bitmap;//
List<Rectangle> l1 = CodeImage(a, b);
int i = 0;
foreach (Rectangle rec in l1)
{
i++;
Bitmap tmp = b.Clone(rec, a.PixelFormat);
tmp.Save(i.ToString() + ".png");
}
But I'm not getting the exact rectangle.. I'm getting only half of that and sometimes even worse. I think something in the code's logic is wrong.
Code for #nico
private unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
List<Rectangle> rec = new List<Rectangle>();
var bmData1 = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
var bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
int bytesPerPixel = 3;
IntPtr scan01 = bmData1.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride1 = bmData1.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
bool[] visited = new bool[nWidth * nHeight];
byte* base1 = (byte*)scan01.ToPointer();
byte* base2 = (byte*)scan02.ToPointer();
for (int y = 0; y < nHeight; y += 5)
{
byte* p1 = base1;
byte* p2 = base2;
for (int x = 0; x < nWidth; x += 5)
{
if (!ArePixelsEqual(p1, p2, bytesPerPixel) && !(visited[x + nWidth * y]))
{
// fill the different area
int minX = x;
int maxX = x;
int minY = y;
int maxY = y;
var pt = new Point(x, y);
Stack<Point> toBeProcessed = new Stack<Point> ();
visited[x + nWidth * y] = true;
toBeProcessed.Push(pt);
while (toBeProcessed.Count > 0)
{
var process = toBeProcessed.Pop();
var ptr1 = (byte*)scan01.ToPointer() + process.Y * stride1 + process.X * bytesPerPixel;
var ptr2 = (byte*) scan02.ToPointer() + process.Y * stride2 + process.X * bytesPerPixel;
//Check pixel equality
if (ArePixelsEqual(ptr1, ptr2, bytesPerPixel))
continue;
//This pixel is different
//Update the rectangle
if (process.X < minX) minX = process.X;
if (process.X > maxX) maxX = process.X;
if (process.Y < minY) minY = process.Y;
if (process.Y > maxY) maxY = process.Y;
Point n;
int idx;
//Put neighbors in stack
if (process.X - 1 >= 0)
{
n = new Point(process.X - 1, process.Y);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.X + 1 < nWidth)
{
n = new Point(process.X + 1, process.Y);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.Y - 1 >= 0)
{
n = new Point(process.X, process.Y - 1);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.Y + 1 < nHeight)
{
n = new Point(process.X, process.Y + 1);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
}
if (((maxX - minX + 1) > 5) & ((maxY - minY + 1) > 5))
rec.Add(new Rectangle(minX, minY, maxX - minX + 1, maxY - minY + 1));
}
p1 += 5 * bytesPerPixel;
p2 += 5 * bytesPerPixel;
}
base1 += 5 * stride1;
base2 += 5 * stride2;
}
bmp.UnlockBits(bmData1);
bmp2.UnlockBits(bmData2);
return rec;
}
I see a couple of problems with your code. If I understand it correctly, you
find a pixel that's different between the two images.
then you continue to scan from there to the right, until you find a position where both images are identical again.
then you scan from the last "different" pixel to the bottom, until you find a position where both images are identical again.
then you store that rectangle and start at the next line below it
Am I right so far?
Two obvious things can go wrong here:
If two rectangles have overlapping y-ranges, you're in trouble: You'll find the first rectangle fine, then skip to the bottom Y-coordinate, ignoring all the pixels left or right of the rectangle you just found.
Even if there is only one rectangle, you assume that every pixel on the rectangle's border is different, and all the other pixels are identical. If that assumption isn't valid, you'll stop searching too early, and only find parts of rectangles.
If your images come from a scanner or digital camera, or if they contain lossy compression (jpeg) artifacts, the second assumption will almost certainly be wrong. To illustrate this, here's what I get when I mark every identical pixel the two jpg images you linked black, and every different pixel white:
What you see is not a rectangle. Instead, a lot of pixels around the rectangles you're looking for are different:
That's because of jpeg compression artifacts. But even if you used lossless source images, pixels at the borders might not form perfect rectangles, because of antialiasing or because the background just happens to have a similar color in that region.
You could try to improve your algorithm, but if you look at that border, you will find all kinds of ugly counterexamples to any geometric assumptions you'll make.
It would probably be better to implement this "the right way". Meaning:
Either implement a flood fill algorithm that erases different pixels (e.g. by setting them to identical or by storing a flag in a separate mask), then recursively checks if the 4 neighbor pixels.
Or implement a connected component labeling algorithm, that marks each different pixel with a temporary integer label, using clever data structures to keep track which temporary labels are connected. If you're only interested in a bounding box, you don't even have to merge the temporary labels, just merge the bounding boxes of adjacent labeled areas.
Connected component labeling is in general a bit faster, but is a bit trickier to get right than flood fill.
One last advice: I would rethink your "no 3rd party libraries" policy if I were you. Even if your final product will contain no 3rd party libraries, development might by a lot faster if you used well-documented, well-tested, useful building blocks from a library, then replaced them one by one with your own code. (And who knows, you might even find an open source library with a suitable license that's so much faster than your own code that you'll stick with it in the end...)
ADD: In case you want to rethink your "no libraries" position: Here's a quick and simple implementation using AForge (which has a more permissive library than emgucv):
private static void ProcessImages()
{
(* load images *)
var img1 = AForge.Imaging.Image.FromFile(#"compare1.jpg");
var img2 = AForge.Imaging.Image.FromFile(#"compare2.jpg");
(* calculate absolute difference *)
var difference = new AForge.Imaging.Filters.ThresholdedDifference(15)
{OverlayImage = img1}
.Apply(img2);
(* create and initialize the blob counter *)
var bc = new AForge.Imaging.BlobCounter();
bc.FilterBlobs = true;
bc.MinWidth = 5;
bc.MinHeight = 5;
(* find blobs *)
bc.ProcessImage(difference);
(* draw result *)
BitmapData data = img2.LockBits(
new Rectangle(0, 0, img2.Width, img2.Height),
ImageLockMode.ReadWrite, img2.PixelFormat);
foreach (var rc in bc.GetObjectsRectangles())
AForge.Imaging.Drawing.FillRectangle(data, rc, Color.FromArgb(128,Color.Red));
img2.UnlockBits(data);
img2.Save(#"compareResult.jpg");
}
The actual difference + blob detection part (without loading and result display) takes about 43ms, for the second run (this first time takes longer of course, due to JITting, cache, etc.)
Result (the rectangle is larger due to jpeg artifacts):
Here is a flood-fill based version of your code. It checks every pixel for difference. If it finds a different pixel, it runs an exploration to find the entire different area.
The code is only meant as an illustration. There are certainly some points that could be improved.
unsafe bool ArePixelsEqual(byte* p1, byte* p2, int bytesPerPixel)
{
for (int i = 0; i < bytesPerPixel; ++i)
if (p1[i] != p2[i])
return false;
return true;
}
private static unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
if (bmp.PixelFormat != bmp2.PixelFormat || bmp.Width != bmp2.Width || bmp.Height != bmp2.Height)
throw new ArgumentException();
List<Rectangle> rec = new List<Rectangle>();
var bmData1 = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
var bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
int bytesPerPixel = Image.GetPixelFormatSize(bmp.PixelFormat) / 8;
IntPtr scan01 = bmData1.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride1 = bmData1.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
bool[] visited = new bool[nWidth * nHeight];
byte* base1 = (byte*)scan01.ToPointer();
byte* base2 = (byte*)scan02.ToPointer();
for (int y = 0; y < nHeight; y++)
{
byte* p1 = base1;
byte* p2 = base2;
for (int x = 0; x < nWidth; ++x)
{
if (!ArePixelsEqual(p1, p2, bytesPerPixel) && !(visited[x + nWidth * y]))
{
// fill the different area
int minX = x;
int maxX = x;
int minY = y;
int maxY = y;
var pt = new Point(x, y);
Stack<Point> toBeProcessed = new Stack<Point>();
visited[x + nWidth * y] = true;
toBeProcessed.Push(pt);
while (toBeProcessed.Count > 0)
{
var process = toBeProcessed.Pop();
var ptr1 = (byte*)scan01.ToPointer() + process.Y * stride1 + process.X * bytesPerPixel;
var ptr2 = (byte*)scan02.ToPointer() + process.Y * stride2 + process.X * bytesPerPixel;
//Check pixel equality
if (ArePixelsEqual(ptr1, ptr2, bytesPerPixel))
continue;
//This pixel is different
//Update the rectangle
if (process.X < minX) minX = process.X;
if (process.X > maxX) maxX = process.X;
if (process.Y < minY) minY = process.Y;
if (process.Y > maxY) maxY = process.Y;
Point n; int idx;
//Put neighbors in stack
if (process.X - 1 >= 0)
{
n = new Point(process.X - 1, process.Y); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.X + 1 < nWidth)
{
n = new Point(process.X + 1, process.Y); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.Y - 1 >= 0)
{
n = new Point(process.X, process.Y - 1); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.Y + 1 < nHeight)
{
n = new Point(process.X, process.Y + 1); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
}
rec.Add(new Rectangle(minX, minY, maxX - minX + 1, maxY - minY + 1));
}
p1 += bytesPerPixel;
p2 += bytesPerPixel;
}
base1 += stride1;
base2 += stride2;
}
bmp.UnlockBits(bmData1);
bmp2.UnlockBits(bmData2);
return rec;
}
You can achieve this easily using a flood fill segmentation algorithm.
First an utility class to make fast bitmap access easier. This will help to encapsulate the complex pointer-logic and make the code more readable:
class BitmapWithAccess
{
public Bitmap Bitmap { get; private set; }
public System.Drawing.Imaging.BitmapData BitmapData { get; private set; }
public BitmapWithAccess(Bitmap bitmap, System.Drawing.Imaging.ImageLockMode lockMode)
{
Bitmap = bitmap;
BitmapData = bitmap.LockBits(new Rectangle(Point.Empty, bitmap.Size), lockMode, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
}
public Color GetPixel(int x, int y)
{
unsafe
{
byte* dataPointer = MovePointer((byte*)BitmapData.Scan0, x, y);
return Color.FromArgb(dataPointer[3], dataPointer[2], dataPointer[1], dataPointer[0]);
}
}
public void SetPixel(int x, int y, Color color)
{
unsafe
{
byte* dataPointer = MovePointer((byte*)BitmapData.Scan0, x, y);
dataPointer[3] = color.A;
dataPointer[2] = color.R;
dataPointer[1] = color.G;
dataPointer[0] = color.B;
}
}
public void Release()
{
Bitmap.UnlockBits(BitmapData);
BitmapData = null;
}
private unsafe byte* MovePointer(byte* pointer, int x, int y)
{
return pointer + x * 4 + y * BitmapData.Stride;
}
}
Then a class representing a rectangle containing different pixels, to mark them in the resulting image. In general this class can also contain a list of Point instances (or a byte[,] map) to make indicating individual pixels in the resulting image possible:
class Segment
{
public int Left { get; set; }
public int Top { get; set; }
public int Right { get; set; }
public int Bottom { get; set; }
public Bitmap Bitmap { get; set; }
public Segment()
{
Left = int.MaxValue;
Right = int.MinValue;
Top = int.MaxValue;
Bottom = int.MinValue;
}
};
Then the steps of a simple algorithm are as follows:
find different pixels
use a flood-fill algorithm to find segments on the difference image
draw bounding rectangles for the segments found
The first step is the easiest one:
static Bitmap FindDifferentPixels(Bitmap i1, Bitmap i2)
{
var result = new Bitmap(i1.Width, i2.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
var ia1 = new BitmapWithAccess(i1, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var ia2 = new BitmapWithAccess(i2, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var ra = new BitmapWithAccess(result, System.Drawing.Imaging.ImageLockMode.ReadWrite);
for (int x = 0; x < i1.Width; ++x)
for (int y = 0; y < i1.Height; ++y)
{
var different = ia1.GetPixel(x, y) != ia2.GetPixel(x, y);
ra.SetPixel(x, y, different ? Color.White : Color.FromArgb(0, 0, 0, 0));
}
ia1.Release();
ia2.Release();
ra.Release();
return result;
}
And the second and the third steps are covered with the following three functions:
static List<Segment> Segmentize(Bitmap blackAndWhite)
{
var bawa = new BitmapWithAccess(blackAndWhite, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var result = new List<Segment>();
HashSet<Point> queue = new HashSet<Point>();
bool[,] visitedPoints = new bool[blackAndWhite.Width, blackAndWhite.Height];
for (int x = 0;x < blackAndWhite.Width;++x)
for (int y = 0;y < blackAndWhite.Height;++y)
{
if (bawa.GetPixel(x, y).A != 0
&& !visitedPoints[x, y])
{
result.Add(BuildSegment(new Point(x, y), bawa, visitedPoints));
}
}
bawa.Release();
return result;
}
static Segment BuildSegment(Point startingPoint, BitmapWithAccess bawa, bool[,] visitedPoints)
{
var result = new Segment();
List<Point> toProcess = new List<Point>();
toProcess.Add(startingPoint);
while (toProcess.Count > 0)
{
Point p = toProcess.First();
toProcess.RemoveAt(0);
ProcessPoint(result, p, bawa, toProcess, visitedPoints);
}
return result;
}
static void ProcessPoint(Segment segment, Point point, BitmapWithAccess bawa, List<Point> toProcess, bool[,] visitedPoints)
{
for (int i = -1; i <= 1; ++i)
{
for (int j = -1; j <= 1; ++j)
{
int x = point.X + i;
int y = point.Y + j;
if (x < 0 || y < 0 || x >= bawa.Bitmap.Width || y >= bawa.Bitmap.Height)
continue;
if (bawa.GetPixel(x, y).A != 0 && !visitedPoints[x, y])
{
segment.Left = Math.Min(segment.Left, x);
segment.Right = Math.Max(segment.Right, x);
segment.Top = Math.Min(segment.Top, y);
segment.Bottom = Math.Max(segment.Bottom, y);
toProcess.Add(new Point(x, y));
visitedPoints[x, y] = true;
}
}
}
}
And the following program given your two images as arguments:
static void Main(string[] args)
{
Image ai1 = Image.FromFile(args[0]);
Image ai2 = Image.FromFile(args[1]);
Bitmap i1 = new Bitmap(ai1.Width, ai1.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
Bitmap i2 = new Bitmap(ai2.Width, ai2.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
using (var g1 = Graphics.FromImage(i1))
using (var g2 = Graphics.FromImage(i2))
{
g1.DrawImage(ai1, Point.Empty);
g2.DrawImage(ai2, Point.Empty);
}
var difference = FindDifferentPixels(i1, i2);
var segments = Segmentize(difference);
using (var g1 = Graphics.FromImage(i1))
{
foreach (var segment in segments)
{
g1.DrawRectangle(Pens.Red, new Rectangle(segment.Left, segment.Top, segment.Right - segment.Left, segment.Bottom - segment.Top));
}
}
i1.Save("result.png");
Console.WriteLine("Done.");
Console.ReadKey();
}
produces the following result:
As you can see there are more differences between the given images. You can filter the resulting segments with regard to their size for example to drop the small artefacts. Also there is of course much work to do in terms of error checking, design and performance.
One idea is to proceed as follows:
1) Rescale images to a smaller size (downsample)
2) Run the above algorithm on smaller images
3) Run the above algorithm on original images, but restricting yourself only to rectangles found in step 2)
This can be of course extended to a multi-level hierarchical approach (using more different image sizes, increasing accuracy with each step).
Ah an algorithm challenge. Like! :-)
There are other answers here using f.ex. floodfill that will work just fine. I just noticed that you wanted something fast, so let me propose a different idea. Unlike the other people, I haven't tested it; it shouldn't be too hard and should be quite fast, but I simply don't have the time at the moment to test it myself. If you do, please share the results. Also, note that it's not a standard algorithm, so there are probably some bugs here and there in my explanation (and no patents).
My idea is derived from the idea of mean adaptive thresholding but with a lot of important differences. I cannot find the link from wikipedia anymore or my code, so I'll do this from the top of my mind. Basically you create a new (64-bit) buffer for both images and fill it with:
f(x,y) = colorvalue + f(x-1, y) + f(x, y-1) - f(x-1, y-1)
f(x,0) = colorvalue + f(x-1, 0)
f(0,y) = colorvalue + f(0, y-1)
The main trick is that you can calculate the sum value of a portion of the image fast, namely by:
g(x1,y1,x2,y2) = f(x2,y2)-f(x1-1,y2)-f(x2,y1-1)+f(x1-1,y1-1)
In other words, this will give the same result as:
result = 0;
for (x=x1; x<=x2; ++x)
for (y=y1; y<=y2; ++y)
result += f(x,y)
In our case this means that with only 4 integer operations this will get you some unique number of the block in question. I'd say that's pretty awesome.
Now, in our case, we don't really care about the average value; we just care about some sort-of unique number. If the image changes, it should change - simple as that. As for colorvalue, usually some gray scale number is used for thresholding - instead, we'll be using the complete 24-bit RGB value. Because there are only so few compares, we can simply scan until we find a block that doesn't match.
The basic algorithm that I propose works as follows:
for (y=0; y<height;++y)
for (x=0; x<width; ++x)
if (src[x,y] != dst[x,y])
if (!IntersectsWith(x, y, foundBlocks))
FindBlock(foundBlocks);
Now, IntersectsWith can be something like a quad tree of if there are only a few blocks, you can simply iterate through the blocks and check if they are within the bounds of the block. You can also update the x variable accordingly (I would). You can even balance things by re-building the buffer for f(x,y) if you have too many blocks (more precise: merge found blocks back from dst into src, then rebuild the buffer).
FindBlocks is where it gets interesting. Using the formula for g that's now pretty easy:
int x1 = x-1; int y1 = y-1; int x2 = x; int y2 = y;
while (changes)
{
while (g(srcimage,x1-1,y1,x1,y2) == g(dstimage,x1-1,y1,x1,y2)) { --x1; }
while (g(srcimage,x1,y1-1,x1,y2) == g(dstimage,x1,y1-1,x1,y2)) { --y1; }
while (g(srcimage,x1,y1,x1+1,y2) == g(dstimage,x1,y1,x1+1,y2)) { ++x1; }
while (g(srcimage,x1,y1,x1,y2+1) == g(dstimage,x1,y1,x1,y2+1)) { ++y1; }
}
That's it. Note that the complexity of the FindBlocks algorithm is O(x + y), which is pretty awesome for finding a 2D block IMO. :-)
As I said, let me know how it turns out.