I am using Microsoft OnnxRuntime to detect and classify objects in images and I want to apply it to real-time video. To do that, I have to convert each frame into an OnnxRuntime Tensor. Right now I have implemented a method that takes around 300ms:
public Tensor<float> ConvertImageToFloatTensor(Bitmap image)
{
// Create the Tensor with the appropiate dimensions for the NN
Tensor<float> data = new DenseTensor<float>(new[] { 1, image.Width, image.Height, 3 });
// Iterate over the bitmap width and height and copy each pixel
for (int x = 0; x < image.Width; x++)
{
for (int y = 0; y < image.Height; y++)
{
Color color = image.GetPixel(x, y);
data[0, y, x, 0] = color.R / (float)255.0;
data[0, y, x, 1] = color.G / (float)255.0;
data[0, y, x, 2] = color.B / (float)255.0;
}
}
return data;
}
I need this code to run as fast as possible since I am representing the output bounding boxes of the detector as a layer on top of the video. Does anyone know a faster way of doing this conversión?
based in the answers by davidtbernal (Fast work with Bitmaps in C#) and FelipeDurar (Grayscale image from binary data) you should be able to access pixels faster using LockBits and a bit of "unsafe" code
public Tensor<float> ConvertImageToFloatTensorUnsafe(Bitmap image)
{
// Create the Tensor with the appropiate dimensions for the NN
Tensor<float> data = new DenseTensor<float>(new[] { 1, image.Width, image.Height, 3 });
BitmapData bmd = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, image.PixelFormat);
int PixelSize = 3;
unsafe
{
for (int y = 0; y < bmd.Height; y++)
{
// row is a pointer to a full row of data with each of its colors
byte* row = (byte*)bmd.Scan0 + (y * bmd.Stride);
for (int x = 0; x < bmd.Width; x++)
{
// note the order of colors is BGR
data[0, y, x, 0] = row[x*PixelSize + 2] / (float)255.0;
data[0, y, x, 1] = row[x*PixelSize + 1] / (float)255.0;
data[0, y, x, 2] = row[x*PixelSize + 0] / (float)255.0;
}
}
image.UnlockBits(bmd);
}
return data;
}
I've compared this piece of code averaging over 1000 runs and got about 3x performance improvement against your original code but results may vary.
Also note I've used 3 channels per pixel as your original answer uses those values only, if you use a 32bpp bitmap, you may change PixelSize to 4 and the last channel should be alpha channel (row[x*PixelSize + 3])
I have a problem. I need to perform this function with lockbits. Please I need help.
public void xPix(Bitmap bmp, int n, Color cx, Color nx)
{
try
{
for (int y = 0; y < bmp.Height; y++)
{
for (int x = 0; x < bmp.Width; x += (n * 2))
{
cx = bmp.GetPixel(x, y);
if (x + n <= bmp.Width - 1) nx = bmp.GetPixel(x + n, y);
bmp.SetPixel(x, y, nx);
if (x + n <= bmp.Width - 1) bmp.SetPixel(x + n, y, cx);
}
}
}
catch { }
}
There were lots of things that didn't make sense to me about your code. I fixed the pieces that were preventing an image from appearing and here is the result. I will explain my changes after the code.
public void xPix(Bitmap bmp, int n, Color cx, Color nx)
{
var img = bmp.LockBits(new Rectangle(Point.Empty, bmp.Size), System.Drawing.Imaging.ImageLockMode.ReadWrite, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
byte[] bmpBytes = new byte[Math.Abs(img.Stride) * img.Height];
System.Runtime.InteropServices.Marshal.Copy(img.Scan0, bmpBytes, 0, bmpBytes.Length);
for (int y = 0; y < img.Height; y++)
{
for (int x = 0; x < img.Width; x+=n*2)
{
cx = Color.FromArgb(BitConverter.ToInt32(bmpBytes, y * Math.Abs(img.Stride) + x * 4));
if (x + n <= img.Width - 1) nx = Color.FromArgb(BitConverter.ToInt32(bmpBytes, y * Math.Abs(img.Stride) + x * 4));
BitConverter.GetBytes(nx.ToArgb()).CopyTo(bmpBytes, y * Math.Abs(img.Stride) + x * 4);
if (x + n <= img.Width - 1) BitConverter.GetBytes(cx.ToArgb()).CopyTo(bmpBytes, y * Math.Abs(img.Stride) + (x + n) * 4);
}
}
System.Runtime.InteropServices.Marshal.Copy(bmpBytes, 0, img.Scan0, bmpBytes.Length);
bmp.UnlockBits(img);
}
protected override void OnClick(EventArgs e)
{
base.OnClick(e);
Bitmap bmp = new Bitmap(#"C:\Users\bluem\Downloads\Default.png");
for (int i = 0; i < bmp.Width; i++)
{
xPix(bmp, new Random().Next(20) + 1, System.Drawing.Color.White, System.Drawing.Color.Green);
}
Canvas.Image = bmp;
}
There's no such class as LockBitmap so I replaced it with the result of a call to Bitmap.LockBits directly.
The result of LockBits does not include functions for GetPixel and SetPixel, so I did what one normally does with the result of LockBits (see https://learn.microsoft.com/en-us/dotnet/api/system.drawing.bitmap.lockbits?view=netframework-4.7.2) and copied the data into a byte array instead.
When accessing the byte data directly, some math must be done to convert the x and y coordinates into a 1-dimensional coordinate within the array of bytes, which I did.
When accessing the byte data directly under the System.Drawing.Imaging.PixelFormat.Format32bppArgb pixel format, multiple bytes must be accessed to convert between byte data and a pixel color, which I did with BitConverter.GetBytes, BitConverter.ToInt32, Color.FromArgb and Color.ToArgb.
I don't think it's a good idea to be changing the Image in the middle of painting it. You should either be drawing the image directly during the Paint event, or changing the image outside the Paint event and allowing the system to draw it. So I used the OnClick of my form to trigger the function instead.
The first random number I got was 0, so I had to add 1 to avoid an endless loop.
The cx and nx parameters never seem to be used as inputs, so I put arbitrary color values in for them. Your x and y variables were not defined/declared anywhere.
If you want faster on-image-action, you can use Marshall.Copy method with Parallel.For
Why dont use GetPixel method? Because every time you call it, your ALL image is loaded to memory. GetPixel get one pixel, and UNLOAD all image. And in every iteration, ALL image is loaded to memory (for example, if u r working on 500x500 pix image, GetPixel will load 500x500 times whole pixels to memory). When you work on images with C# (CV stuff), work on raw bytes from memory.
I will show how to use with Lockbits in Binarization because its easy to explain.
int pixelBPP = Image.GetPixelFormatSize(resultBmp.PixelFormat) / 8;
unsafe
{
BitmapData bmpData = resultBmp.LockBits(new Rectangle(0, 0, resultBmp.Width, resultBmp.Height), ImageLockMode.ReadWrite, resultBmp.PixelFormat);
byte* ptr = (byte*)bmpData.Scan0; //addres of first line
int height = resultBmp.Height;
int width = resultBmp.Width * pixelBPP;
Parallel.For(0, height, y =>
{
byte* offset = ptr + (y * bmpData.Stride); //set row
for(int x = 0; x < width; x = x + pixelBPP)
{
byte value = (offset[x] + offset[x + 1] + offset[x + 2]) / 3 > threshold ? Byte.MaxValue : Byte.MinValue;
offset[x] = value;
offset[x + 1] = value;
offset[x + 2] = value;
if (pixelBPP == 4)
{
offset[x + 3] = 255;
}
}
});
resultBmp.UnlockBits(bmpData);
}
Now, example with Marshall.copy:
BitmapData bmpData = resultBmp.LockBits(new Rectangle(0, 0, resultBmp.Width, resultBmp.Height),
ImageLockMode.ReadWrite,
resultBmp.PixelFormat
);
int bytes = bmpData.Stride * resultBmp.Height;
byte[] pixels = new byte[bytes];
Marshal.Copy(bmpData.Scan0, pixels, 0, bytes); //loading bytes to memory
int height = resultBmp.Height;
int width = resultBmp.Width;
Parallel.For(0, height - 1, y => //seting 2s and 3s
{
int offset = y * stride; //row
for (int x = 0; x < width - 1; x++)
{
int positionOfPixel = x + offset + pixelFormat; //remember about pixel format!
//do what you want with pixel
}
}
});
Marshal.Copy(pixels, 0, bmpData.Scan0, bytes); //copying bytes to bitmap
resultBmp.UnlockBits(bmpData);
Remember, when you warking with RAW bytes very important is to remember about PixelFormat. If you work on RGBA image, you need to set up every channel. (for example offset + x + pixelFormat). I showed it in Binarization example, how to deak with RGBA image with raw data. If lockbits are not fast enough, use Marshall.Copy
I'm trying to scan 2 images (32bppArgb format), identify when there is a difference and store the difference block's bounds in a list of rectangles.
Suppose these are the images:
second:
I want to get the different rectangle bounds (the opened directory window in our case).
This is what I've done:
private unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
List<Rectangle> rec = new List<Rectangle>();
bmData = bmp.LockBits(new System.Drawing.Rectangle(0, 0, 1920, 1080), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, 1920, 1080), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
IntPtr scan0 = bmData.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride = bmData.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
int minX = int.MaxValue;;
int minY = int.MaxValue;
int maxX = 0;
bool found = false;
for (int y = 0; y < nHeight; y++)
{
byte* p = (byte*)scan0.ToPointer();
p += y * stride;
byte* p2 = (byte*)scan02.ToPointer();
p2 += y * stride2;
for (int x = 0; x < nWidth; x++)
{
if (p[0] != p2[0] || p[1] != p2[1] || p[2] != p2[2] || p[3] != p2[3]) //found differences-began to store positions.
{
found = true;
if (x < minX)
minX = x;
if (x > maxX)
maxX = x;
if (y < minY)
minY = y;
}
else
{
if (found)
{
int height = getBlockHeight(stride, scan0, maxX, minY, scan02, stride2);
found = false;
Rectangle temp = new Rectangle(minX, minY, maxX - minX, height);
rec.Add(temp);
//x += minX;
y += height;
minX = int.MaxValue;
minY = int.MaxValue;
maxX = 0;
}
}
p += 4;
p2 += 4;
}
}
return rec;
}
public unsafe int getBlockHeight(int stride, IntPtr scan, int x, int y1, IntPtr scan02, int stride2) //a function to get an existing block height.
{
int height = 0;;
for (int y = y1; y < 1080; y++) //only for example- in our case its 1080 height.
{
byte* p = (byte*)scan.ToPointer();
p += (y * stride) + (x * 4); //set the pointer to a specific potential point.
byte* p2 = (byte*)scan02.ToPointer();
p2 += (y * stride2) + (x * 4); //set the pointer to a specific potential point.
if (p[0] != p2[0] || p[1] != p2[1] || p[2] != p2[2] || p[3] != p2[3]) //still change on the height in the increasing **y** of the block.
height++;
}
return height;
}
This is actually how I call the method:
Bitmap a = Image.FromFile(#"C:\Users\itapi\Desktop\1.png") as Bitmap;//generates a 32bppRgba bitmap;
Bitmap b = Image.FromFile(#"C:\Users\itapi\Desktop\2.png") as Bitmap;//
List<Rectangle> l1 = CodeImage(a, b);
int i = 0;
foreach (Rectangle rec in l1)
{
i++;
Bitmap tmp = b.Clone(rec, a.PixelFormat);
tmp.Save(i.ToString() + ".png");
}
But I'm not getting the exact rectangle.. I'm getting only half of that and sometimes even worse. I think something in the code's logic is wrong.
Code for #nico
private unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
List<Rectangle> rec = new List<Rectangle>();
var bmData1 = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
var bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
int bytesPerPixel = 3;
IntPtr scan01 = bmData1.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride1 = bmData1.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
bool[] visited = new bool[nWidth * nHeight];
byte* base1 = (byte*)scan01.ToPointer();
byte* base2 = (byte*)scan02.ToPointer();
for (int y = 0; y < nHeight; y += 5)
{
byte* p1 = base1;
byte* p2 = base2;
for (int x = 0; x < nWidth; x += 5)
{
if (!ArePixelsEqual(p1, p2, bytesPerPixel) && !(visited[x + nWidth * y]))
{
// fill the different area
int minX = x;
int maxX = x;
int minY = y;
int maxY = y;
var pt = new Point(x, y);
Stack<Point> toBeProcessed = new Stack<Point> ();
visited[x + nWidth * y] = true;
toBeProcessed.Push(pt);
while (toBeProcessed.Count > 0)
{
var process = toBeProcessed.Pop();
var ptr1 = (byte*)scan01.ToPointer() + process.Y * stride1 + process.X * bytesPerPixel;
var ptr2 = (byte*) scan02.ToPointer() + process.Y * stride2 + process.X * bytesPerPixel;
//Check pixel equality
if (ArePixelsEqual(ptr1, ptr2, bytesPerPixel))
continue;
//This pixel is different
//Update the rectangle
if (process.X < minX) minX = process.X;
if (process.X > maxX) maxX = process.X;
if (process.Y < minY) minY = process.Y;
if (process.Y > maxY) maxY = process.Y;
Point n;
int idx;
//Put neighbors in stack
if (process.X - 1 >= 0)
{
n = new Point(process.X - 1, process.Y);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.X + 1 < nWidth)
{
n = new Point(process.X + 1, process.Y);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.Y - 1 >= 0)
{
n = new Point(process.X, process.Y - 1);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
if (process.Y + 1 < nHeight)
{
n = new Point(process.X, process.Y + 1);
idx = n.X + nWidth * n.Y;
if (!visited[idx])
{
visited[idx] = true;
toBeProcessed.Push(n);
}
}
}
if (((maxX - minX + 1) > 5) & ((maxY - minY + 1) > 5))
rec.Add(new Rectangle(minX, minY, maxX - minX + 1, maxY - minY + 1));
}
p1 += 5 * bytesPerPixel;
p2 += 5 * bytesPerPixel;
}
base1 += 5 * stride1;
base2 += 5 * stride2;
}
bmp.UnlockBits(bmData1);
bmp2.UnlockBits(bmData2);
return rec;
}
I see a couple of problems with your code. If I understand it correctly, you
find a pixel that's different between the two images.
then you continue to scan from there to the right, until you find a position where both images are identical again.
then you scan from the last "different" pixel to the bottom, until you find a position where both images are identical again.
then you store that rectangle and start at the next line below it
Am I right so far?
Two obvious things can go wrong here:
If two rectangles have overlapping y-ranges, you're in trouble: You'll find the first rectangle fine, then skip to the bottom Y-coordinate, ignoring all the pixels left or right of the rectangle you just found.
Even if there is only one rectangle, you assume that every pixel on the rectangle's border is different, and all the other pixels are identical. If that assumption isn't valid, you'll stop searching too early, and only find parts of rectangles.
If your images come from a scanner or digital camera, or if they contain lossy compression (jpeg) artifacts, the second assumption will almost certainly be wrong. To illustrate this, here's what I get when I mark every identical pixel the two jpg images you linked black, and every different pixel white:
What you see is not a rectangle. Instead, a lot of pixels around the rectangles you're looking for are different:
That's because of jpeg compression artifacts. But even if you used lossless source images, pixels at the borders might not form perfect rectangles, because of antialiasing or because the background just happens to have a similar color in that region.
You could try to improve your algorithm, but if you look at that border, you will find all kinds of ugly counterexamples to any geometric assumptions you'll make.
It would probably be better to implement this "the right way". Meaning:
Either implement a flood fill algorithm that erases different pixels (e.g. by setting them to identical or by storing a flag in a separate mask), then recursively checks if the 4 neighbor pixels.
Or implement a connected component labeling algorithm, that marks each different pixel with a temporary integer label, using clever data structures to keep track which temporary labels are connected. If you're only interested in a bounding box, you don't even have to merge the temporary labels, just merge the bounding boxes of adjacent labeled areas.
Connected component labeling is in general a bit faster, but is a bit trickier to get right than flood fill.
One last advice: I would rethink your "no 3rd party libraries" policy if I were you. Even if your final product will contain no 3rd party libraries, development might by a lot faster if you used well-documented, well-tested, useful building blocks from a library, then replaced them one by one with your own code. (And who knows, you might even find an open source library with a suitable license that's so much faster than your own code that you'll stick with it in the end...)
ADD: In case you want to rethink your "no libraries" position: Here's a quick and simple implementation using AForge (which has a more permissive library than emgucv):
private static void ProcessImages()
{
(* load images *)
var img1 = AForge.Imaging.Image.FromFile(#"compare1.jpg");
var img2 = AForge.Imaging.Image.FromFile(#"compare2.jpg");
(* calculate absolute difference *)
var difference = new AForge.Imaging.Filters.ThresholdedDifference(15)
{OverlayImage = img1}
.Apply(img2);
(* create and initialize the blob counter *)
var bc = new AForge.Imaging.BlobCounter();
bc.FilterBlobs = true;
bc.MinWidth = 5;
bc.MinHeight = 5;
(* find blobs *)
bc.ProcessImage(difference);
(* draw result *)
BitmapData data = img2.LockBits(
new Rectangle(0, 0, img2.Width, img2.Height),
ImageLockMode.ReadWrite, img2.PixelFormat);
foreach (var rc in bc.GetObjectsRectangles())
AForge.Imaging.Drawing.FillRectangle(data, rc, Color.FromArgb(128,Color.Red));
img2.UnlockBits(data);
img2.Save(#"compareResult.jpg");
}
The actual difference + blob detection part (without loading and result display) takes about 43ms, for the second run (this first time takes longer of course, due to JITting, cache, etc.)
Result (the rectangle is larger due to jpeg artifacts):
Here is a flood-fill based version of your code. It checks every pixel for difference. If it finds a different pixel, it runs an exploration to find the entire different area.
The code is only meant as an illustration. There are certainly some points that could be improved.
unsafe bool ArePixelsEqual(byte* p1, byte* p2, int bytesPerPixel)
{
for (int i = 0; i < bytesPerPixel; ++i)
if (p1[i] != p2[i])
return false;
return true;
}
private static unsafe List<Rectangle> CodeImage(Bitmap bmp, Bitmap bmp2)
{
if (bmp.PixelFormat != bmp2.PixelFormat || bmp.Width != bmp2.Width || bmp.Height != bmp2.Height)
throw new ArgumentException();
List<Rectangle> rec = new List<Rectangle>();
var bmData1 = bmp.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp.PixelFormat);
var bmData2 = bmp2.LockBits(new System.Drawing.Rectangle(0, 0, bmp.Width, bmp.Height), System.Drawing.Imaging.ImageLockMode.ReadOnly, bmp2.PixelFormat);
int bytesPerPixel = Image.GetPixelFormatSize(bmp.PixelFormat) / 8;
IntPtr scan01 = bmData1.Scan0;
IntPtr scan02 = bmData2.Scan0;
int stride1 = bmData1.Stride;
int stride2 = bmData2.Stride;
int nWidth = bmp.Width;
int nHeight = bmp.Height;
bool[] visited = new bool[nWidth * nHeight];
byte* base1 = (byte*)scan01.ToPointer();
byte* base2 = (byte*)scan02.ToPointer();
for (int y = 0; y < nHeight; y++)
{
byte* p1 = base1;
byte* p2 = base2;
for (int x = 0; x < nWidth; ++x)
{
if (!ArePixelsEqual(p1, p2, bytesPerPixel) && !(visited[x + nWidth * y]))
{
// fill the different area
int minX = x;
int maxX = x;
int minY = y;
int maxY = y;
var pt = new Point(x, y);
Stack<Point> toBeProcessed = new Stack<Point>();
visited[x + nWidth * y] = true;
toBeProcessed.Push(pt);
while (toBeProcessed.Count > 0)
{
var process = toBeProcessed.Pop();
var ptr1 = (byte*)scan01.ToPointer() + process.Y * stride1 + process.X * bytesPerPixel;
var ptr2 = (byte*)scan02.ToPointer() + process.Y * stride2 + process.X * bytesPerPixel;
//Check pixel equality
if (ArePixelsEqual(ptr1, ptr2, bytesPerPixel))
continue;
//This pixel is different
//Update the rectangle
if (process.X < minX) minX = process.X;
if (process.X > maxX) maxX = process.X;
if (process.Y < minY) minY = process.Y;
if (process.Y > maxY) maxY = process.Y;
Point n; int idx;
//Put neighbors in stack
if (process.X - 1 >= 0)
{
n = new Point(process.X - 1, process.Y); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.X + 1 < nWidth)
{
n = new Point(process.X + 1, process.Y); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.Y - 1 >= 0)
{
n = new Point(process.X, process.Y - 1); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
if (process.Y + 1 < nHeight)
{
n = new Point(process.X, process.Y + 1); idx = n.X + nWidth * n.Y;
if (!visited[idx]) { visited[idx] = true; toBeProcessed.Push(n); }
}
}
rec.Add(new Rectangle(minX, minY, maxX - minX + 1, maxY - minY + 1));
}
p1 += bytesPerPixel;
p2 += bytesPerPixel;
}
base1 += stride1;
base2 += stride2;
}
bmp.UnlockBits(bmData1);
bmp2.UnlockBits(bmData2);
return rec;
}
You can achieve this easily using a flood fill segmentation algorithm.
First an utility class to make fast bitmap access easier. This will help to encapsulate the complex pointer-logic and make the code more readable:
class BitmapWithAccess
{
public Bitmap Bitmap { get; private set; }
public System.Drawing.Imaging.BitmapData BitmapData { get; private set; }
public BitmapWithAccess(Bitmap bitmap, System.Drawing.Imaging.ImageLockMode lockMode)
{
Bitmap = bitmap;
BitmapData = bitmap.LockBits(new Rectangle(Point.Empty, bitmap.Size), lockMode, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
}
public Color GetPixel(int x, int y)
{
unsafe
{
byte* dataPointer = MovePointer((byte*)BitmapData.Scan0, x, y);
return Color.FromArgb(dataPointer[3], dataPointer[2], dataPointer[1], dataPointer[0]);
}
}
public void SetPixel(int x, int y, Color color)
{
unsafe
{
byte* dataPointer = MovePointer((byte*)BitmapData.Scan0, x, y);
dataPointer[3] = color.A;
dataPointer[2] = color.R;
dataPointer[1] = color.G;
dataPointer[0] = color.B;
}
}
public void Release()
{
Bitmap.UnlockBits(BitmapData);
BitmapData = null;
}
private unsafe byte* MovePointer(byte* pointer, int x, int y)
{
return pointer + x * 4 + y * BitmapData.Stride;
}
}
Then a class representing a rectangle containing different pixels, to mark them in the resulting image. In general this class can also contain a list of Point instances (or a byte[,] map) to make indicating individual pixels in the resulting image possible:
class Segment
{
public int Left { get; set; }
public int Top { get; set; }
public int Right { get; set; }
public int Bottom { get; set; }
public Bitmap Bitmap { get; set; }
public Segment()
{
Left = int.MaxValue;
Right = int.MinValue;
Top = int.MaxValue;
Bottom = int.MinValue;
}
};
Then the steps of a simple algorithm are as follows:
find different pixels
use a flood-fill algorithm to find segments on the difference image
draw bounding rectangles for the segments found
The first step is the easiest one:
static Bitmap FindDifferentPixels(Bitmap i1, Bitmap i2)
{
var result = new Bitmap(i1.Width, i2.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
var ia1 = new BitmapWithAccess(i1, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var ia2 = new BitmapWithAccess(i2, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var ra = new BitmapWithAccess(result, System.Drawing.Imaging.ImageLockMode.ReadWrite);
for (int x = 0; x < i1.Width; ++x)
for (int y = 0; y < i1.Height; ++y)
{
var different = ia1.GetPixel(x, y) != ia2.GetPixel(x, y);
ra.SetPixel(x, y, different ? Color.White : Color.FromArgb(0, 0, 0, 0));
}
ia1.Release();
ia2.Release();
ra.Release();
return result;
}
And the second and the third steps are covered with the following three functions:
static List<Segment> Segmentize(Bitmap blackAndWhite)
{
var bawa = new BitmapWithAccess(blackAndWhite, System.Drawing.Imaging.ImageLockMode.ReadOnly);
var result = new List<Segment>();
HashSet<Point> queue = new HashSet<Point>();
bool[,] visitedPoints = new bool[blackAndWhite.Width, blackAndWhite.Height];
for (int x = 0;x < blackAndWhite.Width;++x)
for (int y = 0;y < blackAndWhite.Height;++y)
{
if (bawa.GetPixel(x, y).A != 0
&& !visitedPoints[x, y])
{
result.Add(BuildSegment(new Point(x, y), bawa, visitedPoints));
}
}
bawa.Release();
return result;
}
static Segment BuildSegment(Point startingPoint, BitmapWithAccess bawa, bool[,] visitedPoints)
{
var result = new Segment();
List<Point> toProcess = new List<Point>();
toProcess.Add(startingPoint);
while (toProcess.Count > 0)
{
Point p = toProcess.First();
toProcess.RemoveAt(0);
ProcessPoint(result, p, bawa, toProcess, visitedPoints);
}
return result;
}
static void ProcessPoint(Segment segment, Point point, BitmapWithAccess bawa, List<Point> toProcess, bool[,] visitedPoints)
{
for (int i = -1; i <= 1; ++i)
{
for (int j = -1; j <= 1; ++j)
{
int x = point.X + i;
int y = point.Y + j;
if (x < 0 || y < 0 || x >= bawa.Bitmap.Width || y >= bawa.Bitmap.Height)
continue;
if (bawa.GetPixel(x, y).A != 0 && !visitedPoints[x, y])
{
segment.Left = Math.Min(segment.Left, x);
segment.Right = Math.Max(segment.Right, x);
segment.Top = Math.Min(segment.Top, y);
segment.Bottom = Math.Max(segment.Bottom, y);
toProcess.Add(new Point(x, y));
visitedPoints[x, y] = true;
}
}
}
}
And the following program given your two images as arguments:
static void Main(string[] args)
{
Image ai1 = Image.FromFile(args[0]);
Image ai2 = Image.FromFile(args[1]);
Bitmap i1 = new Bitmap(ai1.Width, ai1.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
Bitmap i2 = new Bitmap(ai2.Width, ai2.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
using (var g1 = Graphics.FromImage(i1))
using (var g2 = Graphics.FromImage(i2))
{
g1.DrawImage(ai1, Point.Empty);
g2.DrawImage(ai2, Point.Empty);
}
var difference = FindDifferentPixels(i1, i2);
var segments = Segmentize(difference);
using (var g1 = Graphics.FromImage(i1))
{
foreach (var segment in segments)
{
g1.DrawRectangle(Pens.Red, new Rectangle(segment.Left, segment.Top, segment.Right - segment.Left, segment.Bottom - segment.Top));
}
}
i1.Save("result.png");
Console.WriteLine("Done.");
Console.ReadKey();
}
produces the following result:
As you can see there are more differences between the given images. You can filter the resulting segments with regard to their size for example to drop the small artefacts. Also there is of course much work to do in terms of error checking, design and performance.
One idea is to proceed as follows:
1) Rescale images to a smaller size (downsample)
2) Run the above algorithm on smaller images
3) Run the above algorithm on original images, but restricting yourself only to rectangles found in step 2)
This can be of course extended to a multi-level hierarchical approach (using more different image sizes, increasing accuracy with each step).
Ah an algorithm challenge. Like! :-)
There are other answers here using f.ex. floodfill that will work just fine. I just noticed that you wanted something fast, so let me propose a different idea. Unlike the other people, I haven't tested it; it shouldn't be too hard and should be quite fast, but I simply don't have the time at the moment to test it myself. If you do, please share the results. Also, note that it's not a standard algorithm, so there are probably some bugs here and there in my explanation (and no patents).
My idea is derived from the idea of mean adaptive thresholding but with a lot of important differences. I cannot find the link from wikipedia anymore or my code, so I'll do this from the top of my mind. Basically you create a new (64-bit) buffer for both images and fill it with:
f(x,y) = colorvalue + f(x-1, y) + f(x, y-1) - f(x-1, y-1)
f(x,0) = colorvalue + f(x-1, 0)
f(0,y) = colorvalue + f(0, y-1)
The main trick is that you can calculate the sum value of a portion of the image fast, namely by:
g(x1,y1,x2,y2) = f(x2,y2)-f(x1-1,y2)-f(x2,y1-1)+f(x1-1,y1-1)
In other words, this will give the same result as:
result = 0;
for (x=x1; x<=x2; ++x)
for (y=y1; y<=y2; ++y)
result += f(x,y)
In our case this means that with only 4 integer operations this will get you some unique number of the block in question. I'd say that's pretty awesome.
Now, in our case, we don't really care about the average value; we just care about some sort-of unique number. If the image changes, it should change - simple as that. As for colorvalue, usually some gray scale number is used for thresholding - instead, we'll be using the complete 24-bit RGB value. Because there are only so few compares, we can simply scan until we find a block that doesn't match.
The basic algorithm that I propose works as follows:
for (y=0; y<height;++y)
for (x=0; x<width; ++x)
if (src[x,y] != dst[x,y])
if (!IntersectsWith(x, y, foundBlocks))
FindBlock(foundBlocks);
Now, IntersectsWith can be something like a quad tree of if there are only a few blocks, you can simply iterate through the blocks and check if they are within the bounds of the block. You can also update the x variable accordingly (I would). You can even balance things by re-building the buffer for f(x,y) if you have too many blocks (more precise: merge found blocks back from dst into src, then rebuild the buffer).
FindBlocks is where it gets interesting. Using the formula for g that's now pretty easy:
int x1 = x-1; int y1 = y-1; int x2 = x; int y2 = y;
while (changes)
{
while (g(srcimage,x1-1,y1,x1,y2) == g(dstimage,x1-1,y1,x1,y2)) { --x1; }
while (g(srcimage,x1,y1-1,x1,y2) == g(dstimage,x1,y1-1,x1,y2)) { --y1; }
while (g(srcimage,x1,y1,x1+1,y2) == g(dstimage,x1,y1,x1+1,y2)) { ++x1; }
while (g(srcimage,x1,y1,x1,y2+1) == g(dstimage,x1,y1,x1,y2+1)) { ++y1; }
}
That's it. Note that the complexity of the FindBlocks algorithm is O(x + y), which is pretty awesome for finding a 2D block IMO. :-)
As I said, let me know how it turns out.
Emgu.CV (Nuget package 2.4.2) doesn't as far as I can tell implement the gpu::alphaComp method available in OpenCV.
As such when trying to implement this specific type of composite it is incredible slow in C#, such that it takes up some 80% of the total cpu usage of my app.
This was my original solution that performs really badly.
static public Image<Bgra, Byte> Overlay( Image<Bgra, Byte> image1, Image<Bgra, Byte> image2 )
{
Image<Bgra, Byte> result = image1.Copy();
Image<Bgra, Byte> src = image2;
Image<Bgra, Byte> dst = image1;
int rows = result.Rows;
int cols = result.Cols;
for (int y = 0; y < rows; ++y)
{
for (int x = 0; x < cols; ++x)
{
// http://en.wikipedia.org/wiki/Alpha_compositing
double srcA = 1.0/255 * src.Data[y, x, 3];
double dstA = 1.0/255 * dst.Data[y, x, 3];
double outA = (srcA + (dstA - dstA * srcA));
result.Data[y, x, 0] = (Byte)(((src.Data[y, x, 0] * srcA) + (dst.Data[y, x, 0] * (1 - srcA))) / outA); // Blue
result.Data[y, x, 1] = (Byte)(((src.Data[y, x, 1] * srcA) + (dst.Data[y, x, 1] * (1 - srcA))) / outA); // Green
result.Data[y, x, 2] = (Byte)(((src.Data[y, x, 2] * srcA) + (dst.Data[y, x, 2] * (1 - srcA))) / outA); // Red
result.Data[y, x, 3] = (Byte)(outA*255);
}
}
return result;
}
Is there a way to optimise the above in C#?
I've additionally looked at using OpencvSharp, but that doesn't appear to provide access to gpu::alphaComp neither.
Is there any OpenCV C# wrapper library that can do alpha Compositing?
AddWeighted does not do what I need it to do.
Whilst similar, this question doesn't provide an answer
So so simple.
public static Image<Bgra, Byte> Overlay(Image<Bgra, Byte> target, Image<Bgra, Byte> overlay)
{
Bitmap bmp = target.Bitmap;
Graphics gra = Graphics.FromImage(bmp);
gra.CompositingMode = System.Drawing.Drawing2D.CompositingMode.SourceOver;
gra.DrawImage(overlay.Bitmap, new Point(0, 0));
return target;
}
I have this code,
copy/paste in a new winform app and this will write a file on your desktop if you run it: test123abcd.png
Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
Dim SquareSize = 5
Dim GridX = 2500
Dim GridY = 2500
Dim SquareCount = GridX * GridY - 1
Dim sw As New Stopwatch
Dim Rect(4) As Rectangle
Rect(0) = New Rectangle(0, 3, 3, 1)
Rect(1) = New Rectangle(3, 0, 1, 3)
Rect(2) = New Rectangle(3, 3, 3, 1)
Rect(3) = New Rectangle(0, 0, 1, 3)
Dim fullsw = Stopwatch.StartNew
Using board = New Bitmap(SquareSize * (GridX + 1), SquareSize * (GridY + 1), Imaging.PixelFormat.Format32bppPArgb)
Using graph = Graphics.FromImage(board)
Using _board = New Bitmap(SquareSize, SquareSize, Imaging.PixelFormat.Format32bppPArgb)
Using g As Graphics = Graphics.FromImage(_board)
For i = 0 To SquareCount
g.Clear(If((i And 1) = 1, Color.Red, Color.Blue))
g.FillRectangles(Brushes.White, Rect)
sw.Start()
graph.DrawImageUnscaled(_board, ((i Mod GridX) * SquareSize), ((i \ GridY) * SquareSize))
sw.Stop()
Next
End Using
End Using
End Using
fullsw.Stop()
board.Save(Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) & "\test123abcd.png", Imaging.ImageFormat.Png)
End Using
MessageBox.Show("Full SW: " & fullsw.ElapsedMilliseconds & Environment.NewLine &
"DrawImageUnscaled SW: " & sw.ElapsedMilliseconds)
End Sub
about 40% to 45% of the time spent is on DrawImageUnscaled, about 23 seconds on my current computer while the whole thing take about 50 seconds
is there a way to speed up DrawImageUnscaled? (and maybe the whole thing?)
EDIT - question in vb.net, answer in c#
By assuming that the generation part (g.FillRectangles(Brushes.White, Rect), pretty time-consuming too) cannot be avoided, the best thing you can do is avoiding a second graph-generation process (also for board) and just copying the information from _board. Copying is much quicker than a new generation (as shown below), but you have the problem that the source information (_board) do not match the destination format (board by relying on .SetPixel) and thus you will have to create a function determining the current pixel (X/Y point) from the provided information (current rectangle).
Below you can see a simple code showing the time requirement differences between both approaches:
Dim SquareSize As Integer = 5
Dim _board As Bitmap = Bitmap.FromFile("in.png")
Dim board As Bitmap = New Bitmap(_board.Width * SquareSize, _board.Height * SquareSize)
For x As Integer = 0 To _board.Width - 1
For y As Integer = 0 To _board.Height - 1
board.SetPixel(x * SquareSize, y * SquareSize, _board.GetPixel(x, y))
Next
Next
board.Save("out1.png", Imaging.ImageFormat.Png)
board = New Bitmap(_board.Width, _board.Height)
Using board
Using graph = Graphics.FromImage(board)
Using _board
Using g As Graphics = Graphics.FromImage(_board)
For x As Integer = 0 To _board.Width - 1
For y As Integer = 0 To _board.Height - 1
graph.DrawImageUnscaled(_board, x, y)
Next
Next
End Using
End Using
End Using
board.Save("out2.png", Imaging.ImageFormat.Png)
End Using
Bear in mind that it is not a "properly-working code". Its whole point is showing how to copy pixels between bitmaps (by multiplying by a factor, just to get different outputs than inputs); and putting the DrawImageUnscaled method under equivalent conditions (although the output picture is, logically, different) to get a good feeling of the differences in time requirements between both methodologies.
As said via comment, this is all what I can do under the current conditions. I hope that will be enough to help you find the best solution.
wow I like unsafe code when it is worthed, solved my problem with c# in the end
here is the code, which is about 70x faster to the code in my question
using System;
using System.Drawing;
using System.Drawing.Imaging;
namespace BmpFile
{
public class BmpTest
{
private const int PixelSize = 4;
public static long Test(int GridX, int GridY, int SquareSize, Rectangle[][] Rect)
{
Bitmap bmp = new Bitmap(GridX * SquareSize, GridY * SquareSize, PixelFormat.Format32bppArgb);
BitmapData bmd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height),
System.Drawing.Imaging.ImageLockMode.ReadWrite,
bmp.PixelFormat);
int Stride = bmd.Stride;
int Height = bmd.Height;
int Width = bmd.Width;
int RectFirst = Rect.GetUpperBound(0);
int RectSecond;
int Offset1, Offset2, Offset3;
int i, j, k, l, w, h;
int FullRow = SquareSize * Stride;
int FullSquare = SquareSize * PixelSize;
var sw = System.Diagnostics.Stopwatch.StartNew();
unsafe
{
byte* row = (byte*)bmd.Scan0;
//draw all rectangles
for (i = 0; i <= RectFirst; ++i)
{
Offset1 = ((i / GridX) * FullRow) + ((i % GridX) * FullSquare) + 3;
RectSecond = Rect[i].GetUpperBound(0);
for (j = 0; j <= RectSecond; ++j)
{
Offset2 = Rect[i][j].X * PixelSize + Rect[i][j].Y * Stride;
w=Rect[i][j].Width;
h=Rect[i][j].Height;
for (k = 0; k <= w; ++k)
{
Offset3 = k * PixelSize;
for (l = 0; l <= h; ++l)
{
row[Offset1 + Offset2 + Offset3 + (l * Stride)] = 255;
}
}
}
}
//invert color
for (int y = 0; y < Height; y++)
{
Offset1 = (y * Stride) + 3;
for (int x = 0; x < Width; x++)
{
if (row[Offset1 + x * PixelSize] == 255)
{
row[Offset1 + x * PixelSize] = 0;
}
else
{
row[Offset1 + x * PixelSize] = 255;
}
}
}
}
sw.Stop();
bmp.UnlockBits(bmd);
bmp.Save(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + #"\test.png", ImageFormat.Png);
bmp.Dispose();
return sw.ElapsedMilliseconds;
}
}
}