Why is rotation much faster on Image than using BitmapEncoder? - c#

Rotating an Image using
image.RenderTransform = new RotateTransform()...
is almost immediate.
On the other hand, using
bitmapEncoder.BitmapTransform.Rotation = BitmapRotation.Clockwise90Degrees...
is much slower (in the FlushAsync()) - more than half a second.
Why is that? And is there a way to harness the fast rotation in order to rotate bitmaps?

The first one image.RenderTransform will render the bitmap by using hardware rendering. (GPU) The image isn't rotate but will be displayed rotated/scaled. (will access only visible pixels directly from/in the videomemory)
The second one will rotate the image itself by the CPU (all pixels). It will create new memory for the result. (non-video memory)
update:
Is there a way to use the GPU to edit bitmaps?
Depends on what you need:
If you want to use a GPU. You could use an managed wrapper (like Slim DX/Sharp DX) This will take much time to get results. Don't forget, rerasterizing images via gpu could lead to quality lost.
If you want to rotate images only (0, 90, 180, 270)? you could use a Bitmap class with de ScanLine0 option. (this is to preserve quality and size) and you could create a fast implementation.
Look here:
Fast work with Bitmaps in C#
I would create an algoritm foreach angle (0,90,180,270). Because you don't want to calculate the x, y position for each pixel. Something like below..
Tip:
try to lose the multiplies/divides.
/*This time we convert the IntPtr to a ptr*/
byte* scan0 = (byte*)bData.Scan0.ToPointer();
for (int i = 0; i < bData.Height; ++i)
{
for (int j = 0; j < bData.Width; ++j)
{
byte* data = scan0 + i * bData.Stride + j * bitsPerPixel / 8;
//data is a pointer to the first byte of the 3-byte color data
}
}
Becomes something like:
/*This time we convert the IntPtr to a ptr*/
byte* scan0 = (byte*)bData.Scan0.ToPointer();
byte* data = scan0;
int bytesPerPixel = bitsPerPixel / 8;
for (int i = 0; i < bData.Height; ++i)
{
byte* data2 = data;
for (int j = 0; j < bData.Width; ++j)
{
//data2 is a pointer to the first byte of the 3-byte color data
data2 += bytesPerPixel;
}
data += bData.Stride;
}

Related

How to create and write an image from a 2D array?

I have a Color[,] array (2d array of Drawing.Color) in C#. How can I save that locally as a PNG?
Using only .Net official packages provided in the build, no additional library or Nuget packages.
The simple way
First create a blank Bitmap instance with the desired dimensions:
Bitmap bmp = new Bitmap(100, 100);
Loop through your colors array and plot pixels:
for (int i = 0; i < 100; i++)
{
for (int j = 0; j < 100; j++)
{
bmp.SetPixel(i, j, colors[i,j]);
}
}
Finally, save your bitmap to file:
bmp.Save("myfile.png", ImageFormat.Png);
The faster way
The Bitmap.SetPixel method is slow. A much faster way of accessing a bitmap's pixels is by writing directly to an array of 32-bit values (assuming you're shooting for a 32-bit PNG), and have the Bitmap class use that array as its backing.
One way to do this is by creating said array, and getting a GCHandle on it to prevent it from being garbage-collected. The Bitmap class offers a constructor that allows you to create an instance from an array pointer, a pixel format, and a stride (the number of bytes that make up a single row of the pixel data):
public Bitmap (int width, int height, int stride,
System.Drawing.Imaging.PixelFormat format, IntPtr scan0);
This is how you would create a backing array, a handle, and a Bitmap instance:
Int32[] bits = new Int32[width * height];
GCHandle handle = GCHandle.Alloc(bits, GCHandleType.Pinned);
Bitmap bmp = new Bitmap(width, height, width * 4,
PixelFormat.Format32bppPArgb, handle.AddrOfPinnedObject());
Note that:
The backing array has 32-bit entries, since we are working with a 32-bit pixel format
The Bitmap stride is width*4, which is the number of bytes a single row of pixels takes up (4 bytes per pixel)
With this, you can now write pixel values directly into the backing array, and they'll be reflected in the Bitmap. This is much faster than using Bitmap.SetPixel. Here is a code sample, which assumes that you've wrapped up everything in a class that knows how wide and tall the bitmap is:
public void SetPixelValue(int x, int y, int color)
{
// Out of bounds?
if (x < 0 || x >= Width || y < 0 || y >= Height) return;
int index = x + (y * Width);
Bits[index] = color;
}
Please note that color is an int value, not a Color value. If you have an array of Color values, you'll have to convert each to an int first, e.g.
public void SetPixelColor(int x, int y, Color color)
{
SetPixelValue(x, y, color.ToArgb());
}
This conversion will take time, so it's better to work with int values all the way. You can make this faster still by forgoing the x/y bounds check, if you're sure you're never using out-of-bounds coordinates:
public void SetPixelValueUnchecked(int x, int y, int color)
{
// No out of bounds checking.
int index = x + (y * Width);
Bits[index] = color;
}
A caveat is in order here. If you wrap Bitmap this way, you'll still be able to use Graphics to draw things like lines, rectangles, circles etc. by accessing the Bitmap instance directly, but without the speed gain of going through the pinned array. If you want these primitives to be drawn more quickly as well, you'll have to provide your own line/circle implementations. Note that in my experience, your own Bresenham line routine will hardly outperform GDI's built-in one, so it may not be worth it.
An even faster way
Things could be faster still if you're able to set multiple pixels in one go. This would apply if you have a horizontal sequence of pixels with the same color value. The fastest way I've found of setting sequences in an array is using Buffer.BlockCopy. (See here for a discussion). Here is an implementation:
/// <summary>
/// Set a sequential stretch of integers in the bitmap to a specified value.
/// This is done using a Buffer.BlockCopy that duplicates its size on each
/// pass for speed.
/// </summary>
/// <param name="value">Fill value</param>
/// <param name="startIndex">Fill start index</param>
/// <param name="count">Number of ints to fill</param>
private void FillUsingBlockCopy(Int32 value, int startIndex, int count)
{
int numBytesInItem = 4;
int block = 32, index = startIndex;
int endIndex = startIndex + Math.Min(block, count);
while (index < endIndex) // Fill the initial block
Bits[index++] = value;
endIndex = startIndex + count;
for (; index < endIndex; index += block, block *= 2)
{
int actualBlockSize = Math.Min(block, endIndex - index);
Buffer.BlockCopy(Bits, startIndex * numBytesInItem, Bits, index * numBytesInItem, actualBlockSize * numBytesInItem);
}
}
This would be particularly useful when you need a fast way to clear the bitmap, fill a rectangle or a triangle using horizontal lines (for example after triangle rasterization).
// Color 2D Array
var imgColors = new Color[128, 128];
// Get Image Width And Height Form Color Array
int imageH = imgColors.GetLength(0);
int imageW = imgColors.GetLength(1);
// Create Image Instance
Bitmap img = new Bitmap(imageW, imageH);
// Fill Colors on Our Image
for (int x = 0; x < img.Width; ++x)
{
for (int y = 0; y < img.Height; ++y)
{
img.SetPixel(x, y, imgColors[x, y]);
}
}
// Just Save it
img.Save("image.png", ImageFormat.Png);

Why AccessViolationException occurs when accessing pixels in WriteableBitmap Image?

I have a video stream from a camera to an Image in a WPF. I am trying to access the WritableBitMap Image pixel by pixel before displaying it. As a test I am trying to set the whole image to white or black. In both cases however, I get the AccessViolationException error.
I checked other posts and it seems that this error is very wide and not specific to my case. I can't seem to know why I am not getting this working.
So what is the best way to play with the pixels in my case? or why this is not working? Any help is appreciated
private async void UpdateMasterCameraPreview(IdsFrame frame)
{
if (masterImage != null)
frame.Image.CopyTo(masterImage);
else
masterImage = frame.Image.ToWriteableBitmap();
//BitmapImage temp = ConvertWriteableBitmapToBitmapImage(masterImage);
WriteableBitmap temp = masterImage;
// Here I get the exception, on every pixel access
for (int y = 0; y < temp.PixelHeight; y++)
for (int x = 0; x < temp.PixelWidth; x++)
temp.SetPixel(x, y, 255);
masterImage = temp;
masterImage.Lock();
masterImage.AddDirtyRect(new Int32Rect(0, 0, masterImage.PixelWidth, masterImage.PixelHeight));
masterImage.Unlock();
if (OnMasterFrameCaptured != null)
OnMasterFrameCaptured(this, new CameraFrameCapturedArgs(CameraType.Master, masterImage));
}
You have swapped X and Y, i represents height, j represents width, then you shouldcall SetPixel like:
temp.SetPixel(j, i, 255);
On cases like this is better to use meaningful names for variables, like X and Y.
I ended up using the answer of this post. I now can edit raw pixel data of any WriteableBitmap image before sending it to image control in WPF. Below is what I exactly used but here I just convert every frame to some transparency under a condition:
public void ConvertImage(ref WriteableBitmap Wbmp)
{
int width = Wbmp.PixelWidth;
int height = Wbmp.PixelHeight;
int stride = Wbmp.BackBufferStride;
int bytesPerPixel = (Wbmp.Format.BitsPerPixel + 7) / 8;
unsafe
{
byte* pImgData = (byte*)Wbmp.BackBuffer;
// set alpha to transparent for any pixel with red < 0x88 and invert others
int cRowStart = 0;
int cColStart = 0;
for (int row = 0; row < height; row++)
{
cColStart = cRowStart;
for (int col = 0; col < width; col++)
{
byte* bPixel = pImgData + cColStart;
UInt32* iPixel = (UInt32*)bPixel;
if (bPixel[2 /* bgRa */] < 0x44)
{
// set to 50% transparent
bPixel[3 /* bgrA */] = 0x7f;
}
else
{
// invert but maintain alpha
*iPixel = *iPixel ^ 0x00ffffff;
}
cColStart += bytesPerPixel;
}
cRowStart += stride;
}
}
}
And the routine of using it is like this:
masterImage.Lock();
ConvertImage(ref masterImage);
masterImage.AddDirtyRect(new Int32Rect(0, 0, masterImage.PixelWidth, masterImage.PixelHeight));
masterImage.Unlock();

Creating 16-bit+ grayscale images in WPF

I want to create a 16-bit grayscale image from data values in my WPF program. Currently I have been looking at using a WriteableBitmap with PixelFormats.Gray16 set.
However I can't get this to work, and a Microsoft page (http://msdn.microsoft.com/en-us/magazine/cc534995.aspx) lists the Gray16 format as not writeable via the WriteableBitmap, but does not suggest how else to make one in this way.
Currently my code operates within a loop, where i represents the image height and j the width, and looks something like this:
short dataValue = GetDataSamplePixelValue(myDataValue);
//the pixel to fill with this data value:
int pixelOffset = ((i * imageWidth) + j) * bytesPerPixel;
//set the pixel colour values:
pixels[pixelOffset] = dataValue;
I do get an image with this but it is just a bunch of vertical black and white lines. I don't have a problem if using just 8-bit grayscale data (in which case in the above example short is changed to byte).
Does anyone know how to create a 16-bit per pixel or higher grayscale image using WPF? This image will ultimately need to be saved as well.
Any advice is much appreciated.
EDIT
Further to this I have done some editing and am now getting a sensible image using the Gray16 PixelFormat. It's very difficult for me to tell if it is actually 16-bit though, as a colour count by an image program gives 256, and I am not sure if this is because the image is being constrained by WPF, or perhaps the image program does not support it as apparently many image programs ignore the lower 8-bits. For now I will stick with what I have.
For information the code is like this:
myBitmap = new WriteableBitmap((int)visualRect.Width, (int)visualRect.Height, 96, 96, PixelFormats.Gray16, null);
int bytesPerPixel = myBitmap.Format.BitsPerPixel / 8;
ushort[] pixels = new ushort[(int)myBitmap.PixelWidth * (int)myBitmap.PixelHeight];
//if there is a shift factor, set the background colour to white:
if (shiftFactor > 0)
{
for (int i = 0; i < pixels.Length; i++)
{
pixels[i] = 255;
}
}
//the area to be drawn to:
Int32Rect drawRegionRect = new Int32Rect(0, 0, (int)myBitmap.PixelWidth, (int)myBitmap.PixelHeight);
//the number of samples available at this line (reduced by one so that the picked sample can't lie beyond the array):
double availableSamples = myDataFile.DataSamples.Length - 1;
for (int i = 0; i < numDataLinesOnDisplay; i++)
{
//the current line to use:
int currentLine = ((numDataLinesOnDisplay - 1) - i) + startLine < 0 ? 0 : ((numDataLinesOnDisplay- 1) - i) + startLine;
for (int j = 0; j < myBitmap.PixelWidth; j++)
{
//data sample to use:
int sampleToUse = (int)(Math.Floor((availableSamples / myBitmap.PixelWidth) * j));
//get the data value:
ushort dataValue = GetDataSamplePixelValue(sampleToUse);
//the pixel to fill with this data value:
int pixelOffset = (((i + shiftFactor) * (int)myBitmap.PixelWidth) + j);
//set the pixel colour values:
pixels[pixelOffset] = dataValue;
}
}
//copy the byte array into the image:
int stride = myBitmap.PixelWidth * bytesPerPixel;
myBitmap.WritePixels(drawRegionRect, pixels, stride, 0);
In this example startLine and shiftFactor are already set, and depend on from which point in the data file the user is viewing, with shiftFactor only non-zero in the cases of a data file smaller than the screen, in which case I am centering the image vertically using this value.
find bug in your code or display your full code
next example with gray16 image work normal
var width = 300;
var height = 300;
var bitmap = new WriteableBitmap(width, height, 96, 96, PixelFormats.Gray16, null);
var pixels = new ushort[width * height];
for (var y = 0; y < height; ++y)
for (var x = 0; x < width; ++x)
{
var v = (0x10000*2 * x/width + 0x10000 * 3 * y / height);
var isMirror = (v / 0x10000) % 2 == 1;
v = v % 0xFFFF;
if (isMirror)
v = 0xFFFF - v;
pixels[y * width + x] = (ushort)v;
}
bitmap.WritePixels(new Int32Rect(0, 0, width, height), pixels, width *2, 0);
var encoder = new PngBitmapEncoder();
encoder.Frames.Add(BitmapFrame.Create(bitmap));
using (var stream = System.IO.File.Create("gray16.png"))
encoder.Save(stream);
For reference, it is unlikely that a screen can display a 16-bit grayscale image, and also, this format is not well supported by Windows. For example, Windows XP cannot even display a 16-bit grayscale image in Photo viewer, though Windows 7+ can (I'm not sure about Vista, I don't have it).
On top of that, the .NET open TIF method will not load a 16-bit grayscale image.
The solution to loading and saving of 16-bit grayscale image, and I would recommend for TIFs in general is LibTIFF. You then have the option of loading the whole TIF, or loading it line by line, among other methods. I recommend loading it line by line, as then you can keep just the data that will be visible on screen, as some TIFs these days get very large, and cannot be held by a single array.
So ultimately, do not worry about displaying 16-bit grayscale on screen, it may be limited by the capabilities of the system / monitor, and the human eye cannot tell the difference between this and 8-bit anyway. If however you need to load or save 16-bit, use LibTIFF.

How to properly address 16bpp with pointers in C#

I am trying to copy camera metadata into a Bitmap, and seing as each value in the metadata is a 16bit (or ushort) I thought it would be sensible to display it in a 16bpp garyscale Bitmap. The code I wrote is as follows:
// Getting the metadata from the device
metaData = new DepthMetaData();
dataSource.GetMetaData(metaData);
// Setting up bitmap, rect and data to use pointer
Bitmap bitmap = new Bitmap(metaData.XRes, metaData.YRes, PixelFormat.Format16bppGrayScale);
Rectangle rect = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
BitmapData data = bitmap.LockBits(rect, ImageLockMode.WriteOnly, PixelFormat.Format16bppGrayScale);
// Pointer pointing to metadata
ushort* ptrMetaData = (ushort*)dataSource.DepthMapPtr.ToPointer();
lock(this)
{
// Runs through the whole bitmap and assigns the entry in the metadata
// to a pixel
for (int y = 0; y < bitmap.Height; ++y)
{
ushort* ptrDestination = (ushort*)data.Scan0.ToPointer() + y * data.Stride;
for (int x = 0; x < bitmap.Width; ++x, ++ptrMetaData)
{
ptrDestination[x] = (ushort)*ptrMetaData;
}
}
}
// Once done unlock the bitmap so that it can be read again
bitmap.UnlockBits(data);
When running the Metadata's XRes = 640 and YRes = 480. The code throws a memory access exception in the for-loops on "ptrDestination[x] = (ushort)*ptrMetaData;" after only running though 240, half the total, lines.
I used this with 8bpp where I reduced the resolution and it worked nicely, so I don't see why it should not here. Maybe someone finds the problem.
Thanks already
ushort* ptrDestination = (ushort*)data.Scan0.ToPointer() + y * data.Stride;
The data.Stride value is expressed in bytes, not ushorts. So the pointer is off by a factor of 2 so it bombs at bitmap.Height/2. Your for loops are broken, swap bitmap.Width and bitmap.Height. The lock keyword doesn't make much sense here, you are accessing thread-local data, other than dataSource. Fix:
for (int y = 0; y < bitmap.Height; ++y)
{
ushort* ptrDestination = (ushort*)data.Scan0.ToPointer() + y * data.Stride / 2;
for (int x = 0; x < bitmap.Width; ++x, ++ptrMetaData)
{
ptrDestination[x] = (ushort)*ptrMetaData;
}
}

Speed up Matrix Addition in C#

I'd like to optimize this piece of code :
public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
for (int x = 0; x < Width; x++)
{
for (int y = 0; y < Height; y++)
{
Byte pixelValue = image.GetPixel(x, y).B;
this.sumOfPixelValues[x, y] += pixelValue;
this.sumOfPixelValuesSquared[x, y] += pixelValue * pixelValue;
}
}
}
This is to be used for image processing, and we're currently running this for about 200 images. We've optimized the GetPixel value to use unsafe code, and we're not using image.Width, or image.Height, as those properties were adding to our runtime costs.
However, we're still stuck at a low speed. The problem is that our images are 640x480, so the middle of the loop is being called about 640x480x200 times.
I'd like to ask if there's a way to speed it up somehow, or convince me that it's fast enough as it is. Perhaps a way is through some fast Matrix Addition, or is Matrix Addition inherently an n^2 operation with no way to speed it up?
Perhaps doing array accesses via unsafe code would speed it up, but I'm not sure how to go about doing it, and whether it would be worth the time. Probably not.
Thanks.
EDIT : Thank you for all your answers.
This is the GetPixel method we're using:
public Color GetPixel(int x, int y)
{
int offsetFromOrigin = (y * this.stride) + (x * 3);
unsafe
{
return Color.FromArgb(this.imagePtr[offsetFromOrigin + 2], this.imagePtr[offsetFromOrigin + 1], this.imagePtr[offsetFromOrigin]);
}
}
Despite using unsafe code, GetPixel may well be the bottleneck here. Have you looked at ways of getting all the pixels in the image in one call rather than once per pixel? For instance, Bitmap.LockBits may be your friend...
On my netbook, a very simply loop iterating 640 * 480 * 200 times only take about 100 milliseconds - so if you're finding it's all going slowly, you should take another look at the bit inside the loop.
Another optimisation you might want to look at: avoid multi-dimensional arrays. They're significantly slower than single-dimensional arrays.
In particular, you can have a single-dimensional array of size Width * Height and just keep an index:
int index = 0;
for (int x = 0; x < Width; x++)
{
for (int y = 0; y < Height; y++)
{
Byte pixelValue = image.GetPixel(x, y).B;
this.sumOfPixelValues[index] += pixelValue;
this.sumOfPixelValuesSquared[index] += pixelValue * pixelValue;
index++;
}
}
Using the same simple test harness, adding a write to a 2-D rectangular array took the total time of looping over 200 * 640 * 480 up to around 850ms; using a 1-D rectangular array took it back down to around 340ms - so it's somewhat significant, and currently you've got two of those per loop iteration.
Read this article which also has some code and mentions about the slowness of GetPixel.
link text
From the article this is code to simply invert bits. This shows you the usage of LockBits as well.
It is important to note that unsafe code will not allow you to run your code remotely.
public static bool Invert(Bitmap b)
{
BitmapData bmData = b.LockBits(new Rectangle(0, 0, b.Width, b.Height),
ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
int stride = bmData.Stride;
System.IntPtr Scan0 = bmData.Scan0;
unsafe
{
byte * p = (byte *)(void *)Scan0;
int nOffset = stride - b.Width*3;
int nWidth = b.Width * 3;
for(int y=0;y < b.Height;++y)
{
for(int x=0; x < nWidth; ++x )
{
p[0] = (byte)(255-p[0]);
++p;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return true;
}
I recommend that you profile this code and find out what's taking the most time.
You may find that it's the subscripting operation, in which case you might want to change your data structures from:
long sumOfPixelValues[n,m];
long sumOfPixelValuesSquared[n,m];
to
struct Sums
{
long sumOfPixelValues;
long sumOfPixelValuesSquared;
}
Sums sums[n,m];
This would depend on what you find once you profile the code.
Code profiling is the best place to start.
Matrix addition is a highly parallel operation and can be speed up by parallelizing the operation w/ multiple threads.
I would recommend using Intels IPP library that contains threaded highly optimized API for this sort of operation. Perhaps surprisingly it's only about $100 - but would add significant complexity to your project.
If you don't want to trouble yourself with mixed language programming and IPP, you could try out centerspace's C# math libraries. The NMath API contains easy to used, forward scaling, matrix operations.
Paul
System.Drawing.Color is a structure, which on current versions of .NET kills most optimizations. Since you're only interested in the blue component anyway, use a method that only gets the data you need.
public byte GetPixelBlue(int x, int y)
{
int offsetFromOrigin = (y * this.stride) + (x * 3);
unsafe
{
return this.imagePtr[offsetFromOrigin];
}
}
Now, exchange the order of iteration of x and y:
public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
for (int y = 0; y < Height; y++)
{
for (int x = 0; x < Width; x++)
{
Byte pixelValue = image.GetPixelBlue(x, y);
this.sumOfPixelValues[y, x] += pixelValue;
this.sumOfPixelValuesSquared[y, x] += pixelValue * pixelValue;
}
}
}
Now you're accessing all values within a scan line sequentially, which will make much better use of CPU cache for all three matrices involved (image.imagePtr, sumOfPixelValues, and sumOfPixelValuesSquared. [Thanks to Jon for noticing that when I fixed access to image.imagePtr, I broke the other two. Now the output array indexing is swapped to keep it optimal.]
Next, get rid of the member references. Another thread could theoretically be setting sumOfPixelValues to another array midway through, which does horrible horrible things to optimizations.
public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
uint [,] sums = this.sumOfPixelValues;
ulong [,] squares = this.sumOfPixelValuesSquared;
for (int y = 0; y < Height; y++)
{
for (int x = 0; x < Width; x++)
{
Byte pixelValue = image.GetPixelBlue(x, y);
sums[y, x] += pixelValue;
squares[y, x] += pixelValue * pixelValue;
}
}
}
Now the compiler can generate optimal code for moving through the two output arrays, and after inlining and optimization, the inner loop can step through the image.imagePtr array with a stride of 3 instead of recalculating the offset all the time. Now an unsafe version for good measure, doing the optimizations that I think .NET ought to be smart enough to do but probably isn't:
unsafe public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
byte* scanline = image.imagePtr;
fixed (uint* sums = &this.sumOfPixelValues[0,0])
fixed (uint* squared = &this.sumOfPixelValuesSquared[0,0])
for (int y = 0; y < Height; y++)
{
byte* blue = scanline;
for (int x = 0; x < Width; x++)
{
byte pixelValue = *blue;
*sums += pixelValue;
*squares += pixelValue * pixelValue;
blue += 3;
sums++;
squares++;
}
scanline += image.stride;
}
}
Where are images stored? If each is on disk, then a bit of your processing time issue may be in fetching them from the disk. You might examine this to see if it is an issue, and if so, then rewrite to pre-fetch the image data so that the array procesing code does not have to wait for the data...
If the overall application logic will allow it (Is each matrix addition independant, or dependant on output of a previous matrix addition?) If they are independant, I'd examine executing them all on separate threads, or in parallel..
The only possible way I can think of to speed it up would be to try do some of the additions in parallel, which with your size might be beneficial over the threading overhead.
Matrix addition is of course an n^2 operation but you can speed it up by using unsafe code or at least using jagged arrays instead of multidimensional.
About the only way to effectively speed up your matrix multiplication is to use the right algorithm. There are more efficient ways to speed up matrix multiplication.Take a look at the Stressen and Coopersmith Winograd algorithms. It is also noted [with the previous replies] that you can parallize the code, which helps quite a bit.
I'm not sure if it's faster but you may write something like;
public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
Byte pixelValue;
for (int x = 0; x < Width; x++)
{
for (int y = 0; y < Height; y++)
{
pixelValue = image.GetPixel(x, y).B;
this.sumOfPixelValues[x, y] += pixelValue;
this.sumOfPixelValuesSquared[x, y] += pixelValue * pixelValue;
}
}
}
If you only do matrix addition, you'd like to consider using multiple threads to speed up by taking advantage of multi-core processors. Also use one dimensional index instead of two.
If you want to do more complicated operations, you need to use a highly optimized math library, like NMath.Net, which uses native code rather than .net.
Sometimes doing things in native C#, even unsafe calls, is just slower than using methods that have already been optimized.
No results guaranteed, but you may want to investigate the System.Windows.Media.Imaging name space and look at your whole problem in a different way.
Although it's a micro-optimization and thus may not add much you might want to study what the likelihood is of getting a zero when you do
Byte pixelValue = image.GetPixel(x, y).B;
Clearly, if pixelValue = 0 then there's no reason to do the summations so your routine might become
public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{
for (int x = 0; x < Width; x++)
{
for (int y = 0; y < Height; y++)
{
Byte pixelValue = image.GetPixel(x, y).B;
if(pixelValue != 0)
{
this.sumOfPixelValues[x, y] += pixelValue;
this.sumOfPixelValuesSquared[x, y] += pixelValue * pixelValue;
}}}}
However, the question is how often you're going to see pixelValue=0, and whether the saving on the compute-and-store will offset the cost of the test.
This is a classic case of micro-optimisation failing horribly. You're not going to get anything from looking at that loop. To get real speed benefits you need to start off by looking at the big picture:-
Can you asynchronously preload image[n+1] whilst processing image[n]?
Can you load just the B channel from the image? This will decrease memory bandwidth?
Can you load the B value and update the sumOfPixelValues(Squared) arrays directly, i.e. read the file and update instead of read file, store, read, update? Again, this decreases memory bandwidth.
Can you use one dimensional arrays instead of two dimensional? Maybe create your own array class that works either way.
Perhaps you could look into using Mono and the SIMD extensions?
Can you process the image in chunks and assign them to idle CPUs in a multi-cpu environment?
EDIT:
Try having specialised image accessors so you're not wasting memory bandwidth:
public Color GetBPixel (int x, int y)
{
int offsetFromOrigin = (y * this.stride) + (x * 3);
unsafe
{
return this.imagePtr [offsetFromOrigin + 1];
}
}
or, better still:
public Color GetBPixel (int offset)
{
unsafe
{
return this.imagePtr [offset + 1];
}
}
and use the above in a loop like:
for (int start_offset = 0, y = 0 ; y < Height ; start_offset += stride, ++y)
{
for (int x = 0, offset = start_offset ; x < Width ; offset += 3, ++x)
{
pixel = GetBPixel (offset);
// do stuff
}
}
matrix's addition complexity is O(n^2), in number of additions.
However, since there are no intermediate results, you can parallelize the additions using threads:
it easy to proof that the resulting algorithm will be lock-free
you can tune the optimal number of threads to use

Categories

Resources