OpenGl 16 bit display via Tao/C#

OpenGl 16 bit display via Tao/C# - c#

I have some scientific image data that's coming out of a detector device in a 16 bit range which then gets rendered in an image. In order to display this data, I'm using OpenGL, because it should support ushorts as part of the library. I've managed to get this data into textures rendering on an OpenGL 1.4 platform, a limitation that is a requirement of this project.
Unfortunately, the resulting textures look like they're being reduced to 8 bits, rather than 16 bits. I test this by generating a gradient image and displaying it; while the image itself has each pixel different from its neighbors, the displayed texture is showing stripe patterns where all pixels next to one another are showing up as equal values.
I've tried doing this with GlDrawPixels, and the resulting image actually looks like it's really rendering all 16 bits.
How can I force these textures to display properly?
To give more background, the LUT (LookUp Table) is being determined by the following code:
String str = "!!ARBfp1.0\n" +
"ATTRIB tex = fragment.texcoord[0];\n" +
"PARAM cbias = program.local[0];\n" +
"PARAM cscale = program.local[1];\n" +
"OUTPUT cout = result.color;\n" +
"TEMP tmp;\n" +
"TXP tmp, tex, texture[0], 2D;\n" +
"SUB tmp, tmp, cbias;\n" +
"MUL cout, tmp, cscale;\n" +
"END";
Gl.glEnable(Gl.GL_FRAGMENT_PROGRAM_ARB);
Gl.glGenProgramsARB(1, out mFragProg);
Gl.glBindProgramARB(Gl.GL_FRAGMENT_PROGRAM_ARB, mFragProg);
System.Text.Encoding ascii = System.Text.Encoding.ASCII;
Byte[] encodedBytes = ascii.GetBytes(str);
Gl.glProgramStringARB(Gl.GL_FRAGMENT_PROGRAM_ARB, Gl.GL_PROGRAM_FORMAT_ASCII_ARB,
count, encodedBytes);
GetGLError("Shader");
Gl.glDisable(Gl.GL_FRAGMENT_PROGRAM_ARB);
Where cbias and cScale are between 0 and 1.
Thanks!
EDIT: To answer some of the other questions, the line with glTexImage:
Gl.glBindTexture(Gl.GL_TEXTURE_2D, inTexData.TexName);
Gl.glTexImage2D(Gl.GL_TEXTURE_2D, 0, Gl.GL_LUMINANCE, inTexData.TexWidth, inTexData.TexHeight,
0, Gl.GL_LUMINANCE, Gl.GL_UNSIGNED_SHORT, theTexBuffer);
Gl.glTexParameteri(Gl.GL_TEXTURE_2D, Gl.GL_TEXTURE_MIN_FILTER, Gl.GL_LINEAR); // Linear Filtering
Gl.glTexParameteri(Gl.GL_TEXTURE_2D, Gl.GL_TEXTURE_MAG_FILTER, Gl.GL_LINEAR); // Linear Filtering
theTexBuffer = null;
GC.Collect();
GC.WaitForPendingFinalizers();
The pixel format is set when the context is initialized:
Gdi.PIXELFORMATDESCRIPTOR pfd = new Gdi.PIXELFORMATDESCRIPTOR();// The pixel format descriptor
pfd.nSize = (short)Marshal.SizeOf(pfd); // Size of the pixel format descriptor
pfd.nVersion = 1; // Version number (always 1)
pfd.dwFlags = Gdi.PFD_DRAW_TO_WINDOW | // Format must support windowed mode
Gdi.PFD_SUPPORT_OPENGL | // Format must support OpenGL
Gdi.PFD_DOUBLEBUFFER; // Must support double buffering
pfd.iPixelType = (byte)Gdi.PFD_TYPE_RGBA; // Request an RGBA format
pfd.cColorBits = (byte)colorBits; // Select our color depth
pfd.cRedBits = 0; // Individual color bits ignored
pfd.cRedShift = 0;
pfd.cGreenBits = 0;
pfd.cGreenShift = 0;
pfd.cBlueBits = 0;
pfd.cBlueShift = 0;
pfd.cAlphaBits = 0; // No alpha buffer
pfd.cAlphaShift = 0; // Alpha shift bit ignored
pfd.cAccumBits = 0; // Accumulation buffer
pfd.cAccumRedBits = 0; // Individual accumulation bits ignored
pfd.cAccumGreenBits = 0;
pfd.cAccumBlueBits = 0;
pfd.cAccumAlphaBits = 0;
pfd.cDepthBits = 16; // Z-buffer (depth buffer)
pfd.cStencilBits = 0; // No stencil buffer
pfd.cAuxBuffers = 0; // No auxiliary buffer
pfd.iLayerType = (byte)Gdi.PFD_MAIN_PLANE; // Main drawing layer
pfd.bReserved = 0; // Reserved
pfd.dwLayerMask = 0; // Layer masks ignored
pfd.dwVisibleMask = 0;
pfd.dwDamageMask = 0;
pixelFormat = Gdi.ChoosePixelFormat(mDC, ref pfd); // Attempt to find an appropriate pixel format
if (!Gdi.SetPixelFormat(mDC, pixelFormat, ref pfd))
{ // Are we not able to set the pixel format?
BigMessageBox.ShowMessage("Can not set the chosen PixelFormat. Chosen PixelFormat was " + pixelFormat + ".");
Environment.Exit(-1);
}

If you create a texture the 'type' parameter of glTexImage is only the data type your texture data is in before it is converted by OpenGL into its own format. To create a texture with 16 bit per channel you need something like GL_LUMINANCE16 as format (internal format remains GL_LUMINANCE). If there's no GL_LUMINANCE16 for OpenGL 1.4 check if GL_EXT_texture is available and try it with GL_LUMINANCE16_EXT.
One of these should work. However if it doesn't you can encode your 16 bit values as two 8 bit pairs with GL_LUMINANCE_ALPHA and decode it again inside a shader.

I've never worked in depths higher (deeper) than 8bit per channel, but here's what I'd try first:
Turn off filtering on the texture and see how it affects the output.
Set texturing glHints to best quality.

You could consider using a single channel floating point texture through one of the GL_ARB_texture_float, GL_ATI_texture_float or GL_NV_float_buffer extensions if the hardware supports it, I can't recall if GL 1.4 has floating point textures or not though.

Related

Loading and displaying a 16 (12) bit grayscale png into a PictureBox

I'm using a framework for some camera hardware called IDS Peak and we are receiving 16 bit grayscale images back from the framework, the framework itself can write the files to disk as PNGs and that's all good and well, but how do I display them in a PictureBox in Winforms?
Windows Bitmap does not support 16 bit grayscale so the following code throws a 'Parameter is not valid.' System.ArgumentException
var image = new Bitmap(width, height, stride, System.Drawing.Imaging.PixelFormat.Format16bppGrayScale, iplImg.Data());
iplImg.Data() here is an IntPtr to the bespoke Image format of the framework.
Considering Windows Bitmap does not support the format, and I can write the files using the framework to PNGs, how can I do one of the following:
Convert to a different object type other than Bitmap to display directly in Winforms without reading from the files.
Load the 16-bit grayscale PNG files into the PictureBox control (or any other control type, it doesn't have to be a PictureBox).
(1) is preferable as it doesn't require file IO but if (2) is the only possibility that's completely fine as I need to both save and display them anyway but (1) only requires a write operation and not a secondary read.
The files before writing to disc are actually monochrome with 12 bits per pixel, packed.

While it is possible to display 16-bit images, for example by hosting a wpf control in winforms, you probably want to apply a windowing function to reduce the image to 8 bit before display.
So lets use unsafe code and pointers for speed:
var bitmapData = myBitmap.LockBits(
new Rectangle(0, 0, myBitmap.Width, myBitmap.Height),
ImageLockMode.ReadWrite,
myBitmap.PixelFormat);
try
{
var ptr= (byte*)bitmapData.Scan0;
var stride = bitmapData.Stride;
var width = bitmapData.Width;
var height= bitmapData.Height;
// Conversion Code
}
finally
{
myBitmap.UnlockBits(bitmapData);
}
or using wpf image classes, that generally have better 16-bit support:
var myBitmap= new WriteableBitmap(new BitmapImage(new Uri("myBitmap.jpg", UriKind.Relative)));
writeableBitmap.Lock();
try{
var ptr = (byte*)myBitmap.BackBuffer;
...
}
finally
{
myBitmap.Unlock();
}
To loop over all the pixels you would use a double loop:
for (int y = 0; y < height; y++)
{
var row = (ushort*)(ptr+ y * stride);
for (int x = 0; x < width; x++)
{
var pixelValue = row[x];
// Scaling code
}
}
And to scale the value you could use a linear scaling between the min and max values to the 0-255 range of a byte
var slope = (byte.MaxValue + 1f) / (maxUshortValyue - minUshortValue);
var scaled = (int)(((pixelValue + 0.5f - minUshortValue) * slope)) ;
scaled = scaled > byte.MaxValue ? byte.MaxValue: scaled;
scaled = scaled < 0 ? 0: scaled;
var byteValue = (byte)scaled;
The maxUshortValyue / minUshortValue would either be computed from the max/min value of the image, or configured by the user. You would also need to create a target image in order to write down the result into a target 8-bit grayscale bitmap to be displayed, or write down the same value for each color channel in a color image.

Performant method of drawing text onto a png file?

I need to draw a two-dimensional grid of Squares with centered Text on them onto a (transparent) PNG file.
The tiles need to have a sufficiently big resolution, so that the text does not get pixaleted to much.
For testing purposes I create a 2048x2048px 32-bit (transparency) PNG Image with 128x128px tiles like for example that one:
The problem is I need to do this with reasonable performance. All methods I have tried so far took more than 100ms to complete, while I would need this to be at a max < 10ms. Apart from that I would need the program generating these images to be Cross-Platform and support WebAssembly (but even if you have for example an idea how to do this using posix threads, etc. I would gladly take that as a starting point, too).
Net5 Implementation
using System.Diagnostics;
using System;
using System.Drawing;
namespace ImageGeneratorBenchmark
{
class Program
{
static int rowColCount = 16;
static int tileSize = 128;
static void Main(string[] args)
{
var watch = Stopwatch.StartNew();
Bitmap bitmap = new Bitmap(rowColCount * tileSize, rowColCount * tileSize);
Graphics graphics = Graphics.FromImage(bitmap);
Brush[] usedBrushes = { Brushes.Blue, Brushes.Red, Brushes.Green, Brushes.Orange, Brushes.Yellow };
int totalCount = rowColCount * rowColCount;
Random random = new Random();
StringFormat format = new StringFormat();
format.LineAlignment = StringAlignment.Center;
format.Alignment = StringAlignment.Center;
for (int i = 0; i < totalCount; i++)
{
int x = i % rowColCount * tileSize;
int y = i / rowColCount * tileSize;
graphics.FillRectangle(usedBrushes[random.Next(0, usedBrushes.Length)], x, y, tileSize, tileSize);
graphics.DrawString(i.ToString(), SystemFonts.DefaultFont, Brushes.Black, x + tileSize / 2, y + tileSize / 2, format);
}
bitmap.Save("Test.png");
watch.Stop();
Console.WriteLine($"Output took {watch.ElapsedMilliseconds} ms.");
}
}
}
This takes around 115ms on my machine. I am using the System.Drawing.Common nuget here.
Saving the bitmap takes roughly 55ms and drawing to the graphics object in the loop also takes roughly 60ms, while 40ms can be attributed to drawing the text.
Rust Implementation
use std::path::Path;
use std::time::Instant;
use image::{Rgba, RgbaImage};
use imageproc::{drawing::{draw_text_mut, draw_filled_rect_mut, text_size}, rect::Rect};
use rusttype::{Font, Scale};
use rand::Rng;
#[derive(Default)]
struct TextureAtlas {
segment_size: u16, // The side length of the tile
row_col_count: u8, // The amount of tiles in horizontal and vertical direction
current_segment: u32 // Points to the next segment, that will be used
}
fn main() {
let before = Instant::now();
let mut atlas = TextureAtlas {
segment_size: 128,
row_col_count: 16,
..Default::default()
};
let path = Path::new("test.png");
let colors = vec![Rgba([132u8, 132u8, 132u8, 255u8]), Rgba([132u8, 255u8, 32u8, 120u8]), Rgba([200u8, 255u8, 132u8, 255u8]), Rgba([255u8, 0u8, 0u8, 255u8])];
let mut image = RgbaImage::new(2048, 2048);
let font = Vec::from(include_bytes!("../assets/DejaVuSans.ttf") as &[u8]);
let font = Font::try_from_vec(font).unwrap();
let font_size = 40.0;
let scale = Scale {
x: font_size,
y: font_size,
};
// Draw random color rects for benchmarking
for i in 0..256 {
let rand_num = rand::thread_rng().gen_range(0..colors.len());
draw_filled_rect_mut(
&mut image,
Rect::at((atlas.current_segment as i32 % atlas.row_col_count as i32) * atlas.segment_size as i32, (atlas.current_segment as i32 / atlas.row_col_count as i32) * atlas.segment_size as i32)
.of_size(atlas.segment_size.into(), atlas.segment_size.into()),
colors[rand_num]);
let number = i.to_string();
//let text = &number[..];
let text = number.as_str(); // Somehow this conversion takes ~15ms here for 255 iterations, whereas it should normally only be less than 1us
let (w, h) = text_size(scale, &font, text);
draw_text_mut(
&mut image,
Rgba([0u8, 0u8, 0u8, 255u8]),
(atlas.current_segment % atlas.row_col_count as u32) * atlas.segment_size as u32 + atlas.segment_size as u32 / 2 - w as u32 / 2,
(atlas.current_segment / atlas.row_col_count as u32) * atlas.segment_size as u32 + atlas.segment_size as u32 / 2 - h as u32 / 2,
scale,
&font,
text);
atlas.current_segment += 1;
}
image.save(path).unwrap();
println!("Output took {:?}", before.elapsed());
}
For Rust I was using the imageproc crate. Previously I used the piet-common crate, but the output took more than 300ms. With the imageproc crate I got around 110ms in release mode, which is on par with the C# version, but I think it will perform better with webassembly.
When I used a static string instead of converting the number from the loop (see comment) I got below 100ms execution time. For Rust drawing to the image only takes around 30ms, but saving it takes 80ms.
C++ Implementation
#include <iostream>
#include <cstdlib>
#define cimg_display 0
#define cimg_use_png
#include "CImg.h"
#include <chrono>
#include <string>
using namespace cimg_library;
using namespace std;
/* Generate random numbers in an inclusive range. */
int random(int min, int max)
{
static bool first = true;
if (first)
{
srand(time(NULL));
first = false;
}
return min + rand() % ((max + 1) - min);
}
int main() {
auto t1 = std::chrono::high_resolution_clock::now();
static int tile_size = 128;
static int row_col_count = 16;
// Create 2048x2048px image.
CImg<unsigned char> image(tile_size*row_col_count, tile_size*row_col_count, 1, 3);
// Make some colours.
unsigned char cyan[] = { 0, 255, 255 };
unsigned char black[] = { 0, 0, 0 };
unsigned char yellow[] = { 255, 255, 0 };
unsigned char red[] = { 255, 0, 0 };
unsigned char green[] = { 0, 255, 0 };
unsigned char orange[] = { 255, 165, 0 };
unsigned char colors [] = { // This is terrible, but I don't now C++ very well.
cyan[0], cyan[1], cyan[2],
yellow[0], yellow[1], yellow[2],
red[0], red[1], red[2],
green[0], green[1], green[2],
orange[0], orange[1], orange[2],
};
int total_count = row_col_count * row_col_count;
for (size_t i = 0; i < total_count; i++)
{
int x = i % row_col_count * tile_size;
int y = i / row_col_count * tile_size;
int random_color_index = random(0, 4);
unsigned char current_color [] = { colors[random_color_index * 3], colors[random_color_index * 3 + 1], colors[random_color_index * 3 + 2] };
image.draw_rectangle(x, y, x + tile_size, y + tile_size, current_color, 1.0); // Force use of transparency. -> Does not work. Always outputs 24bit PNGs.
auto s = std::to_string(i);
CImg<unsigned char> imgtext;
unsigned char color = 1;
imgtext.draw_text(0, 0, s.c_str(), &color, 0, 1, 40); // Measure the text by drawing to an empty instance, so that the bounding box will be set automatically.
image.draw_text(x + tile_size / 2 - imgtext.width() / 2, y + tile_size / 2 - imgtext.height() / 2, s.c_str(), black, 0, 1, 40);
}
// Save result image as PNG (libpng and GraphicsMagick are required).
image.save_png("Test.png");
auto t2 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count();
std::cout << "Output took " << duration << "ms.";
getchar();
}
I also reimplemented the same program in C++ using CImg. For .png output libpng and GraphicsMagick are required, too. I am not very fluent in C++ and I did not even bother optimizing, because the save operation took ~200ms in Release mode, whereas the whole Image generation which is currently very unoptimized took only 30ms. So this solution also falls way short of my goal.
Where I am right now
A graph of where I am right now. I will update this when I make some progress.
Why I am trying to do this and why it bothers me so much
I was asked in the comments to give a bit more context. I know this question is getting a big bloated, but if you are interested read on...
So basically I need to build a Texture Atlas for a .gltf file. I need to generate a .gltf file from data and the primitives in the .gltf file will be assigned a texture based on the input data, too. In order to optimize for a small amount of draw calls I am putting as much geometry as possible into one single primitive and then use texture coordinates to map the texture to the model. Now GPUs have a maximum size, that the texture can have. I will use 2048x2048 pixels, because the majority of devices supports at least that. That means, that if I have more than 256 objects, I need to add a new primitive to the .gltf and generate another texture atlas. In some cases one texture atlas might be sufficient, in other cases I need up to 15-20.
The textures will have a (semi-)transparent background, maybe text and maybe some lines / hatches or simple symbols, that can be drawn with a path.
I have the whole system set up in Rust already and the .gltf generating is really efficient: I can generate 54000 vertecies (=1500 boxes for example) in about 10ms which is a common case. Now for this I need to generate 6 texture atlases, which is not really a problem on a multi-core system (7 threads one for the .gltf, six for the textures). The problem is generating one takes about 100ms (or now 55 ms) which makes the whole process more than 5 times slower.
Unfortunatly it gets even worse, because another common case is 15000 objects. Generating the vertecies (plus a lot of custom attributes actually) and assembling the .gltf still only takes 96ms (540000 Vertecies / 20MB .gltf), but in that time I need to generate 59 texture atlases. I am working on a 8-core System, so at that point it gets impossible for me to run them all in parallel and I will have to generate ~9 atlases per thread (which means 55ms*9 = 495ms) so again this is 5 times as much and actually creates a quite noticeable lag. In reality it currently takes more than 2.5 s, because I am have updated to use the faster code and there seems to be additional slowdown.
What I need to do
I do understand that it will take some time to write out 4194304 32-bit pixels. But as far as I can see, because I am only writing to different parts of the image (for example only to the upper tile and so on) it should be possible to build a program that does this using multiple threads. That is what I would like to try and I would take any hint on how to make my Rust program run faster.
If it helps I would also be willing to rewrite this in C or any other language, that can be compiled to wasm and can be called via Rust's FFI. So if you have suggestions for more performant libraries I would be very thankful for that too.
Edit
Update 1: I made all the suggested improvements for the C# version from the comments. Thanks for all of them. It is now at 115ms and almost exactly as fast as the Rust version, which makes me believe I am sort of hitting a dead-end there and I would really need to find a way to parallize this in order to make significant further improvements...
Update 2: Thanks to #pinkfloydx33 I was able to run the binary with around 60ms (including the first run) after publishing it with dotnet publish -p:PublishReadyToRun=true --runtime win10-x64 --configuration Release.
In the meantime I also tried other methods myself, namely Python with Pillow (~400ms), C# and Rust both with Skia (~314ms and ~260ms) and I also reimplemented the program in C++ using CImg (and libpng as well as GraphicsMagick).

I was able to get all of the drawing (creating the grid and the text) down to 4-5ms by:
Caching values where possible (Random, StringFormat, Math.Pow)
Using ArrayPool for scratch buffer
Using the DrawString overload accepting a StringFormat with the following options:
Alignment and LineAlignment for centering (in lieu of manually calculating)
FormatFlags and Trimming options that disable things like overflow/wrapping since we are just writing small numbers (this had an impact, though negligible)
Using a custom Font from the GenericMonospace font family instead of SystemFonts.DefaultFont
This shaved off ~15ms
Fiddling with various Graphics options, such as TextRenderingHint and SmoothingMode
I got varying results so you may want to fiddle some more
An array of Color and the ToArgb function to create an int representing the 4x bytes of the pixel's color
Using LockBits, (semi-)unsafe code and Span to
Fill a buffer representing 1px high and size * countpx wide (the entire image width) with the int representing the ARGB values of the random colors
Copy that buffer size times (now representing an entire square in height)
Rinse/Repeat
unsafe was required to create a Span<> from the locked bit's Scan0 pointer
Finally, using GDI/native to draw the text over the graphic
I was then able to shave a little bit of time off of the actual saving process by using the Image.Save(Stream) overload. I used a FileStream with a custom buffer-size of 16kb (over the default 4kb) which seemed to be the sweet spot. This brought the total end-to-end time down to around 40ms (on my machine).
private static readonly Random Random = new();
private static readonly Color[] UsedColors = { Color.Blue, Color.Red, Color.Green, Color.Orange, Color.Yellow };
private static readonly StringFormat Format = new()
{
Alignment = StringAlignment.Center,
LineAlignment = StringAlignment.Center,
FormatFlags = StringFormatFlags.NoWrap | StringFormatFlags.FitBlackBox | StringFormatFlags.NoClip,
Trimming = StringTrimming.None, HotkeyPrefix = HotkeyPrefix.None
};
private static unsafe void DrawGrid(int count, int size, bool save)
{
var intsPerRow = size * count;
var sizePerFullRow = intsPerRow * size;
var colorsLen = UsedColors.Length;
using var bitmap = new Bitmap(intsPerRow, intsPerRow, PixelFormat.Format32bppArgb);
var bmpData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.WriteOnly, PixelFormat.Format32bppArgb);
var byteSpan = new Span<byte>(bmpData.Scan0.ToPointer(), Math.Abs(bmpData.Stride) * bmpData.Height);
var intSpan = MemoryMarshal.Cast<byte, int>(byteSpan);
var arr = ArrayPool<int>.Shared.Rent(intsPerRow);
var buff = arr.AsSpan(0, intsPerRow);
for (int y = 0, offset = 0; y < count; ++y)
{
// fill buffer with an entire 1px row of colors
for (var bOffset = 0; bOffset < intsPerRow; bOffset += size)
buff.Slice(bOffset, size).Fill(UsedColors[Random.Next(0, colorsLen)].ToArgb());
// duplicate the pixel high row until we've created a row of squares in full
var len = offset + sizePerFullRow;
for ( ; offset < len; offset += intsPerRow)
buff.CopyTo(intSpan.Slice(offset, intsPerRow));
}
ArrayPool<int>.Shared.Return(arr);
bitmap.UnlockBits(bmpData);
using var graphics = Graphics.FromImage(bitmap);
graphics.TextRenderingHint = TextRenderingHint.ClearTypeGridFit;
// some or all of these may not even matter?
// you may try removing/modifying the rest
graphics.CompositingQuality = CompositingQuality.HighSpeed;
graphics.InterpolationMode = InterpolationMode.Default;
graphics.SmoothingMode = SmoothingMode.HighSpeed;
graphics.PixelOffsetMode = PixelOffsetMode.HighSpeed;
var font = new Font(FontFamily.GenericMonospace, 14, FontStyle.Regular);
var lenSquares = count * count;
for (var i = 0; i < lenSquares; ++i)
{
var x = i % count * size;
var y = i / count * size;
var rect = new Rectangle(x, y, size, size);
graphics.DrawString(i.ToString(), font, Brushes.Black, rect, Format);
}
if (save)
{
using var fs = new FileStream("Test.png", FileMode.Create, FileAccess.Write, FileShare.Write, 16 * 1024);
bitmap.Save(fs, ImageFormat.Png);
}
}
Here are the timings (in ms) using a StopWatch in Release mode, run outside of Visual Studio. At least the first 1 or 2 timings should be ignored since the methods aren't fully jitted yet. Your mileage will vary depending on your PC, etc.
Image generation only:
Elapsed: 38
Elapsed: 6
Elapsed: 4
Elapsed: 4
Elapsed: 4
Elapsed: 4
Elapsed: 5
Elapsed: 4
Elapsed: 5
Elapsed: 4
Elapsed: 4
Image Generation and saving:
Elapsed: 95
Elapsed: 48
Elapsed: 41
Elapsed: 40
Elapsed: 37
Elapsed: 42
Elapsed: 42
Elapsed: 39
Elapsed: 38
Elapsed: 40
Elapsed: 41
I don't think there is anything that can be done about the slow save. I reviewed the source code of Image.Save. It calls into Native/GDI, passing in a Handle to the Stream, the native image pointer and the Guid representing PNG's ImageCodecInfo (encoder). Any slowness is going to be on that end. Update: I have verified that you get the same slow speed when saving to a MemoryStream so this has nothing to do with the fact you are saving to a file and everything to do with what's going on behind the scenes with GDI/native.
I also attempted to get the Image drawing down further using direct unsafe (pointers) and/or tricks with Unsafe and MemoryMarshal (ex. CopyBlock) as well as unrolling the loops. Those methods either produced identical results or worse and made things a bit harder to follow.
Note: Publishing as a console application with PublishReadyToRun=true seems to help a bit as well.
Update
I realize that the above is just an example, so this may not apply to your end goal. Upon further, extensive review I found that the bulk of the time spent is actually part of Image::Save. It doesn't matter what type of Stream we are saving to, even MemoryStream exhibits the same slowness (obviously disregarding file I/O). I am confident this is related to having GDI objects in the Image/Graphics--in our case the text from DrawString.
As a "simple" test I updated the above so that drawing of the text happened on a secondary image of all white. Without saving that image, I then looped over its individual pixels and based on the rough color (since we have aliasing to deal with) I manually set the corresponding pixel on the primary bitmap. The entire end to end process took sub 20ms on my machine. The rendered image wasn't perfect since it was a quick test, but it proves that you can do parts of this manually and still achieve really low times. The problem is the text drawing but we can leverage GDI without actually using it in our final image. You just need to find the sweet spot. I also tried using an indexed format and populating the pallette with colors beforehand also appeared to help some. Anyways, just food for thought.

Convert 12-bit Monochrome Image to 8-bit Grayscale

I have an image sensor board for embedded development for which I need to capture a stream of images and output them in 8-bit monochrome / grayscale format. The imager output is 12-bit monochrome (which takes 2 bytes per pixel).
In the code, I have an IntPtr to a memory buffer that has the 12-bit image data, from which I have to extract and convert that data down to an 8-bit image. This is represented in memory something like this (with a bright light activating the pixels):
As you can see, every second byte contains the LSB that I want to discard, thereby keeping only the odd-numbered bytes (to put it another way). The best solution I can conceptualize is to iterate through the memory, but that's the rub. I can't get that to work. What I need help with is an algorithm in C# to do this.
Here's a sample image that represents a direct creation of a Bitmap object from the IntPtr as follows:
bitmap = new Bitmap(imageWidth, imageHeight, imageWidth, PixelFormat.Format8bppIndexed, pImage);
// Failed Attempt #1
unsafe
{
IntPtr pImage; // pointer to buffer containing 12-bit image data from imager
int i = 0, imageSize = (imageWidth * imageHeight * 2); // two bytes per pixel
byte[] imageData = new byte[imageSize];
do
{
// Should I bitwise shift?
imageData[i] = (byte)(pImage + i) << 8; // Doesn't compile, need help here!
} while (i++ < imageSize);
}
// Failed Attempt #2
IntPtr pImage; // pointer to buffer containing 12-bit image data from imager
imageSize = imageWidth * imageHeight;
byte[] imageData = new byte[imageSize];
Marshal.Copy(pImage, imageData, 0, imageSize);
// I tried with and without this loop. Neither gives me images.
for (int i = 0; i < imageData.Length; i++)
{
if (0 == i % 2) imageData[i / 2] = imageData[i];
}
Bitmap bitmap;
using (var ms = new MemoryStream(imageData))
{
bitmap = new Bitmap(ms);
}
// This also introduced a memory leak somewhere.
Alternatively, if there's a way to do this with a Bitmap, byte[], MemoryStream, etc. that works, I'm all ears, but everything I've tried has failed.

Here is the algorithm that my coworkers helped formulate. It creates two new (unmanaged) pointers; one 8-bits wide and the other 16-bits.
By stepping through one word at a time and shifting off the last 4 bits of the source, we get a new 8-bit image with only the MSBs. Each buffer has the same number of words, but since the words are different sizes, they progress at different rates as we iterate over them.
unsafe
{
byte* p_bytebuffer = (byte*)pImage;
short* p_shortbuffer = (short*)pImage;
for (int i = 0; i < imageWidth * imageHeight; i++)
{
*p_bytebuffer++ = (byte)(*p_shortbuffer++ >> 4);
}
}
In terms of performance, this appears to be very fast with no perceivable difference in framerate.
Special thanks to #Herohtar for spending a substantial amount of time in chat with me attempting to help me solve this.

Get most similar image [duplicate]

This question already has answers here:
How can I measure the similarity between two images? [closed]
(17 answers)
Closed 5 years ago.
I have one Bitmap A and one array of Bitmap, in the array there is a Bitmap that looks the same as Bitmap A. I'm using the code below but it sometimes doesnt work, it iterates the entire array without finding it, it seems there are some minor differences, is there a way to change the function to return true if its 90% similar or pick the most similar image in the array? The array has only 6 images.
for(int i = 0; i < list.Count;i++)
{
if(ImageCompareString(image,list[i])
{
answerIndex = i;
break;
}
}
private static bool ImageCompareString(Bitmap firstImage, Bitmap secondImage)
{
MemoryStream ms = new MemoryStream();
firstImage.Save(ms, System.Drawing.Imaging.ImageFormat.Png);
String firstBitmap = Convert.ToBase64String(ms.ToArray());
ms.Position = 0;
secondImage.Save(ms, System.Drawing.Imaging.ImageFormat.Png);
String secondBitmap = Convert.ToBase64String(ms.ToArray());
if (firstBitmap.Equals(secondBitmap))
{
return true;
}
else
{
return false;
}
}

Of course there is such way... But you have to code it yourself.
First you shoud not compare the base64 data... You'll loose direct pixel value access and increase the size of the data to compare by more then 150% (Originaly 200% but corrected thanks to PeterDuniho's comment) in C# due to UTF16.
Second I assume that all pictures have the same fixed size. Before comparing, reduce the image size to something really small, but keep the width/height aspect. This will speed up the comparsion and also eliminates noise.
Third Iterate both pictures and compare their grayscaled pixel values. I Assume that you have resized the picture to 16x16. Since we're comparing their grayscale-values the value of one pixel is between 0 and 255. So the maximum distance between both pictures will be 16 * 16 * 256 = 65536. If both pictures are black, the distance between the pictures will be zero (100% similarity). If one picture is black and the other is white the distance will be 65535 (0% similarity).
To compare the images iterate the picture-pixels and subtract the grayscale-pixel-value-from-picture-a from the grayscale-pixel-value-of-picture-b at the point x,y and add the absolute difference value to the counter. This counter will be the total distance between both pictures.
Lets assume this counter has a value of 1000 after the comparison loop, you get the percentage-similarity by 1000 / 65535 ~ 1.5% difference (or 98.5% similarity) between both pictures.
pseudo-compare-code
long counter = 0;
long total = image.Width * image.Height * (Color.White - Color.Black);
for(int x = 0; x < image.Width; x++)
{
for(int y = 0; y < image.Height; y++)
{
var p1 = image.GetPixel(x, y);
var p2 = otherImage.GetPixel(x, y);
var g1 = ((p1.R + p1.G + p1.B) / 3);
var g2 = ((p2.R + p2.G + p2.B) / 3);
var distance = Math.Abs(g1 - g2);
counter += distance;
}
}
var similarity = 100 - ((counter / total) * 100);
This is an more or less easy approach, but you have to test this with you scenario/images. Instead of comparing grayscale-values you could also compare rgb-values. Look for distance definitions like the euclidean distance... Start and keep reading :)
EDIT
This is just a really basic approach that should explain how you can start comparing images. It does not take into account that there might be different image formats (jpeg, png, gif), color formats (indexed, 16bit, 24bit, 32bit) or images with different resolutions.

How can I do a color or numeric replacement with bitwise/boolean logic

How can I do a color replace like the code below without using the if statement and instead use boolean algebra (or some other magic that will not introduce conditional logic)
The Problem (excuse the code):
private Image ReplaceRectangleColors(Bitmap b,
Rectangle rect,
Color oldColor,
Color newColor)
{
BitmapData bmData = b.LockBits(rect,
ImageLockMode.ReadWrite,
PixelFormat.Format24bppRgb);
int stride = bmData.Stride;
IntPtr Scan0 = bmData.Scan0;
byte red = 0;
byte blue = 0;
byte green = 0;
unsafe
{
byte * p = (byte *)(void *)Scan0;
int nOffset = stride - rect.Width *3;
for(int y=0; y < rect.Height; ++y)
{
for(int x=0; x < rect.Width; ++x )
{
red = p[0];
blue = p[1];
green = p[2];
if (red == oldColor.R
&& blue == oldColor.B
&& green == oldColor.G)
{
p[0] = newColor.R;
p[1] = newColor.B;
p[2] = newColor.G;
}
p += 3;
}
p += nOffset;
}
}
b.UnlockBits(bmData);
return (Image)b;
}
The problem I have is that if the image is huge this code gets executed many times and has poor formance. I know there has to be a way to substitute the color replacement with something much cleaner/faster. Any ideas?
Just to summarize and simplify, I want to turn
if (red == oldColor.R
&& blue == oldColor.B
&& green == oldColor.G)
{
red = newColor.R;
blue = newColor.B;
green = newColor.G;
}
into a bit operation that doesn't include an if statement.

There aren't any bitwise operations that will replace pixels of one colour with another for you. In fact, reading a pixel, applying a bitwise operation and writing back the results for every pixel will probably work out slower than reading a pixel and only doing any work on it and writing it back if it matches your target colour.
However, there are some things that can be done to speed up the code, with increasing levels of complexity:
1) The first thing you could do is not to read the 3 bytes before you do the compare. If you read each byte only as it is needed for the comparison, then in the case that the red byte doesn't match, there isn't any need to read or compare the Green/Blue bytes. (The optimiser may well work this out on your behalf though)
2) Use cache coherence by accessing the data in the address-order that it is stored in. (You're doing this by working on the scanlines by putting x in your inner loop).
3) Use multithreading. Break the image into (e.g.) 4 strips, and process them in parallel, and you should be able to get a "several times" speedup if you have a 4+ core processor.
4) You may be able to work several times faster by using a 32-bit or 64-bit value instead of four or eight 8-bit values. This is because fetching one byte from memory might take a similar time (give or take some cache coherence etc) to fetching an entire CPU register (4 or 8 bytes). Once you have the value in a register, you can do a single comparison (RGBA) rather than four (R, G, B, A bytes separately), and then a single write back - potentially as much as 4x faster. This is the easy case (for 32-bpp images), as they conveniently fit one-pixel-per-int, so you can use a 32-bit integer to read/compare/write an entire RGBA pixel in a single operation.
But for other image depths you will have a much harder case, as the number of bytes in each pixel will not exactly match the size of your 32-bit int. For example, for 24bpp images, you will need to read three 32-bit dwords (12 bytes) so that you can then process four pixels (3 bytes x 4 = 12) on each iteration of your loop. You will need to use bitwise operations to peel apart these 3 ints and compare them to your 'oldcolour' (see below). An added complication is that you must be careful not to run off the end of each scanline if you are processing it in 4-pixel jumps. A similar process applies to using 64-bit longs, or processing lower bpp images - but you will have to start doing more intricate bit-wise operations to pull the data out cleanly, and it can get pretty complicated.
So how do you compare the pixels?
The first pixel is easy.
int oldColour = 0x00112233; // e.g. R=33, G=22, B=11
int newColour = 0x00445566;
int chunk1 = scanline[i]; // Treating scanline as an array of int, read 3 ints (12 bytes)
int chunk2 = scanline[i+1]; // We cache them in ints as we will read/write several times
int chunk3 = scanline[i+2];
if (chunk1 & 0x00ffffff == oldColour) // read and check 3 bytes of pixel
chunk2 = (chunk2 & 0xff000000) | newColour; // Write back 3 bytes of pixel
The next pixel has one byte in the first int, and 2 bytes in the next int:
if ((chunk1 >> 24) == (oldColour & 0xff)) // Does B byte match?
{
if ((chunk2 & 0x0000ffff) == (oldColour >> 8))
{
chunk1 = (chunk1 & 0x00ffffff) | (newColour & 0xff); // Replace B byte in chunk1
chunk2 = (chunk2 & 0xffff0000) | (newColour >> 8); // Replace G, B bytes in chunk2
}
}
Then the third pixel has 2 bytes (RG) in chunk2 and 1 byte (B) in chunk3:
if ((chunk2 >> 16) == (oldColour & 0xffff))
{
if ((chunk3 & 0xff) == (oldColour >> 16))
{
chunk2 = (chunk2 & 0x0000ffff) | (newColour << 16); // Replace RG bytes in chunk2
chunk3 = (chunk3 & 0xffffff00) | (newColour >> 16); // Replace B byte in chunk3
}
}
And finally, the last 3 bytes in chunk3 are the last pixel
if ((chunk3 >> 8) == oldCOlour)
chunk3 = (chunk3 & 0x000000ff) | (newColour << 8);
... and then write back the chunks to the scanline buffer.
That's the gist of it (and my masking/combining above may have some bugs, as I wrote the example code quickly and may have mixed up some of the pixels!).
Of course, once it works, you can then optimise it a load more - for example, whenever I compare stuff to parts of the oldColour (e.g. oldColour >> 16), I can precaclulate that constant outside the entire processing loop, and just use an "oldColourShiftedRight16" variable to avoid recalculating it on every pass through the loop. THe same goes for all the bits of newColour that are used. Potentially you may be able to make some gains by avoiding writing back the values that haven't been touched, too, as many of your pixels probably won't match the one you want to change.
So that should give you some idea of what you were asking for. It's not particularly simple, but it's a great deal of fun :-)
When you've got it all written and super-optimised, then the final step is to throw it away and just use your graphics card to do the whole thing a bazillion times faster in hardware - but let's face it, where's the fun in that? :-)

I wrote a project recently where I did color manipulation on a pixel per pixel basis. It had to run fast as it would update while you moved a mouse cursor around.
I started with unsafe code but I don't like unsafe code and so changed to safe territory and when I did, I had the speed issues you had but the resolution wasn't changing conditional logic. It was designing better algorithms for the pixel manipulation.
I'll give you an overview of what I did and I'm hoping it can get you where you want to be because it's really close.
First: I had multiple possible input pixel formats. Due to that I couldn't assume the RGB bytes were at specific offsets or even a static width. As such, I read the info from the passed in image and return a "color" that represents the sizes of each field:
private System.Drawing.Color GetOffsets(System.Drawing.Imaging.PixelFormat PixelFormat)
{
//Alpha contains bytes per color,
// R contains R offset in bytes
// G contains G offset in bytes
// B contains B offset in bytes
switch(PixelFormat)
{
case System.Drawing.Imaging.PixelFormat.Format24bppRgb:
return System.Drawing.Color.FromArgb(3, 0, 1, 2);
case System.Drawing.Imaging.PixelFormat.Format32bppArgb:
case System.Drawing.Imaging.PixelFormat.Format32bppPArgb:
return System.Drawing.Color.FromArgb(4, 1, 2, 3);
case System.Drawing.Imaging.PixelFormat.Format32bppRgb:
return System.Drawing.Color.FromArgb(4, 0, 1, 2);
case System.Drawing.Imaging.PixelFormat.Format8bppIndexed:
return System.Drawing.Color.White;
default:
return System.Drawing.Color.White;
}
}
For example purposes, let's say that a 24-bit RGB image is the source. I didn't want to change alpha values as I'm going to blend a color in to it.
Thus, R is at offset 0, B is at offset 1 and G at offset 2 and each pixel is three bits wide. This I create a temporary Color with this data.
Next, since this is in a custom control, I didn't want flickering so I overrode the OnPaintBackground and turned it off:
protected override void OnPaintBackground(System.Windows.Forms.PaintEventArgs pevent)
{
//base.OnPaintBackground(pevent);
}
Finally, and here's the part that gets to the crux of what you're doing, I draw a new image on each OnPaint (which is triggered as a mouse moves because I "Invalidate" it in the mouse move event handler)
Full code - before I call certain sections out ...
protected override void OnPaint(System.Windows.Forms.PaintEventArgs pe)
{
base.OnPaint(pe);
pe.Graphics.FillRectangle(new System.Drawing.SolidBrush(this.BackColor), pe.ClipRectangle);
System.Drawing.Rectangle DestinationRect = GetDestinationRectangle(pe.ClipRectangle);
if(DestinationRect != System.Drawing.Rectangle.Empty)
{
System.Drawing.Image BlendedImage = (System.Drawing.Image) this.Image.Clone();
if(HighlightRegion != System.Drawing.Rectangle.Empty && this.Image != null)
{
System.Drawing.Rectangle OffsetHighlightRegion =
new System.Drawing.Rectangle(
new System.Drawing.Point(
Math.Min(Math.Max(HighlightRegion.X + OffsetX, 0), BlendedImage.Width - HighlightRegion.Width -1),
Math.Min(Math.Max(HighlightRegion.Y + OffsetY, 0), BlendedImage.Height - HighlightRegion.Height -1)
)
, HighlightRegion.Size
);
System.Drawing.Bitmap BlendedBitmap = (System.Drawing.Bitmap) BlendedImage;
System.Drawing.Color OffsetRGB = GetOffsets(BlendedImage.PixelFormat);
byte BlendR = SelectionColor.R;
byte BlendG = SelectionColor.G;
byte BlendB = SelectionColor.B;
byte BlendBorderR = SelectionBorderColor.R;
byte BlendBorderG = SelectionBorderColor.G;
byte BlendBorderB = SelectionBorderColor.B;
if(OffsetRGB != System.Drawing.Color.White) //White means not supported
{
int BitWidth = OffsetRGB.G - OffsetRGB.R;
System.Drawing.Imaging.BitmapData BlendedData = BlendedBitmap.LockBits(new System.Drawing.Rectangle(0, 0, BlendedBitmap.Width, BlendedBitmap.Height), System.Drawing.Imaging.ImageLockMode.ReadWrite, BlendedBitmap.PixelFormat);
int StrideWidth = BlendedData.Stride;
int BytesPerColor = OffsetRGB.A;
int ROffset = BytesPerColor - (OffsetRGB.R + 1);
int GOffset = BytesPerColor - (OffsetRGB.G + 1);
int BOffset = BytesPerColor - (OffsetRGB.B + 1);
byte[] BlendedBytes = new byte[Math.Abs(StrideWidth) * BlendedData.Height];
System.Runtime.InteropServices.Marshal.Copy(BlendedData.Scan0, BlendedBytes, 0, BlendedBytes.Length);
//Create Highlighted Region
for(int Row = OffsetHighlightRegion.Top ; Row <= OffsetHighlightRegion.Bottom ; Row++)
{
for(int Column = OffsetHighlightRegion.Left ; Column <= OffsetHighlightRegion.Right ; Column++)
{
int Offset = Row * StrideWidth + Column * BytesPerColor;
if(Row == OffsetHighlightRegion.Top || Row == OffsetHighlightRegion.Bottom || Column == OffsetHighlightRegion.Left || Column == OffsetHighlightRegion.Right)
{
BlendedBytes[Offset + ROffset] = BlendBorderR;
BlendedBytes[Offset + GOffset] = BlendBorderG;
BlendedBytes[Offset + BOffset] = BlendBorderB;
}
else
{
BlendedBytes[Offset + ROffset] = (byte) ((BlendedBytes[Offset + ROffset] + BlendR) >> 1);
BlendedBytes[Offset + GOffset] = (byte) ((BlendedBytes[Offset + GOffset] + BlendG) >> 1);
BlendedBytes[Offset + BOffset] = (byte) ((BlendedBytes[Offset + BOffset] + BlendB) >> 1);
}
}
}
System.Runtime.InteropServices.Marshal.Copy(BlendedBytes, 0, BlendedData.Scan0, BlendedBytes.Length);
BlendedBitmap.UnlockBits(BlendedData);
//base.Image = (System.Drawing.Image) BlendedBitmap;
}
}
pe.Graphics.DrawImage(BlendedImage, 0, 0, DestinationRect, System.Drawing.GraphicsUnit.Pixel);
}
}
Going through the code here are some explanations...
System.Drawing.Image BlendedImage = (System.Drawing.Image) this.Image.Clone();
It is important to draw to an offscreen image - this creates one such image. Otherwise, the drawing will be much slower.
if(HighlightRegion != System.Drawing.Rectangle.Empty && this.Image != null)
HighlightRegion is a RECT that holds the area to "mark off" on the source image. I have used this to mark off image regions of 4 Million pixels and it still runs fast enough to be "real time"
Some code below is used because a user might be scrolled over or down on the image so I modify my destination by their scrolling amount.
Below that, I cast the IMAGE to a BITMAP and get the before-mentioned Color info which I'll need to start using now. Depending on what you're doing you might want to cache that instead of getting it each time.
System.Drawing.Bitmap BlendedBitmap = (System.Drawing.Bitmap) BlendedImage;
On my control, I exposed two Color properties - SelectionColor and SelectionBorderColor - so that my regions still have a nice border with them. Part of my speed optimization was to pre-cast these to bytes as I'll be doing bitwise operations in a moment.
You'll see a comment in the code "White not supported" - in this case, the "White" is the "Fake Color" we use to store our bit widths. I used "White" to mean "I can't operate on this data"
The next line establishes that indeed each color is one bit because they might not be depending on our target color format by subtracting the R and G offset. Note that if you cannot garauntee that your G follows your R then you'll need to use something else. In my case, it was garaunteed.
Now where the part you're really looking for starts. I use a LockBits to get the bit data. After that, I use the data to finish setting up some pre-loop variables.
And then, I copy the data to a byte array. I'm going to loop through this byte array, change the values and then copy it's data back to the BITMAP. I was working on the BITMAP directly before thinking that since it's offscreen it would be just as fast as working with a native array.
I was wrong. Performance profiling proved it to me. It's faster to copy everything to a byte array and work within that.
Now the loop starts. It goes row by row, column by column. Offset is a number telling us where in the byte array we are in terms of "current pixel".
Then, I blend 50% or I draw a border. Note that for each pixel I have not only an IF statement, but also OR checks.
And it's still fast as blazes.
Finally, I copy back and unlock the bits. And then copy the image to the onscreen surface.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

OpenGl 16 bit display via Tao/C# - c#

I've never worked in depths higher (deeper) than 8bit per channel, but here's what I'd try first: Turn off filtering on the texture and see how it affects the output. Set texturing glHints to best quality.

You could consider using a single channel floating point texture through one of the GL_ARB_texture_float, GL_ATI_texture_float or GL_NV_float_buffer extensions if the hardware supports it, I can't recall if GL 1.4 has floating point textures or not though.

Related

Loading and displaying a 16 (12) bit grayscale png into a PictureBox

Performant method of drawing text onto a png file?

Convert 12-bit Monochrome Image to 8-bit Grayscale

Get most similar image [duplicate]

How can I do a color or numeric replacement with bitwise/boolean logic

Categories

Resources