So I have, from an external native library (c++) a pixel buffer that appears to be in 16Bit RGB (SlimDX equivalent is B5G6R5_UNorm).
I want to display the image that is represented by this buffer using Direct2D. But Direct2D does not support B5G6R5_UNorm.
so I need to convert this pixel buffer to B8G8R8A8_UNorm
I have seen various code snippets of such a task using bit shifting methods, but none of which were specific for my needs or formats. It doesn't help i have zero, nada, none, zilch any clue about bit shifting, or how it is done.
What i am after is a C♯ code example of such a task or any built in method to do the conversion - I don't mind using other library's
Please Note : I know this can be done using the C♯ bitmap classes, but i am trying to not rely on these built in classes (There is something about GDI i don't like), the images (in the form of pixel buffers) will be coming in thick and fast, and i am choosing SlimDX for its ease of use and performance.
The reason why I believe I need to convert the pixel buffer is, if I draw the image using B8G8R8A8_UNorm the image has a green covering and the pixels are just all over the place, hence why i believe i need to first convert or 'upgrade' the pixel buffer to the required format.
Just to add : When i do the above without converting the buffer, the image doesn't fill the entire geometry.
The pixel buffers are provided via byte[] objects
Bit shifting and logical operators are really useful when dealing with image formats, so you owe it to yourself to read more about it. However, I can give you a quick run-down of what this pixel format represents, and how to convert from one to another. I should preface my answer with a warning that I really don't know C# and its support libraries all that well, so there may be an in-box solution for you.
First of all, your pixel buffer has the format B5G6R5_UNORM. So we've got 16 bits (5 red, 6 green, and 5 blue) assigned to each pixel. We can visualize the bit layout of this pixel format as "RRRRRGGGGGGBBBBB", where 'R' stands for bits that belong to the red channel, 'G' for bits that belong to the green channel, and 'B' for bits that belong to the blue channel.
Now, let's say the first 16 bits (two bytes) of your pixel buffer are 1111110100101111. Line that up with the bit layout of your pixel format...
RRRRRGGGGGGBBBBB
1111110100101111
This means the red channel has bits 11111, green has 101001, and blue has 01111. Converting from binary to decimal: red=31, green=41, and blue=15. You'll notice the red channel has all bits set 1, but its value (31) is actually smaller than the green channel (41). However, this doesn't mean the color is more green than red when displayed; the green channel has an extra bit, so it can represent more values than the red and blue channels, but in this particular example there is actually more red in the output color! That's where the UNORM part comes in...
UNORM stands for unsigned normalized integer; this means the color channel values are to be interpreted as evenly spaced floating-point numbers from 0.0 to 1.0. The values are normalized by the number of bits allocated. What does that mean, exactly? Let's say you had a format with only 3 bits to store a channel. This means the channel can have 2^3=8 different values, which are shown below with the respective decimal, binary, and normalized representations. The normalized value is just the decimal value divided by the largest possible decimal value that can be represented with N bits.
Decimal | Binary | Normalized
-----------------------------
0 | 000 | 0/7 = 0.000
1 | 001 | 1/7 =~ 0.142
2 | 010 | 2/7 =~ 0.285
3 | 011 | 3/7 =~ 0.428
4 | 100 | 4/7 =~ 0.571
5 | 101 | 5/7 =~ 0.714
6 | 110 | 6/7 =~ 0.857
7 | 111 | 7/7 = 1.000
Going back to the earlier example, where the pixel had bits 1111110100101111, we already know our decimal values for the three color channels: RGB = {31, 41, 15}. We want the normalized values instead, because the decimal values are misleading and don't tell us much without knowing how many bits they were stored in. The red and blue channels are stored with 5 bits, so the largest decimal value is 2^5-1=31; however, the green channel's largest decimal value is 2^6-1=63. Knowing this, the normalized color channels are:
// NormalizedValue = DecimalValue / MaxDecimalValue
R = 31 / 31 = 1.000
G = 41 / 63 =~ 0.650
B = 15 / 31 =~ 0.483
To reiterate, the normalized values are useful because they represent the relative contribution of each color channel in the output. Adding more bits to a given channel doesn't affect the range of possible color, it simply improves color accuracy (more shades of that color channel, basically).
Knowing all of the above, you should be able to convert from any RGB(A) format, regardless of how many bits are stored in each channel, to any other RGB(A) format. For example, let's convert the normalized values we just calculated to B8G8R8A8_UNORM. This is easy once you have normalized values calculated, because you just scale by the maximum value in the new format. Every channel uses 8 bits, so the maximum value is 2^8-1=255. Since the original format didn't have an alpha channel, you would typically just store the max value (meaning fully opaque).
// OutputValue = InputValueNormalized * MaxOutputValue
B = 0.483 * 255 = 123.165
G = 0.650 * 255 = 165.75
R = 1.000 * 255 = 255
A = 1.000 * 255 = 255
There's only one thing missing now before you can code this. Way up above, I was able to pull out the bits for each channel just by lining up them up and copying them. That's how I got the green bits 101001. In code, this can be done by "masking" out the bits we don't care about. Shifting does exactly what it sounds like: it moves bits to the right or left. When you move bits to the right, the rightmost bit gets discarded and the new leftmost bit is assigned 0. Visualization below using the 16 bit example from above.
1111110100101111 // original 16 bits
0111111010010111 // shift right 1x
0011111101001011 // shift right 2x
0001111110100101 // shift right 3x
0000111111010010 // shift right 4x
0000011111101001 // shift right 5x
You can keep shifting, and eventually you'll end up with sixteen 0's. However, I stopped at five shifts for a reason. Notice now the 6 rightmost bits are the green bits (I've shifted/discarded the 5 blue bits). We've very nearly extracted the exact bits we need, but there's still the extra 5 red bits to the left of the green bits. To remove these, we use a "logical and" operation to mask out only the rightmost 6 bits. The mask, in binary, is 0000000000111111; 1 means we want the bit, and 0 means we don't want it. The mask is all 0's except for the last 6 positions, because we only want the last 6 bits. Line this mask up with the 5x shifted number, and the output is 1 when both bits are 1, and 0 for every other bit:
0000011111101001 // original 16 bits shifted 5x to the right
0000000000111111 // bit mask to extract the rightmost 6 bits
------------------------------------------------------------
0000000000101001 // result of the 'logical and' of the two above numbers
The result is exactly the number we're looking for: the 6 green bits and nothing else. Recall that the leading 0's have no effect on the decimal value (it's still 41). It's very simple to do the 'shift right' (>>) and 'logical and' (&) operations in C# or any other C-like language. Here's what it looks like in C#:
// 0xFD2F is 1111110100101111 in binary
uint pixel = 0xFD2F;
// 0x1F is 00011111 in binary (5 rightmost bits are 1)
uint mask5bits = 0x1F;
// 0x3F is 00111111 in binary (6 rightmost bits are 1)
uint mask6bits = 0x3F;
// shift right 11x (discard 5 blue + 6 green bits), then mask 5 bits
uint red = (pixel >> 11) & mask5bits;
// shift right 5x (discard 5 blue bits), then mask 6 bits
uint green = (pixel >> 5) & mask6bits;
// mask 5 rightmost bits
uint blue = pixel & mask5bits;
Putting it all together, you might end up with a routine that looks similar to this. Do be careful to read up on endianness, however, to make sure the bytes are ordered in the way you expect. In this case, the parameter is a 32-bit unsigned integer (first 16 bits ignored)
byte[] R5G6B5toR8G8B8A8(UInt16 input)
{
return new byte[]
{
(byte)((input & 0x1F) / 31.0f * 255), // blue
(byte)(((input >> 5) & 0x3F) / 63.0f * 255), // green
(byte)(((input >> 11) & 0x1F) / 31.0f * 255), // red
255 // alpha
};
}
Related
Following up on this question.
I've managed to extract pixel informations off a bitmap instance using Bitmap.LockBits. PixelFormat is Format48bppRbg which, based on my understanding and Peter's answer on the aforementioned question, should store each pixel's RGB color channels using two bytes for storage. The bytes's combined value should be equal to a number between 0 and 8192, representing an RBG channel intensity value. That value is obtained by passing both bytes to the BitConverter.ToUInt16 method.
So upon extracting pixel informations for the following 3 x 3 bitmap (magnified here for clarity):
I'm having first row of pixel channels going like this:
Pixel color
Red intensity
Green intensity
Blue intensity
RED
8192
0
0
GREEN
0
8192
0
BLUE
0
0
8192
So far so good.
On the second row, however, it goes like this:
Pixel color
Red intensity
Green intensity
Blue intensity
WHITE
8192
8192
8192
GRAY (!)
1768 (!)
1768 (!)
1768 (!)
BLACK
0
0
0
The white and black pixel channel values make sense to me.
The gray, however, doesn't.
If you use any color picker on the gray pixel above, you should get a perfectly medium gray. In 24 bits color it should be the equivalent of (R: 128, G: 128, B: 128), or #808080 in hexadecimal form.
Then how come, in a Format48bppRpg pixel format, the channels intensity is way below the expected, middle 4096 value? Isn't this GDI+ based range of 0-8192 supposed to work like its 8 bits counterpart, with 0 being the lowest intensity and 8192 the highest? What am I missing here?
For reference, here is a screen capture from Visual Studio debugger showing the raw bytes indexes and values, with additional notes on the stride, channels positions and their extracted intensity value, up to the gray pixel:
The part where you state an incorrect assumption is that #808080 is "perfectly medium gray". It is not, at least not if you look at it from a certain way (more about it here).
Many color standards, including sRGB, use gamma compression to make darker colors more spaced out in the range of 256 values normally used to store RGB colors. This roughly means taking a square root (or 2.2-root) of the relative component value before encoding this value. The reason is that the human eye (like other senses) perceives brightness logarithmically, thus it is important to represent even an arithmetically small change in the brightness if it would mean actually doubling it, for example.
The byte value of 128 is actually about 21,95 % (128/255 ^ 2.2) of full brightness, which is what you're seeing in the case of 16-bit components. The space of possible values there is much larger, thus GDI (or the format) doesn't need to store them in a special way anymore.
In case you need an algorithm, taking the 2.2-root of the value works mostly well, but the correct formula is a bit different, see here. The root function normally has an infinite slope near zero, so the specific formula attempts to fix that by making that portion linear. A piece of code can be derived from that quite easily:
static byte TransformComponent(ushort linear)
{
const double a = 0.055;
var c = linear / 8192.0;
c = c <= 0.0031308 ? 12.92 * c : (1 + a) * Math.Pow(c, 1/2.4) - a;
return (byte)Math.Round(c * Byte.MaxValue);
}
This gives 128 for the value of 1768.
I am reading data back from an imaging camera system, this camera detects age, gender etc, one of the values that comes back is the confidence value, this is 2 bytes, and is shown as the LSB and MSB, I have just tried converting these to integers and adding them together, but I don't get the value expected.
is this the correct way to get a value using the LSB and MSB, I have not used this before.
Thanks
Your value is going to be:
Value = LSB + (MSB << 8);
Explanation:
A byte can only store 0 - 255 different values, whereas an int (for this example) is 16 bits.
The MSB is the left hand^ side of the 16 bits, and as such needs to be shifted to the left side to change the bits used. You can then add the two values.
I would suggest looking up the shifting operators.
^ based on endienness (Intel/Motorola)
Assuming that MSB and LSB are most/least significant byte (rather than bit or any other expansion of that acronym), the value can be obtained by MSB * 256 + LSB.
I seem to lack a fundemental understanding of calculating and using hex and byte values in C# (or programming in general).
I'd like to know how to calculate hex values and bytes (0x--) from sources such as strings and RGB colors (like how do I figure out what the 0x code is for R255 G0 B0 ?)
Why do we use things like FF, is it to compensate for the base 10 system to get a number like 10?
Hexadecimal is base 16, so instead of counting from 0 to 9, we count from 0 to F. And we generally prefix hex constants with 0x. Thus,
Hex Dec
-------------
0x00 = 0
0x09 = 9
0x0A = 10
0x0F = 15
0x10 = 16
0x200 = 512
A byte is the typical unit of storage for values on a computer, and on most all modern systems, a byte contains 8 bits. Note that bit actually means binary digit, so from this, we gather that a byte has a maximum value of 11111111 binary. That is 0xFF hex, or 255 decimal. Thus, one byte can be represented by a minimum of two hexadecimal characters. A typical 4-byte int is then 8 hex characters, like 0xDEADBEEF.
RGB values are typically packed with 3 byte values, in that order, RGB. Thus,
R=255 G=0 B=0 => R=0xFF G=0x00 B=0x00 => 0xFF0000 or #FF0000 (html)
R=66 G=0 B=248 => R=0x42 G=0x00 B=0xF8 => 0x4200F8 or #4200F8 (html)
For my hex calculations, I like to use python as my calculator:
>>> a = 0x427FB
>>> b = 700
>>> a + b
273079
>>>
>>> hex(a + b)
'0x42ab7'
>>>
>>> bin(a + b)
'0b1000010101010110111'
>>>
For the RGB example, I can demonstrate how we could use bit-shifting to easily calculate those values:
>>> R=66
>>> G=0
>>> B=248
>>>
>>> hex( R<<16 | G<<8 | B )
'0x4200f8'
>>>
Base-16 (also known as hex) notation is convenient because you can fit four bits in exactly one hex digit, making conversion to binary very easy, yet not requiring as much space as a full binary notation. This is useful when you need to represent bit-oriented data in a human-readable form.
Learning hex is easy - all you need to do is memorizing a short table of 16 rows defining hex-to-binary conversion:
0 - 0000
1 - 0001
2 - 0010
3 - 0011
4 - 0100
5 - 0101
6 - 0110
7 - 0111
8 - 1000
9 - 1001
A - 1010
B - 1011
C - 1100
D - 1101
E - 1110
F - 1111
With this table in hand, you can easily convert hex strings of arbitrary length to their corresponding bit patterns:
0x478FD105 - 01000111100011111011000100000101
Converting back is easy as well: group your binary digits by four, and use the table to make hex digits
0010 1001 0100 0101 0100 1111 0101 1100 - 0x29454F5C
In decimal, each digit is weighted 10 times more than the one to the right, for example the '3' in 32 is 3 * 10, and the '1' in 102 is 1 * 100. Binary is similar except since there are only two digits (0 and 1) each bit is only weighted twice as much as the one to the right. Hexadecimal uses 16 digits - the 10 decimal digits along with the letters A = 10 to F = 15.
An n-digit decimal number can represent values up to 10^n - 1 and similarly an n-digit binary number can represent values up to 2^n - 1.
Hexadecimal is convenient since you can express a single hex digit in 4 bits since 2^4 = 16 possible values can be represented in 4 bits.
You can convert binary to hex by grouping from the right 4 bits at a time and converting each group to the corresponding hex. For example 1011100 -> (101)(1100) -> 5C
The conversion from hex to binary is even simpler since you can simply expand each hex digit into the corresponding binary, for example 0xA4 -> 1010 0100
The answer to the actual question posted ("Why do we use things like FF, is it to compensate for the base 10 system to get a number like 10?") is this: Computer use bits, that means either 1 or 0.
The essence is similar to what Lee posted and called "positional notation". In a decimal number, each position in the number refers to a power of 10. For example, in the number 123, the last position represents 10^0 -- the ones. The middle position represents 10^1 -- the tens. And the first is 10^2 -- the hundreds. So the number "123" represents 1 * 100 + 2 * 10 + 3 * 1 = 123.
Numbers in binary use the same system. The number 10 (base 2) represents 1 * 2^1 + 0 * 2^0 = 2.
If you want to express the decimal number 10 in binary, you get the number 1010. That means, you need four bits to represent a single decimal digit.
But with four bits you can represent up to 16 different values, not just 10 different values. If you need four bits per digit, you might as well use numbers in the base 16 instead of only base 10. That's where hexadecimal comes into play.
Regarding how to convert ARGB values; as been written in other replies, converting between binary and hexadecimal is comparatively easy (4 binary digits = 1 hex digit).
Converting between decimal and hex is more involving and at least to me it's been easier (if i have to do it in my head) to first convert the decimal into binary representation, and then the binary number into hex. Google probably has tons of how-tos and algorithms for that.
(uint)Convert.ToInt32(elements[0]) << 24;
The << is the left shift operator.
Given that the number is a binary number, it will shift all the bits the specified amount to the left.
If we have
2 << 1
This will take the number 2 in binary (00000010) and shift it to the left one bit. This gives you 4 (000000100).
Overflows
Note that once you get to the very left, the bits are discarded. So assuming you are working with an 8 bit sized integer (I know c# uint like you have in your example is 32 bits - I dont want to have to type out a 32 bit digit, so just assume we are on 8 bits)
255 << 1
will return 254 (11111110).
Use
Being very careful of the overflows mentioned before, bit shifting is a very fast way to multiply or divide by 2. In a highly optimised environment (such as games) this is a very useful way to perform arithmetic very fast.
However, in your example, it is taking only the right most 8 bits of the number making them the left most 8 bits (multiplying it by 16,777,216) . Why you would want do this, I could only guess.
I guess you are referring to Shift operators.
As Mongus Pong said, shifts are usually used to multiply and divide very fast. (And can cause weird problems due to overflow).
I'm going to go out on a limb and trying to guess what your code is doing.
If elements[0] is a byte element(that is to say, it contains only 8 bits), then this code will result in straight-forward multiplication by 2^24. Otherwise, it will drop the 24 high-order bits and multiply by 2^24.
I have a 16 bit luminance value stored in two bytes, and I want to convert that to R, G, and B values. I have two questions: how do I convert those two bytes to a short, and assuming that the hue and saturation is 0, how do I turn that short into 8 bits per component RGB values?
(The Convert class doesn't have an option to take two bytes and output a short.)
If the 16 bit value is little endian and unsigned, the 2nd byte would be the one you want to repeat 3x to create the RGB value, and you'd drop the other byte. Or if want the RGB in a 32 bit integer, you could either use bit shifts and adds or just multiply that 2nd byte by 0x10101.
Try something like this:
byte luminance_upper = 23;
byte luminance_lower = 57;
int luminance = ((int)luminance_upper << 8) | (int)luminance_lower;
That will give you a value between 0-65535.
Of course, if you just want to end up with a 32-bit ARGB (greyscale) colour from that, the lower byte isnt going to matter because you're going from a 16-bit luminance to 8-bit R,G,B components.
byte luminance_upper = 255;
// assuming ARGB format
uint argb = (255u << 24) // alpha = 255
| ((uint)luminance_upper << 16) // red
| ((uint)luminance_upper << 8) // green
| (uint)luminance_upper; // blue (hope my endianess is correct)
Depending on your situation, there may be a better way of going about it. For example, the above code does a linear mapping, however you may get better-looking results by using a logarithmic curve to map the 16-bit values to 8-bit RGB components.
You will need the other components too (hue and saturation).
Also, 16-bit sounds a bit weird. Normally those values are floating point.
There is a very good article on CodeProject for conversions of many color spaces.