C#: How to concatenate bits to create an UInt64? - c#

I'm trying to create a hashing function for images in order to find similar ones from a database.
The hash is simply a series of bits (101110010) where each bit stands for one pixel. As there are about 60 pixels for each image I assume it would be best to save this as an UInt64.
Now, when looping through each pixel and calculating each bit, how can I concatenate those and save them as a UInt64?
Thanks for you help.

Use some bit twiddling:
long mask = 0;
// For each bit that is set, given its position (0-63):
mask |= 1 << position;

You use bitwise operators like this:
ulong it1 = 0;
ubyte b1 = 0x24;
ubyte b2 = 0x36;
...
it1 = (b1 << 48) | (b2 << 40) | (b3 << 32) .. ;
Alternatively you can use the BitConvert.Uint64() function to quickly convert a byte array to int64. But are you sure the target is of 8bytes long?

Related

Modify specific bit in byte

I need to modify (!not toggle XOR!) specific bit in byte value. I have:
source byte (e.g. b11010010);
index of bit to modify (e.g. 4);
new value of bit (0 or 1).
Now, what I need. If new value is 0, then bit[4] must be set to 0. If new value is 1, then bit[4] must be set to 1.
General part:
var bitIndex = 4;
var byte = b11010010;
var mask = 1 << bitIndex;
var newValue = 1;
This is the easiest way to do this:
if(newValue == 1)
byte |= mask; // set bit[bitIndex]
else
byte &= ~mask; // drop bit[bitIndex]
Another way allows to do this without if else statement, but look to hard to understand:
byte = byte & ~mask | (newValue << bitIndex) & mask
Here, first AND drops bit[bitIndex], second AND calculates new value for bit[bitIndex], and OR set bit[bitIndex] to calculated value, not matter is it 0 or 1.
Is there any easier way to set specific bit into given value?
(newValue << bitIndex) only has a single bit set, there's no need for & mask.
So you have just 5 operations.
byte = byte & ~(1 << bitIndex) | (newValue << bitIndex); // bitIndex'th bit becomes newValue
It's still complex enough to be worth a comment, but easy to see that the comment is correct because it's two easily recognized operations chained together (unlike the current accepted answer, which requires every reader to sit down and think about it for a minute)
The canonical way to do this is:
byte ^= (-newValue ^ byte) & (1 << n);
Bit number n will be set if newValue == 1, and cleared if newValue == 0

Bit manipulation on large integers out of 'int' range

Ok, so let's start with a 32 bit integer:
int big = 536855551; // 00011111111111111100001111111111
Now, I want to set the last 10 bits to within this integer:
int little = 69; // 0001101001
So, my approach was this:
big = (big & 4294966272) & (little)
where 4294966272 is the first 22 bits, or 11111111111111111111110000000000.
But of course this isn't supported because 4294966272 is outside of the int range of 0x7FFFFFFF. Also, this isn't going to be my only operation. I also need to be able to set bits 11 through 14. My approach for that (with the same problem) was:
big = (big & 4294951935) | (little << 10)
So with the explanation out of the way, here is what I'm doing as alternative's for the above:
1: ((big >> 10) << 10) | (little)
2: (big & 1023) | ((big >> 14) << 14) | (little << 10)
I don't feel like my alternative's are the best, efficient way I could go. Is there any better ways to do this?
Sidenote: If C# supported binary literals, '0b', this would be a lot prettier.
Thanks.
4294966272 should actually be -1024, which is represented as 11111111111111111111110000000000.
For example:
int big = 536855551;
int little = 69;
var thing = Convert.ToInt32("11111111111111111111110000000000", 2);
var res = (big & thing) & (little);
Though, the result will always be 0
00011111111111111100001111111111
&
00000000000000000000000001101001
&
11111111111111111111110000000000
Bit shift is usually faster compared to bit-shift + mask (that is, &). I have a test case for it.
You should go with your first alternative.
1: ((big >> 10) << 10) | (little)
Just beware of a little difference between unsigned and signed int when it comes to bit-shifting.
Alternatively, you could define big and little as unsigned. Use uint instead of int.

Fastest way to XOR two specific bit indexes in two bytes

This is in C#. I was hoping I could do something like the following.
byte byte1 = 100;
byte byte2 = 100;
byte1[1] = byte1[1] ^ byte2[6]; // XOR bit at index 1 against bit at index 6
However, I am currently stuck at:
if ((byte2 ^ (byte)Math.Pow(2, index2)) < byte2)
byte1 = (byte)(byte1 ^ (byte)Math.Pow(2, index1));
Is there a faster way, possibly something similar to what I typed at the top?
Edit:
I had never heard of any of the bitwise operators other than XOR. That's why the original solution had the bizarre Math.Pow() calls. I've already improved my solution considerably according to my benchmarking of millions of loop iterations. I'm sure I'll get it faster with more reading. Thanks to everybody that responded.
byte2 = (byte)(byte2 << (7 - index2));
if (byte2 > 127)
{
byte buffer = (byte)(1 << index1);
byte1 = (byte)(byte1 ^ buffer);
}
Bytes are immutable, you can't change a bit of the byte as if it was an array. You'd need to access the bits through masks (&) and shifts (<< >>), then create a new byte containing the result.
// result bit is the LSB of r
byte r = (byte)((byte1 >> 1 & 1) ^ (byte2 >> 6 & 1));
The specific mask 1 will erase any bit except the right most (LSB).

Bit manipulation in C# using a mask

I need a little help with bitmap operations in C#
I want to take a UInt16, isolate an arbitrary number of bits, and set them using another UInt16 value.
Example:
10101010 -- Original Value
00001100 -- Mask - Isolates bits 2 and 3
Input Output
00000000 -- 10100010
00000100 -- 10100110
00001000 -- 10101010
00001100 -- 10101110
^^
It seems like you want:
(orig & ~mask) | (input & mask)
The first half zeroes the bits of orig which are in mask. Then you do a bitwise OR against the bits from input that are in mask.
newValue = (originalValue & ~mask) | (inputValue & mask);
originalValue -> 10101010
inputValue -> 00001000
mask -> 00001100
~mask -> 11110011
(originalValue & ~mask)
10101010
& 11110011
----------
10100010
^^
Cleared isolated bits from the original value
(inputValue & mask)
00001000
& 00001100
----------
00001000
newValue =
10100010
| 00001000
----------
10101010
Something like this?
static ushort Transform(ushort value){
return (ushort)(value & 0x0C/*00001100*/ | 0xA2/*10100010*/);
}
This will convert all your sample inputs to your sample outputs. To be more general, you'd want something like this:
static ushort Transform(ushort input, ushort mask, ushort bitsToSet){
return (ushort)(input & mask | bitsToSet & ~mask);
}
And you would call this with:
Transform(input, 0x0C, 0xA2);
For the equivalent behavior of the first function.
A number of the terser solutions here look plausible, especially JS Bangs', but don't forget that you also have a handy BitArray collection to use in the System.Collections namespace: http://msdn.microsoft.com/en-us/library/system.collections.bitarray.aspx
If you want to do bitwise manipulations, I have written a very versatile method to copy any number of bits from one byte (source byte) to another byte (target byte). The bits can be put to another starting bit in the target byte.
In this example, I want to copy 3 bits (bitCount=3) from bit #4 (sourceStartBit) to bit #3 (destinationStartBit). Please note that the numbering of bits starts with "0" and that in my method, the numbering starts with the most significant bit = 0 (reading from left to right).
byte source = 0b10001110;
byte destination = 0b10110001;
byte result = CopyByteIntoByte(source, destination, 4, 1, 3);
Console.WriteLine("The binary result: " + Convert.ToString(result, toBase: 2));
//The binary result: 11110001
byte CopyByteIntoByte(byte sourceByte, byte destinationByte, int sourceStartBit, int destStartBit, int bitCount)
{
int[] mask = { 0, 1, 3, 7, 15, 31, 63, 127, 255 };
byte sourceMask = (byte)(mask[bitCount] << (8 - sourceStartBit - bitCount));
byte destinationMask = (byte)(~(mask[bitCount] << (8-destStartBit - bitCount)));
byte destinationToCopy = (byte)(destinationByte & destinationMask);
int diff = destStartBit - sourceStartBit;
byte sourceToCopy;
if(diff > 0)
{
sourceToCopy = (byte)((sourceByte & sourceMask) >> (diff));
}
else
{
sourceToCopy = (byte)((sourceByte & sourceMask) << (diff * (-1)));
}
return (byte)(sourceToCopy | destinationToCopy);
}

Help with optimizing C# function via C and/or Assembly

I have this C# method which I'm trying to optimize:
// assume arrays are same dimensions
private void DoSomething(int[] bigArray1, int[] bigArray2)
{
int data1;
byte A1, B1, C1, D1;
int data2;
byte A2, B2, C2, D2;
for (int i = 0; i < bigArray1.Length; i++)
{
data1 = bigArray1[i];
data2 = bigArray2[i];
A1 = (byte)(data1 >> 0);
B1 = (byte)(data1 >> 8);
C1 = (byte)(data1 >> 16);
D1 = (byte)(data1 >> 24);
A2 = (byte)(data2 >> 0);
B2 = (byte)(data2 >> 8);
C2 = (byte)(data2 >> 16);
D2 = (byte)(data2 >> 24);
A1 = A1 > A2 ? A1 : A2;
B1 = B1 > B2 ? B1 : B2;
C1 = C1 > C2 ? C1 : C2;
D1 = D1 > D2 ? D1 : D2;
bigArray1[i] = (A1 << 0) | (B1 << 8) | (C1 << 16) | (D1 << 24);
}
}
The function basically compares two int arrays. For each pair of matching elements, the method compares each individual byte value and takes the larger of the two. The element in the first array is then assigned a new int value constructed from the 4 largest byte values (irrespective of source).
I think I have optimized this method as much as possible in C# (probably I haven't, of course - suggestions on that score are welcome as well). My question is, is it worth it for me to move this method to an unmanaged C DLL? Would the resulting method execute faster (and how much faster), taking into account the overhead of marshalling my managed int arrays so they can be passed to the method?
If doing this would get me, say, a 10% speed improvement, then it would not be worth my time for sure. If it was 2 or 3 times faster, then I would probably have to do it.
Note: please, no "premature optimization" comments, thanks in advance. This is simply "optimization".
Update: I realized that my code sample didn't capture everything I'm trying to do in this function, so here is an updated version:
private void DoSomethingElse(int[] dest, int[] src, double pos,
double srcMultiplier)
{
int rdr;
byte destA, destB, destC, destD;
double rem = pos - Math.Floor(pos);
double recipRem = 1.0 - rem;
byte srcA1, srcA2, srcB1, srcB2, srcC1, srcC2, srcD1, srcD2;
for (int i = 0; i < src.Length; i++)
{
// get destination values
rdr = dest[(int)pos + i];
destA = (byte)(rdr >> 0);
destB = (byte)(rdr >> 8);
destC = (byte)(rdr >> 16);
destD = (byte)(rdr >> 24);
// get bracketing source values
rdr = src[i];
srcA1 = (byte)(rdr >> 0);
srcB1 = (byte)(rdr >> 8);
srcC1 = (byte)(rdr >> 16);
srcD1 = (byte)(rdr >> 24);
rdr = src[i + 1];
srcA2 = (byte)(rdr >> 0);
srcB2 = (byte)(rdr >> 8);
srcC2 = (byte)(rdr >> 16);
srcD2 = (byte)(rdr >> 24);
// interpolate (simple linear) and multiply
srcA1 = (byte)(((double)srcA1 * recipRem) +
((double)srcA2 * rem) * srcMultiplier);
srcB1 = (byte)(((double)srcB1 * recipRem) +
((double)srcB2 * rem) * srcMultiplier);
srcC1 = (byte)(((double)srcC1 * recipRem) +
((double)srcC2 * rem) * srcMultiplier);
srcD1 = (byte)(((double)srcD1 * recipRem) +
((double)srcD2 * rem) * srcMultiplier);
// bytewise best-of
destA = srcA1 > destA ? srcA1 : destA;
destB = srcB1 > destB ? srcB1 : destB;
destC = srcC1 > destC ? srcC1 : destC;
destD = srcD1 > destD ? srcD1 : destD;
// convert bytes back to int
dest[i] = (destA << 0) | (destB << 8) |
(destC << 16) | (destD << 24);
}
}
Essentially this does the same thing as the first method, except in this one the second array (src) is always smaller than the first (dest), and the second array is positioned fractionally relative to the first (meaning that instead of being position at, say, 10 relative to dest, it can be positioned at 10.682791).
To achieve this, I have to interpolate between two bracketing values in the source (say, 10 and 11 in the above example, for the first element) and then compare the interpolated bytes with the destination bytes.
I suspect here that the multiplication involved in this function is substantially more costly than the byte comparisons, so that part may be a red herring (sorry). Also, even if the comparisons are still somewhat expensive relative to the multiplications, I still have the problem that this system can actually be multi-dimensional, meaning that instead of comparing 1-dimensional arrays, the arrays could be 2-, 5- or whatever-dimensional, so that eventually the time taken to calculate interpolated values would dwarf the time taken by the final bytewise comparison of 4 bytes (I'm assuming that's the case).
How expensive is the multiplication here relative to the bit-shifting, and is this the kind of operation that could be sped up by being offloaded to a C DLL (or even an assembly DLL, although I'd have to hire somebody to create that for me)?
Yes, the _mm_max_epu8() intrinsic does what you want. Chews through 16 bytes at a time. The pain-point is the arrays. SSE2 instructions require their arguments to be aligned at 16-byte addresses. You cannot get that out of the garbage collected heap, it only promises 4-byte alignment. Even if you trick it by calculating an offset in the array that's 16-byte aligned then you'll lose when the garbage collector kicks in and moves the array.
You'll have to declare the arrays in the C/C++ code, using the __declspec(align(#)) declarator. Now you need to copy your managed arrays into those unmanaged ones. And the results back. Whether you are still ahead depends on details not easily seen in your question.
The function below uses unsafe code to treat the integer arrays as arrays of bytes so that there's no need for bit twiddling.
private static void DoOtherThing(int[] bigArray1, int[] bigArray2)
{
unsafe
{
fixed (int* p1 = bigArray1, p2=bigArray2)
{
byte* b1 = (byte*)p1;
byte* b2 = (byte*)p2;
byte* bend = (byte*)(&p1[bigArray1.Length]);
while (b1 < bend)
{
if (*b1 < *b2)
{
*b1 = *b2;
}
++b1;
++b2;
}
}
}
}
On my machine running under the debugger in Release mode against arrays of 25 million ints, this code is about 29% faster than your original. However, running standalone, there is almost no difference in runtime. Sometimes your original code is faster, and sometimes the new code is faster.
Approximate numbers:
Debugger Standalone
Original 1,400 ms 700 ms
My code 975 ms 700 ms
And, yes, I did compare the results to ensure that the functions do the same thing.
I'm at a loss to explain why my code isn't faster, since it's doing significantly less work.
Given these results, I doubt that you could improve things by going to native code. As you say, the overhead of marshaling the arrays would likely eat up any savings you might realize in the processing.
The following modification to your original code, though, is 10% to 20% faster.
private static void DoSomething(int[] bigArray1, int[] bigArray2)
{
for (int i = 0; i < bigArray1.Length; i++)
{
var data1 = (uint)bigArray1[i];
var data2 = (uint)bigArray2[i];
var A1 = data1 & 0xff;
var B1 = data1 & 0xff00;
var C1 = data1 & 0xff0000;
var D1 = data1 & 0xff000000;
var A2 = data2 & 0xff;
var B2 = data2 & 0xff00;
var C2 = data2 & 0xff0000;
var D2 = data2 & 0xff000000;
if (A2 > A1) A1 = A2;
if (B2 > B1) B1 = B2;
if (C2 > C1) C1 = C2;
if (D2 > D1) D1 = D2;
bigArray1[i] = (int)(A1 | B1 | C1 | D1);
}
}
What about this?
private void DoSomething(int[] bigArray1, int[] bigArray2)
{
for (int i = 0; i < bigArray1.Length; i++)
{
var data1 = (uint)bigArray1[i];
var data2 = (uint)bigArray2[i];
bigArray1[i] = (int)(
Math.Max(data1 & 0x000000FF, data2 & 0x000000FF) |
Math.Max(data1 & 0x0000FF00, data2 & 0x0000FF00) |
Math.Max(data1 & 0x00FF0000, data2 & 0x00FF0000) |
Math.Max(data1 & 0xFF000000, data2 & 0xFF000000));
}
}
It has a lot less bit shifting in it. You might find the calls to Math.Max aren't inlined if you profile it. In such a case, you'd just make the method more verbose.
I haven't tested this code as I don't have an IDE with me. I reckon it does what you want though.
If this still doesn't perform as you'd expect, you could try using pointer arithmetic in an unsafe block, but I seriously doubt that you'd see a gain. Code like this is unlikely to be faster if you extern to it, from everything I've read. But don't take my word for it. Measure, measure, measure.
Good luck.
I don't see any way of speeding up this code by means of clever bit tricks.
If you really want this code to be faster, the only way of significantly (>2x or so) speeding it up on x86 platform I see is to go for assembler/intrinsics implementation. SSE has the instruction PCMPGTB that
"Performs a SIMD compare for the greater value of the packed bytes, words, or doublewords in the destination operand (first operand) and the source operand (second operand). If a data element in the destination operand is greater than the corresponding date element in the source operand, the corresponding data element in the destination operand is set to all 1s; otherwise, it is set to all 0s."
XMM register would fit four 32-bit ints, and you could loop over your arrays reading the values, getting the mask and then ANDing the first input with the mask and the second one with inverted mask.
On the other hand, maybe you can reformulate your algorithm so that you don't need to to pick larger bytes, but maybe for example take AND of the operands? Just a thought, hard to see if it can work without seeing the actual algorithm.
Another option for you is, if you're able to run Mono, is to use the Mono.Simd package. This provides access into SIMD instruction set from within .NET. Unfortunately you can't just take the assembly and run it on MS's CLR, as the Mono runtime treats is in a special way at JIT time. The actual assembly contains regular IL (non-SIMD) 'simulations' of the SIMD operations as a fall-back, in case the hardware does not support SIMD instructions.
You also need to be able to express your problem using the types that the API consumes, as far as I can make out.
Here is the blog post in which Miguel de Icaza announced the capability back in November 2008. Pretty cool stuff. Hopefully it will be added to the ECMA standard and MS can add it to their CLR.
You might like to look at the BitConverter class - can't remember if it is the right endianness for the particular conversion you're trying to do, but worth knowing about anyway.

Categories

Resources