Faster way to swap endianness in C# with 16 bit words - c#

There's got to be a faster and better way to swap bytes of 16bit words then this.:
public static void Swap(byte[] data)
{
for (int i = 0; i < data.Length; i += 2)
{
byte b = data[i];
data[i] = data[i + 1];
data[i + 1] = b;
}
}
Does anyone have an idea?

In my attempt to apply for the Uberhacker award, I submit the following. For my testing, I used a Source array of 8,192 bytes and called SwapX2 100,000 times:
public static unsafe void SwapX2(Byte[] source)
{
fixed (Byte* pSource = &source[0])
{
Byte* bp = pSource;
Byte* bp_stop = bp + source.Length;
while (bp < bp_stop)
{
*(UInt16*)bp = (UInt16)(*bp << 8 | *(bp + 1));
bp += 2;
}
}
}
My benchmarking indicates that this version is over 1.8 times faster than the code submitted in the original question.

This way appears to be slightly faster than the method in the original question:
private static byte[] _temp = new byte[0];
public static void Swap(byte[] data)
{
if (data.Length > _temp.Length)
{
_temp = new byte[data.Length];
}
Buffer.BlockCopy(data, 1, _temp, 0, data.Length - 1);
for (int i = 0; i < data.Length; i += 2)
{
_temp[i + 1] = data[i];
}
Buffer.BlockCopy(_temp, 0, data, 0, data.Length);
}
My benchmarking assumed that the method is called repeatedly, so that the resizing of the _temp array isn't a factor. This method relies on the fact that half of the byte-swapping can be done with the initial Buffer.BlockCopy(...) call (with the source position offset by 1).
Please benchmark this yourselves, in case I've completely lost my mind. In my tests, this method takes approximately 70% as long as the original method (which I modified to declare the byte b outside of the loop).

I always liked this:
public static Int64 SwapByteOrder(Int64 value)
{
var uvalue = (UInt64)value;
UInt64 swapped =
( (0x00000000000000FF) & (uvalue >> 56)
| (0x000000000000FF00) & (uvalue >> 40)
| (0x0000000000FF0000) & (uvalue >> 24)
| (0x00000000FF000000) & (uvalue >> 8)
| (0x000000FF00000000) & (uvalue << 8)
| (0x0000FF0000000000) & (uvalue << 24)
| (0x00FF000000000000) & (uvalue << 40)
| (0xFF00000000000000) & (uvalue << 56));
return (Int64)swapped;
}
I believe you'll find this is the fastest method as well a being fairly readable and safe. Obviously this applies to 64-bit values but the same technique could be used for 32- or 16-.

Next method, in my test, almost 3 times faster as the accepted answer. (Always faster on more than 3 characters or six bytes, a bit slower on less or equal to three characters or six bytes.) (Note that the accepted answer can read/write outside the bounds of the array.)
(Update While having a pointer there's no need to call the property to get the length. Using that pointer is a bit faster, but requires either a runtime check or, as in next example, a project configuration to build for each platform. Define X86 and X64 under each configuration.)
static unsafe void SwapV2(byte[] source)
{
fixed (byte* psource = source)
{
#if X86
var length = *((uint*)(psource - 4)) & 0xFFFFFFFEU;
#elif X64
var length = *((uint*)(psource - 8)) & 0xFFFFFFFEU;
#else
var length = (source.Length & 0xFFFFFFFE);
#endif
while (length > 7)
{
length -= 8;
ulong* pulong = (ulong*)(psource + length);
*pulong = ( ((*pulong >> 8) & 0x00FF00FF00FF00FFUL)
| ((*pulong << 8) & 0xFF00FF00FF00FF00UL));
}
if(length > 3)
{
length -= 4;
uint* puint = (uint*)(psource + length);
*puint = ( ((*puint >> 8) & 0x00FF00FFU)
| ((*puint << 8) & 0xFF00FF00U));
}
if(length > 1)
{
ushort* pushort = (ushort*)psource;
*pushort = (ushort) ( (*pushort >> 8)
| (*pushort << 8));
}
}
}
Five tests with 300.000 times 8192 bytes
SwapV2: 1055, 1051, 1043, 1041, 1044
SwapX2: 2802, 2803, 2803, 2805, 2805
Five tests with 50.000.000 times 6 bytes
SwapV2: 1092, 1085, 1086, 1087, 1086
SwapX2: 1018, 1019, 1015, 1017, 1018
But if the data is large and performance really matters, you could use SSE or AVX. (13 times faster.) https://pastebin.com/WaFk275U
Test 5 times, 100000 loops with 8192 bytes or 4096 chars
SwapX2 : 226, 223, 225, 226, 227 Min: 223
SwapV2 : 113, 111, 112, 114, 112 Min: 111
SwapA2 : 17, 17, 17, 17, 16 Min: 16

Well, you could use the XOR swapping trick, to avoid an intermediate byte. It won't be any faster, though, and I wouldn't be surprised if the IL is exactly the same.
for (int i = 0; i < data.Length; i += 2)
{
data[i] ^= data[i + 1];
data[i + 1] ^= data[i];
data[i] ^= data[i + 1];
}

Related

Byte shift issue in an integer conversion

I read 3 bytes in a binary file which I need to convert into an integer.
I use this code to read the bytes :
LastNum last1Hz = new LastNum();
last1Hz.Freq = 1;
Byte[] LastNumBytes1Hz = new Byte[3];
Array.Copy(lap_info, (8 + (32 * k)), LastNumBytes1Hz, 0, 3);
last1Hz.NumData = LastNumBytes1Hz[2] << 16 + LastNumBytes1Hz[1] << 8 + LastNumBytes1Hz[0];
last1Hz.NumData is an integer.
This seems to be the good way to convert bytes into integers in the posts i have seen.
Here is a capture of the values read:
But the integer last1Hz.NumData is always 0.
I'm missing something but can't figure out what.
You need to use brackets (because addition has a higher priority than bit shifting):
int a = 0x87;
int b = 0x00;
int c = 0x00;
int x = c << 16 + b << 8 + a; // result 0
int z = (c << 16) + (b << 8) + a; // result 135
Your code should look like this:
last1Hz.NumData = (LastNumBytes1Hz[2] << 16) + (LastNumBytes1Hz[1] << 8) + LastNumBytes1Hz[0];
I think the problem is an order of precedence issue. + is evaluated before <<
Put brackets in to force the bit shift to be evaluated first.
last1Hz.NumData = (LastNumBytes1Hz[2] << 16) + (LastNumBytes1Hz[1] << 8) + LastNumBytes1Hz[0];

How to get the number represented by least-significant non-zero bit efficiently

For example, if it is 0xc0, then the result is 0x40,
Since 0xc0 is equal to binary 11000000, the result should be 01000000.
public static byte Puzzle(byte x) {
byte result = 0;
byte[] masks = new byte[]{1,2,4,8,16,32,64,128};
foreach(var mask in masks)
{
if((x&mask)!=0)
{
return mask;
}
}
return 0;
}
This is my current solution. It turns out this question can be solved in 3-4 lines...
public static byte Puzzle(byte x) {
return (byte) (x & (~x ^ -x));
}
if x is bbbb1000, ~x is BBBB0111 (B is !b)
-x is really ~x+1, (2's complement) so adding 1 to BBBB0111 is BBBB1000
~x ^ -x then is 00001111 & with x gives the lowest 1 bit.
A better answer was supplied by harold in the comments
public static byte Puzzle(byte x) {
return (byte) (x & -x);
}
Maby not the best solution, but you can prepare an array byte[256], store the result for each number and then use it.
Another solution in 1 line:
return (byte)((~(x | (x << 1) | (x << 2) | (x << 3) | (x << 4) | (x << 5) | (x << 6) | (x << 7) | (x << 8)) + 1) >> 1);
Hardly a 1-line solution, but at least it doesn't use the hardcoded list of the bits.
I'd do it with some simple bit shifting. Basically, down-shifting the value until the lowest bit is not 0, and then upshifting 1 with the amount of performed shifts to reconstruct the least significant value. Since it's a 'while' it needs an advance zero check though, or it'll go into an infinite loop on zero.
public static Byte Puzzle(Byte x)
{
if (x == 0)
return 0;
byte shifts = 0;
while ((x & 1) == 0)
{
shifts++;
x = (Byte)(x >> 1);
}
return (Byte)(1 << shifts);
}

BitConvert.Int32, Shouldn´t this be faster?

I am trying to improve the speed of BitConvert, or rather, an alternative way.
So here is the code i thought was supposed to be faster :
bsize = ms.length
int index = 0;
byte[] target = new byte[intsize];
target[index++] = (byte)bsize;
target[index++] = (byte)(bsize >> 8);
target[index++] = (byte)(bsize >> 16);
target[index] = (byte)(bsize >> 24);
And well the BitConvert code:
BitConverter.GetBytes(bsize)
And well, it wasn´t faster, it was alot slower from my tests, more than twice as slow.
So why is it slower?
And is there a way to improve the speed?
EDIT:
BitConvert = 5068 Ticks
OtherMethod above: 12847 Ticks
EDIT 2: My Benchmark code:
private unsafe void ExecuteBenchmark(int samplingSize = 100000)
{
// run the Garbage collector
GC.Collect();
GC.WaitForPendingFinalizers();
// log start
Console.WriteLine("Benchmark started");
// start timer
var t = Stopwatch.StartNew();
for (int i = 0; i < samplingSize; i++)
{
}
}
// stop timer
t.Stop();
// log ending
Console.WriteLine("Execute1 time = " + t.ElapsedTicks + " ticks");
}
Your implementation is slower, because BitConverter uses unsafe code which operates on pointers:
public unsafe static byte[] GetBytes(int value)
{
byte[] array = new byte[4];
fixed (byte* ptr = array)
{
*(int*)ptr = value;
}
return array;
}
And back to int:
public unsafe static int ToInt32(byte[] value, int startIndex)
{
if (value == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
}
if ((ulong)startIndex >= (ulong)((long)value.Length))
{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startIndex, ExceptionResource.ArgumentOutOfRange_Index);
}
if (startIndex > value.Length - 4)
{
ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_ArrayPlusOffTooSmall);
}
int result;
if (startIndex % 4 == 0)
{
result = *(int*)(&value[startIndex]);
}
else
{
if (BitConverter.IsLittleEndian)
{
result = ((int)(*(&value[startIndex])) | (int)(&value[startIndex])[(IntPtr)1 / 1] << 8 | (int)(&value[startIndex])[(IntPtr)2 / 1] << 16 | (int)(&value[startIndex])[(IntPtr)3 / 1] << 24);
}
else
{
result = ((int)(*(&value[startIndex])) << 24 | (int)(&value[startIndex])[(IntPtr)1 / 1] << 16 | (int)(&value[startIndex])[(IntPtr)2 / 1] << 8 | (int)(&value[startIndex])[(IntPtr)3 / 1]);
}
}
return result;
}
Well, first, measuring the speed of such a tiny amount of code is going to be error-prone. Posting your benchmark might give more answers.
But my guess is that on platforms supporting it (like x86), BitConverter probably does a single bounds check and an unaligned write into target rather than 3 shifts, 4 bounds checks, and 4 writes. It may end up completely inlined, alleviating all call overhead.

How to get amount of 1s from 64 bit number [duplicate]

This question already has answers here:
Count number of bits in a 64-bit (long, big) integer?
(3 answers)
Closed 9 years ago.
Possible duplicate: Count number of bits in a 64-bit (long, big)
integer?
For an image comparison algorithm I get a 64bit number as result. The amount of 1s in the number (ulong) (101011011100...) tells me how similar two images are, so I need to count them. How would I best do this in C#?
I'd like to use this in a WinRT & Windows Phone App, so I'm also looking for a low-cost method.
EDIT: As I have to count the bits for a large number of Images, I'm wondering if the lookup-table-approach might be best. But I'm not really sure how that works...
The Sean Eron Anderson's Bit Twiddling Hacks has this trick, among others:
Counting bits set, in parallel
unsigned int v; // count bits set in this (32-bit value)
unsigned int c; // store the total here
static const int S[] = {1, 2, 4, 8, 16}; // Magic Binary Numbers
static const int B[] = {0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF};
c = v - ((v >> 1) & B[0]);
c = ((c >> S[1]) & B[1]) + (c & B[1]);
c = ((c >> S[2]) + c) & B[2];
c = ((c >> S[3]) + c) & B[3];
c = ((c >> S[4]) + c) & B[4];
The B array, expressed as binary, is:
B[0] = 0x55555555 = 01010101 01010101 01010101 01010101
B[1] = 0x33333333 = 00110011 00110011 00110011 00110011
B[2] = 0x0F0F0F0F = 00001111 00001111 00001111 00001111
B[3] = 0x00FF00FF = 00000000 11111111 00000000 11111111
B[4] = 0x0000FFFF = 00000000 00000000 11111111 11111111
We can adjust the method for larger integer sizes by continuing with the patterns for the Binary Magic Numbers, B and S. If there are k bits, then we need the arrays S and B to be ceil(lg(k)) elements long, and we must compute the same number of expressions for c as S or B are long. For a 32-bit v, 16 operations are used.
The best method for counting bits in a 32-bit integer v is the following:
v = v - ((v >> 1) & 0x55555555); // reuse input as temporary
v = (v & 0x33333333) + ((v >> 2) & 0x33333333); // temp
c = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; // count
The best bit counting method takes only 12 operations, which is the same as the lookup-table method, but avoids the memory and potential cache misses of a table. It is a hybrid between the purely parallel method above and the earlier methods using multiplies (in the section on counting bits with 64-bit instructions), though it doesn't use 64-bit instructions. The counts of bits set in the bytes is done in parallel, and the sum total of the bits set in the bytes is computed by multiplying by 0x1010101 and shifting right 24 bits.
A generalization of the best bit counting method to integers of bit-widths upto 128 (parameterized by type T) is this:
v = v - ((v >> 1) & (T)~(T)0/3); // temp
v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp
v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp
c = (T)(v * ((T)~(T)0/255)) >> (sizeof(T) - 1) * CHAR_BIT; // count
Something along these lines would do (note that this isn't tested code, I just wrote it here, so it may and probably will require tweaking).
int numberOfOnes = 0;
for (int i = 63; i >= 0; i--)
{
if ((yourUInt64 >> i) & 1 == 1) numberOfOnes++;
else continue;
}
Option 1 - less iterations if 64bit result < 2^63:
byte numOfOnes;
while (result != 0)
{
numOfOnes += (result & 0x1);
result = (result >> 1);
}
return numOfOnes;
Option 2 - constant number of interations - can use loop unrolling:
byte NumOfOnes;
for (int i = 0; i < 64; i++)
{
numOfOnes += (result & 0x1);
result = (result >> 1);
}
this is a 32-bit version of BitCount, you could easily extend this to 64-bit version by add one more right shift by 32, and it would be very efficient.
int bitCount(int x) {
/* first let res = x&0xAAAAAAAA >> 1 + x&55555555
* after that the (2k)th and (2k+1)th bits of the res
* will be the number of 1s that contained by the (2k)th
* and (2k+1)th bits of x
* we can use a similar way to caculate the number of 1s
* that contained by the (4k)th and (4k+1)th and (4k+2)th
* and (4k+3)th bits of x, so as 8, 16, 32
*/
int varA = (85 << 8) | 85;
varA = (varA << 16) | varA;
int res = ((x>>1) & varA) + (x & varA);
varA = (51 << 8) | 51;
varA = (varA << 16) | varA;
res = ((res>>2) & varA) + (res & varA);
varA = (15 << 8) | 15;
varA = (varA << 16) | varA;
res = ((res>>4) & varA) + (res & varA);
varA = (255 << 16) | 255;
res = ((res>>8) & varA) + (res & varA);
varA = (255 << 8) | 255;
res = ((res>>16) & varA) + (res & varA);
return res;
}

Bit-shifting a byte array by N bits

Hello quick question regarding bit shifting
I have a value in HEX: new byte[] { 0x56, 0xAF };
which is 0101 0110 1010 1111
I want to the first N bits, for example 12.
Then I must right-shift off the lowest 4 bits (16 - 12) to get 0000 0101 0110 1010 (1386 dec).
I can't wrap my head around it and make it scalable for n bits.
Sometime ago i coded these two functions, the first one shifts an byte[] a specified amount of bits to the left, the second does the same to the right:
Left Shift:
public byte[] ShiftLeft(byte[] value, int bitcount)
{
byte[] temp = new byte[value.Length];
if (bitcount >= 8)
{
Array.Copy(value, bitcount / 8, temp, 0, temp.Length - (bitcount / 8));
}
else
{
Array.Copy(value, temp, temp.Length);
}
if (bitcount % 8 != 0)
{
for (int i = 0; i < temp.Length; i++)
{
temp[i] <<= bitcount % 8;
if (i < temp.Length - 1)
{
temp[i] |= (byte)(temp[i + 1] >> 8 - bitcount % 8);
}
}
}
return temp;
}
Right Shift:
public byte[] ShiftRight(byte[] value, int bitcount)
{
byte[] temp = new byte[value.Length];
if (bitcount >= 8)
{
Array.Copy(value, 0, temp, bitcount / 8, temp.Length - (bitcount / 8));
}
else
{
Array.Copy(value, temp, temp.Length);
}
if (bitcount % 8 != 0)
{
for (int i = temp.Length - 1; i >= 0; i--)
{
temp[i] >>= bitcount % 8;
if (i > 0)
{
temp[i] |= (byte)(temp[i - 1] << 8 - bitcount % 8);
}
}
}
return temp;
}
If you need further explanation please comment on this, i will then edit my post for clarification...
You can use a BitArray and then easily copy each bit to the right, starting from the right.
http://msdn.microsoft.com/en-us/library/system.collections.bitarray_methods.aspx
you want something like...
var HEX = new byte[] {0x56, 0xAF};
var bits = new BitArray(HEX);
int bitstoShiftRight = 4;
for (int i = 0; i < bits.Length; i++)
{
bits[i] = i < (bits.Length - bitstoShiftRight) ? bits[i + bitstoShiftRight] : false;
}
bits.CopyTo(HEX, 0);
If you have k total bits, and you want the "first" (as in most significant) n bits, you can simply right shift k-n times. The last k-n bits will be removed, by sort of "falling" off the end, and the first n will be moved to the least significant side.
Answering using C-like notation, assuming bits_in_byte is the number of bits in a byte determined elsewhere:
int remove_bits_count= HEX.count*bits_in_byte - bits_to_keep;
int remove_bits_in_byte_count= remove_bits_count % bits_in_byte;
if (remove_bits_count > 0)
{
for (int iteration= 0; iteration<min(HEX.count, (bits_to_keep + bits_in_byte - 1)/bits_in_byte); ++iteration)
{
int write_index= HEX.count - iteration - 1;
int read_index_lo= write_index - remove_bits_count/bits_in_byte;
if (read_index_lo>=0)
{
int read_index_hi= read_index_lo - (remove_bits_count + bits_in_byte - 1)/bits_in_byte;
HEX[write_index]=
(HEX[read_index_lo] >> remove_bits_in_byte_count) |
(HEX[read_index_hi] << (bits_in_byte - remove_bits_in_byte_count));
}
else
{
HEX[write_index]= 0;
}
}
}
Assuming you are overwriting the original array, you basically take every byte you write to and figure out the bytes that it would get its shifted bits from. You go from the end of the array to the front to ensure you never overwrite data you will need to read.

Categories

Resources