Fast byte array masking in C#

Fast byte array masking in C# - c#

I have a struct with some properties (like int A1, int A2,...). I store a list of struct as binary in a file.
Now, I'm reading the bytes from file using binary reader into Buffer and I want to apply a filter based on the struct's properties (like .A1 = 100 & .A2 = 12).
The performance is very important in my scenario, so I convert the filter criteria to byte array (Filter) and then I want to mask Buffer with Filter. If the result of masking is equal to Filter, the Buffer will be converted to the struct.
The question: What is the fastest way to mask and compare two byte arrays?
Update: The Buffer size is more than 256 bytes. I'm wondering if there is a better way rather than iterating in each byte of Buffer and Filter.

The way I would usually approach this is with unsafe code. You can use the fixed keyword to get a byte[] as a long*, which you can then iterate in 1/8th of the iterations - but using the same bit operations. You will typically have a few bytes left over (from it not being an exact multiple of 8 bytes) - just clean those up manually afterwards.

Try a simple loop with System.BitConverter.ToInt64(). Something Like this:
byte[] arr1;
byte[] arr2;
for (i = 0; i < arr1.Length; i += 8)
{
var P1 = System.BitConverter.ToInt64(arr1, i);
var P2 = System.BitConverter.ToInt64(arr2, i);
if((P1 & P2) != P1) //or whatever
//break the loop if you need to.
}
My assumption is that comparing/masking two Int64s will be much faster (especially on 64-bit machines) than masking one byte at a time.

Once you've got the two arrays - one from reading the file and one from the filter, all you then need is a fast comparison for the arrays. Check out the following postings which are using unsafe or PInvoke methods.
What is the fastest way to compare two byte arrays?
Comparing two byte arrays in .NET

Related

C# - Convert string of zeros and ones to a byte array or similar

I am getting a string of zeros and ones from a client API request. They are of a set length (28, in this case) and I want to convert them to a byte[] or something similar, with the goal of storing these in SQL via EF Core and later using bitwise operators to compare them.
I can't seem to wrap my head around this one. I'm seeing a lot of posts/questions about converting characters to byte arrays, or byte arrays to strings, neither of which is what I need.
I need a "00111000010101010" to become a literal binary 00111000010101010 I can use a ^ on.
Leading zeros would be fine if necessary, I think the length might be forced to be a multiple of 8?

You can binary string convert to an integer easily with this:
string source = "00111000010101010";
int number = Convert.ToInt32(source, 2); // The `2` is "base 2"
That gives: 28842.
Then you can go one step further an convert to a byte array, if needed.
byte[] bytes = BitConverter.GetBytes(number);

How to write directly to memory in C#

I am working on a serial port comms project. There is a piece of hardware sending signed 16 bit integer values, which are being received in to a C# PC application.
Each value is sent over two bytes, least significant byte first.
The performance of the is critical, so I'm looking at ways of reducing processing.
The C# Serial Port object provides a method ReadExisting, which returns the current buffered values as a string.
It also provides a method Read which can accept a byte array which is then populated with the bytes in the port buffer.
If I read all the values in to a byte array, I then have to join the two bytes together to get the 16 bit number.
I'm intrigued by the string returned from the ReadExisting method.
If I create an array of short (short[] MyValues), I could then get the memory location of the array, and simply write the string to that location. As long as I ensure the bytes are sent in the correct order, I could then simple read the 16 bit values when needed.
Alternatively, I could possibly create two arrays, one an array of shorts, the second an array of bytes twice the size of the first, both with the same memory location.
However, "here be dragons". I have no experience with C# and this level of memory access.
Some brief googleing suggests this is possible - this article has some interesting information:
https://www.developerfusion.com/article/84519/mastering-structs-in-c/
Before I head off down this particular rabbit hole, does anyone have any suggestions on how to achieve this level of memory manipulation?
Thanks

Try an overlay
static void Main(string[] args)
{
const int NUMBER_OF_BYTES = 100;
ByteInt16 byteInt = new ByteInt16();
byteInt.data = new byte[NUMBER_OF_BYTES];
byteInt.data2 = new UInt16[NUMBER_OF_BYTES / 2];
byteInt.data = Enumerable.Range(0, NUMBER_OF_BYTES).Select(x => (byte)x).ToArray();
UInt16[] results = byteInt.data2.Select(x => (UInt16)x).ToArray();
}
[StructLayout(LayoutKind.Explicit)]
public struct ByteInt16
{
[FieldOffset(0)]
public byte[] data;
[FieldOffset(0)]
public UInt16[] data2;
}

converting 8 bytes into one long

I am currently developing a C# 2D sandbox based game. The game world is filled with tiles/blocks. Since the world is so large the game can sometimes use more than what is allowed for 32-bit application.
My tiles consist of the following data inside a struct:
public byte type;
public byte typeWall;
public byte liquid;
public byte typeLiquid;
public byte frameX;
public byte frameY;
public byte frameWallX;
public byte frameWallY;
I am looking to encapsulate all this data within one "long" (64-bit integer).
I want properties to get and set each piece of data using bit shifting, etc... (I have never done this).
Would this save space? Would it increase processing speed? If so how can it be accomplished?
Thanks.

I am looking to encapsulate all this data within one "long" (64-bit integer).
You can use StructLayoutAttribute with LayoutKind.Explicit and then decorate fields with FieldOffsetAttribute specifying the exact position.
I want properties to get and set each piece of data using bit shifting, etc... (I have never done this).
Then use shift left (<<), shift right (>>) and masking (and && to extract / or || to write (don't forget about any non-zero bits in the target byte)) with 0xff to separate individual bytes. Read more about bitwise operations here.
Would this save space? Would it increase processing speed?
Did you measure it? Did you discover a performace / memory consuption problem? If yes, go optimize it. If not, do not do premature optimizations. In other words, don't blindly try without measuring first.

I don't know why you want to do this, but you can do it in this way:
byte type = 4;
byte typeWall = 45;
byte liquid = 45;
byte typeLiquid = 234;
byte frameX = 23;
byte frameY = 23;
byte frameWallX = 22;
byte frameWallY = 221;
byte[] bytes = new [] {type, typeWall, liquid, typeLiquid, frameX, frameY, frameWallX, frameWallY};
BitConverter.ToInt64(bytes, 0);
or using << (shift) operator.

As you can see by pasting the following code into linqpad :
void Main()
{
sizeof(byte).Dump("byte size");
sizeof(Int32).Dump("int 32");
sizeof(Int64).Dump("int 64");
sizeof(char).Dump("for good measure, a char:");
}
You'll get:
byte size 1
int 32 4
int 64 8
for good measure, a char: 2
So packing 8 bytes in an int64 will be the same, but you'll have to play with the bits yourself (if that's your thing, by all means, go for it :)

Use the data in a c# byte array

I have an image, including image header, stored in a c# byte array (byte []).
The header is at the beginning of the byte array.
If I put the header in a struct (as I did in c++) it looks like this:
typedef struct RS_IMAGE_HEADER
{
long HeaderVersion;
long Width;
long Height;
long NumberOfBands;
long ColorDepth;
long ImageType;
long OriginalImageWidth;
long OriginalImageHeight;
long OffsetX;
long OffsetY;
long RESERVED[54];
long Comment[64];
} RS_IMAGE_HEADER;
How can I do it in c#, how can I get and use all the data in the image header (that stored in the beginning of the byte array)?
Thanks

Structs are perfectly fine in C#, so there should be no issue with the struct pretty much exactly as you've written it, though you might need to add permission modifiers such as public. To convert byte arrays to other primitives, there are a very helpful class of methods that include ToInt64() that will help you convert an array of bytes to another built in type (in this case long). To get the specific sequences of array bytes you'll need, check out this question on various techniques for doing array slices in C#.

The easiest way is to create an analog data structure in c#, I won't go into that here as it is almost the same. An example to read out individual the bytes from the array is below.
int headerVersionOffset = ... // defined in spec
byte[] headerVersionBuffer = new byte[sizeof(long)];
Buffer.BlockCopy(imageBytes, headerVersionOffset, headerVersionBuffer, 0, sizeof(long));
//Convert bytes to long, etc.
long headerVersion = BitConverter.ToInt64(headerVersionBuffer, 0);
You would want to adapt this to your data structure and usage, you could also accomplish this using a stream or other custom data structures to automatically handle the data for you.

Converting int[] to byte: How to look at int[] as it was byte[]?

To explain: I have array of ints as input. I need to convert it to array of bytes, where 1 int = 4 bytes (big endian). In C++, I can easily just cast it and then access to the field as if it was byte array, without copying or counting the data - just direct access. Is this possible in C#? And in C# 2.0?

Yes, using unsafe code:
int[] arr =...
fixed(int* ptr = arr) {
byte* ptr2 = (byte*)ptr;
// now access ptr2[n]
}
If the compiler complains, add a (void*):
byte* ptr2 = (byte*)(void*)ptr;

You can create a byte[] 4 times the size of your int[] lenght.
Then, you iterate trough your integer array & get the byte array from:
BitConverter.GetBytes(int32);
Next you copy the 4 bytes from this function to the correct offset (i * 4) using Buffer.BlockCopy.
BitConverter
Buffer.BlockCopy

Have a look at the BitConverter class. You could iterate through the array of int, and call BitConverter.GetBytes(Int32) to get a byte[4] for each one.

If you write unsafe code, you can fix the array in memory, get a pointer to its beginning, and cast that pointer.
unsafe
{
fixed(int* pi=arr)
{
byte* pb=(byte*)pi;
...
}
}
An array in .net is prefixed with the number of elements, so you can't safely convert between int[] and byte[] that points to the same data. You can cast between uint[] and int[] (at least as far as .net is concerned, the support for this feature in C# itself is a bit inconsistent).
There is also a union based trick to reinterpret cast references, but I strongly recommend not using it.
The usual way to get individual integers from a byte array in native-endian order is BitConverter, but its relatively slow. Manual code is often faster. And of course it doesn't support the reverse conversion.
One way to manually convert assuming little-endian (managed about 400 million reads per second on my 2.6GHz i3):
byte GetByte(int[] arr, int index)
{
uint elem=(uint)arr[index>>2];
return (byte)(elem>>( (index&3)* 8));
}
I recommend manually writing code that uses bitshifting to access individual bytes if you want to go with managed code, and pointers if you want the last bit of performance.
You also need to be careful about endianness issues. Some of these methods only support native endianness.

The simplest way in type-safe managed code is to use:
byte[] result = new byte[intArray.Length * sizeof(int)];
Buffer.BlockCopy(intArray, 0, result, 0, result.Length);
That doesn't quite do what I think your question asked, since on little endian architectures (like x86 or ARM), the result array will end up being little endian, but I'm pretty sure the same is true for C++ as well.
If you can use unsafe{}, you have other options:
unsafe{
fixed(byte* result = (byte*)(void*)intArray){
// Do stuff with result.
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.