find length of unsafe byte* pointing to native byte array c# - c#

How can I find the length of byte* in c#?
It's pointing to a native byte array in an unmanaged c++ library. I need to convert it to a c# byte[], but in order to do so, I need the length. .Length doesn't work.
byte* ETC = //Stuff from unmanaged c++ DLL;
int ETCLength = ????

You cannot know the length of something just from a pointer; the pointer is just the start. Usually, a pointer and a length are passed together. In the future, this may be improved by Span<T> - or maybe it won't! Time will tell.
You need to already know the length. This could be via an API, or it could be via documentation. There may be a pattern to the data that implies the end (nul terminators, for example, or the length being encoded in the first few bytes), but: that approach is how most buffer attacks start. You should always know the length if you're talking about pointers.

Related

how to drop part of array without allocationg new memory?

I have byte array. I need to drop first 4 bytes, like that:
byte[] newArray = new byte[byteArray.Length - 4];
Buffer.BlockCopy(byteArray, 4, newArray, 0, byteArray.Length - 4);
But can I just move pointer in C/C++ style? :
byte[] byteMsg = byteArray + 4;
I do not want to allocate extra memory until absolutely requried because this code is executed pretty often.
upd: I receive data from Socket so I probably should just use another version of Receive count = s.Receive(byteArray);
No, you can't do that. A .NET array is always of a fixed size, and you can't do pointer arithmetic on it outside unsafe code.
Try using ArraySegment instead
I wouldn't worry, the GC will take care of cleaning up the memory that you're no longer using provided it is not being referenced.
Arrays in C# are immutable. You can't modify them, so if you need to drop the first 4 bytes then you're going to have to reallocate. As thecoop suggests, I'd take a look at ArraySegment and use that to pass around to other functions, if these first 4 bytes are not important to you.
It's also worth noting that yes, in C++, we'd use a bit of pointer arithmetic, but definitely keep hold of the original pointer, less we end up de-allocated and losing 4 bytes to the Demons :)
Just leave the Byte Array untouched and use a MemoryStream and it's offset capability. this won't change your array and you have the ability to skip the first n bytes.
var memoryStream = new MemoryStream(byteArray);
// do whatever you want with the memory stream

Converting int[] to byte: How to look at int[] as it was byte[]?

To explain: I have array of ints as input. I need to convert it to array of bytes, where 1 int = 4 bytes (big endian). In C++, I can easily just cast it and then access to the field as if it was byte array, without copying or counting the data - just direct access. Is this possible in C#? And in C# 2.0?
Yes, using unsafe code:
int[] arr =...
fixed(int* ptr = arr) {
byte* ptr2 = (byte*)ptr;
// now access ptr2[n]
}
If the compiler complains, add a (void*):
byte* ptr2 = (byte*)(void*)ptr;
You can create a byte[] 4 times the size of your int[] lenght.
Then, you iterate trough your integer array & get the byte array from:
BitConverter.GetBytes(int32);
Next you copy the 4 bytes from this function to the correct offset (i * 4) using Buffer.BlockCopy.
BitConverter
Buffer.BlockCopy
Have a look at the BitConverter class. You could iterate through the array of int, and call BitConverter.GetBytes(Int32) to get a byte[4] for each one.
If you write unsafe code, you can fix the array in memory, get a pointer to its beginning, and cast that pointer.
unsafe
{
fixed(int* pi=arr)
{
byte* pb=(byte*)pi;
...
}
}
An array in .net is prefixed with the number of elements, so you can't safely convert between int[] and byte[] that points to the same data. You can cast between uint[] and int[] (at least as far as .net is concerned, the support for this feature in C# itself is a bit inconsistent).
There is also a union based trick to reinterpret cast references, but I strongly recommend not using it.
The usual way to get individual integers from a byte array in native-endian order is BitConverter, but its relatively slow. Manual code is often faster. And of course it doesn't support the reverse conversion.
One way to manually convert assuming little-endian (managed about 400 million reads per second on my 2.6GHz i3):
byte GetByte(int[] arr, int index)
{
uint elem=(uint)arr[index>>2];
return (byte)(elem>>( (index&3)* 8));
}
I recommend manually writing code that uses bitshifting to access individual bytes if you want to go with managed code, and pointers if you want the last bit of performance.
You also need to be careful about endianness issues. Some of these methods only support native endianness.
The simplest way in type-safe managed code is to use:
byte[] result = new byte[intArray.Length * sizeof(int)];
Buffer.BlockCopy(intArray, 0, result, 0, result.Length);
That doesn't quite do what I think your question asked, since on little endian architectures (like x86 or ARM), the result array will end up being little endian, but I'm pretty sure the same is true for C++ as well.
If you can use unsafe{}, you have other options:
unsafe{
fixed(byte* result = (byte*)(void*)intArray){
// Do stuff with result.
}
}

About the "GetBytes" implementation in BitConverter

I've found that the implementation of the GetBytes function in .net framework is something like:
public unsafe static byte[] GetBytes(int value)
{
byte[] bytes = new byte[4];
fixed(byte* b = bytes)
*((int*)b) = value;
return bytes;
}
I'm not so sure I understand the full details of these two lines:
fixed(byte* b = bytes)
*((int*)b) = value;
Could someone provide a more detailed explanation here? And how should I implement this function in standard C++?
Could someone provide a more detailed explanation here?
The MSDN documentation for fixed comes with numerous examples and explanation -- if that's not sufficient, then you'll need to clarify which specific part you don't understand.
And how should I implement this function in standard C++?
#include <cstring>
#include <vector>
std::vector<unsigned char> GetBytes(int value)
{
std::vector<unsigned char> bytes(sizeof(int));
std::memcpy(&bytes[0], &value, sizeof(int));
return bytes;
}
Fixed tells the garbage collector not to move a managed type so that you can access that type with standard pointers.
In C++, if you're not using C++/CLI (i.e. not using .NET) then you can just use a byte-sized pointer (char) and loop through the bytes in whatever you're trying to convert.
Just be aware of endianness...
First fixed has to be used because we want to assign a pointer to a managed variable:
The fixed statement prevents the garbage collector from relocating a
movable variable. The fixed statement is only permitted in an unsafe
context. Fixed can also be used to create fixed size buffers.
The fixed statement sets a pointer to a managed variable and "pins"
that variable during the execution of the statement. Without fixed,
pointers to movable managed variables would be of little use since
garbage collection could relocate the variables unpredictably. The
C# compiler only lets you assign a pointer to a managed variable in a
fixed statement. Ref.
Then we declare a pointer to byte and assign to the start of the byte array.
Then, we cast the pointer to byte to a pointer to int, dereference it and assign it to the int passed in.
The function creates a byte array that contains the same binary data as your platform's representation of the integer value. In C++, this can be achieved (for any type really) like so:
int value; // or any type!
unsigned char b[sizeof(int)];
unsigned char const * const p = reinterpret_cast<unsigned char const *>(&value);
std::copy(p, p + sizeof(int), b);
Now b is an array of as many bytes as the size of the type int (or whichever type you used).
In C# you need to say fixed to obtain a raw pointer, since usually you do not have raw pointers in C# on account of objects not having a fixed location in memory -- the garbage collector can move them around at any time. fixed prevents this and fixes the object in place so a raw pointer can make sense.
You can implement GetBytes() for any POD type with a simple function template.
#include <vector>
template <typename T>
std::vector<unsigned char> GetBytes(T value)
{
return std::vector<unsigned char>(reinterpret_cast<unsigned char*>(&value),
reinterpret_cast<unsigned char*>(&value) + sizeof(value));
}
Here is a C++ header-only library that may be of help.
BitConverter
The idea of implementing the GetBytes function in C++ is straight-forward: compute each byte of the value according to specified layout. For example, let's say we need to get the bytes of an unsigned 16-bit integer in big endian. We can divide the value by 256 to get the first byte, and take the remainder as the second byte.
For floating-point numbers, the algorithm is a little bit more complicated. We need to get the sign, exponent, and mantissa of the number, and encode them as bytes. See https://en.wikipedia.org/wiki/Double-precision_floating-point_format

Can someone tell me what this crazy c++ statement means in C#?

First off, no I am not a student...just a C# guy porting a C++ library.
What do these two crazy lines mean? What are they equivalent to in C#? I'm mostly concerned with the size_t and sizeof. Not concerned about static_cast or assert..I know how to deal with those.
size_t Index = static_cast<size_t>((y - 1620) / 2);
assert(Index < sizeof(DeltaTTable)/sizeof(double));
y is a double and DeltaTTable is a double[]. Thanks in advance!
size_t is a typedef for an unsigned integer type. It is used for sizes of things, and may be 32 or 64 bits in size. The particular size of a size_t is implementation defined, but it is unsigned.
I suppose in C# you could use a 64-bit unsigned integer type.
All sizeof does is return the size in bytes of a C++ type. Every type takes up a certain quantity of room, and sizeof returns that size.
What your code is doing is computing the number of doubles (64-bit floats) that the DeltaTTable takes up. Essentially, it's ensuring that the table is larger than some size based on y, whatever that is.
There is no equivalent of sizeof in C#, nor does it need it. There is no reason for you to port this code to C#.
The bad news first you can't do that in C#. There's no static cast only dynamic casts. However the good news is it doesn't matter.
The two lines of code is asserting that the index is in bounds of the table so that the code won't accidentally read some arbitrary memory location. The CLR takes care of that for you. So when porting just ignore those lines they are automatically there for you any ways.
Of course this is based on an assumption based on the pattern of the code. There's no information on what Y represents and how Index is used
sizeOf calculates how much memory in bytes the DeltaTable type takes.
There is not equivalent to calculate the size like this in c# AFAIK.
I guess size_t much be a struct type in C++ code.

C# analog for getBytes in java

There is wonderful method which class String has in java called getBytes.
In C# it's also implemented in another class - Encoding, but unfortunately it returns array of unsigned bytes, which is a problem.
How is it possible to get an array of signed bytes in C# from a string?
Just use Encoding.GetBytes but then convert the byte[] to an sbyte[] by using something like Buffer.BlockCopy. However, I'd strongly encourage you to use the unsigned bytes instead - work round whatever problem you're having with them instead of moving to signed bytes, which were frankly a mistake in Java to start with. The reason there's no built-in way of converting a string to a signed byte array is because it's rarely something you really want to be doing.
If you can tell us a bit about why the unsigned bytes are causing you a problem, we may well be able to help you with that instead.

Categories

Resources