how to drop part of array without allocationg new memory? - c#

I have byte array. I need to drop first 4 bytes, like that:
byte[] newArray = new byte[byteArray.Length - 4];
Buffer.BlockCopy(byteArray, 4, newArray, 0, byteArray.Length - 4);
But can I just move pointer in C/C++ style? :
byte[] byteMsg = byteArray + 4;
I do not want to allocate extra memory until absolutely requried because this code is executed pretty often.
upd: I receive data from Socket so I probably should just use another version of Receive count = s.Receive(byteArray);

No, you can't do that. A .NET array is always of a fixed size, and you can't do pointer arithmetic on it outside unsafe code.
Try using ArraySegment instead

I wouldn't worry, the GC will take care of cleaning up the memory that you're no longer using provided it is not being referenced.
Arrays in C# are immutable. You can't modify them, so if you need to drop the first 4 bytes then you're going to have to reallocate. As thecoop suggests, I'd take a look at ArraySegment and use that to pass around to other functions, if these first 4 bytes are not important to you.
It's also worth noting that yes, in C++, we'd use a bit of pointer arithmetic, but definitely keep hold of the original pointer, less we end up de-allocated and losing 4 bytes to the Demons :)

Just leave the Byte Array untouched and use a MemoryStream and it's offset capability. this won't change your array and you have the ability to skip the first n bytes.
var memoryStream = new MemoryStream(byteArray);
// do whatever you want with the memory stream

Related

find length of unsafe byte* pointing to native byte array c#

How can I find the length of byte* in c#?
It's pointing to a native byte array in an unmanaged c++ library. I need to convert it to a c# byte[], but in order to do so, I need the length. .Length doesn't work.
byte* ETC = //Stuff from unmanaged c++ DLL;
int ETCLength = ????
You cannot know the length of something just from a pointer; the pointer is just the start. Usually, a pointer and a length are passed together. In the future, this may be improved by Span<T> - or maybe it won't! Time will tell.
You need to already know the length. This could be via an API, or it could be via documentation. There may be a pattern to the data that implies the end (nul terminators, for example, or the length being encoded in the first few bytes), but: that approach is how most buffer attacks start. You should always know the length if you're talking about pointers.

Converting int[] to byte: How to look at int[] as it was byte[]?

To explain: I have array of ints as input. I need to convert it to array of bytes, where 1 int = 4 bytes (big endian). In C++, I can easily just cast it and then access to the field as if it was byte array, without copying or counting the data - just direct access. Is this possible in C#? And in C# 2.0?
Yes, using unsafe code:
int[] arr =...
fixed(int* ptr = arr) {
byte* ptr2 = (byte*)ptr;
// now access ptr2[n]
}
If the compiler complains, add a (void*):
byte* ptr2 = (byte*)(void*)ptr;
You can create a byte[] 4 times the size of your int[] lenght.
Then, you iterate trough your integer array & get the byte array from:
BitConverter.GetBytes(int32);
Next you copy the 4 bytes from this function to the correct offset (i * 4) using Buffer.BlockCopy.
BitConverter
Buffer.BlockCopy
Have a look at the BitConverter class. You could iterate through the array of int, and call BitConverter.GetBytes(Int32) to get a byte[4] for each one.
If you write unsafe code, you can fix the array in memory, get a pointer to its beginning, and cast that pointer.
unsafe
{
fixed(int* pi=arr)
{
byte* pb=(byte*)pi;
...
}
}
An array in .net is prefixed with the number of elements, so you can't safely convert between int[] and byte[] that points to the same data. You can cast between uint[] and int[] (at least as far as .net is concerned, the support for this feature in C# itself is a bit inconsistent).
There is also a union based trick to reinterpret cast references, but I strongly recommend not using it.
The usual way to get individual integers from a byte array in native-endian order is BitConverter, but its relatively slow. Manual code is often faster. And of course it doesn't support the reverse conversion.
One way to manually convert assuming little-endian (managed about 400 million reads per second on my 2.6GHz i3):
byte GetByte(int[] arr, int index)
{
uint elem=(uint)arr[index>>2];
return (byte)(elem>>( (index&3)* 8));
}
I recommend manually writing code that uses bitshifting to access individual bytes if you want to go with managed code, and pointers if you want the last bit of performance.
You also need to be careful about endianness issues. Some of these methods only support native endianness.
The simplest way in type-safe managed code is to use:
byte[] result = new byte[intArray.Length * sizeof(int)];
Buffer.BlockCopy(intArray, 0, result, 0, result.Length);
That doesn't quite do what I think your question asked, since on little endian architectures (like x86 or ARM), the result array will end up being little endian, but I'm pretty sure the same is true for C++ as well.
If you can use unsafe{}, you have other options:
unsafe{
fixed(byte* result = (byte*)(void*)intArray){
// Do stuff with result.
}
}

Converting C# byte array to C++

I really appreciate this community and all the help it has provided towards my programming problems that I've had in the past.
Now unfortunately, I cannot seem to find an answer to this problem which, at first glance, seems like a no brainer. Please note that I am currently using C++ 6.0.
Here is the code that I am trying to convert from C#:
byte[] Data = new byte[0x200000];
uint Length = (uint)Data.Length;
In C++, I declared the new byte array Data as follows:
BYTE Data[0x200000];
DWORD Length = sizeof(Data) / sizeof(DWORD);
When I run my program, I receive stack overflow errors (go figure). I believe this is because the array is so large (2 MB if I'm not mistaken).
Is there any way to implement this size array in C++ 6.0?
Defining array this way makes in on stack which ends in stack overflow. You can create very big arrays on heap by using pointers. For example:
BYTE *Data = new BYTE[0x200000];
Currently, you are allocating a lot of memory on the thread's stack, which will cause stack overflow, as stack space is usually limited to a few megabytes. You can create the array on the heap with new (by the way, you are calculating the array length incorrectly):
DWORD length = 0x200000;
BYTE* Data = new BYTE[length];
You might as well use vector<BYTE> instead of a raw array:
vector<BYTE> Data;
int length = Data.size();

Encoding-free String class for handling bytes? (Or alternative approach)

I have an application converted from Python 2 (where strings are essentially lists of bytes) and I'm using a string as a convenient byte buffer.
I am rewriting some of this code in the Boo language (Python-like syntax, runs on .NET) and am finding that the strings have an intrinsic encoding type, such as ASCII, UTF-8, etc. Most of the information dealing with bytes refer to arrays of bytes, which are (apparently) fixed length, making them quite awkward to work with.
I can obviously get bytes from a string, but at the risk of expanding some characters into multiple bytes, or discarding/altering bytes above 127, etc. This is fine and I fully understand the reasons for this - but what would be handy for me is either (a) an encoding that guarantees no conversion or discarding of characters so that I can use a string as a convenient byte buffer, or (b) some sort of ByteString class that gives the convenience of the string class. (Ideally the latter as it seems less of a hack.) Do either of these already exist? (Or are trivial to implement?)
I am aware of System.IO.MemoryStream, but the prospect of creating one of those each time and then having to make a System.IO.StreamReader at the end just to get access to ReadToEnd() doesn't seem very efficient, and this is in performance-sensitive code.
(I hope nobody minds that I tagged this as C# as I felt the answers would likely apply there also, and that C# users might have a good idea of the possible solutions.)
EDIT: I've also just discovered System.Text.StringBuilder - again, is there such a thing for bytes?
Use the Latin-1 encoding as described in this answer. It maps values in the range 128-255 unchanged, useful when you want to roundtrip bytes to chars.
UPDATE
Or if you want to manipulate bytes directly, use List<byte>:
List<byte> result = ...
...
// Add a byte at the end
result.Add(b);
// Add a collection of bytes at the end
byte[] bytesToAppend = ...
result.AddRange(bytesToAppend);
// Insert a collection of bytes at any position
byte[] bytesToInsert = ...
int insertIndex = ...
result.InsertRange(insertIndex, bytesToInsert);
// Remove a range of bytes
result.RemoveRange(index, count);
... etc ...
I've also just discovered System.Text.StringBuilder - again, is there such a thing for bytes?
The StringBuilder class is needed because regular strings are immutable, and a List<byte> gives you everything you might expect from a "StringBuilder for bytes".
I would suggest that you use MemoryStream combined with the GetBuffer() operator to retrieve the end result. Strings are actually fixed length and immutable, and to add or replace one byte to a string requires you to copy the whole thing into a new string, which is quite slow. To avoid this you would need to use a StringBuilder which allocates memory and doubles the capacity when needed, but then you might just as well use MemoryStream instead which does a similar thing but on bytes.
Each element in the string is a char and are actually two bytes because .NET strings are always UTF-16 in memory, which means you will also be wasting memory if you decide to store only one byte in each element.

Remove First 16 Bytes?

How would I go about removing a number of bytes from a byte array?
EDIT: As nobugz's comment (and Reed Copsey's answer) mentions, if you don't actually need the result as a byte array, you should look into using ArraySegment<T>:
ArraySegment<byte> segment = new ArraySegment<byte>(full, 16, full.Length - 16);
Otherwise, copying will be necessary - arrays always have a fixed size, so you can't "remove" the first 16 bytes from the existing array. Instead, you'll have to create a new, smaller array and copy the relevant data into it.
Zach's suggestion is along the right lines for the non-LINQ approach, but it can be made simpler (this assumes you already know the original array is at least 16 bytes long):
byte[] newArray = new byte[oldArray.Length - 16];
Buffer.BlockCopy(oldArray, 16, newArray, 0, newArray.Length);
or
byte[] newArray = new byte[oldArray.Length - 16];
Array.Copy(oldArray, 16, newArray, 0, newArray.Length);
I suspect Buffer.BlockCopy will be slightly faster, but I don't know for sure.
Note that both of these could be significantly more efficient than the LINQ approach if the arrays involved are big: the LINQ approach requires each byte to be individually returned from an iterator, and potentially intermediate copies to be made (in the same way as adding items to a List<T> needs to grow the backing array periodically). Obviously don't micro-optimise, but it's worth checking if this bit of code is a performance bottleneck.
EDIT: I ran a very "quick and dirty" benchmark of the three approaches. I don't trust the benchmark to distinguish between Buffer.BlockCopy and Array.Copy - they were pretty close - but the LINQ approach was over 100 times slower.
On my laptop, using byte arrays of 10,000 elements, it took nearly 10 seconds to perform 40,000 copies using LINQ; the above approaches took about 80ms to do the same amount of work. I upped the iteration count to 4,000,000 and it still only took about 7 seconds. Obviously the normal caveats around micro-benchmarks apply, but this is a pretty significat difference.
Definitely use the above approach if this is in a code path which is important to performance :)
You could do this:
using System.Linq
// ...
var newArray = oldArray.Skip(numBytes).ToArray();
I will also mention - depending on how you plan to use the results, often, an alternative approach is to use ArraySegment<T> to just access the remaining portion of the array. This prevents the need to copy the array, which can be more efficient in some usage scenarios:
ArraySegment<byte> segment = new ArraySegment<byte>(originalArray, 16, originalArray.Length-16);
// Use segment how you'd use your array...
If you can't use Linq, you could do it this way:
byte[] myArray = // however you acquire the array
byte[] newArray = new byte[myArray.Length - 16];
for (int i = 0; i < newArray.Length; i++)
{
newArray[i] = myArray[i + 16];
}
// newArray is now myArray minus the first 16 bytes
You'll also need to handle the case where the array is less than 16 bytes long.

Categories

Resources