Get a Array subset without copying like in C with pointers - c#

I have an api call which needs to get a byte[] as parameter and my data already is in a byte[]. The problem is that I want to send this buffer in little chunks.
The slow solution would be to copy the array data to new arrays. But I don't want to do this because copying is unnecessary. I just want a byte[]-pointer which i can move around in my buffer. Like in C or C++...
Here a sample in pseudo code:
ArrayOriginal = { 0, 1, 2, 3, 4, ... 100 }
ArrayFirstChunk = { 0, 1, 2 } (pointer to the first element in the Original Array)
ArraySecondChunk = { 3, 4, 5 } (pointer to the fourth element in the Original Array)
...
Is this possible? The data shall be available only one time in the memory.
thx

You don't say whether you can change the API. I assume not, but if you can, there is always IEnumerable<byte> - so you return
myarray.Skip(4).Take(4);
etc

You could try using unsafe to get a pointer to your array. Otherwise Buffer.BlockCopy is an efficient way of copying portions of arrays to another array. If sending small chunks of data you could just reuse the small array instance and leave it to garbage collection to release the memory from the array.

You can create FakeArray that contain an array an offset and a length. Like this you could work with subarray of array. But It won't be an array.

When you pass an array as a parameter, you are passing just a pointer to that array. The array is stored only once in memory.
So, I think you just don't need to divide in little chunks.
If you want to process it in chunks, I would suggest just reading the desired elements, first from 0 to 2, then from 3 to 5, etc...
Hope that helps

The usual way of handling data inside an array is simply to specify a chosen offset and count, for example:
// {0, ..., 100}
byte[] data = Enumerable.Range(0, 101).Select(i => (byte)i).ToArray();
Write(data, 0, 3);
Write(data, 3, 3);
...
static void Write(byte[] buffer, int offset, int count)
{
for(int i = offset ; i < offset + count; i++)
Console.WriteLine(buffer[i]);
Console.WriteLine();
}
You can do something similar to the C approach with unsafe (via byte* and fixed), but I'm not sure it buys you much here; but:
fixed(byte* ptr = data)
{
Write(ptr, 3);
Write(ptr + 3, 3);
}
...
static unsafe void Write(byte* ptr, int count)
{
for (int i = 0; i < count; i++)
Console.WriteLine(ptr[i]);
Console.WriteLine();
}
You can encapsulate a buffer, offset and count, but then it won't be an array - probably not very helpful.

Related

How can you fill an entire array pointer with a single value with a single write operation?

I have a pointer to a byte array, and I need to set the values of a certain region of this array to 0. I'm quite familiar with the methods available through the Marshal/Buffer/Array classes, and this problem is not at all hard.
The problem, however, is that I do not want to create excessive arrays, or write every byte one-by-one. All the methods I'm familiar with require full arrays, though, and they obviously don't work with single values.
I've seen several C methods that would achieve the result I'm looking for, but I don't have believe I have access to these methods without including the whole C library, or without writing platform-specific code.
My current solution is shown below, but I'd like to achieve this without allocating a new byte array.
Marshal.Copy(new byte[Length], 0, ptr + offset, length);
So is there a method in C#, or in an unmanaged language/library that I can use to fill an array (via a pointer) at a certain offset and for a certain length, with one single value (0)?
Miraculously, ChatGPT came rather close when I asked what would be a good solution to this problem. It didn't figure it out, but it suggested that I use spans.
As such, this is the solution I've come up with:
Span<byte> span = new Span<byte>(ptr + offset, Length);
span.Fill(0);
This solution is about 25 times faster than having to allocate a byte array for very large arrays.
Example benchmarks:
int size = 100_000;
nint ArrayPointer = Marshal.AllocHGlobal(size);
int trials = 1_000_000;
// Runtime was 1582ms
Benchmark("Fill with span", () =>
{
Span<byte> span = new Span<byte>((void*) ArrayPointer, size);
span.Fill(0);
}, trials);
// Runtime was 40681ms
Benchmark("Fill with allocation", () =>
{
Marshal.Copy(new byte[size], 0, ArrayPointer, size);
}, trials);
// Far too slow to get a result with these settings
Benchmark("Fill individually", () =>
{
for (int i = 0; i < size; i++)
{
Marshal.WriteByte(ArrayPointer + i, 0);
}
}, trials);
// Results with size = 100_000 and trials = 100_000
// Fill with span: 176ms
// Fill with allocation: 4382ms
// Fill individually: 24672ms
You can use Fill for this
arrayName.Fill('X',4,10) // fill character array at index 4 for 10 elements with character X
https://learn.microsoft.com/en-us/dotnet/api/system.array.fill?view=net-7.0
Note: The documentation for C# is quite good. You can go to the website and see all the Methods for array. If you really care how this is implemented you could even go to github and read the source code.

How to take array segments out of a byte array after every X step?

I got a big byte array (around 50kb) and i need to extract numeric values from it. Every three bytes are representing one value.
What i tried is to work with LINQs skip & take but it's really slow regarding the large size of the array.
This is my very slow routine:
List<int> ints = new List<int>();
for (int i = 0; i <= fullFile.Count(); i+=3)
{
ints.Add(BitConverter.ToInt16(fullFile.Skip(i).Take(i + 3).ToArray(), 0));
}
I think i got a wrong approach to this.
Your code
First of all, ToInt16 only uses two bytes. So your third byte will be discarded.
You can't use ToInt32 as it would include one extra byte.
Let's review this:
fullFile.Skip(i).Take(i + 3).ToArray()
..and take a careful look at Take(i + 3). It says that you want to copy a larger and larger buffer. For instance, when i is on index 32000 you copy 32003 bytes into your new buffer.
That's why the code is quite slow.
The code is also slow since you allocate a lot of byte buffers which will need to be garbage collected. 65535 extra buffers of growing size which would have to be garbage collected.
You could also have done like this:
List<int> ints = new List<int>();
var workBuffer = new byte[4];
for (int i = 0; i <= fullFile.Length; i += 3)
{
// Copy the three bytes into the beginning of the temp buffer
Buffer.BlockCopy(fullFile, i, workBuffer, 0, 3);
// Now we can use ToInt32 as the last byte always is zero
var value = BitConverter.ToInt32(workBuffer, 0);
ints.Add(value);
}
Quite easy to understand, but not the fastest code.
A better solution
So the most efficient way is to do the conversion by yourself (bit shifting).
Something like:
List<int> ints = new List<int>();
for (int i = 0; i <= fullFile.Length; i += 3)
{
// This code assume little endianess
var value = (fullFile[i + 2] << 16)
+ (fullFile[i + 1] << 8)
+ fullFile[i];
ints.Add(value);
}
This code do not allocate anything extra (except the ints), and should be quite fast.
You can read more about Shift operators in MSDN. And about endianess

Fastest way to extend array

I am looking for fastest way to extend an array.
No matter if only for length + 1 or length + x it has to be the most fastest way.
Here is an example:
var arr = new int [200];
for(int = 0; i < 200; i++)
arr[i] = i;
And now I want to extend arr for 5 items beginning at index position 20.
var arr2 = new int [] { 999, 999, 999, 999, 999 }
How do I place arr2 inside arr by using most fast way in terms of performance?
The result shall look like this
0,1,2,3,4....20, 999, 999, 999, 999, 999, 21, 22, 23, 24....199
Create a new array which is the size you want, then use the static Array.Copy method to copy the original arrays into the new one.
You can't "extend" an array, you can only create a bigger one and copy the original into it.
Also, consider using List<int> or LinkedList<> instead of an array, unless you require extremely fine-grained control over what is in memory.
It is far easier to use List. But if you have to use arrays, you have to create new array of size 205 and copy values from both source arrays, since array size is constant.
Your best bet is to use something like List<int> rather than an array. But if you must use an array:
int[] arr1 = new int[200];
// initialize array
int[] arr2 = new int[]{999, 999, 999, 999, 999};
int targetPos = 20;
// resizes the array, copying the items
Array.Resize(ref arr1, arr1.Length + arr2.Length);
// move the tail of the array down
Buffer.BlockCopy(arr1, 4*targetPos, arr1, 4*(targetPos+arr2.Length), 4*(arr1.Length - targetPos));
// copy arr2 to the proper position
Buffer.BlockCopy(arr2, 0, 4*arr1.targetPos, 4*arr2.Length);
It might be faster to create a new array and copy the items, like this.
int[] newArray = new int[arr1.Length + arr2.Length];
// copy first part of original array
Buffer.BlockCopy(arr1, 0, newArray, 0, 4*targetPos);
// copy second array
Buffer.BlockCopy(arr2, 0, newArray, 4*targetPos, 4*arr2.Length);
// copy remainder of original array
Buffer.blockCopy(arr1, 4*targetPos, newArray, 4*(targetPos + arr2.Length), 4*(arr1.Length - targetPos));
// and replace the original array
arr1 = newArray;
Which version is faster will depend on where targetPos is. The second version will be faster when targetPos is small. When targetPos is small, the first version has to copy a lot of data twice. The second version never copies more than it has to.
BlockCopy is kind of a pain to work with because it requires byte offsets, which is the reason for all the multiplications by four in the code. You might be better off using Array.Copy in the second version above. That will prevent you having to multiply everything by 4 (and forgetting sometimes).
If you know how long the array will be dimension it to that length,
var ints = new int[someFixedLength];
If you have a vauge idea of the length, use a generic list.
var ints = new List<int>(someVagueLength);
both types implement IList but, the List type handles the redimensioning of the internal array is generically the "most fast" way.
Note: the initial .Count of the List will be 0 but, the internal array will be dimensioned to size you pass to to the constructor.
If you need to copy data between arrays, the quickest way is Buffer.BlockCopy, so from your example
Buffer.BlockCopy(arr2, 0, arr, sizeof(int) * 20, sizeof(int) * 5);
copies all 5 ints from arr2 into indecies 20, 21 ... 24 of arr.
there is no faster way to do this with c# (currently).
An answer showing timing benchmarks is given here: Best way to combine two or more byte arrays in C# . If you consider the "array you insert into " as arrays 1 and 3, and the "array to be inserted" as array 2, then the "concatenate three arrays" example applies directly.
Note the point at the end of the accepted answer: the method that is faster at creating yields an array that is slower to access (which is why I asked if you cared about speed to create, or access speed).
using System.Linq you can do the following to extend an array by adding one new object to it...
int[] intA = new int[] { 1, 2, 3 };
int intB = 4;
intA = intA.Union(new int[] { intB }).ToArray();
...or you can extend an array by adding another array of items to it...
int[] intA = new int[] { 1, 2, 3 };
int[] intB = new int[] { 4, 5, 6 };
intA = intA.Union(intB).ToArray();
...or if you don't care about duplicates...
int[] intA = new int[] { 1, 2, 3 };
int[] intB = new int[] { 4, 5, 6 };
intA = intA.Concat(intB).ToArray();

C# Changing the number of dimensions in an array

Is it possible, in C#, to convert a multi-dimensional array into a 1D array without having to copy all the elements from one to the other, something like:
int[,] x = new int[3,4];
int[] y = (int[])x;
This would allow the use of x as if it were a 12-element 1D array (and to return it from a function as such), but the compiler does not allow this conversion.
As far as I'm aware, a 2D array (or higher number of dimensions) is laid out in contiguous memory, so it doesn't seem impossible that it could work somehow. Using unsafe and fixed can allow access through a pointer, but this doesn't help with returning the array as 1D.
While I believe I can just use a 1D array throughout in the case I'm working on at present, it would be useful if this function was part of an adapter between something which returns a multidimensional array and something else which requires a 1D one.
You can't, it's not possible in C# to convert array's this way. You maybe could do it by using a external dll ( C/C++ ), but then you need to keep your array at a fixed location.
Speed
Generally i would advice to avoid using a 2D array because theese are slow in C#, better use jagged-array or even better single dimensionals with a little bit of math.
Int32[] myArray = new Int32[xSize * ySize];
// Access
myArray[x + (y * xSize)] = 5;
In C#, arrays cannot be resized dynamically. One approach is to use System.Collections.ArrayList instead of a native array. Another (faster) solution is to re-allocate the array with a different size and to copy the contents of the old array to the new array. The generic function resizeArray (below) can be used to do that.
One example here :
// Reallocates an array with a new size, and copies the contents
// of the old array to the new array.
// Arguments:
// oldArray the old array, to be reallocated.
// newSize the new array size.
// Returns A new array with the same contents.
public static System.Array ResizeArray (System.Array oldArray, int newSize) {
int oldSize = oldArray.Length;
System.Type elementType = oldArray.GetType().GetElementType();
System.Array newArray = System.Array.CreateInstance(elementType,newSize);
int preserveLength = System.Math.Min(oldSize,newSize);
if (preserveLength > 0)
System.Array.Copy (oldArray,newArray,preserveLength);
return newArray; }
You can already iterate over a multidim as if it were a 1 dimensional array:
int[,] data = { { 1, 2, 3 }, { 3, 4, 5 } };
foreach (int i in data)
... // i := 1 .. 5
And you could wrap a 1-dim array in a class and provide an indexer property this[int x1, int x2].
But everything else will require unsafe code or copying. Both will be inefficient.
Riding on the back of Felix K.'s answer and quoting a fellow developer:
You can't convert a square to a line without losing information
try
int[,] x = {{1, 2}, {2, 2}};
int[] y = new int[4];
System.Buffer.BlockCopy(x, 0, y, 0, 4);
You cannot cast, you'll have to copy the elements:
int[] y = (from int i in y select i).ToArray();

Fastest way to chop array in two pieces

I have an array, say:
var arr1 = new [] { 1, 2, 3, 4, 5, 6 };
Now, when my array-size exceeds 5, I want to resize the current array to 3, and create a new array that contains the upper 3 values, so after this action:
arr1 = new [] { 1, 2, 3 };
newArr = new [] { 4, 5, 6 };
What's the fastest way to do this? I guess I'll have to look into the unmanaged corner, but no clue.
Some more info:
The arrays have to be able to size up without large performance hits
The arrays will only contain Int32's
Purpose of the array is to group the numbers in my source array without having to sort the whole list
In short: I want to split the following input array:
int[] arr = new int[] { 1, 3, 4, 29, 31, 33, 35, 36, 37 };
into
arr1 = 1, 3, 4
arr2 = 29, 31, 33, 35, 36, 37
but because the ideal speed is reached with an array size of 3, arr2 should be split into 2 evenly sized arrays.
Note
I know that an array's implementation in memory is quite naive (well, at least it is in C, where you can manipulate the count of items in the array so the array resizes). Also that there is a memory move function somewhere in the Win32 API. So I guess this would be the fastest:
Change arr1 so it only contains 3 items
Create new array arr2 with size 3
Memmove the bytes that aren't in arr1 anymore into arr2
I'm not sure there's anything better than creating the empty arrays, and then using Array.Copy. I'd at least hope that's optimized internally :)
int[] firstChunk = new int[3];
int[] secondChunk = new int[3];
Array.Copy(arr1, 0, firstChunk, 0, 3);
Array.Copy(arr1, 3, secondChunk, 0, 3);
To be honest, for very small arrays the overhead of the method call may be greater than just explicitly assigning the elements - but I assume that in reality you'll be using slightly bigger ones :)
You might also consider not actually splitting the array, but instead using ArraySegment to have separate "chunks" of the array. Or perhaps use List<T> to start with... it's hard to know without a bit more context.
If speed is really critical, then unmanaged code using pointers may well be the fastest approach - but I would definitely check whether you really need to go there before venturing into unsafe code.
Are you looking for something like this?
static unsafe void DoIt(int* ptr)
{
Console.WriteLine(ptr[0]);
Console.WriteLine(ptr[1]);
Console.WriteLine(ptr[2]);
}
static unsafe void Main()
{
var bytes = new byte[1024];
new Random().NextBytes(bytes);
fixed (byte* p = bytes)
{
for (int i = 0; i < bytes.Length; i += sizeof(int))
{
DoIt((int*)(p + i));
}
}
Console.ReadKey();
}
This avoids creating new arrays (which cannot be resized, not even with unsafe code!) entirely and just passes a pointer into the array to some method which reads the first three integers.
If your array will always contain 6 items how about:
var newarr1 = new []{oldarr[0], oldarr[1],oldarr[2]};
var newarr2 = new []{oldarr[3], oldarr[4],oldarr[5]};
Reading from memory is fast.
Since arrays are not dynamically resized in C#, this means your first array must have a minimum length of 5 or maximum length of 6, depending on your implementation. Then, you're going to have to dynamically create new statically sized arrays of 3 each time you need to split. Only after each split will your array items be in their natural order unless you make each new array a length of 5 or 6 as well and only add to the most recent. This approach means that each new array will have 2-3 extra pointers as well.
Unless you have a known number of items to go into your array BEFORE compiling the application, you're also going to have to have some form of holder for your dynamically created arrays, meaning you're going to have to have an array of arrays (a jagged array). Since your jagged array is also statically sized, you'll need to be able to dynamically recreate and resize it as each new dynamically created array is instantiated.
I'd say copying the items into the new array is the least of your worries here. You're looking at some pretty big performance hits as well as the array size(s) grow.
UPDATE: So, if this were absolutely required of me...
public class MyArrayClass
{
private int[][] _master = new int[10][];
private int[] _current = new int[3];
private int _currentCount, _masterCount;
public void Add(int number)
{
_current[_currentCount] = number;
_currentCount += 1;
if (_currentCount == _current.Length)
{
Array.Copy(_current,0,_master[_masterCount],0,3);
_currentCount = 0;
_current = new int[3];
_masterCount += 1;
if (_masterCount == _master.Length)
{
int[][] newMaster = new int[_master.Length + 10][];
Array.Copy(_master, 0, newMaster, 0, _master.Length);
_master = newMaster;
}
}
}
public int[][] GetMyArray()
{
return _master;
}
public int[] GetMinorArray(int index)
{
return _master[index];
}
public int GetItem(int MasterIndex, int MinorIndex)
{
return _master[MasterIndex][MinorIndex];
}
}
Note: This probably isn't perfect code, it's a horrible way to implement things, and I would NEVER do this in production code.
The obligatory LINQ solution:
if(arr1.Length > 5)
{
var newArr = arr1.Skip(arr1.Length / 2).ToArray();
arr1 = arr1.Take(arr1.Length / 2).ToArray();
}
LINQ is faster than you might think; this will basically be limited by the Framework's ability to spin through an IEnumerable (which on an array is pretty darn fast). This should execute in roughly linear time, and can accept any initial size of arr1.

Categories

Resources