VirtualAlloc in c# to allocate large memory - c#

I'm trying to adapt a vendor's c# example code for interfacing with a PCI-Express device. The code basically allocates a large buffer as an int array, and then pins it via the fixed keyword before handing it off to hardware to be filled with data.
This works great, but it eventually fails because .Net is limited to ~2 billion elements in an array. I can push the limit out to 16 GB by using an array of Long and gcAllowVeryLargeObjects keyword, but eventually I still run into .Net limitations.
In unmanaged code I could call VirtualAlloc and request 40 or 50GB directly, however its not clear to me if this is possible in c# and I haven't been able to find any good example code. I realize I could be doing this in a different language, but on Windows at least I'm more familiar with .Net, and aside from this relatively small portion of the program, there is very little hardware-specific code so I'd like to try and stick with what I have.

You can pinvoke VirtualAlloc. The signature is
[DllImport("kernel32.dll", SetLastError=true)]
static extern IntPtr VirtualAlloc(IntPtr lpAddress, UIntPtr dwSize, AllocationType lAllocationType, MemoryProtection flProtect);
You can find most pinvoke structures and signatures on pinvoke.net: VirtualAlloc
Alternatively, take a look at the AllocHGlobal func

gcAllowVeryLargeObjects should work just fine, are you sure it doesn't? You may have to explicitly set the target CPU to x64 only.
In any case, you could use a hack struct to get a large value-type (which you can use as an array):
unsafe struct Test
{
public fixed byte Data[1024];
}
unsafe void Main()
{
Test[] test = new Test[16 * 1024 * 1024];
// We've got 16 * 1024 * 1024 * 1024 here.
fixed (Test* pTest = test)
{
}
}
This does have its limits (the unsafe structure has a maximum size), but it should get you where you need to be.
However, it might be better idea to simply call VirtualAlloc through P/Invoke. Or better, use Marshal.AllocHGlobal, it should really be doing the same thing (although you can't specify any parameters except for the size).

Related

Is there a portable way to copy a block of memory in C#?

If you want to do a memory copy between two arrays, there's an Array.Copy function for that in .NET:
char[] GetCopy(char[] buf)
{
char[] result = new char[buf.Length];
Array.Copy(buf, result);
return result;
}
This is usually faster than manually for-looping to copy all the characters in buf to result, because it's a block copy operation that simply takes a chunk of memory and writes it to the destination.
Similarly, if I was instead given a char* and an int specifying its size, what are my options? Here are the ones I've considered:
Buffer.BlockCopy: requires src and dst to be arrays
Buffer.MemoryCopy: exactly what I'm looking for, but only available on .NET Desktop
Marshal.Copy: stopped being supported in .NET 2.0
Or, if there's no alternative, is there some kind of way to construct an array from a pointer char* and a length int? If I could do that, I could just convert the pointers to arrays and pass them into Array.Copy.
I'm not looking for for-loops because as I said they're not very efficient compared to block copies, but if that's the only way I guess that would have to do.
TL;DR: I'm basically looking for memcpy(), but in C# and can be used in PCLs.
You state that Marshal.Copy would not be supported any longer, however on the documentation of the Marshal class i can find no indication for that.
Quite the contrary, the class is available for the following framework versions: 4.6, 4.5, 4, 3.5, 3.0, 2.0, 1.1
One possible implementation of a utility function based on this method would be:
public static void CopyFromPtrToPtr(IntPtr src, uint srcLen,
IntPtr dst, uint dstLen)
{
var buffer = new byte[srcLen];
Marshal.Copy(src, buffer, 0, buffer.Length);
Marshal.Copy(buffer, 0, dst, (int)Math.Min(buffer.Length, dstLen));
}
However, it is quite possible, that copying the bytes from IntPtr to byte[] and back again negates any possible performance gain over pinning the unsafe memory and copying it in a loop.
Depending on how portable the code needs to be, you could also consider using P/Invoke to actually use the memcpy method. This should work well on Windows 2000 and onwards (all systems that have the Microsoft Visual C Run-Time Library installed).
[DllImport("msvcrt.dll", EntryPoint = "memcpy", CallingConvention = CallingConvention.Cdecl, SetLastError = false)]
static extern IntPtr memcpy(IntPtr dst, IntPtr src, UIntPtr count);
Edit: The second approach would not be suited for portable class libraries, the first however would be.
I don't recall what built-in APIs the BCL has but you can copy the BCL memcpy code and use it. Use Reflector and search for "memcpy" to find it. Unsafe code is perfectly defined and portable.

P/Invoke with arrays of double - marshalling data between C# and C++

I've read the various MSDN pages on C++ Interop with P/Invoke here and here but I am still confused.
I have some large arrays of doubles that I need to get into native code, and some resulting arrays that need to get back. I do not know the sizes of the output arrays in advance. For simplicity, I will use only a single array in the example. The platform is x64; I read that marshalling internals are quite different between 32- and 64-bit environments so this might be important.
C#
[DllImport("NativeLib.dll")]
public static extern void ComputeSomething(double[] inputs, int inlen,
[Out] out IntPtr outputs, [Out] out int outlen);
[DllImport("NativeLib.dll")]
public static extern void FreeArray(IntPtr outputs);
public void Compute(double[] inputs, out double[] outputs)
{
IntPtr output_ptr;
int outlen;
ComputeSomething(inputs, inputs.Length, out output_ptr, out outlen);
outputs = new double[outlen];
Marshal.Copy(output_ptr, outputs, 0, outlen);
FreeArray(output_ptr);
}
C++
extern "C"
{
void ComputeSomething(double* inputs, int input_length,
double** outputs, int* output_length)
{
//...
*output_length = ...;
*outputs = new double[output_length];
//...
}
void FreeArray(double* outputs)
{
delete[] outputs;
}
}
It works, that is, I can read out the doubles I wrote into the array on the C++ side. However, I wonder:
Is this really the right way to use P/Invoke?
Aren't my signatures needlessly complicated?
Can P/Invoke be used more efficiently to solve this problem?
I believe I read that marshalling for single dimensional arrays of built-in types can be avoided. Is there a way around Marshal.Copy?
Note that we have a working C++/Cli version, but there are some problems related to local statics in third-party library code that lead to crashes. Microsoft marked this issue as WONTFIX, which is why I am looking for alternatives.
It is okayish. The complete lack of a way to return an error code is pretty bad, that's going to hurt when the arrays are large and the program runs out of memory. The hard crash you get is pretty undiagnosable.
The need to copy the arrays and to explicitly release them doesn't win any prizes of course. You solve that by letting the caller pass a pointer to its own array and you just write the elements. You however need a protocol to let the caller figure out how large the array needs to be, that is going to require calling the method twice. The first call returns the required size, the second call gets the job done.
A boilerplate example would be:
[DllImport("foo.dll")]
private static int ReturnData(double[] data, ref int dataLength);
And a sample usage:
int len = 0;
double[] data = null;
int err = ReturnData(data, ref len);
if (err == ERROR_MORE_DATA) { // NOTE: expected
data = new double[len];
err = ReturnData(data, len);
}
No need to copy, no need to release memory, good thing. The native code can corrupt the GC heap if it doesn't pay attention to the passed len, not such a good thing. But of course easy to avoid.
If it were practical to separate the code that determines the output length from the code that populates the output then you could:
Export a function that returned the output length.
Call that from the C# code and then allocate the output buffer.
Call the unmanaged code again, this time asking it to populate the output buffer.
But I'm assuming that you have rejected this option because it is impractical. In which case your code is a perfectly reasonable way to solve your problem. In fact I would say that you've done a very good job.
The code will work just the same in x86 once you fix the calling convention mismatch. On the C++ side the calling convention is cdecl, but on the C# side it is stdcall. That doesn't matter on x64 since there is only one calling convention. But it would be a problem under x86.
Some comments:
You don't need to use [Out] as well as out. The latter implies the former.
You can avoid exporting the deallocator by allocating off a shared heap. For instance CoTaskMemAlloc on the C++ side, and then deallocate with Mashal.FreeCoTaskMem on the C# side.
If you knew the array size beforehand, you could write a C++/CLI DLL that takes the managed array as parameter, pins it, and calls the native C++ DLL on the pinned pointer it obtains.
But if it's output-only, I don't see any version without a copy. You can use a SAFEARRAY so P/Invoke does the copying instead of you, but that's all.

PInvoke: Issue with returned array of doubles?

I am using PInvoke to call a C++ function from my C# program. The code looks like this:
IntPtr data = Poll(this.vhtHand);
double[] arr = new double[NR_FINGERS /* = 5 */ * NR_JOINTS /* = 3*/];
Marshal.Copy(data, arr, 0, arr.Length);
With Poll()'s signature looking like this:
[DllImport("VirtualHandBridge.dll")]
static public extern IntPtr Poll(IntPtr hand);
The C-function Poll's signature:
extern "C" __declspec(dllexport) double* Poll(CyberHand::Hand* hand)
Unless I'm having a huge brain failure (admittedly, fairly common for me), this looks to me like it should be working.
However, the double values I am getting are completely incorrect, and I think this is because of incorrect memory usage. I have looked it up, and I think doubles in C# and C++ are identical in size, but maybe there is some other issue playing here. One thing that rubs me the wrong way is that Marshal.Copy is never told what type of data it should expect, but I read that it is supposed to be used this way.
Any clues, anyone? If needed, I can post the correct results and the returned results.
You are missing the CallingConvention property, it is Cdecl.
You really want to favor a better function signature, the one you have is extremely brittle due to the memory management problem, the required manual marshaling, the uncertainty of getting the right size array and the requirement to copy the data. Always favor the caller passing a buffer that your native code fills in:
extern "C" __declspec(dllexport)
int __stdcall Poll(CyberHand::Hand* hand, double* buffer, size_t bufferSize)
[DllImport("foo.dll")]
private static extern int Poll(IntPtr hand, double[] buffer, int bufferSize)
Use the int return value to report a status code. Like a negative value to report an error code, a positive value to return the number of elements actually copied into the buffer.
You shouldn't even need to marshal the data like that, as long as you declare the P/Invoke correctly.
If your CyberHand::Hand* is in reality a pointer to a double, then you should declare your P/Invoke as
[DllImport("VirtualHandBridge.dll")]
static public extern IntPtr Poll(double[] data);
And then just call it with your array of doubles.
If it isn't really an array of doubles, then you certainly can't do what you're doing.
Also, how does your 'C' function know how big the array will be? Is it a fixed size?
The IntPtr return value will be a problem. What is the double* pointing to? An array or a single item?
You could find that it's easier (if you can) to write a simpler more friendly 'C' wrapper for the function you're calling, and call the wrapper function itself. You can of course only do that if you can change the source code of the 'C' DLL. But without knowing exactly what your function does, I can't give you specific advice.
[EDIT]
Ok, your code should theoretically work if the memory being passed back isn't being messed around with (e.g. freed up). If it's not working, then I suspect something like that is happening. You'd definitely be better writing a wrapper 'C' function that fills in an array allocated by the C# and passed to the function, rather than passing back a pointer to some internal memory.
BTW: I don't like code which passes around pointers to blocks of memory without also passing the size of that block. Seems a bit prone to nasty things.

Calling _msize() via PInvoke from C#

I'm writing a C# library where the calling app will pass in a large amount of contiguous, unmanaged memory. This calling app can be either from .Net or Visual C++ (it will go through an intermediate C++/CLI library before calling my library if from C++). It would be useful to validate that there is sufficient memory, so I decided to call the _msize() function. Unfortunately, _msize always seems to give me the wrong size back.
I went back and modified my allocation routine in my sample app and then immediately call _msize. Here is my code snipet:
public unsafe class MyMemory
{
/// <returns></returns>
[DllImport("msvcrt.dll", SetLastError = true)]
public static extern int _msize(IntPtr handle);
public static IntPtr MyAlloc(int size)
{
IntPtr retVal = Marshal.AllocHGlobal(size);
...
int memSize = MyMemory._msize(retVal);
if (memSize < size)
{
...
}
return retVal;
}
When I pass in the size 199229440, I get back memSize of 199178885. I've seen similar results for different numbers. It is less than 0.01% off, which I would totally understand if it was over, but the fact is it is under, meaning _msize thinks the allocated memory is less than what was asked for. Anyone have any clue why this is? And any recommendations on what I should do instead would be appreciated as well.
P/Invoke the LocalSize function instead.
_msize is for determining the size of a block allocated with malloc (and its friends). AllocHGlobal is a wrapper around GlobalAlloc or LocalAlloc (depending on what reference you believe; but I think the two are equivalent), and you want the LocalSize function to determine the size of the block that actually returned. So far as I can tell, Marshal doesn't contain a wrapper for LocalSize, but you can call it using P/Invoke.
So it seems like it's only by sheer good luck that _msize is returning anything useful for you at all. Perhaps malloc uses GlobalAlloc (or LocalAlloc), either always or just when asked for large blocks, and requests a bit of extra space for bookkeeping; in which case _msize would be trying to compensate for that.

How can I copy unmanaged data in C# and how fast is it?

I have two unmanaged pointers in the form of IntPtr and want to copy data between them. How can I do this? I know the method Marshal.Copy, but it can only copy between unmanaged and managed.
And the second part: Is copying unmanaged data from C# slower than doing it in unmanaged C/C++ using memcpy?
Edit: I would be especially interested in a platform independet implementation.
You can use the win32 memcpy function via P-Invoke.
[DllImport("msvcrt.dll", SetLastError = false)]
static extern IntPtr memcpy(IntPtr dest, IntPtr src, int count);
Apart from the (slight) overhead calling a win32 function from managed code, the actual copy performance should be the same as C/C++ code that uses the same function.
Don't forget that you can also use an unsafe block (and compiler option) and simply copy the data one byte/int/long at a time:
unsafe
{
// srcPtr and destPtr are IntPtr's pointing to valid memory locations
// size is the number of long (normally 4 bytes) to copy
long* src = (long*)srcPtr;
long* dest = (long*)destPtr;
for (int i = 0; i < size / sizeof(long); i++)
{
dest[i] = src[i];
}
}
This removes the platform dependency, but you need to be very careful with the bounds checking and pointer arithmetic.
Try System.Buffer.MemoryCopy, see the bottom of the page for supported target frameworks.
I believe that the main difference between this and the other solutions that use P/Invoke is that this method avoids the P/Invoke for smaller sizes and just does the copying directly.
Here's the guts of the implementation in .NET Core (latest as of 2020-09-04).
Without making comments on performance, purely because I have not tested it. You can achieve the same performance as unmanaged copy by using either CopyMemory or MoveMemory from Kernel32 via interop.
Here is the declaration for CopyMemory
[DllImport("kernel32.dll")]
static extern void CopyMemory(IntPtr destination, IntPtr source, uint length);
You could look at System.Runtime.CompilerServices.Unsafe.CopyBlock
It seems to allow you to copy bytes from the source address (designated by a void*) to the destination address (designated by a void*).
It also overriden to support ref byte as the source and destination.
[edit]
Disappointingly it appears not to be implemented in Mono
[edit] For those who are interested in this and using Unity, you should instead look to Unity's UnsafeUtility.MemCpy
CopyMemory aka RtlCopyMemory aka memcpy() will be just as fast whether called from C# or C (other than the tiny overhead of PInvoking the method itself).
Something to keep in mind, though, is that CopyMemory should only be used when you're sure that the source and destination ranges do not overlap. If they do overlap, you need to use MoveMemory instead, which will be slower.
Here is a declaration for CopyMeSomeMemory, showing how many different ways you can do the same thing in .Net:
[DllImport("kernel32.dll", EntryPoint = "RtlCopyMemory")]
public static extern void CopyMeSomeMemory(IntPtr Destination,
IntPtr Source, uint Length);
For the record, I think Buffer.BlockCopy in .Net just wraps one of these functions, too.

Categories

Resources