I am looking for the best way to transfer a large amount of data from C++ (struct or a value class?) into a C# class doing as little data copying as possible. In the sample code below, I have a vector of SubClass objects that has the potential to be very large (10+ million). So I want to avoid a data copy if possible.
Should I/can I just allocate the objects in GC first and use them directly in c++ and forget about the native c++ structures? (Performance is my concern with this one.)
Or, is there some trick that I leverage what is allocated in C++ without causing a data copy?
Here is a sample of something along the lines of what I want to use as a transfer between managed and unmanaged code.
#include <string>
#include <vector>
struct SubClass {
std::string DataItem1;
// lots more here
std::string DataItem50;
};
struct Sample {
int IntValue;
std::string StringValue;
std::vector<std::string> SmallList;
std::vector<SubClass> HugeList;
};
If I can avoid getting into the weeds with pinvoke and COM classes, I would prefer it.
Following the example from Unity (who uses C#), Native plugin example uses a GC handle to transfer data from C# to C++. We can try the opposite to send data to C++ from C#.
Pin down a C# variable to allow faster copying.
using System;
using System.Collections;
using System.Runtime.InteropServices;
// vertices is a Vector3[], where Vector3 is a struct
// of 3 floats using a sequential layout attribute
void test(){
GCHandle gcVertices = GCHandle.Alloc (vertices, GCHandleType.Pinned);
}
Transfer the handle to C++ using marshaling. It's unavoidable that you have to copy something. Here copying a pointer should be good enough. More on marshaling according to Microsoft doc.
[DllImport("your dll")]
private static extern void SendHandle(IntPtr vertexHandle, int vertexCount);
SendHandle(gcVertices, vertices.Length);
Inside C++, you'll receive the handle as a pointer type to a C++ type of your choosing. In this case, vertices are a list of structs of 3 floats. The reference code decided to use float *. You just need to do pointer arithmetic properly depending on the pointed type, including the case of void *.
extern "C" __decl(dllexport) void SendHandle(float* vertices, int vertexCount);
Here the example code copies data directly from the pointer, but you can also write to the pointer's location.
for (int i = 0 ; i < vertexCount; i++)
{
// read from C# heap
float x = vertices[0];
float y = vertices[1];
float z = vertices[2];
// write to C# heap
*vertices = sqrt(x);
*(vertices + 1) = sqrt(y);
*(vertices + 2) = sqrt(z);
vertices += 3; // because it is a list of struct of 3 floats
}
Clean up the pinned handle from the C# side to resume the garbage collector.
gcVertices.Free();
As for strings, I believe the interop library has an implementation that handles pointer arithmetic and copying for you. You could probably just use a string type directly inside the exposed export function, as long as you specify how to marshal it with the MarshalAs attribute in C# and a library in C++ if you are not converting to the C type char *.
Related
I have an array of arrays of this struct (shown here in C#, but existing in C++ as well):
[StructLayout(LayoutKind.Sequential)]
public struct MyStruct
{
IntPtr name; //pointer to string, char* on C++ side
long pValues;
long jValues;
long eValues;
long kValues;
int cost;
};
and an algorithm in a C++ DLL that does work on it, being called from managed C# code. It's CPU-heavy, which is what necessitates this as it runs much faster in C++ than C#. The managed (C#) side never has to know the contents of the struct data, as the algorithm only returns a single array of ints.
So, how would I go about storing this data in the most efficient way (ie with the least overhead), for the lifetime of the application? I think I have it narrowed down to two options:
Initialize structs and set values in C#, pin memory with GCHandle and pass reference to C++ whenever I want to do work (see this post on Unity forums)
Initialize structs and set values in C++, have structs persist in memory on unmanaged side
So my questions are very specific:
With 1, I'm confused as to how marshalling works. It looks like in MSDN: Copying and Pinning that you are able to pass arrays of structures by pinning and passing a reference to the pinned data, without having to copy or convert any of it (and as long as the struct looks the same on both sides). Am I reading that correctly, is that how it actually works? Referring to the Unity3d forum post, I see Marshal.PtrToStructure being called; I thought that performs copying operations? As the data would be stored on the managed side in this instance, having to copy and/or convert the data every time the C++ function is called would cause a lot of overhead, unless I'm thinking that those type of operations are a lot more expensive than they actually are.
With 2, I'm wondering if it's possible to have persistence between C++ calls. To the best of my knowledge, if you're P/Invoking from a DLL, you can't have persistent data on the unmanaged side, so I can't just define and store my struct arrays there, making the only data transferred between managed and unmanaged the int array resulting from the unmanaged algorithm. Is this correct?
Thank you very much for taking the time to read and help!
If the C# code does not need to know the internals of the array and the structure, don't expose it to the C# code. Do all the work on this type in the unmanaged code and avoid marshalling overhead.
Essentially, you want to follow this basic pattern. I'm sure the details will differ, but this should give you the basic concept.
C++
MyStruct* newArray(const int len)
{
return new MyStruct[len];
}
void workOnArray(MyStruct* array, const int len)
{
// do stuff with the array
}
void deleteArray(const MyStruct* array)
{
delete[] array;
}
C#
[DllImport(dllname)]
static extern IntPtr newArray(int len);
[DllImport(dllname)]
static extern void workOnArray(IntPtr array int len);
[DllImport(dllname)]
static extern void deleteArray(IntPtr array);
I'm trying to create a solution where I can run a 2D int array within a C# program through CUDA, so the approach I'm currently taking to try and do this is by creating a C++ dll which can handle the CUDA code then return the 2D array. The code I'm using to send my array to the dll and back again is below.
#include "CudaDLL.h"
#include <stdexcept>
int** cudaArrayData;
void CudaDLL::InitialiseArray(int arrayRows, int arrayCols, int** arrayData)
{
cudaArrayData = new int*[arrayCols];
for(int i = 0; i < arrayCols; i++)
{
cudaArrayData[i] = new int[arrayRows];
}
cudaArrayData = arrayData;
}
int** CudaDLL::ReturnArray()
{
return cudaArrayData;
}
The problem however is I get an error in C# on the return, "Cannot marshal 'return value': Invalid managed/unmanaged type combination." My hope was if I returned the array back as a pointer C# might have hopefully understood and accepted it however no such luck.
Any idea's?
As you’re using int[,] in C#, int** is not the right corresponding type in C++. int[][] is an array of arrays of ints, similar to int** in C++; whereas int[,] is one array of ints with 2D indexing: index = x + y * width. Using int** in C#/C++ interop is difficult, as you have many pointers to either managed or unmanaged memory, which is not directly accessible from one to another (see further down).
Already in InitialiseArray(..., int** arrayData) you read somewhere in your memory but not in array values as you don't pass an array with pointers to arrays of ints, you pass one single array of int.
When you return int** in ReturnArray(), your problem is that .net has no clue how to interpret that pointer to an pointer.
To fix this, use only int* on C++ side and don’t return the array as a function return value, this would only give you a pointer to the unmanaged array and not the entire data in managed memory. It is possible to use this from C#, but probably not in the way you intend to do. Use an array allocated in C# as function argument in void ReturnArray(int* retValues) to copy the data to.
The other problem is then data copying and memory allocations. You can avoid all of these steps if you handle memory in C# right, i.e. forbid the garbage collector to move your data around (it does that when cleaning up unused objects). Either use a fixed{} statement or do it manually via GCHandle.Alloc(array, GCHandleType.Pinned). Doing so, you can directly use the C# allocated arrays from within C++.
Finally, if all you need is to let a CUDA kernel run on your C# data, have a look at some cuda wrappers that handle all the Pinvoke hazards for you. Like managedCuda (I maintain this project) or cudafy and some others.
I have a C++ library, header of which looks like:
void NotMyFun(double * a, int * b);
The function reads from a, and writes to b. To call the library I have created a C++/CLI wrapper, where below function is defined:
static void MyWrapperFun(double * a, int * b)
{
NotMyFun(a,b);
}
and works OK. From C# code, say, I have two managed arrays, i.e. double[] ma and double[] mb, where ma already holds some meaningful data, and mb is -meaningfully- filled when wrapper is called. Is below an OK way to call the wrapper function?
unsafe
{
fixed (double* pma = ma)
{
fixed (int* pmb = mb)
{
MyWrapperNS.MyWrapperClass.MyWrapperFun(pma,pmb);
}
}
}
Are the unsafe pointers a fast way? Is any data copying involved here while passing and retrieving to/from C++/CLI wrapper? Or pointers are already pointing to a continuous memory space in C# arrays?
Besides, do I need any manual memory cleaning here? If the pointers are tied to the memory of managed C# arrays, I guess they are properly garbage collected after, but just want to be sure.
Personally I think you are over-complicating things. I'd avoid the unsafe code and skip the C++/CLI layer. I'd use a simple p/invoke declared like this:
[DllImport(#"mylib.dll")]
static extern void NotMyFun(double[] a, int[] b);
Because double and int are blittable types, no copying is necessary. The marshaller just pins the arrays for the duration of the call.
I want to send a struct in C# to C++ using sockets.
For example, I use this struct:
[StructLayout(LayoutKind.sequential, Pack = 1)]
struct pos {
public int i;
public float x;
};
If I somehow convert it into bytes and send over the network, I should be able to cast it to this in c++:
struct pos {
int i;
float x;
};
... I think.
1) how do you break down a struct instance in C# to send it over the network?
2) can I safely cast it to the c++ struct once I get it?
Thanks
The marshaller helps you with converting between .NET structs and raw bytes. In this answer, I posted a simple solution, which boils down to Marshal.StructureToPtr and Marshal.PtrToStructure. In contrast to the more advanced solutions provided by Johann du Toit, this is in my opinion the best thing you can do if all you want to do is to push some structures through a byte stream.
If you do this, you can safely cast to the C++ struct if the length is correct, and your C++ struct is declared with the same packing as the C# struct (i.e. #pragma pack in VC++ or __attribute__((packed)) in GCC).
Note that this also works with fixed length C strings, but will not take care of the endianness of larger values. I found it a simple solution to provide getters and setters for the latter problem which just swap the bytes accordingly (with BitConverter).
Some elaboration on the packing:
Take the following structure:
struct MyStruct {
uint8_t a;
float b;
};
With the C# declaration with StructLayout, Pack = 1, this struct will have a size of five bytes. The C++ struct, however, may have eight bytes (or even more), depending on the default packing of the compiler, who may happily insert some padding bytes to align the float value on a 32-bit boundary (just an example). Because of this, you have to apply the very same packing to both the C# and C++ struct. In Visual C++:
#pragma pack(push, 1)
// ... struct declarations...
#pragma pack(pop)
This means all structs declared between the two pragmas will have a packing of one. In GCC:
struct x {
// ...
} __attribute__((packed));
This will do the same. You can #define __attribute__(x) on Windows platforms and #ifdef _WIN32 around the pragmas to make the code compatible with both worlds.
You can either encode it in a format like JSON (There are a lot of JSON parsers out there, check on the json.org website for a list), XML or just roll your own. You could also try already built libraries like Protobuf, which allows you to serialize your structures that you would create with a file in .proto format (And use Protobuf-Net for C#). Another option would be Thrift which provides a way to serialize but also supplies a ready to use RCP system. It support c#, c++ and a ton of other languages by default.
So it's depends on taste, take your pick :D
I have a C# front end and a C++ backend for performance reasons.
Now I would like to call a C++ function like for example:
void findNeighbors(Point p, std::vector<Point> &neighbors, double maxDist);
What I'd like to have is a C# wrapper function like:
List<Point> FindNeigbors(Point p, double maxDist);
I could pass a flat array like Point[] to the unmanaged C++ dll, but the problem is, that I don't know how much memory to allocate, because I don't know the number of elements the function will return...
Is there an elegant way to handle this without having troubles with memory leaks?
Thanks for your help!
Benjamin
The best solution here is to write a wrapper function in C which is limited to non-C++ classes. Non-trivial C++ classes are essentially unmarshable via the PInvoke layer [1]. Instead have the wrapper function use a more traditional C signature which is easy to PInvoke against
void findNeigborsWrapper(
Point p,
double maxDist,
Point** ppNeighbors,
size_t* pNeighborsLength)
[1] Yes there are certain cases where you can get away with it but that's the exception and not the rule.
The impedance mismatch is severe. You have to write a wrapper in the C++/CLI language so that you can construct a vector. An additional problem is Point, your C++ declaration for it is not compatible with the managed version of it. Your code ought to resemble this, add it to a class library project from the CLR node.
#include <vector>
using namespace System;
using namespace System::Collections::Generic;
struct Point { int x; int y; };
void findNeighbors(Point p, std::vector<Point> &neighbors, double maxDist);
namespace Mumble {
public ref class Wrapper
{
public:
List<System::Drawing::Point>^ FindNeigbors(System::Drawing::Point p, double maxDist) {
std::vector<Point> neighbors;
Point point; point.x = p.X; point.y = p.Y;
findNeighbors(point, neighbors, maxDist);
List<System::Drawing::Point>^ retval = gcnew List<System::Drawing::Point>();
for (std::vector<Point>::iterator it = neighbors.begin(); it != neighbors.end(); ++it) {
retval->Add(System::Drawing::Point(it->x, it->y));
}
return retval;
}
};
}
Do note the cost of copying the collection, this can quickly erase the perf advantage you might get out of writing the algorithm in native C++.
In order to reduce overhead from copying (if that does cause performance problems) it would be possible to write a C++/CLI ref class around std::vector<>. That way the c++ algorithm can work on c++ types and C# code can access the same data without excessive copying.
The C++/CLI class could implement operator[] and Count in order to avoid relying on IEnumerable::GetEnumerator ().
Or write your wrapper in C++/CLI. Have it take a CLS-compliant type such as IEnumerable and then (sigh) copy each element into your vector, then call the PInvoke.