Calling unsafe code from managed (C#). Reading byte array - c#

I have this method that I need to call and use in my application, but I don't know really know how to do it exactly.
This is the function that I need to call.
[DllImport(dll_Path)]
public static extern int DTS_GetDataToBuffer(int Position, int Length, char* Buffer, int* DataRead);
In my code, I have this function and I'm missing its implementation.
internal static void GetDataToBuffer(int position, int length, out byte[] data, out int dataRead)
{
unsafe
{
// the code I need
}
}
I think most of this is very selfexplanatory. I need to implement the latter function so I can be able to read the data into the buffer and the amount of data read (which should actually be the same as data.Length, but the manufacturer has this as separate option, so I need it).
Can anyone help? Is this clear enough?
Thank you
Edit: Here is the unmanaged declaration from the .h file. Hope it helps.
extern NAG_DLL_EXPIMP int DTS_GetDataToBuffer(int Position,
int Length,
unsigned char *Buffer,
int *DataRead );
Edit #2:
Positon - the position from which to star reading the data.
Length - The amount of data to read (this would be the buffer size).
DataRead - the actual data size that was read.

I don't think you really need to use unsafe pointers here.
Declare function as
[DllImport(dll_Path)]
public static extern int DTS_GetDataToBuffer(
int position,
int length,
byte[] buffer,
ref int dataRead);
Reasonable C# wrapper for this function:
internal static byte[] GetDataToBuffer()
{
// set BufferSize to your most common data length
const int BufferSize = 1024 * 8;
// list of data blocks
var chunks = new List<byte[]>();
int dataRead = 1;
int position = 0;
int totalBytes = 0;
while(true)
{
var chunk = new byte[BufferSize];
// get new block of data
DTS_GetDataToBuffer(position, BufferSize, chunk, ref dataRead);
position += BufferSize;
if(dataRead != 0)
{
totalBytes += dataRead;
// append data block
chunks.Add(chunk);
if(dataRead < BufferSize)
{
break;
}
}
else
{
break;
}
}
switch(chunks.Count)
{
case 0: // no data blocks read - return empty array
return new byte[0];
case 1: // single data block
if(totalBytes < BufferSize)
{
// truncate data block to actual data size
var data = new byte[totalBytes];
Array.Copy(chunks[0], data, totalBytes);
return data;
}
else // single data block with size of Exactly BufferSize
{
return chunks[0];
}
default: // multiple data blocks
{
// construct new array and copy all data blocks to it
var data = new byte[totalBytes];
position = 0;
for(int i = 0; i < chunks.Count; ++i)
{
// copy data block
Array.Copy(chunks[i], 0, data, position, Math.Min(totalBytes, BufferSize));
position += BufferSize;
// we need to handle last data block correctly,
// it might be shorted than BufferSize
totalBytes -= BufferSize;
}
return data;
}
}
}

I can't test this but I think you should let the Marshaler do you conversion(s):
[DllImport(dll_Path)]
public static extern int DTS_GetDataToBuffer(out byte[] data, out int dataRead);

i agree you don't need to use unsafe block. you are using pinvoke, i hope below links might be useful :
http://msdn.microsoft.com/en-us/magazine/cc164123.aspx
http://www.pinvoke.net/
and there are post on stackoverflow too

Related

Copy unmanaged System.IntPtr byte vector into GPU row of 2D device byte array

I am using C# and CUDAfy.net (yes, this problem is easier in straight C with pointers, but I have my reasons for using this approach given the larger system).
I have a video frame grabber card that is collecting byte[1024 x 1024] image data at 30 FPS. Every 33.3 ms it fills a slot in a circular buffer and returns a System.IntPtr that points to that un-managed 1D vector of *byte; The Circular buffer has 15 slots.
On the GPU device (Tesla K40) I want to have a global 2D array that is organized as a dense 2D array. That is, I want something like the Circular Queue but on the GPU organized as a dense 2D array.
byte[15, 1024*1024] rawdata;
// if CUDAfy.NET supported jagged arrays I could use byte[15][1024*1024 but it does not
How can I fill in a different row each 33ms? Do I use something like:
gpu.CopyToDevice<byte>(inputPtr, 0, rawdata, offset, length) // length = 1024*1024
//offset is computed by rowID*(1024*1024) where rowID wraps to 0 via modulo 15.
// inputPrt is the System.Inptr that points to the buffer in the circular queue (un-managed)?
// rawdata is a device buffer allocated gpu.Allocate<byte>(1024*1024);
And in my kernel header is:
[Cudafy]
public static void filter(GThread thread, byte[,] rawdata, int frameSize, byte[] result)
I did try something along these lines. But there is no API pattern in CudaFy for:
GPGPU.CopyToDevice(T) Method (IntPtr, Int32, T[,], Int32, Int32, Int32)
So I used the gpu.Cast Function to change the 2D device array to 1D.
I tried the code below, but I am getting CUDA.net exception: ErrorLaunchFailed
FYI: When I try the CUDA emulator, it aborts on the CopyToDevice
claiming that Data is not host allocated
public static byte[] process(System.IntPtr data, int slot)
{
Stopwatch watch = new Stopwatch();
watch.Start();
byte[] output = new byte[FrameSize];
int offset = slot*FrameSize;
gpu.Lock();
byte[] rawdata = gpu.Cast<byte>(grawdata, FrameSize); // What is the size supposed to be? Documentation lacking
gpu.CopyToDevice<byte>(data, 0, rawdata, offset, FrameSize * frameCount);
byte[] goutput = gpu.Allocate<byte>(output);
gpu.Launch(height, width).filter(rawdata, FrameSize, goutput);
runTime = watch.Elapsed.ToString();
gpu.CopyFromDevice(goutput, output);
gpu.Free(goutput);
gpu.Synchronize();
gpu.Unlock();
watch.Stop();
totalRunTime = watch.Elapsed.ToString();
return output;
}
I propose this "solution", for now, either:
1. Run the program only in native mode (not in emulation mode).
or
2. Do not handle the pinned-memory allocation yourself.
There seems to be an open issue with that now. But this happens only in emulation mode.
see: https://cudafy.codeplex.com/workitem/636
If I understand your question properly I think you are looking to convert the
byte* you get from the cyclic buffer into a multi-dimensional byte array to be sent to
the graphics card API.
int slots = 15;
int rows = 1024;
int columns = 1024;
//Try this
for (int currentSlot = 0; currentSlot < slots; currentSlot++)
{
IntPtr intPtrToUnManagedMemory = CopyContextFrom(currentSlot);
// use Marshal.Copy ?
byte[] byteData = CopyIntPtrToByteArray(intPtrToUnManagedMemory);
int offset =0;
for (int m = 0; m < rows; m++)
for (int n = 0; n < columns; n++)
{
//then send this to your GPU method
rawForGpu[m, n] = ReadByteValue(IntPtr: intPtrToUnManagedMemory,
offset++);
}
}
//or try this
for (int currentSlot = 0; currentSlot < slots; currentSlot++)
{
IntPtr intPtrToUnManagedMemory = CopyContextFrom(currentSlot);
// use Marshal.Copy ?
byte[] byteData = CopyIntPtrToByteArray(intPtrToUnManagedMemory);
byte[,] rawForGpu = ConvertTo2DArray(byteData, rows, columns);
}
}
private static byte[,] ConvertTo2DArray(byte[] byteArr, int rows, int columns)
{
byte[,] data = new byte[rows, columns];
int totalElements = rows * columns;
//Convert 1D to 2D rows, colums
return data;
}
private static IntPtr CopyContextFrom(int slotNumber)
{
//code that return byte* from circular buffer.
return IntPtr.Zero;
}
You should consider using the GPGPU Async functionality that's built in for a really efficient way to move data from/to host/device and use the gpuKern.LaunchAsync(...)
Check out http://www.codeproject.com/Articles/276993/Base-Encoding-on-a-GPU for an efficient way to use this. Another great example can be found in CudafyExamples project, look for PinnedAsyncIO.cs. Everything you need to do what you're describing.
This is in CudaGPU.cs in Cudafy.Host project, which matches the method you're looking for (only it's async):
public void CopyToDeviceAsync<T>(IntPtr hostArray, int hostOffset, DevicePtrEx devArray,
int devOffset, int count, int streamId = 0) where T : struct;
public void CopyToDeviceAsync<T>(IntPtr hostArray, int hostOffset, T[, ,] devArray,
int devOffset, int count, int streamId = 0) where T : struct;
public void CopyToDeviceAsync<T>(IntPtr hostArray, int hostOffset, T[,] devArray,
int devOffset, int count, int streamId = 0) where T : struct;
public void CopyToDeviceAsync<T>(IntPtr hostArray, int hostOffset, T[] devArray,
int devOffset, int count, int streamId = 0) where T : struct;

Get byte[] in C# from char* in C++

In C# I have an data type byte[], which I want to fill in using a C++ function which returns char*
The C++ function (in ImageData.dll)
char* pMemoryBuffer = NULL;
char* LoadData(const char *fileName)
{
// processing pMemoryBuffer ...
return pMemoryBuffer;
}
Import native dll into C#:
[DllImport(".\\Modules_Native\\ImageData.dll", EntryPoint = "LoadData")]
private extern static byte[] LoadData(string fileName);
The byte[] data in C#
byte[] buffer = new byte[256*256];
buffer = LoadData("D:\\myPic.tif");
Apparently it is not working yet, but it presents the idea of what I want to do. So I am wondering how to make this work, and what is the right way to do it. Thanks very much for your education.
try this
// c++
void LoadData(unsigned char* *pMemoryBuffer, const char *fileName)
{
// processing pMemoryBuffer ...
*pMemoryBuffer = resutss;
}
Import native dll into C#:
[DllImport(".\\Modules_Native\\ImageData.dll", EntryPoint = "LoadData")]
private extern static void LoadData(out IntPtr data, string fileName);
When the function returns data will point to the array and you can read the contents using the Marshal class. I guess you would copy it to a new byte array.
byte[] buffer = new byte[256*256];
buffer = Marshal.Copy(LoadData(buffer ,"D:\\myPic.tif"), buffer , 0, buffer.Length);
This should do it:
[DllImport(#".\Modules_Native\ImageData.dll")]
private extern static IntPtr LoadData(string fileName);
byte[] buffer = new byte[256*256];
buffer = Marshal.Copy(LoadData("D:\\myPic.tif"), buffer, 0, buffer.Length);
However, it won't free the memory. Hopefully the C(++) library frees it automatically during the next call, or else provides a deallocation function.
A better approach is to use a caller-allocated buffer, then you would just do:
byte[] buffer = new byte[256*256];
LoadData("D:\\myPic.tif", buffer);
For this, the C(++) code would need to be changed to
int LoadData(const char *fileName, char* pMemoryBuffer)
{
// processing pMemoryBuffer ...
return 1; // if success
}
and the p/invoke declaration to
[DllImport(#".\Modules_Native\ImageData.dll")]
private extern static int LoadData(string fileName, byte[] buffer);
I'm not sure, but my gut says that you can't assign a char* to a byte array, just as you can't in C++ itself. You can either use an IntPtr in C# (probably not super useful), OR, you can pass C++ a byte[] buffer and a number of bytes to write. In other words, I think the following would work:
char* pMemoryBuffer = NULL;
int size = 0;
int seek = 0;
bool LoadData(const char* filename)
{
// load filename
// set seek = 0
// set size to data size
}
int ReadData(char* buffer, int nBytesToRead)
{
// nCopyBytes = min(nBytesToRead, size - seek)
// copy nCopyBytes from pMemoryBuffer+seek to buffer
// seek += nCopyBytes
// return nCopyBytes
}
From C#, you'd use it like this:
byte[] buffer = new byte[256*256];
LoadData("foo.tif");
int bytesRead = ReadData(buffer, 256*256);
Sorry if you specifically want to avoid doing something like this.

Searching Memory For Specific Values C# (Xbox 360)

EDIT: I think I have an idea of a possible solution for the actual searching of values. By making sure the user input ends in 0 the issue should be resolved. This would involve subtracting the last digit from the uint (which I do not know how to get, unless I go the convert to string, trim end back to uint method which is ugly but I guess it could work) and then subtracting it. If anyone has any tips on how to do this please help me out!
I've been working on a program to search memory on the Xbox 360 for specific values, if you are familiar, it is similar to "Cheat Engine". I've gotten the basics down, but I just ran into an issue. My method to search memory is dependent on starting your search at an address that will line up with your value. If that doesn't make sense to you here is the code:
private void searchInt32(int Value, uint Address, uint BytesToSearch)
{
for (uint i = 0; i <= BytesToSearch; i+=4)
{
int recoveredMem = XboxSupport.littleEndtoInt(XboxSupport.GetMem(Address + i, 4), 0);
//Recover Memory (As Bytes) and convert to integer from address (incremented based on for loop)
if (recoveredMem == Value) //Check if recovered mem = search value
{
writeToFile(Address + i, Convert.ToString(Value)); //If recovered mem = search value, write to a text file
}
siStatus.Caption = String.Format("Searching Bytes {0} out of {1}...", i, BytesToSearch); //Update status caption
}
}
As you can see, the code is kept to a minimum and it's also about as fast as possible when it comes to recovering memory from a console. But, if the 4 bytes it recovers don't line up with the value, it will never return what you want. That's obviously a serious issue because the user won't know where their value is or what address to start at to return the correct value. I then attempted to use the following code to fix the issue:
private void searchUInt32(uint Value, uint Address, uint BytesToSearch)
{
siStatus.Caption = String.Format("Recovering Memory...");
byte[] data = XboxSupport.GetMem(Address, BytesToSearch); //Dump Console Memory
FileStream output = new FileStream("SearchData.dat", FileMode.Create);
BinaryWriter writer = new BinaryWriter(output);
writer.Write(data); //Write dump to file
writer.Close();
output = new FileStream("SearchData.dat", FileMode.Open);
BinaryReader reader = new BinaryReader(output); //Open dumped file
for (uint i = 0; i *4 < reader.BaseStream.Length; i++)
{
byte[] bytes = reader.ReadBytes(4); //Read the 4 bytes
Array.Reverse(bytes);
uint currentValue = BitConverter.ToUInt32(bytes, 0); //Convert to UInt
if(currentValue == Value) //Compare
writeToFile(Address + i * 4, Convert.ToString(Value));
siStatus.Caption = String.Format("Searching Bytes {0} out of {1}...", i * 4, BytesToSearch);
}
reader.Close();
File.Delete("SearchData.dat");
}
There is a lot more code, but essentially it does the same thing, just using a file. My original goal was to have users be able to input their own memory blocks to be searched, but right now it seems that just won't work. I do not really want to have the program search all of the memory because that might end up being a slow process (depending on the size of the process being dumped) and often times the values being looked for can be narrowed down to areas of writeable code, removing junk addresses from the executable portion of the process. I am just looking to see if anyone has any suggestions, I was thinking I could possibly get the entry address from the process (I have a function for it) and using a little math correct user input addresses to work properly but I wasn't entirely sure how to do it. If anyone has any suggestions or solutions I'd appreciate any help I can get. If any of my post needs to be clarified/cleaned up please let me know, I'll be glad to do anything that might help me to an answer.
Thanks!
Edit: Temporary (hopefully) Solution:
When I load addresses into the tool they are loaded as strings from a text file, then a conversion to uint is attempted. I solved the not even issue using the following code:
sA[0] = sA[0].Remove(sA[0].Length - 1) + "0"; //Remove last character and replace w/ 0
//Add 16 to the search length
Instead of dumping memory to disk and reading every iteration, scan the target process' memory in chunks, and then marshal the data to leverage the efficiency of pointer arithmetic.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.InteropServices;
namespace MemoryScan {
internal class Program {
[DllImport("kernel32.dll", SetLastError = true)]
private static extern bool ReadProcessMemory(IntPtr hProcess, IntPtr lpBaseAddress, [Out] byte[] lpBuffer, int dwSize, out int lpNumberOfBytesRead);
private static unsafe void Main(string[] args) {
Process process = Process.GetProcessesByName("notepad")[0]; //example target process
int search = 100; //search value
int segment = 0x10000; //avoid the large object heap (> 84k)
int range = 0x7FFFFFFF - segment; ; //32-bit example
int bytesRead;
List<int> addresses = new List<int>();
DateTime start = DateTime.Now;
for (int i = 0; i < range; i += segment) {
byte[] buffer = new byte[segment];
if (!ReadProcessMemory(process.Handle, new IntPtr(i), buffer, segment, out bytesRead)) {
continue;
}
IntPtr data = Marshal.AllocHGlobal(bytesRead);
Marshal.Copy(buffer, 0, data, bytesRead);
for (int j = 0; j < bytesRead; j++) {
int current = *(int*)(data + j);
if (current == search) {
addresses.Add(i + j);
}
}
Marshal.FreeHGlobal(data);
}
Console.WriteLine("Duration: {0} seconds", (DateTime.Now - start).TotalSeconds);
Console.WriteLine("Found: {0}", addresses.Count);
Console.ReadLine();
}
}
}
Test Results
Duration: 1.142 seconds
Found: 3204
Create a generic class to make type marshaling easier, like so:
public static class MarshalHelper
{
public unsafe static T Read<T>(IntPtr address)
{
object value;
switch (Type.GetTypeCode(typeof(T)))
{
case TypeCode.Int16:
value = *(short*)address;
break;
case TypeCode.Int32:
value = *(int*)address;
break;
case TypeCode.Int64:
value = *(long*)address;
break;
default:
throw new ArgumentOutOfRangeException();
}
return (T)value;
}
}

Marshal float* to C#

I have a DLL which exports a function that returns a float*, that I would like to use it in my C# code. I am not sure how to Marshal my float* so that I can safely use it in C#.
So, in my C++ DLL, I have declared:
static float* GetSamples(int identifier, int dataSize);
In my C# script, I have:
[DllImport ("__Internal")]
public static extern float[] GetSamples (int identifier, int dataSize);
The C++ GetSamples(int,int) allocates memory and return a pointer t the float array. How do I declare the C# GetSamples to Marshal my float array, and how do I access the data (either by iteration or Marshal.Copy)?
Also, can I delete the float* from C# or do I have to call another C++ function to delete the allocated memory?
EDIT:
So this is what I have tried up to now.
First, on the C# side:
Declaration:
[DllImport ("__Internal")]
public static extern int GetSamples ([In, Out]IntPtr buffer,int length, [Out] out IntPtr written);
Trying to call it:
IntPtr dataPointer = new IntPtr();
IntPtr outPtr;
GetSamples(dataPointer, data.Length, out outPtr);
for (var i = 0; i < data.Length; i++){
copiedData[i] = Marshal.ReadByte(dataPointer, i);
}
Then in my C++ lib:
int AudioReader::RetrieveSamples(float * sampleBuffer, size_t dataLength, size_t * /* out */ written)
{
float* mydata = new float[dataLength];
//This is where I copy the actual data into mydata
memcpy(sampleBuffer, mydata, dataLength*sizeof(float));
delete data;
return dataLength;
}
I don't really know what outPtr is for... And I know I have some additional copying steps that I can removes, I just want to get it working for now.
So this is a bit of a complicated answer...
.NET doesn't know how to handle C++ memory allocation, so regardless returning a float * is dangerous at best for this. Furthermore the .NET memory model is based on COM so it is CoTaskMemAlloc based, not that it really helps you here. So here is what I would suggest:
int AudioReader::RetrieveSamples(
float * sampleBuffer,
int dataLength,
int * /* out */ written)
{
// assuming mydata is already defined
if(sampleBuffer == NULL || dataLength == 0)
{
*written = sizeof(mydata);
return -1;
}
ZeroMemory(sampleBuffer, dataLength);
int toCopy = min(dataLength, sizeof(myData));
//This is where I copy the actual data into mydata
memcpy(sampleBuffer, mydata, toCopy);
*written = toCopy;
return 0;
}
[DLLImport("__internal")]
private static extern int GetSamples(
[In, Out]IntPtr buffer,
[In] int length,
[Out] out int written);
float[] RetrieveFloats()
{
int bytesToAllocate = 0;
GetSamples(IntPtr.Zero, 0, out bytesToAllocate);
if(bytesToAllocate == 0)
return null;
int floatCount = bytesToAllocate/ sizeof(float);
float[] toReturn = new float[floatCount];
IntPtr allocatedMemory = Marshal.AllocHGlobal(bytesToAllocate);
int written = 0;
if(GetSamples(allocatedMemory, bytesToAllocate, out written) != -1)
{
floatCount = written/sizeof(float);
Marshal.Copy(allocatedMemory, toReturn, 0, floatCount);
}
Marshal.FreeHGlobal(allocatedMemory);
return toReturn;
}
Passing a bufferLength of zero would return the space required for the buffer, which can then be allocated and passed in.
You will need to allocate the memory for the buffer in C#, you cannot allocate it in C++

Suggestions for a thread safe non-blocking buffer manager

I've created a simple buffer manager class to be used with asyncroneous sockets. This will protect against memory fragmentation and improve performance. Any suggestions for further improvements or other approaches?
public class BufferManager
{
private int[] free;
private byte[] buffer;
private readonly int blocksize;
public BufferManager(int count, int blocksize)
{
buffer = new byte[count * blocksize];
free = new int[count];
this.blocksize = blocksize;
for (int i = 0; i < count; i++)
free[i] = 1;
}
public void SetBuffer(SocketAsyncEventArgs args)
{
for (int i = 0; i < free.Length; i++)
{
if (1 == Interlocked.CompareExchange(ref free[i], 0, 1))
{
args.SetBuffer(buffer, i * blocksize, blocksize);
return;
}
}
args.SetBuffer(new byte[blocksize], 0, blocksize);
}
public void FreeBuffer(SocketAsyncEventArgs args)
{
int offset = args.Offset;
byte[] buff = args.Buffer;
args.SetBuffer(null, 0, 0);
if (buffer == buff)
free[offset / blocksize] = 1;
}
}
Edit:
The orignal answer below addresses a code construction issue of overly tight coupling. However, considering the solution as whole I would avoid using just one large buffer and handing over slices of it in this way. You expose your code to buffer overrun (and shall we call it buffer "underrun" issues). Instead I would manage an array of byte arrays each being a discrete buffer. Offset handed over is always 0 and size is always the length of the buffer. Any bad code that attempts to read/write parts beyond the boundaries will be caught.
Original answer
You've coupled the class to SocketAsyncEventArgs where in fact all it needs is a function to assign the buffer, change SetBuffer to:-
public void SetBuffer(Action<byte[], int, int> fnSet)
{
for (int i = 0; i < free.Length; i++)
{
if (1 == Interlocked.CompareExchange(ref free[i], 0, 1))
{
fnSet(buffer, i * blocksize, blocksize);
return;
}
}
fnSet(new byte[blocksize], 0, blocksize);
}
Now you can call from consuming code something like this:-
myMgr.SetBuffer((buf, offset, size) => myArgs.SetBuffer(buf, offset, size));
I'm not sure that type inference is clever enough to resolve the types of buf, offset, size in this case. If not you will have to place the types in the argument list:-
myMgr.SetBuffer((byte[] buf, int offset, int size) => myArgs.SetBuffer(buf, offset, size));
However now your class can be used to allocate a buffer for all manner of requirements that also use the byte[], int, int pattern which is very common.
Of course you need to decouple the free operation to but thats:-
public void FreeBuffer(byte[] buff, int offset)
{
if (buffer == buff)
free[offset / blocksize] = 1;
}
This requires you to call SetBuffer on the EventArgs in consuming code in the case for SocketAsyncEventArgs. If you are concerned that this approach reduces the atomicity of freeing the buffer and removing it from the sockets use, then sub-class this adjusted buffer manager and include SocketAsyncEventArgs specific code in the sub-class.
I've created a new class with a completely different approach.
I have a server class that receives byte arrays. It will then invoke different delegates handing them the buffer objects so that other classes can process them. When those classes are done they need a way to push the buffers back to the stack.
public class SafeBuffer
{
private static Stack bufferStack;
private static byte[][] buffers;
private byte[] buffer;
private int offset, lenght;
private SafeBuffer(byte[] buffer)
{
this.buffer = buffer;
offset = 0;
lenght = buffer.Length;
}
public static void Init(int count, int blocksize)
{
bufferStack = Stack.Synchronized(new Stack());
buffers = new byte[count][];
for (int i = 0; i < buffers.Length; i++)
buffers[i] = new byte[blocksize];
for (int i = 0; i < buffers.Length; i++)
bufferStack.Push(new SafeBuffer(buffers[i]));
}
public static SafeBuffer Get()
{
return (SafeBuffer)bufferStack.Pop();
}
public void Close()
{
bufferStack.Push(this);
}
public byte[] Buffer
{
get
{
return buffer;
}
}
public int Offset
{
get
{
return offset;
}
set
{
offset = value;
}
}
public int Lenght
{
get
{
return buffer.Length;
}
}
}

Categories

Resources