C# Char* to String

C# Char* to String - c#

I've looked around a lot and can't seem to find a solution to anything similar to what I'm doing. I have two applications, a native C++ app and a managed C# app. The C++ app allocates a pool of bytes that are used in a memory manager. Each object allocated in this memory manager has a header, and each header has a char* that points to a name given to the object. The C# app acts as a viewer for this memory. I use memory mapped files to allow the C# app to read the memory while the C++ app is running. My issue is that I am trying to read the name of the object from the header structure and display it in C# (or just store it in a String, whatever). Using unsafe code I am able to convert the four bytes that make up the char* into an IntPtr, convert that to a void*, and call Marshal.PtrToStringAnsi. This is the code:
IntPtr namePtr = new IntrPtr(BitConverter.ToInt32(bytes, index));
unsafe
{
void* ptr = namePtr.ToPointer();
char* cptr = (char*)ptr;
output = Marshal.PtrToStringAnsi((IntPtr)ptr);
}
In this case, bytes is the array read from the memory mapped file that represents all of the pool of bytes created by the native app, and index is the index of the first byte of the name pointer.
I have verified that, on the managed side of things, the address returned by the call to namePtr.ToPointer() is exactly the address of the name pointer in the native app. If this were native code, I would simply cast ptr to a char* and it would be fine, but in managed code I've read I must use the Marshaller to do this.
This code yields varying results. Sometimes cptr is null, sometimes it points to \0, and other times it points to a few Asian characters (which when run through the PtrToStringAnsi method produce seemingly irrelevant characters). I thought it might be a fixed thing, but ToPointer produces a fixed pointer. And sometimes after the cast to a char* the debugger says Unable to evaluate the expression. The pointer is not valid or something like that (it's not easy to repro every varying thing that comes back). And other times I get an access violation when reading the memory, which leads me to the C++ side of things.
On the C++ side, I figured there might be some issues with actually reading the memory because although the memory that stores the pointer is part of the memory mapped file the actual bytes that make up the text are not. So I looked at how to change read/write access to memory (on Windows, mind you) and found the VirtualProtect method in the Windows libraries, which I use to change access to the memory to PAGE_EXECUTE_WRITECOPY, which I figured would give any application that has a pointer to that address will be able to at least read what's there. But that didn't solve the issue either.
To put it shortly:
I have a pointer (in C#) to the first char in a char array (that was allocated in a C++ app) and am trying to read that array of char's into a string in C#.
EDIT:
The source header looks like this:
struct AllocatorHeader
{
// These bytes are reserved, and their purposes may change.
char _reserved[4];
// A pointer to a destructor mapping that is associated with this object.
DestructorMappingBase* _destructor;
// The size of the object this header is for.
unsigned int _size;
char* _name;
};
The _name field is the one I'm trying to dereference in C#.
EDIT:
As of now, even using the solutions provided below, I am unable to dereference this char* in managed code. As such, I have simply made a copy of the char* in the pool referenced by the memory mapped file and use a pointer to that. This works, which makes me believe this is a protection-related issue. If I find a way to circumvent this at some point, I will answer my own question. Until then, this will be my workaround. Thanks to all who helped!

This seems to work for me in my simple test:
private static unsafe String MarshalUnsafeCStringToString(IntPtr ptr, Encoding encoding) {
void *rawPointer = ptr.ToPointer();
if (rawPointer == null) return "";
char* unsafeCString = (char*)rawPointer;
int lengthOfCString = 0;
while (unsafeCString[lengthOfCString] != '\0') {
lengthOfCString++;
}
// now that we have the length of the string, let's get its size in bytes
int lengthInBytes = encoding.GetByteCount (unsafeCString, lengthOfCString);
byte[] asByteArray = new byte[lengthInBytes];
fixed (byte *ptrByteArray = asByteArray) {
encoding.GetBytes(unsafeCString, lengthOfCString, ptrByteArray, lengthInBytes);
}
// now get the string
return encoding.GetString(asByteArray);
}
Perhaps a bit convoluted, but assuming your string is NUL-terminated it should work.

Your code fails because pointers are only valid and meaningful in the process which owns them. You map the memory into a different process and there is a completely different virtual address space. Nothing says that the memory will be mapped into the same virtual address.
If you knew the base address in both processes you could adjust the pointers. But frankly your entire approach is deeply flawed. You really should use some real IPC here. Named pipes, sockets, WCF, even good old windows messages would suffice.
In a comment you say:
If the char array is allocated on the heap, I can dereference this pointer fine. But when the char* is initialized to something like char* a = "Hello world"; it doesn't work.
Of course not. The string literal is statically allocated by the compiler and resides in the executable module's address space. You'd need to allocate a string in your shared heap and use strcpy to copy the literal over to the shared string.

Related

How do I get a byte* for the bits in a BitArray?

I am working on a C++ CLI wrapper of a C API. One function in the C API expected data in the form:
void setData(byte* dataPtr, int offset, int length);
void getData(byte* buffer, int offset, int length);
For the C++ CLI it was suggested that we use a System.Collections.BitArray (Yes the individual Bits have meaning). A BitArray can be constructed from an array of bytes and copied to one:
array<System::Byte>^ bytes = gcnew array<System::Byte>(40);
System::Collections::BitArray^ ba = gcnew System::Collections::BitArray(bytes);
int length = ((ba->Length - 1)/8) +1;
array<System::Byte>^ newBytes = gcnew array<System::Byte>(length);
ba->CopyTo(newBytes, 0);
pin_ptr<unsigned char> rawDataPtr = &buffer[0];
My concern is the last line. Is it valid to get a pointer from the array in this way? Is there a better alternative in C# for working with arbitrary bits? Remember the individual bits have meaning.

Is it valid to get a pointer from the array in this way?
Yes, that's valid. The pin_ptr<> helper class calls GCHandle.Alloc() under the hood, asking for GCHandleType.Pinned. So the pointer is stable and can be passed to unmanaged code without fear that the garbage collector is going to move the array and make the pointer invalid.
A very important detail is missing from the question however. The reason that pin_ptr<> exists instead of just letting you use GCHandle directly is exactly when the GCHandle.Free() method will be called. You don't do this explicitly, pin_ptr<> does it for you, it uses the standard C++ RAII pattern. In other words, the Free() method is automatically called, it happens when the variable goes out of scope. Which gets the C++ compiler to emit the destructor call, it in turns calls Free().
This will go very, very wrong when the C function stores the passed dataPtr and uses it later. Later being the problem, the array won't be pinned anymore and can now exist at an arbitrary address. Major data corruption, very hard to diagnose. The getData() function strongly suggests that is fact the case. This is not good.
You will need to fix this, using GCHandle::Alloc() yourself to pin the array permanently is very painful to garbage collector, a rock in the road that won't budge and has a long-lasting effect on the efficiency of the program. Instead you should copy the managed array to stable memory that you allocate with, say, malloc() or Marshal::AllocHGlobal(). That's unmanaged memory, it will never move. Marshal::Copy() is a simple way to copy it.

Are there memory security levels in .NET interop?

I have a quite strange problem:
I am testing several function calls to a unmanaged C dll with NUnit. The odd thing is, the test fails when it runs normally, but when i run it with the debugger (even with no break point) it passes fine.
So, has the debugger a wider memory access as the plain NUnit application?
i have isolated the call which fails. its passing back a char pointer to a string, which the marshaller should convert to a C# string. the C side looks like this:
#define get_symbol(a) ((a).a_w.w_symbol->s_name)
EXTERN char *atom_get_symbol(t_atom *a);
...
char *atom_get_symbol(t_atom *a) {
return get_symbol(*a);
}
and the C# code:
[DllImport("csharp.dll", EntryPoint="atom_get_symbol")]
[return:MarshalAs(UnmanagedType.LPStr)]
private static extern string atom_get_symbol(IntPtr a);
the pointer which is returned from c is quite deep inside the code and part of a list. so do i just miss some security setting?
EDIT: here is the exception i get:
System.AccessViolationException : (translated to english:) there was an attempt to read or write protected memory. this might be an indication that other memory is corrupt.
at Microsoft.Win32.Win32Native.CoTaskMemFree(IntPtr ptr)
at ....atom_get_symbol(IntPtr a)
SOLUTION:
the problem was, that the marshaller wanted to free the memory which was part of a C struct. but it sould just make a copy of the string and leave the memory as is:
[DllImport("csharp.dll", EntryPoint="atom_get_symbol")]
private static extern IntPtr atom_get_symbol(IntPtr a);
and then in the code get a copy of the string with:
var string = Marshal.PtrToStringAnsi(atom_get_symbol(ptrToStruct));
great!

This will always cause a crash on Vista and up, how you avoided it at all isn't very clear. The stack trace tells the tale, the pinvoke marshaller is trying to release the string buffer that was allocated for the string. It always uses CoTaskMemFree() to do so, the only reasonable guess at an allocator that might have been used to allocate the memory for the string. But that rarely works out well, C or C++ code almost always uses the CRT's private heap. This doesn't crash on XP, it has a much more forgiving memory manager. Which produces undiagnosable memory leaks.
Notable is that the C declaration doesn't give much promise that you can pinvoke the function, it doesn't return a const char*. The only hope you have is to declare the return type as IntPtr instead of string so the pinvoke marshaller doesn't try to release the pointed-to memory. You'll need to use Marshal.PtrToStringAnsi() to convert the returned IntPtr to a string.
You'll need to test the heck out of it, call the function a billion times to ensure that you don't leak memory. If that test crashes with an OutOfMemoryException then you have a big problem. The only alternative then is to write a wrapper in the C++/CLI language and make sure that it uses the exact same version of the CRT as the native code so that they both use the same heap. Which is tricky and impossible if you don't have the source code. This function is just plain difficult to call from any language, including C. It should have been declared as int atom_get_symbol(t_atom* a, char* buf, size_t buflen) so it can be called with a buffer that's allocated by the client code.

Can IntPtr be cast into a byte array without doing a Marshal.Copy?

I want to get data from an IntPtr pointer into a byte array. I can use the following code to do it:
IntPtr intPtr = GetBuff();
byte[] b = new byte[length];
Marshal.Copy(intPtr, b, 0, length);
But the above code forces a copy operation from IntPtr into the byte array. It is not a good solution when the data in question is large.
Is there any way to cast an IntPtr to a byte array? For example, would the following work:
byte[] b = (byte[])intPtr
This would eliminate the need for the copy operation.
Also: how can we determine the length of data pointed to by IntPtr?

As others have mentioned, there is no way you can store the data in a managed byte[] without copying (with the current structure you've provided*). However, if you don't actually need it to be in a managed buffer, you can use unsafe operations to work directly with the unmanaged memory. It really depends what you need to do with it.
All byte[] and other reference types are managed by the CLR Garbage Collector, and this is what is responsible for allocation of memory and deallocation when it is no longer used. The memory pointed to by the return of GetBuffer is a block of unmanaged memory allocated by the C++ code and (memory layout / implementation details aside) is essentially completely separate to your GC managed memory. Therefore, if you want to use a GC managed CLR type (byte[]) to contain all the data currently held within your unmanaged memory pointed to by your IntPtr, it needs to be moved (copied) into memory that the GC knows about. This can be done by Marshal.Copy or by a custom method using unsafe code or pinvoke or what have you.
However, it depends what you want to do with it. You've mentioned it's video data. If you want to apply some transform or filter to the data, you can probably do it directly on the unmanaged buffer. If you want to save the buffer to disk, you can probably do it directly on the unmanaged buffer.
On the topic of length, there is no way to know the length of an unmanaged memory buffer unless the function that allocated the buffer also tells you what the length is. This can be done in lots of ways, as commenters have mentioned (first field of the structure, out paramtere on the method).
*Finally, if you have control of the C++ code it might be possible to modify it so that it is not responsible for allocating the buffer it writes the data to, and instead is provided with a pointer to a preallocated buffer. You could then create a managed byte[] in C#, preallocated to the size required by your C++ code, and use the GCHandle type to pin it and provide the pointer to your C++ code.

Try this:
byte* b = (byte*)intPtr;
Requires unsafe (in the function signature, block, or compiler flag /unsafe).

You can't have a managed array occupy unmanaged memory. You can either copy the unmanaged data one chunk at a time, and process each chunk, or create an UnmanagedArray class that takes an IntPtr and provides an indexer which will still use Marshal.Copy for accessing the data.
As #Vinod has pointed out, you can do this with unsafe code. This will allow you to access the memory directly, using C-like pointers. However, you will need to marshal the data into managed memory before you call any unsafe .NET method, so you're pretty much limited to your own C-like code. I don't think you should bother with this at all, just write the code in C++.

Check out this Code Project page for a solution to working with unmanaged arrays.

Converting C# void* to byte[]

In C#, I need to write T[] to a stream, ideally without any additional buffers. I have a dynamic code that converts T[] (where T is a no-objects struct) to a void* and fixes it in memory, and that works great. When the stream was a file, I could use native Windows API to pass the void * directly, but now I need to write to a generic Stream object that takes byte[].
Question: Can anyone suggest a hack way to create a dummy array object which does not actually have any heap allocations, but rather points to an already existing (and fixed) heap location?
This is the pseudo-code that I need:
void Write(Stream stream, T[] buffer)
{
fixed( void* ptr = &buffer ) // done with dynamic code generation
{
int typeSize = sizeof(T); // done as well
byte[] dummy = (byte[]) ptr; // <-- how do I create this fake array?
stream.Write( dummy, 0, buffer.Length*typeSize );
}
}
Update:
I described how to do fixed(void* ptr=&buffer) in depth in this article. I could always create a byte[], fix it in memory and do an unsafe byte-copying from one pointer to another, and than send that array to the stream, but i was hoping to avoid unneeded extra allocation and copying.
Impossible?
Upon further thinking, the byte[] has some meta data in heap with the array dimensions and the element type. Simply passing a reference (pointer) to T[] as byte[] might not work because the meta data of the block would still be that of T[]. And even if the structure of the meta data is identical, the length of the T[] will be much less than the byte[], hence any subsequent access to byte[] by managed code will generate incorrect results.
Feature requested # Microsoft Connect
Please vote for this request, hopefully MS will listen.

This kind of code can never work in a generic way. It relies on a hard assumption that the memory layout for T is predictable and consistent. That is only true if T is a simple value type. Ignoring endianness for a moment. You are dead in the water if T is a reference type, you'll be copying tracking handles that can never be deserialized, you'll have to give T the struct constraint.
But that's not enough, structure types are not copyable either. Not even if they have no reference type fields, something you can't constrain. The internal layout is determined by the JIT compiler. It swaps fields at its leisure, selecting one where the fields are properly aligned and the structure value take the minimum storage size. The value you'll serialize can only be read properly by a program that runs with the exact same CPU architecture and JIT compiler version.
There are already plenty of classes in the framework that do what you are doing. The closest match is the .NET 4.0 MemoryMappedViewAccessor class. It needs to do the same job, making raw bytes available in the memory mapped file. The workhorse there is the System.Runtime.InteropServices.SafeBuffer class, have a look-see with Reflector. Unfortunately, you can't just copy the class, it relies on the CLR to make the transformation. Then again, it is only another week before it's available.

Because stream.Write cannot take a pointer, you cannot avoid copying memory, so you will have some slowdown. You might want to consider using a BinaryReader and BinaryWriter to serialize your objects, but here is code that will let you do what you want. Keep in mind that all members of T must also be structs.
unsafe static void Write<T>(Stream stream, T[] buffer) where T : struct
{
System.Runtime.InteropServices.GCHandle handle = System.Runtime.InteropServices.GCHandle.Alloc(buffer, System.Runtime.InteropServices.GCHandleType.Pinned);
IntPtr address = handle.AddrOfPinnedObject();
int byteCount = System.Runtime.InteropServices.Marshal.SizeOf(typeof(T)) * buffer.Length;
byte* ptr = (byte*)address.ToPointer();
byte* endPtr = ptr + byteCount;
while (ptr != endPtr)
{
stream.WriteByte(*ptr++);
}
handle.Free();
}

Check out my answer to a related question:
What is the fastest way to convert a float[] to a byte[]?
In it I temporarily transform an array of floats to an array of bytes without memory allocation and copying.
To do this I changed the CLR's metadata using memory manipulation.
Unfortunately, this solution does not lend itself well to generics. However, you can combine this hack with code generation techniques to solve your problem.

Look this article Inline MSIL in C#/VB.NET and Generic Pointers the best way to get dream code :)

Access memory address in c#

I am interfacing with an ActiveX component that gives me a memory address and the number of bytes.
How can I write a C# program that will access the bytes starting at a given memory address? Is there a way to do it natively, or am I going to have to interface to C++? Does the ActiveX component and my program share the same memory/address space?

You can use Marshal.Copy to copy the data from native memory into an managed array. This way you can then use the data in managed code without using unsafe code.

I highly suggest you use an IntPtr and Marshal.Copy. Here is some code to get you started.
memAddr is the memory address you are given, and bufSize is the size.
IntPtr bufPtr = new IntPtr(memAddr);
byte[] data = new byte[bufSize];
Marshal.Copy(bufPtr, data, 0, bufSize);
This doesn't require you to use unsafe code which requires the the /unsafe compiler option and is not verifiable by the CLR.
If you need an array of something other than bytes, just change the second line. Marshal.Copy has a bunch of overloads.

I think you are looking for the IntPtr type. This type (when used within an unsafe block) will allow you to use the memory handle from the ActiveX component.

C# can use pointers. Just use the 'unsafe' keyword in front of your class, block, method, or member variable (not local variables). If a class is marked unsafe all member variables are unsafe as well.
unsafe class Foo
{
}
unsafe int FooMethod
{
}
Then you can use pointers with * and & just like C.
I don't know about ActiveX Components in the same address space.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.