I have a structure that represents a wire format packet. In this structure is an array of other structures. I have generic code that handles this very nicely for most cases but this array of structures case is throwing the marshaller for a loop.
Unsafe code is a no go since I can't get a pointer to a struct with an array (argh!).
I can see from this codeproject article that there is a very nice, generic approach involving C++/CLI that goes something like...
public ref class Reader abstract sealed
{
public:
generic <typename T> where T : value class
static T Read(array<System::Byte>^ data)
{
T value;
pin_ptr<System::Byte> src = &data[0];
pin_ptr<T> dst = &value;
memcpy((void*)dst, (void*)src,
/*System::Runtime::InteropServices::Marshal::SizeOf(T::typeid)*/
sizeof(T));
return value;
}
};
Now if just had the structure -> byte array / writer version I'd be set! Thanks in advance!
Using memcpy to copy an array of bytes to a structure is extremely dangerous if you are not controlling the byte packing of the structure. It is safer to marshall and unmarshall a structure one field at a time. Of course you will lose the generic feature of the sample code you have given.
To answer your real question though (and consider this pseudo code):
public ref class Writer abstract sealed
{
public:
generic <typename T> where T : value class
static System::Byte[] Write(T value)
{
System::Byte buffer[] = new System::Byte[sizeof(T)]; // this syntax is probably wrong.
pin_ptr<System::Byte> dst = &buffer[0];
pin_ptr<T> src = &value;
memcpy((void*)dst, (void*)src,
/*System::Runtime::InteropServices::Marshal::SizeOf(T::typeid)*/
sizeof(T));
return buffer;
}
};
This is probably not the right way to go. CLR is allowed to add padding, reorder the items and alter the way it's stored in memory.
If you want to do this, be sure to add [System.Runtime.InteropServices.StructLayout] attribute to force a specific memory layout for the structure. In general, I suggest you not to mess with memory layout of .NET types.
Unsafe code can be made to do this, actually. See my post on reading structs from disk: Reading arrays from files in C# without extra copy.
Not altering the structure is certainly sound advice. I use liberal amounts of StructLayout attributes to specify the packing, layout and character encoding. Everything flows just fine.
My issue is just that I need a performant and preferably generic solution. Performance because this is a server application and generic for elegance. If you look at the codeproject link you'll see that the StructureToPtr and PtrToStructure methods perform on the order of 20 times slower than a simple unsafe pointer cast. This is one of those areas where unsafe code is full of win. C# will only let you have pointers to primitives (and it's not generic - can't get a pointer to a generic), so that's why CLI.
Thanks for the psuedocode grieve, I'll see if it gets the job done and report back.
Am I missing something? Why not create a new array of the same size and initialise each element seperately in a loop?
Using an array of byte data is quite dangerous unless you are targetting one platform only... for example your method doesn't consider differing endianness between the source and destination arrays.
Something I don't really understand about your question as well is why having an array as a member in your class is causing a problem. If the class comes from a .NET language you should have no issues, otherwise, you should be able to take the pointer in unsafe code and initialise a new array by going through the elements pointed at one by one (with unsafe code) and adding them to it.
Related
I'm writing a B+-tree implementation in C#, and the tree implementation I chose for my application has a very specific structure which is cache-conscious. To achieve these properties, it has strict layout policies on tree nodes.
What I want is simply expressed using C#'s fixed keyword for fixed-sized buffers:
public abstract class Tree<K, T> { }
sealed class Node<K, T> : Tree<K, T>
{
Node<K, T> right;
fixed Tree<K, T> nodes[127]; // inline array of 128 nodes
}
Unfortunately, fixed-sized buffers can only be used with primitive value types, like int and float. Just using plain arrays would add pointer indirections which destroy the cache-friendly properties of this tree type.
I also can't generate 128 fields and use pointer arithmetic to extract the field I want because there are no conversions between pointer types and managed objects.
About the only thing left is generating 128 fields with an indexer that selects the right one based on a switch (which can't be fast), or writing it as a C library and using P/Invoke, which would also destroy the performance.
Any suggestions?
Use C++/CLI. That gives you total control over layout, just like C, but the cost of managed/unmanaged transitions is much reduced from p/invoke (probably no extra cost at all).
Unfortunately managed code is not very good for "cache-conscious" work: inside the managed heap you are powerless to avoid false-sharing. C++/CLI lets you use the unmanaged allocator, so you can not only keep the data contiguous, but aligned to cache lines.
Also note: Using the class keyword creates a "reference type" which already adds that level of indirection you wanted to avoid. With some reorganization, you might be able to use a struct and an array and not have any more indirection that the code you proposed.
I recently came across the link below which I have found quite interesting.
http://en.wikipedia.org/wiki/XOR_linked_list
General-purpose debugging tools
cannot follow the XOR chain, making
debugging more difficult; [1]
The price for the decrease in memory
usage is an increase in code
complexity, making maintenance more
expensive;
Most garbage collection schemes do
not work with data structures that do
not contain literal pointers;
XOR of pointers is not defined in
some contexts (e.g., the C language),
although many languages provide some
kind of type conversion between
pointers and integers;
The pointers will be unreadable if
one isn't traversing the list — for
example, if the pointer to a list
item was contained in another data
structure;
While traversing the list you need to
remember the address of the
previously accessed node in order to
calculate the next node's address.
Now I am wondering if that is exclusive to low level languages or if that is also possible within C#?
Are there any similar options to produce the same results with C#?
TL;DR I quickly wrote a proof-of-concept XorLinkedList implementation in C#.
This is absolutely possible using unsafe code in C#. There are a few restrictions, though:
XorLinkedList must be "unmanaged structs", i.e., they cannot contain managed references
Due to a limitation in C# generics, the linked list cannot be generic (not even with where T : struct)
The latter seems to be because you cannot restrict the generic parameter to unmanaged structs. With just where T : struct you'd also allow structs that contain managed references.
This means that your XorLinkedList can only hold primitive values like ints, pointers or other unmanaged structs.
Low-level programming in C#
private static Node* _ptrXor(Node* a, Node* b)
{
return (Node*)((ulong)a ^ (ulong)b);//very fragile
}
Very fragile, I know. C# pointers and IntPtr do not support the XOR-operator (probably a good idea).
private static Node* _allocate(Node* link, int value = 0)
{
var node = (Node*) Marshal.AllocHGlobal(sizeof (Node));
node->xorLink = link;
node->value = value;
return node;
}
Don't forget to Marshal.FreeHGlobal those nodes afterwards (Implement the full IDisposable pattern and be sure to place the free calls outside the if(disposing) block.
private static Node* _insertMiddle(Node* first, Node* second, int value)
{
var node = _allocate(_ptrXor(first, second), value);
var prev = _prev(first, second);
first->xorLink = _ptrXor(prev, node);
var next = _next(first, second);
second->xorLink = _ptrXor(node, next);
return node;
}
Conclusion
Personally, I would never use an XorLinkedList in C# (maybe in C when I'm writing really low level system stuff like memory allocators or kernel data structures. In any other setting the small gain in storage efficiency is really not worth the pain. The fact that you can't use it together with managed objects in C# renders it pretty much useless for everyday programming.
Also storage is almost free today, even main memory and if you're using C# you likely don't care about storage much. I've read somewhere that CLR object headers were around ~40 bytes, so this one pointer will be the least of your concerns ;)
C# doesn't generally let you manipulate references at that level, so no, unfortunately.
As an alternative to the unsafe solutions that have been proposed.
If you backed your linked list with an array or list collection where instead of a memory pointer 'next' and 'previous' indicate indexes into the array you could implement this xor without resorting to using unsafe features.
There are ways to work with pointers in C#, but you can have a pointer to an object only temporarily, so you can't use them in this scenario. The main reason for this is garbage collection – as long as you can do things like XOR pointers and unXOR them later, the GC has no way of knowing whether it's safe to collect certain object or not.
You could make something very similar by emulating pointers using indexes in one big array, but you would have to implement a simple form of memory management yourself (i.e. when creating new node, where in the array should I put it?).
Another option would be to go with C++/CLI which allows you both the full flexibility of pointers on one hand and GC and access to the framework when you need it on the other.
Sure. You would just need to code the class. the XOR operator in c# is ^
That should be all you need to start the coding.
Note this will require the code to be declared "unsafe." See here: for how to use pointers in c#.
Making a broad generalization here: C# appears to have gone the path of readability and clean interfaces and not the path of bit fiddling and packing all the information as dense as possible.
So, unless you have a specific need here, you should use the List you are provided. Future maintenance programmers will thank you for it.
It is possible however you have to understand how C# looks at objects. An instance variable does not actually contain an object but a pointer to the object in memory.
DateTime dt = DateTime.Now;
dt is a pointer to a struct in memory containing the DateTime scheme.
So you could do this type of linked list although I am not sure why you would as the framework typically has already implemented the most efficient collections. As a thought expirament it is possible.
If one could put an array of pointers to child structs inside unsafe structs in C# like one could in C, constructing complex data structures without the overhead of having one object per node would be a lot easier and less of a time sink, as well as syntactically cleaner and much more readable.
Is there a deep architectural reason why fixed arrays inside unsafe structs are only allowed to be composed of "value types" and not pointers?
I assume only having explicitly named pointers inside structs must be a deliberate decision to weaken the language, but I can't find any documentation about why this is so, or the reasoning for not allowing pointer arrays inside structs, since I would assume the garbage collector shouldn't care what is going on in structs marked as unsafe.
Digital Mars' D handles structs and pointers elegantly in comparison, and I'm missing not being able to rapidly develop succinct data structures; by making references abstract in C# a lot of power seems to have been removed from the language, even though pointers are still there at least in a marketing sense.
Maybe I'm wrong to expect languages to become more powerful at representing complex data structures efficiently over time.
One very simple reason: dotNET has a compacting garbage collector. It moves things around. So even if you could create arrays like that, you would have to pin every allocated block and you would see the system slow down to a crawl.
But you are trying to optimize based on an assumption. Allocation and cleanup of objects in dotNET is highly optimized. So write a working program first and then use a profiler to find your bottlenecks. It will most likely not be the allocation of your objects.
Edit, to answer the latter part:
Maybe I'm wrong to expect languages to
become more powerful at representing
complex data structures efficiently
over time.
I think C# (or any managed language) is much more powerful at representing
complex data structures (efficiently). By changing from low level pointers to garbage collected references.
I'm just guessing, but it might have to do with different pointer sizes for different target platforms. It seems that the C# compiler is using the size of the elements directly for index calculations (i.e. there is no CLR support for calculating fixed sized buffers indices...)
Anyway you can use an array of ulongs and cast the pointers to it:
unsafe struct s1
{
public int a;
public int b;
}
unsafe struct s
{
public fixed ulong otherStruct[100];
}
unsafe void f() {
var S = new s();
var S1 = new s1();
S.otherStruct[4] = (ulong)&S1;
var S2 = (s1*)S.otherStruct[4];
}
Putting a fixed array of pointers in a struct would quickly make it a bad candidate for a struct. The recommended size limit for a struct is 16 bytes, so on a x64 system you would be able to fit only two pointers in the array, which is pretty pointless.
You should use classes for complex data structures, if you use structures they become very limited in their usage. You wouldn't for example be able to create a data structure in a method and return it, as it would then contain pointers to structs that no longer exists as they were allocated in the stack frame of the method.
In C#, I need to write T[] to a stream, ideally without any additional buffers. I have a dynamic code that converts T[] (where T is a no-objects struct) to a void* and fixes it in memory, and that works great. When the stream was a file, I could use native Windows API to pass the void * directly, but now I need to write to a generic Stream object that takes byte[].
Question: Can anyone suggest a hack way to create a dummy array object which does not actually have any heap allocations, but rather points to an already existing (and fixed) heap location?
This is the pseudo-code that I need:
void Write(Stream stream, T[] buffer)
{
fixed( void* ptr = &buffer ) // done with dynamic code generation
{
int typeSize = sizeof(T); // done as well
byte[] dummy = (byte[]) ptr; // <-- how do I create this fake array?
stream.Write( dummy, 0, buffer.Length*typeSize );
}
}
Update:
I described how to do fixed(void* ptr=&buffer) in depth in this article. I could always create a byte[], fix it in memory and do an unsafe byte-copying from one pointer to another, and than send that array to the stream, but i was hoping to avoid unneeded extra allocation and copying.
Impossible?
Upon further thinking, the byte[] has some meta data in heap with the array dimensions and the element type. Simply passing a reference (pointer) to T[] as byte[] might not work because the meta data of the block would still be that of T[]. And even if the structure of the meta data is identical, the length of the T[] will be much less than the byte[], hence any subsequent access to byte[] by managed code will generate incorrect results.
Feature requested # Microsoft Connect
Please vote for this request, hopefully MS will listen.
This kind of code can never work in a generic way. It relies on a hard assumption that the memory layout for T is predictable and consistent. That is only true if T is a simple value type. Ignoring endianness for a moment. You are dead in the water if T is a reference type, you'll be copying tracking handles that can never be deserialized, you'll have to give T the struct constraint.
But that's not enough, structure types are not copyable either. Not even if they have no reference type fields, something you can't constrain. The internal layout is determined by the JIT compiler. It swaps fields at its leisure, selecting one where the fields are properly aligned and the structure value take the minimum storage size. The value you'll serialize can only be read properly by a program that runs with the exact same CPU architecture and JIT compiler version.
There are already plenty of classes in the framework that do what you are doing. The closest match is the .NET 4.0 MemoryMappedViewAccessor class. It needs to do the same job, making raw bytes available in the memory mapped file. The workhorse there is the System.Runtime.InteropServices.SafeBuffer class, have a look-see with Reflector. Unfortunately, you can't just copy the class, it relies on the CLR to make the transformation. Then again, it is only another week before it's available.
Because stream.Write cannot take a pointer, you cannot avoid copying memory, so you will have some slowdown. You might want to consider using a BinaryReader and BinaryWriter to serialize your objects, but here is code that will let you do what you want. Keep in mind that all members of T must also be structs.
unsafe static void Write<T>(Stream stream, T[] buffer) where T : struct
{
System.Runtime.InteropServices.GCHandle handle = System.Runtime.InteropServices.GCHandle.Alloc(buffer, System.Runtime.InteropServices.GCHandleType.Pinned);
IntPtr address = handle.AddrOfPinnedObject();
int byteCount = System.Runtime.InteropServices.Marshal.SizeOf(typeof(T)) * buffer.Length;
byte* ptr = (byte*)address.ToPointer();
byte* endPtr = ptr + byteCount;
while (ptr != endPtr)
{
stream.WriteByte(*ptr++);
}
handle.Free();
}
Check out my answer to a related question:
What is the fastest way to convert a float[] to a byte[]?
In it I temporarily transform an array of floats to an array of bytes without memory allocation and copying.
To do this I changed the CLR's metadata using memory manipulation.
Unfortunately, this solution does not lend itself well to generics. However, you can combine this hack with code generation techniques to solve your problem.
Look this article Inline MSIL in C#/VB.NET and Generic Pointers the best way to get dream code :)
In my C# application, I have a large struct (176 bytes) that is passed potentially a hundred thousand times per second to a function. This function then simply takes a pointer to the struct and passes the pointer to unmanaged code. Neither the function nor the unmanaged code will make any modifications to the struct.
My question is, should I pass the struct to the function by value or by reference? In this particular case, my guess is that passing by reference would be much faster than pushing 176 bytes onto the call stack, unless the JIT happens to recognize that the struct is never modified (my guess is it can't recognize this since the struct's address is passed to unmanaged code) and optimizes the code.
Since we're at it, let's also answer the more general case where the function does not pass the struct's pointer to unmanaged code, but instead performs some read-only operation on the contents of the struct. Would it be faster to pass the struct by reference? Would in this case the JIT recognize that the struct is never modified and thus optimize? Presumably it is not more efficient to pass a 1-byte struct by reference, but at what struct size does it become better to pass a struct by reference, if ever?
Thanks.
EDIT:
As pointed out below, it's also possible to create an "equivalent" class for regular use, and then use a struct when passing to unmanaged code. I see two options here:
1) Create a "wrapper" class that simply contains the struct, and then pin and pass a pointer to the struct to the unmanaged code when necessary. A potential issue I see is that pinning has its own performance consequences.
2) Create an equivalent class whose fields are copied to the struct when the struct is needed. But copying would take a lot of time and seems to me to defeat the point of passing by reference in the first place.
EDIT:
As mentioned a couple times below, I could certainly just measure the performance of each of these methods. I will do this and post the results. However, I am still interested in seeing people's answers and reasonings from an intellectual perspective.
I did some very informal profiling, and the results indicate that, for my particular application, there is a modest performance gain for passing by reference. For by-value I got about 10,050,000 calls per second, whereas for by-reference I got about 11,200,000 calls per second.
Your mileage may vary.
Before you ask whether or not you should pass the struct by reference, you should ask yourself why you've got such an enormous struct in the first place. Does it really need to be a struct? If you need to use a struct at some point for P/Invoke, would it be worth having a struct just for that, and then the equivalent class elsewhere?
A struct that big is very, very unusual...
See the Design Guidelines for Developing Class Libraries section on Choosing Between Classes and Structures for more guidance on this.
The only way you can get an answer to this question is to code up both and measure the performance.
You mention unmanaged/managed interop. My experience is that it takes a surprisingly long time to to the interop. You could try changing your code from:
void ManagedMethod(MyStruct[] items) {
foreach (var item in items) {
unmanagedHandle.ProcessOne(item);
}
}
To:
void ManagedMethod(MyStruct[] items) {
unmanagedHandle.ProcessMany(items, items.Count);
}
This technique helped me in a similar case, but only measurements will tell you if it works for your case.
Why not just use a class instead, and pass your class to your P/Invoke function?
Using class will pass nicely around in your managed code, and will work the same as passing a struct by reference to a P/Invoke function.
e.g.
// What you have
public struct X
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Foo( ref X arg);
// What you could do
[StructLayout(LayoutKind.Sequential)]
public class Y
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Bar( Y arg );