In another topic, a nice guy told me by quote Eric Lippert's words:The significance of static has to do with the knowledge and certainties the compiler has at compile time of a certain class/struct/field what have you. It has nothing to do with memory locations and them being fixed or not, etc.
But I'm still not so sure, because the compiler allows something shown below happen.
struct MyStruct
{
public static int[] Arr = {1,3,5};
}
static void Test<T>(ref T t) where T:struct
{
Console.WriteLine (t);
}
void Main()
{
Test(ref MyStruct.Arr[2]);//output: as expected 5
}
Are the ref arguments totally different things compare to the c++ references, or a behind the scene pin happens every time some args are passed by ref? If the static members are movable, how does the runtime guarantee the address of an array element won't change during the execution of the called function? I've learned from an experiment that the return values of objects' Item prop other than arrays' are not allowed to be passed byRef. I thought that's because array elements are allocated in a chunk of continuous memory, but if the whole array is movable, how can one take an address of its elements?
I'm kinda stuck by this uncertainty. I'd very much appreciate it if someone could give some certain answer. Thanks in advance!
~~~~~~~~~~~~~~~~~~
Trying to understand it:
So, any managed operation as long as the compiler allows it happen, We shouldn't sweat it, right? I have some C/C++ background, I think I understand the meaning of "static" pretty well for c++, only the movability thing of managed code make me dubious. Any managed object, no matter it's on a stack or managed heap, A ref arg can always point to it correctly, right?
C# ref arguments aren't totally different from C++ references, but they are different in this respect.
C# ref arguments are known to the garbage collector and it will adjust them if it promotes objects to a different generation.
C++ references are invisible to the .NET garbage collector, they will break if the target is not pinned and the garbage collector runs.
(C++/CLI supports both .NET references and native references)
If the static members are movable, how does the runtime guarantee the address of a array element won't change during the execution of the called function?
It doesn't. But the function will use the updated address, because it's .NET code also.
And none of this changes depending on whether there's a static field referring to the array. (In fact the array itself isn't static, only the field referring to it. This fact makes the whole question nonsensical.)
Related
Structs are value types and thus are fully copied every time there is a manipulation on the struct. Since they are value types, structs are allocated in the stack and not in the heap.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack, specially if they are big with lots of inner fields.
But I am curious about how C# deals with the return of structs.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers. And practically all C# struct tutorials say structs lives in the stack, never in the heap.
So in the following code:
MyStruct ms = GetMyValue();
Where GetMyValue() is
MyStruct GetMyValue();
How will C# deal with the return of the struct for the ms variable? Specially if it's is too big for the registers? Will it in fact copy it to the heap and then copy it back again to the caller of the method and assign it to ms?
EDIT:
To address the comments left in the post:
I have read a few tutorial on C# structs before posting this, this tutorial in particular uses the word stack more times than I bother to count. And this MSDN tutorial also speaks about the stack, although it's from 2003, I don't think structs changed since then.
I am aware this might not be realted at all with C# but in fact be a matter of the JIT compiler it self or the CLR or something else I am not aware of. That's the purpose of my question, to learn more about the inner workings of C#, even if this is not actually related to the language itself.
There are C function call conventions, the best support for my Post is this StackOverflow post. When I first posted it in here I just said what I remembered, but since the SO answer says:
As for your specific question, it depends the ABI. Sometimes if the return value is larger than 4 bytes but not larger than 8 bytes, it can be split into EAX and EDX. But most of the time the calling function will just allocate some memory (usually on the stack) and pass a pointer to this area to the called function.
I might be wrong on this one, and I say might, because the answer says usually.
The true reason why I want to understand how structs are handled is because I have a project where I have to read a Serial Port multiple times to poll for data, this data will be returned by a method.
Since the data is just some bytes I thought I could get some performance out of structs instead of using a class to abstract the bytes incoming by the Serial Port, but if the return would pass the struct as a heap allocation my expectations on performance increase could be false.
Yes, I can make a simple test and compare performance, I know, but I wanted to actually learn how it's done behind the curtains, and not only memorize the outcome of my simulation. I like to know how the things that I work with actually work, and not only learn how to use them.
Value types are not only located on a stack. They also live in fields and in arrays. The key distinction to reference types is that value types are copied by value and have no identity. The stack vs. heap idea is false.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers
The heap is not involved. The caller allocates spaces for the return value to be placed in. It passes a pointer to that space. The callee can fill that space. The .NET CLR does this as well. Of course this is an implementation detail.
but I wanted to actually learn
This is very good. You could not have tested what I just told you. You need to be a little more critical in what you believe what others say. Either you had bad tutorials or you read them in an imprecise way.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack
This is not always the case I think. I'm not quite sure but I think the JIT can sometimes pass structs in registers. The .NET JITs really do not optimize much but I think this is an optimization that works to a certain degree. Probably driven by the existence of some one-field structs such as DateTime.
structs do not always live on the stack. if you allocate a struct inside of a function, it lives its life on the stack. if it's a field of a reference type(class/array(implicitly derived from System.Array/Object), it lives its life on the heap. as far as how theyre returned, that might be up to the ABI for that CPU architecture.
from the sounds of it, you've never dealt with IL/assembly/code generation, so lets build a dynamic method thats equivalent to MyStruct ms = GetMyValue()/what the compiler would generate in context of the word stack. "things" are never actually returned. thing(s, in a tuple sense i'm sure), are pushed onto the stack, and then a return instruction is emitted. leaving the return value(s) for the caller. we're going to assume GetMyValue() allocates a new MyStruct and assigns it to a local variable. the generated code would look something like this(i extend the ILGenerator class):
ILGenerator generator = dynamicMethod.GetILGenerator();
generator
.DeclareLocal(typeof(MyStruct))
.EmitCall(OpCodes.Call, typeof(EncapsulatingClass).GetMethod("GetMyValue"))
.Emit(OpCodes.Stloc, 0);
what happens here is(some of this is my assumption on how the CLI runtime works):
the calling function reserves a slot of typeof(MyStruct) at the current local list index.
GetMyValue() is called, reserves a MyStruct local the same way the method we are building does, emits an OpCodes.Newobj, which allocates and adjusts ESP(extended stack pointer) downward in the amount of sizeof(MyStruct), emits OpCodes.Stloc to store ESP minus sizeof(MyStruct) into the reserved local index, does some stuff with its fields, calls Emit(OpCodes.Ldloc, 0) to push the address the local points to onto the evaluation stack for the calling function, and emits an OpCodes.Ret to return.
the calling function emits an OpCodes.Stloc to store(copy) the contents of the MyStruct the top of the evaluation stack points to(how this happens, well i'm sure the answer is it depends, unfortunately), at local index 0.
i'm not an expert on how the CLI runtime is constructed by any means, so a lot of this is an assumption of what happens. take it with a grain of salt, and i'm by no means a CPU engineering expert. how the instruction stream segment of OpCodes.Ldloc, OpCodes.Ret, OpCodes.Stloc -- ms = GetMyValue() -- is treated, is probably up to how the JITer translates the IL into actual cpu specific machine instructions. such as X86. what determines if a struct will be returned into a register, is probably limited to one register only, so whatever the biggest register is, and if whatever struct will fit inside of it. i know CPU's can combine registers for memory offsets, but i'm not sure if that applies to returning structs inside of multiple registers. another thing to keep in mind, GetMyValue() went out of scope, which means the struct GetMyValue() allocated, in a scope sense, doesn't exist anymore, but in a stack sense(where it was allocated), it does, so the JITer could very well have just taken the address OpCodes.Ldloc pushed onto the stack, and placed it directly into the callers local index 0. since nothing can possibly copy it anymore due to the function returning. making the caller the new owner of the struct. avoiding any copying and registers altogether in this special case. this might be where calling conventions come into play as well. the problem is, if you allocated three structs in GetMyValue() for whatever reason, returning any struct after the first struct allocated would break that optimization, which is where the next optimization, return the struct inside a register(if it fits), comes into play. leaving the worst case scenario, copying its contents purely onto the stack again for the caller. i could be wrong, and anyone is more than welcome to chime in and correct me. a good place to start, would be github and see how the runtime handles OpCodes.Ldloc/Stloc for structs. i would imagine that's a good spot to look when it comes to getting the answers you need.
EDIT: any tutorial you've read that says structs are always allocated on the stack, have them all DDoS'd.
I recently came across the link below which I have found quite interesting.
http://en.wikipedia.org/wiki/XOR_linked_list
General-purpose debugging tools
cannot follow the XOR chain, making
debugging more difficult; [1]
The price for the decrease in memory
usage is an increase in code
complexity, making maintenance more
expensive;
Most garbage collection schemes do
not work with data structures that do
not contain literal pointers;
XOR of pointers is not defined in
some contexts (e.g., the C language),
although many languages provide some
kind of type conversion between
pointers and integers;
The pointers will be unreadable if
one isn't traversing the list — for
example, if the pointer to a list
item was contained in another data
structure;
While traversing the list you need to
remember the address of the
previously accessed node in order to
calculate the next node's address.
Now I am wondering if that is exclusive to low level languages or if that is also possible within C#?
Are there any similar options to produce the same results with C#?
TL;DR I quickly wrote a proof-of-concept XorLinkedList implementation in C#.
This is absolutely possible using unsafe code in C#. There are a few restrictions, though:
XorLinkedList must be "unmanaged structs", i.e., they cannot contain managed references
Due to a limitation in C# generics, the linked list cannot be generic (not even with where T : struct)
The latter seems to be because you cannot restrict the generic parameter to unmanaged structs. With just where T : struct you'd also allow structs that contain managed references.
This means that your XorLinkedList can only hold primitive values like ints, pointers or other unmanaged structs.
Low-level programming in C#
private static Node* _ptrXor(Node* a, Node* b)
{
return (Node*)((ulong)a ^ (ulong)b);//very fragile
}
Very fragile, I know. C# pointers and IntPtr do not support the XOR-operator (probably a good idea).
private static Node* _allocate(Node* link, int value = 0)
{
var node = (Node*) Marshal.AllocHGlobal(sizeof (Node));
node->xorLink = link;
node->value = value;
return node;
}
Don't forget to Marshal.FreeHGlobal those nodes afterwards (Implement the full IDisposable pattern and be sure to place the free calls outside the if(disposing) block.
private static Node* _insertMiddle(Node* first, Node* second, int value)
{
var node = _allocate(_ptrXor(first, second), value);
var prev = _prev(first, second);
first->xorLink = _ptrXor(prev, node);
var next = _next(first, second);
second->xorLink = _ptrXor(node, next);
return node;
}
Conclusion
Personally, I would never use an XorLinkedList in C# (maybe in C when I'm writing really low level system stuff like memory allocators or kernel data structures. In any other setting the small gain in storage efficiency is really not worth the pain. The fact that you can't use it together with managed objects in C# renders it pretty much useless for everyday programming.
Also storage is almost free today, even main memory and if you're using C# you likely don't care about storage much. I've read somewhere that CLR object headers were around ~40 bytes, so this one pointer will be the least of your concerns ;)
C# doesn't generally let you manipulate references at that level, so no, unfortunately.
As an alternative to the unsafe solutions that have been proposed.
If you backed your linked list with an array or list collection where instead of a memory pointer 'next' and 'previous' indicate indexes into the array you could implement this xor without resorting to using unsafe features.
There are ways to work with pointers in C#, but you can have a pointer to an object only temporarily, so you can't use them in this scenario. The main reason for this is garbage collection – as long as you can do things like XOR pointers and unXOR them later, the GC has no way of knowing whether it's safe to collect certain object or not.
You could make something very similar by emulating pointers using indexes in one big array, but you would have to implement a simple form of memory management yourself (i.e. when creating new node, where in the array should I put it?).
Another option would be to go with C++/CLI which allows you both the full flexibility of pointers on one hand and GC and access to the framework when you need it on the other.
Sure. You would just need to code the class. the XOR operator in c# is ^
That should be all you need to start the coding.
Note this will require the code to be declared "unsafe." See here: for how to use pointers in c#.
Making a broad generalization here: C# appears to have gone the path of readability and clean interfaces and not the path of bit fiddling and packing all the information as dense as possible.
So, unless you have a specific need here, you should use the List you are provided. Future maintenance programmers will thank you for it.
It is possible however you have to understand how C# looks at objects. An instance variable does not actually contain an object but a pointer to the object in memory.
DateTime dt = DateTime.Now;
dt is a pointer to a struct in memory containing the DateTime scheme.
So you could do this type of linked list although I am not sure why you would as the framework typically has already implemented the most efficient collections. As a thought expirament it is possible.
If one could put an array of pointers to child structs inside unsafe structs in C# like one could in C, constructing complex data structures without the overhead of having one object per node would be a lot easier and less of a time sink, as well as syntactically cleaner and much more readable.
Is there a deep architectural reason why fixed arrays inside unsafe structs are only allowed to be composed of "value types" and not pointers?
I assume only having explicitly named pointers inside structs must be a deliberate decision to weaken the language, but I can't find any documentation about why this is so, or the reasoning for not allowing pointer arrays inside structs, since I would assume the garbage collector shouldn't care what is going on in structs marked as unsafe.
Digital Mars' D handles structs and pointers elegantly in comparison, and I'm missing not being able to rapidly develop succinct data structures; by making references abstract in C# a lot of power seems to have been removed from the language, even though pointers are still there at least in a marketing sense.
Maybe I'm wrong to expect languages to become more powerful at representing complex data structures efficiently over time.
One very simple reason: dotNET has a compacting garbage collector. It moves things around. So even if you could create arrays like that, you would have to pin every allocated block and you would see the system slow down to a crawl.
But you are trying to optimize based on an assumption. Allocation and cleanup of objects in dotNET is highly optimized. So write a working program first and then use a profiler to find your bottlenecks. It will most likely not be the allocation of your objects.
Edit, to answer the latter part:
Maybe I'm wrong to expect languages to
become more powerful at representing
complex data structures efficiently
over time.
I think C# (or any managed language) is much more powerful at representing
complex data structures (efficiently). By changing from low level pointers to garbage collected references.
I'm just guessing, but it might have to do with different pointer sizes for different target platforms. It seems that the C# compiler is using the size of the elements directly for index calculations (i.e. there is no CLR support for calculating fixed sized buffers indices...)
Anyway you can use an array of ulongs and cast the pointers to it:
unsafe struct s1
{
public int a;
public int b;
}
unsafe struct s
{
public fixed ulong otherStruct[100];
}
unsafe void f() {
var S = new s();
var S1 = new s1();
S.otherStruct[4] = (ulong)&S1;
var S2 = (s1*)S.otherStruct[4];
}
Putting a fixed array of pointers in a struct would quickly make it a bad candidate for a struct. The recommended size limit for a struct is 16 bytes, so on a x64 system you would be able to fit only two pointers in the array, which is pretty pointless.
You should use classes for complex data structures, if you use structures they become very limited in their usage. You wouldn't for example be able to create a data structure in a method and return it, as it would then contain pointers to structs that no longer exists as they were allocated in the stack frame of the method.
I come from a managed world and c++ automatic memory management is quite unclear to me
If I understand correctly, I encapsulate a pointer within a stack object and when auto_ptr becomes out of scope, it automatically calls delete on the pointed object?
What kind of usage should I make of it and how should I naturally avoid inherent c++ problems?
auto_ptr is the simplest implementation of RAII in C++. Your understanding is correct, whenever its destructor is called, the underlying pointer gets deleted.
This is a one step up from C where you don't have destructors and any meaningful RAII is impossible.
A next step up towards automagic memory management is shared_ptr. It uses reference counting to keep track of whether or not the object is alive. This allows the programmer to create the objects a bit more freely, but still not as powerful as the garbage collection in Java and C#. One example where this method fails is circular references. If A has a ref counted pointer to B and B has a ref counted pointer to A, they will never get destructed, even though no other object is using either.
Modern object orianted languages use some sort of variation of mark and sweep. This technique allows managing circular references and is reliable enough for most programming tasks.
Yes, std::auto_ptr calls delete on its content when it goes out of scope. You use auto_ptr only if no shared ownership takes place.
auto_ptr isn't particularly flexible, you can't use it with objects created with new[] or anything else.
Shared ownership is usually approached with shared pointers, which e.g. boost has implementations of. The most common usage, implemented e.g. in Boosts shared_ptr, employs a reference counting scheme and cleans up the pointee when the last smart pointer goes out of scope.
shared_ptr has one big advantage - it lets you specify custom deleters. With that you can basically put every kind of resource in it and just have to specify what deleter it should use.
Here's how you use a smart pointer. For the sake of example, I'll be using a shared_ptr.
{
shared_ptr<Foo> foo(new Foo);
// do things with foo
}
// foo's value is released here
Pretty much all smart pointers aim to achieve something similar to the above, in that the object being held in the smart pointer gets released at the end of the smart pointer's scope. However, there are three types of smart pointers that are widely used, and they have very different semantics on how ownership is handled:
shared_ptr uses "shared ownership": the shared_ptr can be held by more than one scope/object, and they all own a reference to the object. When the last reference falls off, the object is deleted. This is done using reference counting.
auto_ptr uses "transferable ownership": the auto_ptr's value can be held only in one place, and each time the auto_ptr is assigned, the assignee receives ownership of the object, and the assigner loses its reference to the object. If an auto_ptr's scope is exited without the object being transferred to another auto_ptr, the object is deleted. Since there is only one owner of the object at a time, no reference counting is needed.
unique_ptr/scoped_ptr uses "nontransferable ownership": the object is held only at the place it's created, and cannot be transferred elsewhere. When the program leaves the scope where the unique_ptr is created, the object is deleted, no questions asked.
It's a lot to take in, I'll grant, but I hope it'll all sink in soon. Hope it helps!
You should use boost::shared_ptr instead of std::auto_ptr.
auto_ptr and shared_ptr simply keep an instance of the pointer and because they are local stack objects they get deallocated when they go out of scope. Once they are deallocated they call delete on internal pointer.
Simple example, the actuall shared_ptr and auto_ptr are more sophisticated (they have methods for assignment and conversion/access to internal pointer):
template <typename T>
struct myshrdptr
{
T * t;
myshrdptr(T * p) : t(p) {}
~myshrdptr()
{
cout << "myshrdptr deallocated" << endl;
delete t;
}
T * operator->() { return t; }
};
struct AB
{
void dump() { cout << "AB" << endl; }
};
void testShrdptr()
{
myshrdptr<AB> ab(new AB());
ab->dump();
// ab out of scope destructor called
// which calls delete on the internal pointer
// which deletes the AB object
}
From somewhere else:
int main()
{
testShrdptr();
cout << "done ..." << endl;
}
output something like (you can see that the destructor is called):
AB
myshrdptr deallocated
done ...
Rather than trying to understand auto_ptr and its relation to garbage-collected references, you should really try to see the underlying pattern:
In C++, all local objects have their destructors called when they go out of scope. This can be harnessed to clean up memory. For example, we could write a class which, in its constructor, is given a pointer to heap-allocated memory, and in its destructor, frees this pointer.
That is pretty much what auto_ptr does. (Unfortunately, auto_ptr also has some notoriously quirky semantics for assignment and copying)
It's also what boost::shared_ptr or other smart pointers do. There's no magic to any of those. They are simply classes that are given a pointer in their constructor, and, as they're typically allocated on the stack themselves, they'll automatically go out of scope at some point, and so their destructor is called, which can delete the pointer you originally passed to the constructor. You can write such classes yourself. Again, no magic, just a straightforward application of C++'s lifetime rules: When a local object goes out of scope, its destructor is called.
Many other classes cut out the middleman and simply let the same class do both allocation and deallocation. For example, std::vector calls new as necessary to create its internal array -- and in its destructor, it calls delete to release it.
When the vector is copied, it takes care to allocate a new array, and copy the contents from the original one, so that each object ends up with its own private array.
auto_ptr, or smart pointers in general, aren't the holy grail. They don't "solve" the problem of memory management. They are one useful part of the recipe, but to avoid memory management bugs and headaches, you need to understand the underlying pattern (commonly known as RAII) -- that is, whenever you have a resource allocation, it should be tied to a local variable which is given responsibility for also cleaning it up.
Sometimes, this means calling new yourself to allocate memory, and then passing the result to an auto_ptr, but more often, it means not calling new in the first place -- simply create the object you need on the stack, and let it call new as required internally. Or perhaps, it doesn't even need to call new internally. The trick to memory management is really to just rely on local stack-allocated objects instead of heap allocations. Don't use new by default.
Choose an imperative language (such as C, C++, or ADA) that provides pointer types.
Redesign that language to abolish pointer types, instead allowing programmers to define recursive types directly.
Consider carefully the issue of copy semantics vs reference semantics. Implement an interpreter for the language using DrRacket .
In my C# application, I have a large struct (176 bytes) that is passed potentially a hundred thousand times per second to a function. This function then simply takes a pointer to the struct and passes the pointer to unmanaged code. Neither the function nor the unmanaged code will make any modifications to the struct.
My question is, should I pass the struct to the function by value or by reference? In this particular case, my guess is that passing by reference would be much faster than pushing 176 bytes onto the call stack, unless the JIT happens to recognize that the struct is never modified (my guess is it can't recognize this since the struct's address is passed to unmanaged code) and optimizes the code.
Since we're at it, let's also answer the more general case where the function does not pass the struct's pointer to unmanaged code, but instead performs some read-only operation on the contents of the struct. Would it be faster to pass the struct by reference? Would in this case the JIT recognize that the struct is never modified and thus optimize? Presumably it is not more efficient to pass a 1-byte struct by reference, but at what struct size does it become better to pass a struct by reference, if ever?
Thanks.
EDIT:
As pointed out below, it's also possible to create an "equivalent" class for regular use, and then use a struct when passing to unmanaged code. I see two options here:
1) Create a "wrapper" class that simply contains the struct, and then pin and pass a pointer to the struct to the unmanaged code when necessary. A potential issue I see is that pinning has its own performance consequences.
2) Create an equivalent class whose fields are copied to the struct when the struct is needed. But copying would take a lot of time and seems to me to defeat the point of passing by reference in the first place.
EDIT:
As mentioned a couple times below, I could certainly just measure the performance of each of these methods. I will do this and post the results. However, I am still interested in seeing people's answers and reasonings from an intellectual perspective.
I did some very informal profiling, and the results indicate that, for my particular application, there is a modest performance gain for passing by reference. For by-value I got about 10,050,000 calls per second, whereas for by-reference I got about 11,200,000 calls per second.
Your mileage may vary.
Before you ask whether or not you should pass the struct by reference, you should ask yourself why you've got such an enormous struct in the first place. Does it really need to be a struct? If you need to use a struct at some point for P/Invoke, would it be worth having a struct just for that, and then the equivalent class elsewhere?
A struct that big is very, very unusual...
See the Design Guidelines for Developing Class Libraries section on Choosing Between Classes and Structures for more guidance on this.
The only way you can get an answer to this question is to code up both and measure the performance.
You mention unmanaged/managed interop. My experience is that it takes a surprisingly long time to to the interop. You could try changing your code from:
void ManagedMethod(MyStruct[] items) {
foreach (var item in items) {
unmanagedHandle.ProcessOne(item);
}
}
To:
void ManagedMethod(MyStruct[] items) {
unmanagedHandle.ProcessMany(items, items.Count);
}
This technique helped me in a similar case, but only measurements will tell you if it works for your case.
Why not just use a class instead, and pass your class to your P/Invoke function?
Using class will pass nicely around in your managed code, and will work the same as passing a struct by reference to a P/Invoke function.
e.g.
// What you have
public struct X
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Foo( ref X arg);
// What you could do
[StructLayout(LayoutKind.Sequential)]
public class Y
{
public int data;
}
[DllImport("mylib.dll")]
static extern void Bar( Y arg );