In delphi the "ZeroMemory" procedure, ask for two parameters.
CODE EXAMPLE
procedure ZeroMemory(Destination: Pointer; Length: DWORD);
begin
FillChar(Destination^, Length, 0);
end;
I want make this, or similar in C#... so, what's their equivalent?
thanks in advance!
.NET framework objects are always initialized to a known state
.NET framework value types are automatically 'zeroed' -- which means that the framework guarantees that it is initialized into its natural default value before it returns it to you for use. Things that are made up of value types (e.g. arrays, structs, objects) have their fields similarly initialized.
In general, in .NET all managed objects are initialized to default, and there is never a case when the contents of an object is unpredictable (because it contains data that just happens to be in that particular memory location) as in other unmanaged environments.
Answer: you don't need to do this, as .NET will automatically "zero" the object for you. However, you should know what the default value for each value type is. For example, the default of a bool is false, and the default of an int is zero.
Unmanaged objects
"Zero-ing" a region of memory is usually only necessary in interoping with external, non-managed libraries.
If you have a pinned pointer to a region of memory containing data that you intend to pass to an outside non-managed library (written in C, say), and you want to zero that section of memory, then your pointer most likely points to a byte array and you can use a simple for-loop to zero it.
Off-topic note
On the flip side, if a large object is allocated in .NET, try to reuse it instead of throwing it away and allocating a new one. That's because any new object is automatically "zeroed" by the .NET framework, and for large objects this clearing will cause a hidden performance hit.
You very rarely need unsafe code in C#. Usually only when interacting with native libraries.
The Marshal class as some low level helper functions, but I'm not aware of any that zeros out memory.
Firstly, in .Net (including C#) then value types are zero by default - so this takes away one of the common uses of ZeroMemory.
Secondly, if you want to zero a list of type T then try a method like:
void ZeroMemory<T>(IList<T> destination)
{
for (var i=0;i<destination.Count; i+))
{
destination[i] = default(T);
}
}
If a list isn't available... then I think I'd need to see more of the calling code.
Technically there is the Array.Clear, but it's only for managed arrays. What do you want to do?
Related
In a .NET Core 3.0 project, I have an interface that return a Span<byte>. This works for a large set of classes except one particular implementation which can generate its data on the fly (due to not caching it).
The implementation looks like:
public Span<byte> Data => CompileBytes();
where it would be something like (this is abstract code but pretty close to a use case)
public byte[] CompileBytes()
{
using (MemoryStream stream = new MemoryStream())
{
foreach (IDataSource data in DataSources)
stream.Write(data.ByteArray);
return stream.ToArray();
}
}
I have been looking around online to see if there's a guarantee that this is safe to do but haven't found any.
My worry is that Span is a very thin layer around the data that the GC will ignore such that the GC assumes we will not let the span outlive the underlying buffer, and the temporary byte array that is created will eventually get GC'd, which means I could potentially have a ticking time bomb on my hands if it for some reason does a GC while someone other code is using the span. Is this the case? Can I return a Span<> for a temporary object and be perfectly okay (under the assumption it is properly used by staying in the bounds of the span)?
The definition seems to be implementation dependent so I can't figure out with my limited knowledge if it holds onto the reference or not... because if so, then I am safe and my question is answered.
The MSDN says 'memory safe' but I am unsure the exact specifics by which they define memory safe and if it covers my definition. As such, if it does, then this question is answered.
I am not using any unsafe code.
Having a reference to a managed array in a Span<T> is safe, even if it is the only reference to it.
As described in the article All About Span: Exploring a New .NET Mainstay, Span<T> uses a special way of storing these references, a ByReference<T>, which is implemented as a JIT intrinsic.
Quoting the linked article (section How Is Span<T> Implemented?):
Span<T> is actually written to use a special internal type in the runtime that’s treated as a just-in-time (JIT) intrinsic, with the JIT generating for it the equivalent of a ref T field
And section What Is Memory<T> and Why Do You Need It?
Span<T> is a ref-like type as it contains a ref field, and ref fields can refer not only to the beginning of objects like arrays, but also to the middle of them [...] These references are called interior pointers, and tracking them is a relatively expensive operation for the .NET runtime’s garbage collector.
The last part of that quote clarifies that the reference stored in Span<T> is indeed tracked by the GC, so it will not clean up memory that is still being referenced
It's a actually a general question, but it occurred now that I'm working with Go and C#.
Say we want to assign a variable from user's input in Go:
func main() {
var input float64
fmt.Scan(&input)
}
It's pretty much clear why we need a memory location to put our new value in. But why in languages like Java or C#, we are not following the same logic:
var input = Convert.ToInt32(Console.ReadLine());
// and not &input ...
Java and C# are higher level languages which abstract most of the memory management and other particular things required in lower level languages like C.
In this case, the Console.ReadLine() function allocates memory to store the console input and copies it to the input variable.
Since these languages have garbage collection, allocating and deallocating memory is done automatically, so the framework don't require you to explicitly pass a memory address to write to, and doesn't expect you to free the memory when you are done using it.
Edit:
See #kostix comment for a great improvement to this answer.
In Go, like C/C++, pointer variables are how types can be passed by reference.
Languages like Java and C# discourage the use of pointer variables. C# has the "ref" keyword and "boxing" for passing value types by reference.
See here for more on "ref": https://msdn.microsoft.com/en-us/library/14akc2c7.aspx
See here for more on "boxing: https://msdn.microsoft.com/en-us/library/yz2be5wk.aspx
Structs are value types and thus are fully copied every time there is a manipulation on the struct. Since they are value types, structs are allocated in the stack and not in the heap.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack, specially if they are big with lots of inner fields.
But I am curious about how C# deals with the return of structs.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers. And practically all C# struct tutorials say structs lives in the stack, never in the heap.
So in the following code:
MyStruct ms = GetMyValue();
Where GetMyValue() is
MyStruct GetMyValue();
How will C# deal with the return of the struct for the ms variable? Specially if it's is too big for the registers? Will it in fact copy it to the heap and then copy it back again to the caller of the method and assign it to ms?
EDIT:
To address the comments left in the post:
I have read a few tutorial on C# structs before posting this, this tutorial in particular uses the word stack more times than I bother to count. And this MSDN tutorial also speaks about the stack, although it's from 2003, I don't think structs changed since then.
I am aware this might not be realted at all with C# but in fact be a matter of the JIT compiler it self or the CLR or something else I am not aware of. That's the purpose of my question, to learn more about the inner workings of C#, even if this is not actually related to the language itself.
There are C function call conventions, the best support for my Post is this StackOverflow post. When I first posted it in here I just said what I remembered, but since the SO answer says:
As for your specific question, it depends the ABI. Sometimes if the return value is larger than 4 bytes but not larger than 8 bytes, it can be split into EAX and EDX. But most of the time the calling function will just allocate some memory (usually on the stack) and pass a pointer to this area to the called function.
I might be wrong on this one, and I say might, because the answer says usually.
The true reason why I want to understand how structs are handled is because I have a project where I have to read a Serial Port multiple times to poll for data, this data will be returned by a method.
Since the data is just some bytes I thought I could get some performance out of structs instead of using a class to abstract the bytes incoming by the Serial Port, but if the return would pass the struct as a heap allocation my expectations on performance increase could be false.
Yes, I can make a simple test and compare performance, I know, but I wanted to actually learn how it's done behind the curtains, and not only memorize the outcome of my simulation. I like to know how the things that I work with actually work, and not only learn how to use them.
Value types are not only located on a stack. They also live in fields and in arrays. The key distinction to reference types is that value types are copied by value and have no identity. The stack vs. heap idea is false.
In C the return is made by registers, or by reference using the heap if the value to be returned is too big for the registers
The heap is not involved. The caller allocates spaces for the return value to be placed in. It passes a pointer to that space. The callee can fill that space. The .NET CLR does this as well. Of course this is an implementation detail.
but I wanted to actually learn
This is very good. You could not have tested what I just told you. You need to be a little more critical in what you believe what others say. Either you had bad tutorials or you read them in an imprecise way.
I can see how structs can degrade the performance of methods when structs are passed as parameters, since they will be always copied in the stack
This is not always the case I think. I'm not quite sure but I think the JIT can sometimes pass structs in registers. The .NET JITs really do not optimize much but I think this is an optimization that works to a certain degree. Probably driven by the existence of some one-field structs such as DateTime.
structs do not always live on the stack. if you allocate a struct inside of a function, it lives its life on the stack. if it's a field of a reference type(class/array(implicitly derived from System.Array/Object), it lives its life on the heap. as far as how theyre returned, that might be up to the ABI for that CPU architecture.
from the sounds of it, you've never dealt with IL/assembly/code generation, so lets build a dynamic method thats equivalent to MyStruct ms = GetMyValue()/what the compiler would generate in context of the word stack. "things" are never actually returned. thing(s, in a tuple sense i'm sure), are pushed onto the stack, and then a return instruction is emitted. leaving the return value(s) for the caller. we're going to assume GetMyValue() allocates a new MyStruct and assigns it to a local variable. the generated code would look something like this(i extend the ILGenerator class):
ILGenerator generator = dynamicMethod.GetILGenerator();
generator
.DeclareLocal(typeof(MyStruct))
.EmitCall(OpCodes.Call, typeof(EncapsulatingClass).GetMethod("GetMyValue"))
.Emit(OpCodes.Stloc, 0);
what happens here is(some of this is my assumption on how the CLI runtime works):
the calling function reserves a slot of typeof(MyStruct) at the current local list index.
GetMyValue() is called, reserves a MyStruct local the same way the method we are building does, emits an OpCodes.Newobj, which allocates and adjusts ESP(extended stack pointer) downward in the amount of sizeof(MyStruct), emits OpCodes.Stloc to store ESP minus sizeof(MyStruct) into the reserved local index, does some stuff with its fields, calls Emit(OpCodes.Ldloc, 0) to push the address the local points to onto the evaluation stack for the calling function, and emits an OpCodes.Ret to return.
the calling function emits an OpCodes.Stloc to store(copy) the contents of the MyStruct the top of the evaluation stack points to(how this happens, well i'm sure the answer is it depends, unfortunately), at local index 0.
i'm not an expert on how the CLI runtime is constructed by any means, so a lot of this is an assumption of what happens. take it with a grain of salt, and i'm by no means a CPU engineering expert. how the instruction stream segment of OpCodes.Ldloc, OpCodes.Ret, OpCodes.Stloc -- ms = GetMyValue() -- is treated, is probably up to how the JITer translates the IL into actual cpu specific machine instructions. such as X86. what determines if a struct will be returned into a register, is probably limited to one register only, so whatever the biggest register is, and if whatever struct will fit inside of it. i know CPU's can combine registers for memory offsets, but i'm not sure if that applies to returning structs inside of multiple registers. another thing to keep in mind, GetMyValue() went out of scope, which means the struct GetMyValue() allocated, in a scope sense, doesn't exist anymore, but in a stack sense(where it was allocated), it does, so the JITer could very well have just taken the address OpCodes.Ldloc pushed onto the stack, and placed it directly into the callers local index 0. since nothing can possibly copy it anymore due to the function returning. making the caller the new owner of the struct. avoiding any copying and registers altogether in this special case. this might be where calling conventions come into play as well. the problem is, if you allocated three structs in GetMyValue() for whatever reason, returning any struct after the first struct allocated would break that optimization, which is where the next optimization, return the struct inside a register(if it fits), comes into play. leaving the worst case scenario, copying its contents purely onto the stack again for the caller. i could be wrong, and anyone is more than welcome to chime in and correct me. a good place to start, would be github and see how the runtime handles OpCodes.Ldloc/Stloc for structs. i would imagine that's a good spot to look when it comes to getting the answers you need.
EDIT: any tutorial you've read that says structs are always allocated on the stack, have them all DDoS'd.
My understanding has always been, regardless of C++ or C# or Java, that when we use the new keyword to create an object it allocates memory on the heap. I thought that new is only needed for reference types (classes), and that primitive types (int, bool, float, etc.) never use new and always go on the stack (except when they're a member variable of a class that gets instantiated with new). However, I have been reading information that makes me doubt this long standing assumption, at least for Java and C#.
For example, I just noticed that in C# the new operator can be used to initialize a value type, see here. Is this an exception to the rule, a helper feature of the language, and if so, what other exceptions would there be?
Can someone please clarify this?
I thought that new is only needed for reference types (classes), and that primitive types (int, bool, float, etc.) never use new
In C++, you can allocate primitive types on the heap if you want to:
int* p = new int(42);
This is useful if you want a shared counter, for example in the implementation of shared_ptr<T>.
Also, you are not forced to use new with classes in C++:
void function()
{
MyClass myObject(1, 2, 3);
}
This will allocate myObject on the stack. Note that new is rarely used in modern C++.
Furthermore, you can overload operator new (either globally or class-specific) in C++, so even if you say new MyClass, the object does not necessarily get allocated on the heap.
I don't know precisely about Java (and it seems quite difficult to get a documentation about it).
In C#, new invokes the constructor and returns a fresh object. If it is of value type, it is allocated on the stack (eg. local variable) or on the heap (eg. boxed object, member of a reference type object). If it is of reference type, it always goes on the heap and is managed by the garbage collector. See http://msdn.microsoft.com/en-us/library/fa0ab757(v=vs.80).aspx for more details.
In C++, a "new expression" returns a pointer to an object with dynamic storage duration (ie. that you must destroy yourself). There is no mention of heap (with this meaning) in the C++ standard, and the mechanism through which such an object is obtained is implementation defined.
My understanding has always been, regardless of C++ or C# or Java, that when we use the new keyword to create an object it allocates memory on the heap.
Your understanding has been incorrect:
new may work differently in different programming languages, even when these languages are superficially alike. Don't let the similar syntax of C#, C++, and Java mislead you!
The terms "heap" and "stack" (as they are understood in the context of internal memory management) are simply not relevant to all programming languages. Arguably, these two concepts are more often implementation details than that they are part of a programming language's official specification.
(IIRC, this is true for at least C# and C++. I don't know about Java.)
The fact that they are such widespread implementation details doesn't imply that you should rely on that distinction, nor that you should even know about it! (However, I admit that I usually find it beneficial to know "how things work" internally.)
I would suggest that you stop worrying too much about these concepts. The important thing that you need to get right is to understand a language's semantics; e.g., for C# or any other .NET language, the difference in reference and value type semantics.
Example: What the C# specification says about operator new:
Note how the following part of the C# specification published by ECMA (4th edition) does not mention any "stack" or "heap":
14.5.10 The new operator
The new operator is used to create new instances of types. […]
The new operator implies creation of an instance of a type, but does not necessarily imply dynamic allocation of memory. In particular, instances of value types require no additional memory beyond the variables in which they reside, and no dynamic allocations occur when new is used to create instances of value types.
Instead, it talks of "dynamic allocation of memory", but that is not the same thing: You could dynamically allocate memory on a stack, on the heap, or anywhere else (e.g. on a hard disk drive) for that matter.
What it does say, however, is that instances of value types are stored in-place, which is exactly what value type semantics are all about: Value type instances get copied during an assignment, while reference type instances are referenced / "aliased". That is the important thing to understand, not the "heap" or the "stack"!
In c#, a class always lives on the heap. A struct can be either on the heap or stack:
variables (except captures and iterator blocks), and fields on a struct that is itself on the stack live on the stack
captures, iterator blocks, fields of something that is on the heap, and values in an array live on the heap, as do "boxed" values
Java 7 does escape analysis to determine if an object can be allocated on the stack, according to http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html.
However, you cannot instruct the runtime to allocate an object on heap or on stack. It's done automatically.
Regarding c#, read The Truth About Value Types.
You will see that value types can go on the heap as well.
And at this question is suggested that reference types could go on the stack. (but it does not happen at the moment)
(Referring to Java) What you said is correct- primitives are allocated on the stack (there are exceptions e.g. closures). However, you might be referring to objects such as:
Integer n = new Integer(2);
This refers to an Integer object, and not a primitive int. Perhaps this was your source of confusion? In this case, n will be allocated on the heap. Perhaps your confusion was due to autoboxing rules? Also see this question for more details on autoboxing. Check out comments on this answer for exceptions to the rule where primitives are allocated on the heap.
In Java and C#, we don't need to allocate primitive types on the heap. They can be allocated on the stack ( not that they are restricted to stack ). Whereas, in C++ we can have primitive as well as user defined types to be allocated on both stack and heap.
In C++, there's an additional way to use the new operator, and that's via 'placement new'. The memory you point it to could exist anywhere.
See What uses are there for "placement new"?
I recently came across the link below which I have found quite interesting.
http://en.wikipedia.org/wiki/XOR_linked_list
General-purpose debugging tools
cannot follow the XOR chain, making
debugging more difficult; [1]
The price for the decrease in memory
usage is an increase in code
complexity, making maintenance more
expensive;
Most garbage collection schemes do
not work with data structures that do
not contain literal pointers;
XOR of pointers is not defined in
some contexts (e.g., the C language),
although many languages provide some
kind of type conversion between
pointers and integers;
The pointers will be unreadable if
one isn't traversing the list — for
example, if the pointer to a list
item was contained in another data
structure;
While traversing the list you need to
remember the address of the
previously accessed node in order to
calculate the next node's address.
Now I am wondering if that is exclusive to low level languages or if that is also possible within C#?
Are there any similar options to produce the same results with C#?
TL;DR I quickly wrote a proof-of-concept XorLinkedList implementation in C#.
This is absolutely possible using unsafe code in C#. There are a few restrictions, though:
XorLinkedList must be "unmanaged structs", i.e., they cannot contain managed references
Due to a limitation in C# generics, the linked list cannot be generic (not even with where T : struct)
The latter seems to be because you cannot restrict the generic parameter to unmanaged structs. With just where T : struct you'd also allow structs that contain managed references.
This means that your XorLinkedList can only hold primitive values like ints, pointers or other unmanaged structs.
Low-level programming in C#
private static Node* _ptrXor(Node* a, Node* b)
{
return (Node*)((ulong)a ^ (ulong)b);//very fragile
}
Very fragile, I know. C# pointers and IntPtr do not support the XOR-operator (probably a good idea).
private static Node* _allocate(Node* link, int value = 0)
{
var node = (Node*) Marshal.AllocHGlobal(sizeof (Node));
node->xorLink = link;
node->value = value;
return node;
}
Don't forget to Marshal.FreeHGlobal those nodes afterwards (Implement the full IDisposable pattern and be sure to place the free calls outside the if(disposing) block.
private static Node* _insertMiddle(Node* first, Node* second, int value)
{
var node = _allocate(_ptrXor(first, second), value);
var prev = _prev(first, second);
first->xorLink = _ptrXor(prev, node);
var next = _next(first, second);
second->xorLink = _ptrXor(node, next);
return node;
}
Conclusion
Personally, I would never use an XorLinkedList in C# (maybe in C when I'm writing really low level system stuff like memory allocators or kernel data structures. In any other setting the small gain in storage efficiency is really not worth the pain. The fact that you can't use it together with managed objects in C# renders it pretty much useless for everyday programming.
Also storage is almost free today, even main memory and if you're using C# you likely don't care about storage much. I've read somewhere that CLR object headers were around ~40 bytes, so this one pointer will be the least of your concerns ;)
C# doesn't generally let you manipulate references at that level, so no, unfortunately.
As an alternative to the unsafe solutions that have been proposed.
If you backed your linked list with an array or list collection where instead of a memory pointer 'next' and 'previous' indicate indexes into the array you could implement this xor without resorting to using unsafe features.
There are ways to work with pointers in C#, but you can have a pointer to an object only temporarily, so you can't use them in this scenario. The main reason for this is garbage collection – as long as you can do things like XOR pointers and unXOR them later, the GC has no way of knowing whether it's safe to collect certain object or not.
You could make something very similar by emulating pointers using indexes in one big array, but you would have to implement a simple form of memory management yourself (i.e. when creating new node, where in the array should I put it?).
Another option would be to go with C++/CLI which allows you both the full flexibility of pointers on one hand and GC and access to the framework when you need it on the other.
Sure. You would just need to code the class. the XOR operator in c# is ^
That should be all you need to start the coding.
Note this will require the code to be declared "unsafe." See here: for how to use pointers in c#.
Making a broad generalization here: C# appears to have gone the path of readability and clean interfaces and not the path of bit fiddling and packing all the information as dense as possible.
So, unless you have a specific need here, you should use the List you are provided. Future maintenance programmers will thank you for it.
It is possible however you have to understand how C# looks at objects. An instance variable does not actually contain an object but a pointer to the object in memory.
DateTime dt = DateTime.Now;
dt is a pointer to a struct in memory containing the DateTime scheme.
So you could do this type of linked list although I am not sure why you would as the framework typically has already implemented the most efficient collections. As a thought expirament it is possible.