This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C# parameters by reference and .net garbage collection
I was thinking of using ref arguments to limit the bounds checking of an array. For example the code of swapping two elements is :
class Test {
int[] array;
private void qSort() {
...blah...
int temp = array[a];
array[a] = array[b];
array[b] = temp;
}
}
which has 4 access to the array
the alternative will be :
class Test {
int[] array;
private void qSort() {
...blah...
Swap( ref array[a], ref array[b] );
}
static void Swap(ref int a,ref int b) {
int temp = a;
a=b;
GC.Collect(); // suppose this happens
b=temp;
}
}
which theoretically has only 2 access to the array
What confusing me is that I don't know what exactly happens when I pass an array element by ref. If Garbage Collector kicks in, while executing code in the Swap function, will be able to move the array ? or the array is pinned for the duration of the call ?
Mind that the above code is a simple test case. I want to use it in much more complex scenarios
Edit: As BrokenGlass pointed out, this is answered by Eric Lippert here C# parameters by reference and .net garbage collection
The array will not be pinned and GCollector can move it and will update accordinly any ref to an element of it, that resides on the stack
The Swap function still access the array 3 or 4 times, the Swap function does not offer any performance adavantage over the simpler code. It might be useful if it is reused.
static void Swap(ref int a, ref int b)
{
int temp = a; //<-- Here, a is in the array
a=b; //<-- a and b are in the array
b=temp; //<-- b is in the array
}
The garbage collector will not release memory you have a reference to, as happens when you pass by reference.
The stack could look like this:
qSort() has a reference to the array
Swap()
So if GC.Collect() executes in swap there is still a reference to the array in qSort() which means it won't be collected.
Related
This question already has answers here:
Ref returns restrictions in C# 7.0
(4 answers)
Closed 1 year ago.
I'm reading <Essential C# 7.0>.
As for "return by reference", it says
There are two important restrictions on return by reference—both due to object lifetime: Object references shouldn’t be garbage collected while they’re still referenced, and they shouldn’t consume memory when they no longer have any references. To enforce these restrictions, you can only return the following from a reference-returning function:
• References to fields or array elements
• Other reference-returning properties or functions
• References that were passed in as parameters to the by-reference returning function
I did some experiments, and the results are certainly as the book says.
namespace App
{
public class App
{
static void Main()
{
}
class Test
{
public int x;
}
ref int RefReturn()
{
int[] a = { 1, 2 };
return ref a[0];
}
ref int RefReturn2()
{
Test t = new Test();
return ref t.x;
}
ref int RefReturnError()
{
int a = 1;
return ref a; //Error
}
ref Test RefReturnError2()
{
Test t = new Test();
return ref t; //Error
}
}
}
I can not quite understand this.
(the 3 kinds can return by reference in the book, why these 3 kinds?)
For example, I can ref return a field of the class, but not the class.
("ref int RefReturn2()" and "ref Test RefReturnError2()" in my code)
I think it should be something about "object lifetime" & "garbage collection" in C#, which I don't know much about.
I also would like to know typical situations of using return by reference.
I think typical situations can also help understanding.
The key take-away for the rules for managed references (ref) is: a managed reference must not point to a local variable, or to part of one (in the case of a struct), because the reference can outlive the life of the location it points to. It must point to a non-stack location.
Let's take each version one-by-one
ref int RefReturn()
{
int[] a = { 1, 2 };
return ref a[0];
}
In the above example, the returned reference points to the interior of the array, it does not point to the local variable. The interior of an array is effectively a field of a heap object. The array will outlive the life of the function.
ref int RefReturn2()
{
Test t = new Test();
return ref t.x;
}
In this one, Test is a reference-type, and therefore lives on the heap. The reference points to the field x of the object contained in t, this also lives on the heap. The fact that t is a local variable is immaterial, the reference does not point to t.
ref int RefReturnError()
{
int a = 1;
return ref a; //Error
}
In this case, the reference points to the actual location of the local variable, this lives on the stack, and the location will disappear at the end of the function.
Note that the same problem is visible when taking a reference to a field of a struct, when the struct's location is a local variable.
ref int RefReturnError1A()
{
MyStruct a = new MyStruct();
return ref a.x; //Error
}
ref Test RefReturnError2()
{
Test t = new Test();
return ref t; //Error
}
In this one, although t is a reference-type and itself points to a heap object, our reference does not point to that object which t points to. It points to the location of t itself which contains that object reference.
Note that a reference to a boxed struct is disallowed for a different reason: due to C#'s unboxing rules, unboxing (logically) creates a copy, therefore you cannot change it in place. Coding in IL directly (or in C++/CLI) you can perfectly verifiably do the equivalent of:
ref int RefReturnBox()
{
object a = (object)1;
return ref (int)a; // CS0445: Cannot modify the result of an unboxing conversion
}
A local variable has the same lifetime as the method, the memory location of the variable itself is on the stack.
So neither variable int a or Test t exist after the method returns.
But a[0] and t.x exist inside objects that are stored in heap memory. They will still be there when the method returns since they live outside the stack.
So how about an example. Why would use want to use a ref local or ref return? How about when defining a linked list;
internal class Node<T>
where T:class
{
internal T Item;
internal Node<T> Next;
}
public class LinkedList<T>
where T : class
{
private Node<T> Root;
private ref Node<T> Find(Func<T, bool> comparison)
{
ref var ret = ref Root;
while (ret!=null && comparison(ret.Item))
ret = ref ret.Next;
return ref ret;
}
public void Insert(T newItem, Func<T, bool> comparison)
{
ref var position = ref Find(comparison);
position = new Node<T>
{
Item = newItem,
Next = position.Next
};
}
}
Find returns a reference to a Node<T> field, which you can then assign a new node to. So you can handle assigning to either Root or somenode.Next without needing two special cases.
This question already has answers here:
Why would ref be used for array parameters in C#?
(2 answers)
What is the use of "ref" for reference-type variables in C#?
(10 answers)
Closed 7 years ago.
I am slightly confused about this, as I have read that an int[] array, although int is a primitive type, since it's an array, it's a reference type variable.
What is the different then between a method such as:
public static void ChangeSomething(ref int[] array)
{
array[0] = 100;
}
and
public static void ChangeSomething(int[] array)
{
array[0] = 100;
}
When the array is modified, I can see the new value of 100 at index 0 for both of these calls.
Is there something different that happens under the covers which makes one better than another? Does the VS IDE allow both simply because perhaps the "ref" keyword clarifies the intention?
The difference is that you can assign the original variable directly in the method. If you change your method to the this:
public static void ChangeSomething(ref int[] array)
{
array = new int[2];
}
And call it like this:
var myArray = new int[10];
ChangeSomething(ref myArray);
Console.WriteLine(array.Length);
You will see that myArray only have a length of 2 after the call. Without the ref keyword you can only change the content of the array, since the array's reference is copied into the method.
If you modify the items of the array, there is no difference.
But if you redefined the array itself with larger array, there is the difference:
public static void ChangeSomething(ref int[] array)
{
array = new int[100]; //you are changing the variable of caller
}
and
public static void ChangeSomething(int[] array)
{
array = new int[100]; //you are changing local copy of array variable, the caller array remains same.
}
Last day I was reading C# reference and there I saw a statement. Kindly have a look at the following statement.
Context:
the use of a struct rather than a class for a Point can make a large difference in the number of
memory allocations performed at run time. The program below creates and initializes an array of 100 points.
With Point implemented as a class, 101 separate objects are instantiated—one for the array and one each
for the 100 elements.
class Point
{
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
class Test
{
static void Main() {
Point[] points = new Point[100];
for (int i = 0; i < 100; i++)
points[i] = new Point(i, i*i);
}
}
If Point is instead implemented as a struct, as in
struct Point
{
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
only one object is instantiated—the one for the array. The Point instances are allocated in-line within the
array. This optimization can be misused. Using structs instead of classes can also make an application run slower or take up more memory, as passing a struct instance by value causes a copy of that struct to be
created.
Question:
Here my question is how memory allocation is done in case of Value Type and Reference Type?
Confusion:
Why it is mentioned in Reference Guide that Only 1 Object will be intialized. As per my understanding for each object in Array a separate memory will be allocated.
Edit: Possible Duplicate
This question is bit different from possible duplicate question as suggested by jason. My concern is about how memory is allocated in case of Value Type and Referenece Type solely while that question just explain the overview of Value Type and Reference Type.
Perhaps the difference between an array of a reference type and an array of a value type is easier to understand with an illustration:
Array of a reference type
Each Point as well as the array is allocated on the heap and the array stores references to each Point. In total you need N + 1 allocations where N is the number of points. You also need an extra indirection to access a field of a particular Point because you have to go through a reference.
Array of a value type
Each Point is stored directly in the array. There is only one allocation on the heap. Accessing a field does not involve indirection. The memory address of the field can be computed directly from the memory address of the array, the index of the item in the array and the location of the field inside the value type.
An array with reference types will consist of an array of references. Each reference points to a memory area which contains the actual object:
array[0] == ref0 -> robj0
array[1] == ref1 -> robj1
...
So there is one memory allocation for the array of references (size: arraylength * sizeof(reference)) and a separate memory allocation for each object (sizeof(robj)).
An array with value types (like structs) will contain just the objects:
array[0] == vobj0
array[1] == vobj1
...
so there is jst one memory allocation with size arraylength * sizeof(vobj)
I have a multidimentional array of pointers to integer (of unknown rank) being passed into my function as such:
public unsafe static void MyMethod(Array source, ...)
{
//...
}
Multidimensional arrays of pointers are being constructed outside of the method and being passed in. Here's an example:
int*[,,,] testArray = new int*[10,10,5,5];
MyMethod(testArray);
How can I set a value at an runtime-computed index in the array? Array.SetValue(...) works perfectly fine for non-pointer arrays, but refuses to work for my int* array. Using reflector, I see SetValue reduces down to calling InternalSetValue which takes an object for the value but it's marked as extern and I can't see the implementation. I took a shot in the dark and tried passing in boxed pointer, but no luck.
This works:
unsafe static void MyMethod(int** array)
{
array[10] = (int*)0xdeadbeef;
}
private static unsafe void Main()
{
int*[, , ,] array = new int*[10, 10, 10, 10];
fixed (int** ptr = array)
{
MyMethod(ptr);
}
int* x = array[0, 0, 1, 0]; // == 0xdeadbeef
}
Does that help?
Question to the experts: Is it wrong to assume that the array is allocated consecutively in memory?
This doesn't work because it's not possible to box a pointer in .NET, so you can never call the Array.SetValue and pass an int*.
Can you declare MyMethod to accept int*[,,,] instead?
Edit: for further reading, an interesting recent post from Eric Lippert.
I read this post about card shuffling and in many shuffling and sorting algorithms you need to swap two items in a list or array. But what does a good and efficient Swap method look like?
Let's say for a T[] and for a List<T>. How would you best implement a method that swaps two items in those two?
Swap(ref cards[i], ref cards[n]); // How is Swap implemented?
Well, the code you have posted (ref cards[n]) can only work with an array (not a list) - but you would use simply (where foo and bar are the two values):
static void Swap(ref int foo, ref int bar) {
int tmp = foo;
foo = bar;
bar = tmp;
}
Or possibly (if you want atomic):
Interlocked.Exchange(ref foo, ref bar);
Personally, I don't think I'd bother with a swap method, though - just do it directly; this means that you can use (either for a list or for an array):
int tmp = cards[n];
cards[n] = cards[i];
cards[i] = tmp;
If you really wanted to write a swap method that worked on either a list or an array, you'd have to do something like:
static void Swap(IList<int> list, int indexA, int indexB)
{
int tmp = list[indexA];
list[indexA] = list[indexB];
list[indexB] = tmp;
}
(it would be trivial to make this generic) - however, the original "inline" version (i.e. not a method) working on an array will be faster.
11 years later and we have tuples...
(foo, bar) = (bar, foo);
A good swap is one where you don't swap the contents. In C/C++ this would be akin to swapping pointers instead of swapping the contents. This style of swapping is fast and comes with some exception guarantee. Unfortunately, my C# is too rusty to allow me to put it in code. For simple data types, this style doesn't give you much. But once you are used to, and have to deal with larger (and more complicated) objects, it can save your life.
What about this?
It's a generic implementation of a swap method. The Jit will create a compiled version ONLY for you closed types so you don't have to worry about perfomances!
/// <summary>
/// Swap two elements
/// Generic implementation by LMF
/// </summary>
public static void Swap<T>(ref T itemLeft, ref T itemRight) {
T dummyItem = itemRight;
itemLeft = itemRight;
itemRight = dummyItem;
}
HTH
Lorenzo
Use:
void swap(int &a, int &b)
{
// &a != &b
// a == b OK
a ^= b;
b ^= a;
a ^= b;
return;
}
I did not realize I was in the C# section. This is C++ code, but it should have the same basic idea. I believe ^ is XOR in C# as well. It looks like instead of & you may need "ref"(?). I am not sure.
You can now use tuples to accomplish this swap without having to manually declare a temporary variable:
static void Swap<T>(ref T foo, ref T bar)
{
(foo, bar) = (bar, foo)
}
For anyone wondering, swapping can also be done also with Extension methods (.NET 3.0 and newer).
In general there seems not to be possibility to say that extension methods "this" value is ref, so you need to return it and override the old value.
public static class GeneralExtensions {
public static T SwapWith<T>(this T current, ref T other) {
T tmpOther = other;
other = current;
return tmpOther;
}
}
This extension method can be then used like this:
int val1 = 10;
int val2 = 20;
val1 = val1.SwapWith(ref val2);