Common question but I could use an "english" explanation.
Is it like Java where
Cat myCat
actually is a pointer to Cat?
Should I really create copy constructors in C#?
I understand we are passing by value, but now my question is are we passing by pointer value or full copy of the object?
If it's the latter, isn't that too expensive performance/memory wise? Is that when you have to use the ref keyword?
As #rstevens answered, if it is a class, myCat is a reference. But if you pass myCat to a method call, then the reference itself is passed by value - i.e. the parameter itself will reference the same object, but it's a completely new reference, so if you assign it to null, or create a new object, the old myCat reference will still point to the original object.
SomeMethod(myCat);
void SomeMethod(Cat cat)
{
cat.Miau(); //will make the original myCat object to miau
cat = null; //only cat is set to null, myCat still points to the original object
}
Jon Skeet has a good article about it.
Remember that a pointer is not exactly the same as a reference, but you can just about think of it that way if you want.
I swear I saw another SO question on this not 10 minutes ago, but I can't find the link now. In the other question I saw, they were talking about passing arguments by ref vs by value, and it came down to this:
By default in .Net, you don't pass objects by reference. You pass references to objects by value.
The difference is subtle but important, especially if, for example, you want to assign to your passed object in the method.
If you delacred Cat as
class Cat {...}
then it is.
If you delcared Cat as
struct Cat {...}
then your variable "is" the structure itself.
This is the difference between reference types and value types in .Net.
Yes, it's about pointers but not really... The thing that messed me up originally is that it isn't really about about protecting your variable from changes within the method. If you change the object within the method, those changes are visible to the external methods regardless of whether it is passed in "ref" or not.
The difference (as I understand it) is whether the variable you send in has its reference updated coming back out if you change the object that variable references. So given this method
public void DoSomething(ref CoolShades glasses)
{
glasses.Vendor = "Ray Ban";
glasses = new CoolShades();
}
the variable you passed in as a parameter now contains a reference to the new CoolShades rather than whatever object it referenced before. The original parameter object's Vendor property will be changed to "Ray Ban" regardless of whether you passed the parameter ref or not.
Related
I have a object that is my in memory state of the program and also have some other worker functions that I pass the object to to modify the state. I have been passing it by ref to the worker functions. However I came across the following function.
byte[] received_s = new byte[2048];
IPEndPoint tmpIpEndPoint = new IPEndPoint(IPAddress.Any, UdpPort_msg);
EndPoint remoteEP = (tmpIpEndPoint);
int sz = soUdp_msg.ReceiveFrom(received_s, ref remoteEP);
It confuses me because both received_s and remoteEP are returning stuff from the function. Why does remoteEP need a ref and received_s does not?
I am also a c programmer so I am having a problem getting pointers out of my head.
Edit:
It looks like that objects in C# are pointers to the object under the hood. So when you pass an object to a function you can then modify the object contents through the pointer and the only thing passed to the function is the pointer to the object so the object itself is not being copied. You use ref or out if you want to be able to switch out or create a new object in the function which is like a double pointer.
Short answer: read my article on argument passing.
Long answer: when a reference type parameter is passed by value, only the reference is passed, not a copy of the object. This is like passing a pointer (by value) in C or C++. Changes to the value of the parameter itself won't be seen by the caller, but changes in the object which the reference points to will be seen.
When a parameter (of any kind) is passed by reference, that means that any changes to the parameter are seen by the caller - changes to the parameter are changes to the variable.
The article explains all of this in more detail, of course :)
Useful answer: you almost never need to use ref/out. It's basically a way of getting another return value, and should usually be avoided precisely because it means the method's probably trying to do too much. That's not always the case (TryParse etc are the canonical examples of reasonable use of out) but using ref/out should be a relative rarity.
Think of a non-ref parameter as being a pointer, and a ref parameter as a double pointer. This helped me the most.
You should almost never pass values by ref. I suspect that if it wasn't for interop concerns, the .Net team would never have included it in the original specification. The OO way of dealing with most problem that ref parameters solve is to:
For multiple return values
Create structs that represent the multiple return values
For primitives that change in a method as the result of the method call (method has side-effects on primitive parameters)
Implement the method in an object as an instance method and manipulate the object's state (not the parameters) as part of the method call
Use the multiple return value solution and merge the return values to your state
Create an object that contains state that can be manipulated by a method and pass that object as the parameter, and not the primitives themselves.
You could probably write an entire C# app and never pass any objects/structs by ref.
I had a professor who told me this:
The only place you'd use refs is where you either:
Want to pass a large object (ie, the objects/struct has
objects/structs inside it to multiple levels) and copying it would
be expensive and
You are calling a Framework, Windows API or other API that requires
it.
Don't do it just because you can. You can get bit in the ass by some
nasty bugs if you start changing the values in a param and aren't
paying attention.
I agree with his advice, and in my five plus years since school, I've never had a need for it outside of calling the Framework or Windows API.
Since received_s is an array, you're passing a pointer to that array. The function manipulates that existing data in place, not changing the underlying location or pointer. The ref keyword signifies that you're passing the actual pointer to the location and updating that pointer in the outside function, so the value in the outside function will change.
E.g. the byte array is a pointer to the same memory before and after, the memory has just been updated.
The Endpoint reference is actually updating the pointer to the Endpoint in the outside function to a new instance generated inside the function.
Think of a ref as meaning you are passing a pointer by reference. Not using a ref means you are passing a pointer by value.
Better yet, ignore what I just said (it's probably misleading, especially with value types) and read This MSDN page.
While I agree with Jon Skeet's answer overall and some of the other answers, there is a use case for using ref, and that is for tightening up performance optimizations. It has been observed during performance profiling that setting the return value of a method has slight performance implications, whereas using ref as an argument whereby the return value is populated into that parameter results in this slight bottleneck being removed.
This is really only useful where optimization efforts are taken to extreme levels, sacrificing readability and perhaps testability and maintainability for saving milliseconds or perhaps split-milliseconds.
my understanding is that all objects derived from Object class are passed as pointers whereas ordinary types (int, struct) are not passed as pointers and require ref. I am nor sure about string (is it ultimately derived from Object class ?)
Ground zero rule first, Primitives are passed by value(stack) and Non-Primitive by reference(Heap) in the context of TYPES involved.
Parameters involved are passed by Value by default.
Good post which explain things in details.
http://yoda.arachsys.com/csharp/parameters.html
Student myStudent = new Student {Name="A",RollNo=1};
ChangeName(myStudent);
static void ChangeName(Student s1)
{
s1.Name = "Z"; // myStudent.Name will also change from A to Z
// {AS s1 and myStudent both refers to same Heap(Memory)
//Student being the non-Primitive type
}
ChangeNameVersion2(ref myStudent);
static void ChangeNameVersion2(ref Student s1)
{
s1.Name = "Z"; // Not any difference {same as **ChangeName**}
}
static void ChangeNameVersion3(ref Student s1)
{
s1 = new Student{Name="Champ"};
// reference(myStudent) will also point toward this new Object having new memory
// previous mystudent memory will be released as it is not pointed by any object
}
We can say(with warning) Non-primitive types are nothing but Pointers
And when we pass them by ref we can say we are passing Double Pointer
I know this question is old, but I would to brief the answer for any new curious newbie developers.
Try as much as you can to avoid using ref/out, it is against writing a clean code and bad quality code.
Passing types by reference (using out or ref) requires experience with pointers, understanding how value types and reference types differ and handling methods that have multiple return values. Also, the difference between out and ref parameters is not widely understood.
please check
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1045
I just want to check my understanding of C#'s ways of handling things, before I delve too deeply into designing my classes. My current understanding is that:
Struct is a value type, meaning it actually contains the data members defined within.
Class is a reference type, meaning it contains references to the data members defined within.
A method signature passes parameters by value, which means a copy of the value is passed to the inside of the method, making it expensive for large arrays and data structures.
A method signature that defines a parameter with the ref or out keywords will instead pass a parameter by reference, which means a pointer to the object is provided instead.
What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?
What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?
The short answer:
The empty constructor will not be called automatically, and it actually just points to the original object.
using ref and out does not affect this.
The long answer:
I think it would be easier to understand how C# handles passing arguments to a function.
Actually everything is being passed by value
Really?! Everything by value?
Yes! Everything!
Of course there must be some kind of a difference between passing classes and simple typed objects, such as an Integer, otherwise, it would be a huge step back performance wise.
Well the thing is, that behind the scenes when you pass a class instance of an object to a function, what is really being passed to the function is the pointer to the class. the pointer, of course, can be passed by value without causing performance issues.
Actually, everything is being passed by value; it's just that when
you're "passing an object", you're actually passing a reference to that
object (and you're passing that reference by value).
once we are in the function, given the argument pointer, we can relate to the object passed by reference.
You don't actually need to do anything for this, you can relate directly to the instance passed as the argument (as said before, this whole process is being done behind the scenes).
After understanding this, you probably understand that the empty constructor will not be called automatically, and it actually just points to the original object.
EDITED:
As to the out and ref, they allow functions to change the value of an arguments and have that change persist outside of the scope of the function.
In a nutshell, using the ref keyword for value types will act as follows:
int i = 42;
foo(ref i);
will translate in c++ to:
int i = 42;
int* ptrI = &i;
foo(ptrI)
while omitting the ref will simply translate to:
int i = 42;
foo(i)
using those keywords for reference type objects, will allow you to reallocate memory to the passed argument, and make the reallocation persist outside of the scope of the function (for more details please refer to the MSDN page)
Side note:
The difference between ref and out is that out makes sure that the called function must assign a value to the out argument, while ref does not have this restriction, and then you should handle it by assigning some default value yourself, thus, ref Implies the the initial value of the argument is important to the function and might affect it's behaviour.
Passing a value-type variable to a method means passing a copy of the variable to the method. Any changes to the parameter that take place inside the method have no affect on the original data stored in the variable.
If you want the called method to change the value of the parameter, you have to pass it by reference, using the ref or out keyword.
When you pass a reference-type parameter by value, it is possible to change the data pointed to by the reference, such as the value of a class member. However, you cannot change the value of the reference itself; that is, you cannot use the same reference to allocate memory for a new class and have it persist outside the block. To do that, pass the parameter using the ref (or out) keyword.
Reference: Passing Parameters(C#)
Tragically, there is no way to pass an object by value in C# or VB.NET. I suggest instead you pass, for example, New Class1(Object1) where Object1 is an instance of Class1. You will have to write your own New method to do this but at least you then have an easy pass-by-value capability for Class1.
I have a code like the following:
struct A
{
void SomeMethod()
{
var items = Enumerable.Range(0, 10).Where(i => i == _field);
}
int _field;
}
... and then i get the following compiler error:
Anonymous methods inside structs can not access instance members of 'this'.
Can anybody explains what's going on here.
Variables are captured by reference (even if they were actually value-types; boxing is done then).
However, this in a ValueType (struct) cannot be boxed, and hence you cannot capture it.
Eric Lippert has a nice article on the surprises of capturing ValueTypes. Let me find the link
The Truth About Value Types
Note in response to the comment by Chris Sinclair:
As a quick fix, you can store the struct in a local variable: A thisA = this; var items = Enumerable.Range(0, 10).Where(i => i == thisA._field); – Chris Sinclair 4 mins ago
Beware of the fact that this creates surprising situations: the identity of thisA is not the same as this. More explicitly, if you choose to keep the lambda around longer, it will have the boxed copy thisA captured by reference, and not the actual instance that SomeMethod was called on.
When you have an anonymous method it will be compiled into a new class, that class will have one method (the one you define). It will also have a reference to each variable that you used that was outside of the scope of the anonymous method. It's important to emphasize that it is a reference, not a copy, of that variable. "lambdas close over variables, not values" as the saying goes. This means that if you close over a variable outside of the scope of a lambda, and then change that variable after defining the anonymous method (but before invoking it) then you will see the changed value when you do invoke it).
So, what's the point of all of that. Well, if you were to close over this for a struct, which is a value type, it's possible for the lambda to outlive the struct. The anonymous method will be in a class, not a struct, so it will go on the heap, live as long as it needs to, and you are free to pass a reference to that class (directly or indirectly) wherever you want.
Now imagine that we have a local variable, with a struct of the type you've defined here. We use this named method to generate a lambda, and let's assume for a moment that the query items is returned (instead of the method being void). Would could then store that query in another instance (instead of local) variable, and iterate over that query some time later on another method. What would happen here? In essence, we would have held onto a reference to a value type that was on the stack once it is no longer in scope.
What does that mean? The answer is, we have no idea. (Please look over the link; it's kinda the crux of my argument.) The data could just happen to be the same, it could have been zeroed out, it could have been filled by entirely different objects, there is no way of knowing. C# goes to great lengths, as a language, to prevent you from doing things like this. Languages such as C or C++ don't try so hard to stop you from shooting your own foot.
Now, in this particular case, it's possible that you aren't going to use the lambda outside of the scope of what this refers to, but the compiler doesn't know that, and if it lets you create the lambda it has no way of determining whether or not you expose it in a way that could result in it outliving this, so the only way to prevent this problem is to disallow some cases that aren't actually problematic.
If you have class and a constructor which takes in an object as a input param - is that object passed by reference or is it passed by value?
And is it true to assume that for class methods, object input parameters are passed by value by default unless the ref keyword is used?
What about the out keyword? Does this still mean that it is passed by reference?
If you have class and a constructor which takes in an object as a input param - is that object passed by reference or is it passed by value?
All parameters are passed by value in C# unless the parameter is marked with out or ref.
This is a huge source of confusion. I'll state things a little more explicitly.
All parameters have their value copied unless the parameter is marked with out or ref. For value types, this means that a copy of the value being passed is made. For reference types this means that a copy of the reference is made. For this last point, the value of a reference type is the reference.
And is it true to assume that for class methods, object input parameters are passed by value by default unless the ref keyword is used?
Again, all parameters are passed by value in C# unless the parameter is marked with out or ref. For a parameter marked with ref, a reference to the parameter is passed to the method and now you can think of that parameter as an alias. Thus, when you say
void M(ref int m) { m = 10; }
int n = 123;
M(ref n);
you can think of m in M as an alias for n. That is m and n are just two different names for the same storage location.
This is very different from
string s = "Hello, world!";
string t = s;
In this case, s and t are not alises for the same storage location. These are two different variables that happen to refer to the same object.
What about the `out keyword? Does this still mean that it is passed by reference?
The only difference between ref and out is that ref requires the variable to be initialized before being passed.
The reference to the object will be passed by value.
.NET has reference types and value types - classes are all reference types and structs are value types. You can pass either one by value or by reference.
By default, everything is passed by value, the difference being that with reference types the reference is passed in.
The ref and out keywords will cause the parameters to be passed by reference - in the case of value types that means you can now make changes that will be reflected in the passed in object. With reference types that means you can now change the object that the reference refers to.
An object is always passed by reference to the actual object. So no copy (aka "by value") is being performed of the object.
Just, as Oded notes, the reference to the object is being copied.
The default passing mechanism for parameters in .Net is by value. This is true for both reference and value types. In the reference case though it's the actual reference which is passed by value, not the object.
When the ref or out keyword is used then the value is indeed passed by reference (once again true for both value and reference types). At a CLR level there is actually no difference between ref and out. The out keyword is a C# notion which is expressed by marking a ref param (I believe it's done with a modopt)
An important thing to understand with reference types is that almost anything one does with a variable of reference type is implicitly done to the thing being referred to by the reference, not to the reference itself. I find it helpful to think of reference types as instance ID's. To use an analogy, think of instances as cars, and reference types as slips of paper with automotive vehicle identification numbers (VINs) written on them. If I copy a VIN onto a slip of paper, hand it to someone in the shop, and say "paint this blue", what I really mean is "find the car with this VIN and paint it blue", not "paint this slip of paper blue". What I'm handing the person is not a car, but merely a VIN; what I'm telling him to paint blue, however, is the car that's sitting in the shop, and not the piece of paper (nor anything else) that I'm actually handing him. Such usage would be passing by value.
Suppose, however, what I wanted was for someone to buy a car and give me the VIN. I might write out on some slip of papers the make, model, color, etc. that I want, and also hand the person a slip of paper on which to write the VIN. In that case, I would want to get back the slip of paper with the new VIN on it. Such usage would be passing the VIN by reference, since the person would be writing the VIN on a piece of paper I supplied and giving it back to me.
#Supercat: It is rather interesting. Perhaps the confusion lies in understanding why you would want to pass a reference type by reference!
Extending the analogy for ref types only ( I think value types are easier to understand)
One may write out the same VIN ( Vehicle Id number) on a multiple slips of paper, hence all slips on your hand refer to the same car. What if you write 'paint blue' on one slip and 'paint red' on another? well this demonstrates that the slips can only contain the VIN (object address) and all other information is stored in the car itself.
If you are interested in getting the car painted at the workshop, you don't have to send a slip, you can just tell them the VIN...that's only need to know, the value- pass by val. You still keep your slip and they can't change what's written on your slip...hence it is safer. Therefore they write down the VIN on their own slip - copy of the reference.
On the other hand you may ask a collegue to get the slip for last washed car from the shelf, go to the forecourt and choose a car that is not last washed car and return the slip with the new VIN of the washed car written on it - by ref. Actual slip is used and you have refered to the address of the actual slip (shelf) so that he gets the slip from there. He better not lose it or get it wet...less safe.
In all this palava, no-one is talking about copying, taking or moving the actual car as this is NOT refering to value types.
It's passed by value, if you intended to pass it by reference, you would use the ref parameter modifier. Not sure though whether this is allowed in constructors...
Do they? Or to speed up my program should I pass them by reference?
The reference is passed by value.
Arrays in .NET are object on the heap, so you have a reference. That reference is passed by value, meaning that changes to the contents of the array will be seen by the caller, but reassigning the array won't:
void Foo(int[] data) {
data[0] = 1; // caller sees this
}
void Bar(int[] data) {
data = new int[20]; // but not this
}
If you add the ref modifier, the reference is passed by reference - and the caller would see either change above.
They are passed by value (as are all parameters that are neither ref nor out), but the value is a reference to the object, so they are effectively passed by reference.
Yes, they are passed by reference by default in C#. All objects in C# are, except for value types. To be a little bit more precise, they're passed "by reference by value"; that is, the value of the variable that you see in your methods is a reference to the original object passed. This is a small semantic point, but one that can sometimes be important.
(1) No one explicitly answered the OP's question, so here goes:
No. Explicitly passing the array or list as a reference will not affect performance.
What the OP feared might be happening is avoided because the function is already operating on a reference (which was passed by value). The top answer nicely explains what this means, giving an Ikea way to answer the original question.
(2) Good advice for everyone:
Read Eric Lippert's advice on when/how to approach optimization. Premature optimization is the root of much evil.
(3) Important, not already mentioned:
Use cases that require passing anything - values or references - by reference are rare.
Doing so gives you extra ways to shoot yourself in the foot, which is why C# makes you use the "ref" keyword on the method call as well. Older (pre-Java) languages only made you indicate pass-by-reference on the method declaration. And this invited no end of problems. Java touts the fact that it doesn't let you do it at all.