Generic Methods that overwrite parameters without reference? - c#

Okay this might be a really stupid quesion but I will risk my rep anyway. I'm pretty new to programming so take it easy will ya ;)
So I just got into TCP when I encountered something I dont quite understand.
To be specific:
int length = Socket.Receive(MyByteArray);
To my understanding this method returns the length of the data beeing received and writes the recieved data into my byte array. So how does it write into my byte array without me telling it to? After some research I learned that you can use references to do this kind of thing but this method doesn't require "ref MyByteArray" which leaves me puzzled. Is this a different kind of Method or is it what is going on inside the method (duh)?
Thanks in advance you utterly awesome person.

Passing a reference type into a method can sometimes be an unintuitive thing for developers. Consider these two pieces of code (neither of which use the ref keyword):
void Method1(SomeType myObj)
{
myObj = new SomeType();
}
void Method2(SomeType myObj)
{
myObj.SomeProperty = 1;
}
The first method has no side effects. The reference type was passed into the method, which for lack of a better term is basically a "pointer" (itself passed by value) to the object in memory. If you set the variable to a new object in memory, the original remains unchanged. There are then two objects in memory. (Though the new one will go away once the method ends, because nothing uses it.)
The second method, however, does have a side-effect. It uses the same reference to the object in memory, but it modifies the object itself. So anything which examines the object after the method is called will see that modification.
Presumably, Socket.Receive() does something similar to the second method above. It uses the reference to modify the object.
To illustrate how the ref keyword would change this:
void Method3(ref SomeType myObj)
{
myObj = new SomeType();
}
In this scenario, there is also a side-effect. Any code which calls the method and then, afterward, examines the object it sent to the method will then see that the object has been replaced with a new one. In this case there wasn't a second "pointer" to the same location in memory. The method used the actual pointer that the calling code used.

Related

C# How to set or copy a new object to a global variable which references a parameter via constructor [duplicate]

I have a object that is my in memory state of the program and also have some other worker functions that I pass the object to to modify the state. I have been passing it by ref to the worker functions. However I came across the following function.
byte[] received_s = new byte[2048];
IPEndPoint tmpIpEndPoint = new IPEndPoint(IPAddress.Any, UdpPort_msg);
EndPoint remoteEP = (tmpIpEndPoint);
int sz = soUdp_msg.ReceiveFrom(received_s, ref remoteEP);
It confuses me because both received_s and remoteEP are returning stuff from the function. Why does remoteEP need a ref and received_s does not?
I am also a c programmer so I am having a problem getting pointers out of my head.
Edit:
It looks like that objects in C# are pointers to the object under the hood. So when you pass an object to a function you can then modify the object contents through the pointer and the only thing passed to the function is the pointer to the object so the object itself is not being copied. You use ref or out if you want to be able to switch out or create a new object in the function which is like a double pointer.
Short answer: read my article on argument passing.
Long answer: when a reference type parameter is passed by value, only the reference is passed, not a copy of the object. This is like passing a pointer (by value) in C or C++. Changes to the value of the parameter itself won't be seen by the caller, but changes in the object which the reference points to will be seen.
When a parameter (of any kind) is passed by reference, that means that any changes to the parameter are seen by the caller - changes to the parameter are changes to the variable.
The article explains all of this in more detail, of course :)
Useful answer: you almost never need to use ref/out. It's basically a way of getting another return value, and should usually be avoided precisely because it means the method's probably trying to do too much. That's not always the case (TryParse etc are the canonical examples of reasonable use of out) but using ref/out should be a relative rarity.
Think of a non-ref parameter as being a pointer, and a ref parameter as a double pointer. This helped me the most.
You should almost never pass values by ref. I suspect that if it wasn't for interop concerns, the .Net team would never have included it in the original specification. The OO way of dealing with most problem that ref parameters solve is to:
For multiple return values
Create structs that represent the multiple return values
For primitives that change in a method as the result of the method call (method has side-effects on primitive parameters)
Implement the method in an object as an instance method and manipulate the object's state (not the parameters) as part of the method call
Use the multiple return value solution and merge the return values to your state
Create an object that contains state that can be manipulated by a method and pass that object as the parameter, and not the primitives themselves.
You could probably write an entire C# app and never pass any objects/structs by ref.
I had a professor who told me this:
The only place you'd use refs is where you either:
Want to pass a large object (ie, the objects/struct has
objects/structs inside it to multiple levels) and copying it would
be expensive and
You are calling a Framework, Windows API or other API that requires
it.
Don't do it just because you can. You can get bit in the ass by some
nasty bugs if you start changing the values in a param and aren't
paying attention.
I agree with his advice, and in my five plus years since school, I've never had a need for it outside of calling the Framework or Windows API.
Since received_s is an array, you're passing a pointer to that array. The function manipulates that existing data in place, not changing the underlying location or pointer. The ref keyword signifies that you're passing the actual pointer to the location and updating that pointer in the outside function, so the value in the outside function will change.
E.g. the byte array is a pointer to the same memory before and after, the memory has just been updated.
The Endpoint reference is actually updating the pointer to the Endpoint in the outside function to a new instance generated inside the function.
Think of a ref as meaning you are passing a pointer by reference. Not using a ref means you are passing a pointer by value.
Better yet, ignore what I just said (it's probably misleading, especially with value types) and read This MSDN page.
While I agree with Jon Skeet's answer overall and some of the other answers, there is a use case for using ref, and that is for tightening up performance optimizations. It has been observed during performance profiling that setting the return value of a method has slight performance implications, whereas using ref as an argument whereby the return value is populated into that parameter results in this slight bottleneck being removed.
This is really only useful where optimization efforts are taken to extreme levels, sacrificing readability and perhaps testability and maintainability for saving milliseconds or perhaps split-milliseconds.
my understanding is that all objects derived from Object class are passed as pointers whereas ordinary types (int, struct) are not passed as pointers and require ref. I am nor sure about string (is it ultimately derived from Object class ?)
Ground zero rule first, Primitives are passed by value(stack) and Non-Primitive by reference(Heap) in the context of TYPES involved.
Parameters involved are passed by Value by default.
Good post which explain things in details.
http://yoda.arachsys.com/csharp/parameters.html
Student myStudent = new Student {Name="A",RollNo=1};
ChangeName(myStudent);
static void ChangeName(Student s1)
{
s1.Name = "Z"; // myStudent.Name will also change from A to Z
// {AS s1 and myStudent both refers to same Heap(Memory)
//Student being the non-Primitive type
}
ChangeNameVersion2(ref myStudent);
static void ChangeNameVersion2(ref Student s1)
{
s1.Name = "Z"; // Not any difference {same as **ChangeName**}
}
static void ChangeNameVersion3(ref Student s1)
{
s1 = new Student{Name="Champ"};
// reference(myStudent) will also point toward this new Object having new memory
// previous mystudent memory will be released as it is not pointed by any object
}
We can say(with warning) Non-primitive types are nothing but Pointers
And when we pass them by ref we can say we are passing Double Pointer
I know this question is old, but I would to brief the answer for any new curious newbie developers.
Try as much as you can to avoid using ref/out, it is against writing a clean code and bad quality code.
Passing types by reference (using out or ref) requires experience with pointers, understanding how value types and reference types differ and handling methods that have multiple return values. Also, the difference between out and ref parameters is not widely understood.
please check
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1045

Instantiate an object in an IF condition

I have a class that has a method CheckValues(someVar) which returns true or false after taking in a parameter which is being checked for null or empty first. This class's method is called in a WCF service running off IIS and also in a multi-threaded application. Which of the two ways below is better?
1:
MyClass obj = new MyClass();
if( !String.IsNullOrEmpty(someVar) && obj.CheckValues(someVar))
{
...
}
2:
if( !String.IsNullOrEmpty(someVar) && new MyClass().CheckValues(someVar))
{
...
}
The first method is pretty conventional. The second gives me the benefit of creating an object only if the variable someVar has some value, not otherwise.
Is there any problem with the second approach, or is it bad practice? Will it matter if this variable is either a value type or a reference type?
There isn't anything inherently bad about the second code snippet. In fact, as you have noted, it would be a small optimization over the first.
That said, it smells bad. Why are you creating an object for exactly one use, to call a method with no parameters? Should that method be static, as it doesn't need state? Should you just reuse an existing object?
The main problem I see is that the existence of that logic indicates a problem somewhere else in your design.
To your comments:
The existence of a parameter doesn't really change the smell; as you are still making a one-time use object for a single function call.
Given your second comment, the design could be reasonable. Make sure that object does nothing but generate the state necessary to handle the check though. If it is some large object, you just introduced a bunch of overhead for nothing.
You can make CheckValues() static method and then you won't have to make object of class to call it.

Passing objects by reference vs value

I just want to check my understanding of C#'s ways of handling things, before I delve too deeply into designing my classes. My current understanding is that:
Struct is a value type, meaning it actually contains the data members defined within.
Class is a reference type, meaning it contains references to the data members defined within.
A method signature passes parameters by value, which means a copy of the value is passed to the inside of the method, making it expensive for large arrays and data structures.
A method signature that defines a parameter with the ref or out keywords will instead pass a parameter by reference, which means a pointer to the object is provided instead.
What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?
What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?
The short answer:
The empty constructor will not be called automatically, and it actually just points to the original object.
using ref and out does not affect this.
The long answer:
I think it would be easier to understand how C# handles passing arguments to a function.
Actually everything is being passed by value
Really?! Everything by value?
Yes! Everything!
Of course there must be some kind of a difference between passing classes and simple typed objects, such as an Integer, otherwise, it would be a huge step back performance wise.
Well the thing is, that behind the scenes when you pass a class instance of an object to a function, what is really being passed to the function is the pointer to the class. the pointer, of course, can be passed by value without causing performance issues.
Actually, everything is being passed by value; it's just that when
you're "passing an object", you're actually passing a reference to that
object (and you're passing that reference by value).
once we are in the function, given the argument pointer, we can relate to the object passed by reference.
You don't actually need to do anything for this, you can relate directly to the instance passed as the argument (as said before, this whole process is being done behind the scenes).
After understanding this, you probably understand that the empty constructor will not be called automatically, and it actually just points to the original object.
EDITED:
As to the out and ref, they allow functions to change the value of an arguments and have that change persist outside of the scope of the function.
In a nutshell, using the ref keyword for value types will act as follows:
int i = 42;
foo(ref i);
will translate in c++ to:
int i = 42;
int* ptrI = &i;
foo(ptrI)
while omitting the ref will simply translate to:
int i = 42;
foo(i)
using those keywords for reference type objects, will allow you to reallocate memory to the passed argument, and make the reallocation persist outside of the scope of the function (for more details please refer to the MSDN page)
Side note:
The difference between ref and out is that out makes sure that the called function must assign a value to the out argument, while ref does not have this restriction, and then you should handle it by assigning some default value yourself, thus, ref Implies the the initial value of the argument is important to the function and might affect it's behaviour.
Passing a value-type variable to a method means passing a copy of the variable to the method. Any changes to the parameter that take place inside the method have no affect on the original data stored in the variable.
If you want the called method to change the value of the parameter, you have to pass it by reference, using the ref or out keyword.
When you pass a reference-type parameter by value, it is possible to change the data pointed to by the reference, such as the value of a class member. However, you cannot change the value of the reference itself; that is, you cannot use the same reference to allocate memory for a new class and have it persist outside the block. To do that, pass the parameter using the ref (or out) keyword.
Reference: Passing Parameters(C#)
Tragically, there is no way to pass an object by value in C# or VB.NET. I suggest instead you pass, for example, New Class1(Object1) where Object1 is an instance of Class1. You will have to write your own New method to do this but at least you then have an easy pass-by-value capability for Class1.

What happens when you create an instance of an object containing no state in C#?

I am I think ok at algorithmic programming, if that is the right term? I used to play with turbo pascal and 8086 assembly language back in the 1980s as a hobby. But only very small projects and I haven't really done any programming in the 20ish years since then. So I am struggling for understanding like a drowning swimmer.
So maybe this is a very niave question or I'm just making no sense at all, but say I have an object kind of like this:
class Something : IDoer
{
void Do(ISomethingElse x)
{
x.DoWhatEverYouWant(42);
}
}
And then I do
var Thing1 = new Something();
var Thing2 = new Something();
Thing1.Do(blah);
Thing2.Do(blah);
does Thing1 = Thing2? does "new Something()" create anything? Or is it not much different different from having a static class, except I can pass it around and swap it out etc.
Is the "Do" procedure in the same location in memory for both the Thing1(blah) and Thing2(blah) objects? I mean when executing it, does it mean there are two Something.Do procedures or just one?
They are two separate objects; they just don't have state.
Consider this code:
var obj1 = new object();
var obj2 = new object();
Console.WriteLine(object.ReferenceEquals(obj1, obj2));
It will output False.
Just because an object has no state doesn't mean it doesn't get allocated just like any other object. It just takes very little space (just like an object).
In response to the last part of your question: there is only one Do method. Methods are not stored per instance but rather per class. If you think about it, it would be extremely wasteful to store them per instance. Every method call to Do on a Something object is really the same set of instructions; all that differs between calls from different objects is the state of the underlying object (if the Something class had any state to begin with, that is).
What this means is that instance methods on class objects are really behaviorally the same as static methods.
You might think of it as if all instance-level methods were secretly translated as follows (I'm not saying this is strictly true, just that you could think of it this way and it does kind of make sense):
// appears to be instance-specific, so you might think
// it would be stored for every instance
public void Do() {
Do(this);
}
// is clearly static, so it is much clearer it only needs
// to be stored in one place
private static Do(Something instance) {
// do whatever Do does
}
Interesting side note: the above hypothetical "translation" explains pretty much exactly how extension methods work: they are static methods, but by qualifying their first parameter with the this keyword, they suddenly look like instance methods.
There are most definitely two different objects in memory. Each object will consume 8 bytes on the heap (at least on 32-bit systems); 4 for the syncblock and 4 for the type handle (which includes the method table). Other than the system-defined state data there is no other user-defined state data in your case.
There is a single instance of the code for the Something.Do method. The type handle pointer that each object holds is how the CLR locates the different methods for the class. So even though there are two different objects in memory they both execute the same code. Since Something.Do was declared as an instance method it will have a this pointer passed to it internally so that the code can modify the correct instance members depending on which object was invoking the method. In your case the Something class has no instance members (and thus no user-defined state) and so this is quite irrelevant, but still happens nevertheless.
No they are not the same. They are two separate instances of the class Something. They happen to be identically instantiated, that is all.
You would create 2 "empty" objects, there would be a small allocation on the heap for each object.
But the "Do" method is always in the same place, that has nothing to do with the absence of state. Code is not stored 'in' a class/object. There is only 1 piece of code corresponding to Do() and it has a 'hidden' parameter this that points to the instance of Something it was called on.
Conceptually, Thing1 and Thing2 are different objects, but there is only one Something.Do procedure.
The .Net runtime allocates a little bit of memory to each of the objects you create - one chunk to Thing1 and another to Thing2. The purpose of this chunk of memory is to store (1) the state of the object and (2) a the address of any procedures that that belong to the object. I know you don't have any state, but the runtime doesn't care - it still keeps two separate references to two separate chunks of memory.
Now, your "Do" method is the same for both Thing1 and Thing2, do the runtime only keeps one version of the procedure in memory.
he memory allocated Thing1 includes the address of the the Do method. When you invoke the Do method on Thing1, it looks up the address of its Do method for Thing1 and runs the method. The same thing happens with the other object, Thing2. Although the objects are different, the same Do method is called for both Thing1 and Thing2.
What this boils down to is that Thing1 and Thing2 are different, in that the names "Thing1" and "Thing2" refer to different areas of memory. The contents of this memory is he same in both cases - a single address that points to the "Do" method.
Well, that's the theory, anyway. Under the hood, there might be some kind of optimisation going on (See http://www.wrox.com/WileyCDA/Section/CLR-Method-Call-Internals.id-291453.html if you're interested), but for most practical purposes, what I have said is the way things work.
Thing1 != Thing2
These are two different objects in memory.
The Do method code is in the same place for both objects. There is no need to store two different copies of the method.
Each reference type (Thing1, Thing2) is pointing to a different physical address in main memory, as they have been instantiated separately. The thing pointed to in memory is the bytes used by the object, whether it has a state or not (it always has a state, but whether it has a declared/initialised state).
If you assigned a reference type to another reference type (Thing2 = Thing1;) then it would be the same portion of memory used by two different reference types, and no new instantiation would take place.
A good way of think of the new constructor(), is that you are really just calling the method inside your class whos sole responsibility is to produce you a new instance of an object that is cookie cutted from your class.
so now you can have multiple instances of the same class running around at runtime handling all sorts of situations :D
as far as the CLR, you are getting infact 2 seperate instances on memory that each contain pointers to it, it is very similar to any other OOP language but we do not have to actually interact with the pointers, they are translated the same as a non reference type, so we dont have to worry about them!
(there are pointers in C# if you wish to whip out your [unsafe] keyword!)

Is new-ing objects obsolete?

Ok, so here's the question... is the new keyword obsolete?
Consider in C# (and java, I believe) there are strict rules for types. Classes are reference types and can only be created on the heap. POD types are created on the stack; if you want to allocate them on the heap you have to box them in an object type. In C#, structs are the exception, they can be created on the stack or heap.
Given these rules, does it make sense that we still have to use the new keyword? Wouldn't it make sense for the language to use the proper allocation strategy based on the type?
For example, we currently have to write:
SomeClassType x = new SomeClassType();
instead of
SomeClassType x = SomeClassType();
or even just
SomeClassType x;
The compiler would, based on that the type being created is a reference type, go ahead and allocate the memory for x on the heap.
This applies to other languages like ruby, php, et al. C/C++ allow the programmer more control over where objects are created, so it has good reason to require the new keyword.
Is new just a holdover from the 60's and our C based heritage?
SomeClassType x = SomeClassType();
in this case SomeClassType() might be a method located somewhere else, how would the compiler know whether to call this method or create a new class.
SomeClassType x;
This is not very useful, most people declare their variables like this and sometimes populate them later when they need to. So it wouldn't be useful to create an instance in memory each time you declare a variable.
Your third method will not work, since sometimes we want to define a object of one type and assign it to a variable of another type. For instance:
Stream strm = new NetworkStream();
I want a stream type (perhaps to pass on somewhere), but internally I want a NetworkStream type.
Also many times I create a new object while calling a method:
myobj.Foo(new NetworkStream());
doing that this way:
myobj.Foo(NetworkStream());
is very confusing. Am I creating an object, or calling a method when I say NetworkStream()?
If you could just write SomeClassType x; and have it automatically initialized, that wouldn't allow for constructors with any parameters. Not every SomeClassType will have a parameterless constructor; how would the compiler know what arguments to supply?
public class Repository
{
private IDbConnection connection;
public Repository(IDbConnection connection)
{
if (connection == null)
{
throw new ArgumentNullException("connection");
}
this.connection = connection;
}
}
How would you instantiate this object with just Repository rep;? It requires a dependent object to function properly.
Not to mention, you might want to write code like so:
Dictionary<int, SomeClass> instances = GetInstancesFromSomewhere();
SomeClass instance;
if (instances.TryGetValue(1, out instance))
{
// Do something
}
Would you really want it auto-initializing for you?
If you just wrote SomeClassType x = SomeClassType() then this makes no distinction between a constructor and a method in scope.
More generally:
I think there's a fundamental misunderstanding of what the new keyword is for. The fact that value types are allocated on the stack and "reference" types are allocated on the heap is an implementation detail. The new keyword is part of the specification. As a programmer, you don't care whether or not it's allocated on the heap or stack (most of the time), but you do need to specify how the object gets initialized.
There are other valid types of initializers too, such as:
int[] values = { 1, 2, 3, 4 };
VoilĂ , an initialization with no new. In this case the compiler was smart enough to figure it out for you because you provided a literal expression that defines the entire object.
So I guess my "answer" is, don't worry about where the object exists memory-wise; use the new keyword as it's intended, as an object initializer for objects that require initialization.
For starters:
SomeClassType x;
is not initialized so no memory should be allocated.
Other than that, how do you avoid problems where there is a method with the same name as the class.
Say you write some code:
int World() { return 3; }
int hello = World();
and everything is nice and jolly.
Now you write a new Class later:
class World
{
...
}
Suddenly your int hello = World() line is ambiguous.
For performance reasons, this might be a bad idea. For instance, if you wanted to have x be a reference for an object that's already been created, it would be a waste of memory and processor time to create a new object then immediately dispose of it.
Wouldn't it make sense for the
language to use the proper allocation
strategy based on the type?
That's exactly what the C# compiler/runtime already does. The new keyword is just the syntax for constructing an object in whatever way makes sense for that object.
Removing the new keyword would make it less obvious that a constructor is being called. For a similar example, consider out parameters:
myDictionary.TryGetValue(key, out val);
The compiler already knows that val is an out. If you don't say so, it complains. But it makes the code more readable to have it stated.
At least, that is the justification - in modern IDEs these things could be found and highlighted in other ways besides actual inserted text.
Is new just a holdover from the 60's
and our C based heritage?
Definitely not. C doesn't have a new keyword.
I've been programming with Java for a number of years and I have never care if my object is on the heap or the stack. From that perspective is all the same to me to type new or don't type it.
I guess this would be more relevant for other languages.
The only thing I care is the class have the right operations and my objects are created properly.
BTW, I use ( or try ) to use the new keyword only in the factory merthod so my client looks like this anyway
SomeClasType x = SomeClasType.newInstance();
See: Effective Java Item:1
If you don't have a parameterless constructor, this could get ugly.
If you have multiple constructors, this could get real ugly.

Categories

Resources