I'm studying C# and caught a piece of code that I don't understand. I was hoping that you could clearify it for me.
CreateCustomerTask.<>c__DisplayClass0 cDisplayClass0 =
new CreateCustomerTask.<>c__DisplayClass0();
What does the <> signify? And why is there a . (dot) in front of it?
You're looking at some decompiled code - specifically, something that was generated by the compiler.
The compiler uses <> (this is an implementation detail) because, whilst it's valid for a CLR identifier to start with such characters, it's not valid in C# - so it's guaranteed that the name will not conflict with any names in the C# code.
why the compiler has generated this code varies - it can be the implementation of a lambda, or an iterator or async block, and possibly some other reasons also.
And, hopefully the other part of your question is also answered - there's a . in front of it for the usual reasons - to separate namespace portions, or more likely in this case, to separate the name of a nested class from the name of the enclosing class.
As others have pointed out, what you're seeing is a name generated by the compiler that is deliberately not legal C#, so that no one can ever accidentally (or deliberately!) cause a name conflict.
The reason this name is being generated is because;
class C
{
void M()
{
int x = 1;
Func<int, int> f = y=>x+y;
}
}
Is generated by the compiler as though you'd written:
class C
{
private class DisplayClass
{
public int x;
public int AnonymousMethod(int y)
{
return this.x + y;
}
}
void M()
{
C.DisplayClass d = new C.DisplayClass();
d.x = 1;
Func<int, int> f = d.AnonymousMethod;
}
}
Except that of course all the names are deliberately mangled, as you've discovered.
The reason that a closure class is called "DisplayClass" is a bit unfortunate: this is jargon used by the debugger team to describe a class that has special behaviours when displayed in the debugger. Obviously we do not want to display "x" as a field of an impossibly-named class when you are debugging your code; rather, you want it to look like any other local variable. There is special gear in the debugger to handle doing so for this kind of display class. It probably should have been called "ClosureClass" instead, to make it easier to read disassembly.
Use this answer by Eric Lippert to decode names such as <>c__DisplayClass0. According to the table provided in the answer, you are looking at an anonymous method closure class. Do not rely on this always being true in the future, it is an implementation detail subject to change at any time.
Related
I'm studying C# and caught a piece of code that I don't understand. I was hoping that you could clearify it for me.
CreateCustomerTask.<>c__DisplayClass0 cDisplayClass0 =
new CreateCustomerTask.<>c__DisplayClass0();
What does the <> signify? And why is there a . (dot) in front of it?
You're looking at some decompiled code - specifically, something that was generated by the compiler.
The compiler uses <> (this is an implementation detail) because, whilst it's valid for a CLR identifier to start with such characters, it's not valid in C# - so it's guaranteed that the name will not conflict with any names in the C# code.
why the compiler has generated this code varies - it can be the implementation of a lambda, or an iterator or async block, and possibly some other reasons also.
And, hopefully the other part of your question is also answered - there's a . in front of it for the usual reasons - to separate namespace portions, or more likely in this case, to separate the name of a nested class from the name of the enclosing class.
As others have pointed out, what you're seeing is a name generated by the compiler that is deliberately not legal C#, so that no one can ever accidentally (or deliberately!) cause a name conflict.
The reason this name is being generated is because;
class C
{
void M()
{
int x = 1;
Func<int, int> f = y=>x+y;
}
}
Is generated by the compiler as though you'd written:
class C
{
private class DisplayClass
{
public int x;
public int AnonymousMethod(int y)
{
return this.x + y;
}
}
void M()
{
C.DisplayClass d = new C.DisplayClass();
d.x = 1;
Func<int, int> f = d.AnonymousMethod;
}
}
Except that of course all the names are deliberately mangled, as you've discovered.
The reason that a closure class is called "DisplayClass" is a bit unfortunate: this is jargon used by the debugger team to describe a class that has special behaviours when displayed in the debugger. Obviously we do not want to display "x" as a field of an impossibly-named class when you are debugging your code; rather, you want it to look like any other local variable. There is special gear in the debugger to handle doing so for this kind of display class. It probably should have been called "ClosureClass" instead, to make it easier to read disassembly.
Use this answer by Eric Lippert to decode names such as <>c__DisplayClass0. According to the table provided in the answer, you are looking at an anonymous method closure class. Do not rely on this always being true in the future, it is an implementation detail subject to change at any time.
This question already has answers here:
variable that can't be modified
(9 answers)
Closed 5 years ago.
In Scala I can write (and it will mean exactly the same thing it means in C#)
var v = 1;
v = 2;
but can't write (well, of course I can write but can't compile actually though the syntax is correct)
val v = 1;
v = 2;
Semicolons are not necessary but can be voluntarily used in Scala so I've decided to include them to let the code correspond C# more closely. val means an immutable value, a kind of a variable that can be only assigned once but, unlikely to c# consts can be initialized with a result of a run-time expression, unlikely to C# readonly fields can be introduced at any place in the code where a var variable can and, unlikely to C# immutable types is immune to a reference replacement, not just modification of the referenced object.
I enjoy the way C# introduces more and more functional coding candies in every new version of the language but miss this (arguably the most simple and the most essential) one heavily. In the majority of the cases I only assign values to my variables once so re-assignment is usually something that is not meant to happen which means a thing that is not expected and can cause a bug this way. Might I, perhaps, just be unaware of such a feature? I don't mind such a declaration looking a little bit clumsy (some F# imports perhaps, whatever they might look like in C# code).
UPDATE: As it seems there indeed is no such a feature a by now (March 2017, C# language version 7.0) and as suggested by others I have submitted an issue at the C# language design GitHub repository.
Basically, you can't - at least at the time of C#6 - with one notable exception mentioned below. Maybe something will change in future versions of C#. There were plans for "record types" for C# 7, it could open some way when paired with anonymous inline objects. However, I actually don't know what gets exactly added in C#7.
The only normal support for anything like that is at class member scope:
class Foo
{
public readonly int shoeSize; // readonly field
public int ShoeSize { get { .. } } // readonly property
public int ToeSize { get; } = 5; // readonly property with initializer
// ..etc
}
with read-only field being settable only during object member initialization or in constructor, and getters - well, should be more or less obvious.
At the scope of normal code, any 'variable' (as opposed to 'member' above, or 'constants' you mentioned) you create is (almost) always writable, and assignment semantics will always differ depending on the kind (struct/class) of the variable's type.
EDIT: I've found one! Your note about clumsy syntax got me an idea. Actually, the foreach iterator variable is guarded against assignments by the compiler, so you can use it with Enumerable.Repeat to quickly open a foreach scope that will iterate just once..
static void Main()
{
foreach(int x in Enumerable.Repeat(5/*value for X*/, 1/*single run*/))
{
x=4; // <- compile time error!
Console.WriteLine(x);
}
}
EDIT2: another option, nicer, tuple literal that is said to be added in C#7
public static void Main()
{
var pair1 = (42, "hello");
System.Console.Write(Method(pair1).message);
var pair2 = (code: 43, message: "world");
System.Console.Write(pair2.message);
}
fields/properties of a Tuple are not writable, hence such tuple-literal will be quite handy, except for the 'pair2' extra identifier to write (and .. some cost of creating and disposing a tuple object)
However, I actually don't know if they are mutable or not. They are called "tuples", so I immediately think of "Tuple<>" whose properties are readonly, but then, in this old article
Tuples are value types, and their elements are simply public, mutable fields.
Now, I don't (yet) have VS2017 installed.. It will take some time, maybe someone else will be able to check that sooner than me.
if(true)
{
string var = "VAR";
}
string var = "New VAR!";
This will result in:
Error 1 A local variable named 'var'
cannot be declared in this scope
because it would give a different
meaning to 'var', which is already
used in a 'child' scope to denote
something else.
Nothing earth shattering really, but isn't this just plain wrong? A fellow developer and I were wondering if the first declaration should be in a different scope, thus the second declaration cannot interfere with the first declaration.
Why is C# unable to differentiate between the two scopes? Should the first IF scope not be completely separate from the rest of the method?
I cannot call var from outside the if, so the error message is wrong, because the first var has no relevance in the second scope.
The issue here is largely one of good practice and preventing against inadvertent mistakes. Admittedly, the C# compiler could theoretically be designed such that there is no conflict between scopes here. This would however be much effort for little gain, as I see it.
Consider that if the declaration of var in the parent scope were before the if statement, there would be an unresolvable naming conflict. The compiler simply does not differentiate between the following two cases. Analysis is done purely based on scope, and not order of declaration/use, as you seem to be expecting.
The theoretically acceptable (but still invalid as far as C# is concerned):
if(true)
{
string var = "VAR";
}
string var = "New VAR!";
and the unacceptable (since it would be hiding the parent variable):
string var = "New VAR!";
if(true)
{
string var = "VAR";
}
are both treated precisely the same in terms of variables and scopes.
Now, is there any actual reason in this secenario why you can't just give one of the variables a different name? I assume (hope) your actual variables aren't called var, so I don't really see this being a problem. If you're still intent on reusing the same variable name, just put them in sibling scopes:
if(true)
{
string var = "VAR";
}
{
string var = "New VAR!";
}
This however, while valid to the compiler, can lead to some amount of confusion when reading the code, so I recommend against it in almost any case.
isn't this just plain wrong?
No, this is not wrong at all. This is a correct implementation of section 7.5.2.1 of the C# specification, "Simple names, invariant meanings in blocks".
The specification states:
For each occurrence of a given
identifier as a simple-name in an
expression or declarator, within the
local variable declaration space
of that occurrence, every
other occurrence of the same
identifier as a simple-name in an
expression or declarator must refer to the same
entity. This rule ensures that the
meaning of a name is always the same
within a given block, switch block,
for-, foreach- or using-statement, or
anonymous function.
Why is C# unable to differentiate between the two scopes?
The question is nonsensical; obviously the compiler is able to differentiate between the two scopes. If the compiler were unable to differentiate between the two scopes then how could the error be produced? The error message says that there are two different scopes, and therefore the scopes have been differentiated!
Should the first IF scope not be completeley seperate from the rest of the method?
No, it should not. The scope (and local variable declaration space) defined by the block statement in the consequence of the conditional statement is lexically a part of the outer block which defines the body of the method. Therefore, rules about the contents of the outer block apply to the contents of the inner block.
I cannot call var from outside the if,
so the error message is wrong, because
the first var has no relevance in the
second scope.
This is completely wrong. It is specious to conclude that just because the local variable is no longer in scope, that the outer block does not contain an error. The error message is correct.
The error here has nothing to do with whether the scope of any variable overlaps the scope of any other variable; the only thing that is relevant here is that you have a block -- the outer block -- in which the same simple name is used to refer to two completely different things. C# requires that a simple name have one meaning throughout the block which first uses it.
For example:
class C
{
int x;
void M()
{
int x = 123;
}
}
That is perfectly legal; the scope of the outer x overlaps the scope of the inner x, but that is not an error. What is an error is:
class C
{
int x;
void M()
{
Console.WriteLine(x);
if (whatever)
{
int x = 123;
}
}
}
because now the simple name "x" means two different things inside the body of M -- it means "this.x" and the local variable "x". It is confusing to developers and code maintainers when the same simple name means two completely different things in the same block, so that is illegal.
We do allow parallel blocks to contain the same simple name used in two different ways; this is legal:
class C
{
int x;
void M()
{
if (whatever)
{
Console.WriteLine(x);
}
if (somethingelse)
{
int x = 123;
}
}
}
because now the only block that contains two inconsistent usages of x is the outer block, and that block does not directly contain any usage of "x", only indirectly.
This is valid in C++, but a source for many bugs and sleepless nights. I think the C# guys decided that it's better to throw a warning/error since it's, in the vast majority of cases, a bug rather than something the coder actually want.
Here's an interesting discussion on what parts of the specification this error comes from.
EDIT (some examples) -----
In C++, the following is valid (and it doesn't really matter if the outer declaration is before or after the inner scope, it will just be more interesting and bug-prone if it's before).
void foo(int a)
{
int count = 0;
for(int i = 0; i < a; ++i)
{
int count *= i;
}
return count;
}
Now imagine the function being a few lines longer and it might be easy to not spot the error. The compiler never complains (not it the old days, not sure about newer versions of C++), and the function always returns 0.
The behaivour is clearly a bug, so it would be good if a c++-lint program or the compiler points this out. If it's not a bug it is easy to work around it by just renaming the inner variable.
To add insult to injury I remember that GCC and VS6 had different opinions on where the counter variable in for loops belonged. One said it belonged to the outer scope and the other said it didn't. A bit annoying to work on cross-platform code. Let me give you yet another example to keep my line count up.
for(int i = 0; i < 1000; ++i)
{
if(array[i] > 100)
break;
}
printf("The first very large value in the array exists at %d\n", i);
This code worked in VS6 IIRC and not in GCC. Anyway, C# has cleaned up a few things, which is good.
if(true)
{
string var = "VAR";
}
string var = "New VAR!";
This will result in:
Error 1 A local variable named 'var'
cannot be declared in this scope
because it would give a different
meaning to 'var', which is already
used in a 'child' scope to denote
something else.
Nothing earth shattering really, but isn't this just plain wrong? A fellow developer and I were wondering if the first declaration should be in a different scope, thus the second declaration cannot interfere with the first declaration.
Why is C# unable to differentiate between the two scopes? Should the first IF scope not be completely separate from the rest of the method?
I cannot call var from outside the if, so the error message is wrong, because the first var has no relevance in the second scope.
The issue here is largely one of good practice and preventing against inadvertent mistakes. Admittedly, the C# compiler could theoretically be designed such that there is no conflict between scopes here. This would however be much effort for little gain, as I see it.
Consider that if the declaration of var in the parent scope were before the if statement, there would be an unresolvable naming conflict. The compiler simply does not differentiate between the following two cases. Analysis is done purely based on scope, and not order of declaration/use, as you seem to be expecting.
The theoretically acceptable (but still invalid as far as C# is concerned):
if(true)
{
string var = "VAR";
}
string var = "New VAR!";
and the unacceptable (since it would be hiding the parent variable):
string var = "New VAR!";
if(true)
{
string var = "VAR";
}
are both treated precisely the same in terms of variables and scopes.
Now, is there any actual reason in this secenario why you can't just give one of the variables a different name? I assume (hope) your actual variables aren't called var, so I don't really see this being a problem. If you're still intent on reusing the same variable name, just put them in sibling scopes:
if(true)
{
string var = "VAR";
}
{
string var = "New VAR!";
}
This however, while valid to the compiler, can lead to some amount of confusion when reading the code, so I recommend against it in almost any case.
isn't this just plain wrong?
No, this is not wrong at all. This is a correct implementation of section 7.5.2.1 of the C# specification, "Simple names, invariant meanings in blocks".
The specification states:
For each occurrence of a given
identifier as a simple-name in an
expression or declarator, within the
local variable declaration space
of that occurrence, every
other occurrence of the same
identifier as a simple-name in an
expression or declarator must refer to the same
entity. This rule ensures that the
meaning of a name is always the same
within a given block, switch block,
for-, foreach- or using-statement, or
anonymous function.
Why is C# unable to differentiate between the two scopes?
The question is nonsensical; obviously the compiler is able to differentiate between the two scopes. If the compiler were unable to differentiate between the two scopes then how could the error be produced? The error message says that there are two different scopes, and therefore the scopes have been differentiated!
Should the first IF scope not be completeley seperate from the rest of the method?
No, it should not. The scope (and local variable declaration space) defined by the block statement in the consequence of the conditional statement is lexically a part of the outer block which defines the body of the method. Therefore, rules about the contents of the outer block apply to the contents of the inner block.
I cannot call var from outside the if,
so the error message is wrong, because
the first var has no relevance in the
second scope.
This is completely wrong. It is specious to conclude that just because the local variable is no longer in scope, that the outer block does not contain an error. The error message is correct.
The error here has nothing to do with whether the scope of any variable overlaps the scope of any other variable; the only thing that is relevant here is that you have a block -- the outer block -- in which the same simple name is used to refer to two completely different things. C# requires that a simple name have one meaning throughout the block which first uses it.
For example:
class C
{
int x;
void M()
{
int x = 123;
}
}
That is perfectly legal; the scope of the outer x overlaps the scope of the inner x, but that is not an error. What is an error is:
class C
{
int x;
void M()
{
Console.WriteLine(x);
if (whatever)
{
int x = 123;
}
}
}
because now the simple name "x" means two different things inside the body of M -- it means "this.x" and the local variable "x". It is confusing to developers and code maintainers when the same simple name means two completely different things in the same block, so that is illegal.
We do allow parallel blocks to contain the same simple name used in two different ways; this is legal:
class C
{
int x;
void M()
{
if (whatever)
{
Console.WriteLine(x);
}
if (somethingelse)
{
int x = 123;
}
}
}
because now the only block that contains two inconsistent usages of x is the outer block, and that block does not directly contain any usage of "x", only indirectly.
This is valid in C++, but a source for many bugs and sleepless nights. I think the C# guys decided that it's better to throw a warning/error since it's, in the vast majority of cases, a bug rather than something the coder actually want.
Here's an interesting discussion on what parts of the specification this error comes from.
EDIT (some examples) -----
In C++, the following is valid (and it doesn't really matter if the outer declaration is before or after the inner scope, it will just be more interesting and bug-prone if it's before).
void foo(int a)
{
int count = 0;
for(int i = 0; i < a; ++i)
{
int count *= i;
}
return count;
}
Now imagine the function being a few lines longer and it might be easy to not spot the error. The compiler never complains (not it the old days, not sure about newer versions of C++), and the function always returns 0.
The behaivour is clearly a bug, so it would be good if a c++-lint program or the compiler points this out. If it's not a bug it is easy to work around it by just renaming the inner variable.
To add insult to injury I remember that GCC and VS6 had different opinions on where the counter variable in for loops belonged. One said it belonged to the outer scope and the other said it didn't. A bit annoying to work on cross-platform code. Let me give you yet another example to keep my line count up.
for(int i = 0; i < 1000; ++i)
{
if(array[i] > 100)
break;
}
printf("The first very large value in the array exists at %d\n", i);
This code worked in VS6 IIRC and not in GCC. Anyway, C# has cleaned up a few things, which is good.
We were having the (never ending) underscore prefix versus no underscore prefix debate on member variables and someone mentioned that is you use "this." instead of "-", your code will be slower due to the "." in "this.". Is this true and can anyone quantify this?
No, that makes no sense at all. Just look at the IL, and kick that developer in the ass.
Also FWIW, I like the underscore in member variables.
There doesn't seem to be difference when using the this keywords. If you have the following code:
class Class3
{
private long id;
public void DoWork()
{
id = 1;
this.id = 2;
}
}
When you run it through reflector you will see the following output:
internal class Class3
{
// Fields
private long id;
// Methods
public void DoWork()
{
this.id = 1L;
this.id = 2L;
}
}
Seems to me that "this." is a disambiguator at compile time. It tells the compiler the scope of the variable. It may be unnecessary, since the compiler will need to figure out scope in any case. But I can't imagine there is any performance downside, perhaps even a microscopic upside as you are "hinting".
Once the code is compiled (ie, at runtime), I imagine "this." is utterly irrelevant.
So it's a style choice. Some people prefer terseness. I like "this." because it adds clarity, when used correctly. It tells other developers where a function or property lives. I use it for any public method or property. I don't usually use it with private members.
Juval Lowy has a very nice C# style guide here: http://www.idesign.net/
Variables represent locations in memory. When compiled a 100character variable and a one letter variable are both converted into numbers. In the same way special characters are translated and wont make any difference on the speed.
Who needs underscores when you got camelCasing? Also I got a say that what you doing sounds like a crazy idea.