How to translate or convert CompilerGenerated code? - c#

If you try to use decompilers like: jetbrains dotpeek, redgate reflector, telerik justdecompile, whatever.. Sometimes if you need a code to copy or just to understand, it is not possible because are shown somethings like it:
[CompilerGenerated]
private sealed class Class15
{
// Fields
public Class11.Class12 CS$<>8__locals25;
public string endName;
// Methods
public Class15();
public bool <Show>b__11(object intelliListItem_0);
}
I'm not taking about obfuscation, this is happens at any time, I didsome tests (my own code), and occurs using lambdas and iterators. I'm not sure, could anyone give more information about when and why..?
So, by standard Visual Studio not compile $ and <> keywords in c# (like the code above)...
There is a way to translate or convert this decompiled code automatically?

Lambdas are a form of closure which is a posh way of saying it's a unit of code you can pass around like it was an object (but with access to its original context). When the compiler finds a lambda it generates a new type (Type being a class or struct) which encapsulates the code and any fields accessed by the lambda in its original context.
The problem here is, how do you generate code which will never conflict with user written code?
The compiler's answer is to generate code which is illegal in the language you are using, but legal in IL. IL is "Intermediate Language" it's the native language used by the Common Language Runtime. Any language which runs on the CLR (C#, vb.net, F#) compiles into IL. This is how you get to use VB.Net assemblies in C# code and so on.
So this is why the decompilers generate the hideous code you see. Iterators follow the exact same model as do a bunch of other language features that require generated types.
There is an interesting side effect. The Lambda may capture a variable in its original context:
public void TestCapture()
{
StringBuilder b = new StringBuilder();
Action l = () => b.Append("Kitties!");
}
So by capture I mean the variable b here is included in the package that defines the closure.
The compiler tries to be efficient and create as few types as possible, so you can end up with one generated class that supports all the lambdas found in a specific class, including fields for all the captured variables. In this way, if you're not careful, you can accidentally capture something you expect to be released, causing really tricky to trace memory leaks.

Is there an option to change the target framework?... I know with some decompilers they default to the lowest level framework (C# 1.0)

Related

Source code for the "nameof" operator of C#

Where can I get the source code for "nameof" of C# or how do I decompile it?
I checked https://referencesource.microsoft.com/, but I couldn't find it.
It's not something you can decompile as such, or show you source code for. It's part of the C# compiler: when you use nameof(Foo) the compiler just injects "Foo" into the source code. The IL for the methods is exactly the same:
static void PrintMyName()
{
Console.WriteLine(nameof(PrintMyName));
}
vs
static void PrintMyName()
{
Console.WriteLine("PrintMyName");
}
As noted in comments, it's not just that the name is taken literally as per the operand; it's the last part of the name that's used. So for example, if you have:
string x = "10";
string text = nameof(x.Length);
then that will resolve to "Length". (This doesn't use the value of x at execution time, either - it's fine if x is null. Or you could use nameof(string.Length) or nameof(String.Length).)
nameof is a keyword, so you would need to look into the compiler for the source code of how it is processed. Fortunately for you, the C# compiler is now open-sourced under the Roslyn project. Understanding a compiler is not a trivial task – source code is passed through pipelines of transformations, which each one adding more syntactic or semantic information. To start you off, the GetContextualKeywordKind parses the nameof keyword into a SyntaxKind.NameOfKeyword, which then gets matched in TryBindNameofOperator.
As to your other question of creating another such operator: Yes, you can, by cloning and modifying the Roslyn source. However, your new operator would obviously only work on the modified compiler, so you'd need to supply this to whoever will be compiling your code. This is something that's rarely done; you're normally better off defining extension methods for your custom functionality, unless you need something particularly esoteric.

Dynamically create compiler error based on API usage?

Did not know how to search for an answer to this, but I am wondering if there is a way in a compiled language such as Java, C#, Scala etc... To force a compiler error where an API is used incorrectly.
Say you have some sort of API you are working with and you know you need to call a specific setup method X before calling some other method Y, is it possible to set things up such that the compiler will catch the error and avoid having to do so at run time?
It would be fairly useful for enforcing some code standards or fixing broken API's. No idea if its even possible though.
Tools like NDepend (for .NET) or JArchitect (for java) lets write custom code rules over LINQ queries that can emit warning or error at analysis time (in the IDE, or at Build Process time). For example the following CQLinq code rule enforces that if a method is calling MethodA(), it must call MethodB():
warnif count > 0
from m in Application.Methods where
m.IsUsing("MyNamespace.MyClass.MyMethodA()") &&
!m.IsUsing("MyNamespace.MyClass.MyMethodB()")
select m
Generally, you can't create arbitrary custom compile-time errors in most static languages. There are a few exceptions (like #error directives in C and C++, but even then the preprocessor comes strictly before the main compilation so that won't help).
However, you can utilize a language feature specifically designed to catch errors at compile time:
The type system is your friend.
Requiring methods to be called in a particular order is unfriendly to API consumers, and is a code smell in a high-level language.
A trivial solution is to expose a single method that performs both operations in the correct order. More generally (e.g. if the 2nd method doesn't have to be called), have the 2nd method call the 1st method at the top. But presumably you want the calls to be under the control of the API consumer.
In that case, refactor your code so that it is not possible to call methods in the wrong order without creating a type error. For example, suppose you're writing a regex library where regexes have to be compiled before matching. In Scala, refactor:
class Regex(pattern : String) {
private[this] var compiled : Option[CompiledData] = None
def compile() {
// do stuff
this.compiled = ...
}
def search(s : String) : MatchResult = compiled match {
case Some(c) => ... // match the string
case None => throw new IllegalStateException("must compile regex first")
}
}
to
class Regex(pattern : String) {
def compile : CompiledRegex = {
// do stuff
CompiledRegex(...)
}
}
class CompiledRegex(c : CompiledData) {
def search(s : String) : MatchResult = {
... // match the string
}
}
Since the search method is now only available on CompiledRegex, and CompiledRegex is obtained by calling Regex.compile, it's impossible to call search before that action is made valid by compiling the regex.
This setup also helps out API consumers, because if the user needs an object of type MatchResult and has already typed new Regex("[abc123]*"), a good IDE can autocomplete or suggest the needed method .compile; that would not be possible with the original setup.
(In this example, it also no longer needs mutable state, which is often avoided in Scala.)
Static assertions
As an alternative solution, some languages (I think D and C++11, though not the ones you listed) support static assertions.
the usage of specific functions ist on another level than the compiler operation ... therefore i don't think that a compiler should do things like that ...
on the other hand: compilers are commonly integrated into IDEs ... why don't you put that into an IDE module too? ... most IDEs allow pre compiler operations ...
you could write some sort of checking tool (based on CodeDOM or something similar) to test for specific function usage, and based on that check-result, abort the compiler run if the quality criteria is not matched ...

How to alias a built-in type in C#?

So in C++, I'm used to being able to do:
typedef int PeerId;
This allows me to make a type more self-documenting, but additionally also allows me to make PeerId represent a different type at any time without changing all of the code. I could even turn PeerId into a class if I wanted. This kind of extensibility is what I want to have in C#, however I am having trouble figuring out how to create an alias for 'int' in C#.
I think I can use the using statement, but it only has scope in the current file I believe, so that won't work (The alias needs to be accessible between multiple files without being redefined). I also can't derive a class from built-in types (but normally this is what I would do to alias ref-types, such as List or Dictionary). I'm not sure what I can do. Any ideas?
You need to use the full type name like this:
using DWORD = System.Int32;
You could (ab)use implicit conversions:
struct PeerId
{
private int peer;
public static implicit operator PeerId(int i)
{
return new PeerId {peer=i};
}
public static implicit operator int(PeerId p)
{
return p.peer;
}
}
This takes the same space as an int, and you can do:
PeerId p = 3;
int i = p;
But I agree you probably don't need this.
Summary
Here's the short answer:
Typedefs are actually a variable used by compile-time code generators.
C# is being designed to avoid adding code generation language constructs.
Therefore, the concept of typedefs doesn't fit in well with the C# language.
Long Answer
In C++, it makes more sense: C++ started off as a precompiler that spit out C code, which was then compiled. This "code generator" beginning still has effects in modern C++ features (i.e., templates are essentially a Turing-complete language for generating classes and functions at compile time). In this context, a typedef makes sense because it's a way to get the "result" of a compile-time type factory or "algorithm" that "returns" a type.
In this strange meta-language (which few outside of Boost have mastered), a typedef is actually a variable.
What you're describing is less complex, but you're still trying to use the typedef as a variable. In this case, it's used as an input variable. So when other code uses the typedef, it's really not using that type directly. Rather, it's acting as a compile-time code generator, building classes and methods based on typedef'ed input variables. Even if you ignore C++ templates and just look at C typedefs, the effect is the same.
C++ and Generative Programming
C++ was designed to be a multi-paradign language (OO and procedural, but not functional until Boost came out). Interestingly enough, templates have evolved an unexpected paradign: generative programming. (Generative programming was around before C++, but C++ made it popular). Generative programs are actually meta-programs that - when compiled - generate the needed classes and methods, which are in turn compiled into executables.
C# and Generative Programming
Our tools are slowly evolving in the same direction. Of course, reflection emit can be used for "manual" generative programming, but it is quite painful. The way LINQ providers use expression trees is very generative in nature. T4 templates get really close but still fall short. The "compiler as a service" which will hopefully be part of C# vNext appears most promising of all, if it could be combined with some kind of type variable (such as a typedef).
This one piece of the puzzle is still missing: generative programs need some sort of automatic trigger mechanism (in C++, this is handled by implicit template instantiation).
However, it is explicitly not a goal of C# to have any kind of "code generator" in the C# language like C++ templates (probably for the sake of understandability; very few C++ programmers understand C++ templates). This will probably be a niche satisfied by T4 rather than C#.
Conclusion (repeating the Summary)
All of the above is to say this:
Typedefs are a variable used by code generators.
C# is being designed to avoid adding code generation language constructs.
Therefore, the concept of typedefs doesn't fit in well with the C# language.
I also sometimes feel I need (integer) typedefs for similar purposes to the OP.
If you do not mind the casts being explicit (I actually want them to be) you can do this:
enum PeerId : int {};
Will also work for byte, sbyte, short, ushort, uint, long, or ulong (obviously).
Not exactly the intended usage of enum, but it does work.
Since C# 10 you can use global using:
global using PeerId = System.Int32;
It works for all files.
It should appear before all using directives without the global modifier.
See using directive.
Redefining fundamental types just for the sake of changing the name is C++ think and does not sit well with the more pure Object Orientated C#. Whenever you get the urge to shoehorn a concept from one language into another, you must stop and think whether or not it makes sense and try to stay native to the platform.
The requirement of being able to change the underlying type easily can be satisfied by defining your own value type. Coupled with implicit conversion operators and arithmetic operators, you have the power to define very powerful types. If you are worried about performance for adding layers on top of simple types, don't. 99% chance that it won't, and the 1% chance is that in case it does, it will not the be "low hanging fruit" of performance optimization.

Why is C# statically typed?

I am a PHP web programmer who is trying to learn C#.
I would like to know why C# requires me to specify the data type when creating a variable.
Class classInstance = new Class();
Why do we need to know the data type before a class instance?
As others have said, C# is static/strongly-typed. But I take your question more to be "Why would you want C# to be static/strongly-typed like this? What advantages does this have over dynamic languages?"
With that in mind, there are lots of good reasons:
Stability Certain kinds of errors are now caught automatically by the compiler, before the code ever makes it anywhere close to production.
Readability/Maintainability You are now providing more information about how the code is supposed to work to future developers who read it. You add information that a specific variable is intended to hold a certain kind of value, and that helps programmers reason about what the purpose of that variable is.
This is probably why, for example, Microsoft's style guidelines recommended that VB6 programmers put a type prefix with variable names, but that VB.Net programmers do not.
Performance This is the weakest reason, but late-binding/duck typing can be slower. In the end, a variable refers to memory that is structured in some specific way. Without strong types, the program will have to do extra type verification or conversion behind the scenes at runtime as you use memory that is structured one way physically as if it were structured in another way logically.
I hesitate to include this point, because ultimately you often have to do those conversions in a strongly typed language as well. It's just that the strongly typed language leaves the exact timing and extent of the conversion to the programmer, and does no extra work unless it needs to be done. It also allows the programmer to force a more advantageous data type. But these really are attributes of the programmer, rather than the platform.
That would itself be a weak reason to omit the point, except that a good dynamic language will often make better choices than the programmer. This means a dynamic language can help many programmers write faster programs. Still, for good programmers, strongly-typed languages have the potential to be faster.
Better Dev Tools If your IDE knows what type a variable is expected to be, it can give you additional help about what kinds of things that variable can do. This is much harder for the IDE to do if it has to infer the type for you. And if you get more help with the minutia of an API from the IDE, then you as a developer will be able to get your head around a larger, richer API, and get there faster.
Or perhaps you were just wondering why you have to specify the class name twice for the same variable on the same line? The answer is two-fold:
Often you don't. In C# 3.0 and later you can use the var keyword instead of the type name in many cases. Variables created this way are still statically typed, but the type is now inferred for you by the compiler.
Thanks to inheritance and interfaces sometimes the type on the left-hand side doesn't match the type on the right hand side.
It's simply how the language was designed. C# is a C-style language and follows in the pattern of having types on the left.
In C# 3.0 and up you can kind of get around this in many cases with local type inference.
var variable = new SomeClass();
But at the same time you could also argue that you are still declaring a type on the LHS. Just that you want the compiler to pick it for you.
EDIT
Please read this in the context of the users original question
why do we need [class name] before a variable name?
I wanted to comment on several other answers in this thread. A lot of people are giving "C# is statically type" as an answer. While the statement is true (C# is statically typed), it is almost completely unrelated to the question. Static typing does not necessitate a type name being to the left of the variable name. Sure it can help but that is a language designer choice not a necessary feature of static typed languages.
These is easily provable by considering other statically typed languages such as F#. Types in F# appear on the right of a variable name and very often can be altogether ommitted. There are also several counter examples. PowerShell for instance is extremely dynamic and puts all of its type, if included, on the left.
One of the main reasons is that you can specify different types as long as the type on the left hand side of the assignment is a parent type of the type on the left (or an interface that is implemented on that type).
For example given the following types:
class Foo { }
class Bar : Foo { }
interface IBaz { }
class Baz : IBaz { }
C# allows you to do this:
Foo f = new Bar();
IBaz b = new Baz();
Yes, in most cases the compiler could infer the type of the variable from the assignment (like with the var keyword) but it doesn't for the reason I have shown above.
Edit: As a point of order - while C# is strongly-typed the important distinction (as far as this discussion is concerned) is that it is in fact also a statically-typed language. In other words the C# compiler does static type checking at compilation time.
C# is a statically-typed, strongly-typed language like C or C++. In these languages all variables must be declared to be of a specific type.
Ultimately because Anders Hejlsberg said so...
You need [class name] in front because there are many situations in which the first [class name] is different from the second, like:
IMyCoolInterface obj = new MyInterfaceImplementer();
MyBaseType obj2 = new MySubTypeOfBaseType();
etc. You can also use the word 'var' if you don't want to specify the type explicitely.
Why do we need to know the data type
before a class instance?
You don't! Read from right to left. You create the variable and then you store it in a type safe variable so you know what type that variable is for later use.
Consider the following snippet, it would be a nightmare to debug if you didn't receive the errors until runtime.
void FunctionCalledVeryUnfrequently()
{
ClassA a = new ClassA();
ClassB b = new ClassB();
ClassA a2 = new ClassB(); //COMPILER ERROR(thank god)
//100 lines of code
DoStuffWithA(a);
DoStuffWithA(b); //COMPILER ERROR(thank god)
DoStuffWithA(a2);
}
When you'r thinking you can replace the new Class() with a number or a string and the syntax will make much more sense. The following example might be a bit verbose but might help to understand why it's designed the way it is.
string s = "abc";
string s2 = new string(new char[]{'a', 'b', 'c'});
//Does exactly the same thing
DoStuffWithAString("abc");
DoStuffWithAString(new string(new char[]{'a', 'b', 'c'}));
//Does exactly the same thing
C#, as others have pointed out, is a strongly, statically-typed language.
By stating up front what the type you're intending to create is, you'll receive compile-time warnings when you try to assign an illegal value. By stating up front what type of parameters you accept in methods, you receive those same compile-time warnings when you accidentally pass nonsense into a method that isn't expecting it. It removes the overhead of some paranoia on your behalf.
Finally, and rather nicely, C# (and many other languages) doesn't have the same ridiculous, "convert anything to anything, even when it doesn't make sense" mentality that PHP does, which quite frankly can trip you up more times than it helps.
c# is a strongly-typed language, like c++ or java. Therefore it needs to know the type of the variable. you can fudge it a bit in c# 3.0 via the var keyword. That lets the compiler infer the type.
That's the difference between a strongly typed and weakly typed language. C# (and C, C++, Java, most more powerful languages) are strongly typed so you must declare the variable type.
When we define variables to hold data we have to specify the type of data that those variables will hold. The compiler then checks that what we are doing with the data makes sense to it, i.e. follows the rules. We can't for example store text in a number - the compiler will not allow it.
int a = "fred"; // Not allowed. Cannot implicitly convert 'string' to 'int'
The variable a is of type int, and assigning it the value "fred" which is a text string breaks the rules- the compiler is unable to do any kind of conversion of this string.
In C# 3.0, you can use the 'var' keyword - this uses static type inference to work out what the type of the variable is at compile time
var foo = new ClassName();
variable 'foo' will be of type 'ClassName' from then on.
One things that hasn't been mentioned is that C# is a CLS (Common Language Specification) compliant language. This is a set of rules that a .NET language has to adhere to in order to be interopable with other .NET languages.
So really C# is just keeping to these rules. To quote this MSDN article:
The CLS helps enhance and ensure
language interoperability by defining
a set of features that developers can
rely on to be available in a wide
variety of languages. The CLS also
establishes requirements for CLS
compliance; these help you determine
whether your managed code conforms to
the CLS and to what extent a given
tool supports the development of
managed code that uses CLS features.
If your component uses only CLS
features in the API that it exposes to
other code (including derived
classes), the component is guaranteed
to be accessible from any programming
language that supports the CLS.
Components that adhere to the CLS
rules and use only the features
included in the CLS are said to be
CLS-compliant components
Part of the CLS is the CTS the Common Type System.
If that's not enough acronyms for you then there's a tonne more in .NET such as CLI, ILasm/MSIL, CLR, BCL, FCL,
Because C# is a strongly typed language
Static typing also allows the compiler to make better optimizations, and skip certain steps. Take overloading for example, where you have multiple methods or operators with the same name differing only by their arguments. With a dynamic language, the runtime would need to grade each version in order to determine which is the best match. With a static language like this, the final code simply points directly to the appropriate overload.
Static typing also aids in code maintenance and refactoring. My favorite example being the Rename feature of many higher-end IDEs. Thanks to static typing, the IDE can find with certainty every occurrence of the identifier in your code, and leave unrelated identifiers with the same name intact.
I didn't notice if it were mentioned yet or not, but C# 4.0 introduces dynamic checking VIA the dynamic keyword. Though I'm sure you'd want to avoid it when it's not necessary.
Why C# requires me to specify the data type when creating a variable.
Why do we need to know the data type before a class instance?
I think one thing that most answers haven't referenced is the fact that C# was originally meant and designed as "managed", "safe" language among other things and a lot of those goals are arrived at via static / compile time verifiability. Knowing the variable datatype explicitly makes this problem MUCH easier to solve. Meaning that one can make several automated assessments (C# compiler, not JIT) about possible errors / undesirable behavior without ever allowing execution.
That verifiability as a side effect also gives you better readability, dev tools, stability etc. because if an automated algorithm can understand better what the code will do when it actually runs, so can you :)
Statically typed means that Compiler can perform some sort of checks at Compile time not at run time. Every variable is of particular or strong type in Static type. C# is strongly definitely strongly typed.

What is 'unverifiable code' and why is it bad?

I am designing a helper method that does lazy loading of certain objects for me, calling it looks like this:
public override EDC2_ORM.Customer Customer {
get { return LazyLoader.Get<EDC2_ORM.Customer>(
CustomerId, _customerDao, ()=>base.Customer, (x)=>Customer = x); }
set { base.Customer = value; }
}
when I compile this code I get the following warning:
Warning 5 Access to member
'EDC2_ORM.Billing.Contract.Site'
through a 'base' keyword from an
anonymous method, lambda expression,
query expression, or iterator results
in unverifiable code. Consider moving
the access into a helper method on the
containing type.
What exactly is the complaint here and why is what I'm doing bad?
"base.Foo" for a virtual method will make a non-virtual call on the parent definition of the method "Foo". Starting with CLR 2.0, the CLR decided that a non-virtual call on a virtual method can be a potential security hole and restricted the scenarios in which in can be used. They limited it to making non-virtual calls to virtual methods within the same class hierarchy.
Lambda expressions put a kink in the process. Lambda expressions often generate a closure under the hood which is a completely separate class. So the code "base.Foo" will eventually become an expression in an entirely new class. This creates a verification exception with the CLR. Hence C# issues a warning.
Side Note: The equivalent code will work in VB. In VB for non-virtual calls to a virtual method, a method stub will be generated in the original class. The non-virtual call will be performed in this method. The "base.Foo" will be redirected into "StubBaseFoo" (generated name is different).
I suspect the problem is that you're basically saying, "I don't want to use the most derived implementation of Customer - I want to use this particular one" - which you wouldn't be able to do normally. You're allowed to do it within a derived class, and for good reasons, but from other types you'd be violating encapsulation.
Now, when you use an anonymous method, lambda expression, query expression (which basically uses lambda expressions) or iterator block, sometimes the compiler has to create a new class for you behind the scenes. Sometimes it can get away with creating a new method in the same type for lambda expressions, but it depends on the context. Basically if any local variables are captured in the lambda expression, that needs a new class (or indeed multiple classes, depending on scope - it can get nasty). If the lambda expression only captures the this reference, a new instance method can be created for the lambda expression logic. If nothing is captured, a static method is fine.
So, although the C# compiler knows that really you're not violating encapsulation, the CLR doesn't - so it treats the code with some suspicion. If you're running under full trust, that's probably not an issue, but under other trust levels (I don't know the details offhand) your code won't be allowed to run.
Does that help?
copy/pasting from here:
Codesta: C#/CLR has 2 kinds of code, safe and unsafe. What is it trying to provide and how did this affect the virtual machine?
Peter Hallam: For C# the terms are safe and unsafe. The CLR uses the terms verifiable and unverifiable.
When running verifiable code the CLR can enforce security policies; the CLR can prevent verifiable code from doing things that it doesn't have permission to do. When running potentially malicious code, code that was downloaded from the internet for example, the CLR will only run verifiable code, and will ensure that the untrusted code doesn't access anything that it doesn't have permission to access.
The use of standard C style pointers creates unverifiable code. The CLR supports C style pointers natively. Once you've got a C style pointer you can read or write to any byte of memory in the process, so the runtime cannot enforce security policy. Actually it could but the performance penalty would make it impractical.
Now, that does not fully answer your question (i.e. WHY is this now unverifiable code), but at least it explains that "unverifiable" is the CLR-term for "unsafe". I assume that anonymous methods and base classes result in some funky pointer-magic internally.
By the Way: I think that the code snippet does not match the Warning message. The code is talking about a Customer, the Warning is about the Billing. Is it possible to post the actuon code the warning is generated for? Maybe you have something else in that code that would better explain why you get the warning.

Categories

Resources