What is 'unverifiable code' and why is it bad?

What is 'unverifiable code' and why is it bad? - c#

I am designing a helper method that does lazy loading of certain objects for me, calling it looks like this:
public override EDC2_ORM.Customer Customer {
get { return LazyLoader.Get<EDC2_ORM.Customer>(
CustomerId, _customerDao, ()=>base.Customer, (x)=>Customer = x); }
set { base.Customer = value; }
}
when I compile this code I get the following warning:
Warning 5 Access to member
'EDC2_ORM.Billing.Contract.Site'
through a 'base' keyword from an
anonymous method, lambda expression,
query expression, or iterator results
in unverifiable code. Consider moving
the access into a helper method on the
containing type.
What exactly is the complaint here and why is what I'm doing bad?

"base.Foo" for a virtual method will make a non-virtual call on the parent definition of the method "Foo". Starting with CLR 2.0, the CLR decided that a non-virtual call on a virtual method can be a potential security hole and restricted the scenarios in which in can be used. They limited it to making non-virtual calls to virtual methods within the same class hierarchy.
Lambda expressions put a kink in the process. Lambda expressions often generate a closure under the hood which is a completely separate class. So the code "base.Foo" will eventually become an expression in an entirely new class. This creates a verification exception with the CLR. Hence C# issues a warning.
Side Note: The equivalent code will work in VB. In VB for non-virtual calls to a virtual method, a method stub will be generated in the original class. The non-virtual call will be performed in this method. The "base.Foo" will be redirected into "StubBaseFoo" (generated name is different).

I suspect the problem is that you're basically saying, "I don't want to use the most derived implementation of Customer - I want to use this particular one" - which you wouldn't be able to do normally. You're allowed to do it within a derived class, and for good reasons, but from other types you'd be violating encapsulation.
Now, when you use an anonymous method, lambda expression, query expression (which basically uses lambda expressions) or iterator block, sometimes the compiler has to create a new class for you behind the scenes. Sometimes it can get away with creating a new method in the same type for lambda expressions, but it depends on the context. Basically if any local variables are captured in the lambda expression, that needs a new class (or indeed multiple classes, depending on scope - it can get nasty). If the lambda expression only captures the this reference, a new instance method can be created for the lambda expression logic. If nothing is captured, a static method is fine.
So, although the C# compiler knows that really you're not violating encapsulation, the CLR doesn't - so it treats the code with some suspicion. If you're running under full trust, that's probably not an issue, but under other trust levels (I don't know the details offhand) your code won't be allowed to run.
Does that help?

copy/pasting from here:
Codesta: C#/CLR has 2 kinds of code, safe and unsafe. What is it trying to provide and how did this affect the virtual machine?
Peter Hallam: For C# the terms are safe and unsafe. The CLR uses the terms verifiable and unverifiable.
When running verifiable code the CLR can enforce security policies; the CLR can prevent verifiable code from doing things that it doesn't have permission to do. When running potentially malicious code, code that was downloaded from the internet for example, the CLR will only run verifiable code, and will ensure that the untrusted code doesn't access anything that it doesn't have permission to access.
The use of standard C style pointers creates unverifiable code. The CLR supports C style pointers natively. Once you've got a C style pointer you can read or write to any byte of memory in the process, so the runtime cannot enforce security policy. Actually it could but the performance penalty would make it impractical.
Now, that does not fully answer your question (i.e. WHY is this now unverifiable code), but at least it explains that "unverifiable" is the CLR-term for "unsafe". I assume that anonymous methods and base classes result in some funky pointer-magic internally.
By the Way: I think that the code snippet does not match the Warning message. The code is talking about a Customer, the Warning is about the Billing. Is it possible to post the actuon code the warning is generated for? Maybe you have something else in that code that would better explain why you get the warning.

Related

Using F# quotations to detect code changes

I need to cache results from heavy calculations done by several different classes inheriting from the same base class. I was doing run-time caching by subclass name. Now I need to store the results on disk/DB to avoid long recalculations when I restart my app, but I need to invalidate cache if I change the code inside Calculate().
type BaseCalculator() =
let output = ConcurrentDictionary<string, Series<DateTime, float>>() // has public getter, the output is cached by the caller of Calculate() method
abstract Calculate: unit->unit
type FirstCalculator() =
inherit BaseCalculator()
override this.Calculate() = ... do heavy work here ...
From this question and answers I have learned that I could use [<ReflectedDefinition>] on my calculate method. But I have never worked with quotations myself before.
The questions are:
Could I use a hash code of quotations to uniquely identify the body of the method, or there are some guids or timestamps inside quotations?
Could [<ReflectedDefinition>] be applied to the abstract method and will it work if I override the method in C#, but call the method from F# runner? (I use reflection to load all implementations of the base class in all dlls in a folder)
Is there other simple and reliable method to detect code changes in an assembly automatically, without quotations? Using last modified time of an assembly file (dll) could work, but any change in an assembly will invalidate all calculators in the assembly. This could work if I separate stable and WIP calculators into separate assemblies, but more granularity is preferred.

Could I use a hash code of quotations to uniquely identify the body of the method, or there are some guids or timestamps inside quotations?
I think this would work. Have a look at FsPickler (which is used by MBrace to serialize quotations). I think it can give you a hash of a quotation too. But keep in mind that other parts of your code might change (e.g. another method in another type).
Could ReflectedDefinition be applied to the abstract method and will it work if I override the method in C#, but call the method from F# runner?
No, the attribute only works on F# methods compiled using the F# compiler.
Is there other simple and reliable method to detect code changes in an assembly automatically, without quotations?
I'm not sure. You could use GetMethodBody method and .NET reflection to get the IL, but this only gives you the immediate body - not including e.g. lambda functions, so changes that happen elsewhere will not be easy to detect.
A completely different approach that might work better would be to keep the calculations in FSX files in plain text and compile them on the fly using F# Compiler Service. Then you could just hash the source of the individual FSX files (on a per-computation basis).

I have found that Mono.Cecil is surprisingly easy to use, and with it I could hash all member bodies of a type, including base type.
This SO question was very helpful as a starting point.

How to translate or convert CompilerGenerated code?

If you try to use decompilers like: jetbrains dotpeek, redgate reflector, telerik justdecompile, whatever.. Sometimes if you need a code to copy or just to understand, it is not possible because are shown somethings like it:
[CompilerGenerated]
private sealed class Class15
{
// Fields
public Class11.Class12 CS$<>8__locals25;
public string endName;
// Methods
public Class15();
public bool <Show>b__11(object intelliListItem_0);
}
I'm not taking about obfuscation, this is happens at any time, I didsome tests (my own code), and occurs using lambdas and iterators. I'm not sure, could anyone give more information about when and why..?
So, by standard Visual Studio not compile $ and <> keywords in c# (like the code above)...
There is a way to translate or convert this decompiled code automatically?

Lambdas are a form of closure which is a posh way of saying it's a unit of code you can pass around like it was an object (but with access to its original context). When the compiler finds a lambda it generates a new type (Type being a class or struct) which encapsulates the code and any fields accessed by the lambda in its original context.
The problem here is, how do you generate code which will never conflict with user written code?
The compiler's answer is to generate code which is illegal in the language you are using, but legal in IL. IL is "Intermediate Language" it's the native language used by the Common Language Runtime. Any language which runs on the CLR (C#, vb.net, F#) compiles into IL. This is how you get to use VB.Net assemblies in C# code and so on.
So this is why the decompilers generate the hideous code you see. Iterators follow the exact same model as do a bunch of other language features that require generated types.
There is an interesting side effect. The Lambda may capture a variable in its original context:
public void TestCapture()
{
StringBuilder b = new StringBuilder();
Action l = () => b.Append("Kitties!");
}
So by capture I mean the variable b here is included in the package that defines the closure.
The compiler tries to be efficient and create as few types as possible, so you can end up with one generated class that supports all the lambdas found in a specific class, including fields for all the captured variables. In this way, if you're not careful, you can accidentally capture something you expect to be released, causing really tricky to trace memory leaks.

Is there an option to change the target framework?... I know with some decompilers they default to the lowest level framework (C# 1.0)

Method analysis using Reflection and CodeDom

The context of this question is too elaborate to describe here and will likely adversely affect responses so I am not including it. I want to assert certain things about a method in a unit test. Some of these things are possible using reflection such as format of the try/finally block, class fields and method local variables, etc. I already know the type and method signature.
protected override void OnTest ()
{
bool result = false;
SomeCOMObject com = null; // System.__ComObject
try
{
}
finally
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(com);
}
return (result);
}
What I have not been able to achieve are things like:
Whether the method contains only a single return (result); statement and whether that statement is the last one in the function.
Whether all variables of type System.__ComObject have been manually de-referenced using System.Runtime.InteropServices.Marshal.ReleaseComObject(object) in the finally block.
Since some of these things are not possible using reflection, and source code text analysis is far from ideal, I turned to CodeDom but have not been able to get a grip on it. I have been told that creating expression trees from source code is not possible. Nor is it possible to create expression trees from the runtime type. If that is correct, how can I leverage CodeDom to achieve things in the list above?
I have used CodeDom in the past for code generation and compiling simple code classes to assemblies. But I have no idea how it could be used to analyze the internals of a method. Please advise.

In general, reflection built into programming languages provides no access to the content of functions. So you pretty much can't do this with reflection.
You might be able to do it if you have access to the byte-code equivalent, but byte code can't really answer questions about the syntax of the method, e.g., "how many return statements exists returning the same expression".
If you want to reason about code, your need to reason about the source code. This means you need access to a parser, and often other useful facts ("what the declaration of X?", "Is the type of X and Y compatible?", "Does data flow from X to Y?"), etc.
Roslyn provides some of this information. There are also commercial solutions (I have one).

Property / Method inlining and impact on Reflection

My answer to one of the question on SO was commented by Valentin Kuzub, who argues that inlining a property by JIT compiler will cause the reflection to stop working.
The case is as follows:
class Foo
{
public string Bar { get; set; }
public void Fuzz<T>(Expression<Func<T>> lambda)
{
}
}
Fuzz(x => x.Bar);
Fuzz function accepts a lambda expression and uses reflection to find the property. It is a common practice in MVC in HtmlHelper extensions.
I don't think that the reflection will stop working even if the Bar property gets inlined, as it is a call to Bar that will be inlined and typeof(Foo).GetProperty("Bar") will still return a valid PropertyInfo.
Could you confirm this please or my understanding of method inlining is wrong?

JIT compiler operates at runtime and it can't rewrite metadata information stored in the assembly. And reflection reads assembly to access this metadata. So there are no impact from JIT-compiler to reflection.
EDIT:
Actually there are couple of places when C# compiler itself "inlines" some information during compilation. For example, constants, enums and default arguments are "inlined" so you can't access them during reflection. But it definitely not related to your particular case.

Yeah when I think about it more I guess only way inlining properties could fail INotifyPropertyChanged interface correct work would be if you were using a reflection based method used like
public Count
{
get {return m_Count;}
set { m_Count=value;
GetCurrentPropertyNameUsingReflectionAndNotifyItChanged();}
}
If used like you suggest indeed metadata exists in assembly and property name will be successfully taken from there.
Got us both thinking though.

I personally agree with #Sergey:
Considering that inlining happens on JIT compiler side, but metadata generated before, it shouldn't inpact on reflection in any way. By the way, good question, like it +1

Expression trees can't be in-lined anyway since they are a representation of the expression (abstract syntax tree) rather than the expression itself.
Delegates, even if they can be in-lined, will still carry the data about the method and target being called in their properties.

Is using var actually slow? If so, why?

I am learning C# and .NET, and I frequently use the keyword var in my code. I got the idea from Eric Lippert and I like how it increases my code's maintainability.
I am wondering, though... much has been written in blogs about slow heap-located refs, yet I am not observing this myself. Is this actually slow? I am referring to slow compile times due to type inferencing.

You state:
I am referring to slow time for compile due to type 'inferencing'
This does not slow down the compiler. The compiler already has to know the result type of the expression, in order to check compatibility (direct or indirect) of the assignment. In some ways using this already-known type removes a few things (the potential to have to check for inheritance, interfaces and conversion operators, for example).
It also doesn't slow down the runtime; they are fully static compiled like regular c# variables (which they are).
In short... it doesn't.

'var' in C# is not a VARIANT like you're used to in VB. var is simply syntactic sugar the compiler lets you use to shorthand the type. The compiler figures out the type of the right-hand side of the expression and sets your variable to that type. It has no performance impact at all - just the same as if you'd typed the full type expression:
var x = new X();
exactly the same as
X x = new X();
This seems like a trivial example, and it is. This really shines when the expression is much more complex or even 'unexpressable' (like anonymous types) and enumerables.

Var is replaced at compile time with your actual variable type. Are you thinking of dynamic?

A "variant" is typeless, so access to state (or internal state conversion) always must go through two steps: (1) Determine the "real" internal type, and (2) Extract the relevant state from that "real" internal type.
You do not have that two-step process when you start with a typed object.
True, a "variant" thus has this additional overhead. The appropriate use is in those cases where you want the convenience of any-type for code simplicity, like is done with most scripting languages, or very high-level APIs. In those cases, the "variant" overhead is often not significant (since you are working at a high-level API anyway).
If you're talking about "var", though, then that is merely a convenience way for you to say, "Compiler, put the proper type here" because you don't want to do that work, and the compiler should be able to figure it out. In that case, "var" doesn't represent a (runtime) "variant", but rather a mere source-code specification syntax.

The compiler infers from the constructor the type.
var myString = "123"; is no different from string myString = "123";
Also, generally speaking, reference types live on the heap and value types live on the stack, regardless if they're declared using var.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.