Method analysis using Reflection and CodeDom

Method analysis using Reflection and CodeDom - c#

The context of this question is too elaborate to describe here and will likely adversely affect responses so I am not including it. I want to assert certain things about a method in a unit test. Some of these things are possible using reflection such as format of the try/finally block, class fields and method local variables, etc. I already know the type and method signature.
protected override void OnTest ()
{
bool result = false;
SomeCOMObject com = null; // System.__ComObject
try
{
}
finally
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(com);
}
return (result);
}
What I have not been able to achieve are things like:
Whether the method contains only a single return (result); statement and whether that statement is the last one in the function.
Whether all variables of type System.__ComObject have been manually de-referenced using System.Runtime.InteropServices.Marshal.ReleaseComObject(object) in the finally block.
Since some of these things are not possible using reflection, and source code text analysis is far from ideal, I turned to CodeDom but have not been able to get a grip on it. I have been told that creating expression trees from source code is not possible. Nor is it possible to create expression trees from the runtime type. If that is correct, how can I leverage CodeDom to achieve things in the list above?
I have used CodeDom in the past for code generation and compiling simple code classes to assemblies. But I have no idea how it could be used to analyze the internals of a method. Please advise.

In general, reflection built into programming languages provides no access to the content of functions. So you pretty much can't do this with reflection.
You might be able to do it if you have access to the byte-code equivalent, but byte code can't really answer questions about the syntax of the method, e.g., "how many return statements exists returning the same expression".
If you want to reason about code, your need to reason about the source code. This means you need access to a parser, and often other useful facts ("what the declaration of X?", "Is the type of X and Y compatible?", "Does data flow from X to Y?"), etc.
Roslyn provides some of this information. There are also commercial solutions (I have one).

Related

Using F# quotations to detect code changes

I need to cache results from heavy calculations done by several different classes inheriting from the same base class. I was doing run-time caching by subclass name. Now I need to store the results on disk/DB to avoid long recalculations when I restart my app, but I need to invalidate cache if I change the code inside Calculate().
type BaseCalculator() =
let output = ConcurrentDictionary<string, Series<DateTime, float>>() // has public getter, the output is cached by the caller of Calculate() method
abstract Calculate: unit->unit
type FirstCalculator() =
inherit BaseCalculator()
override this.Calculate() = ... do heavy work here ...
From this question and answers I have learned that I could use [<ReflectedDefinition>] on my calculate method. But I have never worked with quotations myself before.
The questions are:
Could I use a hash code of quotations to uniquely identify the body of the method, or there are some guids or timestamps inside quotations?
Could [<ReflectedDefinition>] be applied to the abstract method and will it work if I override the method in C#, but call the method from F# runner? (I use reflection to load all implementations of the base class in all dlls in a folder)
Is there other simple and reliable method to detect code changes in an assembly automatically, without quotations? Using last modified time of an assembly file (dll) could work, but any change in an assembly will invalidate all calculators in the assembly. This could work if I separate stable and WIP calculators into separate assemblies, but more granularity is preferred.

Could I use a hash code of quotations to uniquely identify the body of the method, or there are some guids or timestamps inside quotations?
I think this would work. Have a look at FsPickler (which is used by MBrace to serialize quotations). I think it can give you a hash of a quotation too. But keep in mind that other parts of your code might change (e.g. another method in another type).
Could ReflectedDefinition be applied to the abstract method and will it work if I override the method in C#, but call the method from F# runner?
No, the attribute only works on F# methods compiled using the F# compiler.
Is there other simple and reliable method to detect code changes in an assembly automatically, without quotations?
I'm not sure. You could use GetMethodBody method and .NET reflection to get the IL, but this only gives you the immediate body - not including e.g. lambda functions, so changes that happen elsewhere will not be easy to detect.
A completely different approach that might work better would be to keep the calculations in FSX files in plain text and compile them on the fly using F# Compiler Service. Then you could just hash the source of the individual FSX files (on a per-computation basis).

I have found that Mono.Cecil is surprisingly easy to use, and with it I could hash all member bodies of a type, including base type.
This SO question was very helpful as a starting point.

Resharper search pattern to detect methods that return a value that is not used

I would like a resharper pattern to detect unhandled IDisposables if possible. If I have a method
IDisposable Subscribe(...){....}
and call it without assigning and using that IDisposable I would like to be told about it. I have tried the following pattern
;$expr$;
where expr is of type IDisposable. The following happens.
the first is detected correctly but the second is an error because simple assignment to an existing variable is also and expression in C# whereas assignment using var is not. Is it possible to detect that the return value is assigned via structural search?
I notice that resharper has the following code quality options
but I'm guessing they are built with something more sophisticated than the structural search parser.

Unfortunately, this can't be done with structural search and replace. For one thing, there is no construct to match against the absence of something, so there's no way to match against a method invocation that does NOT have an assignment of its return value.
As you note, there are inspections that track pure functions that don't use the return value, and they're not implemented with SSR. You can make them apply to your methods by applying the [Pure] attribute to them. However, this is implying that the method actually is pure, i.e. has no side effects, so may be the wrong semantic in this instance.

pass properties as reference using Expressions

This post details workarounds for passing properties as references including using Expressions such as
public void StoreProperty(Expression<Func<T, object>> expr)
This approach is ok and I note many frameworks appear to use this (eg automapper, autofac) as detailed in James Gregory's Introduction to static reflection where he states
The great thing about this is that if you change the name of a member inside a lambda, you’ll get a compile error if you haven’t updated all the references! No more hidden bugs.
Whilst I much prefer this approach it is still not perfect as you can pass any expression returning an object (or whatever your return val is) eg
x => x.Name) //fine
x => x.Name+"x") //runtime error
Is there currently any better way to reference the property (by locking down the Expression, or some other way)
If No, how might a future version of C# lock down the Expression? for example, something like:
public void StoreProperty(Expression<Func<T, object>> expr) where expr.Member is PropertyInfo
clarification: above is only an example, I know this isn't currently supported; thats what I'm trying to discuss.

Well, i don't see how it would be possible.
I wouldn't say it's "outrageous" to suggest, it's simply not the intended use.
In fact this whole concept, while very useful in this situation, was not designed to answer this case. LINQ was intended as an extendable query language with a rich expression mechanism.
Linq extends the language syntax to allow strong, type-safe expressions on the provided types and this is why you can use it in this way to get a strong and typesafe expression.
In this case, you are creating a function that transforms one data type (the T object) into another - a general object.
If i was prohibited from writing something like p=>p.Name+"something" i would lose a lot of the inherent flexibility of the language.
I.e. This would not be possible p=>p.X + p.Y as some query result that returns a sum of elements.
The solution you showed is designed to utilize a feature of linq - strong, type safe property names. It provides an elegant way of using linq to solve a problem, but like any solution - it is open to possible abuse.
A developer that passes p=>p.Name+"something" did not grok the intended use of the solution, which is a matter for training.

When should one use dynamic keyword in c# 4.0?

When should one use dynamic keyword in c# 4.0?.......Any good example with dynamic keyword in c# 4.0 that explains its usage....

Dynamic should be used only when not using it is painful. Like in MS Office libraries. In all other cases it should be avoided as compile type checking is beneficial. Following are the good situation of using dynamic.
Calling javascript method from Silverlight.
COM interop.
Maybe reading Xml, Json without creating custom classes.

How about this? Something I've been looking for and was wondering why it was so hard to do without 'dynamic'.
interface ISomeData {}
class SomeActualData : ISomeData {}
class SomeOtherData : ISomeData {}
interface ISomeInterface
{
void DoSomething(ISomeData data);
}
class SomeImplementation : ISomeInterface
{
public void DoSomething(ISomeData data)
{
dynamic specificData = data;
HandleThis( specificData );
}
private void HandleThis(SomeActualData data)
{ /* ... */ }
private void HandleThis(SomeOtherData data)
{ /* ... */ }
}
You just have to maybe catch for the Runtime exception and handle how you want if you do not have an overloaded method that takes the concrete type.
Equivalent of not using dynamic will be:
public void DoSomething(ISomeData data)
{
if(data is SomeActualData)
HandleThis( (SomeActualData) data);
else if(data is SomeOtherData)
HandleThis( (SomeOtherData) data);
...
else
throw new SomeRuntimeException();
}

As described in here dynamics can make poorly-designed external libraries easier to use: Microsoft provides the example of the Microsoft.Office.Interop.Excel assembly.
And With dynamic, you can avoid a lot of annoying, explicit casting when using this assembly.
Also, In opposition to #user2415376 ,It is definitely not a way to handle Interfaces since we already have Polymorphism implemented from the beginning days of the language!
You can use
ISomeData specificData = data;
instead of
dynamic specificData = data;
Plus it will make sure that you do not pass a wrong type of data object instead.

Check this blog post which talks about dynamic keywords in c#. Here is the gist:
The dynamic keyword is powerful indeed, it is irreplaceable when used with dynamic languages but can also be used for tricky situations while designing code where a statically typed object simply will not do.
Consider the drawbacks:
There is no compile-time type checking, this means that unless you have 100% confidence in your unit tests (cough) you are running a risk.
The dynamic keyword uses more CPU cycles than your old fashioned statically typed code due to the additional runtime overhead, if performance is important to your project (it normally is) don’t use dynamic.
Common mistakes include returning anonymous types wrapped in the dynamic keyword in public methods. Anonymous types are specific to an assembly, returning them across assembly (via the public methods) will throw an error, even though simple testing will catch this, you now have a public method which you can use only from specific places and that’s just bad design.
It’s a slippery slope, inexperienced developers itching to write something new and trying their best to avoid more classes (this is not necessarily limited to the inexperienced) will start using dynamic more and more if they see it in code, usually I would do a code analysis check for dynamic / add it in code review.

Here is a recent case in which using dynamic was a straightforward solution. This is essentially 'duck typing' in a COM interop scenario.
I had ported some code from VB6 into C#. This ported code still needed to call other methods on VB6 objects via COM interop.
The classes needing to be called looked like this:
class A
{
void Foo() {...}
}
class B
{
void Foo() {...}
}
(i.e., this would be the way the VB6 classes looked in C# via COM interop.
Since A and B are independent of each other you can't cast one to the other, and they have no common base class (COM doesn't support that AFAIK and VB6 certainly didn't. And they did not implement a common interface - see below).
The original VB6 code which was ported did this:
' Obj must be either an A or a B
Sub Bar(Obj As Object)
Call Obj.Foo()
End Sub
Now in VB6 you can pass things around as Object and the runtime will figure out if those objects have method Foo() or not. But in C# a literal translation would be:
// Obj must be either an A or a B
void Bar(object Obj)
{
Obj.Foo();
}
Which will NOT work. It won't compile because object does not have a method called "Foo", and C# being typesafe won't allow this.
So the simple "fix" was to use dynamic, like this:
// Obj must be either an A or a B
void Bar(dynamic Obj)
{
Obj.Foo();
}
This defers type safety until runtime, but assuming you've done it right works just fine.
I wouldn't endorse this for new code, but in this situation (which I think is not uncommon judging from other answers here) it was valuable.
Alternatives considered:
Using reflection to call Foo(). Probably would work, but more effort and less readable.
Modifying the VB6 library wasn't on the table here, but maybe there could be an approach to define A and B in terms of a common interface, which VB6 and COM would support. But using dynamic was much easier.
Note: This probably will turn out to be a temporary solution. Eventually if the remaining VB6 code is ported over then a proper class structure can be used.

I will like to copy an excerpt from the code project post, which define that :
Why use dynamic?
In the statically typed world, dynamic gives developers a lot of rope
to hang themselves with. When dealing with objects whose types can be
known at compile time, you should avoid the dynamic keyword at all
costs. Earlier, I said that my initial reaction was negative, so what
changed my mind? To quote Margret Attwood, context is all. When
statically typing, dynamic doesn't make a stitch of sense. If you are
dealing with an unknown or dynamic type, it is often necessary to
communicate with it through Reflection. Reflective code is not easy to
read, and has all the pitfalls of the dynamic type above. In this
context, dynamic makes a lot of sense.[More]
While Some of the characteristics of Dynamic keyword are:
Dynamically typed - This means the type of variable declared is
decided by the compiler at runtime time.
No need to initialize at the time of declaration.
e.g.,
dynamic str;
str=”I am a string”; //Works fine and compiles
str=2; //Works fine and compiles
Errors are caught at runtime
Intellisense is not available since the type and its related methods and properties can be known at run time only. [https://www.codeproject.com/Tips/460614/Difference-between-var-and-dynamic-in-Csharp]

It is definitely a bad idea to use dynamic in all cases where it can be used. This is because your programs will lose the benefits of compile-time checking and they will also be much slower.

What is 'unverifiable code' and why is it bad?

I am designing a helper method that does lazy loading of certain objects for me, calling it looks like this:
public override EDC2_ORM.Customer Customer {
get { return LazyLoader.Get<EDC2_ORM.Customer>(
CustomerId, _customerDao, ()=>base.Customer, (x)=>Customer = x); }
set { base.Customer = value; }
}
when I compile this code I get the following warning:
Warning 5 Access to member
'EDC2_ORM.Billing.Contract.Site'
through a 'base' keyword from an
anonymous method, lambda expression,
query expression, or iterator results
in unverifiable code. Consider moving
the access into a helper method on the
containing type.
What exactly is the complaint here and why is what I'm doing bad?

"base.Foo" for a virtual method will make a non-virtual call on the parent definition of the method "Foo". Starting with CLR 2.0, the CLR decided that a non-virtual call on a virtual method can be a potential security hole and restricted the scenarios in which in can be used. They limited it to making non-virtual calls to virtual methods within the same class hierarchy.
Lambda expressions put a kink in the process. Lambda expressions often generate a closure under the hood which is a completely separate class. So the code "base.Foo" will eventually become an expression in an entirely new class. This creates a verification exception with the CLR. Hence C# issues a warning.
Side Note: The equivalent code will work in VB. In VB for non-virtual calls to a virtual method, a method stub will be generated in the original class. The non-virtual call will be performed in this method. The "base.Foo" will be redirected into "StubBaseFoo" (generated name is different).

I suspect the problem is that you're basically saying, "I don't want to use the most derived implementation of Customer - I want to use this particular one" - which you wouldn't be able to do normally. You're allowed to do it within a derived class, and for good reasons, but from other types you'd be violating encapsulation.
Now, when you use an anonymous method, lambda expression, query expression (which basically uses lambda expressions) or iterator block, sometimes the compiler has to create a new class for you behind the scenes. Sometimes it can get away with creating a new method in the same type for lambda expressions, but it depends on the context. Basically if any local variables are captured in the lambda expression, that needs a new class (or indeed multiple classes, depending on scope - it can get nasty). If the lambda expression only captures the this reference, a new instance method can be created for the lambda expression logic. If nothing is captured, a static method is fine.
So, although the C# compiler knows that really you're not violating encapsulation, the CLR doesn't - so it treats the code with some suspicion. If you're running under full trust, that's probably not an issue, but under other trust levels (I don't know the details offhand) your code won't be allowed to run.
Does that help?

copy/pasting from here:
Codesta: C#/CLR has 2 kinds of code, safe and unsafe. What is it trying to provide and how did this affect the virtual machine?
Peter Hallam: For C# the terms are safe and unsafe. The CLR uses the terms verifiable and unverifiable.
When running verifiable code the CLR can enforce security policies; the CLR can prevent verifiable code from doing things that it doesn't have permission to do. When running potentially malicious code, code that was downloaded from the internet for example, the CLR will only run verifiable code, and will ensure that the untrusted code doesn't access anything that it doesn't have permission to access.
The use of standard C style pointers creates unverifiable code. The CLR supports C style pointers natively. Once you've got a C style pointer you can read or write to any byte of memory in the process, so the runtime cannot enforce security policy. Actually it could but the performance penalty would make it impractical.
Now, that does not fully answer your question (i.e. WHY is this now unverifiable code), but at least it explains that "unverifiable" is the CLR-term for "unsafe". I assume that anonymous methods and base classes result in some funky pointer-magic internally.
By the Way: I think that the code snippet does not match the Warning message. The code is talking about a Customer, the Warning is about the Billing. Is it possible to post the actuon code the warning is generated for? Maybe you have something else in that code that would better explain why you get the warning.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.