Replacing instructions in a method's MethodBody - c#

(First of all, this is a very lengthy post, but don't worry: I've already implemented all of it, I'm just asking your opinion, or possible alternatives.)
I'm having trouble implementing the following; I'd appreciate some help:
I get a Type as parameter.
I define a subclass using reflection. Notice that I don't intend to modify the original type, but create a new one.
I create a property per field of the original class, like so:
public class OriginalClass {
private int x;
}
public class Subclass : OriginalClass {
private int x;
public int X {
get { return x; }
set { x = value; }
}
}
For every method of the superclass, I create an analogous method in the subclass. The method's body must be the same except that I replace the instructions ldfld x with callvirt this.get_X, that is, instead of reading from the field directly I call the get accessor.
I'm having trouble with step 4. I know you're not supposed to manipulate code like this, but I really need to.
Here's what I've tried:
Attempt #1: Use Mono.Cecil. This would allow me to parse the body of the method into human-readable Instructions, and easily replace instructions. However, the original type isn't in a .dll file, so I can't find a way to load it with Mono.Cecil. Writing the type to a .dll, then load it, then modify it and write the new type to disk (which I think is the way you create a type with Mono.Cecil), and then load it seems like a huge overhead.
Attempt #2: Use Mono.Reflection. This would also allow me to parse the body into Instructions, but then I have no support for replacing instructions. I've implemented a very ugly and inefficient solution using Mono.Reflection, but it doesn't yet support methods that contain try-catch statements (although I guess I can implement this) and I'm concerned that there may be other scenarios in which it won't work, since I'm using the ILGenerator in a somewhat unusual way. Also, it's very ugly ;). Here's what I've done:
private void TransformMethod(MethodInfo methodInfo) {
// Create a method with the same signature.
ParameterInfo[] paramList = methodInfo.GetParameters();
Type[] args = new Type[paramList.Length];
for (int i = 0; i < args.Length; i++) {
args[i] = paramList[i].ParameterType;
}
MethodBuilder methodBuilder = typeBuilder.DefineMethod(
methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);
ILGenerator ilGen = methodBuilder.GetILGenerator();
// Declare the same local variables as in the original method.
IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
foreach (LocalVariableInfo local in locals) {
ilGen.DeclareLocal(local.LocalType);
}
// Get readable instructions.
IList<Instruction> instructions = methodInfo.GetInstructions();
// I first need to define labels for every instruction in case I
// later find a jump to that instruction. Once the instruction has
// been emitted I cannot label it, so I'll need to do it in advance.
// Since I'm doing a first pass on the method's body anyway, I could
// instead just create labels where they are truly needed, but for
// now I'm using this quick fix.
Dictionary<int, Label> labels = new Dictionary<int, Label>();
foreach (Instruction instr in instructions) {
labels[instr.Offset] = ilGen.DefineLabel();
}
foreach (Instruction instr in instructions) {
// Mark this instruction with a label, in case there's a branch
// instruction that jumps here.
ilGen.MarkLabel(labels[instr.Offset]);
// If this is the instruction that I want to replace (ldfld x)...
if (instr.OpCode == OpCodes.Ldfld) {
// ...get the get accessor for the accessed field (get_X())
// (I have the accessors in a dictionary; this isn't relevant),
MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];
// ...instead of emitting the original instruction (ldfld x),
// emit a call to the get accessor,
ilGen.Emit(OpCodes.Callvirt, safeReadAccessor);
// Else (it's any other instruction), reemit the instruction, unaltered.
} else {
Reemit(instr, ilGen, labels);
}
}
}
And here comes the horrible, horrible Reemit method:
private void Reemit(Instruction instr, ILGenerator ilGen, Dictionary<int, Label> labels) {
// If the instruction doesn't have an operand, emit the opcode and return.
if (instr.Operand == null) {
ilGen.Emit(instr.OpCode);
return;
}
// Else (it has an operand)...
// If it's a branch instruction, retrieve the corresponding label (to
// which we want to jump), emit the instruction and return.
if (instr.OpCode.FlowControl == FlowControl.Branch) {
ilGen.Emit(instr.OpCode, labels[Int32.Parse(instr.Operand.ToString())]);
return;
}
// Otherwise, simply emit the instruction. I need to use the right
// Emit call, so I need to cast the operand to its type.
Type operandType = instr.Operand.GetType();
if (typeof(byte).IsAssignableFrom(operandType))
ilGen.Emit(instr.OpCode, (byte) instr.Operand);
else if (typeof(double).IsAssignableFrom(operandType))
ilGen.Emit(instr.OpCode, (double) instr.Operand);
else if (typeof(float).IsAssignableFrom(operandType))
ilGen.Emit(instr.OpCode, (float) instr.Operand);
else if (typeof(int).IsAssignableFrom(operandType))
ilGen.Emit(instr.OpCode, (int) instr.Operand);
... // you get the idea. This is a pretty long method, all like this.
}
Branch instructions are a special case because instr.Operand is SByte, but Emit expects an operand of type Label. Hence the need for the Dictionary labels.
As you can see, this is pretty horrible. What's more, it doesn't work in all cases, for instance with methods that contain try-catch statements, since I haven't emitted them using methods BeginExceptionBlock, BeginCatchBlock, etc, of ILGenerator. This is getting complicated. I guess I can do it: MethodBody has a list of ExceptionHandlingClause that should contain the necessary information to do this. But I don't like this solution anyway, so I'll save this as a last-resort solution.
Attempt #3: Go bare-back and just copy the byte array returned by MethodBody.GetILAsByteArray(), since I only want to replace a single instruction for another single instruction of the same size that produces the exact same result: it loads the same type of object on the stack, etc. So there won't be any labels shifting and everything should work exactly the same. I've done this, replacing specific bytes of the array and then calling MethodBuilder.CreateMethodBody(byte[], int), but I still get the same error with exceptions, and I still need to declare the local variables or I'll get an error... even when I simply copy the method's body and don't change anything.
So this is more efficient but I still have to take care of the exceptions, etc.
Sigh.
Here's the implementation of attempt #3, in case anyone is interested:
private void TransformMethod(MethodInfo methodInfo, Dictionary<string, MethodInfo[]> dataMembersSafeAccessors, ModuleBuilder moduleBuilder) {
ParameterInfo[] paramList = methodInfo.GetParameters();
Type[] args = new Type[paramList.Length];
for (int i = 0; i < args.Length; i++) {
args[i] = paramList[i].ParameterType;
}
MethodBuilder methodBuilder = typeBuilder.DefineMethod(
methodInfo.Name, methodInfo.Attributes, methodInfo.ReturnType, args);
ILGenerator ilGen = methodBuilder.GetILGenerator();
IList<LocalVariableInfo> locals = methodInfo.GetMethodBody().LocalVariables;
foreach (LocalVariableInfo local in locals) {
ilGen.DeclareLocal(local.LocalType);
}
byte[] rawInstructions = methodInfo.GetMethodBody().GetILAsByteArray();
IList<Instruction> instructions = methodInfo.GetInstructions();
int k = 0;
foreach (Instruction instr in instructions) {
if (instr.OpCode == OpCodes.Ldfld) {
MethodInfo safeReadAccessor = dataMembersSafeAccessors[((FieldInfo) instr.Operand).Name][0];
// Copy the opcode: Callvirt.
byte[] bytes = toByteArray(OpCodes.Callvirt.Value);
for (int m = 0; m < OpCodes.Callvirt.Size; m++) {
rawInstructions[k++] = bytes[put.Length - 1 - m];
}
// Copy the operand: the accessor's metadata token.
bytes = toByteArray(moduleBuilder.GetMethodToken(safeReadAccessor).Token);
for (int m = instr.Size - OpCodes.Ldfld.Size - 1; m >= 0; m--) {
rawInstructions[k++] = bytes[m];
}
// Skip this instruction (do not replace it).
} else {
k += instr.Size;
}
}
methodBuilder.CreateMethodBody(rawInstructions, rawInstructions.Length);
}
private static byte[] toByteArray(int intValue) {
byte[] intBytes = BitConverter.GetBytes(intValue);
if (BitConverter.IsLittleEndian)
Array.Reverse(intBytes);
return intBytes;
}
private static byte[] toByteArray(short shortValue) {
byte[] intBytes = BitConverter.GetBytes(shortValue);
if (BitConverter.IsLittleEndian)
Array.Reverse(intBytes);
return intBytes;
}
(I know it isn't pretty. Sorry. I put it quickly together to see if it would work.)
I don't have much hope, but can anyone suggest anything better than this?
Sorry about the extremely lengthy post, and thanks.
UPDATE #1: Aggh... I've just read this in the msdn documentation:
[The CreateMethodBody method] is
currently not fully supported. The
user cannot supply the location of
token fix ups and exception handlers.
I should really read the documentation before trying anything. Some day I'll learn...
This means option #3 can't support try-catch statements, which makes it useless for me. Do I really have to use the horrible #2? :/ Help! :P
UPDATE #2: I've successfully implemented attempt #2 with support for exceptions. It's quite ugly, but it works. I'll post it here when I refine the code a bit. It's not a priority, so it may be a couple of weeks from now. Just letting you know in case someone is interested in this.
Thanks for your suggestions.

I am trying to do a very similar thing. I have already tried your #1 approach, and I agree, that creates a huge overhead (I haven't measured it exactly though).
There is a DynamicMethod class which is - according to MSDN - "Defines and represents a dynamic method that can be compiled, executed, and discarded. Discarded methods are available for garbage collection."
Performance wise it sounds good.
With the ILReader library I could convert normal MethodInfo to DynamicMethod. When you look into the ConvertFrom method of the DyanmicMethodHelper class of the ILReader library you can find the code we'd need:
byte[] code = body.GetILAsByteArray();
ILReader reader = new ILReader(method);
ILInfoGetTokenVisitor visitor = new ILInfoGetTokenVisitor(ilInfo, code);
reader.Accept(visitor);
ilInfo.SetCode(code, body.MaxStackSize);
Theoretically this let's us modify the code of an existing method and run it as a dynamic method.
My only problem now is that Mono.Cecil does not allow us to save the bytecode of a method (at least I could not find the way to do it). When you download the Mono.Cecil source code it has a CodeWriter class to accomplish the task, but it is not public.
Other problem I have with this approach is that MethodInfo -> DynamicMethod transformation works only with static methods with ILReader. But this can be worked around.
The performance of the invocation depends on the method I used. I got following results after calling short method 10'000'000 times:
Reflection.Invoke - 14 sec
DynamicMethod.Invoke - 26 sec
DynamicMethod with delegates - 9 sec
Next thing I'm going to try is:
load original method with Cecil
modify the code in Cecil
strip off of the unmodified code from the assembly
save the assembly as MemoryStream instead of File
load the new assembly (from memory) with Reflection
call the method with reflection invoke if its a one-time call
generate DynamicMethod's delegates and store them if I want to call that method regularly
try to find out if I can unload the not necessary assemblies from memory (free up both MemoryStream and run-time assembly representation)
It sounds like a lot of work and it might not work, we'll see :)
I hope it helps, let me know what you think.

Have you tried PostSharp? I think that it already provides all you'd need out of the box via the On Field Access Aspect.

Maybe i unterstood something wrong, but if you like to extend, intercept an existing instance of a class you can take a look into Castle Dynamic Proxy.

You'd have to define the properties in the base class as virtual or abstract first.
Also,the fields then need to be modified to be 'protected' as opposed to 'private'.
Or am I misunderstanding something here?

What about using SetMethodBody instead of CreateMethodBody (this would be a variation of #3)? It's a new method introduced in .NET 4.5 and seems to support exceptions and fixups.

Basically you are copying the program text of the original class, and then making regular changes to it. Your current method is to copy the object code for the class and patch that. I can understand why that seems ugly; you're working at an extremely low level.
This seems like it would be easy to do with source-to-source program transformations.
This operates on the AST for the source code rather than the source code itself for precisions. See DMS Software Reengineering Toolkit for such a tool. DMS has a full C# 4.0 parser.

Related

Loading a ParameterInfo using IL Emit

I am currently using Calling a method of existing object using IL Emit as a guide, and I can already do whatever is asked in question. Now, I have an attribute added to a parameter and I want to load that particular parameter's attribute so that I can call a method inside that attribute.
I know it can be done by loading the MethodInfo and then getting the ParameterInfo and then getting attribute of that ParameterInfo in IL; I am simply trying to avoid writing that much IL.
Is there a way to load a parameter's attribute in IL just like it is mentioned in the linked post?
Edit:
I have a method with a signature like
Method([Attr] int Parameter)
and I want to load a method which is part of the Attr. I was just hoping I could load ParameterInfo (obtained using MethodInfo.GetParameters()) directly onto the stack. Turns out, LdToken doesn't really allow putting ParameterInfo. Only other way I can think of doing this is to load MethodInfo (LdToken supports that) and then use GetParameters() in IL to get an array of Parameters and then loop through them in IL one by one to get each one's Attribute (using .GetCustomAttribute(Type)) and then call the method on that attribute. Note that I don't have to get a field of an attribute, I need to call a method on that attribute.
K, third time lucky based on another interpretation of the question; here, we're assuming that we want to invoke methods on an attribute instance. We need to consider that attributes only kinda sorta exist at runtime - we can create synthetic instances of the attribute as represented by the metadata, but this isn't particularly cheap or fast, so we should ideally only do this once (the metadata isn't going to change, after all). This means we might want to store the instance as a field somewhere. This could be an instance field, or a static field - in many cases, a static field is fine. Consider:
using System;
using System.Reflection;
using System.Reflection.Emit;
public class SomethingAttribute : Attribute
{
public SomethingAttribute(string name)
=> Name = name;
public string Name { get; }
public void SomeMethod(int i)
{
Console.WriteLine($"SomeMethod: {Name}, {i}");
}
}
public static class P
{
public static void Foo([Something("Abc")] int x)
{
Console.WriteLine($"Foo: {x}");
}
public static void Main()
{
// get the attribute
var method = typeof(P).GetMethod(nameof(Foo));
var p = method.GetParameters()[0];
var attr = (SomethingAttribute)Attribute.GetCustomAttribute(p, typeof(SomethingAttribute));
// define an assembly, module and type to play with
AssemblyBuilder asm = AssemblyBuilder.DefineDynamicAssembly(new AssemblyName("Evil"), AssemblyBuilderAccess.Run);
var module = asm.DefineDynamicModule("Evil");
var type = module.DefineType("SomeType", TypeAttributes.Public);
// define a field where we'll store our synthesized attribute instance; avoid initonly, unless you're
// going to write code in the .cctor to initialize it; leaving it writable allows us to assign it via
// reflection
var attrField = type.DefineField("s_attr", typeof(SomethingAttribute), FieldAttributes.Static | FieldAttributes.Private);
// declare the method we're working on
var bar = type.DefineMethod("Bar", MethodAttributes.Static | MethodAttributes.Public, typeof(void), new[] { typeof(int) });
var il = bar.GetILGenerator();
// use the static field instance as our target to invoke the attribute method
il.Emit(OpCodes.Ldsfld, attrField); // the attribute instance
il.Emit(OpCodes.Ldarg_0); // the integer
il.EmitCall(OpCodes.Callvirt, typeof(SomethingAttribute).GetMethod(nameof(SomethingAttribute.SomeMethod)), null);
// and also call foo
il.Emit(OpCodes.Ldarg_0); // the integer
il.EmitCall(OpCodes.Call, typeof(P).GetMethod(nameof(P.Foo)), null);
il.Emit(OpCodes.Ret);
// complete the type
var actualType = type.CreateType();
// assign the synthetic attribute instance on the concrete type
actualType.GetField(attrField.Name, BindingFlags.Static | BindingFlags.NonPublic).SetValue(null, attr);
// get a delegate to the method
var func = (Action<int>)Delegate.CreateDelegate(typeof(Action<int>), actualType.GetMethod(bar.Name));
// and test it
for (int i = 0; i < 5; i++)
func(i);
}
}
Output from the final loop (for (int i = 0; i < 5; i++) func(i);):
SomeMethod: Abc, 0
Foo: 0
SomeMethod: Abc, 1
Foo: 1
SomeMethod: Abc, 2
Foo: 2
SomeMethod: Abc, 3
Foo: 3
SomeMethod: Abc, 4
Foo: 4
As a side note; in many ways it is easier to do this with expression-trees, since expression-trees have Expression.Constant which can be the attribute instance, and which is treated like a field internally. But you mentioned TypeBuilder, so I went this way :)
I know it can be done by loading the MethodInfo and then getting the ParameterInfo and then getting attribute of that ParameterInfo in IL; I am simply trying to avoid writing that much IL.
Yeah, that's pretty much it, in IL; IL is powerful, but is not particularly terse or simple. Just like in the linked post, you'd end up loading the parameter (ldarg or ldarga, maybe some .s), then depending on whether the member is a field or a property, using either ldfld or callvirt on the property getter. About 3 lines, so not huge; perhaps something like:
static void EmitLoadPropertyOrField(ILGenerator il, Type type, string name)
{
// assumes that the target *reference* has already been loaded; only
// implements reference-type semantics currently
var member = type.GetMember(name, BindingFlags.Public | BindingFlags.Instance).Single();
switch (member)
{
case FieldInfo field:
il.Emit(OpCodes.Ldfld, field);
break;
case PropertyInfo prop:
il.EmitCall(OpCodes.Callvirt, prop.GetGetMethod(), null);
break;
default:
throw new InvalidOperationException();
}
}
If you're trying to save complexity (or are sick of InvalidProgramException), another viable approach is to use expression trees; then you have many more convenience features, but in particular things like:
var p = Expression.Parameter(typeof(Foo), "x");
var name = Expression.PropertyOrField(p, "Name");
// ...
var lambda = Expression.Lambda<YourDelegateType>(body, p);
var del = lambda.Compile();
Note that expression trees cannot be used in all scenarios; for example, they can't really be used with TypeBuilder; conversely, though, they can do things that IL-emit can't - for example, in AOT scenarios where IL-emit is prohibited, they can work as a runtime reflection evaluated tree, so it still works (but: slower). They add some additional processing (building and then parsing the tree), but: they are simpler than IL-emit, especially for debugging.
With the clarification that by attribute you really did mean a .NET attribute (not a field or property), this becomes simpler in many ways; consider:
class SomethingAttribute : Attribute
{
public SomethingAttribute(string name)
=> Name = name;
public string Name { get; }
}
static class P
{
public static void Foo([Something("Abc")] int x) {}
static void Main()
{
var method = typeof(P).GetMethod(nameof(Foo));
var p = method.GetParameters()[0];
var attr = (SomethingAttribute)Attribute.GetCustomAttribute(
p, typeof(SomethingAttribute));
string name = attr?.Name;
// you can now "ldstr {name}" etc
}
}
The important point here is that the attribute isn't going to change at runtime - it is pure metadata; so, we can load it once with reflection when we are processing the model, then just emit the processed data, i.e. the line
// you can now "ldstr {name}" etc
For future reference, I actually went ahead and loaded ParameterInfo using IL only. Marc's solution was good, but it quickly became infeasible after the number of parameter based attributes increased. We plan to use attributes to contain some state specific information, we would have to use a ton of reflection from outside the type to find and assign correct attribute to the field. Overall, it would have become quite hectic to deal with it.
To load ParameterInfo using IL
IL.Emit(OpCodes.Ldtoken, Method); // Method is MethodInfo as we can't use GetParameters with MethodBuilder
IL.Emit(OpCodes.Call, typeof(MethodBase).GetMethod(nameof(MethodBase.GetMethodFromHandle), new[] { typeof(RuntimeMethodHandle) }));
IL.Emit(OpCodes.Callvirt, typeof(MethodInfo).GetMethod(nameof(MethodInfo.GetParameters)));
var ILparameters = IL.DeclareLocal(typeof(ParameterInfo[]));
IL.Emit(OpCodes.Stloc, ILparameters);
that loads ParameterInfo onto stack and then stores it into a LocalBuilder called ILparameters. Note that it is an array. Items of this array can then be accessed like
IL.Emit(OpCodes.Ldloc, ILparameters);
IL.Emit(OpCodes.Ldc_I4, Number); // Replace with Ldc_I4_x if number < 8
IL.Emit(OpCodes.Ldelem_Ref);
I prefer to create two helper functions for the two code pieces. It works pretty well.

Discard feature significance in C# 7.0?

While going through new C# 7.0 features, I stuck up with discard feature. It says:
Discards are local variables which you can assign but cannot read
from. i.e. they are “write-only” local variables.
and, then, an example follows:
if (bool.TryParse("TRUE", out bool _))
What is real use case when this will be beneficial? I mean what if I would have defined it in normal way, say:
if (bool.TryParse("TRUE", out bool isOK))
The discards are basically a way to intentionally ignore local variables which are irrelevant for the purposes of the code being produced. It's like when you call a method that returns a value but, since you are interested only in the underlying operations it performs, you don't assign its output to a local variable defined in the caller method, for example:
public static void Main(string[] args)
{
// I want to modify the records but I'm not interested
// in knowing how many of them have been modified.
ModifyRecords();
}
public static Int32 ModifyRecords()
{
Int32 affectedRecords = 0;
for (Int32 i = 0; i < s_Records.Count; ++i)
{
Record r = s_Records[i];
if (String.IsNullOrWhiteSpace(r.Name))
{
r.Name = "Default Name";
++affectedRecords;
}
}
return affectedRecords;
}
Actually, I would call it a cosmetic feature... in the sense that it's a design time feature (the computations concerning the discarded variables are performed anyway) that helps keeping the code clear, readable and easy to maintain.
I find the example shown in the link you provided kinda misleading. If I try to parse a String as a Boolean, chances are I want to use the parsed value somewhere in my code. Otherwise I would just try to see if the String corresponds to the text representation of a Boolean (a regular expression, for example... even a simple if statement could do the job if casing is properly handled). I'm far from saying that this never happens or that it's a bad practice, I'm just saying it's not the most common coding pattern you may need to produce.
The example provided in this article, on the opposite, really shows the full potential of this feature:
public static void Main()
{
var (_, _, _, pop1, _, pop2) = QueryCityDataForYears("New York City", 1960, 2010);
Console.WriteLine($"Population change, 1960 to 2010: {pop2 - pop1:N0}");
}
private static (string, double, int, int, int, int) QueryCityDataForYears(string name, int year1, int year2)
{
int population1 = 0, population2 = 0;
double area = 0;
if (name == "New York City")
{
area = 468.48;
if (year1 == 1960) {
population1 = 7781984;
}
if (year2 == 2010) {
population2 = 8175133;
}
return (name, area, year1, population1, year2, population2);
}
return ("", 0, 0, 0, 0, 0);
}
From what I can see reading the above code, it seems that the discards have a higher sinergy with other paradigms introduced in the most recent versions of C# like tuples deconstruction.
For Matlab programmers, discards are far from being a new concept because the programming language implements them since very, very, very long time (probably since the beginning, but I can't say for sure). The official documentation describes them as follows (link here):
Request all three possible outputs from the fileparts function:
helpFile = which('help');
[helpPath,name,ext] = fileparts('C:\Path\data.txt');
The current workspace now contains three variables from fileparts: helpPath, name, and ext. In this case, the variables are small. However, some functions return results that use much more memory. If you do not need those variables, they waste space on your system.
Ignore the first output using a tilde (~):
[~,name,ext] = fileparts(helpFile);
The only difference is that, in Matlab, inner computations for discarded outputs are normally skipped because output arguments are flexible and you can know how many and which one of them have been requested by the caller.
I have seen discards used mainly against methods which return Task<T> but you don't want to await the output.
So in the example below, we don't want to await the output of SomeOtherMethod() so we could do something like this:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
Example();
Except this will generate the following warning:
CS4014 Because this call is not awaited, execution of the
current method continues before the call is completed. Consider
applying the 'await' operator to the result of the call.
To mitigate this warning and essentially ensure the compiler that we know what we are doing, you can use a discard:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
_ = Example();
No more warnings.
To add another use case to the above answers.
You can use a discard in conjunction with a null coalescing operator to do a nice one-line null check at the start of your functions:
_ = myParam ?? throw new MyException();
Many times I've done code along these lines:
TextBox.BackColor = int32.TryParse(TextBox.Text, out int32 _) ? Color.LightGreen : Color.Pink;
Note that this would be part of a larger collection of data, not a standalone thing. The idea is to provide immediate feedback on the validity of each field of the data they are entering.
I use light green and pink rather than the green and red one would expect--the latter colors are dark enough that the text becomes a bit hard to read and the meaning of the lighter versions is still totally obvious.
(In some cases I also have a Color.Yellow to flag something which is not valid but neither is it totally invalid. Say the parser will accept fractions and the field currently contains "2 1". That could be part of "2 1/2" so it's not garbage, but neither is it valid.)
Discard pattern can be used with a switch expression as well.
string result = shape switch
{
Rectangule r => $"Rectangule",
Circle c => $"Circle",
_ => "Unknown Shape"
};
For a list of patterns with discards refer to this article: Discards.
Consider this:
5 + 7;
This "statement" performs an evaluation but is not assigned to something. It will be immediately highlighted with the CS error code CS0201.
// Only assignment, call, increment, decrement, and new object expressions can be used as a statement
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0201?f1url=%3FappId%3Droslyn%26k%3Dk(CS0201)
A discard variable used here will not change the fact that it is an unused expression, rather it will appear to the compiler, to you, and others reviewing your code that it was intentionally unused.
_ = 5 + 7; //acceptable
It can also be used in lambda expressions when having unused parameters:
builder.Services.AddSingleton<ICommandDispatcher>(_ => dispatcher);

What is it that makes Enum.HasFlag so slow?

I was doing some speed tests and I noticed that Enum.HasFlag is about 16 times slower than using the bitwise operation.
Does anyone know the internals of Enum.HasFlag and why it is so slow? I mean twice as slow wouldn't be too bad but it makes the function unusable when its 16 times slower.
In case anyone is wondering, here is the code I am using to test its speed.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace app
{
public class Program
{
[Flags]
public enum Test
{
Flag1 = 1,
Flag2 = 2,
Flag3 = 4,
Flag4 = 8
}
static int num = 0;
static Random rand;
static void Main(string[] args)
{
int seed = (int)DateTime.UtcNow.Ticks;
var st1 = new SpeedTest(delegate
{
Test t = Test.Flag1;
t |= (Test)rand.Next(1, 9);
if (t.HasFlag(Test.Flag4))
num++;
});
var st2 = new SpeedTest(delegate
{
Test t = Test.Flag1;
t |= (Test)rand.Next(1, 9);
if (HasFlag(t , Test.Flag4))
num++;
});
rand = new Random(seed);
st1.Test();
rand = new Random(seed);
st2.Test();
Console.WriteLine("Random to prevent optimizing out things {0}", num);
Console.WriteLine("HasFlag: {0}ms {1}ms {2}ms", st1.Min, st1.Average, st1.Max);
Console.WriteLine("Bitwise: {0}ms {1}ms {2}ms", st2.Min, st2.Average, st2.Max);
Console.ReadLine();
}
static bool HasFlag(Test flags, Test flag)
{
return (flags & flag) != 0;
}
}
[DebuggerDisplay("Average = {Average}")]
class SpeedTest
{
public int Iterations { get; set; }
public int Times { get; set; }
public List<Stopwatch> Watches { get; set; }
public Action Function { get; set; }
public long Min { get { return Watches.Min(s => s.ElapsedMilliseconds); } }
public long Max { get { return Watches.Max(s => s.ElapsedMilliseconds); } }
public double Average { get { return Watches.Average(s => s.ElapsedMilliseconds); } }
public SpeedTest(Action func)
{
Times = 10;
Iterations = 100000;
Function = func;
Watches = new List<Stopwatch>();
}
public void Test()
{
Watches.Clear();
for (int i = 0; i < Times; i++)
{
var sw = Stopwatch.StartNew();
for (int o = 0; o < Iterations; o++)
{
Function();
}
sw.Stop();
Watches.Add(sw);
}
}
}
}
Results:
HasFlag: 52ms 53.6ms 55ms
Bitwise: 3ms 3ms 3ms
Does anyone know the internals of Enum.HasFlag and why it is so slow?
The actual check is just a simple bit check in Enum.HasFlag - it's not the problem here. That being said, it is slower than your own bit check...
There are a couple of reasons for this slowdown:
First, Enum.HasFlag does an explicit check to make sure that the type of the enum and the type of the flag are both the same type, and from the same Enum. There is some cost in this check.
Secondly, there is an unfortunate box and unbox of the value during a conversion to UInt64 that occurs inside of HasFlag. This is, I believe, due to the requirement that Enum.HasFlag work with all enums, regardless of the underlying storage type.
That being said, there is a huge advantage to Enum.HasFlag - it's reliable, clean, and makes the code very obvious and expressive. For the most part, I feel that this makes it worth the cost - but if you're using this in a very performance critical loop, it may be worth doing your own check.
Decompiled code of Enum.HasFlags() looks like this:
public bool HasFlag(Enum flag)
{
if (!base.GetType().IsEquivalentTo(flag.GetType()))
{
throw new ArgumentException(Environment.GetResourceString("Argument_EnumTypeDoesNotMatch", new object[] { flag.GetType(), base.GetType() }));
}
ulong num = ToUInt64(flag.GetValue());
return ((ToUInt64(this.GetValue()) & num) == num);
}
If I were to guess, I would say that checking the type was what's slowing it down most.
Note that in recent versions of .Net Core, this has been improved and Enum.HasFlag compiles to the same code as using bitwise comparisons.
The performance penalty due to boxing discussed on this page also affects the public .NET functions Enum.GetValues and Enum.GetNames, which both forward to (Runtime)Type.GetEnumValues and (Runtime)Type.GetEnumNames respectively.
All of these functions use a (non-generic) Array as a return type--which is not so bad for the names (since String is a reference type)--but is quite inappropriate for the ulong[] values.
Here's a peek at the offending code (.NET 4.7):
public override Array /* RuntimeType.*/ GetEnumValues()
{
if (!this.IsEnum)
throw new ArgumentException();
ulong[] values = Enum.InternalGetValues(this);
Array array = Array.UnsafeCreateInstance(this, values.Length);
for (int i = 0; i < values.Length; i++)
{
var obj = Enum.ToObject(this, values[i]); // ew. boxing.
array.SetValue(obj, i); // yuck
}
return array; // Array of object references, bleh.
}
We can see that prior to doing the copy, RuntimeType goes back again to System.Enum to get an internal array, a singleton which is cached, on demand, for each specific Enum. Notice also that this version of the values array does use the proper strong signature, ulong[].
Here's the .NET function (again we're back in System.Enum now). There's a similar function for getting the names (not shown).
internal static ulong[] InternalGetValues(RuntimeType enumType) =>
GetCachedValuesAndNames(enumType, false).Values;
See the return type? This looks like a function we'd like to use... But first consider that a second reason that .NET re-copys the array each time (as you saw above) is that .NET must ensure that each caller gets an unaltered copy of the original data, given that a malevolent coder could change her copy of the returned Array, introducing a persistent corruption. Thus, the re-copying precaution is especially intended to protect the cached internal master copy.
If you aren't worried about that risk, perhaps because you feel confident you won't accidentally change the array, or maybe just to eke-out a few cycles of (what's surely premature) optimization, it's simple to fetch the internal cached array copy of the names or values for any Enum:
→ The following two functions comprise the sum contribution of this article ←
→ (but see edit below for improved version) ←
static ulong[] GetEnumValues<T>() where T : struct =>
(ulong[])typeof(System.Enum)
.GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
static String[] GetEnumNames<T>() where T : struct =>
(String[])typeof(System.Enum)
.GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
Note that the generic constraint on T isn't fully sufficient for guaranteeing Enum. For simplicity, I left off checking any further beyond struct, but you might want to improve on that. Also for simplicity, this (ref-fetches and) reflects directly off the MethodInfo every time rather than trying to build and cache a Delegate. The reason for this is that creating the proper delegate with a first argument of non-public type RuntimeType is tedious. A bit more on this below.
First, I'll wrap up with usage examples:
var values = GetEnumValues<DayOfWeek>();
var names = GetEnumNames<DayOfWeek>();
and debugger results:
'values' ulong[7]
[0] 0
[1] 1
[2] 2
[3] 3
[4] 4
[5] 5
[6] 6
'names' string[7]
[0] "Sunday"
[1] "Monday"
[2] "Tuesday"
[3] "Wednesday"
[4] "Thursday"
[5] "Friday"
[6] "Saturday"
So I mentioned that the "first argument" of Func<RuntimeType,ulong[]> is annoying to reflect over. However, because this "problem" arg happens to be first, there's a cute workaround where you can bind each specific Enum type as a Target of its own delegate, where each is then reduced to Func<ulong[]>.)
Clearly, its pointless to make any of those delegates, since each would just be a function that always return the same value... but the same logic seems to apply, perhaps less obviously, to the original situation as well (i.e., Func<RuntimeType,ulong[]>). Although we do get by with a just one delegate here, you'd never really want to call it more than once per Enum type. Anyway, all of this leads to a much better solution, which is included in the edit below.
[edit:]Here's a slightly more elegant version of the same thing. If you will be calling the functions repeatedly for the same Enum type, the version shown here will only use reflection one time per Enum type. It saves the results in a locally-accessible cache for extremely rapid access subsequently.
static class enum_info_cache<T> where T : struct
{
static _enum_info_cache()
{
values = (ulong[])typeof(System.Enum)
.GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
names = (String[])typeof(System.Enum)
.GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
.Invoke(null, new[] { typeof(T) });
}
public static readonly ulong[] values;
public static readonly String[] names;
};
The two functions become trivial:
static ulong[] GetEnumValues<T>() where T : struct => enum_info_cache<T>.values;
static String[] GetEnumNames<T>() where T : struct => enum_info_cache<T>.names;
The code shown here illustrates a pattern of combining three specific tricks that seem to mutually result in an unusualy elegant lazy caching scheme. I've found the particular technique to have surprisingly wide application.
using a generic static class to cache independent copies of the arrays for each distinct Enum. Notably, this happens automatically and on demand;
related to this, the loader lock guarantees unique atomic initialization and does this without the clutter of conditional checking constructs. We can also protect static fields with readonly (which, for obvious reasons, typically can't be used with other lazy/deferred/demand methods);
finally, we can capitalize on C# type inference to automatically map the generic function (entry point) into its respective generic static class, so that the demand caching is ultimately even driven implicitly (viz., the best code is the code that isn't there--since it can never have bugs)
You probably noticed that the particular example shown here doesn't really illustrate point (3) very well. Rather than relying on type inference, the void-taking function has to manually propagate forward the type argument T. I didn't choose to expose these simple functions such that there would be an opportunity to show how C# type inference makes the overall technique shine...
However, you can imagine that when you do combine a static generic function that can infer its type argument(s)--i.e., so you don't even have to provide them at the call site--then it gets quite powerful.
The key insight is that, while generic functions have the full type-inference capability, generic classes do not, that is, the compiler will never infer T if you try to call the first of the following lines. But we can still get fully inferred access to a generic class, and all the benefits that entails, by traversing into them via generic function implicit typing (last line):
int t = 4;
typed_cache<int>.MyTypedCachedFunc(t); // no inference from 't', explicit type required
MyTypedCacheFunc<int>(t); // ok, (but redundant)
MyTypedCacheFunc(t); // ok, full inference
Designed well, inferred typing can effortlessly launch you into the appropriate automatically demand-cached data and behaviors, customized for each type (recall points 1. and 2). As noted, I find the approach useful, especially considering its simplicity.
The JITter ought to be inlining this as a simple bitwise operation. The JITter is aware enough to custom-handle even certain framework methods (via MethodImplOptions.InternalCall I think?) but HasFlag seems to have escaped Microsoft's serious attention.

Which one is better to use and why in c#

Which one is better to use?
int xyz = 0;
OR
int xyz= default(int);
int xyz = 0;
Why make people think more than necessary? default is useful with generic code, but here it doesn't add anything. You should also think if you're initializing it in the right place, with a meaningful value. Sometimes you see, with stack variables, code like:
int xyz = 0;
if(someCondition)
{
// ...
xyz = 1;
// ...
}
else
{
// ...
xyz = 2;
// ...
}
In such cases, you should delay initialization until you have the real value. Do:
int xyz;
if(someCondition)
{
// ...
xyz = 1;
// ...
}
else
{
// ...
xyz = 2;
// ...
}
The compiler ensures you don't use an uninitialized stack variable. In some cases, you have to use meaningless values because the compiler can't know code will never execute (due to an exception, call to Exit, etc.). This is the exception (no pun intended) to the rule.
It depends what you want to achieve.
I would prefer
int xyz = 0;
as I believe it is more readable and not confusing.
default keyword is mostly suitable for Generics.
The purpose of the default operator is to provide you with the default value for a type, but it was added primarily to allow generics to have a valid value for values declared to be of its generic type arguments.
I don't have hard evidence but I suspect the compiler will emit the same code for both in your specific case.
However, here there is a legitimate use of default:
public T Aggregate<T>(IEnumerable<T> collection, Func<T, T, T> aggregation)
{
T result = default(T);
foreach (T element in collection)
result = aggregation(result, element);
return result;
}
Without default, the above code would need some hacks in order to compile and function properly.
So use the first, set it to 0.
no performance difference between your codes. to see clearly use int xyz = 0;
Given that the emitted CIL is identical (you get
IL_0001: ldc.i4.0
IL_0002: stloc.0
in both cases), the rule is to choose the one you feel better communicates the intent of the code. Normally, questions of feeling are subjective and hard-to-decide; in this case, however, were I the code reviewer, I would have to be presented with an extremely compelling reason to accept what looks at first sight to be an entirely superfluous use of default().
int xyz = default(int);
I like this way when working with Generics bcoz it give you flexibility to get default of whatever type you are working with.
int xyz=0;
On the other hand this is easy and fast and obviously won't work in generic cases.
Both have their pros and cons..
Regards,
int xyz = 0 is moreclear, defaut is generally used with generics
the best is
int xyz;
because you can't access to uninitialized variable.

How can I evaluate C# code dynamically?

I can do an eval("something()"); to execute the code dynamically in JavaScript. Is there a way for me to do the same thing in C#?
An example of what I am trying to do is: I have an integer variable (say i) and I have multiple properties by the names: "Property1", "Property2", "Property3", etc.
Now, I want to perform some operations on the " Propertyi " property depending on the value of i.
This is really simple with Javascript. Is there any way to do this with C#?
Using the Roslyn scripting API (more samples here):
// add NuGet package 'Microsoft.CodeAnalysis.Scripting'
using Microsoft.CodeAnalysis.CSharp.Scripting;
await CSharpScript.EvaluateAsync("System.Math.Pow(2, 4)") // returns 16
You can also run any piece of code:
var script = await CSharpScript.RunAsync(#"
class MyClass
{
public void Print() => System.Console.WriteLine(1);
}")
And reference the code that was generated in previous runs:
await script.ContinueWithAsync("new MyClass().Print();");
DISCLAIMER: This answer was written back in 2008. The landscape has changed drastically since then.
Look at the other answers on this page, especially the one detailing Microsoft.CodeAnalysis.CSharp.Scripting.
Rest of answer will be left as it was originally posted but is no longer accurate.
Unfortunately, C# isn't a dynamic language like that.
What you can do, however, is to create a C# source code file, full with class and everything, and run it through the CodeDom provider for C# and compile it into an assembly, and then execute it.
This forum post on MSDN contains an answer with some example code down the page somewhat:
create a anonymous method from a string?
I would hardly say this is a very good solution, but it is possible anyway.
What kind of code are you going to expect in that string? If it is a minor subset of valid code, for instance just math expressions, it might be that other alternatives exists.
Edit: Well, that teaches me to read the questions thoroughly first. Yes, reflection would be able to give you some help here.
If you split the string by the ; first, to get individual properties, you can use the following code to get a PropertyInfo object for a particular property for a class, and then use that object to manipulate a particular object.
String propName = "Text";
PropertyInfo pi = someObject.GetType().GetProperty(propName);
pi.SetValue(someObject, "New Value", new Object[0]);
Link: PropertyInfo.SetValue Method
Not really. You can use reflection to achieve what you want, but it won't be nearly as simple as in Javascript. For example, if you wanted to set the private field of an object to something, you could use this function:
protected static void SetField(object o, string fieldName, object value)
{
FieldInfo field = o.GetType().GetField(fieldName, BindingFlags.Instance | BindingFlags.NonPublic);
field.SetValue(o, value);
}
This is an eval function under c#. I used it to convert anonymous functions (Lambda Expressions) from a string.
Source: http://www.codeproject.com/KB/cs/evalcscode.aspx
public static object Eval(string sCSCode) {
CSharpCodeProvider c = new CSharpCodeProvider();
ICodeCompiler icc = c.CreateCompiler();
CompilerParameters cp = new CompilerParameters();
cp.ReferencedAssemblies.Add("system.dll");
cp.ReferencedAssemblies.Add("system.xml.dll");
cp.ReferencedAssemblies.Add("system.data.dll");
cp.ReferencedAssemblies.Add("system.windows.forms.dll");
cp.ReferencedAssemblies.Add("system.drawing.dll");
cp.CompilerOptions = "/t:library";
cp.GenerateInMemory = true;
StringBuilder sb = new StringBuilder("");
sb.Append("using System;\n" );
sb.Append("using System.Xml;\n");
sb.Append("using System.Data;\n");
sb.Append("using System.Data.SqlClient;\n");
sb.Append("using System.Windows.Forms;\n");
sb.Append("using System.Drawing;\n");
sb.Append("namespace CSCodeEvaler{ \n");
sb.Append("public class CSCodeEvaler{ \n");
sb.Append("public object EvalCode(){\n");
sb.Append("return "+sCSCode+"; \n");
sb.Append("} \n");
sb.Append("} \n");
sb.Append("}\n");
CompilerResults cr = icc.CompileAssemblyFromSource(cp, sb.ToString());
if( cr.Errors.Count > 0 ){
MessageBox.Show("ERROR: " + cr.Errors[0].ErrorText,
"Error evaluating cs code", MessageBoxButtons.OK,
MessageBoxIcon.Error );
return null;
}
System.Reflection.Assembly a = cr.CompiledAssembly;
object o = a.CreateInstance("CSCodeEvaler.CSCodeEvaler");
Type t = o.GetType();
MethodInfo mi = t.GetMethod("EvalCode");
object s = mi.Invoke(o, null);
return s;
}
I have written an open source project, Dynamic Expresso, that can convert text expression written using a C# syntax into delegates (or expression tree). Expressions are parsed and transformed into Expression Trees without using compilation or reflection.
You can write something like:
var interpreter = new Interpreter();
var result = interpreter.Eval("8 / 2 + 2");
or
var interpreter = new Interpreter()
.SetVariable("service", new ServiceExample());
string expression = "x > 4 ? service.SomeMethod() : service.AnotherMethod()";
Lambda parsedExpression = interpreter.Parse(expression,
new Parameter("x", typeof(int)));
parsedExpression.Invoke(5);
My work is based on Scott Gu article http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx .
All of that would definitely work. Personally, for that particular problem, I would probably take a little different approach. Maybe something like this:
class MyClass {
public Point point1, point2, point3;
private Point[] points;
public MyClass() {
//...
this.points = new Point[] {point1, point2, point3};
}
public void DoSomethingWith(int i) {
Point target = this.points[i+1];
// do stuff to target
}
}
When using patterns like this, you have to be careful that your data is stored by reference and not by value. In other words, don't do this with primitives. You have to use their big bloated class counterparts.
I realized that's not exactly the question, but the question has been pretty well answered and I thought maybe an alternative approach might help.
I don't now if you absolutely want to execute C# statements, but you can already execute Javascript statements in C# 2.0. The open-source library Jint is able to do it. It's a Javascript interpreter for .NET. Pass a Javascript program and it will run inside your application. You can even pass C# object as arguments and do automation on it.
Also if you just want to evaluate expression on your properties, give a try to NCalc.
You can use reflection to get the property and invoke it. Something like this:
object result = theObject.GetType().GetProperty("Property" + i).GetValue(theObject, null);
That is, assuming the object that has the property is called "theObject" :)
You also could implement a Webbrowser, then load a html-file wich contains javascript.
Then u go for the document.InvokeScript Method on this browser. The return Value of the eval function can be catched and converted into everything you need.
I did this in several Projects and it works perfectly.
Hope it helps
Uses reflection to parse and evaluate a data-binding expression against an object at run time.
DataBinder.Eval Method
I have written a package, SharpByte.Dynamic, to simplify the task of compiling and executing code dynamically. The code can be invoked on any context object using extension methods as detailed further here.
For example,
someObject.Evaluate<int>("6 / {{{0}}}", 3))
returns 3;
someObject.Evaluate("this.ToString()"))
returns the context object's string representation;
someObject.Execute(#
"Console.WriteLine(""Hello, world!"");
Console.WriteLine(""This demonstrates running a simple script"");
");
runs those statements as a script, etc.
Executables can be gotten easily using a factory method, as seen in the example here--all you need is the source code and list of any expected named parameters (tokens are embedded using triple-bracket notation, such as {{{0}}}, to avoid collisions with string.Format() as well as Handlebars-like syntaxes):
IExecutable executable = ExecutableFactory.Default.GetExecutable(executableType, sourceCode, parameterNames, addedNamespaces);
Each executable object (script or expression) is thread-safe, can be stored and reused, supports logging from within a script, stores timing information and last exception if encountered, etc. There is also a Copy() method compiled on each to allow creating cheap copies, i.e. using an executable object compiled from a script or expression as a template for creating others.
Overhead of executing an already-compiled script or statement is relatively low, at well under a microsecond on modest hardware, and already-compiled scripts and expressions are cached for reuse.
You could do it with a prototype function:
void something(int i, string P1) {
something(i, P1, String.Empty);
}
void something(int i, string P1, string P2) {
something(i, P1, P2, String.Empty);
}
void something(int i, string P1, string P2, string P3) {
something(i, P1, P2, P3, String.Empty);
}
and so on...
I was trying to get a value of a structure (class) member by it's name. The structure was not dynamic. All answers didn't work until I finally got it:
public static object GetPropertyValue(object instance, string memberName)
{
return instance.GetType().GetField(memberName).GetValue(instance);
}
This method will return the value of the member by it's name. It works on regular structure (class).
You might check the Heleonix.Reflection library. It provides methods to get/set/invoke members dynamically, including nested members, or if a member is clearly defined, you can create a getter/setter (lambda compiled into a delegate) which is faster than reflection:
var success = Reflector.Set(instance, null, $"Property{i}", value);
Or if number of properties is not endless, you can generate setters and chache them (setters are faster since they are compiled delegates):
var setter = Reflector.CreateSetter<object, object>($"Property{i}", typeof(type which contains "Property"+i));
setter(instance, value);
Setters can be of type Action<object, object> but instances can be different at runtime, so you can create lists of setters.
Unfortunately, C# doesn't have any native facilities for doing exactly what you are asking.
However, my C# eval program does allow for evaluating C# code. It provides for evaluating C# code at runtime and supports many C# statements. In fact, this code is usable within any .NET project, however, it is limited to using C# syntax. Have a look at my website, http://csharp-eval.com, for additional details.
the correct answer is you need to cache all the result to keep the mem0ry usage low.
an example would look like this
TypeOf(Evaluate)
{
"1+1":2;
"1+2":3;
"1+3":5;
....
"2-5":-3;
"0+0":1
}
and add it to a List
List<string> results = new List<string>();
for() results.Add(result);
save the id and use it in the code
hope this helps

Categories

Resources