Do closures break serialization - c#

Today I faced a SerializationException that refered to some anonymous inner class +<>c__DisplayClass10 stating it was not serializable, when IIS tried to store the session in the ASP.NET State Service:
Type 'Xyz.GeneralUnderstandingTests+ASerializableType+<>c__DisplayClass10' in Assembly 'Xyz, Version=1.2.5429.24450, Culture=neutral, PublicKeyToken=null' is not marked as serializable.
I looked for lambdas in my code and found quite a few, but most of them were not new and did never have any issues in serialization. But then I noticed that I had built in a new lambda expression that "happened" to build up a closure.
Searching Stackoverflow for closure serialization I found some Q&As that reveal that closures cannot be serialized in other languages such as PHP *) but I did not manage to find such a statement for C#. I have however been able to build a very simple example that seems to confirm that closures are not serializable whereas "normal" functions are
[TestFixture]
public class GeneralUnderstandingTests
{
[Serializable]
private class ASerializableType
{
private readonly Func<int> thisIsAClosure;
private readonly Func<int> thisIsNotAClosure;
public ASerializableType()
{
const int SomeConst = 12345;
thisIsNotAClosure = () => SomeConst; // succeeds to serialize
var someVariable = 12345;
thisIsAClosure = () => someVariable; // fails to serialize
}
}
[Test]
public void ASerializableType_CanBeSerialized()
{
var sessionStateItemCollection = new
System.Web.SessionState.SessionStateItemCollection();
sessionStateItemCollection["sut"] = new ASerializableType();
sessionStateItemCollection.Serialize(new BinaryWriter(new MemoryStream()));
}
}
This test fails but goes green as soon as the line thisIsAClosure = ... is commented out. The line thisIsNotAClosure = ... however does not cause any issues as SomeConst is not a variable but a constant, that is, it does not build a closure but is inlined by the compiler.
Can you confirm that what I have concluded is correct?
=> Is there a way to circumvent this issue?
=> Could it be that this depends on the internals of the compiler used? Since it is the compiler that turns the lambda/closure expression into a anonymous inner class.
*) Links:
Exception: Serialization of 'Closure' is not allowed
Closures Can't Be Serialized, How To Do Callbacks via AJAX to PHP?
Serialization of 'Closure' is not allowed with php pthreads
How to serialize object that has closures inside properties?

Related

C# compiler : class from lambda [duplicate]

According to this answer when code uses local variables from inside lambda methods the compiler will generate extra classes that can have name such as c__DisplayClass1. For example the following (completely useless) code:
class Program
{
static void Main()
{
try {
implMain();
} catch (Exception e) {
Console.WriteLine(e.ToString());
}
}
static void implMain()
{
for (int i = 0; i < 10; i++) {
invoke(() => {
Console.WriteLine(i);
throw new InvalidOperationException();
});
}
}
static void invoke(Action what)
{
what();
}
}
outputs the following call stack:
System.InvalidOperationException
at ConsoleApplication1.Program.<>c__DisplayClass2.<implMain>b__0()
at ConsoleApplication1.Program.invoke(Action what)
at ConsoleApplication1.Program.implMain()
at ConsoleApplication1.Program.Main()
Note that there's c__DisplayClass2 in there which is a name of a class generated by the compiler to hold the loop variable.
According to this answer c__DisplayClass "means"
c --> anonymous method closure class ("DisplayClass")
Okay, but what does "DisplayClass" mean here?
What does this generated class "display"? In other words why is it not "MagicClass" or "GeneratedClass" or any other name?
From an answer to a related question by Eric Lippert:
The reason that a closure class is called "DisplayClass" is a bit unfortunate: this is jargon used by the debugger team to describe a class that has special behaviours when displayed in the debugger. Obviously we do not want to display "x" as a field of an impossibly-named class when you are debugging your code; rather, you want it to look like any other local variable. There is special gear in the debugger to handle doing so for this kind of display class. It probably should have been called "ClosureClass" instead, to make it easier to read disassembly.
You can get some insight from the C# compiler source as available from the SSCLI20 distribution, csharp/sccomp subdirectory. Searching the code for "display" gives most hits in the fncbind.cpp source code file. You'll see it used in code symbols as well as comments.
The comments strongly suggest that this was a term used internally by the team, possibly as far back as the design meetings. This is .NET 2.0 vintage code, there was not a lot of code rewriting going on yet. Just iterators and anonymous methods, both implemented in very similar ways. The term "display class" is offset from "user class" in the comments, a clear hint that they used the term to denote auto-generated classes. No strong hint why "display" was favored, I suspect that it might have something to do with these classes being visible in the metadata of the assembly.
Based on Reflector, DisplayClass can be translated as CompilerGeneratedClass
[CompilerGenerated]
private sealed class <>c__DisplayClass16b
{
// Fields
public MainForm <>4__this;
public object sender;
// Methods
public void <cmdADSInit_Click>b__16a()
{
ADS.Initialize();
this.<>4__this._Sender = this.sender;
this.<>4__this.SelectedObject = ADS.Instance;
}
}

How to mitigate "Access to modified closure" in cases where the delegate is called directly

My understanding is that the "Access to modified closure" warning is there to warn me about accessing local variables from a delegate when the delegate might be stored and called later or called on a different thread so that the local variable isn't actually available at the time of actual code execution. This is sensible of course.
But what if I am creating a delegate that I know is going to be called immediately in the same thread? The warning then is not needed. For example the warning is generated in the code:
delegate void Consume();
private void ConsumeConsume(Consume c)
{
c();
}
public int Hello()
{
int a = 0;
ConsumeConsume(() => { a += 9; });
a = 1;
return a;
}
There can be no problem here since ConsumeConsume always calls the function immediately. Is there any way around this? Is there some way to annotate the function ConsumeConsume to indicate the ReSharper that the delegate will be called immediately?
Interestingly, when I replace the ConsumeConsume(() => { a += 9; }); line with:
new List<int>(new[] {1}).ForEach(i => { a += 9; });
which does the same thing, no warning is generated. Is this just an in-built exception for ReSharper or is there something I can do similarly to indicate that the delegate is called immediately?
I am aware that I can disable these warnings but that is not a desired outcome.
Install the JetBrains.Annotations package with NuGet: https://www.nuget.org/packages/JetBrains.Annotations
Mark the passed in delegate with the InstantHandle attribute.
private void ConsumeConsume([InstantHandle] Consume c)
{
c();
}
From InstantHandle's description:
Tells code analysis engine if the parameter is completely handled when the invoked method is on stack. If the parameter is a delegate, indicates that delegate is executed while the method is executed. If the parameter is an enumerable, indicates that it is enumerated while the method is executed.
Source: https://www.jetbrains.com/help/resharper/Reference__Code_Annotation_Attributes.html
If you don't want to add the whole package to your project, it's enough to just add the attribute yourself, although it's hacky in my opinion.
namespace JetBrains.Annotations
{
[AttributeUsage(AttributeTargets.Parameter)]
public class InstantHandleAttribute : Attribute { }
}

In TPL Dataflow, is it possible to change DataflowBlockOptions after block is created but before it is used?

... and have it take effect?
I'd like to defer setting the ExecutionDataflowBlockOptions.SingleProducerConstrained property until I'm ready to link the network together. (Because, I want to separate creating the blocks, with their semantics, from linking the network together, with its semantics.)
But as far as I can tell you can only set the ExecutionDataflowBlockOptions when the block is created (e.g., for TransformBlock, TransformManyBlock, etc, you pass it in to the constructor and it is not visible otherwise).
However ... it hasn't escaped my notice that the properties have public setters. So ... can I create the block with a placeholder instance of ExecutionDataflowBlockOptions and hold on to it so that I can later set SingleProducerConstrained=true if I desire, when linking the blocks together (and that it will take effect)?
(BTW, is there any way to tell if SingleProducerConstrained is having any effect other than measuring throughput?)
Update: #i3amon correctly pointed out in his answer this can't be done because dataflow blocks clone the DataflowBlockOptions you pass in and use that. But I did it anyway, using internal data structures I can access via reflection and dynamic. I put that in an answer below.
It isn't possible. Modifying the options after the fact won't work. The options are cloned inside the block's constructor. Changing the options later will have no effect.
You can see examples of that here and here and it's simple to verify:
var options = new ExecutionDataflowBlockOptions
{
NameFormat = "bar",
};
var block = new ActionBlock<int>(_ => { }, options);
options.NameFormat = "hamster";
Console.WriteLine(block.ToString());
Output:
bar
Let me answer my own question. Using information from DotNetInside's decompile of the Dataflow assembly, for example, TransformBlock here (thanks #i3amon again for the link to dotnetinside.com), and the very nice ExposedObject package at codeplex here (which I learned about at this blog post, I did the following:
The TPL Dataflow blocks all implement debugger visualizers via the DebuggerTypeProxy attribute, which, applied to a type, names another type to use in the Visual Studio debugger whenever the original type is to be displayed (e.g., watch window).
Each of these DebuggerTypeProxy-named classes are inner classes of the dataflow block the attribute is attached to, usually named DebugView. That class is always private and sealed. It exposes lots of cool stuff about the dataflow block, including its genuine (not a copy) DataflowBlockOptions and also - if the block is a source block - an ITargetBlock[], which can be used to trace the dataflow network from its start block after construction.
Once you get an instance of the DebugView you can use dynamic via ExposedObject to get any of the properties exposed by the class - ExposedObject lets you take an object and use ordinary method and property syntax to access its methods and properties.
Thus you can get the DataflowBlockOptions out of the dataflow block and change its NameFormat, and if it is an ExecutionDataflowBlockOptions (and you haven't yet hooked up the block to other blocks) you can change its SingleProducerConstrained value.
However you can't use dynamic to find or construct the instance of the inner DebugView class. You need reflection for that. You start by getting the DebuggerTypeProxy attribute off your
dataflow block's type, fetch the name of the debugging class, assume it is an inner class of
the dataflow block's type and search for it, convert it to a closed generic type, and finally
construct an instance.
Be fully aware that you're using undocumented code from the dataflow internals. Use your own
judgement about whether this is a good idea. In my opinion, the developers of TPL Dataflow did a lot of work to support viewing these blocks in the debugger, and they'll probably keep it up. Details may change, but, if you're doing proper error checking on your reflection and dynamic use of these types, you will be able to discover when your code stops working with a new version of TPL Dataflow.
The following code fragments probably don't compile together - they're simply cut&pasted out of my working code, from different classes, but they certainly give you the idea. I made it work fine. (Also, for brevity, I elided all error checking.) (Also, I developed/tested this code with version 4.5.20.0 only of TPL dataflow, so you may have to adapt it for past - or future! - versions.)
// Set (change) the NameFormat of a dataflow block after construction
public void SetNameFormat(IDataflowBlock block, string nameFormat)
{
try
{
dynamic debugView = block.GetInternalData(Logger);
if (null != debugView)
{
var blockOptions = debugView.DataflowBlockOptions as DataflowBlockOptions;
blockOptions.NameFormat = nameFormat;
}
}
catch (Exception ex)
{
...
}
}
// Get access to the internal data of a dataflow block via its DebugTypeProxy class
public static dynamic GetInternalData(this IDataflowBlock block)
{
Type blockType = block.GetType();
try
{
// Get the DebuggerTypeProxy attribute, which names the debug class type.
DebuggerTypeProxyAttribute debuggerTypeProxyAttr =
blockType.GetCustomAttributes(true).OfType<DebuggerTypeProxyAttribute>().Single();
// Get the name of the debug class type
string debuggerTypeProxyNestedClassName =
GetNestedTypeNameFromTypeProxyName(debuggerTypeProxyAttr.ProxyTypeName);
// Get the actual Type of the nested class type (it will be open generic)
Type openDebuggerTypeProxyNestedClass = blockType.GetNestedType(
debuggerTypeProxyNestedClassName,
System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic);
// Close it with the actual type arguments from the outer (dataflow block) Type.
Type debuggerTypeProxyNestedClass =
openDebuggerTypeProxyNestedClass.CloseNestedTypeOfClosedGeneric(blockType);
// Now create an instance of the debug class directed at the given dataflow block.
dynamic debugView = ExposedObject.New(debuggerTypeProxyNestedClass, block);
return debugView;
}
catch (Exception ex)
{
...
return null;
}
}
// Given a (Type of a) (open) inner class of a generic class, return the (Type
// of the) closed inner class.
public static Type CloseNestedTypeOfClosedGeneric(
this Type openNestedType,
Type closedOuterGenericType)
{
Type[] outerGenericTypeArguments = closedOuterGenericType.GetGenericArguments();
Type closedNestedType = openNestedType.MakeGenericType(outerGenericTypeArguments);
return closedNestedType;
}
// A cheesy helper to pull a type name for a nested type out of a full assembly name.
private static string GetNestedTypeNameFromTypeProxyName(string value)
{
// Expecting it to have the following form: full assembly name, e.g.,
// "System.Threading...FooBlock`1+NESTEDNAMEHERE, System..."
Match m = Regex.Match(value, #"^.*`\d+[+]([_\w-[0-9]][_\w]+),.*$", RegexOptions.IgnoreCase);
if (!m.Success)
return null;
else
return m.Groups[1].Value;
}
// Added to IgorO.ExposedObjectProject.ExposedObject class to let me construct an
// object using a constructor with an argument.
public ExposedObject {
...
public static dynamic New(Type type, object arg)
{
return new ExposedObject(Create(type, arg));
}
private static object Create(Type type, object arg)
{
// Create instance using Activator
object res = Activator.CreateInstance(type, arg);
return res;
// ... or, alternatively, this works using reflection, your choice:
Type argType = arg.GetType();
ConstructorInfo constructorInfo = GetConstructorInfo(type, argType);
return constructorInfo.Invoke(new object[] { arg });
}
...
}

Can I emit existing implementations in "temporary" assembly

Take the following C# code
namespace lib.foo {
public class A {
public A (int x) {}
public int GetNumber() { return calculateNumber(); }
private int calculateNumber() { return lib.bar.B.ProduceNumber(); }
public void irrelevantMethod() {}
}
}
namespace lib.bar {
public class B {
public static int ProduceNumber() { return something; }
public void IrrelevantMethod() {}
}
}
I want to produce an assembly that contains the functionality of lib.foo.A.GetNumber(), store it, and later load it dynamically and then execute it.
In order for that to work, I'd need a program that can trace all the required dependencies (listed below), and emit them - including their implementation(!) - in one assembly for storage.
* lib.foo.A(int)
* lib.foo.A.getNumber()
* lib.foo.A.calculateNumer()
* lib.bar.B.ProduceNumber()
Can it be done? How?
In case anyone is wondering, I want to build a system where machine A tells machine B (using WCF) what to do. Since serializing delegates is impossible, my plan is to
1) transport an assembly from machine A to B,
2) load the assembly on machine B,
3) have machine A instruct machine B to invoke the desired method, which is implemented in this new assembly.
Note - this isn't really an answer, more of a nitpicking correction (of sorts)..
When you say "Since serializing delegates is impossible", this isn't strictly true, although I would NOT recommend doing it. This example code effectively "serializes" a delegate:
void Main()
{
Func<int,int> dlgt = FuncHolder.SomeMethod;
var ser = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
byte[] buffer;
using(var ms = new MemoryStream())
{
ser.Serialize(ms, dlgt);
buffer = ms.ToArray();
}
Console.WriteLine("{0} was serialized to {1} bytes", dlgt.GetType().Name, buffer.Length);
using(var ms = new MemoryStream(buffer))
{
dynamic whatzit = ser.Deserialize(ms);
whatzit(1);
}
}
[Serializable]
public struct FuncHolder
{
public static int SomeMethod(int i)
{
Console.WriteLine("I was called with {0}, returning {1}", i, i+1);
return i+1;
}
}
Output:
Func`2 was serialized to 978 bytes
I was called with 1, returning 2
I must emphasize, however, that you probably shouldn't do this. :)
As for the original question:
I'd be very careful about transporting and executing arbitrary code, especially in a production environment; the potential for security breaches is considerable, mainly via injection routes. If you were to take, for example, one of the above suggestions and just blast over the source to execute dynamically, there's little stopping someone from injecting who-knows-what into your "Give me code to run" service.
You'd really need to spell out your exact needs here to really come up with a "good" solution, as there are multiple ways to accomplish the same basic idea:
as mentioned, pass actual source code to the service to load/compile/execute, potentially in a "sandbox" for some aspect of security/protection
distribute all executable code paths in a shared plugin/assembly which is pushed by some trusted process to all remote servers, and reduce your executor code to a single "DoWork" method invocation (i.e., wrap all the details inside the plugin)
Cobble together a rough DSL or other type of pseudo-language, restricted in what it can/can't do, and pass that source around.
rely on .NET remoting: actually go remotely call the methods in the assembly on a remote object via proxy.

Change object type at runtime maintaining functionality

Long story short
Say I have the following code:
// a class like this
class FirstObject {
public Object OneProperty {
get;
set;
}
// (other properties)
public Object OneMethod() {
// logic
}
}
// and another class with properties and methods names
// which are similar or exact the same if needed
class SecondObject {
public Object OneProperty {
get;
set;
}
// (other properties)
public Object OneMethod(String canHaveParameters) {
// logic
}
}
// the consuming code would be something like this
public static void main(String[] args) {
FirstObject myObject=new FirstObject();
// Use its properties and methods
Console.WriteLine("FirstObject.OneProperty value: "+myObject.OneProperty);
Console.WriteLine("FirstObject.OneMethod returned value: "+myObject.OneMethod());
// Now, for some reason, continue to use the
// same object but with another type
// -----> CHANGE FirstObject to SecondObject HERE <-----
// Continue to use properties and methods but
// this time calls were being made to SecondObject properties and Methods
Console.WriteLine("SecondObject.OneProperty value: "+myObject.OneProperty);
Console.WriteLine("SecondObject.OneMethod returned value: "+myObject.OneMethod(oneParameter));
}
Is it possible to change FirstObject type to SecondObject and continue to use it's properties and methods?
I've total control over FirstObject, but SecondObject is sealed and totally out of my scope!
May I achieve this through reflection? How? What do you think of the work that it might take to do it? Obviously both class can be a LOT more complex than the example above.
Both class can have templates like FirstObject<T> and SecondObject<T> which is intimidating me to use reflection for such a task!
Problem in reality
I've tried to state my problem the easier way for the sake of simplicity and to try to extract some knowledge to solve it but, by looking to the answers, it seems obvious to me that, to help me, you need to understand my real problem because changing object type is only the tip of the iceberg.
I'm developing a Workflow Definition API. The main objective is to have a API able to be reusable on top of any engine I might want to use(CLR through WF4, NetBPM, etc.).
By now I'm writing the middle layer to translate that API to WF4 to run workflows through the CLR.
What I've already accomplished
The API concept, at this stage, is somehow similar to WF4 with ActivityStates with In/Out Arguments and Data(Variables) running through the ActivityStates using their arguments.
Very simplified API in pseudo-code:
class Argument {
object Value;
}
class Data {
String Name;
Type ValueType;
object Value;
}
class ActivityState {
String DescriptiveName;
}
class MyIf: ActivityState {
InArgument Condition;
ActivityState Then;
ActivityState Else;
}
class MySequence: ActivityState {
Collection<Data> Data;
Collection<ActivityState> Activities;
}
My initial approach to translate this to WF4 was too run through the ActivitiesStates graph and do a somehow direct assignment of properties, using reflection where needed.
Again simplified pseudo-code, something like:
new Activities.If() {
DisplayName=myIf.DescriptiveName,
Condition=TranslateArgumentTo_WF4_Argument(myIf.Condition),
Then=TranslateActivityStateTo_WF4_Activity(myIf.Then),
Else=TranslateActivityStateTo_WF4_Activity(myIf.Else)
}
new Activities.Sequence() {
DisplayName=mySequence.DescriptiveName,
Variables=TranslateDataTo_WF4_Variables(mySequence.Variables),
Activities=TranslateActivitiesStatesTo_WF4_Activities(mySequence.Activities)
}
At the end of the translation I would have an executable System.Activities.Activity object. I've already accomplished this easily.
The big issue
A big issue with this approach appeared when I began the Data object to System.Activities.Variable translation. The problem is WF4 separates the workflow execution from the context. Because of that both Arguments and Variables are LocationReferences that must be accessed through var.Get(context) function for the engine to know where they are at runtime.
Something like this is easily accomplished using WF4:
Variable<string> var1=new Variable<string>("varname1", "string value");
Variable<int> var2=new Variable<int>("varname2", 123);
return new Sequence {
Name="Sequence Activity",
Variables=new Collection<Variable> { var1, var2 },
Activities=new Collection<Activity>(){
new Write() {
Name="WriteActivity1",
Text=new InArgument<string>(
context =>
String.Format("String value: {0}", var1.Get(context)))
},
new Write() {
//Name = "WriteActivity2",
Text=new InArgument<string>(
context =>
String.Format("Int value: {0}", var2.Get(context)))
}
}
};
but if I want to represent the same workflow through my API:
Data<string> var1=new Data<string>("varname1", "string value");
Data<int> var2=new Data<int>("varname2", 123);
return new Sequence() {
DescriptiveName="Sequence Activity",
Data=new Collection<Data> { var1, var2 },
Activities=new Collection<ActivityState>(){
new Write() {
DescriptiveName="WriteActivity1",
Text="String value: "+var1 // <-- BIG PROBLEM !!
},
new Write() {
DescriptiveName="WriteActivity2",
Text="Int value: "+Convert.ToInt32(var2) // ANOTHER BIG PROBLEM !!
}
}
};
I end up with a BIG PROBLEM when using Data objects as Variables. I really don't know how to allow the developer, using my API, to use Data objects wherever who wants(just like in WF4) and later translate that Data to System.Activities.Variable.
Solutions come to mind
If you now understand my problem, the FirstObject and SecondObject are the Data and System.Activities.Variable respectively. Like I said translate Data to Variable is just the tip of the iceberg because I might use Data.Get() in my code and don't know how to translate it to Variable.Get(context) while doing the translation.
Solutions that I've tried or thought of:
Solution 1
Instead of a direct translation of properties I would develop NativeActivites for each flow-control activity(If, Sequence, Switch, ...) and make use of CacheMetadata() function to specify Arguments and Variables. The problem remains because they are both accessed through var.Get(context).
Solution 2
Give my Data class its own Get() function. It would be only an abstract method, without logic inside that it would, somehow, translate to Get() function of System.Activities.Variable. Is this even possible using C#? Guess not! Another problem is that a Variable.Get() has one parameter.
Solution 3
The worst solution that I thought of was CIL-manipulation. Try to replace the code where Data/Argument is used with Variable/Argument code. This smells like a nightmare to me. I know next to nothing about System.reflection.Emit and even if I learn it my guess is that it would take ages ... and might not even be possible to do it.
Sorry if I ended up introducing a bigger problem but I'm really stuck here and desperately needing a tip/path to go on.
This is called "duck typing" (if it looks like a duck and quacks like a duck you can call methods on it as though it really were a duck). Declare myObject as dynamic instead of as a specific type and you should then be good to go.
EDIT: to be clear, this requires .NET 4.0
dynamic myObject = new FirstObject();
// do stuff
myObject = new SecondObject();
// do stuff again
Reflection isn't necessarily the right task for this. If SecondObject is out of your control, your best option is likely to just make an extension method that instantiates a new copy of it and copies across the data, property by property.
You could use reflection for the copying process, and work that way, but that is really a separate issue.

Categories

Resources