Why does this wrong object initialisation with curly braces even compile? [duplicate] - c#

This question already has an answer here:
Nested object initializer syntax
(1 answer)
Closed 4 years ago.
Whille creating some dummy data for a collection for a WPF/MVVM project, I produced the following wrong code, which compiles fine, but throws an exception at runtime.
There's a nested structure of data objects, which I wrongly instantiate with only the curly braces (looks like writing JavaScript does cause permanent damage to the brain).
using System.Collections.ObjectModel;
namespace testapp
{
class Program
{
static void Main(string[] args)
{
var collection = new ObservableCollection<TopLevelDataObject>();
collection.Add(new TopLevelDataObject{Nested = {Bar = 5}}); // NullReferenceException
}
class TopLevelDataObject
{
public NestedDataObject Nested { get; set; }
public string Foo { get; set; }
}
class NestedDataObject
{
public double Bar { get; set; }
}
}
}
Why does that compile?
If I create an annonymous type, like Nested = new {Bar = 5}, I get the error message during compilation (which thus fails):
Cannot implicitly convert type '<anonymous type: int Bar>' to 'testapp.Program.NestedDataObject'
Why do I not get such an error when ommitting the new operator?
It even gives me a code hint for the property:
My guess would be that {Bar = 5} is simply a code block, which on its own is a valid thing to have.
But why is it valid to assign a code block to anything (in this case, the Nested property)?

Why does that compile?
Because when that code is compiled, it is compiled as just a set of assignment operations. It doesn't all have to be new instances you create.
If you construct a new instance of Nested from the constructor, you can assign a value to Nested.Bar.
Change public NestedDataObject Nested { get; set; } to this to see how it works:
public NestedDataObject Nested { get; } = new NestedDataObject();
(Note you can never assign a value to Nested outside the constructor in the above code!)

Why does that compile?
Because var x = new SomeType{Property = value}; is the same as:
var x = new SomeType();
x.Property = value;
Indeed we can even leave in the () to have var x = new SomeType(){Property = value}; or even var x = new SomeType(argument){Property = value}; combining passing an argument to the constructor and setting a value.
As such you can see that there is always a constructor called, and if you leave out the parentheses that make that explicit its always the nullary (no-argument) constructor.
Meanwhile a type with no explicit constructor always has a public nullary constructor (the "default constructor").
Hence new TopLevelDataObject{Nested = {Bar = 5}} is the same as:
var temp = new TopLevelDataObject();
temp.Nested.Bar = 5; // NRE on temp.Nested
Because TopLevelDataObject could have a constructor that sets `Nestedt then the code you have could work, so it should compile. Of course, because it doesn't have such a constructor it doesn't work.
(Note that initialisers don't operate quite the same with anonymous types, in that case it gets rewritten to call a hidden constructor hence allowing the properties to be read-only even though initialisers cannot be used with read-only properties of non-anonymous types. The syntax allows them to look the same and hence be easily understood as similar but the result is not the same).

Related

Exception serializing Anonymous Type to JSON when nested property names are the same in .net5+

I'm migrating some test projects from .netcore 3.1 to .net6 and hit an unexpected exception while serializing an anonymous type to json.
var obj = new
{
input = new {
foo = "foo"
},
INPUT = new {
foo = "foo"
}
};
var json = JsonSerializer.Serialize(obj); // throws an exception
Previously in .netcore 3.1 this was acceptable and would return:
{"input":{"foo":"foo"},"INPUT":{"foo":"bar"}}
But in .net6 an exception is thrown:
Unhandled exception. System.InvalidOperationException:
Members 'input' and 'INPUT' on type
'<>f__AnonymousType0`2[<>f__AnonymousType1`1[System.String],<>f__AnonymousType1`1[System.String]]'
cannot both bind with parameter 'input' in the deserialization constructor.
I've tried various JsonSerializerOptions thinking that case sensitivity or insensitivity was the issue with no luck.
However if I created my Anonymous Type as classes, JsonSerialization has no problems:
public class Obj
{
public Input input { get; set; }
public Input INPUT { get; set; }
}
public class Input
{
public string foo { get; set; }
}
var obj = new Obj(){ input = new Input() { foo = "bar" }, INPUT = new Input() { foo = "bar" }};
var json = JsonSerializer.Serialize(obj);
returns:
{"input":{"foo":"bar"},"INPUT":{"foo":"bar"}}
So what is it about serializing anonymous types that is causing this exception?
The breaking change (arguably a regression) was introduced in .NET 5:
Support deserializing objects using parameterized constructors
It looks to be symptomatic only for types with parameterized constructors with multiple arguments whose names differ only in case, specifically input and INPUT in your example. As documented in How to use immutable types and non-public accessors with System.Text.Json,
The parameter names of a parameterized constructor must match the property names and types. Matching is case-insensitive, and the constructor parameter must match the actual property name even if you use [JsonPropertyName] to rename a property.
It appears that it is the case-insensitive constructor parameter matching that is tripping the serializer up. An anonymous type has exactly one constructor -- an auto-generated parameterized constructor whose argument names match the property names exactly. And because the case-insensitive parameter-to-constructor-argument binding algorithm does not distinguish between input and INPUT in the argument list, an exception gets thrown during contract generation that multiple properties cannot both bind with parameter 'input'.
Now, arguably, the exception should not be thrown during contract generation, but during deserialization. After all, if System.Text.Json cannot determine a unique constructor to use during deserialization, it still will allow you to serialize the type (demo here). You may want to report an issue to Microsoft regarding the regression.
As a workaround, you will need to serialize using types for which this is not a problem, e.g.:
Use mutable types with parameterless constructors (which you are doing with your Obj class).
Use records with distinct property names whose JSON property names are overridden via JsonPropertyNameAttribute, e.g.:
public record Foo(string foo);
public record Input([property:JsonPropertyName("input")] Foo lowerInput, [property:JsonPropertyName("INPUT")] Foo upperInput);
var obj = new Input(new("foo"), new("foo"));
var json = JsonSerializer.Serialize(obj); // Works fine.
Demo #1 here.
Or if you prefer a more generic solution, you can use a record for the object with the case-invariant duplicate names and anonymous types for everything else like so:
public static class InputExtensions
{
public record Input<TInput>([property:JsonPropertyName("input")] TInput lowerInput, [property:JsonPropertyName("INPUT")] TInput upperInput);
public static Input<TInput> ToInput<TInput>(this (TInput, TInput) pair) => new Input<TInput>(pair.Item1, pair.Item2);
}
var obj = (new { foo = "foo" }, new { foo = "foo" }).ToInput();
var json = JsonSerializer.Serialize(obj); // works fine.
Demo #2 here.
Use a dictionary:
var obj = new [] { ("input", new { foo = "foo" } ), ("INPUT", new { foo = "foo" } ) }
.ToDictionary(i => i.Item1, i => i.Item2);
var json = JsonSerializer.Serialize(obj); // works fine.
Demo #3 here.
Revert back to Json.NET temporarily until the regression is fixed by Microsoft, as suggested by Serge.

Why does a combination of object and collection initializers use Add method?

The following combination of object and collection initializers does not give compilation error, but it is fundamentally wrong (https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/object-and-collection-initializers#examples), because the Add method will be used in the initialization:
public class Foo
{
public List<string> Bar { get; set; }
}
static void Main()
{
var foo = new Foo
{
Bar =
{
"one",
"two"
}
};
}
So you'll get NullReferenceException. What is the reason for making such an unsafe decision while developing the syntax of the language? Why not to use initialization of a new collection for example?
First, it's not only for combination of object and collection initializers. What you are referring here is called nested collection initializers, and the same rule (or issue by your opinion) applies to nested object initializers. So if you have the following classes:
public class Foo
{
public Bar Bar { get; set; }
}
public class Bar
{
public string Baz { get; set; }
}
and you use the following code
var foo = new Foo
{
Bar = { Baz = "one" }
};
you'll get the same NRE at runtime because no new Bar will be created, but attempt to set Baz property of the Foo.Bar.
In general the syntax for object/collection initializer is
target = source
where the source could be an expression, object initializer or collection initializer. Note that new List<Bar> { … } is not a collection initializer - it's an object creation expression (after all, everything is an object, including collection) combined with collection initializer. And here is the difference - the idea is not to omit the new, but give you a choice to either use creation expression + object/collection initializer or only initializers.
Unfortunately the C# documentation does not explain that concept, but C# specification does that in the Object Initializers section:
A member initializer that specifies an object initializer after the equals sign is a nested object initializer, i.e. an initialization of an embedded object. Instead of assigning a new value to the field or property, the assignments in the nested object initializer are treated as assignments to members of the field or property. Nested object initializers cannot be applied to properties with a value type, or to read-only fields with a value type.
and
A member initializer that specifies a collection initializer after the equals sign is an initialization of an embedded collection. Instead of assigning a new collection to the target field, property or indexer, the elements given in the initializer are added to the collection referenced by the target.
So why is that? First, because it clearly does exactly what you are telling it to do. If you need new, then use new, otherwise it works as assignment (or add for collections).
Other reasons are - the target property could not be settable (already mentioned in other answers). But also it could be non creatable type (e.g. interface, abstract class), and even when it is a concrete class, except it is a struct, how it will decide that it should use new List<Bar> (or new Bar in my example) instead of new MyBarList, if we have
class MyBarList : List<Bar> { }
or new MyBar if we have
class MyBar : Bar { }
As you can see, the compiler cannot make such assumptions, so IMO the language feature is designed to work in the quite clear and logical way. The only confusing part probably is the usage of the = operator for something else, but I guess that was a tradeoff decision - use the same operator = and add new after that if needed.
Take a look at this code and the output of it due to the Debug.WriteLine():
public class Foo
{
public ObservableCollection<string> _bar = new ObservableCollection<string>();
public ObservableCollection<string> Bar
{
get
{
Debug.WriteLine("Bar property getter called");
return _bar;
}
set
{
Debug.WriteLine("Bar allocated");
_bar = value;
}
}
public Foo()
{
_bar.CollectionChanged += _bar_CollectionChanged;
}
private void _bar_CollectionChanged(object sender, NotifyCollectionChangedEventArgs e)
{
Debug.WriteLine("Item added");
}
}
public MainWindow()
{
Debug.WriteLine("Starting..");
var foo = new Foo
{
Bar =
{
"one",
"two"
}
};
Debug.WriteLine("Ending..");
}
The output is:
Starting..
Bar property getter called
Item added
Bar property getter called
Item added
Ending..
For you questions:
What is the reason for making such an unsafe decision while developing the syntax of the language? Why not to use initialization of a new collection for example?
Answer:
As you can see the intention of the designer of that feature was not to reallocate the collection but rather to help you add items to it more easily considering that you manage your collection allocation by yourself.
Hope this clear things out ;)
Consider the following code:
class Program
{
static void Main()
{
var foo = new Foo
{
Bar =
{
"one",
"two"
}
};
}
}
public class Foo
{
public List<string> Bar { get; set; } = new List<string>();
}
The compiler does not know whether you already created a new list instance within the class constructor (or in another method).
Recall that collection initializer is a series of calls to Add method on an existing collection!
See also:
Custom Collection Initializers
Also note that this initializer applies to a collection that was exposed as a property. Hence the collection initializer is possible as part of the outer object initializer (the Foo object in your example).
However, if it was a simple variable, the compiler would not let you to intialize the collection this way. Here is an example:
List<string> list =
{
"one",
"two"
};
This will throws a compilation error.
As last example, the output of the following code will be: "one, two, three, four, ". I think that now you understand why.
Pay attention to the list static instance, as well as to the private modifier in the "set" of the Bar property, which does not matters because the initializer just calls the Add method, which is accessible even when the Bar "set" is private.
class Program
{
static void Main()
{
var foo1 = new Foo
{
Bar =
{
"one",
"two"
}
};
var foo2 = new Foo
{
Bar =
{
"three",
"four"
}
};
PrintList(foo1.Bar);
}
public static void PrintList(List<string> list)
{
foreach (var item in list)
{
Console.Write(item + ", ");
}
Console.WriteLine();
}
}
public class Foo
{
private static readonly List<string> _bar = new List<string>();
public List<string> Bar { get; private set; } = _bar;
}
I believe the key thing to understand here is that there are two syntactic sugar flavors at play (or at least, there should be):
Object Initialization
Collection Initialization
Take away the List for a moment and look at the field as an object:
public class Foo
{
public object Bar { get; set; }
}
When using Object Initialization, you assign an object (or null):
var foo = new Foo()
{
Bar = new object(); //or Bar = null
}
Now, let's go back to your original example and slap Collection Initialization on top of this. This time around, the compiler realizes this property implements IEnumerable and the array you have provided is of the right type, so it attempts to call the Add method of the interface. It must first go looking for the object, which in your case is null because you haven't initialized it internally. If you debug through this, you will find that the getter gets called and returns null, hence the error.
The correct way of mixing both features then would be for you to assign a new object that you initialize with your values:
var foo = new Foo()
{
Bar = new List<string>(){ "one", "two" }
};
If you debug this version, you will find that the setter is called instead, with the new instance you initialized.
Alternatively, you can initialize your property internally:
public List<string> Bar { get; set; } = new List<string>();
If you debug this version, you will find that the property is first initialized with a value and your version of the code then executes without error (by calling the getter first):
var foo = new Foo()
{
Bar = {"one", "two"}
};
To illustrate the syntactic sugar aspect, Collection Initialization only works within the confines of a constructor calling statement:
List<string> bar = {"one", "two" }; //ERROR: Can only use array initializer expressions to assign to array types. Try using a new expression instead.
List<string> bar = new[] { "one", "two" }; //ERROR: Cannot implicitly convert type 'string[]' to 'System.Collections.Generic.List<string>'
List<string> bar = new List<string>() { "one", "two" }; //This works!
If you wish to allow initialization like in your original example, then the expectation is that the variable will be set to an instance before the Add method can be called. This is true whether you use syntactic sugar or not. I could just as well run into the same error by doing this:
var foo = new Foo();
foo.Bar.Add("one");
So you may want to initialize the variable in order to cover all bases, unless of course a null value has a semantic meaning in your application.

What is happening with this C# object initializer code?

What is going on with this C# code? I'm not even sure why it compiles. Specifically, what's going on where it's setting Class1Prop attempting to use the object initializer syntax? It seems like invalid syntax but it compiles and produces a null reference error at runtime.
void Main()
{
var foo = new Class1
{
Class1Prop =
{
Class2Prop = "one"
}
};
}
public class Class1
{
public Class2 Class1Prop { get; set; }
}
public class Class2
{
public string Class2Prop { get; set; }
}
This is allowed by object initializer syntax in the C# specification, where it is called a nested object initializer. It is equivalent to:
var _foo = new Class1();
_foo.Class1Prop.Class2Prop = "one"
var foo = _foo;
It should be a little more obvious why this throws a null reference exception. Class1Prop was never initialized in the constructor of Class1.
The benefit of this syntax is that the caller can use the convenient object initializer syntax even when the properties are getter-only to set mutable properties on nested objects. For example, if Class1Prop was a getter-only property the example is still valid.
Note that there is an inaccessible temporary variable created to prevent the access of a field or array slot before the full initialization has run.

What the difference between array indexer and any other object indexer

Consider following two data types:
class C
{
public int I { get; set; }
}
struct S
{
public int I { get; set; }
}
Let's try to use them inside the list, for example:
var c_list = new List<C> { new C { I = 1 } };
c_list[0].I++;
var s_list = new List<S> { new S { I = 1 } };
s_list[0].I++; // (a) CS1612 compilation error
As expected, there is compilation error on line (a): CS1612 Cannot modify the return value of 'List<UserQuery.S>.this[int]' because it is not a variable. This is fine, because actually we trying to modify temporary copy of S, which is r-value in giving context.
But let's try to do same thing for an array:
var c_arr = new[] { new C { I = 1 } };
c_arr[0].I++;
var s_arr = new[] { new S { I = 1 } };
s_arr[0].I++; // (b)
And.. this works.
But
var s_arr_list = (IList<S>) s_arr;
s_arr_list[0].I++;
will not compile, as expected.
If we look at the produced IL, we will find following:
IL_0057: ldloc.1 // s_arr
IL_0058: ldc.i4.0 // index
IL_0059: ldelema UserQuery.S // manager pointer of element
ldelema loads address of the array element to the top of the evaluation stack. Such behavior is expected with fixed array and unsafe pointers. But for safe context this is a bit unexpected. Why there is a special unobvious case for arrays? Any why there is no option to achieve same behavior for members of other types?
An array access expression is classified as a variable. You can assign to it, pass it by reference etc. An indexer access is classified separately... in the list of classifications (C# 5 spec section 7.1.)
An indexer access. Every indexer access has an associated type, namely the element type of the indexer. Furthermore, an indexer access has an associated instance expression and an associated argument list. When an accessor (the get or set block) of an indexer access is invoked, the result of evaluating the instance expression becomes the instance represented by this (§7.6.7), and the result of evaluating the argument list becomes the parameter list of the invocation.
Think of this as similar to the difference between a field and a property:
public class Test
{
public int PublicField;
public int PublicProperty { get; set; }
}
...
public void MethodCall(ref int x) { ... }
...
Test test = new Test();
MethodCall(ref test.PublicField); // Fine
MethodCall(ref test.PublicProperty); // Not fine
Fundamentally, an indexer is a pair of methods (or a single one) whereas an array access gives you a storage location.
Note that if you weren't using a mutable struct to start with, you wouldn't see the difference in this way - I'd strongly advise against using mutable structs at all.
A class indexer like the one in List<T> is actually a syntactically convenient way of calling a method.
With arrays however you are actually accesing to the structure in memory. There is no method call in that case.

The difference between new List() {...} and new List {...} [duplicate]

This question already has answers here:
Why are C# 3.0 object initializer constructor parentheses optional?
(5 answers)
Closed 7 years ago.
When initializing a new List in C#, both of following will compile:
(1) List<string> s = new List<string>() { "value" };
and
(2) List<string> s = new List<string> { "value" };
What is the difference between case 1 and case 2?
This rings true for any type. Generally, you specify the () if you need to pass something to the types constructor that it doesn't expose for setting publically. If there's no parameters needed, the () are just fluff.
Consider the scenario that you may want to do some additional validation/logic to a property and you don't allow direct manipulation of it:
public class Foo
{
private string Bar { get; set; }
public string FooBar { get; set; }
public Foo (string bar)
{
Bar = bar + "foo";
}
}
So this is allowed:
var foo = new Foo("bar")
{
FooBar = "foobar"
};
Yet this isn't:
var foo = new Foo
{
Bar = "bar",
FooBar = "foobar"
};
There are some C# types that only allow you to set certain properties within the constructor.
There is no difference. It will be translated as:
List<string> s1 = new List<string>();
s1.Add("value");
List<string> s2 = new List<string>();
s2.Add("value");
From 7.6.10.1 Object creation expressions of the C# reference 5.0:
An object creation expression can omit the constructor argument list and enclosing parentheses provided it includes an object initializer or collection initializer. Omitting the constructor argument list and enclosing parentheses is equivalent to specifying an empty argument list.
The latter is called Object Initialization. You may also use MyClass m = new MyClass { MyVar = 3 };

Categories

Resources