String.IsNullOrEmpty Monad - c#

I have lately been dipping my feet into the fascinating world of functional programming, largely due to gaining experience in FP platforms like React and reading up on blogs the likes of https://blog.ploeh.dk/. As a primarily imperative programmer, this has been an interesting transition, but I am still trying to get my wet feet under me.
I am getting a little tired of using string.IsNullOrEmpty as such. So much of the time I find myself littering my code with expressions such as
_ = string.IsNullOrEmpty(str) ? "default text here" : str;
which isn't so bad as it goes, but say I wanted to chain a bunch of options past that null, e.g.
_ = string.IsNullOrEmpty(str) ? (
util.TryGrabbingMeAnother() ??
"default text here") : str;
Yuck. I'd much rather have something like this --
_ = monad.NonEmptyOrNull(str) ??
util.TryGrabbingMeAnother() ??
"default text here";
As the sample indicates, I am using a function that I am referring to as a monad to help reduce string.IsNullOrEmpty to a null-chainable operation:
public string NonEmptyOrNull(string source) =>
string.IsNullOrEmpty(source) ? null : source;
My question is, is this proper terminology? I know Nullable<T> can be considered a monad (see Can Nullable be used as a functor in C#? and Monad in plain English? (For the OOP programmer with no FP background)). These materials are good references, but I still don't have quite enough an intuitive grasp of the subject to know if I'm not just being confusing or inconsistent here. For example, I know monads are supposed to enable function chaining like I have above, but they are also "type amplifiers" -- so my little example seems to behave like a monad for enabling chaining, but it seems like converting null/empty to just null is a reduction rather than an amplification, so I question whether this actually is a monad. So for this particular application, could someone who has a little more experience with FP tell me whether or not it is accurate to call NonEmptyOrNull a monad, and why or why not?

A monad is a triple consisting of:
A single-argument type constructor M
A function unit of type a -> M a
A function join of type M (M a) -> a
which satisfies the monad laws.
A type constructor is a type-level function which takes a number of type arguments and returns a type. C# doesn't have this feature directly but when encoding monads, you need a single-argument generic type e.g. List<T>, Task<T> etc. For some generic type M you therefore need two functions which construct an instance of the generic type from a single value, an 'flattens' a nested instance of the type. For example for List<T>:
public static List<T> unit<T>(T value) { return new List<T> { value }; }
public static List<T> join<T>(List<List<T>> l) { return l.SelectMany(l => l); }
From this definition you can see that a single function cannot satisfy the definition ofa monad, so your example is not an example of a monad.
By this definition, Nullable<T> also does not have a monad instance since the nested type Nullable<Nullable<T>> cannot be constructed, so join cannot be implemented.

This is more like a filter operation. In C#, you'd idiomatically call it Where. It may be easier to see if we make the distinction between absent and populated values more explicit, which we can do with the Maybe container:
public static Maybe<T> Where<T>(
this Maybe<T> source,
Func<T, bool> predicate)
{
return source.SelectMany(x => predicate(x) ? x.ToMaybe() : Maybe.Empty<T>());
}
There's only a few containers that support filtering. The two most common ones are Maybe (AKA Option) and various collections (i.e. IEnumerable<T>).
In Haskell (which has a more powerful type system than C#) this is enabled via a class named MonadPlus, but I think that the type class Alternative actually ought to be enough to implement filtering. Alternative is described as a monoid on applicative functors. I'm not sure that that's particularly helpful, though.
With the above Where method, you could thread Maybe values through checks like IsNullOrEmpty like this:
var m = "foo".ToMaybe();
var inspected = m.Where(s => !string.IsNullOrEmpty(s));
This will let m pass through unchanged, while the following will not:
var m = "".ToMaybe();
var inspected = m.Where(s => !string.IsNullOrEmpty(s));
You could do the same with Nullable<T>, but I'll leave that as an exercise 😉
It's also possible that you could do it with the new nullable reference types language feature of C# 8, but I haven't tried yet.

I believe this is usually solved in the FP paradigm a step ahead of validating null. The str value must never be null. Instead the original method must return an empty collection. This way the chaining of methods do not have to validate null. The next operation will not execute since there are no elements to operate on
There are multiple references you can find. related to this on the internet. https://www.informit.com/articles/article.aspx?p=2133373&seqNum=5 is one I could quickly grab
I learnt this from Zoran Horvat course in Pluralsight. If you do have access please check it out. The course name is "
Tactical Design Patterns in .NET: Control Flow" and the module is "Null Object and Special Case Patterns"
Taking about interest in FP, Zoran Horvat also has other courses that help convert or make OO code more fuctional. I'm quite excited in responding here because lately I've been looking into FP as well. Good luck!

Related

When using object initializers, what does the parenthesis do? [duplicate]

It seems that the C# 3.0 object initializer syntax allows one to exclude the open/close pair of parentheses in the constructor when there is a parameterless constructor existing. Example:
var x = new XTypeName { PropA = value, PropB = value };
As opposed to:
var x = new XTypeName() { PropA = value, PropB = value };
I'm curious why the constructor open/close parentheses pair is optional here after XTypeName?
This question was the subject of my blog on September 20th 2010. Josh and Chad's answers ("they add no value so why require them?" and "to eliminate redundancy") are basically correct. To flesh that out a bit more:
The feature of allowing you to elide the argument list as part of the "larger feature" of object initializers met our bar for "sugary" features. Some points we considered:
the design and specification cost was low
we were going to be extensively changing the parser code that handles object creation anyway; the additional development cost of making the parameter list optional was not large compared to the cost of the larger feature
the testing burden was relatively small compared to the cost of the larger feature
the documentation burden was relatively small compared...
the maintenance burden was anticipated to be small; I don't recall any bugs reported in this feature in the years since it shipped.
the feature does not pose any immediately obvious risks to future features in this area. (The last thing we want to do is make a cheap, easy feature now that makes it much harder to implement a more compelling feature in the future.)
the feature adds no new ambiguities to the lexical, grammatical or semantic analysis of the language. It poses no problems for the sort of "partial program" analysis that is performed by the IDE's "IntelliSense" engine while you are typing. And so on.
the feature hits a common "sweet spot" for the larger object initialization feature; typically if you are using an object initializer it is precisely because the constructor of the object does not allow you to set the properties you want. It is very common for such objects to simply be "property bags" that have no parameters in the ctor in the first place.
Why then did you not also make empty parentheses optional in the default constructor call of an object creation expression that does not have an object initializer?
Take another look at that list of criteria above. One of them is that the change does not introduce any new ambiguity in the lexical, grammatical or semantic analysis of a program. Your proposed change does introduce a semantic analysis ambiguity:
class P
{
class B
{
public class M { }
}
class C : B
{
new public void M(){}
}
static void Main()
{
new C().M(); // 1
new C.M(); // 2
}
}
Line 1 creates a new C, calls the default constructor, and then calls the instance method M on the new object. Line 2 creates a new instance of B.M and calls its default constructor. If the parentheses on line 1 were optional then line 2 would be ambiguous. We would then have to come up with a rule resolving the ambiguity; we could not make it an error because that would then be a breaking change that changes an existing legal C# program into a broken program.
Therefore the rule would have to be very complicated: essentially that the parentheses are only optional in cases where they don't introduce ambiguities. We'd have to analyze all the possible cases that introduce ambiguities and then write code in the compiler to detect them.
In that light, go back and look at all the costs I mention. How many of them now become large? Complicated rules have large design, spec, development, testing and documentation costs. Complicated rules are much more likely to cause problems with unexpected interactions with features in the future.
All for what? A tiny customer benefit that adds no new representational power to the language, but does add crazy corner cases just waiting to yell "gotcha" at some poor unsuspecting soul who runs into it. Features like that get cut immediately and put on the "never do this" list.
How did you determine that particular ambiguity?
That one was immediately clear; I am pretty familiar with the rules in C# for determining when a dotted name is expected.
When considering a new feature how do you determine whether it causes any ambiguity? By hand, by formal proof, by machine analysis, what?
All three. Mostly we just look at the spec and noodle on it, as I did above. For example, suppose we wanted to add a new prefix operator to C# called "frob":
x = frob 123 + 456;
(UPDATE: frob is of course await; the analysis here is essentially the analysis that the design team went through when adding await.)
"frob" here is like "new" or "++" - it comes before an expression of some sort. We'd work out the desired precedence and associativity and so on, and then start asking questions like "what if the program already has a type, field, property, event, method, constant, or local called frob?" That would immediately lead to cases like:
frob x = 10;
does that mean "do the frob operation on the result of x = 10, or create a variable of type frob called x and assign 10 to it?" (Or, if frobbing produces a variable, it could be an assignment of 10 to frob x. After all, *x = 10; parses and is legal if x is int*.)
G(frob + x)
Does that mean "frob the result of the unary plus operator on x" or "add expression frob to x"?
And so on. To resolve these ambiguities we might introduce heuristics. When you say "var x = 10;" that's ambiguous; it could mean "infer the type of x" or it could mean "x is of type var". So we have a heuristic: we first attempt to look up a type named var, and only if one does not exist do we infer the type of x.
Or, we might change the syntax so that it is not ambiguous. When they designed C# 2.0 they had this problem:
yield(x);
Does that mean "yield x in an iterator" or "call the yield method with argument x?" By changing it to
yield return(x);
it is now unambiguous.
In the case of optional parens in an object initializer it is straightforward to reason about whether there are ambiguities introduced or not because the number of situations in which it is permissible to introduce something that starts with { is very small. Basically just various statement contexts, statement lambdas, array initializers and that's about it. It's easy to reason through all the cases and show that there's no ambiguity. Making sure the IDE stays efficient is somewhat harder but can be done without too much trouble.
This sort of fiddling around with the spec usually is sufficient. If it is a particularly tricky feature then we pull out heavier tools. For example, when designing LINQ, one of the compiler guys and one of the IDE guys who both have a background in parser theory built themselves a parser generator that could analyze grammars looking for ambiguities, and then fed proposed C# grammars for query comprehensions into it; doing so found many cases where queries were ambiguous.
Or, when we did advanced type inference on lambdas in C# 3.0 we wrote up our proposals and then sent them over the pond to Microsoft Research in Cambridge where the languages team there was good enough to work up a formal proof that the type inference proposal was theoretically sound.
Are there ambiguities in C# today?
Sure.
G(F<A, B>(0))
In C# 1 it is clear what that means. It's the same as:
G( (F<A), (B>0) )
That is, it calls G with two arguments that are bools. In C# 2, that could mean what it meant in C# 1, but it could also mean "pass 0 to the generic method F that takes type parameters A and B, and then pass the result of F to G". We added a complicated heuristic to the parser which determines which of the two cases you probably meant.
Similarly, casts are ambiguous even in C# 1.0:
G((T)-x)
Is that "cast -x to T" or "subtract x from T"? Again, we have a heuristic that makes a good guess.
Because that's how the language was specified. They add no value, so why include them?
It's also very similar to implicity typed arrays
var a = new[] { 1, 10, 100, 1000 }; // int[]
var b = new[] { 1, 1.5, 2, 2.5 }; // double[]
var c = new[] { "hello", null, "world" }; // string[]
var d = new[] { 1, "one", 2, "two" }; // Error
Reference: http://msdn.microsoft.com/en-us/library/ms364047%28VS.80%29.aspx
This was done to simplify the construction of objects. The language designers have not (to my knowledge) specifically said why they felt that this was useful, though it is explicitly mentioned in the C# Version 3.0 Specification page:
An object creation expression can omit the constructor argument list and enclosing parentheses, provided it includes an object or collection initializer. Omitting the constructor argument list and enclosing parentheses is equivalent to specifying an empty argument list.
I suppose that they felt the parenthesis, in this instance, were not necessary in order to show developer intent, since the object initializer shows the intent to construct and set the properties of the object instead.
In your first example, the compiler infers that you're calling the default constructor (the C# 3.0 Language Specification states that if no parenthesis are provided, the default constructor is called).
In the second, you explicitly call the default constructor.
You can also use that syntax to set properties while explicitly passing values to the constructor. If you had the following class definition:
public class SomeTest
{
public string Value { get; private set; }
public string AnotherValue { get; set; }
public string YetAnotherValue { get; set;}
public SomeTest() { }
public SomeTest(string value)
{
Value = value;
}
}
All three statements are valid:
var obj = new SomeTest { AnotherValue = "Hello", YetAnotherValue = "World" };
var obj = new SomeTest() { AnotherValue = "Hello", YetAnotherValue = "World"};
var obj = new SomeTest("Hello") { AnotherValue = "World", YetAnotherValue = "!"};
I am no Eric Lippert, so I can't say for sure, but I would assume it is because the empty parenthesis is not needed by the compiler in order to infer the initialization construct. Therefore it becomes redundant information, and not needed.

Syntax alternatives to casting of dynamic objects

I have an implementation of DynamicDictionary where all of the entries in the dictionary are of a known type:
public class FooClass
{
public void SomeMethod()
{
}
}
dynamic dictionary = new DynamicDictionary<FooClass>();
dictionary.foo = new FooClass();
dictionary.foo2 = new FooClass();
dictionary.foo3 = DateTime.Now; <--throws exception since DateTime is not FooClass
What I'd like is to be able to have Visual Studio Intellisense work when referencing a method of one of the dictionary entries:
dictionary.foo.SomeMethod() <--would like SomeMethod to pop up in intellisense
The only way I've found to do this is:
((FooClass)dictionary.foo).SomeMethod()
Can anyone recommend a more elegant syntax? I'm comfortable writing a custom implementation of DynamicDictionary with IDynamicMetaObjectProvider.
UPDATE:
Some have asked why dynamics and what my specific problem is. I have a system that lets me do something like this:
UI.Map<Foo>().Action<int, object>(x => x.SomeMethodWithParameters).Validate((parameters) =>
{
//do some method validation on the parameters
return true; //return true for now
}).WithMessage("The parameters are not valid");
In this case the method SomeMethodWithParameters has the signature
public void SomeMethodWithParameters(int index, object target)
{
}
What I have right now for registering validation for individual parameters looks like this:
UI.Map<Foo>().Action<int, object>(x => x.SomeMethodWithParameters).GetParameter("index").Validate((val) =>
{
return true; //valid
}).WithMessage("index is not valid");
What I'd like it to be is:
UI.Map<Foo>().Action<int, object(x => x.SomeMethodWithParameters).index.Validate((val) =>
{
return true;
}).WithMessage("index is not valid");
This works using dynamics, but you lose intellisense after the reference to index - which is fine for now. The question is is there a clever syntactical way (other than the ones metioned above) to get Visual Studio to recognize the type somehow. Sounds so far like the answer is "no".
It seems to me that if there was a generic version of IDynamicMetaObjectProvider,
IDynamicMetaObjectProvider<T>
this could be made to work. But there isn't, hence the question.
In order to get intellisense, you're going to have to cast something to a value that is not dynamic at some point. If you find yourself doing this a lot, you can use helper methods to ease the pain somewhat:
GetFoo(dictionary.Foo).SomeMethod();
But that isn't much of an improvement over what you've got already. The only other way to get intellisense would be to cast the value back to a non-dynamic type or avoid dynamic in the first place.
If you want to use Intellisense, it's usually best to avoid using dynamic in the first place.
typedDictionary["foo"].SomeMethod();
Your example makes it seem likely that you have specific expectations about the structure of your dynamic object. Consider whether there's a way to create a static class structure that would fulfill your needs.
Update
In response to your update: If you don't want to drastically change your syntax, I'd suggest using an indexer so that your syntax can look like this:
UI.Map<Foo>().Action<int, object>(x => x.SomeMethodWithParameters)["index"].Validate((val) => {...});
Here's my reasoning:
You only add four characters (and subtract one) compared to the dynamic approach.
Let's face it: you are using a "magic string." By requiring an actual string, this fact will be immediately obvious to programmers who look at this code. Using the dynamic approach, there's nothing to indicate that "index" is not a known value from the compiler's perspective.
If you're willing to change things around quite a bit, you may want to investigate the way Moq plays with expressions in their syntax, particularly the It.IsAny<T>() method. It seems like you might be able to do something more along these lines:
UI.Map<Foo>().Action(
(x, v) => x.SomeMethodWithParameters(
v.Validate<int>(index => {return index > 1;})
.WithMessage("index is not valid"),
v.AlwaysValid<object>()));
Unlike your current solution:
This wouldn't break if you ended up changing the names of the parameters in the method signature: Just like the compiler, the framework would pay more attention to the location and types of the parameters than to their names.
Any changes to the method signature would cause an immediate flag from the compiler, rather than a runtime exception when the code runs.
Another syntax that's probably slightly easier to accomplish (since it wouldn't require parsing expression trees) might be:
UI.Map<Foo>().Action((x, v) => x.SomeMethodWithParameters)
.Validate(v => new{
index = v.ByMethod<int>(i => {return i > 1;}),
target = v.IsNotNull()});
This doesn't give you the advantages listed above, but it still gives you type safety (and therefore intellisense). Pick your poison.
Aside from Explict Cast,
((FooClass)dictionary.foo).SomeMethod();
or Safe Cast,
(dictionary.foo as FooClass).SomeMethod();
the only other way to switch back to static invocation (which will allow intellisense to work) is to do Implicit Cast:
FooClass foo = dictionary.foo;
foo.SomeMethod().
Declared casting is your only option, can't use helper methods because they will be dynamically invoked giving you the same problem.
Update:
Not sure if this is more elegant but doesn't involve casting a bunch and gets intellisense outside of the lambda:
public class DynamicDictionary<T>:IDynamicMetaObjectProvider{
...
public T Get(Func<dynamic,dynamic> arg){
return arg(this);
}
public void Set(Action<dynamic> arg){
arg(this);
}
}
...
var dictionary = new DynamicDictionary<FooClass>();
dictionary.Set(d=>d.Foo = new FooClass());
dictionary.Get(d=>d.Foo).SomeMethod();
As has already been said (in the question and StriplingWarrior answer) the C# 4 dynamic type does not provide intellisense support. This answer is provided merely to provide an explanation why (based on my understanding).
dynamic to the C# compiler is nothing more than object which has only limited knowledge at compile-time which members it supports. The difference is, at run-time, dynamic attempts to resolve members called against its instances against the type for which the instance it represents knows (providing a form of late binding).
Consider the following:
dynamic v = 0;
v += 1;
Console.WriteLine("First: {0}", v);
// ---
v = "Hello";
v += " World";
Console.WriteLine("Second: {0}", v);
In this snippet, v represents both an instance of Int32 (as seen in the first section of code) and an instance of String in the latter. The use of the += operator actually differs between the two different calls to it because the types involved are inferred at run-time (meaning the compiler doesn't understand or infer usage of the types at compile-time).
Now consider a slight variation:
dynamic v;
if (DateTime.Now.Second % 2 == 0)
v = 0;
else
v = "Hello";
v += 1;
Console.WriteLine("{0}", v);
In this example, v could potentially be either an Int32 or a String depending on the time at which the code is run. An extreme example, I know, though it clearly illustrates the problem.
Considering a single dynamic variable could potentially represent any number of types at run-time, it would be nearly impossible for the compiler or IDE to make assumptions about the types it represents prior to it's execution, so Design- or Compile-time resolution of a dynamic variable's potential members is unreasonable (if not impossible).

Using Expression to call a property and object and determine if the object is null or not

I want to be able to call properties on objects that might be null but not explicitly have to check whether they are null or not when calling.
Like this:
var something = someObjectThatMightBeNull.Property;
My idea is to create a method that takes an Expression, something like this:
var something = GetValueSafe(() => someObjectThatMightBeNull.Property);
TResult? GetValueSafe<TResult>(Expression<Func<TResult>> expression)
where TResult : struct
{
// what must I do?
}
What I need to do is to inspect the expression and determine if someObjectThatMightBeNull is null or not. How would I do this?
If there is any smarter way of being lazy I'd appreciate that too.
Thanks!
It's complicated, but it can be done, without leaving "expression-land":
// Get the initial property expression from the left
// side of the initial lambda. (someObjectThatMightBeNull.Property)
var propertyCall = (MemberExpression)expression.Body;
// Next, remove the property, by calling the Expression
// property from the MemberExpression (someObjectThatMightBeNull)
var initialObjectExpression = propertyCall.Expression;
// Next, create a null constant expression, which will
// be used to compare against the initialObjectExpression (null)
var nullExpression = Expression.Constant(null, initialObjectExpression.Type);
// Next, create an expression comparing the two:
// (someObjectThatMightBeNull == null)
var equalityCheck = Expression.Equal(initialObjectExpression, nullExpression);
// Next, create a lambda expression, so the equalityCheck
// can actually be called ( () => someObjectThatMightBeNull == null )
var newLambda = Expression.Lambda<Func<bool>>(equalityCheck, null);
// Compile the expression.
var function = newLambda.Compile();
// Run the compiled delegate.
var isNull = function();
That being said, as Andras Zoltan has so eloquently put in the comments: "Just because you can doesn't mean you should." Make sure you have a good reason to do this. If there's a better way to, then do that instead. Andras has a great workaround.
What you're talking about is called null-safe dereferencing - this SO specifically asks that question: C# if-null-then-null expression.
Expressions aren't really the answer (see below for clarification of my reasons for that statement). This extension method might be, though:
public static TResult? GetValueSafe<TInstance, TResult>(this TInstance instance,
Func<TInstance, TResult> accessor)
where TInstance : class
where TResult : struct
{
return instance != null ? (TResult?)accessor(instance) : (TResult?)null;
}
And now you can do:
MyObject o = null;
int? i = o.GetValueSafe(obj => obj.SomeIntProperty);
Assert.IsNull(i);
Obviously this is most useful when the property is a struct; you can reduce to any type and just use default(TResult) - but then you'd get 0 for ints, doubles etc:
public static TResult GetValueSafe<TInstance, TResult>(this TInstance instance,
Func<TInstance, TResult> accessor, TResult def = default(TResult))
where TInstance : class
{
return instance != null ? accessor(instance) : def;
}
This second version is more useful specifically because it works for any TResult. I've extended with an optional parameter to allow the caller to provide the default, e.g (using o from previous code):
int i = o.GetValueSafe(obj => obj.SomeIntProperty); //yields 0
i = o.GetValueSafe(obj => obj.SomeIntProperty, -1); //yields -1
//while this yields string.Empty instead of null
string s = o.GetValueSafe(obj => obj.SomeStringProperty, string.Empty);
Edit - in response to David's comment
David suggested my answer is wrong because it doesn't provide an expression-based solution, and that is what was asked for. My point is that any truly correct, and indeed responsible, answer on SO should always try to seek a simpler solution for the person asking the question if one exists. I believe it is widely accepted that over-complicated solutions to otherwise simple problems should be avoided in our day-to-day professional lives; and SO is only as popular as it is because it's community behaves in the same way.
David also took issue with my unjustified statement that 'they're not the solution' - so I'm going to expand upon that now and show why the expression-based solution is largely pointless, except in a rare edge-case that the OP doesn't actually ask for (which, incidentally, David's answer doesn't cover either).
The irony being that it makes this answer in itself perhaps unnecessarily complicated :) You can safely ignore from here down if you don't actually care why expressions aren't the best route
Whilst it is correct to say that you can solve this with expressions, for the examples laid out in the question there is simply no reason to use them - it's over-complicating what is ultimately quite a simple issue; and at runtime the overhead of compiling the expression (and subsequently throwing it away, unless you put caching in, which is going to be tricky to get right unless you emit something like call sites, like the DLR uses) is huge compared to the solution I present here.
Ultimately the motivation of any solution is to try and keep the work required by the caller to a minimum, but at the same time you also need to keep the work that is to be done by the expression analyzer to a minimum as well, otherwise the solution becomes almost unsolvable without a lot of work. To illustrate my point - let's look at the simplest we can achieve with a static method that takes an expression, given our object o:
var i = GetValueSafe(obj => obj.SomeIntProperty);
Uh-oh, that expression doesn't actually do anything - because it's not passing the o to it - the expression itself is useless to us, because we need the actual reference to o that could be null. So - the first of the solutions to this, naturally, is to explicitly pass the reference:
var i = GetValueSafe(o, obj => obj.SomeIntProperty);
(Note - could also be written as an extension method)
Thus the static method's job is to take the first parameter and pass it to the compiled expression when it invokes it. This also helps with identifying the type of the expression whose property is sought. It also, however, completely nullifies the reason to use an expression in the first place; since the method itself can immediately make the decision to access the property or not - since it has a reference to the object that could be null. Thus, in this case, it's easier, simpler and faster to simply pass a reference and accessor delegate (instead of an expression), as my extension method does.
As I mentioned, there is a way to get around having to pass the instance and that is to do one of the following:
var i = GetValueSafe(obj => o.SomeIntProperty);
Or
var i = GetValueSafe(() => o.SomeIntProperty);
We're discounting the extension method version - because with that we get a reference passed to the method, and as soon as we get a reference we can do away with expressions, as my last point proved.
Here we're relying on the caller to understand that they must include an expression that represents the actual instance (be it an in-scope property or field or local variable) in the body of the expression, on the left hand side of the member read, so that we can actually get a concrete value from it in order to do the null-check.
This is not a natural use of expression parameters, first of all, so I believe your callers could be confused. There's also another issue, which I think will be a killer if you intend to use this a lot - you cannot cache these expressions, because each time the instance, whose 'null-ness' you want to sidestep, is being baked into the expression that is passed. This means that you are always having to recompile the expression for every call; and that is going to be really slow. If you parameterise the instance in the expression you can then cache it - but then you end up with our first solution which required the instance to be passed; and again I've already shown there that we can then just use a delegate!
It is relatively easy - using the ExpressionVisitor class - to write something that can turn all property/field reads (and method calls for that matter) into 'safe' calls like you want. However, I cannot see any benefit to doing this unless you intend to do a safe read on something like this: a.b.c.d. But then the augmentation of value types to nullable versions of themselves is going to cause you a good few headaches in the expression-tree rewriting I can tell you; leaving a solution that hardly anyone will understand :)

Type inference in C#

I know msdn should probably be the first place to go and it will be after I get the scoop here. What the msdn would not really provide as part of the technical specification is what I am about to ask now:
How exactly the subject is useful in day to day development process?
Does it have a correlation in any shape or form with anonymous types inside clr?
What does it allow for what otherwise could not have been done without it?
Which .net features are dependent upon the subject and could not have exist without as part of the framework?
To bring a note of specifics to the question, it would be really interesting to know (in pseudo code) of how the compiler can actually determine the needed type if the method was called using lambdas and type inference
I am looking to see the compiler logical flow on how to locate that type.
Type inference occurs in many places in C#, at least the following:
The var keyword, which tells the compiler to infer (deduce) the correct type for the variable from what you initialize it with
The ability to leave type parameters out of a generic method call as long as they can be deduced from the parameters
The ability to leave out types from lambda expression arguments, as long as they can be deduced
And to answer your questions:
1) It saves a lot of typing, especially when using the so-called "LINQ methods". Compare for example
List<string> myList = new List<string>();
// ...
IEnumerable<string> result = myList.Where<string>((string s) => s.Length > 0)
.Select<string, string>((string s) => s.ToLower());
versus
var myList = new List<string>();
// ...
var result = myList.Where(s => s.Length > 0).Select(s => s.ToLower());
2) I don't know what you mean by "correlation", but without the var keyword you couldn't have variables refer to anonymous types in a type-safe way (you could always use object or dynamic), which makes it pretty important when using anonymous types.
3) Nothing as far as I can think of. It's only a convenience feature. Of course its absence would make, for instance, the aforementioned anonymous types less useful, but they're mostly a convenience feature as well.
4) I think 3) answers this as well.
It is syntactic sugar.
Not that I know about.
It greatly simplifies the programmers job.
Linq.

Are there any good reasons why ternaries in C# are limited?

Fails:
object o = ((1==2) ? 1 : "test");
Succeeds:
object o;
if (1 == 2)
{
o = 1;
}
else
{
o = "test";
}
The error in the first statement is:
Type of conditional expression cannot be determined because there is no implicit conversion between 'int' and 'string'.
Why does there need to be though, I'm assigning those values to a variable of type object.
Edit: The example above is trivial, yes, but there are examples where this would be quite helpful:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = ((subscriptionID == null) ? DBNull.Value : subscriptionID),
}
use:
object o = ((1==2) ? (object)1 : "test");
The issue is that the return type of the conditional operator cannot be un-ambiguously determined. That is to say, between int and string, there is no best choice. The compiler will always use the type of the true expression, and implicitly cast the false expression if necessary.
Edit:
In you second example:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = subscriptionID.HasValue ? (object)subscriptionID : DBNull.Value,
}
PS:
That is not called the 'ternary operator.' It is a ternary operator, but it is called the 'conditional operator.'
Though the other answers are correct, in the sense that they make true and relevant statements, there are some subtle points of language design here that haven't been expressed yet. Many different factors contribute to the current design of the conditional operator.
First, it is desirable for as many expressions as possible to have an unambiguous type that can be determined solely from the contents of the expression. This is desirable for several reasons. For example: it makes building an IntelliSense engine much easier. You type x.M(some-expression. and IntelliSense needs to be able to analyze some-expression, determine its type, and produce a dropdown BEFORE IntelliSense knows what method x.M refers to. IntelliSense cannot know what x.M refers to for sure if M is overloaded until it sees all the arguments, but you haven't typed in even the first argument yet.
Second, we prefer type information to flow "from inside to outside", because of precisely the scenario I just mentioned: overload resolution. Consider the following:
void M(object x) {}
void M(int x) {}
void M(string x) {}
...
M(b ? 1 : "hello");
What should this do? Should it call the object overload? Should it sometimes call the string overload and sometimes call the int overload? What if you had another overload, say M(IComparable x) -- when do you pick it?
Things get very complicated when type information "flows both ways". Saying "I'm assigning this thing to a variable of type object, therefore the compiler should know that it's OK to choose object as the type" doesn't wash; it's often the case that we don't know the type of the variable you're assigning to because that's what we're in the process of attempting to figure out. Overload resolution is exactly the process of working out the types of the parameters, which are the variables to which you are assigning the arguments, from the types of the arguments. If the types of the arguments depend on the types to which they're being assigned, then we have a circularity in our reasoning.
Type information does "flow both ways" for lambda expressions; implementing that efficiently took me the better part of a year. I've written a long series of articles describing some of the difficulties in designing and implementing a compiler that can do analysis where type information flows into complex expressions based on the context in which the expression is possibly being used; part one is here:
http://blogs.msdn.com/ericlippert/archive/2007/01/10/lambda-expressions-vs-anonymous-methods-part-one.aspx
You might say "well, OK, I see why the fact that I'm assigning to object cannot be safely used by the compiler, and I see why it's necessary for the expression to have an unambiguous type, but why isn't the type of the expression object, since both int and string are convertible to object?" This brings me to my third point:
Third, one of the subtle but consistently-applied design principles of C# is "don't produce types by magic". When given a list of expressions from which we must determine a type, the type we determine is always in the list somewhere. We never magic up a new type and choose it for you; the type you get is always one that you gave us to choose from. If you say to find the best type in a set of types, we find the best type IN that set of types. In the set {int, string}, there is no best common type, the way there is in, say, "Animal, Turtle, Mammal, Wallaby". This design decision applies to the conditional operator, to type inference unification scenarios, to inference of implicitly typed array types, and so on.
The reason for this design decision is that it makes it easier for ordinary humans to work out what the compiler is going to do in any given situation where a best type must be determined; if you know that a type that is right there, staring you in the face, is going to be chosen then it is a lot easier to work out what is going to happen.
It also avoids us having to work out a lot of complex rules about what's the best common type of a set of types when there are conflicts. Suppose you have types {Foo, Bar}, where both classes implement IBlah, and both classes inherit from Baz. Which is the best common type, IBlah, that both implement, or Baz, that both extend? We don't want to have to answer this question; we want to avoid it entirely.
Finally, I note that the C# compiler actually gets the determination of the types subtly wrong in some obscure cases. My first article about that is here:
http://blogs.msdn.com/ericlippert/archive/2006/05/24/type-inference-woes-part-one.aspx
It's arguable that in fact the compiler does it right and the spec is wrong; the implementation design is in my opinion better than the spec'd design.
Anyway, that's just a few reasons for the design of this particular aspect of the ternary operator. There are other subtleties here, for instance, how the CLR verifier determines whether a given set of branching paths are guaranteed to leave the correct type on the stack in all possible paths. Discussing that in detail would take me rather far afield.
Why is feature X this way is often a very hard question to answer. It's much easier to answer the actual behavior.
My educated guess as to why. The conditional operator is allowed to succinctly and tersely use a boolean expression to pick between 2 related values. They must be related because they are being used in a single location. If the user instead picks 2 unrelated values perhaps the had a subtle typo / bug in there code and the compiler is better off alerting them to this rather than implicitly casting to object. Which may be something they did not expect.
"int" is a primitive type, not an object while "string" is considered more of a "primitive object". When you do something like "object o = 1", you're actually boxing the "int" to an "Int32". Here's a link to an article about boxing:
http://msdn.microsoft.com/en-us/magazine/cc301569.aspx
Generally, boxing should be avoided due to performance loses that are hard to trace.
When you use a ternary expression, the compiler does not look at the assignment variable at all to determine what the final type is. To break down your original statement into what the compiler is doing:
Statement:
object o = ((1==2) ? 1 : "test");
Compiler:
What are the types of "1" and "test" in '((1==2) ? 1 : "test")'? Do they match?
Does the final type from #1 match the assignment operator type for 'object o'?
Since the compiler doesn't evaluate #2 until #1 is done, it fails.

Categories

Resources