This question already has answers here:
c# casting with is and as
(5 answers)
Closed 6 years ago.
In C# there is is operator for checking if object is compatible with some type. This operators tries to cast object to some type and if casting is successful it returns true (or false if casting fails).
From Jeffrey Richter CLR via C#:
The is operator checks whether an object is compatible with a given
type, and the result of the evaluation is a Boolean: true or false.
if (o is Employee)
{
Employee e = (Employee) o;
// Use e within the remainder of the 'if' statement.
}
In this code, the CLR is actually checking the object’s type twice:
The is operator first checks to see if o is compatible with the
Employee type. If it is, inside the if statement, the CLR again
verifies that o refers to an Employee when performing the cast. The
CLR’s type checking improves security, but it certainly comes at a
performance cost, because the CLR must determine the actual type of
the object referred to by the variable (o), and then the CLR must walk
the inheritance hierarchy, checking each base type against the
specified type (Employee).
Also, from the same book:
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
In this code, the CLR checks if o is compatible with the Employee
type, and if it is, as returns a non-null reference to the same
object. If o is not compatible with the Employee type, the as operator
returns null. Notice that the as operator causes the CLR to verify an
object’s type just once. The if statement simply checks whether e is
null; this check can be performed faster than verifying an object’s
type.
So, my question is: why do we need is operator? Which are the cases when is operator is more preferable over as.
why do we need is operator?
We don't need it. It is redundant. If the is operator were not in the language you could emulate it by simply writing
(x as Blah) != null
for reference types and
(x as Blah?) != null
for value types.
In fact, that is all is is; if you look at the IL, both is and as compile down to the same IL instruction.
Your first question cannot be answered because it presumes a falsehood. Why do we need this operator? We don't need it, so there is no reason why we need it. So this is not a productive question to ask.
Which are the cases when is operator is more preferable over as.
I think you meant to ask
why would I write the "inefficient" code that does two type checks -- is followed by a cast -- when I could write the efficient code that does one type check using as and a null check?
First of all, the argument from efficiency is weak. Type checks are cheap, and even if they are expensive, they're probably not the most expensive thing you do. Don't change code that looks perfectly sensible just to save those few nanoseconds. If you think the code looks nicer or is easier to read using is rather than as, then use is rather than as. There is no product in the marketplace today whose success or failure was predicated on using as vs is.
Or, look at it the other way. Both is and as are evidence that your program doesn't even know what the type of a value is, and programs where the compiler cannot work out the types tend to be (1) buggy, and (2) slow. If you care so much about speed, don't write programs that do one type test instead of two; write programs that do zero type tests instead of one! Write programs where typing can be determined statically.
Second, there are situations in C# where you need an expression, not a statement, and C# unfortunately does not have "let" expressions outside of queries. You can write
... e is Manager ? ((Manager)e).Reports : 0 ...
as an expression but pre C# 7 there was no way to write
Manager m = e as Manager;
in an expression context. In a query you could write either
from e in Employees
select e is Manager ? ((Manager)e).Reports : 0
or
from e in Employees
let m = e as Manager
select m == null ? 0 : m.Reports
but there is no "let" in an expression context outside of queries. It would be nice to be able to write
... let m = e as Manager in m == null ? 0 : m.Reports ...
in an arbitrary expression. But we can get some of the way there. In C# 7 you'll (probably) be able to write
e is Manager m ? m.Reports : 0 ...
which is a nice sugar and eliminates the inefficient double-check. The is-with-new-variable syntax nicely combines everything together: you get a Boolean type test and a named, typed reference.
Now, what I just said is a slight lie; as of C# 6 you can write the code above as
(e as Manager)?.Reports ?? 0
which does the type check once. But pre C# 6.0 you were out of luck; you pretty much always had to do the type check twice if you were in an expression context.
With C# 7 operator is can be less wordy then as
Compare this
Employee e = o as Employee;
if (e != null)
{
// Use e within the 'if' statement.
}
and this
if (o is Employee e)
{
// Use e within the 'if' statement.
}
Information from here. Section Pattern Matching with Is Expressions
There are times when you might want to just check the type not actually go through the effort of casting it.
As such you can just use the is operator to confirm your object is compatible, and do whatever logic you want. Whereas in other scenarios you may just want to cast (Safely or otherwise) and utilise the returned value.
Ultimately because is just returns a boolean, you can use it for checking.
as and the (T)MyType type casting are used to safely casting to null, or throwing an Exception respectively
How to: Safely Cast by Using as and is Operators (C# Programming Guide)
At least one use-case I can think of is when comparing if a certain variable is a value type (as cannot be used in that case).
For instance,
var x = ...;
if(x is bool)
{
// do something
}
It can also be useful when you don't necessarily need to use the cast, but are simply interested whether or not something is of a certain underlying type.
Related
Sorry if my question seem stupid.
In C++, this code work:
Foo* foo = new Foo();
if (foo)
....;
else
....;
In C#, this doesn't work:
Object obj = new Object();
if (obj)
....;
else
....;
because Object class cannot be implicitly convert to bool (obvious, no problem about that), and it doesn't implement true operator.
So my question is why doesn't Object implement the true operator (just check whether itself is null or not, sound easy enough)? Is it just because of code readability or something?
It's because of code clarity. A lot of design choices for C# were made with the goal that the code should be written in such a way that it is immediately obvious what it is trying to do. If you have something like:
Object obj = ...;
if (obj)
....
What does if(obj) mean? Is it checking if obj true? Is it checking if it is null? Is it checking if it is 0? It's not clear to someone glancing at the code and necessitates the programmer to consult the C# documentation to see what this particular syntax is doing. So instead, C# has you say
Object obj = ...;
if (obj == null)
....
This way, it is obvious what you are trying to do.
This is the same reason why C# requires you to instantiate your local variables as well as declare them before you can use them in code. The value of an uninstantiated variable is ambiguous and depends on the compiler's configuration, so instead of forcing you to do research or perform guesswork, C# instead makes it so you have to code in such a way that your intention is clear.
The fundamental answer to your question is the one given in the accepted answer: because C# was designed to avoid, not perpetuate the design errors of C.
But more specifically: you ask why a type does not implement operator true. The answer to that question is: the purpose of operator true is to implement the short circuiting && and || operators. Since there is no desire to have object implement either & or && or | or ||, there is no reason to implement operator true or operator false.
Boxing.
It would really be horribly confusing for conditionals involving booleans to do the exact opposite when the value is boxed.
What do you want from:
object o = false; // Yes, this is legal, it makes a boxed System.Boolean
if (o) { ... }
For this reason it's fine for bool-like types to be tested in conditions, but not for System.Object which is the base class for all boxed values.
I'm interesting in some design choices of C# language.
There is a rule in C# spec that allows to use method groups as the expressions of is operator:
class Foo {
static void Main() { if (Main is Foo) Main(); }
}
Condition above is always false, as the specification says:
7.10.10 The is operator
• If E is a method group or the null literal, of if the type of E is a reference type
or a nullable type and the value of E is null, the result is false.
My questions: what is the purpose/point/reason of allowing to use the C# language element with no runtime representation in CLR like method groups inside the such "runtime" operator like is?
what is the purpose/point/reason of allowing to use the C# language element with no runtime representation in CLR like method groups inside the such "runtime" operator like is?
The language design notes archive make no mention of why this decision was made, so any guess at an answer will be a conjecture. They do mention that if the result of the "is" can be statically determined to always be true or false, that it be so determined and produce a warning. It seems plausible that this could simply be an error.
The most common reason for turning what could rightly be an error into a warning (or simply allowing it) is because that lessens the burden upon producers of programs that automatically generate code. However, I don't see a really compelling scenario here.
UPDATE:
I just checked the C# 1.0 specification. It does not have this language in it. It does not say anything about nulls or method group arguments. And of course it says nothing about method group conversions because in C# 1.0 there were no implicit method group conversions; you had to explicitly call "new D(M)" if you wanted to convert method M to delegate type D.
This latter point is justification for "M is D" returning false rather than true; You couldn't legally say "D d = M;" so why should "M is D" be true?
Of course in C# 2.0 this makes less sense, since you can say "D d = M;" in C# 2.0.
I also just asked one of the people present when the "is" operator was designed and he had no memory of ever deciding this question one way or the other. His suspicion was that the original design of the "is" operator was to not give any errors, only warnings, and that all the text in the spec about what to do with method groups and nulls and whatnot was added post-hoc, for the C# 2.0 version of the spec, and based on what the compiler actually did.
In short, it looks like this was a design hole in C# 1.0 that was papered over when the spec was updated for C# 2.0. It does not look like this specific behaviour was desired and deliberately implemented.
This theory is reinforced by the fact that anonymous methods do produce an error when used as an argument to "is" in C# 2.0. It would not be a breaking change to do so, but it would be a breaking change to make "M is D" suddenly start returning true or being an error.
FURTHER UPDATE:
While investigating this I learned something interesting. (To me.) When the feature was originally designed, the design was to allow either the name of a type, or a Type object as the right-hand argument to "is". That idea was abandoned well before C# 1.0 shipped though.
First of all a method is not a type, msdn clearly states the following:
The is operator is used to check
whether the run-time type of an object
is compatible with a given type
Example
public static void Test (object o)
{
Class1 a;
if (o is Class1) {}
}
From MSDN:
An is expression evaluates to true if both of the following conditions are met:
expression is not null.
expression can be cast to type. That is, a cast expression of the form (type)(expression) will complete without throwing an exception. For more information, see 7.6.6 Cast expressions.
So the reason for your example being false lands on the second point, it has to be castable to a specific type.
I hope I did not missunderstand the question.
This question already has answers here:
Casting vs using the 'as' keyword in the CLR
(18 answers)
Closed 9 years ago.
How does the "is" operator work in C#?
I have been told that this :
if (x is string)
{
string y = x as string;
//Do something
}
is not as good as this:
string y = x as string;
if (y != null)
{
//Do something
}
Which is better and why?
FxCop issues Warning CA1800 in the first scenario (and not only when using as, but also when using an unchecked cast) as both is and the actual casts require certain type checking operations to determine whether the cast is successful or whether to throw an InvalidCastException.
You might save a few operations by just using as once and then checking the result for null if you are going to use the cast value anyway, rather than checking explicitly with is and then casting anew.
I think second is better cause in first case it will cast object 2 times, first time with is operator and second time in as operator.
while in second case it cast only one time.
The is operator checks if an object can be cast to a specific type or not
like
if (someObj is StringBuilder)
{
StringBuilder ss = someObj as StringBuilder
....
}
The as operator cast an object to a specific type, and returns null if it fails.
like
StringBuilder b = someObj as StringBuilder;
if (b != null) ...
I would use the first approach when more than one type is expected to be stored in x. (You might be passed a string, or a StringBuilder etc). This would allow for non-exception based handling.
If you are always expecting x to hold a certain type of value, then I would use the second option. The check for null is inexpensive and provides a level of validation.
-- Edit --
Wow. After reading some of the comments below, I started looking for more updated information. There is a LOT more to consider, when using as vs is vs (casting). Here are two interesting reads I found.
Does it make sense to use "as" instead of a cast even if there is no null check? and http://blogs.msdn.com/b/ericlippert/archive/2009/10/08/what-s-the-difference-between-as-and-cast-operators.aspx?PageIndex=1#comments
Both of which seem to be well summarized by Jon Skeet's blog. http://www.yoda.arachsys.com/csharp/faq/#cast.as
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Casting: (NewType) vs. Object as NewType
In C#, why ever cast reference types when you can use "as"?
Casting can generate exceptions whereas "as" will evaulate to null if the casting fails.
Wouldn't "as" be easier to use with reference types in all cases?
eg:
MyObject as DataGridView
rather than,
(DataGridView)MyObject
Consider the following alternatives:
Foo(someObj as SomeClass);
and:
Foo((SomeClass)someObj);
Due to someObj being of the wrong type, the first version passes null to Foo. Some time later, this results in a NullReferenceException being thrown. How much later? Depends on what Foo does. It might store the null in a field, and then minutes later it's accessed by some code that expects it to be non-null.
But with the second version, you find the problem immediately.
Why make it harder to fix bugs?
Update
The OP asked in a comment: isn't is easier to use as and then check for null in an if statement?
If the null is unexpected and is evidence of a bug in the caller, you could say:
SomeClass c = someObj as SomeClass;
if (c == null)
{
// hmm...
}
What do you do in that if-block? There are two general solutions. One is to throw an exception, so it is the caller's responsibility to deal with their mistake. In which case it is definitely simpler to write:
SomeClass c = (SomeClass)someObj;
It simply saves you writing the if/throw logic by hand.
There is another alternative though. If you have a "stock" implementation of SomeClass that you are happy to use where nothing better is available (maybe it has methods that do nothing, or return "empty" values, etc.) then you could do this:
SomeClass c = (someObj as SomeClass) ?? _stockImpl;
This will ensure that c is never null. But is that really better? What if the caller has a bug; don't you want to help find bugs? By swapping in a default object, you disguise the bug. That sounds like an attractive idea until you waste a week of your life trying to track down a bug.
(In a way this mimics the behaviour of Objective-C, in which any attempt to use a null reference will never throw; it just silently does nothing.)
operator 'as' work with reference types only.
Sometimes, you want the exception to be thrown. Sometimes, you want to try to convert and nulls are OK. As already stated, as will not work with value types.
One definite reason is that the object is, or could be (when writing a generic method, you may not know at coding-time) being cast to a value type, in which case as isn't allowed.
One more dubious reason is that you already know that the object is of the type in question. Just how dubious depends on how you already know that. In the following case:
if(obj is MyType)
DoMyTypeStuff((MyType)obj);
else
DoMoreGeneralStuff(obj);
It's hard to justify using as here, as the only thing it really does is add a redundant check (maybe it'll be optimised away, maybe it won't). At the other extreme, if you are half-way to a trance state with the amount of information you've got in you're brain's paged-in memory and on the basis of that you are pretty sure that the object must be of the type in question, maybe it's better to add in the check.
Another good reason is that the difference between being of the wrong type and being null gets hidden by as. If it's reasonable to be passing in a string to a given method, including a null string, but it's not reasonable to pass in an int, then val as string has just made the incorrect usage look like a completely different correct usage, and you've just made the bug harder to find and potentially more damaging.
Finally, maybe if you don't know the type of the object, the calling code should. If the calling code has called yours incorrectly, they should receive an exception. To either allow the InvalidCastException to pass back, or to catch it and throw an InvalidArgument exception or similar is a reasonable and clear means of doing so.
If, when you write the code to make the cast, you are sure that the cast should work, you should use (DataGridView)MyObject. This way, if the cast fails in the future, your assumption about the type of MyObject will cause an invalid cast exception at the point where you make the cast, instead of a null reference exception at some point later.
If you do want to handle the case where MyObject is not a DataGridView, then use as, and presumably check for it being null before doing anything with it.
tl;dr If your code assumes something, and that assumption is wrong at run-time, the code should throw an exception.
From MSDN (as (C# reference)):
the as operator only performs reference conversions and boxing conversions. The as operator cannot perform other conversions, such as user-defined conversions, which should instead be performed using cast expressions.
Taking into consideration all of the comments, we came across this just the other day and wondered why you would do a direct cast over using the keyword as. What if you want the cast to fail? This is sometimes the desirable effect you want from a cast if you're casting from a null object. You then push the exception up the call stack.
So, if you want something to fail, use a direct cast, if you're okay with it not failing, use the as keyword.
As is faster and doesn't throw exceptions. Therefore it is generally preferred. Reasons to use casts include:
Using as, you can only assign types that are lower in the inheritance tree to ones that are higher. For example:
object o = "abc" as object;
DataGridView d = "abc" as DataGridView // doesn't do anything
DataGridView could create a custom cast that does allow this. Casts are defined on the target type and therefore allow everything, as long as it's defined.
Another problem with as is that it doesn't always work. Consider this method:
IEnumerable<T> GetList<T>(T item)
{
(from ... select) as IEnumerable<T>
}
This code fails because T could also be a Value Type. You can't use as on those because they can never be null. This means you'll have to put a constraint on T, while it is actually unnecesary. If you don't know whether you're going to have a reference type or not, you can never use as.
Of course, you should always check for null when you use the as keyword. Don't assume no exceptions will be thrown just becase the keyword doesn't throw any. Don't put a Try {} Catch(NullReferenceException){} around it, that't unneccesary and bloat. Just assign the value to a variable and check for null before you use it. Never use it inline in a method call.
Fails:
object o = ((1==2) ? 1 : "test");
Succeeds:
object o;
if (1 == 2)
{
o = 1;
}
else
{
o = "test";
}
The error in the first statement is:
Type of conditional expression cannot be determined because there is no implicit conversion between 'int' and 'string'.
Why does there need to be though, I'm assigning those values to a variable of type object.
Edit: The example above is trivial, yes, but there are examples where this would be quite helpful:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = ((subscriptionID == null) ? DBNull.Value : subscriptionID),
}
use:
object o = ((1==2) ? (object)1 : "test");
The issue is that the return type of the conditional operator cannot be un-ambiguously determined. That is to say, between int and string, there is no best choice. The compiler will always use the type of the true expression, and implicitly cast the false expression if necessary.
Edit:
In you second example:
int? subscriptionID; // comes in as a parameter
EntityParameter p1 = new EntityParameter("SubscriptionID", DbType.Int32)
{
Value = subscriptionID.HasValue ? (object)subscriptionID : DBNull.Value,
}
PS:
That is not called the 'ternary operator.' It is a ternary operator, but it is called the 'conditional operator.'
Though the other answers are correct, in the sense that they make true and relevant statements, there are some subtle points of language design here that haven't been expressed yet. Many different factors contribute to the current design of the conditional operator.
First, it is desirable for as many expressions as possible to have an unambiguous type that can be determined solely from the contents of the expression. This is desirable for several reasons. For example: it makes building an IntelliSense engine much easier. You type x.M(some-expression. and IntelliSense needs to be able to analyze some-expression, determine its type, and produce a dropdown BEFORE IntelliSense knows what method x.M refers to. IntelliSense cannot know what x.M refers to for sure if M is overloaded until it sees all the arguments, but you haven't typed in even the first argument yet.
Second, we prefer type information to flow "from inside to outside", because of precisely the scenario I just mentioned: overload resolution. Consider the following:
void M(object x) {}
void M(int x) {}
void M(string x) {}
...
M(b ? 1 : "hello");
What should this do? Should it call the object overload? Should it sometimes call the string overload and sometimes call the int overload? What if you had another overload, say M(IComparable x) -- when do you pick it?
Things get very complicated when type information "flows both ways". Saying "I'm assigning this thing to a variable of type object, therefore the compiler should know that it's OK to choose object as the type" doesn't wash; it's often the case that we don't know the type of the variable you're assigning to because that's what we're in the process of attempting to figure out. Overload resolution is exactly the process of working out the types of the parameters, which are the variables to which you are assigning the arguments, from the types of the arguments. If the types of the arguments depend on the types to which they're being assigned, then we have a circularity in our reasoning.
Type information does "flow both ways" for lambda expressions; implementing that efficiently took me the better part of a year. I've written a long series of articles describing some of the difficulties in designing and implementing a compiler that can do analysis where type information flows into complex expressions based on the context in which the expression is possibly being used; part one is here:
http://blogs.msdn.com/ericlippert/archive/2007/01/10/lambda-expressions-vs-anonymous-methods-part-one.aspx
You might say "well, OK, I see why the fact that I'm assigning to object cannot be safely used by the compiler, and I see why it's necessary for the expression to have an unambiguous type, but why isn't the type of the expression object, since both int and string are convertible to object?" This brings me to my third point:
Third, one of the subtle but consistently-applied design principles of C# is "don't produce types by magic". When given a list of expressions from which we must determine a type, the type we determine is always in the list somewhere. We never magic up a new type and choose it for you; the type you get is always one that you gave us to choose from. If you say to find the best type in a set of types, we find the best type IN that set of types. In the set {int, string}, there is no best common type, the way there is in, say, "Animal, Turtle, Mammal, Wallaby". This design decision applies to the conditional operator, to type inference unification scenarios, to inference of implicitly typed array types, and so on.
The reason for this design decision is that it makes it easier for ordinary humans to work out what the compiler is going to do in any given situation where a best type must be determined; if you know that a type that is right there, staring you in the face, is going to be chosen then it is a lot easier to work out what is going to happen.
It also avoids us having to work out a lot of complex rules about what's the best common type of a set of types when there are conflicts. Suppose you have types {Foo, Bar}, where both classes implement IBlah, and both classes inherit from Baz. Which is the best common type, IBlah, that both implement, or Baz, that both extend? We don't want to have to answer this question; we want to avoid it entirely.
Finally, I note that the C# compiler actually gets the determination of the types subtly wrong in some obscure cases. My first article about that is here:
http://blogs.msdn.com/ericlippert/archive/2006/05/24/type-inference-woes-part-one.aspx
It's arguable that in fact the compiler does it right and the spec is wrong; the implementation design is in my opinion better than the spec'd design.
Anyway, that's just a few reasons for the design of this particular aspect of the ternary operator. There are other subtleties here, for instance, how the CLR verifier determines whether a given set of branching paths are guaranteed to leave the correct type on the stack in all possible paths. Discussing that in detail would take me rather far afield.
Why is feature X this way is often a very hard question to answer. It's much easier to answer the actual behavior.
My educated guess as to why. The conditional operator is allowed to succinctly and tersely use a boolean expression to pick between 2 related values. They must be related because they are being used in a single location. If the user instead picks 2 unrelated values perhaps the had a subtle typo / bug in there code and the compiler is better off alerting them to this rather than implicitly casting to object. Which may be something they did not expect.
"int" is a primitive type, not an object while "string" is considered more of a "primitive object". When you do something like "object o = 1", you're actually boxing the "int" to an "Int32". Here's a link to an article about boxing:
http://msdn.microsoft.com/en-us/magazine/cc301569.aspx
Generally, boxing should be avoided due to performance loses that are hard to trace.
When you use a ternary expression, the compiler does not look at the assignment variable at all to determine what the final type is. To break down your original statement into what the compiler is doing:
Statement:
object o = ((1==2) ? 1 : "test");
Compiler:
What are the types of "1" and "test" in '((1==2) ? 1 : "test")'? Do they match?
Does the final type from #1 match the assignment operator type for 'object o'?
Since the compiler doesn't evaluate #2 until #1 is done, it fails.