Why can’t down-casting be checked at compile time? - c#

Why can’t compiler detect at compile-time that obj references object of type B and thus reports an error when we try to cast it to type A?
public class A { }
public class B { }
static void Main(string[] args)
{
B b = new B();
object obj = (object)b;
A a = (A)obj; // exception
thanx

Because of the Halting problem. This essentially means that you cannot decide which execution path will the program follow (and there is a mathematical proof for that). For example the following code may or may not be correct:
object o = SomeTest() ? (new A()) : (new B());
A a = (A)o;
If the SomeTest method always returns true then it is correct. Unfortunatelly, it is not possible to decide that. However, there is a lot of research going on in this field. Even though it cannot be always checked, there are tools that can sometimes verify that something will always succeed or give you an example of execution path for which the assumption fails.
A good example of this technique are Code Contracts, which will be a part of Visual Studio 2010. I believe you could use them to give prove that your down-casting will be correct. However, there is no explicit support for this - although, it would be useful!

Let me turn the question around: if the compiler could prove that, then why would we need casts at all? The purpose of a cast is tell the compiler "I know more about this code than you do, and I promise you that this cast is valid. I am so sure of that fact that I am willing to let you generate code that throws an exception if I'm wrong." The compiler can't prove that the cast is valid precisely because the cast is for scenarios where the compiler can't prove that it is valid.

A compiler certainly could implement checks that would work in trivial cases like this. But doing so would be unlikely to help "real" code very much, since programmers rarely write such obviously wrong code.
To handle more complicated cases, a compiler would have to perform much more complicated analysis. This is harder for the compiler writer to do, and is also slower for your machine to run, and it still wouldn't be able to catch every possible bad cast. And again, because most code doesn't have easily-identifiable errors of this sort, it's not clear that the payoff would be worth the cost of writing the analysis.
Two drawbacks of more complicated static analysis are error messages and false positives. First, having a tool explain a problem in code is often an order of magnitude harder than having the tool merely check for the problem. Second, as checked-for problems turn from "bad thing X will definitely happen" to "bad thing Y might happen", it becomes much more likely that the tool will flag things that aren't ever a problem in practice.
There's an interesting essay written by a company, selling static analysis tools, that was spun off from academic research. One thing they discovered is that they often made fewer sales with more complicated analyses! A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World

You want the compiler to follow the control flow, and determine ahead of time that the cast will cause an exception? Why bother? With a real program, the control flow will be too complicated to figure this out.

Even static analysis tools wouldn't be able to solve this problem. What if your code uses reflection?
void Test(string typeName)
{
Type t = Type.GetType(typeName);
object obj = Activator.CreateInstance(t);
A a = (A)obj;
// etc.
}
Will this throw an exception? There is absolutely no possible way to know the answer without actually running it. No amount of code-path analysis will unravel a bug that depends on the value of some particular parameter. And if you have to run the code to detect the bug, then that makes it a runtime error, not compile-time.
This is exactly the reason why you need to test your code. Compilers can't ensure that your code is correct, only that it's syntactically valid and follows whatever rules are in the grammar.
And although this might seem like a contrived example, reflection is used pretty much everywhere these days, from your O/R mapper to your DI framework. It's actually quite common in a modern application not to know the type of some instance, or at least not the specific concrete type, until runtime.

Because you'd sit there for days while compilers tried every possible path through your code.

As others have mentioned, the general problem is that the compiler would have to trace back through all possible execution paths to see where that variable may have come from - and then determine if the cast is valid.
Imagine if the object was passed in to the function, which then downcast it. The compiler would have to know the run-time type of the object passed in. The calling code may not even exist at compile time, if this is a library.

In a basic example like yours, one might think it would be easy for a compiler to intelligently look for all references to a particular object and then see if it's being illegally cast. But consider this counterexample:
public class A { }
public class B { }
static void Main(string[] args)
{
B b = new B();
object obj = (object)b;
// re-using the obj reference
obj = new A();
A a = (A)obj; // cast is now valid
There are so many possible permutations of ways you could re-use and cast a particular base reference that a compiler writer would need to foresee. It gets even more complicated when the obj reference is passed in a parameter to a method. Compile-time checking becomes non-deterministic, making compilations times potentially much longer and still not guaranteeing it would be able to catch all invalid casts.

Related

Why are casting and conversion operations are syntactically indistinguishable?

Stack Overflow has several questions about casting boxed values: 1, 2.
The solution requires first to unbox the value and only after that cast it to another type. Nevertheless, boxed value "knows" its own type, and I see no reason why conversion operator could not be called.
Moreover, the same issue is valid for reference types:
void Main()
{
object obj = new A();
B b = (B)obj;
}
public class A
{
}
public class B {}
This code throws InvalidCastException. So it's not the matter of value vs reference type; it's how compiler behaves.
For the upper code it emits castclass B, and for the code
void Main()
{
A obj = new A();
B b = (B)obj;
}
public class A
{
public static explicit operator B(A obj)
{
return new B();
}
}
public class B
{
}
it emits call A.op_Explicit.
Aha! Here compiler sees that A has an operator and calls it. But what then happens if B inherits from A? Not so fast, compiler is quite clever...it just says:
A.explicit operator B(A)': user-defined conversions to or from a
derived class are not allowed
Ha! No ambiguity!
but why on Earth did they allow two rather different operations to look the same?! What was the reason?
Your observation is, as far as I can tell, the observation that I made here:
http://ericlippert.com/2009/03/03/representation-and-identity/
There are two basic usages of the cast operator in C#:
(1) My code has an expression of type B, but I happen to have more information than the compiler does. I claim to know for certain that at runtime, this object of type B will actually always be of derived type D. I will inform the compiler of this claim by inserting a cast to D on the expression. Since the compiler probably cannot verify my claim, the compiler might ensure its veracity by inserting a run-time check at the point where I make the claim. If my claim turns out to be inaccurate, the CLR will throw an exception.
(2) I have an expression of some type T which I know for certain is not of type U. However, I have a well-known way of associating some or all values of T with an “equivalent” value of U. I will instruct the compiler to generate code that implements this operation by inserting a cast to U. (And if at runtime there turns out to be no equivalent value of U for the particular T I’ve got, again we throw an exception.)
The attentive reader will have noticed that these are opposites. A neat trick, to have an operator which means two contradictory things, don’t you think?
So apparently you are one of the "attentive readers" I called out who have noticed that we have one operation that logically means two rather different things. This is a good observation!
Your question is "why is that the case?" This is not a good question! :-)
As I have noted many times on this site, I cannot answer "why" questions satisfactorily. "Because that's what the specification says" is a correct answer but unsatisfactory. Really what the questioner is usually looking for is a summary of the language design process.
When the C# language design team designs features the debates can go on for literally months, they can involve a dozen people discussing many different proposals each with their own pros and cons, that generate hundreds of pages of notes. Even if I had the relevant information from the late 1990s meetings about cast operations, which I don't, it seems hard to summarize it concisely in a manner that would be satisfactory to the original questioner.
Moreover, in order to satisfactorily answer this question one would of course have to discuss the entire historical perspective. C# was designed to be immediately productive for existing C, C++ and Java programmers, and so it borrows many of the conventions of these languages, including its basic mechanisms for conversion operators. In order to properly answer the question we would have to discuss the history of the cast operator in C, C++ and Java as well. This seems like far too much information to expect in an answer on StackOverflow.
Frankly, the most likely explanation is that this decision was not the result of long debate between the merits of different positions. Rather, it's likely the language design team considered how it is done in C, C++ and Java, made a reasonable compromise position that didn't look too terrible, and moved on to other more interesting business. A proper answer would therefore be almost entirely historical; why did Ritchie design the cast operator like he did for C? I don't know, and we can't ask him.
My advice to you is that you stop asking "why?" questions about the history of programming language design and instead ask specific technical questions about specific code, questions that have a short answers.
Conversion operators are essentially "glorified method calls", so the compiler (as opposed to the runtime) already needs to know that you want to use the conversion operator and not a typecast. Basically the compiler needs to check whether a conversion exists to be able to generate the appropriate bytecode for it.
Your first code sample essentially looks like "convert from object to B", as the compiler has no idea that variable can only contain an A. According to the rules that means the compiler must emit a typecast operation.
Your second code sample is obvious to the compiler, because "convert from A to B" can be done with the conversion operator.

Does C# support type inference of the return type?

This is just a curiousity about if there is a fundamental thing stopping something like this (or correct me if there's already some way):
public TTo Convert<TTo, TFrom>(TFrom from)
{
...
}
Called like this:
SomeType someType = converter.Convert(someOtherType);
Because what would happen if you did this?
static void M(int x){}
static void M(double x){}
static T N<T>() {}
...
M(N());
Now what is T? int or double?
It's all very easy to solve the problem when you know what the type you're assigning to is, but much of the time the type you're assigning to is the thing you're trying to figure out in the first place.
Reasoning from inside to outside is hard enough. Reasoning from outside to inside is far more difficult, and doing both at the same time is extremely difficult. If it is hard for the compiler to understand what is going on, imagine how hard it is for the human trying to read, understand and debug the code when inferences can be made both from and to the type of the context of an expression. This kind of inference makes programs harder to understand, not easier, and so it would be a bad idea to add it to C#.
Now, that said, C# does support this feature with lambda expressions. When faced with an overload resolution problem in which the lambda can be bound two, three, or a million different ways, we bind it two, three or a million different ways and then evaluate those million different possible bindings to determine which one is "the best". This makes overload resolution at least NP-HARD in C#, and it took me the better part of a year to implement. We were willing to make that investment because (1) lambdas are awesome, and (2) most of the time people write programs that can be analyzed in a reasonable amount of time and can be understood by humans. So it was worth the cost. But in general, that kind of advanced analysis is not worth the cost.
C# expressions always* have a fixed type, regardless of their surroundings.
You're asking for an expression whose type is determined by whatever it's assigned to; that would violate this principle.
*) except for lambda expressions, function groups, and the null literal.
Unlike Java, in C# type reference doesn't base on the return type. And don't ask me why, Eric Lippert had answered these "why can't C# ..." questions:
because no one ever designed, specified, implemented, tested,
documented and shipped that feature

Automatically exchange explicit type with var-keyword

I want to automatically remove all explicit types and exchange them with the var keyword in a big solution, e.g. instead of
int a = 1;
I want to have:
var a = 1;
This is just cosmetics, the code in the solution works perfectly fine, I just want to have things consistent, as I started out using explicit types, but later on used var-keywords.
I'm guessing I would have to write some sort of code parser - sounds a little cumbersome. Does anybody know an easy solution to this?
Cheers,
Chris
This isn't an answer per se, but it's too long for a comment.
You should strongly consider not doing this. There's no stylistic concern with mixing explicit and inferential typing (you should infer types when you need to, either when using anonymous types or when it makes the code easier to read), and there are plenty of potential issues you'll encounter with this:
Declarations without assignment are ineligible
Declarations that are assigned to null are ineligible
Declarations that are of a supertype but initialized to an instance of a subtype (or compatible but different type) would change their meaning.
I.E.
object foo = "test";
...
foo = 2;
Obviously, this is a simple (and unlikely) example, but changing foo from object to var would result in foo being typed as a string instead of object, and would change the semantics of the code (it wouldn't even compile in this case, but you could easily run into more difficult to find scenarios where it changes overload resolution but doesn't produce a compile-time error).
In other words, don't do this, please.
Firstly, this is probably not such a good idea. There is no advantage to var over int; many declarations will be almost as simple.
But if you must...
A partly manual solution is to turn ReSharper's "Use var" hint into a warning and get it to fix them all up. I don't know if ReSharper will do it en masse, but I often rifle through a badly-done piece of third-party code with a rapid sequence of Alt+PgDn, Alt+Enter.
This has the significant advantage that ReSharper respects the semantics of your code. It won't replace types indiscriminately, and I'm pretty sure it will only make changes that don't affect the meaning of your program. E.g.: It won't replace object o = "hello"; (I think; I'm not in front of VS to check this).
Look into Lex & Yacc. You could combine that with a perl or awk script to mechanically edit your source.
You could also do this in emacs, using CEDET. It parses code modules and produces a table of its code analysis.
In either case you will need to come up with an analysis of the code that describes... class declarations (class name, parent types, start and end points), method declarations (similar), variable declarations, and so on. Then you will write some code (perl, awk, powershell, elisp, whatever) that walks the table, and does the replace on each appropriate variable declaration.
I'd be wary of doing this in an automated fashion. There are places where this may actually change the semantics of the program or introduce errors. For example,
IEnumerable<string> list = MethodThatReturnsListType();
or
string foo = null;
if (!dict.TryGetValue( "bar", out foo ))
{
foo = "default";
}
Since these aren't errors, I would simply replace them as you touch the code for other reasons. That way you can inspect the surrounding code and make sure you aren't changing the semantics and avoid introducing errors that need to be fixed.
What's about search/replace in Visual Studio IDE
For example search vor 'int ' and replace it with 'var '.

Use of "var" type in variable declaration

Our internal audit suggests us to use explicit variable type declaration instead of using the keyword var. They argue that using of var "may lead to unexpected results in some cases".
I am not aware of any difference between explicit type declaration and using of var once the code is compiled to MSIL.
The auditor is a respected professional so I cannot simply refuse such a suggestion.
How about this...
double GetTheNumber()
{
// get the important number from somewhere
}
And then elsewhere...
var theNumber = GetTheNumber();
DoSomethingImportant(theNumber / 5);
And then, at some point in the future, somebody notices that GetTheNumber only ever returns whole numbers so refactors it to return int rather than double.
Bang! No compiler errors and you start seeing unexpected results, because what was previously floating-point arithmetic has now become integer arithmetic without anybody noticing.
Having said that, this sort of thing should be caught by your unit tests etc, but it's still a potential gotcha.
I tend to follow this scheme:
var myObject = new MyObject(); // OK as the type is clear
var myObject = otherObject.SomeMethod(); // Bad as the return type is not clear
If the return type of SomeMethod ever changes then this code will still compile. In the best case you get compile errors further along, but in the worst case (depending on how myObject is used) you might not. What you will probably get in that case is run-time errors which could be very hard to track down.
Some cases could really lead to unexpected results. I'm a var fan myself, but this could go wrong:
var myDouble = 2;
var myHalf = 1 / myDouble;
Obviously this is a mistake and not an "unexpected result". But it is a gotcha...
var is not a dynamic type, it is simply syntactic sugar. The only exception to this is with Anonymous types. From the Microsoft Docs
In many cases the use of var is optional and is just a syntactic convenience. However, when a variable is initialized with an anonymous type you must declare the variable as var if you need to access the properties of the object at a later point.
There is no difference once compiled to IL unless you have explicitly defined the type as different to the one which would be implied (although I can't think of why you would). The compiler will not let you change the type of a variable declared with var at any point.
From the Microsoft documentation (again)
An implicitly typed local variable is strongly typed just as if you had declared the type yourself, but the compiler determines the type
In some cases var can impeed readability. More Microsoft docs state:
The use of var does have at least the potential to make your code more difficult to understand for other developers. For that reason, the C# documentation generally uses var only when it is required.
In the non-generic world you might get different behavior when using var instead of the type whenever an implicit conversion would occur, e.g. within a foreach loop.
In the example below, an implicit conversion from object to XmlNode takes place (the non-generic IEnumerator interface only returns object). If you simply replace the explicit declaration of the loop variable with the var keyword, this implicit conversion no longer takes place:
using System;
using System.Xml;
class Program
{
static void Foo(object o)
{
Console.WriteLine("object overload");
}
static void Foo(XmlNode node)
{
Console.WriteLine("XmlNode overload");
}
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<root><child/></root>");
foreach (XmlNode node in doc.DocumentElement.ChildNodes)
{
Foo(node);
}
foreach (var node in doc.DocumentElement.ChildNodes)
{
// oops! node is now of type object!
Foo(node);
}
}
}
The result is that this code actually produces different outputs depending on whether you used var or an explicit type. With var the Foo(object) overload will be executed, otherwise the Foo(XmlNode) overload will be. The output of the above program therefore is:
XmlNode overload
object overload
Note that this behavior is perfectly according to the C# language specification. The only problem is that var infers a different type (object) than you would expect and that this inference is not obvious from looking at the code.
I did not add the IL to keep it short. But if you want you can have a look with ildasm to see that the compiler actually generates different IL instructions for the two foreach loops.
It's an odd claim that using var should never be used because it "may lead to unexpected results in some cases", because there are subtleties in the C# language far more complex than the use of var.
One of these is the implementation details of anonymous methods which can lead to the R# warning "Access to modified closure" and behaviour that is very much not what you might expect from looking at the code. Unlike var which can be explained in a couple of sentences, this behaviour takes three long blog posts which include the output of a disassembler to explain fully:
The implementation of anonymous methods in C# and its consequences (part 1)
The implementation of anonymous methods in C# and its consequences (part 2)
The implementation of anonymous methods in C# and its consequences (part 3)
Does this mean that you also shouldn't use anonymous methods (i.e. delegates, lambdas) and the libraries that rely on them such as Linq or ParallelFX just because in certain odd circumstances the behaviour might not be what you expect?
Of course not.
It means that you need to understand the language you're writing in, know its limitations and edge cases, and test that things work as you expect them to. Excluding language features on the basis that they "may lead to unexpected results in some cases" would mean that you were left with very few language features to use.
If they really want to argue the toss, ask them to demonstrate that a number of your bugs can be directly attributed to use of var and that explicit type declaration would have prevented them. I doubt you'll hear back from them soon.
They argue that using of var "may lead
to unexpected results in some cases".to unexpected results in some cases".
If unexpected is, "I don't know how to read the code and figure out what it is doing," then yes, it may lead to unexpected results. The compiler has to know what type to make the variable based on the code written around the variable.
The var keyword is a compile time feature. The compiler will put in the appropriate type for the declaration. This is why you can't do things like:
var my_variable = null
or
var my_variable;
The var keyword is great because, you have to define less information in the code itself. The compiler figures out what it is supposed to do for you. It's almost like always programming to an interface when you use it (where the interface methods and properties are defined by what you use within the declaration space of the variable defined by var). If the type of a variable needs to change(within reason of course), you don't need to worry about changing the variable declaration, the compiler handles this for you. This may sound like a trivial matter, but what happens if you have to change the return value in a function, and that function is used all throughout the program. If you didn't use var, then you have to find and replace every place that variable is called. With the var keyword, you don't need to worry about that.
When coming up with guidelines, as an auditor has to do, it is probably better to err on the side of fool safe, that is white listing good practices / black listing bad practices as opposed to telling people to simply be sensible and do the right thing based on an assessment of the situation at hand.
If you just say "don't use var anywhere in code", you get rid of a lot of ambiguity in the coding guidelines. This should make code look & feel more standardized without having to solve the question of when to do this and when to do that.
I personally love var. I use it for all local variables. All the time. If the resulting type is not clear, then this is not an issue with var, but an issue with the (naming of) methods used to initialize a variable...
I follow a simple principle when it comes to using the var keyword. If you know the type beforehand, don't use var.
In most cases, I use var with linq as I might want to return an anonymous type.
var best using when you have obviously declaration
ArrayList<Entity> en = new ArrayList<Enity>()
complicates readability
var en = new ArrayList<Entity>()
Lazy, clear code, i like it
I use var only where it is clear what type the variable is, or where it is no need to know the type at all (e.g. GetPerson() should return Person, Person_Class, etc.).
I do not use var for primitive types, enum, and string. I also do not use it for value type, because value type will be copied by assignment so the type of variable should be declared explicitly.
About your auditor comments, I would say that adding more lines of code as we have been doing everyday also "lead to unexpected results in some cases". This argument validity has already proven by those bugs we created, therefore I would suggest freezing the code base forever to prevent that.
using var is lazy code if you know what the type is going to be. Its just easier and cleaner to read. When looking at lots and lots of code, easier and cleaner is always better
There is absolutely no difference in the IL output for a variable declaration using var and one explicitly specified (you can prove this using reflector). I generally only use var for long nested generic types, foreach loops and anonymous types, as I like to have everything explicitly specified. Others may have different preferences.
var is just a shorthand notation of using the explicit type declaration.
You can only use var in certain circumstances; You'll have to initialize the variable at declaration time when using var.
You cannot assign a variable that is of another type afterwards to the variable.
It seems to me that many people tend to confuse the 'var' keyword with the 'Variant' datatype in VB6 .
The "only" benefit that i see towards using explicit variable declaration, is with well choosen typenames you state the intent of your piece of code much clearer (which is more important than anything else imo). The var keyword's benefit really is what Pieter said.
I also think that you will run into trouble if you declare your doubles without the D on the end. when you compile the release version, your compiler will likely strip off the double and make them a float to save space since it will not consider your precision.
var will compile to the same thing as the Static Type that could be specified. It just removes the need to be explicit with that Type in your code. It is not a dynamic type and does not/can not change at runtime. I find it very useful to use in foreach loops.
foreach(var item in items)
{
item.name = ______;
}
When working with Enumerations some times a specific type is unknown of time consuming to lookup. The use of var instead of the Static Type will yeald the same result.
I have also found that the use of var lends it self to easier refactoring. When a Enumeration of a different type is used the foreach will not need to be updated.
Use of var might hide logical programming errors, that otherwise you would have got warning from the compiler or the IDE. See this example:
float distX = innerDiagramRect.Size.Width / (numObjInWidth + 1);
Here, all the types in the calculation are int, and you get a warning about possible loss of fraction because you pick up the result in a float variable.
Using var:
var distX = innerDiagramRect.Size.Width / (numObjInWidth + 1);
Here you get no warning because the type of distX is compiled as int. If you intended to use float values, this is a logical error that is hidden to you, and hard to spot in executing unless it triggers a divide by zero exception in a later calculation if the result of this initial calculation is <1.

Does it make sense to use "as" instead of a cast even if there is no null check? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
In development blogs, online code examples and (recently) even a book, I keep stumbling about code like this:
var y = x as T;
y.SomeMethod();
or, even worse:
(x as T).SomeMethod();
That doesn't make sense to me. If you are sure that x is of type T, you should use a direct cast: (T)x. If you are not sure, you can use as but need to check for null before performing some operation. All that the above code does is to turn a (useful) InvalidCastException into a (useless) NullReferenceException.
Am I the only one who thinks that this a blatant abuse of the as keyword? Or did I miss something obvious and the above pattern actually makes sense?
Your understanding is true. That sounds like trying to micro-optimize to me. You should use a normal cast when you are sure of the type. Besides generating a more sensible exception, it also fails fast. If you're wrong about your assumption about the type, your program will fail immediately and you'll be able to see the cause of failure immediately rather than waiting for a NullReferenceException or ArgumentNullException or even a logical error sometime in the future. In general, an as expression that's not followed by a null check somewhere is a code smell.
On the other hand, if you are not sure about the cast and expect it to fail, you should use as instead of a normal cast wrapped with a try-catch block. Moreover, use of as is recommended over a type check followed by a cast. Instead of:
if (x is SomeType)
((SomeType)x).SomeMethod();
which generates an isinst instruction for the is keyword, and a castclass instruction for the cast (effectively performing the cast twice), you should use:
var v = x as SomeType;
if (v != null)
v.SomeMethod();
This only generates an isinst instruction. The former method has a potential flaw in multithreaded applications as a race condition might cause the variable to change its type after the is check succeeded and fail at the cast line. The latter method is not prone to this error.
The following solution is not recommended for use in production code. If you really hate such a fundamental construct in C#, you might consider switching to VB or some other language.
In case one desperately hates the cast syntax, he/she can write an extension method to mimic the cast:
public static T To<T>(this object o) { // Name it as you like: As, Cast, To, ...
return (T)o;
}
and use a neat[?] syntax:
obj.To<SomeType>().SomeMethod()
IMHO, as just make sense when combined with a null check:
var y = x as T;
if (y != null)
y.SomeMethod();
Using 'as' does not apply user defined conversions while the cast will use them where appropriate. That can be an important difference in some cases.
I wrote a bit about this here:
http://blogs.msdn.com/ericlippert/archive/2009/10/08/what-s-the-difference-between-as-and-cast-operators.aspx
I understand your point. And I agree with the thrust of it: that a cast operator communicates "I am sure that this object can be converted to that type, and I am willing to risk an exception if I'm wrong", whereas an "as" operator communicates "I am not sure that this object can be converted to that type; give me a null if I'm wrong".
However, there is a subtle difference. (x as T).Whatever() communicates "I know not just that x can be converted to a T, but moreover, that doing so involves only reference or unboxing conversions, and furthermore, that x is not null". That does communicate different information than ((T)x).Whatever(), and perhaps that is what the author of the code intends.
I've often seen references to this misleading article as evidence that "as" is faster than casting.
One of the more obvious misleading aspects of this article is the graphic, which does not indicate what is being measured: I suspect it's measuring failed casts (where "as" is obviously much faster as no exception is thrown).
If you take the time to do the measurements, then you'll see that casting is, as you'd expect, faster than "as" when the cast succeeds.
I suspect this may be one reason for "cargo cult" use of the as keyword instead of a cast.
The direct cast needs a pair of parentheses more than the as keyword. So even in the case where you're 100 % sure what the type is, it reduces visual clutter.
Agreed on the exception thing, though. But at least for me, most uses of as boil down to check for null afterwards, which I find nicer than catching an exception.
99% of the time when I use "as" is when I'm not sure what's the actual object type
var x = obj as T;
if(x != null){
//x was type T!
}
and I don't want to catch explicit cast exceptions nor make cast twice, using "is":
//I don't like this
if(obj is T){
var x = (T)obj;
}
It's just because people like the way it looks, it's very readable.
Lets face it: the casting/conversion operator in C-like languages is pretty terrible, readability-wise. I would like it better if C# adopted either the Javascript syntax of:
object o = 1;
int i = int(o);
Or define a to operator, the casting equivalent of as:
object o = 1;
int i = o to int;
People likeas so much because it makes them feel safe from exceptions... Like guarantee on a box. A guy puts a fancy guarantee on the box 'cause he wants you to feel all warm and toasty inside. You figure you put that little box under your pillow at night, the Guarantee Fairy might come down and leave a quarter, am I right Ted?
Back on topic... when using a direct cast, there is the possibility for an invalid cast exception. So people apply as as a blanket solution to all of their casting needs because as (by itself) will never throw an exception. But the funny thing about that, is in the example you gave (x as T).SomeMethod(); you are trading an invalid cast exception for a null reference exception. Which obfuscates the real problem when you see the exception.
I generally don't use as too much. I prefer the is test because to me, it appears more readable and makes more sense then trying a cast and checking for null.
This has to be one of my top peeves.
Stroustrup's D&E and/or some blog post I cant find right now discusses the notion of a to operator which would address the point made by https://stackoverflow.com/users/73070/johannes-rossel (i.e., same syntax as as but with DirectCast semantics).
The reason this didnt get implemented is because a cast should cause pain and be ugly so you get pushed away from using it.
Pity that 'clever' programmers (often book authors (Juval Lowy IIRC)) step around this by abusing as in this fashion (C++ doesnt offer an as, probably for this reason).
Even VB has more consistency in having a uniform syntax that forces you to choose a TryCast or DirectCast and make up your mind!
I believe that the as keyword could be thought of as a more elegant looking version of the
dynamic_cast from C++.
It's probably more popular for no technical reason but just because it's easier to read and more intuitive. (Not saying it makes it better just trying to answer the question)
One reason for using "as":
T t = obj as T;
//some other thread changes obj to another type...
if (t != null) action(t); //still works
Instead of (bad code):
if (obj is T)
{
//bang, some other thread changes obj to another type...
action((T)obj); //InvalidCastException
}

Categories

Resources