Currently I'm teaching a class of C++ programmers the basics of the C# language. As we discussed the topic operators I used C# standard categories of primary, unary etc. operators.
One of the attendees felt puzzled, because in the C# standard the "postfix ++/--" have been put in the category of primary operators rather than the "prefix ++/--". Her rationale behind this confusion was, that she would rather implement the C++ operator "postfix ++/--" in terms of the operator "prefix ++/--". In other words she would rather count the operator "prefix ++/--" as a primary operator. - I understand her point, but I can't give to her a rationale behind that. OK the operators "postfix ++/--" have a higher precedence than "prefix ++/--", but is this the only rationale behind that?
The spec mentioned it in section "14.2.1 Operator precedence and associativity".
So my very neutral question: Why are Postfix ++/-- categorized as primary Operators in C#?
Is there a deeper truth in it?
Since the ECMA standard itself does not define what a 'Primary' operator is, other than order of precedence (i.e. coming before 'Unary') there can be no other significance. The choice of words was probably bad.
Take into account that in many C-link languages, postfix operators tend to create a temporary variable where the expression's intermediate result is stored (see: "Prefer prefix operators over postfix" at Semicolon). Thus, they are fundamentally different from the prefix version.
Nonetheless, quickly checking how Mono and Visual Studio compile for-loops using the postfix and prefix forms, I saw that the IL code produced is identical. Only if you use the postfix/prefix expression's value does it translate to different IL (only affecting where the 'dup' instruction in placed), at least with those implementations mentioned.
EDIT: Okay, now I'm back home, I've removed most of the confusing parts...
I don't know why x++ is classified as a primary expression but ++x isn't; although I doubt it makes much difference in terms of the code you would write. Ditto precedence. I wonder whether the postfix is deemed primary as it's used more commonly? The annotated C# specs don't have any annotations around this, by the way, in either the ECMA edition or the Microsoft C# 4 editions. (I can't immediately find my C# 3 edition to check.)
However, in terms of implementation, I would think of ++ as a sort of pseudo-operator which is used by both prefix and postfix expressions. In particular, when you overload the ++ operator, that overload is used for both postfix and prefix increment. This is unlike C++, as stakx pointed out in a comment.
One thing to note is that while a post-increment/post-decrement expression has to have a primary expression as an operand, a pre-increment/pre-decrement expression only has to have a unary expression as an operand. In both cases the operand has to be classified as a variable, property access or indexer access though, so I'm not sure what practical difference that makes, if any.
EDIT: Just to give another bit of commentary, even though it seems arbitrary, I agree it does seem odd when the spec states:
Primary expressions include the simplest forms of expressions
But the list of steps for pre-increment is shorter/simpler than list of steps for post-increment (as it doesn't include the "save the value" step).
the difference is that
a[i++] will access the element indexed i.
a[++i] will access teh element indexed i+1.
In both cases after execution of a[++i/i++]
i will be i+1.
This can make troubles because you can't make assumption on parameters order
function(i++,i++,i++)
will increment i 3 times but you don't know in wich order. if initially i is 4 you can also have
function(4,5,6)
but also function(6,5,4)
or also function(6,4,5).
and that is still nothing because I used as example native types (for example "int"), things get worse when you have classes.
When overloading the operator result is not changed, what is changed is it's precedence. and this too can cause troubles.
So in one case "++" is applied before returning the reference, in the other case is applied "after" returning the reference. And when you overload it probably is better having it applied before returnin the reference (so ++something is much better than something++ at least from overloading point of view.)
take a generic class with overloaded ++ (of wich we have 2 items, foo and bar)
foo = bar ++; //is like writing (foo=bar).operator++();
foo = ++bar; // is like writing foo= (bar.operator++());
and there's much difference. Especially when you just don't assign your reference but do something more complex with it, or internally your object has stuff that has to do with shallow-copies VS deep copies.
Related
In C# 7.2, we saw the introduction of the in modifier for method parameters to pass read-only references to objects. I'm working on a new .NET Standard project using 7.2, and out of curiosity I tried compiling with the in keyword on the parameters for the equality operators for a struct.
i.e. - public static bool operator == (in Point l, in Point r)
not - public static bool operator == (Point l, Point r)
I was initially a bit surprised that this worked, but after thinking about it a bit I realized that there is probably no functional difference between the two versions of the operator. I wanted to confirm these suspicions, but after a somewhat thorough search around, I can't find anything that explicitly talks about using the in keyword in operator overloads.
So my question is whether or not this actually has a functional difference, and if it does, is there any particular reason to encourage or discourage the use of in with operator arguments. My initial thoughts are that there is no difference, particularly if the operator is inlined. However, if it does make a difference, it seems like in parameters should be used everywhere (everywhere that readonly references make sense, that is), as they provide a speed bonus, and, unlike ref and out, don't require the user to prepend the those keywords when passing objects. This would allow more efficient value-type object passing without a single change on the user of the methods and operators.
Overall, this may go beyond the sort of small-scale optimizations that most C# developers worry about, but I am curious as to whether or not it has an effect.
whether or not this actually has a functional difference... My initial thoughts are that there is no difference, particularly if the operator is inlined
Since the operator == overload is invoked like a regular static method in MSIL, it has the functional difference. It can help to avoid unnecessary copying like in a regular method.
is there any particular reason to encourage or discourage the use of in with operator arguments.
According to this article it is recommended to apply in modifier when value types are larger than System.IntPtr.Size. But it is important that the value type should be readonly struct. Otherwise in modifier can harm the performance because the compiler will create a defensive copy when calling struct's methods and properties since they can change the state of the argument.
Languages like i.e. Java and C# have both bitwise and logical operators.
Logical operators make only sense with boolean operands, bitwise operators work with integer types as well. Since C had no boolean type and treats all non-zero integers as true, the existence of both logical and bitwise operators makes sense there. However, languages like Java or C# have a boolean type so the compiler could automatically use the right kind of operators, depending on the type context.
So, is there some concrete reason for having both logical and bitwise operators in those languages? Or were they just included for familiarity reasons?
(I am aware that you can use the "bitwise" operators in a boolean context to circumvent the short-circuiting in Java and C#, but i have never needed such a behaviour, so i guess it might be a mostly unused special case)
1) is there some concrete reason for having both logical and bitwise operators in those languages?
Yes:
We have boolean operators to do boolean logic (on boolean values).
We have bitwise operators to do bitwise logic (on integer values).
2) I am aware that you can use the "bitwise" operators in a boolean context to circumvent the short-circuiting in Java and C#,
For as far as C# goes this simply is not true.
C# has for example 2 boolean AND operators: & (full) and && (short) but it does not allow bitwise operations on booleans.
So, there really is no 'overlap' or redundancy between logical and bitwise operators. The two do not apply to the same types.
in C#, with booleans
&& is a short circuiting logical operator
& is a non short circuiting logical operator
bitwise, it just uses & as a legacy syntax from C / C++.... but it's really quite different. If anything, it would be better as a completely different symbol to avoid any confustion. But there aren't really many left, unless you wanted to go for &&& or ||| but thats a bit ugly.
Late answer, but I'll try to get to your real point.
You are correct. And the easiest way to make your point is to mention that other typed languages (like Visual Basic) have logical operators that can act on both Boolean and Integer expressions.
VB Or operator: http://msdn.microsoft.com/en-us/library/06s37a7f.aspx
VB Bitwise example: http://visualbasic.about.com/od/usingvbnet/a/bitops01_2.htm
This was very much a language design decision. Java and C# didn’t have to be the way they are. They just are the way they are. Java and C# did indeed inherit much of their syntax from C for the sake of familiarity. Other languages didn’t and work just fine.
A design decision like this has consequences. Short circuit evaluation is one. Disallowing mixed types (which can be confusing for humans to read) is another. I’ve come to like it but maybe I’ve just been staring at Java for too long.
Visual Basic added AndAlso and OrElse as a way to do short circuit evaluation. Unlike basics other logical operators these work only on Booleans.
VB OrElse: http://msdn.microsoft.com/en-us/library/ea1sssb2.aspx
Short-circuit description: http://support.microsoft.com/kb/817250
The distinction wasn’t made because strong typing makes it impossible to only have one set of logic operators in a language. It was made because they wanted short circuit evaluation and they wanted a clear way to signal to the reader what was happening.
The other reason (besides short circuit) that c and c++ have different kinds of logical operators is to allow any non-zero number to be reinterpreted as TRUE and zero reinterpreted as FALSE. To do that they need operators that tell them to interpret that way. Java rejects this whole reinterpret idea and throws an error in your face if you try to do that using it's logical operators. If it wasn’t for short circuit evaluation the only reason left would simply be because they wanted the operators to look different when doing different things.
So yes, if the Java and C# language designers hadn't cared about any of that they could have used one set of logical operators for both bitwise and boolean logic and figured out which to do based on operand type like some other languages do. They just didn't.
I'll say for Java
Logical operator are user with booleans and bitwise operators are used with ints. They can't be mixed.
Why not reduce them to one operator such as "&" or "|"? Java was designed to be friendly for C/C++ users, so it got their syntax. Nowadays these operators cannot be reduced because of backwards compatibility.
As you have already said, there's some difference between & and && (the same goes for | and ||) so you need two sets of boolean operators.
Now, independently from those above, you may need bitwise operators and the best choice is &, | s.o. since you don't have to avoid any confusion.
Why complicate things and use the two-character version ?
compiler cannot infer proper operator looking only at arguments. it's a business decision which one to choose. it's about lazy calculations. e.g.
public boolean a() {
doStuffA();
return false;
}
public boolean b() {
doStuffB();
return true;
}
and now:
a() & b() will execute doStuffB() while a() && b() will not
It's probably very lame question, but I found no references in C# specification about round brackets. Please point me to spec or msdn if answer on that question will be obvious.
What is the inner difference between (MyType)SomeObj.Property1 and (MyType)(SomeObj.Property1) in C# ?
AFAIK, in first case ( (x)SomeObj.Property1 cast ) - it will be the reference of concrete type (MyType) to Property1. In second case, such reference will execute get accessor SomeObj.get_Property1.
And it eventually could lead to subtle errors if get accessor have any side effects (and its often - do have ones)
Could anyone point me to exact documentation where such behaviour specified ?
Updated: Thank you for pointing. And I deeply apologize about such dumb question - after posting this question, I found a typo in example I fiddle with and thus realized that second case behaviour was not based on code I tried to compile, but on previously compiled completely different code. So my question was initially based on my own blindness ...
They are equivalent. This is determined by the operator precedence rules in the C# language, chapter 7.2.1 in the C# Language Specification:
The . operator is at the top in this list, the cast operator is the 2nd in the list. The . operator "wins". You will have use parentheses if you need the cast because Property1 is a property of the MyType class:
((MyType)SomeObj).Property1
There is absolutely no difference. The . operator binds more tightly than the typecast operator, so the extra parentheses make no difference. See here for details of operator precedence; the operators in question are in the first two groups.
What’s the point of post increment ++ operator having higher precedence than preincrement ++ operator? Thus, is there a situation where x++ having same level of precedence as ++x would cause an expression to return a wrong result?
Let's start with defining some terms, so that we're all talking about the same thing.
The primary operators are postfix "x++" and "x--", the member access operator "x.y", the call operator "f(x)", the array dereference operator "a[x]", and the new, typeof, default, checked, unchecked and delegate operators.
The unary operators are "+x", "-x", "~x", "!x", "++x", "--x" and the cast "(T)x".
The primary operators are by definition of higher precedence than the unary operators.
Your question is
is there a situation where x++ having same level of precedence as ++x would cause an expression to return a wrong result?
It is not at all clear to me what you mean logically by "the wrong result". If we changed the rules of precedence in such a way that the value of an expression changed then the new result would be the right result. The right result is whatever the rules say the right result is. That's how we define "the right result" -- the right result is what you get when you correctly apply the rules.
We try to set up the rules so that they are useful and make it easy to express the meaning you intend to express. Is that what you mean by "the wrong result" ? That is, are you asking if there is a situation where one's intuition about what the right answer is would be incorrect?
I submit to you that if that is the case, then this is not a helpful angle to pursue because almost no one's intuition about the "correct" operation of the increment operators actually matches the current specification, much less some hypothetical counterfactual specification. In almost every C# book I have edited, the author has in some subtle or gross way mis-stated the meaning of the increment operators.
These are side-effecting operations with unusual semantics, and they come out of a language - C - with deliberately vague operational semantics. We have tried hard in the definition of C# to make the increment and decrement operators sensible and strictly defined, but it is impossible to come up with something that makes intuitive sense to everyone, since everyone has a different experience with the operators in C and C++.
Perhaps it would be helpful to approach the problem from a different angle. Your question presupposes a counterfactual world in which postfix and prefix ++ are specified to have the same precedence, and then asks for a criticism of that design choice in that counterfactual world. But there are many different ways that could happen. We could make them have the same precedence by putting both into the "primary" category. Or we could make them have the same precedence by putting them both into the "unary" category. Or we could invent a new level of precedence between primary and unary. Or below unary. Or above primary. We could also change the associativity of the operators, not just their precedence.
Perhaps you could clarify the question as to which counterfactual world you'd like to have criticized. Given any of those counterfactuals, I can give you a criticism of how that choice would lead to code that was unnecessarily verbose or confusing, but without a clear concept of the counterfactual design you're asking for criticism on, I worry that I'd spend a lot of time criticising something other than what you actually have in mind.
Make a specific proposed design change, and we'll see what its consequences are.
John, you have answered the question yourself: these two constructions are mostly used for function calls: ++x - when you want first to increase the value and then call a function, and x++ when you want to call a function, and then make an increase. That might be very useful, depending on the context. Looking at return x++ vs return ++x I see no point for error: the code means exactly how it reads :) The only problem is the programmer, who might use these two constructions without understanding the operator's precedence, and thus missing the meaning.
Ada, Pascal and many other languages support ranges, a way to subtype integers.
A range is a signed integer value which ranges from a value (first) to another (last).
It's easy to implement a class that does the same in OOP but I think that supporting the feature natively could let the compiler to do additional static checks.
I know that it's impossible to verify statically that a variabile defined in a range is not going to "overflow" runtime, i.e. due to bad input, but I think that something could be done.
I think about the Design by Contract approach (Eiffel) and the Spec# ( C# Contracts ), that give a more general solution.
Is there a simpler solution that checks, at least, static out-of-bound assignment at compile time in C++, C# and Java? Some kind of static-assert?
edit: I understand that "ranges" can be used for different purpose:
iterators
enumerators
integer subtype
I would focus on the latter, because the formers are easily mappable on C* language .
I think about a closed set of values, something like the music volume, i.e. a range that goes from 1 up to 100. I would like to increment or decrement it by a value. I would like to have a compile error in case of static overflow, something like:
volume=rangeInt(0,100);
volume=101; // compile error!
volume=getIntFromInput(); // possible runtime exception
Thanks.
Subrange types are not actually very useful in practice. We do not often allocate fixed length arrays, and there is also no reason for fixed sized integers. Usually where we do see fixed sized arrays they are acting as an enumeration, and we have a better (although "heavier") solution to that.
Subrange types also complicate the type system. It would be much more useful to bring in constraints between variables than to fixed constants.
(Obligatory mention that integers should be arbitrary size in any sensible language.)
Ranges are most useful when you can do something over that range, concisely. That means closures. For Java and C++ at least, a range type would be annoying compared to an iterator because you'd need to define an inner class to define what you're going to do over that range.
Java has had an assert keyword since version 1.4. If you're doing programming by contract, you're free to use those to check proper assignment. And any mutable attribute inside an object that should fall within a certain range should be checked prior to being set. You can also throw an IllegalArgumentException.
Why no range type? My guess is that the original designers didn't see one in C++ and didn't consider it as important as the other features they were trying to get right.
For C++, a lib for constrained values variables is currently being implemented and will be proposed in the boost libraries : http://student.agh.edu.pl/~kawulak/constrained_value/index.html
Pascal (and also Delphi) uses a subrange type but it is limited to ordinal types (integer, char and even boolean).
It is primarilly an integer with extra type checking. You can fake that in an other language using a class. This gives the advantage that you can apply more complex ranges.
I would add to Tom Hawtin response (to which I agree) that, for C++, the existence of ranges would not imply they would be checked - if you want to be consistent to the general language behavior - as array accesses, for instance, are also not range-checked anyway.
For C# and Java, I believe the decision was based on performance - to check ranges would impose a burden and complicate the compiler.
Notice that ranges are mainly useful during the debugging phase - a range violation should never occur in production code (theoretically). So range checks are better to be implemented not inside the language itself, but in pre- and post- conditions, which can (should) be stripped out when producing the release build.
This is an old question, but just wanted to update it. Java doesn't have ranges per-se, but if you really want the function you can use Commons Lang which has a number of range classes including IntRange:
IntRange ir = new IntRange(1, 10);
Bizarrely, this doesn't exist in Commons Math. I kind of agree with the accepted answer in part, but I don't believe ranges are useless, particularly in test cases.
C++ allows you to implement such types through templates, and I think there are a few libraries available doing this already. However, I think in most cases, the benefit is too small to justify the added complexity and compilation speed penalty.
As for static assert, it already exists.
Boost has a BOOST_STATIC_ASSERT, and on Windows, I think Microsoft's ATL library defines a similar one.
boost::type_traits and boost::mpl are probably your best friends in implementing something like this.
The flexibility to roll your own is better than having it built into the language. What if you want saturating arithmetic for example, instead of throwing an exception for out of range values? I.e.
MyRange<0,100> volume = 99;
volume += 10; // results in volume==100
In C# you can do this:
foreach(int i in System.Linq.Enumerable.Range(0, 10))
{
// Do something
}
JSR-305 provides some support for ranges but I don't know when if ever this will be part of Java.