Efficiency of bitwise operations on bytes in C# [duplicate]

Efficiency of bitwise operations on bytes in C# [duplicate] - c#

If I have two bytes a and b, how come:
byte c = a & b;
produces a compiler error about casting byte to int? It does this even if I put an explicit cast in front of a and b.
Also, I know about this question, but I don't really know how it applies here. This seems like it's a question of the return type of operator &(byte operand, byte operand2), which the compiler should be able to sort out just like any other operator.

Why do C#'s bitwise operators always return int regardless of the format of their inputs?
I disagree with always. This works and the result of a & b is of type long:
long a = 0xffffffffffff;
long b = 0xffffffffffff;
long x = a & b;
The return type is not int if one or both of the arguments are long, ulong or uint.
Why do C#'s bitwise operators return int if their inputs are bytes?
The result of byte & byte is an int because there is no & operator defined on byte. (Source)
An & operator exists for int and there is also an implicit cast from byte to int so when you write byte1 & byte2 this is effectively the same as writing ((int)byte1) & ((int)byte2) and the result of this is an int.

This behavior is a consequence of the design of IL, the intermediate language generated by all .NET compilers. While it supports the short integer types (byte, sbyte, short, ushort), it has only a very limited number of operations on them. Load, store, convert, create array, that's all. This is not an accident, those are the kind of operations you could execute efficiently on a 32-bit processor, back when IL was designed and RISC was the future.
The binary comparison and branch operations only work on int32, int64, native int, native floating point, object and managed reference. These operands are 32-bits or 64-bits on any current CPU core, ensuring the JIT compiler can generate efficient machine code.
You can read more about it in the Ecma 335, Partition I, chapter 12.1 and Partition III, chapter 1.5
I wrote a more extensive post about this over here.

Binary operators are not defined for byte types (among others). In fact, all binary (numeric) operators act only on the following native types:
int
uint
long
ulong
float
double
decimal
If there are any other types involved, it will use one of the above.
It's all in the C# specs version 5.0 (Section 7.3.6.2):
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
If either operand is of type decimal, the other operand is converted to type decimal, or a compile-time error occurs if the other operand is of type float or double.
Otherwise, if either operand is of type double, the other operand is converted to type double.
Otherwise, if either operand is of type float, the other operand is converted to type float.
Otherwise, if either operand is of type ulong, the other operand is converted to type ulong, or a compile-time error occurs if the other operand is of type sbyte, short, int, or long.
Otherwise, if either operand is of type long, the other operand is converted to type long.
Otherwise, if either operand is of type uint and the other operand is of type sbyte, short, or int, both operands are converted to type long.
Otherwise, if either operand is of type uint, the other operand is converted to type uint.
Otherwise, both operands are converted to type int.

It's because & is defined on integers, not on bytes, and the compiler implicitly casts your two arguments to int.

Related

Why does XOR-ing two shorts give an int? [duplicate]

If I have two bytes a and b, how come:
byte c = a & b;
produces a compiler error about casting byte to int? It does this even if I put an explicit cast in front of a and b.
Also, I know about this question, but I don't really know how it applies here. This seems like it's a question of the return type of operator &(byte operand, byte operand2), which the compiler should be able to sort out just like any other operator.

Why do C#'s bitwise operators always return int regardless of the format of their inputs?
I disagree with always. This works and the result of a & b is of type long:
long a = 0xffffffffffff;
long b = 0xffffffffffff;
long x = a & b;
The return type is not int if one or both of the arguments are long, ulong or uint.
Why do C#'s bitwise operators return int if their inputs are bytes?
The result of byte & byte is an int because there is no & operator defined on byte. (Source)
An & operator exists for int and there is also an implicit cast from byte to int so when you write byte1 & byte2 this is effectively the same as writing ((int)byte1) & ((int)byte2) and the result of this is an int.

This behavior is a consequence of the design of IL, the intermediate language generated by all .NET compilers. While it supports the short integer types (byte, sbyte, short, ushort), it has only a very limited number of operations on them. Load, store, convert, create array, that's all. This is not an accident, those are the kind of operations you could execute efficiently on a 32-bit processor, back when IL was designed and RISC was the future.
The binary comparison and branch operations only work on int32, int64, native int, native floating point, object and managed reference. These operands are 32-bits or 64-bits on any current CPU core, ensuring the JIT compiler can generate efficient machine code.
You can read more about it in the Ecma 335, Partition I, chapter 12.1 and Partition III, chapter 1.5
I wrote a more extensive post about this over here.

It's because & is defined on integers, not on bytes, and the compiler implicitly casts your two arguments to int.

UInt weird compile time behaviour - returning a Long..?

The following code prints UInt32:
var myUint = 1U;
Console.WriteLine(myUint.GetType().Name);
As per this SO answer I wanted to see what would happen if you try to use the U literal suffix with a compile-time negative number. This code (changing1U to -1U) prints Int64 (long):
var myUint = -1U;
Console.WriteLine(myUint.GetType().Name);
I thought it would just be a compile time error, but instead returns a long with the value -1 - what is going on? Why does the compiler do this??

The minus sign is not a part of the integer literal specification. So when you write var x = -1U, the following rules are applied by the compiler:
If the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.
So that's the 1U part becoming a uint / UInt32, so far conforming to your expectations.
But then the minus is applied:
For an operation of the form -x, unary operator overload resolution (§7.3.3) is applied to select a specific operator implementation. The operand is converted to the parameter type of the selected operator, and the type of the result
is the return type of the operator. The predefined negation operators are:
Integer negation:
int operator -(int x);
long operator -(long x);
[...]
If the operand of the negation operator is of type uint, it is converted to type long, and the type of the result is long.
So the type of the expression -1U is long, as per the C# specification. This then becomes the type of x.

Obviously -1U cannot be stored as a uint. Since you use var, the compiler deduces the type to hold the value. Since you want to hold -(1 as unsigned integer), the compiler decides to store it as long.
You would get a compile time error if you defined your type explicitly:
uint myUint = -1U;

C# casting issue

Why if i write
char ch = 0;
I get compiler error as I would axpect, instead
bool allZero = "000".All(ch => ch == 0);
I don't get any error.
C# is strongly typed and I'd prefer to be warning in this case.
That was a bug in my software.

This is explained in the C# language specification.
6.1.2 Implicit numeric conversions states:
The implicit numeric conversions are:
(... some text omitted)
• From char to ushort, int, uint, long, ulong, float, double, or decimal.
And goes on to explicitly state:
There are no implicit conversions to the char type, so values of the other integral types do not automatically convert to the char type
7.3.6.2 Binary numeric promotions states:
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
• If either operand is of type decimal, the other operand is converted to type decimal, or a binding-time error occurs if the other operand is of type float or double.
(... some text omitted)
• Otherwise, both operands are converted to type int.
So when char == 0 is compiled, the compiler will promote the char to an int before generating the comparison code.
There is nothing in the standard that allows for an int to be implicitly converted to a char (even if it is a constant value in the valid range for a char), and in fact it is explicitly disallowed - which is why char ch = 0; is not allowed.

Are operands inside an expression promoted to larger types according to the following rules?

If numeric expression contains operands (constants and variables) of different numeric types, are operands promoted to larger types according to the following rules:
if operands are of types byte, sbyte, char, short, ushort, they get converted to int type
If one of the operands is int, then all operands are converted to int
if expression also contains operands of types uint and int, then all operands are converted to long
If one of operands is long, then all operands are converted to long
if expression contains operands of type ulong and long, then operands are converted to float
If one of the operands is float, then all operands are converted to float
if one of operands is double, then all operands are converted to double
Assuming numeric expressions contains operands of different types, will all operands first get converted to a single numeric type, and only then will the runtime try to compute the result? For example, if variables b1 and b2 are of byte type, while i1 is of int type, will b1 and b2 get converted to int prior to computing (b1+b2):
int i2=(b1+b2)+i1

The parentheses are of higher precedence than +, so the conversion would normally take place after b1 and b2 have been added. However, the + operator does not have an overload for bytes, so the bytes must first be promoted to ints.
Further reading:
operator precedence
binary numeric promotion

Your rules have some elements of truths, but is technically imprecise.
Here are the relevant excerpts from the C# Language Specification
7.2.6.2 Binary numeric promotions
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
If either operand is of type decimal, the other operand is converted to type decimal, or a compile-time error occurs if the other operand is of type float or double.
Otherwise, if either operand is of type double, the other operand is converted to type double.
Otherwise, if either operand is of type float, the other operand is converted to type float.
Otherwise, if either operand is of type ulong, the other operand is converted to type ulong, or a compile-time error occurs if the other operand is of type sbyte, short, int, or long.
Otherwise, if either operand is of type long, the other operand is converted to type long.
Otherwise, if either operand is of type uint and the other operand is of type sbyte, short, or int, both operands are converted to type long.
Otherwise, if either operand is of type uint, the other operand is converted to type uint.
Otherwise, both operands are converted to type int.
int i2=(b1+b2)+i1
As per the specification above, yes, byte b1, b2 are subject to binary numeric promotion to int for the binary operator +.

OR-ing bytes in C# gives int [duplicate]

This question already has answers here:
Left bit shifting 255 (as a byte)
(7 answers)
Closed 9 years ago.
I have this code.
byte dup = 0;
Encoding.ASCII.GetString(new byte[] { (0x80 | dup) });
When I try to compile I get:
Cannot implicitly convert type 'int'
to 'byte'. An explicit conversion
exists (are you missing a cast?)
Why does this happen? Shouldn't | two bytes give a byte? Both of the following work, assuring that each item is a byte.
Encoding.ASCII.GetString(new byte[] { (dup) });
Encoding.ASCII.GetString(new byte[] { (0x80) });

It's that way by design in C#, and, in fact, dates back all the way to C/C++ - the latter also promotes operands to int, you just usually don't notice because int -> char conversion there is implicit, while it's not in C#. This doesn't just apply to | either, but to all arithmetic and bitwise operands - e.g. adding two bytes will give you an int as well. I'll quote the relevant part of the spec here:
Binary numeric promotion occurs for
the operands of the predefined +, –,
*, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary
numeric promotion implicitly converts
both operands to a common type which,
in case of the non-relational
operators, also becomes the result
type of the operation. Binary numeric
promotion consists of applying the
following rules, in the order they
appear here:
If either operand is of
type decimal, the other operand is
converted to type decimal, or a
compile-time error occurs if the other
operand is of type float or double.
Otherwise, if either operand is of
type double, the other operand is
converted to type double.
Otherwise,
if either operand is of type float,
the other operand is converted to type
float.
Otherwise, if either operand
is of type ulong, the other operand is
converted to type ulong, or a
compile-time error occurs if the other
operand is of type sbyte, short, int,
or long.
Otherwise, if either
operand is of type long, the other
operand is converted to type long.
Otherwise, if either operand is of
type uint and the other operand is of
type sbyte, short, or int, both
operands are converted to type long.
Otherwise, if either operand is of
type uint, the other operand is
converted to type uint.
Otherwise,
both operands are converted to type
int.
I don't know the exact rationale for this, but I can think about one. For arithmetic operators especially, it might be a bit surprising for people to get (byte)200 + (byte)100 suddenly equal to 44, even if it makes some sense when one carefully considers the types involved. On the other hand, int is generally considered a type that's "good enough" for arithmetic on most typical numbers, so by promoting both arguments to int, you get a kind of "just works" behavior for most common cases.
As to why this logic was also applied to bitwise operators - I imagine this is so mostly for consistency. It results in a single simple rule that is common for all non-boolean binary types.
But this is all mostly guessing. Eric Lippert would probably be the one to ask about the real motives behind this decision for C# at least (though it would be a bit boring if the answer is simply "it's how it's done in C/C++ and Java, and it's a good enough rule as it is, so we saw no reason to change it").

The literal 0x80 has the type "int", so you are not oring bytes.
That you can pass it to the byte[] only works because 0x80 (as a literal) it is within the range of byte.
Edit: Even if 0x80 is cast to a byte, the code would still not compile, since oring bytes will still give int. To have it compile, the result of the or must be cast: (byte)(0x80|dup)

byte dup = 0;
Encoding.ASCII.GetString(new byte[] { (byte)(0x80 | dup) });
The result of a bitwise Or (|) on two bytes is always an int.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.