0x80000000 == 2147483648 in C# but not in VB.NET - c#

In C#:
0x80000000==2147483648 //outputs True
In VB.NET:
&H80000000=2147483648 'outputs False
How is this possible?

This is related to the history behind the languages.
C# always supported unsigned integers. The value you use are too large for int so the compiler picks the next type that can correctly represent the value. Which is uint for both.
VB.NET didn't acquire unsigned integer support until version 8 (.NET 2.0). So traditionally, the compiler was forced to pick Long as the type for the 2147483648 literal. The rule was however different for the hexadecimal literal, it traditionally supported specifying the bit pattern of a negative value (see section 2.4.2 in the language spec). So &H80000000 is a literal of type Integer with the value -2147483648 and 2147483648 is a Long. Thus the mismatch.
If you think VB.NET is a quirky language then I'd invite you to read this post :)

The VB version should be:
&H80000000L=2147483648
Without the 'long' specifier ('L'), VB will try to interpret &H8000000 as an integer. If you force it to consider this as a long type, then you'll get the same result.
&H80000000UI will also work - actually this is the type (UInt32) that C# regards the literal as.

This happens because the type of the hexadecimal number is UInt32 in C# and Int32 in VB.NET.
The binary representation of the hexadecimal number is:
10000000000000000000000000000000
Both UInt32 and Int32 take 32 bits, but because Int32 is signed, the first bit is considered a sign to indicate whether the number is negative or not: 0 for positive, 1 for negative. To convert a negative binary number to decimal, do this:
Invert the bits. You get 01111111111111111111111111111111.
Convert this to decimal. You get 2147483647.
Add 1 to this number. You get 2147483648.
Make this negative. You get -2147483648, which is equal to &H80000000 in VB.NET.

Related

C#: Why is 0xFFFFFFFF a uint when it represents -1?

I don't understand why C# considers the literal 0xFFFFFFFF as a uint when it also represents -1 for int types.
The following is code was entered into the Immediate Window shown here with the output:
int i = -1;
-1
string s = i.ToString("x");
"ffffffff"
int j = Convert.ToInt32(s, 16);
-1
int k = 0xFFFFFFFF;
Cannot implicitly convert type 'uint' to 'int'. An explicit conversion exists (are you missing a cast?)
int l = Convert.ToInt32(0xFFFFFFFF);
OverflowException was unhandled: Value was either too large or too small for an Int32.
Why can the string hex number be converted without problems but the literal only be converted using unchecked?
Why is 0xFFFFFFFF a uint when it represents -1?
Because you're not writing the bit pattern when you write
i = 0xFFFFFFFF;
you're writing a number by C#'s rules for integer literals. With C#'s integer literals, to write a negative number we write a - followed by the magnitude of the number (e.g., -1), not the bit pattern for what we want. It's really good that we aren't expected to write the bit pattern, it would make it really awkward to write negative numbers. When I want -3, I don't want to have to write 0xFFFFFFFD. :-) And I really don't want to have to vary the number of leading Fs based on the size of the type (0xFFFFFFFFFFFFFFFD for a long -3).
The rule for choosing the type of the literal is covered by the above link by saying:
If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
0xFFFFFFFF doesn't fit in an int, which has a maximum positive value of 0x7FFFFFFF, so the next in the list is uint, which it does fit in.
0xffffffff is 4294967295 is an UInt32 that just happens to have a bit pattern equal to the Int32 -1 due to the way negative numbers are represented on computers. Just because they have the same bit pattern, that doesn't mean 4294967295 = -1. They're completely different numbers so of course you can't just trivially convert between the two. You can force the reintepretation of the bit pattern by using an explicit cast to int: (int)0xffffffff.
The C# docs say that the compiler will try to fit the number you provide in the smallest type that can fit it. That doc is a bit old, but it applies still. It always assumes that the number is positive.
As a fallback you can always coerce the type.
The C# language rules state that 0xFFFFFFFF is an unsigned literal.
A C# signed int is 2's complement type. That scheme uses 0xFFFFFFFF to represent -1. (2's complement is a clever scheme since it doesn't have a signed zero).
For an unsigned int, 0xFFFFFFFF is the largest value it can take, and due to its size, it can't be converted to a signed int.

Will a c# "int" ever be 64 bits? [duplicate]

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

.NET primitives and type hierarchies, why was it designed like this?

I would like to understand why on .NET there are nine integer types: Char, Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, and UInt64; plus other numeric types: Single, Double, Decimal; and all these types have no relation at all.
When I first started coding in C# I thought "cool, there's a uint type, I'm going to use that when negative values are not allowed". Then I realized no API used uint but int instead, and that uint is not derived from int, so a conversion was needed.
What are the real world application of these types? Why not have, instead, integer and positiveInteger ? These are types I can understand. A person's age in years is a positiveInteger, and since positiveInteger is a subset of integer there's so need for conversion whenever integer is expected.
The following is a diagram of the type hierarchy in XPath 2.0 and XQuery 1.0. If you look under xs:anyAtomicType you can see the numeric hierarchy decimal > integer > long > int > short > byte. Why wasn't .NET designed like this? Will the new framework "Oslo" be any different?
My guess would be because the underlying hardware breaks that class hierarchy. There are (perhaps surprisingly) many times when you care that a UInt32 is a 4 bytes big and unsigned, so a UInt32 is not a kind of Int32, nor is an Int32 a type of Int64.
And you almost always care about the difference between an int and a float.
Fundamentally, inheritance & the class hierarchy are not the same as mathematical set inclusion. The fact that the values a UInt32 can hold are a strict subset of the values an Int64 can hold does not mean that a UInt32 is a type of Int64. Less obviously, an Int32 is not a type of Int64 - even though there's no conceptual difference between them, their underlying representations are different (4 bytes versus 8 bytes). Decimals are even more different.
XPath is different: the representations for all the numeric types are fundamentally the same - a string of ASCII digits. There, the difference between a short and a long is one of possible range rather than representation - "123" is both a valid representation of a short and a valid representation of a long with the same value.
Decimal is intended for calculations that need precision (basically, money).
See here: http://msdn.microsoft.com/en-us/library/364x0z75(VS.80).aspx
Singles/Doubles are different to decimals, because they're intended to be an approximation (basically, for scientific calculations).
That's why they're not related.
As for bytes and chars, they're totally different: a byte is 0-255, whereas a char is a character, and can hence store unicode characters (there are a lot more than 255 of them!)
Uints and ints don't convert automatically, because they can each store values that are impossible for the other (uints have twice the positive range of ints).
Once you get the hang of it all, it actually does make a lot of sense.
As for your ages thing, i'd simply use an int ;)

Hexadecimal notation and signed integers

This is a follow up question. So, Java store's integers in two's-complements and you can do the following:
int ALPHA_MASK = 0xff000000;
In C# this requires the use of an unsigned integer, uint, because it interprets this to be 4278190080 instead of -16777216.
My question, how do declare negative values in hexadecimal notation in c#, and how exactly are integers represented internally? What are the differences to Java here?
C# (rather, .NET) also uses the two's complement, but it supports both signed and unsigned types (which Java doesn't). A bit mask is more naturally an unsigned thing - why should one bit be different than all the other bits?
In this specific case, it is safe to use an unchecked cast:
int ALPHA_MASK = unchecked((int)0xFF000000);
To "directly" represent this number as a signed value, you write
int ALPHA_MASK = -0x1000000; // == -16777216
Hexadecimal is not (or should not) be any different from decimal: to represent a negative number, you need to write a negative sign, followed by the digits representing the absolute value.
Well, you can use an unchecked block and a cast:
unchecked
{
int ALPHA_MASK = (int)0xff000000;
}
or
int ALPHA_MASK = unchecked((int)0xff000000);
Not terribly convenient, though... perhaps just use a literal integer?
And just to add insult to injury, this will work too:
-0x7F000000

Is an int a 64-bit integer in 64-bit C#?

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

Categories

Resources