What is the difference between “int” and “uint” / “long” and “ulong”? - c#

I know about int and long (32-bit and 64-bit numbers), but what are uint and ulong?

The primitive data types prefixed with "u" are unsigned versions with the same bit sizes. Effectively, this means they cannot store negative numbers, but on the other hand they can store positive numbers twice as large as their signed counterparts. The signed counterparts do not have "u" prefixed.
The limits for int (32 bit) are:
int: –2147483648 to 2147483647
uint: 0 to 4294967295
And for long (64 bit):
long: -9223372036854775808 to 9223372036854775807
ulong: 0 to 18446744073709551615

uint and ulong are the unsigned versions of int and long. That means they can't be negative. Instead they have a larger maximum value.
Type Min Max CLS-compliant
int -2,147,483,648 2,147,483,647 Yes
uint 0 4,294,967,295 No
long –9,223,372,036,854,775,808 9,223,372,036,854,775,807 Yes
ulong 0 18,446,744,073,709,551,615 No
To write a literal unsigned int in your source code you can use the suffix u or U for example 123U.
You should not use uint and ulong in your public interface if you wish to be CLS-Compliant.
Read the documentation for more information:
int
uint
long
ulong
By the way, there is also short and ushort and byte and sbyte.

The difference is that the uint and ulong are unsigned data types, meaning the range is different: They do not accept negative values:
int range: -2,147,483,648 to 2,147,483,647
uint range: 0 to 4,294,967,295
long range: –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
ulong range: 0 to 18,446,744,073,709,551,615

u means unsigned, so ulong is a large number without sign. You can store a bigger value in ulong than long, but no negative numbers allowed.
A long value is stored in 64-bit,with its first digit to show if it's a positive/negative number. while ulong is also 64-bit, with all 64 bit to store the number. so the maximum of ulong is 2(64)-1, while long is 2(63)-1.

Based on the other answers here and a little review you can understand it this way: unsigned is in reference to the assignment of a negative or positive explicit assignment (think the "-" in -1) and the inability to have negative versions of said numbers.
And because of this capacity on the negative end being removed as an option they instead allocated that capacity to the positive end hence the doubling of the positive valuation's maximum value. So instead of the bit range being split along positive and negative valuations, they are instead for ushort, uint, along, etc allocated to the positive end of the valuation.

It's been a while since I C++'d but these answers are off a bit.
As far as the size goes, 'int' isn't anything. It's a notional value of a standard integer; assumed to be fast for purposes of things like iteration. It doesn't have a preset size.
So, the answers are correct with respect to the differences between int and uint, but are incorrect when they talk about "how large they are" or what their range is. That size is undefined, or more accurately, it will change with the compiler and platform.
It's never polite to discuss the size of your bits in public.
When you compile a program, int does have a size, as you've taken the abstract C/C++ and turned it into concrete machine code.
So, TODAY, practically speaking with most common compilers, they are correct. But do not assume this.
Specifically: if you're writing a 32 bit program, int will be one thing, 64 bit, it can be different, and 16 bit is different. I've gone through all three and briefly looked at 6502 shudder
A brief google search shows this:
https://www.tutorialspoint.com/cprogramming/c_data_types.htm
This is also good info:
https://docs.oracle.com/cd/E19620-01/805-3024/lp64-1/index.html
use int if you really don't care how large your bits are; it can change.
Use size_t and ssize_t if you want to know how large something is.
If you're reading or writing binary data, don't use int. Use a (usually platform/source dependent) specific keyword. WinSDK has plenty of good, maintainable examples of this. Other platforms do too.
I've spent a LOT of time going through code from people that "SMH" at the idea that this is all just academic/pedantic. These ate the people that write unmaintainable code. Sure, it's easy to use type 'int' and use it without all the extra darn typing. It's a lot of work to figure out what they really meant, and a bit mind-numbing.
It's fragile coding when you mix int and assume sizes.
use int and uint when you just want a fast integer and don't care about the range (other than signed/unsigned).

Related

Calculators working with larger numbers than 18446744073709551615

When I Initialize a ulong with the value 18446744073709551615 and then add a 1 to It and display to the Console It displays a 0 which is totally expected.
I know this question sounds stupid but I have to ask It. if my Computer has a 64-bit architecture CPU how is my calculator able to work with larger numbers than 18446744073709551615?
I suppose floating-point has a lot to do here.
I would like to know exactly how this happens.
Thank you.
working with larger numbers than 18446744073709551615
"if my Computer has a 64-bit architecture CPU" --> The architecture bit size is largely irrelevant.
Consider how you are able to add 2 decimal digits whose sum is more than 9. There is a carry generated and then used when adding the next most significant decimal place.
The CPU can do the same but with base 18446744073709551616 instead of base 10. It uses a carry bit as well as a sign and overflow bit to perform extended math.
I suppose floating-point has a lot to do here.
This is nothing to do with floating point.
; you say you're using ulong, which means your using unsigned 64-but arithmetic. The largest value you can store is therefore "all ones", for 64 bits - aka UInt64.MaxValue, which as you've discovered: https://learn.microsoft.com/en-us/dotnet/api/system.uint64.maxvalue
If you want to store arbitrarily large numbers: there are APIs for that - for example BigInteger. However, arbitrary size cones at a cost, so it isn't the default, and certainly isn't what you get when you use ulong (or double, or decimal, etc - all the compiler-level numeric types have fixed size).
So: consider using BigInteger
You either way have a 64 bits architecture processor and limited to doing 64 bits math - your problem is a bit hard to explain without taking an explicit example of how this is solved with BigInteger in System.Numerics namespace, available in .NET Framework 4.8 for example. The basis is to 'decompose' the number into an array representation.
mathematical expression 'decompose' here meaning :
"express (a number or function) as a combination of simpler components."
Internally BigInteger uses an internal array (actually multiple internal constructs) and a helper class called BigIntegerBuilder. In can implicitly convert an UInt64 integer without problem, for even bigger numbers you can use the + operator for example.
BigInteger bignum = new BigInteger(18446744073709551615);
bignum += 1;
You can read about the implicit operator here:
https://referencesource.microsoft.com/#System.Numerics/System/Numerics/BigInteger.cs
public static BigInteger operator +(BigInteger left, BigInteger right)
{
left.AssertValid();
right.AssertValid();
if (right.IsZero) return left;
if (left.IsZero) return right;
int sign1 = +1;
int sign2 = +1;
BigIntegerBuilder reg1 = new BigIntegerBuilder(left, ref sign1);
BigIntegerBuilder reg2 = new BigIntegerBuilder(right, ref sign2);
if (sign1 == sign2)
reg1.Add(ref reg2);
else
reg1.Sub(ref sign1, ref reg2);
return reg1.GetInteger(sign1);
}
In the code above from ReferenceSource you can see that we use the BigIntegerBuilder to add the left and right parts, which are also BigInteger constructs.
Interesting, it seems to keep its internal structure into an private array called "_bits", so that is the answer to your question. BigInteger keeps track of an array of 32-bits valued integer array and is therefore able to handle big integers, even beyond 64 bits.
You can drop this code into a console application or Linqpad (which has the .Dump() method I use here) and inspect :
BigInteger bignum = new BigInteger(18446744073709551615);
bignum.GetType().GetField("_bits",
BindingFlags.NonPublic | BindingFlags.Instance).GetValue(bignum).Dump();
A detail about BigInteger is revealed in a comment in the source code of BigInteger on Reference Source. So for integer values, BigInteger stores the value in the _sign field, for other values the field _bits is used.
Obviously, the internal array needs to be able to be converted into a representation in the decimal system (base-10) so humans can read it, the ToString() method converts the BigInteger to a string representation.
For a better in-depth understanding here, consider doing .NET source stepping to step way into the code how you carry out the mathematics here. But for a basic understanding, the BigInteger uses an internal representation of which is composed with 32 bits array which is transformed into a readable format which allows bigger numbers, bigger than even Int64.
// For values int.MinValue < n <= int.MaxValue, the value is stored in sign
// and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits

Why Int64 and UInt64 are of same size

I'm asking in context of C#. Why both Int64 and UInt64 are of same size i.e. 64-bits. Same goes to other Int variables also.
I was expecting UInt to be smaller in size at least a bit as it do not need to represent the - sign, am I expecting some illogical thing here?
The 64 in Int64 and UInt64 means the size of the integer is 64 bits. That's why they are the same size.
The size is the same, but range is different (also due to the significant bit):
UInt64: 0 to 18,446,744,073,709,551,615;
Int64: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
The fundamental reason is that modern CPUs are 64-bit ones.
So it's far more easier to implement
64-bit integer - one CPU register (say RAX)
32-bit integer - half CPU register (EAX)
16-bit integer - 1/4 (AX)
8-bit integer - 1/8 (AH)
otherwise (for say 96-bit) you have to design your own support for, say, integer overflow detection etc. which will be slow.
It all comes down to what the most significant bit is used for. In signed data it indicates the sign; in unsigned data it is just another bit. This means that unsigned data can represent values twice as large in magnitude, but only represent positive values (and zero) - for example, in the case of 64-bits, unsigned values are 0 to 18,446,744,073,709,551,615 - where-as signed values are -9,223,372,036,854,775,808 thru 9,223,372,036,854,775,807.
It is easier to see for 8-bit: -128 to 127 versus 0 to 255. The number of possible values is identical.
Here there is the Integral Types Table for C#. If you look there you can see:
long -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 Signed 64-bit integer
ulong 0 to 18,446,744,073,709,551,615 Unsigned 64-bit integer
If you look, the "range" of both is the same, 2 ^ 64 (including the 0)
Probably easier with smaller data types:
sbyte -128 to 127 Signed 8-bit integer
byte 0 to 255 Unsigned 8-bit integer
Both have a "range" of 256 values, both of them are 8 bits, and 2^8 == 256
Note that signed and unsigned types have the same range of values because historically nearly all the processors use two's complement for numbers. If they had used ones' complement then the effective range of signed numbers would be one less than the range of unsigned numbers, because there would be two 0 values (positive and negative 0)

Why are the unsigned CLR types so difficult to use in C#?

I came from a mostly C/C++ background before I began using C#. One of the things I did with my first project in C# was make a class like this
class Element{
public uint Size;
public ulong BigThing;
}
I was then mortified by the fact that this requires casting:
int x=MyElement.Size;
as does
int x=5;
uint total=MyElement.Size+x;
Why did the language designers decide to make the signed and unsigned integer types not implicitly castable? And why are the unsigned types not used more throughout the .Net library? For instance String.Length can never be negative, yet it is a signed integer.
Why did the language designers decide to make the signed and unsigned integer types not
implicitly castable?
Because that could lose data or throw any exception, neither of which is generally a good thing to allow implicitly. (The implicit conversion from long to double can lose data too, admittedly, but in a different way.)
And why are the unsigned types not used more throughout the .Net library
Unsigned types aren't CLS-compliant - not all .NET languages have always supported them. For example, Visual Basic didn't have "native" support for unsigned data types in .NET 1.0 and 1.1; it was added to the language for 2.0. (You could still use them, but they weren't part of the language itself - you couldn't use the normal arithmetic operators, for example.)
Along with Jon's answer, just because an unsigned number can't be negative doesn't mean it isn't bigger than a signed one. uint is 0 to 4,294,967,295 but int is -2,147,483,648 to 2,147,483,647. Plenty of room above int's max for loss.
Because implicitly converting an unsigned integer of 3B into an signed integer is going to blow up.
Unsigned has twice the maximum value of signed. It's the same reason you can't cast a long to an int.
I was then mortified by the fact that this requires casting:
int x=MyElement.Size;
But you are contradicting yourself here. If you really (really) need Size to be unsigned than assigning it to (signed) x is an error. A deep flaw in your code.
For instance String.Length can never be negative, yet it is a signed integer
But String.IndexOf can return a negative number, and it would be awkward if String.Length and Index values where of different types.
And while in theory there would be merit in an unsigned String.Length (4 GB cap), in practice even the current 2GB is large enough (because strings of that length are rare and unworkable anyway).
So the real answer is: Why use unsigned in the first place?
On the second count: because they wanted the CLR to be compatible with languages that don't have unsigned datatypes (read: VB.NET).

Will a c# "int" ever be 64 bits? [duplicate]

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

Is an int a 64-bit integer in 64-bit C#?

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

Categories

Resources