When adding two positive Int32 values with a theoretical result greater than Int32.MaxValue can I count on the overflown value always being negative?
I mean to do this as a way to check for overflow without using a checked context and exception handling (like proposed here: http://sandbox.mc.edu/~bennet/cs110/tc/orules.html ), but is this behaviour garantueed?
From what I've read so far is signed integer overflow in C# defined behaviour (Is C#/.NET signed integer overflow behavior defined?) (in contrast to C/C++) and Int32 is two-complement so I would be thankful for someone with better understanding of this subject than me to verify this.
Update
Quote from link 1:
The rules for detecting overflow in a two's complement sum are simple:
If the sum of two positive numbers yields a negative result, the sum has overflowed.
If the sum of two negative numbers yields a positive result, the sum has overflowed.
Otherwise, the sum has not overflowed.
Rule #2 from
http://sandbox.mc.edu/~bennet/cs110/tc/orules.html
is incorrect
If the sum of two negative numbers yields a positive result, the sum has overflowed.
Counter example:
int a = int.MinValue;
int b = int.MinValue;
unchecked {
// 0
Console.Write(a + b);
}
However, the rule can be simply amended
If the sum of two negative numbers yields a non-negative result, the sum has overflowed.
As for Rule #1
If the sum of two positive numbers yields a negative result, the sum has overflowed.
it's correct one
No, you cannot.
You already talk about checked context, where you know that overflow causes an exception to be thrown. However, your seems to assume that the lack of a checked keyword indicates you're in unchecked context. That isn't necessarily the case. The default context when neither checked nor unchecked is specified is configurable, can be different in multiple projects that share the same source files, and can even be different between different configurations of the same project.
If you want integer overflow to wrap, be explicit, use the unchecked keyword.
Related
Situation:
I've a system which reads a continuous stream of numbers (integers only).
The numbers are both positive and negative.
Some of the numbers are results of overflowed arithmetic operations: so these numbers will be negative ones in the stream.
Problem:
How can I differentiate between overflowed negative numbers and non-overflowed negative number in a stream? Is there a way to find and discard overflowed numbers? I'm developing in C# and don't have control over the stream source. So I can't change the code or add any checks.
There is no way to achieve what you want in generic case. There is no information on int (or any other built in value type) on the way how it was generated. There is no difference in binary pattern of value obtained by regular operation or overflow.
Your options:
some cases may be detected by range checking - if normal values are relatively small some large values are likely errors
capture more information when constructing stream of values - include "overflow" flag as pair to the value
use wider type for values (long or BigInteger)
Note that overflow also can produce small number/positive numbers, so there is no chances of filtering all invalid values unless you know more information about how computation is performed.
You can write defensive code to guard against overflow.
All of the built-in numeric types have MinValue and MaxValue properties
if (number >= Int32.MinValue && number <= Int32.MaxValue)
{
// logic
}
I'm just curious, why in IEEE-754 any non zero float number divided by zero results in infinite value? It's a nonsense from the mathematical perspective. So I think that correct result for this operation is NaN.
Function f(x) = 1/x is not defined when x=0, if x is a real number. For example, function sqrt is not defined for any negative number and sqrt(-1.0f) if IEEE-754 produces a NaN value. But 1.0f/0 is Inf.
But for some reason this is not the case in IEEE-754. There must be a reason for this, maybe some optimization or compatibility reasons.
So what's the point?
It's a nonsense from the mathematical perspective.
Yes. No. Sort of.
The thing is: Floating-point numbers are approximations. You want to use a wide range of exponents and a limited number of digits and get results which are not completely wrong. :)
The idea behind IEEE-754 is that every operation could trigger "traps" which indicate possible problems. They are
Illegal (senseless operation like sqrt of negative number)
Overflow (too big)
Underflow (too small)
Division by zero (The thing you do not like)
Inexact (This operation may give you wrong results because you are losing precision)
Now many people like scientists and engineers do not want to be bothered with writing trap routines. So Kahan, the inventor of IEEE-754, decided that every operation should also return a sensible default value if no trap routines exist.
They are
NaN for illegal values
signed infinities for Overflow
signed zeroes for Underflow
NaN for indeterminate results (0/0) and infinities for (x/0 x != 0)
normal operation result for Inexact
The thing is that in 99% of all cases zeroes are caused by underflow and therefore in 99%
of all times Infinity is "correct" even if wrong from a mathematical perspective.
I'm not sure why you would believe this to be nonsense.
The simplistic definition of a / b, at least for non-zero b, is the unique number of bs that has to be subtracted from a before you get to zero.
Expanding that to the case where b can be zero, the number that has to be subtracted from any non-zero number to get to zero is indeed infinite, because you'll never get to zero.
Another way to look at it is to talk in terms of limits. As a positive number n approaches zero, the expression 1 / n approaches "infinity". You'll notice I've quoted that word because I'm a firm believer in not propagating the delusion that infinity is actually a concrete number :-)
NaN is reserved for situations where the number cannot be represented (even approximately) by any other value (including the infinities), it is considered distinct from all those other values.
For example, 0 / 0 (using our simplistic definition above) can have any amount of bs subtracted from a to reach 0. Hence the result is indeterminate - it could be 1, 7, 42, 3.14159 or any other value.
Similarly things like the square root of a negative number, which has no value in the real plane used by IEEE754 (you have to go to the complex plane for that), cannot be represented.
In mathematics, division by zero is undefined because zero has no sign, therefore two results are equally possible, and exclusive: negative infinity or positive infinity (but not both).
In (most) computing, 0.0 has a sign. Therefore we know what direction we are approaching from, and what sign infinity would have. This is especially true when 0.0 represents a non-zero value too small to be expressed by the system, as it frequently the case.
The only time NaN would be appropriate is if the system knows with certainty that the denominator is truly, exactly zero. And it can't unless there is a special way to designate that, which would add overhead.
NOTE:
I re-wrote this following a valuable comment from #Cubic.
I think the correct answer to this has to come from calculus and the notion of limits. Consider the limit of f(x)/g(x) as x->0 under the assumption that g(0) == 0. There are two broad cases that are interesting here:
If f(0) != 0, then the limit as x->0 is either plus or minus infinity, or it's undefined. If g(x) takes both signs in the neighborhood of x==0, then the limit is undefined (left and right limits don't agree). If g(x) has only one sign near 0, however, the limit will be defined and be either positive or negative infinity. More on this later.
If f(0) == 0 as well, then the limit can be anything, including positive infinity, negative infinity, a finite number, or undefined.
In the second case, generally speaking, you cannot say anything at all. Arguably, in the second case NaN is the only viable answer.
Now in the first case, why choose one particular sign when either is possible or it might be undefined? As a practical matter, it gives you more flexibility in cases where you do know something about the sign of the denominator, at relatively little cost in the cases where you don't. You may have a formula, for example, where you know analytically that g(x) >= 0 for all x, say, for example, g(x) = x*x. In that case the limit is defined and it's infinity with sign equal to the sign of f(0). You might want to take advantage of that as a convenience in your code. In other cases, where you don't know anything about the sign of g, you cannot generally take advantage of it, but the cost here is just that you need to trap for a few extra cases - positive and negative infinity - in addition to NaN if you want to fully error check your code. There is some price there, but it's not large compared to the flexibility gained in other cases.
Why worry about general functions when the question was about "simple division"? One common reason is that if you're computing your numerator and denominator through other arithmetic operations, you accumulate round-off errors. The presence of those errors can be abstracted into the general formula format shown above. For example f(x) = x + e, where x is the analytically correct, exact answer, e represents the error from round-off, and f(x) is the floating point number that you actually have on the machine at execution.
The following runs fine without error and diff is 1:
int max = int.MaxValue;
int min = int.MinValue;
//Console.WriteLine("Min is {0} and max is {1}", min, max);
int diff = min - max;
Console.WriteLine(diff);
Wouldn't then all programs be suspect? a+b is no more the sum of a and b, where a and b are of type int. Sometimes it is, but sometimes it is the sum* of a, b and 2*int.MinValue.
* Sum as in the ordinary English meaning of addition, ignoring any computer knowledge or word size
In PowerShell, it looks better, but it still is not a hardware exception from the add operation. It appears to use a long before casting back to an int:
[int]$x = [int]::minvalue - [int]::maxvalue
Cannot convert value "-4294967295" to type "System.Int32". Error: "Value was either too large or too small for an Int32."
By default, overflow checking is turned off in C#. Values simply "wrap round" in the common way.
If you compiled the same code with /checked or used a checked { ... } block, it would throw an exception.
Depending on what you're doing, you may want checking or explicitly not want it. For example, in Noda Time we have overflow checking turned on by default, but explicitly turn it off for GetHashCode computations (where we expect overflow and have no problem with it) and computations which we know won't overflow, and where we don't want the (very slight) performance penalty of overflow checking.
See the checked and unchecked pages in the C# reference, or section 7.6.12 of the C# language specification, for more details.
When not specified, .NET will not check for numeric overflows when doing operations on numeric data types int, etc.
You can enable this by either building in checked mode (passing /checked into the compiler arguments), or use checked in a code segment:
checked
{
int i = int.MaxValue + int.MaxValue;
Console.WriteLine(i);
}
To my understanding, that should give you an overflow error and when I write it like this:
public static void Main()
{
Console.WriteLine(int.MaxValue - int.MinValue);
}
it does correctly give me an overflow error.
However:
public static void Main()
{
Console.WriteLine(test());
}
public static Int32 test(int minimum = int.MinValue, int maximum = int.MaxValue)
{
return maximum - minimum;
}
will output -1
Why does it do this? It should throw an error because its clearly an overflow!
int.MaxValue - int.MinValue = a value which int cannot hold. Thus, the number wraps around back to -1.
It is like 2147483647-(-2147483648) = 4294967295 which is not an int
Int32.MinValue Field
The value of this constant is -2,147,483,648; that is, hexadecimal
0x80000000.
And Int32.MaxValue Field
The value of this constant is 2,147,483,647; that is, hexadecimal
0x7FFFFFFF.
From MSDN
When integer overflow occurs, what happens depends on the execution
context, which can be checked or unchecked. In a checked context, an
OverflowException is thrown. In an unchecked context, the most
significant bits of the result are discarded and execution continues.
Thus, C# gives you the choice of handling or ignoring overflow.
This is because of compile-time overflow checking of your code. The line
Console.WriteLine(int.MaxValue - int.MinValue);
would not actually error at runtime, it would simple write "-1", but due to overflow checking you get the compile error "The operation overflows at compile time in checked mode".
To get around the compile-time overflow checking in this case you can do:
unchecked
{
Console.WriteLine(int.MaxValue - int.MinValue);
}
Which will run fine and output "-1"
The default project-level setting that controls this is set to "unchecked" by default. You can turn on overflow checking by going to the project properties, Build tab, Advanced button. The popup allows you to turn on overflow checking. The .NET Fiddle tool that you link to seems to perform some additional static analysis that is preventing you from seeing the true out-of-the-box runtime behavior. (The error for your first code snippet above is "The operation overflows at compile time in checked mode." You aren't seeing a runtime error.)
I think it goes even further than overflows.
if i look at this
Int64 max = Int32.MaxValue;
Console.WriteLine(max.ToString("X16")); // 000000007FFFFFFF
Int64 min = Int32.MinValue;
Console.WriteLine(min.ToString("X")); //FFFFFFFF80000000
Int64 subtract = max - min;
Console.WriteLine(subtract.ToString("X16")); //00000000FFFFFFFF <- not an overflow since it's a 64 bit number
Int32 neg = -1
Console.WriteLine(neg.ToString("X")); //FFFFFFFF
Here you see that if you just subtract the hex values) in 2's complement you get the number that's -1 in a 32 bit number. (after trunkating the leading 0's
2's complement arithmetic can be very fun http://en.wikipedia.org/wiki/Two's_complement
Consider this
int i = 2147483647;
var n = i + 3;
i = n;
Console.WriteLine(i); // prints -2147483646 (1)
Console.WriteLine(n); // prints -2147483646 (2)
Console.WriteLine(n.GetType()); // prints System.Int32 (3)
I am confused with following
(1) how could int hold the value -2147483646 ? (int range = -2,147,483,648 to 2,147,483,647)
(2) why does this print -2147483648 but not 2147483648 (compiler should
decide better type as int range
exceeds)
(3) if it is converted somewhere, why n.GetType() gives System.Int32
?
Edit1: Made the correction: Now you will get What I am Getting. (sorry for that)
var n = i + 1; to
var n = i + 3;
Edit2: One more thing, if it as overflow, why is an exception not raised ?
Addition: as the overflow occurs, is it not right to set the type for
var n
in statement var n = i + 3; to another type accordingly ?
you are welcome to suggest a better title, as this is not making sense to.... me at least
Thanks
Update: Poster fixed his question.
1) This is output is expected because you added 3 to int.MaxValue causing an overflow. In .NET by default this is a legal operation in unchecked code giving a wrap-around to negative values, but if you add a checked block around the code it will throw an OverflowException instead.
2) The type of a variable declared with var is determined at compile time not runtime. It's a rule that adding two Int32s gives an Int32, not a UInt32, an Int64 or something else. So even though at runtime you can see that the result is too big for an Int32, it still has to return an Int32.
3) It's not converted to another type.
1) -2147483646 is bigger than -2,147,483,648
2) 2147483648 is out of range
3) int is an alias for Int32
1)
First of all, the value in the variable is not -2147483646, it's -2147483648. Run your test again and check the result.
There is no reason that an int could not hold the value -2147483646. It's within the range -2147483648..2147483647.
2)
The compiler chooses the data type of the variable to be the type of the result of the expression. The expression returns an int value, and even if the compiler would choose a larger data type for the variable, the expression still returns an int and you get the same value as result.
It's the operation in the expression that overflows, it's not when the result is assigned to the variable that it overflows.
3)
It's not converted anywhere.
This is an overflow, your number wrapped around and went negative
This isn't the compiler's job, as a loop at runtime can cause the same thing
int is an alias or System.Int32 they are equivalent in .Net.
This is because of the bit representation
you use Int32 but the same goes for char (8 bits)
the first bit holds the sign, then the following bits hold the number
so with 7 bits you can represent 128 numbers 0111 1111
when you try the 129th, 1000 0001, the sign bits get set so the computer thinks its -1 instead
Arithmic operations in .NET don't change the actual type.
You start off with an (32bit) integer and the +3 isn't going to change that.
That's also why you get an unexpected round number when you do this:
int a = 2147483647;
double b = a / 4;
or
int a = 2147483647;
var b = a / 4;
for that matter.
EDIT:
There is no exception because .NET overflows the number.
The overflow exception will only occur at assignment operations or as Mark explains when you set the conditions to generate the exception.
If you want an exception to be thrown, write
abc = checked(i+3)
instead. That will check for overflows.
Also, in c#, the default setting is to not throw exceptions on overflows. But you can switch that option somewhere on your project's properties.
You could make this easier on us all by using hex notation.
Not everyone knows that the eighth Mersenne prime is 0x7FFFFFFF
Just sayin'