Why does .NET use int instead of uint in certain classes? - c#

I always come across code that uses int for things like .Count, etc, even in the framework classes, instead of uint.
What's the reason for this?

UInt32 is not CLS compliant so it might not be available in all languages that target the Common Language Specification. Int32 is CLS compliant and therefore is guaranteed to exist in all languages.

int, in c, is specifically defined to be the default integer type of the processor, and is therefore held to be the fastest for general numeric operations.

Unsigned types only behave like whole numbers if the sum or product of a signed and unsigned value will be a signed type large enough to hold either operand, and if the difference between two unsigned values is a signed value large enough to hold any result. Thus, code which makes significant use of UInt32 will frequently need to compute values as Int64. Operations on signed integer types may fail to operate like whole numbers when the operands are overly large, but they'll behave sensibly when operands are small. Operations on unpromoted arguments of unsigned types pose problems even when operands are small. Given UInt32 x; for example, the inequality x-1 < x will fail for x==0 if the result type is UInt32, and the inequality x<=0 || x-1>=0 will fail for large x values if the result type is Int32. Only if the operation is performed on type Int64 can both inequalities be upheld.
While it is sometimes useful to define unsigned-type behavior in ways that differ from whole-number arithmetic, values which represent things like counts should generally use types that will behave like whole numbers--something unsigned types generally don't do unless they're smaller than the basic integer type.

UInt32 isn't CLS-Compliant. http://msdn.microsoft.com/en-us/library/system.uint32.aspx
I think that over the years people have come to the conclusions that using unsigned types doesn't really offer that much benefit. The better question is what would you gain by making Count a UInt32?

Some things use int so that they can return -1 as if it were "null" or something like that. Like a ComboBox will return -1 for it's SelectedIndex if it doesn't have any item selected.

If the number is truly unsigned by its intrinsic nature then I would declare it an unsigned int. However, if I just happen to be using a number (for the time being) in the positive range then I would call it an int.
The main reasons being that:
It avoids having to do a lot of type-casting as most methods/functions are written to take an int and not an unsigned int.
It eliminates possible truncation warnings.
You invariably end up wishing you could assign a negative value to the number that you had originally thought would always be positive.
Are just a few quick thoughts that came to mind.
I used to try and be very careful and choose the proper unsigned/signed and I finally realized that it doesn't really result in a positive benefit. It just creates extra work. So why make things hard by mixing and matching.

Some old libraries and even InStr use negative numbers to mean special cases. I believe either its laziness or there's negative special values.

Related

Why is C# DateTime Seconds property type int and not uint?

Is the reason same as in Why is Array.Length an int, and not an uint? I am asking because I will need to do some additional casting/validation in my code which will unnecessarily reduce readability and I think there should not be any issue with just casting Seconds to uint like below:
uint modulo = (uint)DateTime.Now.Second % triggerModuloSeconds;
Using int as the default data type tend to make programming easier, since it is large enough for most common use cases, and limits the risk someone makes a mistake with unsigned arithmetic. It will often have much greater range than actually needed, but that is fine, memory is cheap, and cpus may be optimized for accessing 32-bit chunks of data.
If you wanted the most precise datatype a byte would be most appropriate, but what would you gain from using that instead of an int? There might be a point if you have millions of values, but that would be rare to store something like seconds in that way.
As mentioned in the comments, unsigned types are not CLS compliant, so will limit compatibility with other languages that do not support unsigned types. And this would be needed for a type like DateTime.
You should also prefer to use specific types over primitives. For example using TimeSpan to represent, well, a span of time.

Why does HashSet<T>.Count return an int instead of uint? [duplicate]

I always come across code that uses int for things like .Count, etc, even in the framework classes, instead of uint.
What's the reason for this?
UInt32 is not CLS compliant so it might not be available in all languages that target the Common Language Specification. Int32 is CLS compliant and therefore is guaranteed to exist in all languages.
int, in c, is specifically defined to be the default integer type of the processor, and is therefore held to be the fastest for general numeric operations.
Unsigned types only behave like whole numbers if the sum or product of a signed and unsigned value will be a signed type large enough to hold either operand, and if the difference between two unsigned values is a signed value large enough to hold any result. Thus, code which makes significant use of UInt32 will frequently need to compute values as Int64. Operations on signed integer types may fail to operate like whole numbers when the operands are overly large, but they'll behave sensibly when operands are small. Operations on unpromoted arguments of unsigned types pose problems even when operands are small. Given UInt32 x; for example, the inequality x-1 < x will fail for x==0 if the result type is UInt32, and the inequality x<=0 || x-1>=0 will fail for large x values if the result type is Int32. Only if the operation is performed on type Int64 can both inequalities be upheld.
While it is sometimes useful to define unsigned-type behavior in ways that differ from whole-number arithmetic, values which represent things like counts should generally use types that will behave like whole numbers--something unsigned types generally don't do unless they're smaller than the basic integer type.
UInt32 isn't CLS-Compliant. http://msdn.microsoft.com/en-us/library/system.uint32.aspx
I think that over the years people have come to the conclusions that using unsigned types doesn't really offer that much benefit. The better question is what would you gain by making Count a UInt32?
Some things use int so that they can return -1 as if it were "null" or something like that. Like a ComboBox will return -1 for it's SelectedIndex if it doesn't have any item selected.
If the number is truly unsigned by its intrinsic nature then I would declare it an unsigned int. However, if I just happen to be using a number (for the time being) in the positive range then I would call it an int.
The main reasons being that:
It avoids having to do a lot of type-casting as most methods/functions are written to take an int and not an unsigned int.
It eliminates possible truncation warnings.
You invariably end up wishing you could assign a negative value to the number that you had originally thought would always be positive.
Are just a few quick thoughts that came to mind.
I used to try and be very careful and choose the proper unsigned/signed and I finally realized that it doesn't really result in a positive benefit. It just creates extra work. So why make things hard by mixing and matching.
Some old libraries and even InStr use negative numbers to mean special cases. I believe either its laziness or there's negative special values.

Why is the division result between two integers truncated?

All experienced programmers in C# (I think this comes from C) are used to cast on of the integers in a division to get the decimal / double / float result instead of the int (the real result truncated).
I'd like to know why is this implemented like this? Is there ANY good reason to truncate the result if both numbers are integer?
C# traces its heritage to C, so the answer to "why is it like this in C#?" is a combination of "why is it like this in C?" and "was there no good reason to change?"
The approach of C is to have a fairly close correspondence between the high-level language and low-level operations. Processors generally implement integer division as returning a quotient and a remainder, both of which are of the same type as the operands.
(So my question would be, "why doesn't integer division in C-like languages return two integers", not "why doesn't it return a floating point value?")
The solution was to provide separate operations for division and remainder, each of which returns an integer. In the context of C, it's not surprising that the result of each of these operations is an integer. This is frequently more accurate than floating-point arithmetic. Consider the example from your comment of 7 / 3. This value cannot be represented by a finite binary number nor by a finite decimal number. In other words, on today's computers, we cannot accurately represent 7 / 3 unless we use integers! The most accurate representation of this fraction is "quotient 2, remainder 1".
So, was there no good reason to change? I can't think of any, and I can think of a few good reasons not to change. None of the other answers has mentioned Visual Basic which (at least through version 6) has two operators for dividing integers: / converts the integers to double, and returns a double, while \ performs normal integer arithmetic.
I learned about the \ operator after struggling to implement a binary search algorithm using floating-point division. It was really painful, and integer division came in like a breath of fresh air. Without it, there was lots of special handling to cover edge cases and off-by-one errors in the first draft of the procedure.
From that experience, I draw the conclusion that having different operators for dividing integers is confusing.
Another alternative would be to have only one integer operation, which always returns a double, and require programmers to truncate it. This means you have to perform two int->double conversions, a truncation and a double->int conversion every time you want integer division. And how many programmers would mistakenly round or floor the result instead of truncating it? It's a more complicated system, and at least as prone to programmer error, and slower.
Finally, in addition to binary search, there are many standard algorithms that employ integer arithmetic. One example is dividing collections of objects into sub-collections of similar size. Another is converting between indices in a 1-d array and coordinates in a 2-d matrix.
As far as I can see, no alternative to "int / int yields int" survives a cost-benefit analysis in terms of language usability, so there's no reason to change the behavior inherited from C.
In conclusion:
Integer division is frequently useful in many standard algorithms.
When the floating-point division of integers is needed, it may be invoked explicitly with a simple, short, and clear cast: (double)a / b rather than a / b
Other alternatives introduce more complication both the programmer and more clock cycles for the processor.
Is there ANY good reason to truncate the result if both numbers are integer?
Of course; I can think of a dozen such scenarios easily. For example: you have a large image, and a thumbnail version of the image which is 10 times smaller in both dimensions. When the user clicks on a point in the large image, you wish to identify the corresponding pixel in the scaled-down image. Clearly to do so, you divide both the x and y coordinates by 10. Why would you want to get a result in decimal? The corresponding coordinates are going to be integer coordinates in the thumbnail bitmap.
Doubles are great for physics calculations and decimals are great for financial calculations, but almost all the work I do with computers that does any math at all does it entirely in integers. I don't want to be constantly having to convert doubles or decimals back to integers just because I did some division. If you are solving physics or financial problems then why are you using integers in the first place? Use nothing but doubles or decimals. Use integers to solve finite mathematics problems.
Calculating on integers is faster (usually) than on floating point values. Besides, all other integer/integer operations (+, -, *) return an integer.
EDIT:
As per the request of the OP, here's some addition:
The OP's problem is that they think of / as division in the mathematical sense, and the / operator in the language performs some other operation (which is not the math. division). By this logic they should question the validity of all other operations (+, -, *) as well, since those have special overflow rules, which is not the same as would be expected from their math counterparts. If this is bothersome for someone, they should find another language where the operations perform as expected by the person.
As for the claim on perfomance difference in favor of integer values: When I wrote the answer I only had "folk" knowledge and "intuition" to back up the claim (hece my "usually" disclaimer). Indeed as Gabe pointed out, there are platforms where this does not hold. On the other hand I found this link (point 12) that shows mixed performances on an Intel platform (the language used is Java, though).
The takeaway should be that with performance many claims and intuition are unsubstantiated until measured and found true.
Yes, if the end result needs to be a whole number. It would depend on the requirements.
If these are indeed your requirements, then you would not want to store a decimal and then truncate it. You would be wasting memory and processing time to accomplish something that is already built-in functionality.
The operator is designed to return the same type as it's input.
Edit (comment response):
Why? I don't design languages, but I would assume most of the time you will be sticking with the data types you started with and in the remaining instance, what criteria would you use to automatically assume which type the user wants? Would you automatically expect a string when you need it? (sincerity intended)
If you add an int to an int, you expect to get an int. If you subtract an int from an int, you expect to get an int. If you multiple an int by an int, you expect to get an int. So why would you not expect an int result if you divide an int by an int? And if you expect an int, then you will have to truncate.
If you don't want that, then you need to cast your ints to something else first.
Edit: I'd also note that if you really want to understand why this is, then you should start looking into how binary math works and how it is implemented in an electronic circuit. It's certainly not necessary to understand it in detail, but having a quick overview of it would really help you understand how the low-level details of the hardware filter through to the details of high-level languages.

Why are the unsigned CLR types so difficult to use in C#?

I came from a mostly C/C++ background before I began using C#. One of the things I did with my first project in C# was make a class like this
class Element{
public uint Size;
public ulong BigThing;
}
I was then mortified by the fact that this requires casting:
int x=MyElement.Size;
as does
int x=5;
uint total=MyElement.Size+x;
Why did the language designers decide to make the signed and unsigned integer types not implicitly castable? And why are the unsigned types not used more throughout the .Net library? For instance String.Length can never be negative, yet it is a signed integer.
Why did the language designers decide to make the signed and unsigned integer types not
implicitly castable?
Because that could lose data or throw any exception, neither of which is generally a good thing to allow implicitly. (The implicit conversion from long to double can lose data too, admittedly, but in a different way.)
And why are the unsigned types not used more throughout the .Net library
Unsigned types aren't CLS-compliant - not all .NET languages have always supported them. For example, Visual Basic didn't have "native" support for unsigned data types in .NET 1.0 and 1.1; it was added to the language for 2.0. (You could still use them, but they weren't part of the language itself - you couldn't use the normal arithmetic operators, for example.)
Along with Jon's answer, just because an unsigned number can't be negative doesn't mean it isn't bigger than a signed one. uint is 0 to 4,294,967,295 but int is -2,147,483,648 to 2,147,483,647. Plenty of room above int's max for loss.
Because implicitly converting an unsigned integer of 3B into an signed integer is going to blow up.
Unsigned has twice the maximum value of signed. It's the same reason you can't cast a long to an int.
I was then mortified by the fact that this requires casting:
int x=MyElement.Size;
But you are contradicting yourself here. If you really (really) need Size to be unsigned than assigning it to (signed) x is an error. A deep flaw in your code.
For instance String.Length can never be negative, yet it is a signed integer
But String.IndexOf can return a negative number, and it would be awkward if String.Length and Index values where of different types.
And while in theory there would be merit in an unsigned String.Length (4 GB cap), in practice even the current 2GB is large enough (because strings of that length are rare and unworkable anyway).
So the real answer is: Why use unsigned in the first place?
On the second count: because they wanted the CLR to be compatible with languages that don't have unsigned datatypes (read: VB.NET).

Converting to int16, int32, int64 - how do you know which one to choose?

I often have to convert a retreived value (usually as a string) - and then convert it to an int. But in C# (.Net) you have to choose either int16, int32 or int64 - how do you know which one to choose when you don't know how big your retrieved number will be?
Everyone here who has mentioned that declaring an Int16 saves ram should get a downvote.
The answer to your question is to use the keyword "int" (or if you feel like it, use "Int32").
That gives you a range of up to 2.4 billion numbers... Also, 32bit processors will handle those ints better... also (and THE MOST IMPORTANT REASON) is that if you plan on using that int for almost any reason... it will likely need to be an "int" (Int32).
In the .Net framework, 99.999% of numeric fields (that are whole numbers) are "ints" (Int32).
Example: Array.Length, Process.ID, Windows.Width, Button.Height, etc, etc, etc 1 million times.
EDIT: I realize that my grumpiness is going to get me down-voted... but this is the right answer.
Just wanted to add that... I remembered that in the days of .NET 1.1 the compiler was optimized so that 'int' operations are actually faster than byte or short operations.
I believe it still holds today, but I'm running some tests now.
EDIT: I have got a surprise discovery: the add, subtract and multiply operations for short(s) actually return int!
Repeatedly trying TryParse() doesn't make sense, you have a field already declared. You can't change your mind unless you make that field of type Object. Not a good idea.
Whatever data the field represents has a physical meaning. It's an age, a size, a count, etc. Physical quantities have realistic restraints on their range. Pick the int type that can store that range. Don't try to fix an overflow, it would be a bug.
Contrary to the current most popular answer, shorter integers (like Int16 and SByte) do often times take up less space in memory than larger integers (like Int32 and Int64). You can easily verify this by instantiating large arrays of sbyte/short/int/long and using perfmon to measure managed heap sizes. It is true that many CLR flavors will widen these integers for CPU-specific optimizations when doing arithmetic on them and such, but when stored as part of an object, they take up only as much memory as is necessary.
So, you definitely should take size into consideration especially if you'll be working with large list of integers (or with large list of objects containing integer fields). You should also consider things like CLS-compliance (which disallows any unsigned integers in public members).
For simple cases like converting a string to an integer, I agree an Int32 (C# int) usually makes the most sense and is likely what other programmers will expect.
If we're just talking about a couple numbers, choosing the largest won't make a noticeable difference in your overall ram usage and will just work. If you are talking about lots of numbers, you'll need to use TryParse() on them and figure out the smallest int type, to save ram.
All computers are finite. You need to define an upper limit based on what you think your users requirements will be.
If you really have no upper limit and want to allow 'unlimited' values, try adding the .Net Java runtime libraries to your project, which will allow you to use the java.math.BigInteger class - which does math on nearly-unlimited size integer.
Note: The .Net Java libraries come with full DevStudio, but I don't think they come with Express.

Categories

Resources