Regarding an implementation detail of dotnet/runtime - c#

Recently, I'm excited to read parts of the corefx (which moved recently to dotnet/runtime repo). I came across the Array.CopyTo method:
public static void Copy(Array sourceArray, long sourceIndex, Array destinationArray, long destinationIndex, long length)
{
int isourceIndex = (int)sourceIndex;
int idestinationIndex = (int)destinationIndex;
int ilength = (int)length;
if (sourceIndex != isourceIndex)
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.sourceIndex, ExceptionResource.ArgumentOutOfRange_HugeArrayNotSupported);
if (destinationIndex != idestinationIndex)
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.destinationIndex, ExceptionResource.ArgumentOutOfRange_HugeArrayNotSupported);
if (length != ilength)
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.length, ExceptionResource.ArgumentOutOfRange_HugeArrayNotSupported);
Copy(sourceArray, isourceIndex, destinationArray, idestinationIndex, ilength);
}
The thing caught my eye is that the method converts the long arguments to int (means that it discards the MSBs), and compre them to the actual arguments, and throw exception if they're not equal. What I see is the exception will be thrown if and only if any of these long arguments is greater than int.MaxValue or less than int.MinValue. The question is, why those arguments are long from the beginning ? Why not just make them int.

Only Microsoft can answer the "why?" question definitively. That said…
That method, and others like it, are provided as a convenience. There are situations where one might have a long value representing the region in the array(s) to be copied. For example, working with unmanaged memory, later copied piece-wise to a managed array.
Without such overloads, the caller would have to do the same work to have the same degree of safety, and quite possibly would do it incorrectly (e.g. just cast the value without checking for overflow).
By providing the long-parameter overloads, the framework offers both convenience and better resistance to client-code bugs. This way the client can still use long values when appropriate, and pass them directly to the framework API without having to do extra work to covert them properly.

The reason for this is that the API defines it as long to support a big range of values but this implementation does not support copying “huge arrays” so it throws an exception. It is not defined in the API that “huge arrays” shouldn’t be supported, it’s just the implementation. Other implementations might support any array size a long can define, or this implementation could be changed to support it. If the type were int then a breaking API change would be needed.
The .NET APIs usually handle lengths and sizes with longs anyway so that there’s no need to change it there is a need for huge values.

Related

Why can I not encode const structs?

I wish to encode hard coded value of a const Point struct.
Why does the compiler not allow neither internal, nor arbitrary structs to be replaced during compilation? Since the internal bitwise representation can be established at compile time (in both cases), there is no apparent reason for the restriction.
My question is: Is there a way to hard-code a predefined set of bytes in c# that can be interpreted at compile time as the appropriate type, since all structs have a predetermined memory outline.
EDIT:
To clarify: Compile time means C# -> IL byte-code as stored in the output assembly.
The use case example:
public void Draw(Bitmap bmp, Point Location = new Point(0,0)) // invalid
This is an error because the new Point(0,0) cannot be evaluated at compile time. I can pass in int X = 0, int Y = 0 or the nullable Point? Location = null and generate the struct inside of the method, or Overload the method without the optional parameters and call the main method passing in the default values, but that technique incurs a performance penalty in terms of the extra method calls required.
This may not be appropriate for all structs, since the constructor could rely on, or change, external state or randomness.
FINAL EDIT:
This is now possible. Making the question moot. Yay.
The issue was the incorrect belief that the new keyword always implied heap allocation or dynamic stack allocation, with constant arguments neither case was true.
Why does the compiler not allow neither internal, nor arbitrary structs to be replaced during compilation? Since the internal bitwise representation can be established at compile time (in both cases), there is no apparent reason for the restriction.
All not-implemented features are not implemented for the same reason. To be implemented, a feature must be thought of, judged to be appropriate, designed, specified, implemented, tested, documented and shipped. All those things must happen. For your proposed feature, none of them happened. Therefore, no feature.
Programming language designers are not required to provide a justification for why a feature was not implemented. Rather, the people who want the feature are required to provide a reason why programming language designers should spend their valuable time implementing a feature that you want.
The C# design process is open, and the compiler source code is available. Why have you not designed and implemented the feature? If it is fair for you to ask the designers that question, it's fair for them to ask it of you! You're a computer programmer; get busy programming computers and build the feature if you think it is worthwhile, and then convince the language team to accept your pull request. If you don't think it is worth your time to do that, well, probably the language designers feel the same way.
My question is: Is there a way to hard-code a predefined set of bytes in c# that can be interpreted at compile time as the appropriate type, since all structs have a predetermined memory outline.
I'm not sure what you mean by "at compile time"; can you clarify?
There are ways to store byte arrays in an assembly, sure. Make a C# program with a byte array initialized to all constant values and ildasm the assembly; you'll see the code that the C# compiler generates to get the byte array image out of the metadata and into the array.
You could implement similar shenanigans to get a byte array, fix the array in place, and then use unsafe pointer magic to reinterpret the array bytes as struct bytes. That sounds extraordinarily dangerous, and might mess up the performance of the garbage collector. I would not wish to do so myself, but you seem pretty keen on this feature, so go for it and report back what you find out!
Alternatively, C++/CLI probably implements the feature you want; I've never used it but that seems like the sort of thing it would do. You could write a little program in C++/CLI that does what you want, and then either (1) use that program's assembly as a dependency of your assembly, (2) compile it as a netmodule link it in to your assembly via the usual netmodule linking gear (yuck) or (3) deduce how they implemented the feature and then do the same.
You can convert a struct into byte array and encode a byte array. It may work like this: (please compile and fix any errors (typing through mobile))
Suppose your struct is:
public struct TestStruct
{
public int x;
public string y;
}
public byte[] GetTestByte(TestStruct c)
{
var intGuy = BitConverter.GetBytes(c.x);
var stringGuy = Encoding.UTF8.GetBytes(c.y);
var both = stringGuy.Concat(intGuy).Concat(new byte[1]).ToArray();
return both;
}
Now you can encode a byte array like:
Convert.ToBase64String(byteArray);
There is no direct way to encode a struct. Its a value type at best and probably there as legacy for C and C++

Declaring a field as a bit (as a single bit, as opposed to a byte multiple) in C#

C# 6.0 in a Nutshell by Joseph Albahari and Ben Albahari (O’Reilly).
Copyright 2016 Joseph Albahari and Ben Albahari, 978-1-491-92706-9.
introduces, at page 312, BitArrays as one of the Collection types .NET provides:
BitArray
A BitArray is a dynamically sized collection of compacted bool values.
It is more memory-efficient than both a simple array of bool and a
generic List of bool, because it uses only one bit for each value,
whereas the bool type otherwise occupies one byte for each value.
It's nice to have the possibility of declaring a collection of bits instead of working with bytes when you are interested in binary values only, but what about declaring a single bit field ?
like:
public class X
{
public [bit-type] MyBit {get; set;}
}
.NET does not support it ?
The existent posts on that touch the topic talk about setting individual bits within, ultimately, a byte variable. I am asking if, once .NET thought of supporting working with bit variables, in a collection, if it also supports declaring a non-collection such variable.
So your question is whether .NET supports this or not. The answer is no.
Why? It's fundamentally possible to have such a feature. But the demand is really low. It's better to invest the developer time elsewhere.
If you want to make use of memory below the byte granularity you will need to build this yourself. BitArray is not intrinsic to the runtime. It manipulates the bits of some bigger type (I think it's int-based). You can do the same thing.
BitVector32 is a built-in struct that you can use to individually address 32 bits.
As you can see in the .Net reference, BitArray internally stores the values within an Array of int
public BitArray(int length, bool defaultValue) {
...
m_array = new int[GetArrayLength(length, BitsPerInt32)];
m_length = length;
int fillValue = defaultValue ? unchecked(((int)0xffffffff)) : 0;
for (int i = 0; i < m_array.Length; i++) {
m_array[i] = fillValue;
}
_version = 0;
}
So the least thing that gets allocated with a BitArray is already an entire int for the reference and even more if you store data in it. This also makes sense since the memory the used for addressing anything is already in data words. Those are - depending on the architecture - at least 4 bytes long already.
You can of course define an own type for a single Bit to store, but this will also take at least a byte - if not even a complete word and a byte due to being a reference type - to do so. Memory is allocated to a program by the OS in terms of memory addresses, which usually address bytes, so anything less is not entirely useful.
It takes a lot of binary values to store, to even make up for the space already lost by using the type in the first place, so the only really useful application of this technique of storing bits is when you've got lots of them, so that you can profit of the 8:1 memory ratio.

Use uint or int

Definitely, I know the basic differences between unsigned integers (uint) and signed integers (int).
I noticed that in .NET public classes, a property called Length is always using signed integers.
Maybe this is because unsigned integers are not CLS compliant.
However, for example, in my static function:
public static double GetDistributionDispersion(int tokens, int[] positions)
The parameter tokens and all elements in positions cannot be negative. If it's negative, the final result is useless. So if I use int both for tokens and positions, I have to check the values every time this function is called (and return non-sense values or throw exceptions if negative values found???), which is tedious.
OK, then we should use uint for both parameters. This really makes sense to me.
I found, however, as in a lot of public APIs, they are almost always using int. Does that mean inside their implementation, they always check the negativeness of each value (if it is supposed to be non-negative)?
So, in a word, what should I do?
I could provide two cases:
This function will only be called by myself in my own solution;
This function will be used as a library by others in other team.
Should we use different schemes for these two cases?
Peter
P.S.: I did do a lot of research, and there is still no reason to convince me not to use uint :-)
I see three options.
Use uint. The framework doesn't because it's not CLS compliant. But do you have to be CLS compliant? (There are also some not-fun issues with arithmetic that crop up; it's not fun to cast all over the place. I tend to void uint for this reason).
Use int but use contracts:
Contract.Requires(tokens >= 0);
Contract.Requires(Contract.ForAll(positions, position => position >= 0));
Make it explicit it exactly what you require.
Create a custom type that encapsulates the requirement:
struct Foo {
public readonly int foo;
public Foo(int foo) {
Contract.Requires(foo >= 0);
this.foo = foo;
}
public static implicit operator int(Foo foo) {
return this.foo;
}
public static explicit operator Foo(int foo) {
return new Foo(foo);
}
}
Then:
public static double GetDistributionDispersion(Foo tokens, Foo[] positions) { }
Ah, nice. We don't have to worry about it in our method. If we're getting a Foo, it's valid.
You have a reason for requiring non-negativity in your domain. It's modeling some concept. Might as well promote that concept to a bonafide object in your domain model and encapsulate all the concepts that come with it.
I use uint.
Yes, other answers are all correct... But I prefer then uint for one reason:
Make interface more clear. If a parameter (or a returned value) is unsigned, it is because it cannot be negative (has you seen a negative collection count?). Otherwise, I need to check parameters, document parameters (and returned value) that cannot be negative; then I need to write additional unit tests for checking parameters and returned values (wow, and someone would complain for doing casts? Are int casts so frequent? No, by my experience).
Additionally, users are required to test returned values negativity, which may be worse.
I don't mind about CLS compliace, so why I should be? From my point of view, the question should be reversed: why should I use ints when value cannot be negative?
For the point of returning negative value for additional information (errors, for example...): I don't like so C-ish design. I think there may be a more modern design to accomplish this (i.e. use of Exceptions, or alterntively use of out values and boolean returned values).
Yes go for int. I once tried to go uint myself only to refactor everything again as soon as I had my share of annoying casts in my lib.
I guess the decision to go for int is for historical reasons where often a result of -1 indicates some sort of error (for example in IndexOf).
int. it gives more flexibility when you need to do a API change in the near future.
such as negative indexes are often used by python to indicate reversed counting from the end of string.
values also turns negative when overflows, assertion will catch it.
its a trade off for speed vs robustness.
As you read it violates the Common Language Specification rules, but then how often will the function be used and in case it going to be clubbed up with other method its normal for them to expect int as parameter which would get you into the trouble of casting the values.
If you are going to make it available as a library then its better sticking to the conventional int else you need to implicitly take care of conditions wherein you might not get a positive value which would mean littering the checks across the pages.
Interesting Read - SO link

Should I use int or Int32

In C#, int and Int32 are the same thing, but I've read a number of times that int is preferred over Int32 with no reason given. Is there a reason, and should I care?
The two are indeed synonymous; int will be a little more familiar looking, Int32 makes the 32-bitness more explicit to those reading your code. I would be inclined to use int where I just need 'an integer', Int32 where the size is important (cryptographic code, structures) so future maintainers will know it's safe to enlarge an int if appropriate, but should take care changing Int32s in the same way.
The resulting code will be identical: the difference is purely one of readability or code appearance.
ECMA-334:2006 C# Language Specification (p18):
Each of the predefined types is shorthand for a system-provided type. For example, the keyword int refers to the struct System.Int32. As a matter of style, use of the keyword is favoured over use of the complete system type name.
They both declare 32 bit integers, and as other posters stated, which one you use is mostly a matter of syntactic style. However they don't always behave the same way. For instance, the C# compiler won't allow this:
public enum MyEnum : Int32
{
member1 = 0
}
but it will allow this:
public enum MyEnum : int
{
member1 = 0
}
Go figure.
I always use the system types - e.g., Int32 instead of int. I adopted this practice after reading Applied .NET Framework Programming - author Jeffrey Richter makes a good case for using the full type names. Here are the two points that stuck with me:
Type names can vary between .NET languages. For example, in C#, long maps to System.Int64 while in C++ with managed extensions, long maps to Int32. Since languages can be mixed-and-matched while using .NET, you can be sure that using the explicit class name will always be clearer, no matter the reader's preferred language.
Many framework methods have type names as part of their method names:
BinaryReader br = new BinaryReader( /* ... */ );
float val = br.ReadSingle(); // OK, but it looks a little odd...
Single val = br.ReadSingle(); // OK, and is easier to read
int is a C# keyword and is unambiguous.
Most of the time it doesn't matter but two things that go against Int32:
You need to have a "using System;" statement. using "int" requires no using statement.
It is possible to define your own class called Int32 (which would be silly and confusing). int always means int.
As already stated, int = Int32. To be safe, be sure to always use int.MinValue/int.MaxValue when implementing anything that cares about the data type boundaries. Suppose .NET decided that int would now be Int64, your code would be less dependent on the bounds.
Byte size for types is not too interesting when you only have to deal with a single language (and for code which you don't have to remind yourself about math overflows). The part that becomes interesting is when you bridge between one language to another, C# to COM object, etc., or you're doing some bit-shifting or masking and you need to remind yourself (and your code-review co-wokers) of the size of the data.
In practice, I usually use Int32 just to remind myself what size they are because I do write managed C++ (to bridge to C# for example) as well as unmanaged/native C++.
Long as you probably know, in C# is 64-bits, but in native C++, it ends up as 32-bits, or char is unicode/16-bits while in C++ it is 8-bits. But how do we know this? The answer is, because we've looked it up in the manual and it said so.
With time and experiences, you will start to be more type-conscientious when you do write codes to bridge between C# and other languages (some readers here are thinking "why would you?"), but IMHO I believe it is a better practice because I cannot remember what I've coded last week (or I don't have to specify in my API document that "this parameter is 32-bits integer").
In F# (although I've never used it), they define int, int32, and nativeint. The same question should rise, "which one do I use?". As others has mentioned, in most cases, it should not matter (should be transparent). But I for one would choose int32 and uint32 just to remove the ambiguities.
I guess it would just depend on what applications you are coding, who's using it, what coding practices you and your team follows, etc. to justify when to use Int32.
Addendum:
Incidentally, since I've answered this question few years ago, I've started using both F# and Rust. F#, it's all about type-inferences, and bridging/InterOp'ing between C# and F#, the native types matches, so no concern; I've rarely had to explicitly define types in F# (it's almost a sin if you don't use type-inferences). In Rust, they completely have removed such ambiguities and you'd have to use i32 vs u32; all in all, reducing ambiguities helps reduce bugs.
There is no difference between int and Int32, but as int is a language keyword many people prefer it stylistically (just as with string vs String).
In my experience it's been a convention thing. I'm not aware of any technical reason to use int over Int32, but it's:
Quicker to type.
More familiar to the typical C# developer.
A different color in the default visual studio syntax highlighting.
I'm especially fond of that last one. :)
I always use the aliased types (int, string, etc.) when defining a variable and use the real name when accessing a static method:
int x, y;
...
String.Format ("{0}x{1}", x, y);
It just seems ugly to see something like int.TryParse(). There's no other reason I do this other than style.
Though they are (mostly) identical (see below for the one [bug] difference), you definitely should care and you should use Int32.
The name for a 16-bit integer is Int16. For a 64 bit integer it's Int64, and for a 32-bit integer the intuitive choice is: int or Int32?
The question of the size of a variable of type Int16, Int32, or Int64 is self-referencing, but the question of the size of a variable of type int is a perfectly valid question and questions, no matter how trivial, are distracting, lead to confusion, waste time, hinder discussion, etc. (the fact this question exists proves the point).
Using Int32 promotes that the developer is conscious of their choice of type. How big is an int again? Oh yeah, 32. The likelihood that the size of the type will actually be considered is greater when the size is included in the name. Using Int32 also promotes knowledge of the other choices. When people aren't forced to at least recognize there are alternatives it become far too easy for int to become "THE integer type".
The class within the framework intended to interact with 32-bit integers is named Int32. Once again, which is: more intuitive, less confusing, lacks an (unnecessary) translation (not a translation in the system, but in the mind of the developer), etc. int lMax = Int32.MaxValue or Int32 lMax = Int32.MaxValue?
int isn't a keyword in all .NET languages.
Although there are arguments why it's not likely to ever change, int may not always be an Int32.
The drawbacks are two extra characters to type and [bug].
This won't compile
public enum MyEnum : Int32
{
AEnum = 0
}
But this will:
public enum MyEnum : int
{
AEnum = 0
}
I know that the best practice is to use int, and all MSDN code uses int. However, there's not a reason beyond standardisation and consistency as far as I know.
You shouldn't care. You should use int most of the time. It will help the porting of your program to a wider architecture in the future (currently int is an alias to System.Int32 but that could change). Only when the bit width of the variable matters (for instance: to control the layout in memory of a struct) you should use int32 and others (with the associated "using System;").
int is the C# language's shortcut for System.Int32
Whilst this does mean that Microsoft could change this mapping, a post on FogCreek's discussions stated [source]
"On the 64 bit issue -- Microsoft is indeed working on a 64-bit version of the .NET Framework but I'm pretty sure int will NOT map to 64 bit on that system.
Reasons:
1. The C# ECMA standard specifically says that int is 32 bit and long is 64 bit.
2. Microsoft introduced additional properties & methods in Framework version 1.1 that return long values instead of int values, such as Array.GetLongLength in addition to Array.GetLength.
So I think it's safe to say that all built-in C# types will keep their current mapping."
int is the same as System.Int32 and when compiled it will turn into the same thing in CIL.
We use int by convention in C# since C# wants to look like C and C++ (and Java) and that is what we use there...
BTW, I do end up using System.Int32 when declaring imports of various Windows API functions. I am not sure if this is a defined convention or not, but it reminds me that I am going to an external DLL...
Once upon a time, the int datatype was pegged to the register size of the machine targeted by the compiler. So, for example, a compiler for a 16-bit system would use a 16-bit integer.
However, we thankfully don't see much 16-bit any more, and when 64-bit started to get popular people were more concerned with making it compatible with older software and 32-bit had been around so long that for most compilers an int is just assumed to be 32 bits.
I'd recommend using Microsoft's StyleCop.
It is like FxCop, but for style-related issues. The default configuration matches Microsoft's internal style guides, but it can be customised for your project.
It can take a bit to get used to, but it definitely makes your code nicer.
You can include it in your build process to automatically check for violations.
It makes no difference in practice and in time you will adopt your own convention. I tend to use the keyword when assigning a type, and the class version when using static methods and such:
int total = Int32.Parse("1009");
int and Int32 is the same. int is an alias for Int32.
You should not care. If size is a concern I would use byte, short, int, then long. The only reason you would use an int larger than int32 is if you need a number higher than 2147483647 or lower than -2147483648.
Other than that I wouldn't care, there are plenty of other items to be concerned with.
int is an alias for System.Int32, as defined in this table:
Built-In Types Table (C# Reference)
I use int in the event that Microsoft changes the default implementation for an integer to some new fangled version (let's call it Int32b).
Microsoft can then change the int alias to Int32b, and I don't have to change any of my code to take advantage of their new (and hopefully improved) integer implementation.
The same goes for any of the type keywords.
You should not care in most programming languages, unless you need to write very specific mathematical functions, or code optimized for one specific architecture... Just make sure the size of the type is enough for you (use something bigger than an Int if you know you'll need more than 32-bits for example)
It doesn't matter. int is the language keyword and Int32 its actual system type.
See also my answer here to a related question.
Use of Int or Int32 are the same Int is just sugar to simplify the code for the reader.
Use the Nullable variant Int? or Int32? when you work with databases on fields containing null. That will save you from a lot of runtime issues.
Some compilers have different sizes for int on different platforms (not C# specific)
Some coding standards (MISRA C) requires that all types used are size specified (i.e. Int32 and not int).
It is also good to specify prefixes for different type variables (e.g. b for 8 bit byte, w for 16 bit word, and l for 32 bit long word => Int32 lMyVariable)
You should care because it makes your code more portable and more maintainable.
Portable may not be applicable to C# if you are always going to use C# and the C# specification will never change in this regard.
Maintainable ihmo will always be applicable, because the person maintaining your code may not be aware of this particular C# specification, and miss a bug were the int occasionaly becomes more than 2147483647.
In a simple for-loop that counts for example the months of the year, you won't care, but when you use the variable in a context where it could possibly owerflow, you should care.
You should also care if you are going to do bit-wise operations on it.
Using the Int32 type requires a namespace reference to System, or fully qualifying (System.Int32). I tend toward int, because it doesn't require a namespace import, therefore reducing the chance of namespace collision in some cases. When compiled to IL, there is no difference between the two.
According to the Immediate Window in Visual Studio 2012 Int32 is int, Int64 is long. Here is the output:
sizeof(int)
4
sizeof(Int32)
4
sizeof(Int64)
8
Int32
int
base {System.ValueType}: System.ValueType
MaxValue: 2147483647
MinValue: -2147483648
Int64
long
base {System.ValueType}: System.ValueType
MaxValue: 9223372036854775807
MinValue: -9223372036854775808
int
int
base {System.ValueType}: System.ValueType
MaxValue: 2147483647
MinValue: -2147483648
Also consider Int16. If you need to store an Integer in memory in your application and you are concerned about the amount of memory used, then you could go with Int16 since it uses less memeory and has a smaller min/max range than Int32 (which is what int is.)
It's 2021 and I've read all answers. Most says it's basically the same (it's an alias), or, it depends on "what you like", or "by convention use int..." No answer gives you a clear when, where and why use Int32 over int. That's why I'm here.
98% of the time, you can get away with int, and that's perfectly fine. What are the other 2% ?
IO with records (struct, native types, organization and compression). Someone said an useless application is one that can read and manipulate data, but not actually capable of writing new datas to a defined storage. But in order to not reinvent the wheel, at some point, those dealing with old datas has to retrieve the documentation on how to read them. And chances are they were compiled from an era where a long was always a 32-bits integer.
It happenned before, where some had trouble remembering a db is a byte, a dw is a word, a dd is a double word, but how many bits was that about ? And that will likely happen again on C# 43.0 on a 256-bits platform... where the (future) boys never heard of "by convention, use int instead of Int32". That's the 2% where Int32 matters over int. MSDN saying today it's recommended to use int is irrelevant, it usually works with current C# version, but that may get dropped in future MSDN pages, in 2028, or 2034 ? Fewer and fewer people have WORD and DWORD encouters today, yet, two decades ago, they were common. The same thing will happen to int, in the very case of dealing with precise-fixed-length data.
In memory, a ushort (UInt16) can be a Decimal as long as it's fractional part is null, it is positive or null, and does not exceed 65535. But inside a file, it must be a short, 16-bits long. And when you read a documentation about a file structure from another era (inside the source code), you realize there are 3545 records definitions, some nested inside others, each record having between a couple and hundreds of fields of varying types.
Somewhere in 2028 a boy thought he could just get away by Ctrl-H-ing int to Int32, whole word only and match case... ~67000 changes in whole solution. Hit Run and still get CTDs. Clap clap clap. Go figure which int you should have changed to Int32 and which ones you should have changed to var. Also worth to point out Pointers are useful, when you deal with terabytes of datas (have a virtual representation of an entire planet on some cloud, download on demand, and render to user screen). Pointers are really fast in the ~1% of cases where there are so many datas to compute in realtime, you must trade with unsafe code. Again, it's to come up with an actually useful application, instead of being fancy and waste time porting to managed. So, be carefull, IntPtr is 32-bits or 64-bits already ? Could you get away with your code without caring how many bytes you read/skip ? Or just go (Int32*) int32Ptr = (Int32*) int64Ptr;...
An even more factual example is a file containing data processing and their respective commands (methods in the source code), like internal branching (a conditional continue or jump to if the test fails) :
IfTest record in file says : if value equals someConstant, jump to address. Where address is a 16-bits integer representing a relative pointer inside the file (you can go back towards the start of the file up to 32768 bytes, or up to 32767 bytes further down). But 10 years later, platforms can handle larger files and larger datas, and now you have 32-bits relative address. Your method in the source code were named IfTestMethod(...), now how would you name the new one ? IfTestMethodInt() or IfTestMethod32() ? Would you also rename the old method IfTestMethodShort() or IfTestMethod16() ? Then a decade later, you get a new command with long (Int64) relative address... What about a 128 bits command some 10 years later ? Be consistent ! Principles are great, but sometimes logic is better.
The problem is not me or you writing a code today, and it appears okay to us. It is being in the place of the one guy trying to understand what we wrote, 10 or 20 years later, how much it costs in time (= money) to come up with a working updated code ? Being explicit or writing redundant comments will actually save time. Which one you prefer ? Int32 val; or var val; // 32-bits.
Also, working with foreign data from other platforms or compile directives is a concept (today involves Interop, COM, PInvoke...) And that's a concept we cannot get rid of, whatever the era, because it takes time to update (reformat) datas (via serialization for ex.) Upgrading DLLs to managed code also takes time. We took time to leave assembler behind and go full-C. We are taking time to move from 32-bits datas to 64-bits, yet, we still need to care about 8 and 16-bits. What next in the future ? Move from 128-bits to 256 or directly to 1024 ? Do not assume a keyword explicit to you will remain explicit for the guys reading your documentation 20 years later (and documentation usually contains errors, mainly because of copy/paste).
So here it is : Where to use Int32 today over int ?
It's when you are producing code that is data-size sensible (IO, network, cross-platform data...), and at some point in the future - could be decades later - someone will have to understand and port your code. The key reason is era-based. 1000 lines of code, it's okay to use int, 100000 lines, it's not anymore. That's a rare duty only a few will have to do, and hell yeah, they have struggle, if only some were a little more explicit instead of relying on "by convention" or "it looks pretty in the IDE, Int32 is so ugly" or "they are the same, don't bother, it's a waste of time to write that two numbers and holding shift key", or "int is unambiguous", or "those who don't like int are just VB fanboys - go learn C# you noob" (yeah, that's the underlying meaning of a few comments right here)
Do not take what I wrote as a generalized perception, nor an attempt to promote Int32 on all cases. I clearly stated the specific case (as it seems to me this was not clear from other answers), to advocate for the few ones getting blammed by their supervisors for being fancy writing Int32, and at the same time the very same supervisor not understanding what takes so long to rewrite that C DLL to C#. It's an edge case, but at least for those reading, "Int32" has at least one purpose in its life.
The point can be further discussed by turning the question the other way around : Why not just get rid of Int32, Int64 and all the other variants in future C# specifications ? What that would imply ?

Using Unsigned Primitive Types

Most of time we represent concepts which can never be less than 0. For example to declare length, we write:
int length;
The name expresses its purpose well but you can assign negative values to it. It seems that for some situations, you can represent your intent more clearly by writing it this way instead:
uint length;
Some disadvantages that I can think of:
unsigned types (uint, ulong, ushort) are not CLS compliant so you can't use it with other languages that don't support this
.Net classes use signed types most of the time so you have to cast
Thoughts?
“When in Rome, do as the Romans do.”
While there is theoretically an advantage in using unsigned values where applicable because it makes the code more expressive, this is simply not done in C#. I'm not sure why the developers initially didn't design the interfaces to handle uints and make the type CLS compliant but now the train has left the station.
Since consistency is generally important I'd advise taking the C# road and using ints.
If you decrement a signed number with a value of 0, it becomes negative and you can easily test for this. If you decrement an unsigned number with a value of 0, it underflows and becomes the maximum value for the type - somewhat more difficult to check for.
Your second point is the most important. Generally you should just use int since that's a pretty good "catch-all" for integer values. I would only use uint if you absolutely need the ability to count higher than int, but without using the extra memory long requires (it's not much more memory, so don't be cheap :-p).
I think the subtle use of uint vs. int will cause confusing with developers unless it was written into developer guidelines for the company.
If the length, for example, can't be less than zero then it should be expressed clearly in the business logic so future developers can read the code and know the true intent.
Just my 2 cents.
I will point out that in C# you can turn on /checked to check for arithmetic overflow / underflow, which isn't a bad idea anyways. If performance matters in a critical section, you can still use unchecked to avoid this.
For internal code (ie code that won't be referenced in any interop manor with other languages) I vote for using unsigned when the situation warrants it, such as length variables as mentioned earlier. This - along with checked arithmetic - provides one more net for developers, catching subtle bugs earlier.
Another point in the signed vs unsigned debate is that some programmers use values such as -1 to indicate errors, when they wouldn't otherwise have meaning. I subscribe to the view that each variable should have only one purpose, but if you - or colleagues you code with - like to indicate errors in this way, leaving variables signed gives you the flexibility to add error states later.
Your two points are good. The primary reason to avoid it is casting, though. Casting makes them incredibly annoying to use. I tried using unisigned variables once but I had to sprinkle casts absolutely everywhere because the framework methods all use signed integers. Therefore, whenever you call a framework method, you have to cast.

Categories

Resources