Define enum representation type in C# - c#

I would like to know, is there any posibility in C# to change default (integer) representation of enum to something with less weight like char.
Many of you will ask me why I want to do it? Answer is simple:
I have to work at huge, huge Array.
My PC allow me to allocate memory for array of integer with 540 000 000 elements (2048 * 2048 * 128). Everyone know integer needs aroud 4 times more memory than Char.
Char representation give me 2 000 000 000 elements to manipulate.
Much easier in programming masive algorithms is to work at Enum than char but if the change of representation isn't possible I will have to work on charracters.

Yes, you can change the type of an enum but not to char. For your byte can work well as it's 1 byte type. Check enum (C# reference) on MSDN (emphasis mine):
Every enumeration type has an underlying type, which can be any integral type except char. The default underlying type of enumeration elements is int. To declare an enum of another integral type, such as byte, use a colon after the identifier followed by the type, as shown in the following example.
enum Days : byte {Sat, Sun, Mon, Tue, Wed, Thu, Fri};
The approved types for an enum are byte, sbyte, short, ushort, int, uint, long, or ulong.

Yes you can specify the underlying type
public enum MyEnum : byte
{
}

enum MyEnum : byte
{
...
That's all.

You can specify underlying type of enum. From enum (C# Reference):
Every enumeration type has an underlying type, which can be any
integral type except char. The default underlying type of enumeration
elements is int. To declare an enum of another integral type, such as
byte, use a colon after the identifier followed by the type
public enum YourEnum : byte
{
Foo,
Bar
}

The documentation shows you how.
Every enumeration type has an underlying type, which can be any
integral type except char. The default underlying type of enumeration
elements is int. To declare an enum of another integral type, such as
byte, use a colon after the identifier followed by the type, as shown
in the following example.
enum Days : byte {Sat=1, Sun, Mon, Tue, Wed, Thu, Fri};
Note well that in C# char is two bytes wide. You presumably mean to use byte. But as you can see from the documentation, the compiler would have rejected your attempt to use char.

Simply declare the enum of another integral type such as byte like this:
Public enum MyEnum : byte {}

Related

C#: Why is 0xFFFFFFFF a uint when it represents -1?

I don't understand why C# considers the literal 0xFFFFFFFF as a uint when it also represents -1 for int types.
The following is code was entered into the Immediate Window shown here with the output:
int i = -1;
-1
string s = i.ToString("x");
"ffffffff"
int j = Convert.ToInt32(s, 16);
-1
int k = 0xFFFFFFFF;
Cannot implicitly convert type 'uint' to 'int'. An explicit conversion exists (are you missing a cast?)
int l = Convert.ToInt32(0xFFFFFFFF);
OverflowException was unhandled: Value was either too large or too small for an Int32.
Why can the string hex number be converted without problems but the literal only be converted using unchecked?
Why is 0xFFFFFFFF a uint when it represents -1?
Because you're not writing the bit pattern when you write
i = 0xFFFFFFFF;
you're writing a number by C#'s rules for integer literals. With C#'s integer literals, to write a negative number we write a - followed by the magnitude of the number (e.g., -1), not the bit pattern for what we want. It's really good that we aren't expected to write the bit pattern, it would make it really awkward to write negative numbers. When I want -3, I don't want to have to write 0xFFFFFFFD. :-) And I really don't want to have to vary the number of leading Fs based on the size of the type (0xFFFFFFFFFFFFFFFD for a long -3).
The rule for choosing the type of the literal is covered by the above link by saying:
If the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
0xFFFFFFFF doesn't fit in an int, which has a maximum positive value of 0x7FFFFFFF, so the next in the list is uint, which it does fit in.
0xffffffff is 4294967295 is an UInt32 that just happens to have a bit pattern equal to the Int32 -1 due to the way negative numbers are represented on computers. Just because they have the same bit pattern, that doesn't mean 4294967295 = -1. They're completely different numbers so of course you can't just trivially convert between the two. You can force the reintepretation of the bit pattern by using an explicit cast to int: (int)0xffffffff.
The C# docs say that the compiler will try to fit the number you provide in the smallest type that can fit it. That doc is a bit old, but it applies still. It always assumes that the number is positive.
As a fallback you can always coerce the type.
The C# language rules state that 0xFFFFFFFF is an unsigned literal.
A C# signed int is 2's complement type. That scheme uses 0xFFFFFFFF to represent -1. (2's complement is a clever scheme since it doesn't have a signed zero).
For an unsigned int, 0xFFFFFFFF is the largest value it can take, and due to its size, it can't be converted to a signed int.

0x80000000 == 2147483648 in C# but not in VB.NET

In C#:
0x80000000==2147483648 //outputs True
In VB.NET:
&H80000000=2147483648 'outputs False
How is this possible?
This is related to the history behind the languages.
C# always supported unsigned integers. The value you use are too large for int so the compiler picks the next type that can correctly represent the value. Which is uint for both.
VB.NET didn't acquire unsigned integer support until version 8 (.NET 2.0). So traditionally, the compiler was forced to pick Long as the type for the 2147483648 literal. The rule was however different for the hexadecimal literal, it traditionally supported specifying the bit pattern of a negative value (see section 2.4.2 in the language spec). So &H80000000 is a literal of type Integer with the value -2147483648 and 2147483648 is a Long. Thus the mismatch.
If you think VB.NET is a quirky language then I'd invite you to read this post :)
The VB version should be:
&H80000000L=2147483648
Without the 'long' specifier ('L'), VB will try to interpret &H8000000 as an integer. If you force it to consider this as a long type, then you'll get the same result.
&H80000000UI will also work - actually this is the type (UInt32) that C# regards the literal as.
This happens because the type of the hexadecimal number is UInt32 in C# and Int32 in VB.NET.
The binary representation of the hexadecimal number is:
10000000000000000000000000000000
Both UInt32 and Int32 take 32 bits, but because Int32 is signed, the first bit is considered a sign to indicate whether the number is negative or not: 0 for positive, 1 for negative. To convert a negative binary number to decimal, do this:
Invert the bits. You get 01111111111111111111111111111111.
Convert this to decimal. You get 2147483647.
Add 1 to this number. You get 2147483648.
Make this negative. You get -2147483648, which is equal to &H80000000 in VB.NET.

Why can't I base an enum off UInt16?

Given the code below:
static void Main()
{
Console.WriteLine(typeof(MyEnum).BaseType.FullName);
}
enum MyEnum : ushort
{
One = 1,
Two = 2
}
It outputs System.Enum, which means the colon here has nothing to do with inheritance, and it just specifies the basic type of the enum, am I right?
But if I change my code as follows:
enum MyEnum : UInt16
{
One = 1,
Two = 2
}
I would get a compilation error. Why? Aren't UInt16 and ushort the same?
You are correct that reflection doesn't report that an enum inherits the base type, which the specification calls the "underlying type". You can find it using Enum.GetUnderlyingType instead.
The type named by ushort and System.UInt16 are precisely the same.
However, the syntax of enum does not call for a type. Instead it calls for one of a limited set of keywords, which control the underlying type. While System.UInt16 is a valid underlying type, it is not one of the keywords which the C# grammar permits to appear in that location.
Quoting the grammar:
enum-declaration:
attributesopt enum-modifiersopt enum identifier enum-baseopt enum-body ;opt
enum-base:
: integral-type
integral-type:
sbyte
byte
short
ushort
int
uint
long
ulong
char
Because the valid types for an enum are explicitly specified to be the integral types (except char).
The approved types for an enum are byte, sbyte, short, ushort, int, uint, long, or ulong.
http://msdn.microsoft.com/en-us/library/sbbt4032.aspx
One would expect the UInt16 to be equivalent to a ushort given the documentation for built in types:
The C# type keywords and their aliases are interchangeable. For example, you can declare an integer variable by using either of the following declarations...
http://msdn.microsoft.com/en-us/library/ya5y69ds.aspx
Edit: I had messed around with this answer a few times not quite grasping the correct answer. #BenVoight is correct. The accepted list are the integral types (other than char) The System.UInt16 is exactly the same type as ushort, but it is not an integral type identifier (merely a struct type) as specified by the grammar.
That's compiler error CS1008, and it pretty much provides the answer. The approved types for an enum:
The approved types for an enum are byte, sbyte, short, ushort, int,
uint, long, or ulong.
The first part of your question is answered by others, but no one has addressed the 2nd part yet. Someone other than the OP has since edited the 2nd question, my answer may no longer apply
UInt16 and UInt are not the same, UInt16 is an unsigned 16 bit integer, UInt is an unsigned 32 bit integer. They vary quite a bit in their maximum value.
Just for completeness, I'm including the answer to the. first question also:
The approved types for an enum are byte, sbyte, short, ushort, int, uint, long, or ulong.
As for why?
My guess is CLS compliance.

.NET primitives and type hierarchies, why was it designed like this?

I would like to understand why on .NET there are nine integer types: Char, Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, and UInt64; plus other numeric types: Single, Double, Decimal; and all these types have no relation at all.
When I first started coding in C# I thought "cool, there's a uint type, I'm going to use that when negative values are not allowed". Then I realized no API used uint but int instead, and that uint is not derived from int, so a conversion was needed.
What are the real world application of these types? Why not have, instead, integer and positiveInteger ? These are types I can understand. A person's age in years is a positiveInteger, and since positiveInteger is a subset of integer there's so need for conversion whenever integer is expected.
The following is a diagram of the type hierarchy in XPath 2.0 and XQuery 1.0. If you look under xs:anyAtomicType you can see the numeric hierarchy decimal > integer > long > int > short > byte. Why wasn't .NET designed like this? Will the new framework "Oslo" be any different?
My guess would be because the underlying hardware breaks that class hierarchy. There are (perhaps surprisingly) many times when you care that a UInt32 is a 4 bytes big and unsigned, so a UInt32 is not a kind of Int32, nor is an Int32 a type of Int64.
And you almost always care about the difference between an int and a float.
Fundamentally, inheritance & the class hierarchy are not the same as mathematical set inclusion. The fact that the values a UInt32 can hold are a strict subset of the values an Int64 can hold does not mean that a UInt32 is a type of Int64. Less obviously, an Int32 is not a type of Int64 - even though there's no conceptual difference between them, their underlying representations are different (4 bytes versus 8 bytes). Decimals are even more different.
XPath is different: the representations for all the numeric types are fundamentally the same - a string of ASCII digits. There, the difference between a short and a long is one of possible range rather than representation - "123" is both a valid representation of a short and a valid representation of a long with the same value.
Decimal is intended for calculations that need precision (basically, money).
See here: http://msdn.microsoft.com/en-us/library/364x0z75(VS.80).aspx
Singles/Doubles are different to decimals, because they're intended to be an approximation (basically, for scientific calculations).
That's why they're not related.
As for bytes and chars, they're totally different: a byte is 0-255, whereas a char is a character, and can hence store unicode characters (there are a lot more than 255 of them!)
Uints and ints don't convert automatically, because they can each store values that are impossible for the other (uints have twice the positive range of ints).
Once you get the hang of it all, it actually does make a lot of sense.
As for your ages thing, i'd simply use an int ;)

Implicit type cast of char to int in C#

I have a question about the implicit type conversion
Why does this implicit type conversion work in C#? I've learned that implicit code usually don't work.
I have a code sample here about implicit type conversion
char c = 'a';
int x = c;
int n = 5;
int answer = n * c;
Console.WriteLine(answer);
UPDATE: I am using this question as the subject of my blog today. Thanks for the great question. Please see the blog for future additions, updates, comments, and so on.
http://blogs.msdn.com/ericlippert/archive/2009/10/01/why-does-char-convert-implicitly-to-ushort-but-not-vice-versa.aspx
It is not entirely clear to me what exactly you are asking. "Why" questions are difficult to answer. But I'll take a shot at it.
First, code which has an implicit conversion from char to int (note: this is not an "implicit cast", this is an "implicit conversion") is legal because the C# specification clearly states that there is an implicit conversion from char to int, and the compiler is, in this respect, a correct implementation of the specification.
Now, you might sensibly point out that the question has been thoroughly begged. Why is there an implicit conversion from char to int? Why did the designers of the language believe that this was a sensible rule to add to the language?
Well, first off, the obvious things which would prevent this from being a rule of the language do not apply. A char is implemented as an unsigned 16 bit integer that represents a character in a UTF-16 encoding, so it can be converted to a ushort without loss of precision, or, for that matter, without change of representation. The runtime simply goes from treating this bit pattern as a char to treating the same bit pattern as a ushort.
It is therefore possible to allow a conversion from char to ushort. Now, just because something is possible does not mean it is a good idea. Clearly the designers of the language thought that implicitly converting char to ushort was a good idea, but implicitly converting ushort to char is not. (And since char to ushort is a good idea, it seems reasonable that char-to-anything-that-ushort-goes-to is also reasonable, hence, char to int. Also, I hope that it is clear why allowing explicit casting of ushort to char is sensible; your question is about implicit conversions.)
So we actually have two related questions here: First, why is it a bad idea to allow implicit conversions from ushort/short/byte/sbyte to char? and second,
why is it a good idea to allow implicit conversions from char to ushort?
Unlike you, I have the original notes from the language design team at my disposal. Digging through those, we discover some interesting facts.
The first question is covered in the notes from April 14th, 1999, where the question of whether it should be legal to convert from byte to char arises. In the original pre-release version of C#, this was legal for a brief time. I've lightly edited the notes to make them clear without an understanding of 1999-era pre-release Microsoft code names. I've also added emphasis on important points:
[The language design committee] has chosen to provide
an implicit conversion from bytes to
chars, since the domain of one is
completely contained by the other.
Right now, however, [the runtime
library] only provide Write methods
which take chars and ints, which means
that bytes print out as characters
since that ends up being the best
method. We can solve this either by
providing more methods on the Writer
class or by removing the implicit
conversion.
There is an argument for why the
latter is the correct thing to do.
After all, bytes really aren't
characters. True, there may be a
useful mapping from bytes to chars, but ultimately, 23 does not denote the
same thing as the character with ascii
value 23, in the same way that 23B
denotes the same thing as 23L. Asking
[the library authors] to provide this
additional method simply because of
how a quirk in our type system works
out seems rather weak. So I would
suggest that we make the conversion
from byte to char explicit.
The notes then conclude with the decision that byte-to-char should be an explicit conversion, and integer-literal-in-range-of-char should also be an explicit conversion.
Note that the language design notes do not call out why ushort-to-char was also made illegal at the same time, but you can see that the same logic applies. When calling a method overloaded as M(int) and M(char), when you pass it a ushort, odds are good that you want to treat the ushort as a number, not as a character. And a ushort is NOT a character representation in the same way that a ushort is a numeric representation, so it seems reasonable to make that conversion illegal as well.
The decision to make char go to ushort was made on the 17th of September, 1999; the design notes from that day on this topic simply state "char to ushort is also a legal implicit conversion", and that's it. No further exposition of what was going on in the language designer's heads that day is evident in the notes.
However, we can make educated guesses as to why implicit char-to-ushort was considered a good idea. The key idea here is that the conversion from number to character is a "possibly dodgy" conversion. It's taking something that you do not KNOW is intended to be a character, and choosing to treat it as one. That seems like the sort of thing you want to call out that you are doing explicitly, rather than accidentally allowing it. But the reverse is much less dodgy. There is a long tradition in C programming of treating characters as integers -- to obtain their underlying values, or to do mathematics on them.
In short: it seems reasonable that using a number as a character could be an accident and a bug, but it also seems reasonable that using a character as a number is deliberate and desirable. This asymmetry is therefore reflected in the rules of the language.
Does that answer your question?
The basic idea is that conversions leading to potential data-loss can be implicit, whereas conversions, which may lead to data-loss have to be explicit (using, for instance, a cast operator).
So implicitly converting from char to int will work in C#.
[edit]As others pointed out, a char is a 16-bit number in C#, so this conversion is just from a 16-bit integer to a 32-bit integer, which is possible without data-loss.[/edit]
C# supports implicit conversions, the part "usually don't work" is probably coming from some other language, probably C++, where some glorious string implementations provided implicit conversions to diverse pointer-types, creating some gigantic bugs in applications.
When you, in whatever language, provide type-conversions, you should also default to explicit conversions by default, and only provide implicit conversions for special cases.
From C# Specification
6.1.2 Implicit numeric conversions
The implicit numeric conversions are:
• From sbyte to short, int, long,
float, double, or decimal.
• From byte to short, ushort, int,
uint, long, ulong, float, double, or
decimal.
• From short to int, long, float,
double, or decimal.
• From ushort to int, uint, long,
ulong, float, double, or decimal.
• From int to long, float, double, or
decimal.
• From uint to long, ulong, float,
double, or decimal.
• From long to float, double, or
decimal.
• From ulong to float, double, or
decimal.
• From char to ushort, int, uint,
long, ulong, float, double, or
decimal.
• From float to double.
Conversions from int, uint, long, or
ulong to float and from long or ulong
to double may cause a loss of
precision, but will never cause a loss
of magnitude. The other implicit
numeric conversions never lose any
information. There are no implicit
conversions to the char type, so
values of the other integral types do
not automatically convert to the char
type.
From the MSDN page about the char (char (C# Reference) :
A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal. However, there are no implicit conversions from other types to the char type.
It's because they have implemented an implicit method from char to all those types. Now if you ask why they implemented them, I'm really not sure, certainly to help working with ASCII representation of chars or something like that.
Casting will cause data loss. Here char is 16 bit and int is 32 bit. So the cast will happen without loss of data.
Real life example: we can put a small vessel into a big vessel but not vice versa without external help.
The core of #Eric Lippert's blog entry is his educated guess for the reasoning behind this decision of the c# language designers:
"There is a long tradition in C programming of treating characters as integers
-- to obtain their underlying values, or to do mathematics on them."
It can cause errors though, such as:
var s = new StringBuilder('a');
Which you might think initialises the StringBuilder with an 'a' character to start with, but actually sets the capacity of the StringBuilder to 97.
It works because each character is handled internally as a number, hence the cast is implicit.
The char is implicitly cast to it's Unicode numeric value, which is an integer.
The implicit conversion from char to number types makes no sense, in my opinion, because a loss of information happens. You can see it from this example:
string ab = "ab";
char a = ab[0];
char b = ab[1];
var d = a + b; //195
We have put all pieces of information from the string into chars. If by any chance only the information from d is kept, all that is left to us is a number which makes no sense in this context and cannot be used to recover the previously provided information. Thus, the most useful way to go would be to implicitly convert the "sum" of chars to a string.

Categories

Resources