is java byte the same as C# byte? - c#

Native method from dll works in java if the input parameter is array of bytes - byte[].
If we use the same method from c# it throws EntryPointNotFoundException.
Is that because of byte[] in java and c# are different things? and if it's so how should I use native function from c#?

Java lacks the unsigned types. In particular, Java lacks a primitive type for an unsigned byte. The Java byte type is signed, while the C# byte is unsigned and sbyte is signed.

Is that because of byte[] in java and c# are different things?
Yes.
Endianness: Java stores things internally as Big Endian, while .NET is Little Endian by default.
Signedness: C# bytes are unsigned. Java bytes are signed.
See different results when converting int to byte array - .NET vs Java.

What's the signature of the native function? How do you declare it in Java and in C#?
The most common reason for EntryPointNotFoundException is that function name is mangled (esp. true if function is written in C++) or misspelled.
Another source of problem is 'W' and 'A' suffixes for WinAPI function used to distinguish ANSI and Unicode versions of functions. .NET interop mechanism can try to guess the function suffix, so that may be the source of confusion,

Java Byte:
java byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive). The byte data type can be useful for saving memory in large arrays, where the memory savings actually matters. They can also be used in place of int where their limits help to clarify your code; the fact that a variable's range is limited can serve as a form of documentation.
more for Java Byte
C# Byte
Byte Represents an 8-bit unsigned integer,Byte is an immutable value type that represents unsigned integers with values that range from 0 (which is represented by the Byte.MinValue constant) to 255 (which is represented by the Byte.MaxValue constant). The .NET Framework also includes a signed 8-bit integer value type, SByte, which represents values that range from -128 to 127.
more for c# Byte

Related

char (C++) manipulation in C#

I am trying to rewrite old code written in C++ to C# - code does binary manipulation with chars, but I recieve different results (probably I do some bad manipulation because of Unicode in C#).
I need to rewrite this C++ code to C#:
myChar = 'K' ^ 128;
Result of this code in C++ is -53 ('Ë') in C++'s data type char.
Same operation in C# results in 203 (again 'Ë') in C#'s data type char.
So char is ok, but I need same byte value as in C++ (because I do math operation with that). Can you recommend way, how to safe convert C# char to equivalent C++ byte values?
Thanks
In a single byte two's complement representation 203 is an unsigned interpretation of of -53.
If you would like to use an equivalent representation of C++ signed char, the type should be sbyte:
sbyte myChar = (sbyte)('K' ^ 128);
Note that C++ standard leaves it up to the implementation to decide whether a char is signed or unsigned, which means that some standard-compliant C++ will print 203 for myChar, not -58, without any change to your code.

How to read/write unsigned byte array between C# and Java on a file?

This question is a bit similar to my previous one, where I asked a "cross-language" way to write and read integers between a Java and a C# program. Problem was the endianess.
Anyway, I'm facing a secondary problem. My goal is now to store and retrieve an array of unsigned bytes (values from 0 to 255) in a way that it can be processed by both Java and C#.
In C# it's easy, since unsigned byte[] exists:
BinaryWriterBigEndian writer = new BinaryWriterBigEndian(fs);
// ...
writer.Write(byteData, 0, byteData.Length);
BinaryWriterBigEndian is... well... a big-endian binary writer ;)
This way, the file will contain a sequence composed by, for example, the following values (in a big-endian representation):
[0][120][50][250][221]...
Now it's time to do the same thing under Java. Since unsigned byte[] does not exist here, the array is stored in memory as a (signed) int[] in order to have the possibility to represent values higher than 127.
How to write it as a sequence of unsigned byte values like C# does?
I tried with this:
ByteBuffer byteBuffer = ByteBuffer.allocate(4 * dataLength);
IntBuffer intBuffer = byteBuffer.asIntBuffer();
intBuffer.put(intData);
outputStream.write(byteBuffer.array());
Writing goes well, but C# is not able to read it in the proper way.
Since unsigned byte[] does not exist [...]
You don't care. Signed or unsigned, a byte is ultimately 8 bits. Just use a regular ByteBuffer and write your individual bytes in it.
In C# as well in Java, 1000 0000 (for instance) is exactly the same binary representation of a byte; the fact that in C# it can be treated as an unsigned value, and not in Java, is irrelevant as long as you don't do any arithmetic on the value.
When you need a readable representation of it and you'd like it to be unsigned, you can use (int) (theByte & 0xff) (you need the mask, otherwise casting will "carry" the sign bit).
Or, if you use Guava, you can use UnsignedBytes.toString().

C# Numeric Data Type Naming

Looking at the C# numeric data types, I noticed that most of the types have a signed and unsigned version. I noticed that whereas the "default" integer, short and long are signed, and have their unsigned counterpart as uint, ushort and ulong; the "default" byte is instead unsigned - and have a signed counterpart in sbyte.
Just out of curiosity, why is byte so different from the rest? Was there a specific reason behind this or it is "just the way things are"?
Hope the question isn't too confusing due to my phrasing and excessive use of quotes. Heh..
I would say a byte is not considered a numeric type but defines a structure with 8 bits in size. Besides there is no signed byte notion, it is unsigned. Numbers on the otherhand are firstly considered to be signed, so stating they are unsigned which is less common warrants the prefix
[EDIT]
Forgot there is a signed byte (sbyte). I suppose it is rather historical and practical application. Ints are more common than UInts and byte is more common than sbyte.
Historically the terms byte, nibble and bit indicate a unit of storage, a mnemonic or code...not a numeric value. Having negative mega-bytes of memory or adding ASCII codes 1 and 2 expecting code 3 is kinda silly. In many ways there is no such thing as a signed "byte". Sometimes the line between "thing" and "value" is very blurry....as with most languages that treat byte as a thing and a value.
It's more so a degree of corruption of the terms. A byte is not inherently numeric in any form, it's simply a unit of storage.
However, bytes, characters, and 8-bit signed/unsigned integers have had their names used interchangeably where they probably should not have:
Byte denotes 8 bits of data, says
nothing about the format of the data.
Character denotes some data that
stores a representation of a single
text character.
"UINT8"/"INT8" denotes 8 bits of
data, in signed or unsigned format,
storing numeric integer values.
It really just comes down to being intuitive versus being consistent. It probably would have been cleaner if the .NET Framework used System.UInt8 and System.Int8 for consistency with the other integer types. But yeah it does seem a bit arbitrary.
For what it's worth MSIL (which all .NET languages compile to anyhow) is more consistent in that a sbyte is called an int8 and a byte is called an unsigned int8, short is called int16, etc.
But the term byte is typically not used to describe a numeric type but rather a set of 8 bits such as when dealing with files, serialization, sockets, etc. For example if Stream.Read worked with a System.Int8[] array, that would be a very unusual looking API.

Will a c# "int" ever be 64 bits? [duplicate]

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

Is an int a 64-bit integer in 64-bit C#?

In my C# source code I may have declared integers as:
int i = 5;
or
Int32 i = 5;
In the currently prevalent 32-bit world they are equivalent. However, as we move into a 64-bit world, am I correct in saying that the following will become the same?
int i = 5;
Int64 i = 5;
No. The C# specification rigidly defines that int is an alias for System.Int32 with exactly 32 bits. Changing this would be a major breaking change.
The int keyword in C# is defined as an alias for the System.Int32 type and this is (judging by the name) meant to be a 32-bit integer. To the specification:
CLI specification section 8.2.2 (Built-in value and reference types) has a table with the following:
System.Int32 - Signed 32-bit integer
C# specification section 8.2.1 (Predefined types) has a similar table:
int - 32-bit signed integral type
This guarantees that both System.Int32 in CLR and int in C# will always be 32-bit.
Will sizeof(testInt) ever be 8?
No, sizeof(testInt) is an error. testInt is a local variable. The sizeof operator requires a type as its argument. This will never be 8 because it will always be an error.
VS2010 compiles a c# managed integer as 4 bytes, even on a 64 bit machine.
Correct. I note that section 18.5.8 of the C# specification defines sizeof(int) as being the compile-time constant 4. That is, when you say sizeof(int) the compiler simply replaces that with 4; it is just as if you'd said "4" in the source code.
Does anyone know if/when the time will come that a standard "int" in C# will be 64 bits?
Never. Section 4.1.4 of the C# specification states that "int" is a synonym for "System.Int32".
If what you want is a "pointer-sized integer" then use IntPtr. An IntPtr changes its size on different architectures.
int is always synonymous with Int32 on all platforms.
It's very unlikely that Microsoft will change that in the future, as it would break lots of existing code that assumes int is 32-bits.
I think what you may be confused by is that int is an alias for Int32 so it will always be 4 bytes, but IntPtr is suppose to match the word size of the CPU architecture so it will be 4 bytes on a 32-bit system and 8 bytes on a 64-bit system.
According to the C# specification ECMA-334, section "11.1.4 Simple Types", the reserved word int will be aliased to System.Int32. Since this is in the specification it is very unlikely to change.
No matter whether you're using the 32-bit version or 64-bit version of the CLR, in C# an int will always mean System.Int32 and long will always mean System.Int64.
The following will always be true in C#:
sbyte signed 8 bits, 1 byte
byte unsigned 8 bits, 1 byte
short signed 16 bits, 2 bytes
ushort unsigned 16 bits, 2 bytes
int signed 32 bits, 4 bytes
uint unsigned 32 bits, 4 bytes
long signed 64 bits, 8 bytes
ulong unsigned 64 bits, 8 bytes
An integer literal is just a sequence of digits (eg 314159) without any of these explicit types. C# assigns it the first type in the sequence (int, uint, long, ulong) in which it fits. This seems to have been slightly muddled in at least one of the responses above.
Weirdly the unary minus operator (minus sign) showing up before a string of digits does not reduce the choice to (int, long). The literal is always positive; the minus sign really is an operator. So presumably -314159 is exactly the same thing as -((int)314159). Except apparently there's a special case to get -2147483648 straight into an int; otherwise it'd be -((uint)2147483648). Which I presume does something unpleasant.
Somehow it seems safe to predict that C# (and friends) will never bother with "squishy name" types for >=128 bit integers. We'll get nice support for arbitrarily large integers and super-precise support for UInt128, UInt256, etc. as soon as processors support doing math that wide, and hardly ever use any of it. 64-bit address spaces are really big. If they're ever too small it'll be for some esoteric reason like ASLR or a more efficient MapReduce or something.
Yes, as Jon said, and unlike the 'C/C++ world', Java and C# aren't dependent on the system they're running on. They have strictly defined lengths for byte/short/int/long and single/double precision floats, equal on every system.
int without suffix can be either 32bit or 64bit, it depends on the value it represents.
as defined in MSDN:
When an integer literal has no suffix, its type is the first of these types in which its value can be represented: int, uint, long, ulong.
Here is the address:
https://msdn.microsoft.com/en-us/library/5kzh1b5w.aspx

Categories

Resources