C# byte array calculation with BigInteger not working properly

C# byte array calculation with BigInteger not working properly - c#

So, i need to calculate byte arrays in my program and i noticed weird thing:
string aaa = "F8F9FAFBFCFD";
string aaaah = "10101010101";
BigInteger dsa = BigInteger.Parse(aaa, NumberStyles.HexNumber) + BigInteger.Parse(aaaah, NumberStyles.HexNumber);
MessageBox.Show(dsa.ToString("X"));
When i add aaa + aaah, it displays me 9FAFBFCFDFE, but it should display F9FAFBFCFDFE, but when i subtract it does it right, aaa - aaah, displays F7F8F9FAFBFC, everything should be right in my code.

BigInteger.Parse interprets "F8F9FAFBFCFD" as the negative number -7,722,435,347,203 (using two's complement) and not 273,752,541,363,453 as you were probably expecting.
From the documentation for BigInteger.Parse:
If value is a hexadecimal string, the Parse(String, NumberStyles)
method interprets value as a negative number stored by using two's
complement representation if its first two hexadecimal digits are
greater than or equal to 0x80. In other words, the method interprets
the highest-order bit of the first byte in value as the sign bit.
To get the result you are expecting, prefix aaa with a 0 to force it to be interpreted as a positive value:
string aaa = "0F8F9FAFBFCFD";
string aaaah = "10101010101";
BigInteger dsa = BigInteger.Parse(aaa, NumberStyles.HexNumber)
+ BigInteger.Parse(aaaah, NumberStyles.HexNumber);
MessageBox.Show(dsa.ToString("X")); // outputs 0F9FAFBFCFDFE

Related

Why do I get a different value after turning an integer into ASCII and then back to an integer?

Why, when I turn INT value to bytes and to ASCII and back, I get another value?
Example:
var asciiStr = new string(Encoding.ASCII.GetChars(BitConverter.GetBytes(2000)));
var intVal = BitConverter.ToInt32(Encoding.ASCII.GetBytes(asciiStr), 0);
Console.WriteLine(intVal);
// Result: 1855

ASCII is only 7-bit - code points above 127 are unsupported. Unsupported characters are converted to ? per the docs on Encoding.ASCII:
The ASCIIEncoding object that is returned by this property might not have the appropriate behavior for your app. It uses replacement fallback to replace each string that it cannot encode and each byte that it cannot decode with a question mark ("?") character.
So 2000 decimal = D0 07 00 00 hexadecimal (little endian) = [unsupported character] [BEL character] [NUL character] [NUL character] = ? [BEL character] [NUL character] [NUL character] = 3F 07 00 00 hexadecimal (little endian) = 1855 decimal.

TL;DR: Everything's fine. But you're a victim of character replacement.
We start with 2000. Let's acknowledge, first, that this number can be represented in hexadecimal as 0x000007d0.
BitConverter.GetBytes
BitConverter.GetBytes(2000) is an array of 4 bytes, Because 2000 is a 32-bit integer literal. So the 32-bit integer representation, in little endian (least significant byte first), is given by the following byte sequence { 0xd0, 0x07, 0x00, 0x00 }. In decimal, those same bytes are { 208, 7, 0, 0 }
Encoding.ASCII.GetChars
Uh oh! Problem. Here's where things likely took an unexpected turn for you.
You're asking the system to interpret those bytes as ASCII-encoded data. The problem is that ASCII uses codes from 0-127. The byte with value 208 (0xd0) doesn't correspond to any character encodable by ASCII. So what actually happens?
When decoding ASCII, if it encounters a byte that is out of the range 0-127 then it decodes that byte to a replacement character and moves to the next byte. This replacement character is a question mark ?. So the 4 chars you get back from Encoding.ASCII.GetChars are ?, BEL (bell), NUL (null) and NUL (null).
BEL is the ASCII name of the character with code 7, which traditionally elicits a beep when presented on a capable terminal. NUL (code 0) is a null character traditionally used for representing the end of a string.
new string
Now you create a string from that array of chars. In C# a string is perfectly capable of representing a NUL character within the body of a string, so your string will have two NUL chars in it. They can be represented in C# string literals with "\0", in case you want to try that yourself. A C# string literal that represents the string you have would be "?\a\0\0" Did you know that the BEL character can be represented with the escape sequence \a? Many people don't.
Encoding.ASCII.GetBytes
Now you begin the reverse journey. Your string is comprised entirely of characters in the ASCII range. The encoding of a question mark is code 63 (0x3F). And the BEL is 7, and the NUL is 0. so the bytes are { 0x3f, 0x07, 0x00, 0x00 }. Surprised? Well, you're encoding a question mark now where before you provided a 208 (0xd0) byte that was not representable with ASCII encoding.
BitConverter.ToInt32
Converting these four bytes back to a 32-bit integer gives the integer 0x0000073f, which, in decimal, is 1855.

String encoding (ASCII, UTF8, SHIFT_JIS, etc.) is designed to pigeonhole human language into a binary (byte) form. It isn't designed to store arbitrary binary data, such as the binary form of an integer.
While your binary data will be interpreted as a string, some of the information will be lost, meaning that storing binary data in this way will fail in the general case. You can see the point where this fails using the following code:
for (int i = 0; i < 255; ++i)
{
var byteData = new byte[] { (byte)i };
var stringData = System.Text.Encoding.ASCII.GetString(byteData);
var encodedAsBytes = System.Text.Encoding.ASCII.GetBytes(stringData);
Console.WriteLine("{0} vs {1}", i, (int)encodedAsBytes[0]);
}
Try it online
As you can see it starts off well because all of the character codes correspond to ASCII characters, but once we get up in the numbers (i.e. 128 and beyond), we start to require a more than 7 bits to store the binary value. At this point it ceases to be decoded correctly, and we start seeing 63 come back instead of the input value.
Ultimately you will have this problem encoding binary data using any string encoding. You need to choose an encoding method specifically meant for storing binary data as a string.
Two popular methods are:
Hexadecimal
Base64 using ToBase64String and FromBase64String
Hexadecimal example (using the hex methods here):
int initialValue = 2000;
Console.WriteLine(initialValue);
// Convert from int to bytes and then to hex
byte[] bytesValue = BitConverter.GetBytes(initialValue);
string stringValue = ByteArrayToString(bytesValue);
Console.WriteLine("As hex: {0}", stringValue); // outputs D0070000
// Convert form hex to bytes and then to int
byte[] decodedBytesValue = StringToByteArray(stringValue);
int intValue = BitConverter.ToInt32(decodedBytesValue, 0);
Console.WriteLine(intValue);
Try it online
Base64 example:
int initialValue = 2000;
Console.WriteLine(initialValue);
// Convert from int to bytes and then to base64
byte[] bytesValue = BitConverter.GetBytes(initialValue);
string stringValue = Convert.ToBase64String(bytesValue);
Console.WriteLine("As base64: {0}", stringValue); // outputs 0AcAAA==
// Convert form base64 to bytes and then to int
byte[] decodedBytesValue = Convert.FromBase64String(stringValue);
int intValue = BitConverter.ToInt32(decodedBytesValue, 0);
Console.WriteLine(intValue);
Try it online
P.S. If you simply wanted to convert your integer to a string (e.g. "2000") then you can simply use .ToString():
int initialValue = 2000;
string stringValue = initialValue.ToString();

What does {1:X} mean in C#?

I don't quite get what {1:X} is in this piece of code:
ushort secretKey = 0x0088; // The cyphering key.
char character = 'A'; // The initial key to be cyphered.
Console.WriteLine("Initial symbol: {0}, its code in the symbols' table: {1:X}", character, (byte)character);
I mean, I realize that {0:X} means a lowercase hexadecimal, but does that mean that {1:X} is a decimal? Thanks for explaining.

It is known as composite formatting. 1 means second argument and X means hexadecimal. Here is the list.

Strange FormatException when converting to octal

So I have an integer number 208 I don't expect many to understand why I am doing this, but the end result of what I am trying to do is get the base-10 representation of octal number 208 (two-zero-eight). I expect that the confusing thing (for people that will try and answer this question) is that while 208 is an integer, I am using it more like a string containing the characters two, zero, and eight. Please let me know if there are any more questions on this, as I think it will cause some confusion.
Anyway, to get the base-10 representation of "208" here is what I do:
Convert int 208 into string "208".
Take the string "208", and parse from octal to decimal.
Then, here is the corresponding source code:
public byte OctalToDecimal(int octalDigits)
{
byte decimalValue = 0;
string octalString = string.Empty;
// first, get a string representation of the integer number
octalString = octalDigits.ToString();
// now, get the decimal value of the octal string
decimalValue = Convert.ToByte(octalString, 8);
// set the decimal-value as the label
return decimalValue;
}
I get a format exception when octalDigits = 208. I get a message about there being additional characters in the octalString's value. Why would that be? All I do is convert from int to string it's very short/simple, and not like I append anything on there. What is going on?

You should know that the digits for octal numbers are in the range 0 to 7
Here, some helpful links
the octal representations of bytes range from 000 to 377?
http://www.asciitable.com/

Octal numbers can not contain digit 8, like base-10 representation can't contain "digit" 10 and binary can't contain digit 2.

Converting Double to String with multiple format specifiers

I'm trying to convert a Double to String, I need both round-tripping R and lowercase exponential notation enabled. I've looked at NumberFormatInfo, but it doesn't seem to define the exponent symbol.
I'm aware of x.ToString("R").ToLower(), but I prefer a direct approach if there is any.
The references I've examined:
https://msdn.microsoft.com/en-us/library/dwhawy9k(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/0c899ak8(v=vs.110).aspx

double value = 12345.6789;
//Roundtrip
Console.WriteLine(value.ToString("R")); // Result: 12345.6789
//Lowercase "e" denotes lowercase exponent symbol
Console.WriteLine(value.ToString("e")); // Result: 1.234568e+004
//Likewise uppercase "E" denotes uppercase exponent symbol
Console.WriteLine(value.ToString("E")); // Result: 1.234568E+004
//add precision after the "e"
Console.WriteLine(value.ToString("e8")); // Result: 1.23456789e+004
//Variable Precision
int Precision = value.ToString("R").Replace(".", "").Length - 1;
Console.WriteLine(value.ToString("e" + Precision.ToString())); //Result: 1.23456789e+004
https://msdn.microsoft.com/en-us/library/dwhawy9k(v=vs.110).aspx#EFormatString

Conversion of a Char variable to an integer

why is the integer equivalent of '8' is 56 in C sharp? I want to convert it to an integer 8 and not any other number.

You'll need to subtract the offset from '0'.
int zero = (int)'0'; // 48
int eight = (int)'8'; // 56
int value = eight - zero; // 8

56 is the (EDIT) Unicode value for the character 8 use:
Int32.Parse(myChar.ToString());
EDIT:
OR this:
char myChar = '8';
Convert.ToInt32(myChar);

The right way of converting unicode characters in C#/.Net is to use corresponding Char methods IsDigit and GetNumericValue (http://msdn.microsoft.com/en-us/library/e7k33ktz.aspx).
If you are absolutely sure that there will be no non-ASCII numbers in your input than ChaosPandion's suggestion is fine too.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.