CRC32 hex in .net - c#

I'm trying to find a CRC32 computation algorythm that output data in positive 8-character HEX (like winrar CRC for example).
All algorythms found return a positive/negative integer that I don't know how to handle...

I'll bet that all of the CRC32 algorithms you found return a 32-bit integer (e.g. int or uint). When you view it, you're probably viewing it as a base-10 number. By viewing I mean formatting the integer as a string without passing any format options to Int32.ToString or String.Format.
If you were to put Visual Studio into Hexadecimal View, you would get your "expected" output. The algorithms are all correct in this case, however, it is your expectation that is incorrect!
Instead, use a Number Format which produces the string representation you desire:
uint crc32 = Crc32Class.GetMeAValue(data);
// For example, we'll write base-10 and base-16 the output to the console
Console.WriteLine("dec: {0}", crc32);
Console.WriteLine("hex: {0:X8}", crc32);

Related

BigInteger.Parse() trouble reading in large numbers

Presently I am attempting to do this challenge (http://cryptopals.com/sets/1/challenges/1) and I am having some trouble completing the task in C#. I can not seem to parse the number into a big integer.
So code looks like below:
string output = "";
BigInteger hexValue = BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6");
output = Convert.ToBase64String(hexValue.ToByteArray());
Console.WriteLine(hexValue);
Console.WriteLine(output);
Console.ReadKey();
return "";
And at present the problem I am getting is when I run the program it fails with the error
System.FormatException: 'The value could not be parsed.' and I am not entirely sure why.
So, what is the appropriate way to get a large integer from a string into a BigInt?
The initial problem
The BigInteger.Parse method expects the value to be decimal, not hex. You can "fix" that by passing in NumberStyles.HexNumber.
The bigger problem with using BigInteger for this
If you're just trying to convert a string of hex digits into bytes, I would avoid using BigInteger at all. For one thing, you could end up with problems if the original byte array started with zeroes, for example. The zeroes wouldn't be in the resulting byte array. (Sample input: "0001" - you want to get two bytes out, but you'll only get one, after persuading it to parse hex.)
Even if you don't lose any information, the byte[] you receive from BigInteger.ToByteArray() isn't what you were probably expecting. For example, consider this code, which just converts the data to byte[] and back to hex via BitConverter:
BigInteger bigInt = BigInteger.Parse("1234567890ABCDEF", NumberStyles.HexNumber);
byte[] bytes = bigInt.ToByteArray();
Console.WriteLine(BitConverter.ToString(bytes));
The output of that is "EF-CD-AB-90-78-56-34-12" - because BigInteger.ToByteArray returns the data in little-endian order:
The individual bytes in the array returned by this method appear in little-endian order. That is, the lower-order bytes of the value precede the higher-order bytes.
That's not what you want - because it means the last part of the original string is the first part of the byte array, etc.
Avoiding BigInteger altogether
Instead, parse the data directly to a byte array, as in this question, or this one, or various others. I won't reproduce the code here, but it's simple enough, with different options depending on whether you're trying to create simple source code or an efficient program.
General advice on conversions
In general it's a good idea to avoid intermediate representations of data unless you're absolutely convinced that you won't lose information in the process - as you would here. It's fine to convert the hex string to a byte array before converting the result to base64, because that's not a lossy transformation.
So your conversions are:
String (hex) to BigInteger: lossy (in the context of leading 0s being significant, as they are in this situation)
BigInteger to byte[]: not lossy
byte[] to String (base64): not lossy
I'm recommending:
String (hex) to byte[]: not lossy (assuming you have an even number of nybbles to convert, which is generally a reasonable assumption)
byte[] to String (base64): not lossy
Use NumberStyles.HexNumber:
BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6",
NumberStyles.HexNumber,
CultureInfo.InvariantCulture);
If your number is supposed to be always positive, add a leading zero to your string.
The problem is that the input is not decimal but hexadecimal, therefore you need to pass an additional parameter for parsing:
BigInteger number = BigInteger.Parse(
hexString,
NumberStyles.AllowHexSpecifier);

BitConverter.DoubleToInt64Bits equivalent in python

I figure my best way of really getting a feeling about double precision numbers is to play around with them a bit, and one of the things I want to do is to look at their (almost) binary representation. For this, in C#, the function BitConverter.DoubleToInt64Bits is very useful, as it (after converting into hexadecimal) gives me a look at what the "real" nature of the floating point number is.
The problem is that I don't seem to be able to find the equivalent function in Python, is there a way to do the same things as BitConverter.DoubleToInt64Bits in a python function?
Thank you.
EDIT:
An answer bellow suggested usinginascii.hexlify(struct.pack('d', 123.456) to convert the double into a hexadecimal representation, but I am still getting strange results.
For example,
inascii.hexlify(struct.pack('d', 123.456))
does indeed return '77be9f1a2fdd5e40' but if I run the code that should be equivalent in C#, i.e.
BitConverter.DoubleToInt64Bits(123.456).ToString("X")
I get a completely different number: "405EDD2F1A9FBE77". Where have I made my mistake?
How about using struct.pack and binascii.hexlify?
>>> import binascii
>>> import struct
>>> struct.pack('d', 0.0)
'\x00\x00\x00\x00\x00\x00\x00\x00'
>>> binascii.hexlify(struct.pack('d', 0.0))
'0000000000000000'
>>> binascii.hexlify(struct.pack('d', 1.0))
'000000000000f03f'
>>> binascii.hexlify(struct.pack('d', 123.456))
'77be9f1a2fdd5e40'
struct format specify d represent double c type (8 bytes = 64 bits). For other format, see Format characters.
UPDATE
By specifying #, =, <, >, ! as the first character of the format, you can indicate byte order. (Byte Order, Size, and Alignment)
>>> binascii.hexlify(struct.pack('<d', 123.456)) # little-enddian
'77be9f1a2fdd5e40'
>>> binascii.hexlify(struct.pack('>d', 123.456)) # big-endian
'405edd2f1a9fbe77'

Int64 seems too short in C#

I'm trying to write the largest int64 value to the command line. I tried using 0x1111111111111111 which is 16 ones, and visual studio says that is int64. I would have assumed that would be int16. What am missing here?
0x is the prefix for hexadecimal and not binary literals. This means that the binary representation of your number is 0001000100010001000100010001000100010001000100010001000100010001
There are unfortunately no binary literals in C#, so you either have to do the calculation yourself (0x7FFFFFFFFFFFFFFF) or use the Convert class, for example:
short s = Convert.ToInt16("1111111111111111", 2); // "2" for binary
In order to just get the largest Int64 number, you don't need to perform any calculations of your own, as it is already available for you in this field:
Int64.MaxValue
The literal 0x1111111111111111 is a hexadecimal number. Each hexadecimal digit can be represented using four bits so with 16 hexadecimal digits you need 4*16 = 64 bits. You probably intended to write the binary number 1111111111111111. You can convert from a binary literal string to an integer using the following code:
Convert.ToInt16("1111111111111111", 2)
This will return the desired number (-1).
To get the largest Int64 you can use Int64.MaxValue (0x7FFFFFFFFFFFFFFF) or if you really want the unsigned value you can use UInt64.MaxValue (0xFFFFFFFFFFFFFFFF).
The largest Int64 value is Int64.MaxValue. To print this in hex, try:
Console.WriteLine(Int64.MaxValue.ToString("X"));

Hash function to obtain a limited length result

I need to hash a number (about 22 digits) and the result length must be less than 12 characters. It can be a number or a mix of characters, and must be unique. (The number entered will be unique too).
For example, if the number entered is 000000000000000000001, the result should be something like 2s5As5A62s.
I looked at the typicals, like MD5, SHA-1, etc., but they give high length results.
The problem with your question is that the input is larger than the output and unique. If you're expecting a unique output as well, it won't happen. The reason behind this that if you have an input space of say 22 numeric digits (10^22 possibilities) and an output space of hexadecimal digits with a length of 11 digits (16^11 possibilities), you end up with more input possibilities than output possibilities.
The graph below shows that you would need a an output space of 19 hexadecimal digits and a perfect one-to-one function, otherwise you will have collisions pretty often (more than 50% of the time). I assume this is something you do not want, but you did not specify.
Since what you want cannot be done, I would suggest rethinking your design or using a checksum such as the cyclic redundancy check (CRC). CRC-64 will produce a 64 bit output and when encoded with any base64 algorithm, will give you something along the lines of what you want. This does not provide cryptographic strength like SHA-1, so it should never be used in anything related to information security.
However, if you were able to change your criteria to allow for long hash outputs, then I would strongly suggest you look at SHA-512, as it will provide high quality outputs with an extremely low chance of duplication. By a low chance I mean that no two inputs have yet been found to equal the same hash in the history of the algorithm.
If both of these suggestions still are not great for you, then your last alternative is probably just going with only base64 on the input data. It will essentially utilize the standard English alphabet in the best way possible to represent your data, thus reducing the number of characters as much as possible while retaining a complete representation of the input data. This is not a hash function, but simply a method for encoding binary data.
Why not taking MD5 or SHA-N then refactor to BASE64 (or base-whatever) and take only 12 characters of them ?
NB: In all case the hash will NEVER be unique (but can offer low collision probability)
You can't use a hash if it has to be unique.
You need about 74 bits to store such a number. If you convert it to base-64 it will be about 12 characters.
Can you elaborate on what your requirement is for the hashing? Do you need to make sure the result is diverse? (i.e. not 1 = a, 2 = b)
Just thinking out loud, and a little bit laterally, but could you not apply principles of run-length encoding on your number, treating it as data you want to compress. You could then use the base64 version of your compressed version.

converting a string of binary into a string of decimal c#

problem is to convert a string of binary digits into its decimal representation. Easy eh?
well, it needs to be able to handle input of over 64 bits in length and convert it without using any external libraries or data types like big integer.
How can I do this?
so far i have a string called input which handles the binary
I then access each digit using input[0] etc to get a char representing that digit.
Now I manipulate it and multiply by the corresponding power of 2 that its index represents, and move through the array storing the total as i go.
I use a big integer to store the total as for large numbers the primative types dont work.
My first solution works perfectly, how can I do this without using anything to store the total, i.e only using strings to store answers.
Any Ideas?
Thanks
You will need an array of digits to hold the result. An array of int's would be easier but you can also use the final string. You can calculate it's length from the length of the inputstring, you may have to remove leading zero's in the end.
Calculate your result as before but do the adding (including the carrying) in the result array.
I think this is a (homework) assignment that wants you to implement a version of the "pen & paper" addition method.

Categories

Resources