I'm using xxHash for C# to hash a value for consistency.
ComputeHash returns a byte[], but I need to store the results in a long.
I'm able to convert the results into an int32 using the BitConverter. Here is what I've tried:
var xxHash = new System.Data.HashFunction.xxHash();
byte[] hashedValue = xxHash.ComputeHash(Encoding.UTF8.GetBytes(valueItem));
long value = BitConverter.ToInt64(hashedValue, 0);
When I use int this works fine, but when I change to ToInt64 it fails.
Here's the exception I get:
Destination array is not long enough to copy all the items in the collection. Check array index and length.
When you construct your xxHash object, you need to supply a hashsize:
var hasher = new xxHash(32);
valid hash sizes are 32 and 64.
See https://github.com/brandondahler/Data.HashFunction/blob/master/src/xxHash/xxHash.cs for the source.
Adding a new answer because current implementation of xxHash from Brandon Dahler uses a hashing factory where you initialize the factory with a configuration containing hashsize and seed:
using System.Data.HashFunction.xxHash;
//can also set seed here, (ulong) Seed=234567
xxHashConfig config = new xxHashConfig() { HashSizeInBits = 64 };
var factory = xxHashFactory.Instance.Create(config);
byte[] hashedValue = factory.ComputeHash(Encoding.UTF8.GetBytes(valueItem)).Hash;
BitConverter.ToInt64 expects hashedValue to have 8 bytes (= 64bits). You could manually extend, and then pass it.
Related
Why does this returns a hash size of 512 bit ...
var text = "Hello World";
var buffer = Encoding.UTF8.GetBytes(text);
var hmac = new System.Security.Cryptography.HMACSHA512();
hmac.Key = GetRandomBits(512);
hmac.ComputeHash(buffer);
Assert.That(hmac.HashSize, Is.EqualTo(512));
... and this a hash size of 160 bit?
var text = "Hello World";
var buffer = Encoding.UTF8.GetBytes(text);
var hmac = System.Security.Cryptography.HMACSHA512.Create();
hmac.Key = GetRandomBits(512);
hmac.ComputeHash(buffer);
Assert.That(hmac.HashSize, Is.EqualTo(512)); // failure
The constructor and the factory are both related to HMACSHA512, so I assumend the same output.
There is no HMACSHA512.Create(). You're actually calling HMAC.Create() (because the language allows writing calls to static methods off of derived types)
So you're just getting "an HMAC", which seems to be HMACSHA1.
It looks to me like the Create factory method is not doing HMACSHA512 when used in this way.
The documentation breaks it down for us.
Return Value Type: System.Security.Cryptography.HMAC A new SHA-1
instance, unless the default settings have been changed by using the
element.
So it looks like the reason they are different in size is because the Create Method is returning a SHA-1 instance instead of the HMACSHA512 instance as you expected.
I use SHA512Managed class for coding user password string. I initually create etalon string coded in the folowing way:
Convert password string (for example "Johnson_#1") to byte array;
Get hash value of this byte array using SHA512Managed.ComputeHash
method. As you know, hash value gotten from SHA512Managed.ComputeHash(byte[])
method is byte array too.
Then (in program loop) I convert this hash byte array to string in the following way:
System.Text.StringBuilder sBuilder = new System.Text.StringBuilder();
for (int i = 0; i < passwordСache.Length; i++)
{
sBuilder.Append(passwordСache[i].ToString("x2"));
}
string passwordCacheString = sBuilder.ToString();
where the passwordСache is hash byte array and passwordCacheString is result string.
Finally, I store result string in MS SQL Server database table as etalon string.
The matter is in the folowing: If I periodically call SHA512Managed.ComputeHash(byte[]) method and each time pass to it the same byte array as input parameter (for example obtained from "Johnson_#1" string), then the content of returned hash byte array will differs from time to time.
So, if I convert such hash byte array to string (as I showed above) and compare this string to etalon string that is in database table, then the content of this string will differ from content of etalon string though the same string ("Johnson_#1") underlies.
Better defined the question
My question is: Is there a way of determining that two compared SHA512Managed hash byte arrays with different content were created on the base of the same string? Yuor help will be appreciated highly.
As xanatos mentioned in his comments, hash functions must be deterministic.
That is for the same input, you'll get the same hash output.
Try it for yourself:
SHA512Managed sha512Managed = new SHA512Managed();
for (int i = 0; i < 1000; i++) {
var input = Guid.NewGuid().ToString();
byte[] data = sha512Managed.ComputeHash(Encoding.UTF8.GetBytes(input));
byte[] data2 = sha512Managed.ComputeHash(Encoding.UTF8.GetBytes(input));
if (Encoding.UTF8.GetString(data) != Encoding.UTF8.GetString(data2)) {
throw new InvalidOperationException("Hash functions as we know them are useless");
}
}
I am currently working with the .NET port of BouncyCastle and I am having some trouble converting a big integer into a System.Guid using the native .NET BigInteger.
For some context, I am using BouncyCastle in one ("source") application to convert a System.Guid to a Org.BouncyCastle.Math.BigInteger. This value is then saved as a string in the format 3A2B847A960F0E4A8D49BD62DDB6EB38.
Then, in another ("destination") application, I am taking this saved string value of a BigInteger and am trying to convert it back into a System.Guid.
Please note that in the destination application I do not want BouncyCastle as a reference and would like to use core .NET libraries for the conversion. However, since I am running into problems converting with core .NET classes, I am using BouncyCastle and the following code does exactly what I would like:
var data = "3A2B847A960F0E4A8D49BD62DDB6EB38";
var integer = new Org.BouncyCastle.Math.BigInteger( data, 16 );
var bytes = integer.ToByteArrayUnsigned();
var guid = new Guid( bytes ); // holds expected value: (7A842B3A-0F96-4A0E-8D49-BD62DDB6EB38)
As you can see, there is a ToByteArrayUnsigned method on the Org.BouncyCastle.Math.BigInteger that makes this work. If I use the ToByteArray on the System.Numerics.BigInteger (even when resizing the array as discussed in this question) it does not work and I get a different System.Guid than expected.
So, what is the best way to perform the equivalent to the above operation using native .NET classes?
Solution
Thanks to #John-Tasler's suggestion, it turns out this was due to endianess... darn you endianess... will your endiness ever end? :P
var parsed = System.Numerics.BigInteger.Parse( "3A2B847A960F0E4A8D49BD62DDB6EB38", NumberStyles.HexNumber ).ToByteArray();
Array.Resize( ref parsed, 16 );
var guid = new Guid( parsed.Reverse().ToArray() ); // Hoorrrrayyyy!
What's the actual value of the resulting Guid when using .NET's BigIntegrer?
It could be that the two implementations are just storing the bytes differently. Endianess, etc.
To see what comes back from each implementation's array of bytes:
var sb = new StringBuilder();
foreach (var b in bytes)
{
sb.AppendFormat("{0:X2} ", b);
}
sb.ToString();
It'd be an interesting comparison.
I would avoid BigInteger here entirely. Your data bytes are already in the correct order so you can convert to byte[] directly.
var data = "3A2B847A960F0E4A8D49BD62DDB6EB38";
var bytes = new byte[16];
for (var i = 0; i < 16; i++)
bytes[i] = byte.Parse(data.Substring(i * 2, 2), NumberStyles.HexNumber);
var guid = new Guid(bytes); // 7a842b3a-0f96-4a0e-8d49-bd62ddb6eb38
I am computing md5hash of files to check if identical so I wrote the following
private static byte[] GetMD5(string p)
{
FileStream fs = new FileStream(p, FileMode.Open);
HashAlgorithm alg = new HMACMD5();
byte[] hashValue = alg.ComputeHash(fs);
fs.Close();
return hashValue;
}
and to test if for the beginning I called it like
var v1 = GetMD5("C:\\test.mp4");
var v2 = GetMD5("C:\\test.mp4");
and from debugger I listed v1 and v2 values and they are different !! why is that ?
It's because you're using HMACMD5, a keyed hashing algorithm, which combines a key with the input to produce a hash value. When you create an HMACMD5 via it's default constructor, it will use a random key each time, therefore the hashes will always be different.
You need to use MD5:
private static byte[] GetMD5(string p)
{
using(var fs = new FileStream(p, FileMode.Open))
{
using(var alg = new MD5CryptoServiceProvider())
{
return alg.ComputeHash(fs);
}
}
}
I've changed the code to use usings as well.
From the HMACMD5 constructor doc:
HMACMD5 is a type of keyed hash algorithm that is constructed from the
MD5 hash function and used as a Hash-based Message Authentication Code
(HMAC). The HMAC process mixes a secret key with the message data,
hashes the result with the hash function, mixes that hash value with
the secret key again, and then applies the hash function a second
time. The output hash will be 128 bits in length.
With this constructor, a 64-byte, randomly generated key is used.
(Emphasis mine)
With every call to GetMD5(), you're generating a new random key.
You might want to use System.Security.Cryptography.MD5Cng
My guess is that you did something like:
Console.WriteLine(v1);
Console.WriteLine(v2);
or
Console.WriteLine(v1 == v2);
That just shows that the variable values refer to distinct arrays - it doesn't say anything about the values within those arrays.
Instead, try this (to print out the hex):
Console.WriteLine(BitConverter.ToString(v1));
Console.WriteLine(BitConverter.ToString(v2))
Use ToString() methode to get the value of the array byte
I have faced aweird problem with the following code, the code below suppose to stop after one iteration, but it just keep going. However, if I remove the last "result_bytes = md5.ComputeHash(orig_bytes);" then it will work. Does anyone face similar problem before?
MD5 md5;
byte[] orig_bytes;
byte[] result_bytes;
Dictionary<byte[], string> hashes = new Dictionary<byte[], string>();
string input = "NEW YORK";
result_bytes = UnicodeEncoding.Default.GetBytes("HELLO");
while (!hashes.ContainsKey(result_bytes))
{
md5 = new MD5CryptoServiceProvider();
orig_bytes = UnicodeEncoding.Default.GetBytes(input);
result_bytes = md5.ComputeHash(orig_bytes);
hashes.Add(result_bytes, input);
Console.WriteLine(BitConverter.ToString(result_bytes));
Console.WriteLine(hashes.ContainsKey(result_bytes));
result_bytes = md5.ComputeHash(orig_bytes);
}
When you reassign result_bytes to a new value in the last line, you have a new reference to a byte array, which is not equal to the one in the collection, therefore hashes.ContainsKey returns false.
You're assuming that byte arrays override Equals and GetHashCode to compare for equality: they don't. They just use the default identity test - so without the extra assignment at the end, you're just checking whether the exact key object you've just added is still in the dictionary - which of course it is.
One way round this would be to store a reversible string representation of the hash (e.g. using base64), instead of the hash itself. Or write your own implementation of IEqualityComparer<byte[]> and pass that to the Dictionary constructor, so that it uses that implementation to find the hash code of byte arrays and compare them with each other.
In short: this has nothing to do with MD5, and everything to do with the fact that
Console.WriteLine(new byte[0].Equals(new byte[0]));
will print False :)