I've ran into a curious behaviour when trying to hash a string password and then display the hash in console.
My code is:
static void Main(string[] args)
{
string password = "password";
ConvertPasswordToHash(password);
}
private static void ConvertPasswordToHash(string password)
{
using (HashAlgorithm sha = SHA256.Create())
{
byte[] result = sha.ComputeHash(Encoding.UTF8.GetBytes(password));
string hashText = Encoding.UTF8.GetString(result);
Console.WriteLine(hashText);
StringBuilder sb = new StringBuilder();
foreach (var item in result)
{
sb.Append((char)item);
}
Console.WriteLine(sb);
}
}
The problem is two fold:
The hashTest and sb contain different values (both are 32 characters long before outputting) and 2) Console outputs are even stranger. They are not 32 characters in length and second the outputs are slightly different:
When examining the strings before outputting them, I've noticed that hashText contains for instance \u0004, which could be a unicode character of some sort while sb does not contain that at all (that is before outputting the values into the console).
My questions are:
Which way is the correct way of getting a string of chars from the provided array of bytes?
Why are the console outputs different but only so slightly? It does not look like it is the fault of using the wrong Encoding.
How do I output the correct hash (32 symbols) into the console? Ive tried adding '#' before the strings to cancel any possible carriage returns etc... Pretty much without any result.
Maybe I am missing something obvious. Thank you.
The correct logic should be as follows:
private static void ConvertPasswordToHash(string password)
{
using (HashAlgorithm sha = SHA256.Create())
{
byte[] result = sha.ComputeHash(Encoding.UTF8.GetBytes(password));
StringBuilder sb = new StringBuilder();
foreach (var item in result)
{
sb.Append(item.ToString("x2"));
}
Console.WriteLine(sb);
}
}
ToString("x2") formats the string as two hexadecimal characters.
Live example: https://dotnetfiddle.net/QkREkX
Another way is just to represent your byte[] array as a base 64 string, no StringBuilder required.
byte[] result = sha.ComputeHash(Encoding.UTF8.GetBytes(password));
Console.WriteLine(Convert.ToBase64String(result));
Related
Below are 2 similar code blocks. They take a string, encrypt in SHA512, then convert to Base64, I had trouble getting the second code block to produce the same results as my manual test using online calculators and encoders. So I broke the process down step by step and discovered that it was capable of producing the same results as my manual test but only if it behaved like the first code block. Why do these two code blocks produce different results? Thanks!
private void EditText_AfterTextChanged(object sender, AfterTextChangedEventArgs e)
{
//This builds a string to encrypt.
string domain = txtDomain.Text;
string username = txtUsername.Text;
string pin = txtPin.Text;
txtPreview.Text = string.Format("{0}+{1}+{2}", domain, username, pin);
//This takes the above string, encrypts it.
StringBuilder Sb = new StringBuilder();
SHA512Managed HashTool = new SHA512Managed();
Byte[] PhraseAsByte = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtPreview.Text));
Byte[] EncryptedBytes = HashTool.ComputeHash(PhraseAsByte);
HashTool.Clear();
//This rebuilds the calculated hash for manual comparison.
foreach (Byte b in EncryptedBytes)
Sb.Append(b.ToString("x2"));
txtHash.Text = Sb.ToString();
//This takes the rebuilt hash and re-converts it to bytes before encoding it in Base64
EncryptedBytes = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtHash.Text));
txtResult.Text = Convert.ToBase64String(EncryptedBytes);
}
and
private void EditText_AfterTextChanged(object sender, AfterTextChangedEventArgs e)
{
//This builds a string to encrypt.
string domain = txtDomain.Text;
string username = txtUsername.Text;
string pin = txtPin.Text;
txtPreview.Text = string.Format("{0}+{1}+{2}", domain, username, pin);
//This takes the above string, encrypts it.
StringBuilder Sb = new StringBuilder();
SHA512Managed HashTool = new SHA512Managed();
Byte[] PhraseAsByte = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtPreview.Text));
Byte[] EncryptedBytes = HashTool.ComputeHash(PhraseAsByte);
HashTool.Clear();
//This takes the EncryptedBytes and converts them to base64.
txtResult.Text = Convert.ToBase64String(EncryptedBytes);
//This reverses the EncryptedBytes into readable hash for manual comparison
foreach (Byte b in EncryptedBytes)
Sb.Append(b.ToString("x2"));
txtHash.Text = Sb.ToString();
}
Found the answer, no thanks to your less-than-useful downvotes..
Encoding.Unicode is Microsoft's misleading name for UTF-16 (a double-wide encoding, used in the Windows world for historical reasons but not used by anyone else). http://msdn.microsoft.com/en-us/library/system.text.encoding.unicode.aspx
If you inspect your bytes array, you'll see that every second byte is 0x00 (because of the double-wide encoding).
You should be using Encoding.UTF8.GetBytes instead.
But also, you will see different results depending on whether or not you consider the terminating '\0' byte to be part of the data you're hashing. Hashing the two bytes "Hi" will give a different result from hashing the three bytes "Hi". You'll have to decide which you want to do. (Presumably you want to do whichever one your friend's PHP code is doing.)
For ASCII text, Encoding.UTF8 will definitely be suitable. If you're aiming for perfect compatibility with your friend's code, even on non-ASCII inputs, you'd better try a few test cases with non-ASCII characters such as é and 家 and see whether your results still match up. If not, you'll have to figure out what encoding your friend is really using; it might be one of the 8-bit "code pages" that used to be popular before the invention of Unicode. (Again, I think Windows is the main reason that anyone still needs to worry about "code pages".)
Source: Hashing a string with Sha256
I have a string that I need to hash in order to access an API. The API-creator has provided a codesnippet in Python, which hashes the code like this:
hashed_string = hashlib.sha1(string_to_hash).hexdigest()
When using this hashed string to access the API, everything is fine. I have tried to get the same hashed string result in C#, but without success. I have tried incredibly many ways but nothing has worked so far. I am aware about the hexdigest part aswell and I have kept that in mind when trying to mimic the behaviour.
Does anyone know how to get the same result in C#?
EDIT:
This is one of the many ways I have tried to reproduce the same result in C#:
public string Hash(string input)
{
using (SHA1Managed sha1 = new SHA1Managed())
{
var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(input));
var sb = new StringBuilder(hash.Length * 2);
foreach (byte b in hash)
{
sb.Append(b.ToString("X2"));
}
return sb.ToString().ToLower();
}
}
This code is taken from: Hashing with SHA1 Algorithm in C#
Another way
public string ToHexString(string myString)
{
HMACSHA1 hmSha1 = new HMACSHA1();
Byte[] hashMe = new ASCIIEncoding().GetBytes(myString);
Byte[] hmBytes = hmSha1.ComputeHash(hashMe);
StringBuilder hex = new StringBuilder(hmBytes.Length * 2);
foreach (byte b in hmBytes)
{
hex.AppendFormat("{0:x2}", b);
}
return hex.ToString();
}
This code is taken from: Python hmac and C# hmac
EDIT 2
Some input/output:
C# (using second method provided in above description)
input: callerId1495610997apiKey3*_&E#N#B1)O)-1Y
output: 1ecded2b66e152f0965adb96727d96b8f5db588a
Python
input: callerId1495610997apiKey3*_&E#N#B1)O)-1Y
output: bf11a12bbac84737a39152048e299fa54710d24e
C# (using first method provided in above description)
input: callerId1495611935apiKey{[B{+%P)s;WD5&5x
output: 7e81e0d40ff83faf1173394930443654a2b39cb3
Python
input: callerId1495611935apiKey{[B{+%P)s;WD5&5x
output: 512158bbdbc78b1f25f67e963fefdc8b6cbcd741
C#:
public static string Hash(string input)
{
using (SHA1Managed sha1 = new SHA1Managed())
{
var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(input));
var sb = new StringBuilder(hash.Length * 2);
foreach (byte b in hash)
{
sb.Append(b.ToString("x2")); // x2 is lowercase
}
return sb.ToString().ToLower();
}
}
public static void Main()
{
var x ="callerId1495611935apiKey{[B{+%P)s;WD5&5x";
Console.WriteLine(Hash(x)); // prints 7e81e0d40ff83faf1173394930443654a2b39cb3
}
Python
import hashlib
s = u'callerId1495611935apiKey{[B{+%P)s;WD5&5x'
enc = s.encode('utf-8') # encode in utf8
hash = hashlib.sha1(enc)
formatted = h.hexdigest()
print(formatted) # prints 7e81e0d40ff83faf1173394930443654a2b39cb3
Your main problem is that you are using different encodings for the same string in C# and Python. Use UTF8 in both languages and use the same casing. The output is the same.
Note that inside your input string (between callerId1495611935 and apiKey{[B{+%P)s;WD5&5x) there is an hidden \u200b character. That's why encoding your string in UTF-8 gives a different result than encoding it using ASCII. Does that character have to be inside your string?
My application have an auto update feature. To verify if it successfully download the file I compare two hash, one to the xml and to the hash generated after downloading. The two hash is same but its throwing me that the two hash not same. When I check the size, xml hash string have 66 and the other is 36. I use the trim method but still no luck.
string file = ((string[])e.Argument)[0];
string updateMD5 = "--"+((string[])e.Argument)[1].ToUpper()+"--";
string xx="--"+Hasher.HashFile(file, HashType.MD5).ToUpper()+"--";
// Hash the file and compare to the hash in the update xml
int xxx = (updateMD5.Trim()).Length;
int xxxxx = xx.Trim().Length;
if (String.Equals(updateMD5.Trim(), xx.Trim(), StringComparison.InvariantCultureIgnoreCase))
e.Result = DialogResult.OK;
else
e.Result = DialogResult.No;
hasher code
internal static string HashFile(string filePath, HashType algo)
{
switch (algo)
{
case HashType.MD5:
return MakeHashString(MD5.Create().ComputeHash(new FileStream(filePath, FileMode.Open)));
case HashType.SHA1:
return MakeHashString(SHA1.Create().ComputeHash(new FileStream(filePath, FileMode.Open)));
case HashType.SHA512:
return MakeHashString(SHA512.Create().ComputeHash(new FileStream(filePath, FileMode.Open)));
default:
return "";
}
}
private static string MakeHashString(byte[] hash)
{
StringBuilder s = new StringBuilder();
foreach (byte b in hash)
s.Append(b.ToString("x2").ToLower());
return s.ToString();
}
NOTE: I use the '--' to check if there are trailing space
StringBuilder s=new StringBuilder();
foreach (char c in updateMD5.Trim())
s.AppendLine(string.Format("{0}=={1}",c,(int)c));
Once you showed the character for character output of the longer string the explanation is clear.
As to why this happens, that's pretty impossible to tell from our end due to the nature of the problem.
Anyway, the problem are these two:
==8204
==8203
Those two code points are 0x200C and 0x200B aka:
0x200C = ZERO WIDTH NON-JOINER
0x200B = ZERO WIDTH SPACE
These are invisible characters meant to give hints to word-breaking algorithms and similar gory stuff.
Simply put, somewhere in your code where you concatenate strings you have those two characters as part of your source code. Since they're not visible in your source code either (zero width, remember) they can be hard to spot.
I would take a look at all strings involved in thise, in particular I would starte with the "x2" format string used to build up the hash code, or possibly the code that returns the MD5 code for the update to apply.
From a third part i get a string like this "123123"
Ill have to wrap it into some XML however i get this error System.ArgumentException: '', hexadecimal value 0x04, is an invalid character.
Can I either decode the hex value to something meaningful, or just delete it. The solution must be able to handle other hex values as well.
I ended up creating this method
public static string RemoveInvalidXmlChars(string str)
{
var sb = new StringBuilder();
var decodedString = HttpUtility.HtmlDecode(str);
foreach (var c in decodedString)
{
if (XmlConvert.IsXmlChar(c))
sb.Append(c);
}
return sb.ToString();
}
I am trying to read a String in UTF-16 encoding scheme and perform MD5 hashing on it. But strangely, Java and C# are returning different results when I try to do it.
The following is the piece of code in Java:
public static void main(String[] args) {
String str = "preparar mantecado con coca cola";
try {
MessageDigest digest = MessageDigest.getInstance("MD5");
digest.update(str.getBytes("UTF-16"));
byte[] hash = digest.digest();
String output = "";
for(byte b: hash){
output += Integer.toString( ( b & 0xff ) + 0x100, 16).substring( 1 );
}
System.out.println(output);
} catch (Exception e) {
}
}
The output for this is: 249ece65145dca34ed310445758e5504
The following is the piece of code in C#:
public static string GetMD5Hash()
{
string input = "preparar mantecado con coca cola";
System.Security.Cryptography.MD5CryptoServiceProvider x = new System.Security.Cryptography.MD5CryptoServiceProvider();
byte[] bs = System.Text.Encoding.Unicode.GetBytes(input);
bs = x.ComputeHash(bs);
System.Text.StringBuilder s = new System.Text.StringBuilder();
foreach (byte b in bs)
{
s.Append(b.ToString("x2").ToLower());
}
string output= s.ToString();
Console.WriteLine(output);
}
The output for this is: c04d0f518ba2555977fa1ed7f93ae2b3
I am not sure, why the outputs are not the same. How do we change the above piece of code, so that both of them return the same output?
UTF-16 != UTF-16.
In Java, getBytes("UTF-16") returns an a big-endian representation with optional byte-ordering mark. C#'s System.Text.Encoding.Unicode.GetBytes returns a little-endian representation. I can't check your code from here, but I think you'll need to specify the conversion precisely.
Try getBytes("UTF-16LE") in the Java version.
The first thing I can find, and this might not be the only problem, is that C#'s Encoding.Unicode.GetBytes() is littleendian, while Java's natural byte order is bigendian.
You could use the System.Text.Enconding.Unicode.GetString(byte[]) to convert back from byte to string. In this way you're sure that all happens in Unicode encoding.