Hashing non ascii characters C#

Hashing non ascii characters C# - c#

Here are two hash generators:
http://www.md5hashgenerator.com/index.php
http://www.miraclesalad.com/webtools/md5.php
Now, my question is:
Why do the hashes differ when trying to hash the char '€' (0x80)?
I assume it happens because '€' is not a normal ASCII character.
Which of the two hashes is 'correct'?
I'm trying to calculate the hash returned by hash generator 1 with C#.
This hashing function doesn't return it.
private string GetMD5Hash(string TextToHash)
{
if ((TextToHash == null) || (TextToHash.Length == 0))
{
return string.Empty;
}
MD5 md5 = new MD5CryptoServiceProvider();
byte[] textToHash = Encoding.Default.GetBytes(TextToHash);
byte[] result = md5.ComputeHash(textToHash);
return BitConverter.ToString(result).Replace("-", "").ToLower();
}
How could I change it so it returns the hash I want?
Additional Info:
I made a little AutoIt script:
#include <Crypt.au3>
ConsoleWrite(StringLower(StringMid(_Crypt_HashData(Chr(128), $CALG_MD5),3)) & #CRLF)
and it returns the hash I want!
However I need a C# code :)

It comes down to which encoding you use to turn the string into a byte[] (hence my suggestion to use try UTF-8, as that is a pretty common choice here; however, any full unicode encoding would work as long as you know which to use) ; for example, based on the string "abc€" we can deduce that the first site might be using any of:
874: Thai (Windows)
936: Chinese Simplified (GB2312)
1250: Central European (Windows)
1252: Western European (Windows)
1253: Greek (Windows)
1254: Turkish (Windows)
1255: Hebrew (Windows)
1256: Arabic (Windows)
1257: Baltic (Windows)
1258: Vietnamese (Windows)
50227: Chinese Simplified (ISO-2022)
51936: Chinese Simplified (EUC)
52936: Chinese Simplified (HZ)
Personally, I'd use UTF-8!
Here's the code I used to find the candidate encodings:
MD5 md5 = new MD5CryptoServiceProvider();
foreach (var enc in Encoding.GetEncodings())
{
byte[] textToHash = enc.GetEncoding().GetBytes("abc€");
byte[] result = md5.ComputeHash(textToHash);
var output = BitConverter.ToString(result).Replace("-", "").ToLower();
if(output == "7a66042043b2cc38ba16a13c596d740e")
{ // result from http://www.md5hashgenerator.com/index.php
Console.WriteLine(enc.CodePage + ": " + enc.DisplayName);
}
}
Further, testing with the string "dnos ʇǝqɐɥdʃɐ" shows that the second site is definitely using UTF-8; the first site finds no matches, so I guess it is using a code-page based encoding, and in short will not work reliably with the full range of unicode.

Both of the MD5 pages you've shown describe MD5 as an operation which works on strings. It isn't - it's an operation which works on byte sequences. In order to convert from a string to a byte sequence, you need to use an encoding.
You've chosen Encoding.Default which is almost always a bad choice - I'd generally choose Encoding.UTF8. However, importantly, neither of those sites say what they're using. However, in real life I would hope you'd either have control over both hashing processes (assuming there really are two) or that any hashing code you don't have control over will specify what encoding to use.
Note that there's a simpler way of creating an instance of MD5 - just use MD5.Create. You should also generally put it in a using statement as it implements IDisposable:
private static string GetMD5Hash(string text)
{
if (string.IsNullOrEmpty(text))
{
return "";
}
using (var md5 = MD5.Create())
{
byte[] hash = md5.ComputeHash(Encoding.UTF8.GetBytes(text));
return BitConverter.ToString(hash).Replace("-", "").ToLower();
}
}

Related

C# AES decrypted output is not the same as input

I wanted to do a simple message encrypter to dip my toes into the matter but I can't make it to work. The problem is that whatever input I start with, sometimes it encrypts it but when I try to decrypt it, it just doesn't return the original string. It would be really helpful if you could tell me what I'm doing wrong or guide in the right direction.
Complete code
This are the sections in charge of encrypting and decrypting.
void Decrypt()
{
using var crypt = Aes.Create();
string[] input = ClipboardService.GetText()?.Split(SEPARATOR) ?? Array.Empty<string>();
byte[] key = input[0].ToBytes();
byte[] IV = input[^1].ToBytes();
byte[] value = string.Join(string.Empty, input[1..^1]).ToBytes();
crypt.IV = IV;
crypt.Key = key;
var decryptedValue = crypt.DecryptCbc(value, IV, PaddingMode.Zeros);
string decryptedValueInText = decryptedValue.ToUnicodeString();
ClipboardService.SetText(decryptedValueInText);
LogInfoMessage($"{decryptedValueInText}: {decryptedValue.Length}");
crypt.Clear();
}
void Encrypt()
{
using var crypt = Aes.Create();
crypt.GenerateKey();
string value = ClipboardService.GetText() ?? string.Empty;
var encryptedValue = crypt.EncryptCbc(value.ToBytes(), crypt.IV, PaddingMode.Zeros);
string encryptedValueInText = $"{crypt.Key.ToUnicodeString()}{SEPARATOR}{encryptedValue.ToUnicodeString()}{SEPARATOR}{crypt.IV.ToUnicodeString()}";
ClipboardService.SetText(encryptedValueInText);
LogInfoMessage($"{encryptedValueInText}: {encryptedValue.Length}");
crypt.Clear();
}
There are two extension methods:
public static string ToUnicodeString(this byte[] bytes) => Encoding.Unicode.GetString(bytes);
public static byte[] ToBytes(this string str) => Encoding.Unicode.GetBytes(str);
Example
The input links were:
https://www.youtube.com/
https://www.youtube.com/watch?v=bSA91XTzeuA
I don't think it matters because the key and IV are autogenerated everytime anyways but still.

Per our discussion...
Using the clipboard to store binary data as Unicode text will fail due to invalid UTF-16 codepoints. UTF-16 uses some multi-word encoding for certain Unicode characters, using 32 bits in surrogate pairs to encode Unicode code points from the supplementary planes. There are plenty of primers on the UTF-16 encoding, but basically you have a pair of 16-bit values where the first is in the range 0xD800-0xDBFF and the second must be in the range 0xDC00-0xDFFF. Odds on your encrypted data will break this rule.
As noted, if your encrypted binary data must be sent through a text-only transport you should encode the bytes in the encrypted block using Base64 or similar.
I'd also like to stress that writing methods that can be called with parameters rather than directly accessing the clipboard for I/O makes it much simpler to do testing, including round-trip tests on the various parts of the problem. Proving that the codec is working without reference to the clipboard is a good test and separation of concerns helps to more readily identify the source of problems in the future.

Why do these code blocks produce different results?

Below are 2 similar code blocks. They take a string, encrypt in SHA512, then convert to Base64, I had trouble getting the second code block to produce the same results as my manual test using online calculators and encoders. So I broke the process down step by step and discovered that it was capable of producing the same results as my manual test but only if it behaved like the first code block. Why do these two code blocks produce different results? Thanks!
private void EditText_AfterTextChanged(object sender, AfterTextChangedEventArgs e)
{
//This builds a string to encrypt.
string domain = txtDomain.Text;
string username = txtUsername.Text;
string pin = txtPin.Text;
txtPreview.Text = string.Format("{0}+{1}+{2}", domain, username, pin);
//This takes the above string, encrypts it.
StringBuilder Sb = new StringBuilder();
SHA512Managed HashTool = new SHA512Managed();
Byte[] PhraseAsByte = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtPreview.Text));
Byte[] EncryptedBytes = HashTool.ComputeHash(PhraseAsByte);
HashTool.Clear();
//This rebuilds the calculated hash for manual comparison.
foreach (Byte b in EncryptedBytes)
Sb.Append(b.ToString("x2"));
txtHash.Text = Sb.ToString();
//This takes the rebuilt hash and re-converts it to bytes before encoding it in Base64
EncryptedBytes = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtHash.Text));
txtResult.Text = Convert.ToBase64String(EncryptedBytes);
}
and
private void EditText_AfterTextChanged(object sender, AfterTextChangedEventArgs e)
{
//This builds a string to encrypt.
string domain = txtDomain.Text;
string username = txtUsername.Text;
string pin = txtPin.Text;
txtPreview.Text = string.Format("{0}+{1}+{2}", domain, username, pin);
//This takes the above string, encrypts it.
StringBuilder Sb = new StringBuilder();
SHA512Managed HashTool = new SHA512Managed();
Byte[] PhraseAsByte = System.Text.Encoding.UTF8.GetBytes(string.Concat(txtPreview.Text));
Byte[] EncryptedBytes = HashTool.ComputeHash(PhraseAsByte);
HashTool.Clear();
//This takes the EncryptedBytes and converts them to base64.
txtResult.Text = Convert.ToBase64String(EncryptedBytes);
//This reverses the EncryptedBytes into readable hash for manual comparison
foreach (Byte b in EncryptedBytes)
Sb.Append(b.ToString("x2"));
txtHash.Text = Sb.ToString();
}

Found the answer, no thanks to your less-than-useful downvotes..
Encoding.Unicode is Microsoft's misleading name for UTF-16 (a double-wide encoding, used in the Windows world for historical reasons but not used by anyone else). http://msdn.microsoft.com/en-us/library/system.text.encoding.unicode.aspx
If you inspect your bytes array, you'll see that every second byte is 0x00 (because of the double-wide encoding).
You should be using Encoding.UTF8.GetBytes instead.
But also, you will see different results depending on whether or not you consider the terminating '\0' byte to be part of the data you're hashing. Hashing the two bytes "Hi" will give a different result from hashing the three bytes "Hi". You'll have to decide which you want to do. (Presumably you want to do whichever one your friend's PHP code is doing.)
For ASCII text, Encoding.UTF8 will definitely be suitable. If you're aiming for perfect compatibility with your friend's code, even on non-ASCII inputs, you'd better try a few test cases with non-ASCII characters such as é and 家 and see whether your results still match up. If not, you'll have to figure out what encoding your friend is really using; it might be one of the 8-bit "code pages" that used to be popular before the invention of Unicode. (Again, I think Windows is the main reason that anyone still needs to worry about "code pages".)
Source: Hashing a string with Sha256

Trying to replicate C# hashing in Perl (on Linux)

I'm not coming up with the same values (using a known password).
I suspect it may be something having to do with encodings, but all the things I've tried haven't worked thus far:
windows code (c#?):
private static string EncodePassword(string password, string salt)
{
string encodedPassword = password;
HMACSHA1 hash = new HMACSHA1 { Key = Convert.FromBase64String(salt) };
encodedPassword = Convert.ToBase64String(hash.ComputeHash(Encoding.Unicode.GetBytes(password)));
return encodedPassword;
}
perl code run on linux:
use Modern::Perl '2015';
use Digest::SHA qw(hmac_sha1 hmac_sha1_base64);
use MIME::Base64 qw(decode_base64 encode_base64);
use Unicode::String qw(utf16be utf16le);
say encode_base64(hmac_sha1($password, decode_base64($salt)));
# (or, equivalently)
say hmac_sha1_base64($password, decode_base64($salt));
my $le16 = utf16le($password);
my $be16 = utf16be($password);
say "ok, try utf-16 (le, then be)...";
say encode_base64(hmac_sha1($le16, decode_base64($salt)));
say encode_base64(hmac_sha1($be16, decode_base64($salt)));
# try reversing the hmac output?
my $hmac_bytes = hmac_sha1($password, decode_base64($salt));
my $rev_bytes = reverse $hmac_bytes;
say encode_base64($rev_bytes);

In the original C# code, in this line:
encodedPassword = Convert.ToBase64String(hash.ComputeHash(Encoding.Unicode.GetBytes(password)));
a call to Encoding.Unicode.GetBytes transforms the password to a byte array via a UTF-16LE encoder.
You have to do the same transformation to get the same hash in Perl:
use Digest::SHA qw(hmac_sha1);
use MIME::Base64 qw(decode_base64 encode_base64);
use Encode qw(encode);
$utf16LEPassword = encode("UTF-16LE", $password);
print encode_base64(hmac_sha1($utf16LEPassword, decode_base64($salt)));

classic asp use capicom for md5 hashing - result differs from .net System.Security.Cryptography

When using CAPICOM in Classic ASP (VBScript) to perform MD5 hashing like so:
With server.CreateObject("CAPICOM.HashedData")
.Algorithm = 3 ' CAPICOM_HASH_ALGORITHM_MD5
.Hash "password"
md5Pwd = .Value
End With
I get this result: B081DBE85E1EC3FFC3D4E7D0227400CD
When I use .NET, I get this result: 5f4dcc3b5aa765d61d8327deb882cf99
Why are the MD5 strings different? What am I doing wrong?
Here is my C# function:
MD5 md5Hasher = MD5.Create();
byte[] data = md5Hasher.ComputeHash( Encoding.Default.GetBytes( val ) );
StringBuilder sBuilder = new StringBuilder();
// Loop through each byte of the hashed data
// and format each one as a hexadecimal string.
for( int i = 0; i < data.Length; i++ ) {
sBuilder.Append( data[i].ToString( "x2" ) );
}
// Return the hexadecimal string.
return sBuilder.ToString();

The problem is that you are using Encoding.Default encoding which represents 7 bit ASCII characters. At the same time, “CAPICOM manipulates only Unicode strings while validating and generating digital signatures”.
So, Encoding.Default.GetBytes deals with one-byte characters (losing any non-ASCII data by the way), while CAPICOM.HashedData treats them as 2-byte Unicode characters.
Replace Encoding.Default with Encoding.Unicode to make your .NET implementation to be compatible with CAPICOM.
One more note, use data[i].ToString("X2") to produce upper-case result, as you have in CAPICOM implementation.

How can I reproduce a SHA512 hash in C# that fits the PHP SHA512?

The question is pretty much self-explanatory. I Googled many sites, many methods, tried many encodings, but I can't get it to match.
I'm trying to make the string "asdasd" match. (http://www.fileformat.info/tool/hash.htm?text=asdasd)

Try this
using System.Security.Cryptography
public static string HashPassword(string unhashedPassword)
{
return BitConverter.ToString(new SHA512CryptoServiceProvider().ComputeHash(Encoding.Default.GetBytes(unhashedPassword))).Replace("-", String.Empty).ToUpper();
}

BitConverter works just fine ...
var testVal = "asdasd";
var enc = new ASCIIEncoding();
var bytes = enc.GetBytes( testVal );
var sha = new SHA512Managed();
var result = sha.ComputeHash( bytes );
var resStr = BitConverter.ToString( result );
var nodash = resStr.Replace( "-", "" );
nodash.Dump();
(Fixed for 512-bit hash, sorry :)

I just spent several hours trying to get a .NET hash function to match PHP's Crypt function. Not fun.
There are multiple challenges here, since the PHP implementation of Crypt returns a base64 encoded string, and doesn't do multiple hashing iterations (e.g. 5000 is default for Crypt.) I was not able to get similar outputs from .NET using several libraries, until I found CryptSharp. It accepts a salt similar to PHP's (or the original C) function (e.g. "$6$round=5000$mysalt"). Note that there is no trailing $, and that if you don't provide a salt it will autogenerate a random one.
You can find CryptSharp here:
http://www.zer7.com/software.php?page=cryptsharp
Good background reading:
- http://www.akkadia.org/drepper/SHA-crypt.txt

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Hashing non ascii characters C# - c#

Related

C# AES decrypted output is not the same as input

Why do these code blocks produce different results?

Trying to replicate C# hashing in Perl (on Linux)

classic asp use capicom for md5 hashing - result differs from .net System.Security.Cryptography

How can I reproduce a SHA512 hash in C# that fits the PHP SHA512?

Categories

Resources